Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random generation is very slow #234

Closed
nh2 opened this issue Nov 25, 2018 · 7 comments
Closed

Random generation is very slow #234

nh2 opened this issue Nov 25, 2018 · 7 comments

Comments

@nh2
Copy link
Contributor

nh2 commented Nov 25, 2018

For testing my binding to the lz4 compression library (https://github.com/nh2/lz4-frame-conduit), I need to generate a lot of large ByteStrings.

Unfortunately I can barely use quickcheck for this, because its random generator is too slow.

When actually generating ByteStrings using quickcheck-instances and size = 1000 on e.g.

QC.property $ \(bsList :: [ByteString]) -> QCM.monadicIO $ do
  QCM.run $ hPutStrLn stderr (show $ sum $ map BS.length bsList)
  QCM.assert True

it generates me 1 MB/s; generating [String] is only slightly faster.

These numbers are very low compared to the 230 MB/s /dev/urandom gets me on the same machine.

Can we do something about it?

@phadej
Copy link
Contributor

phadej commented Nov 25, 2018

Something is weird; when I last time benchmarked tf-random was 5-10 time faster than random. How you generate your ByteStrings?

quickcheck-instances way of generating is not optimal, but nobody mentioned it's too slow for their needs (it generates list of Word8s, and not Word32 which tf-random generates natively; so it's at least 4x slower than it could be).

@nh2
Copy link
Contributor Author

nh2 commented Nov 25, 2018

How you generate your ByteStrings?

Using quickcheck-instances, as you said.

I'm now trying out to work around this slowness by using pcg-random (with a seed and target length generated by QuickCheck's choose) to generate the ByteStrings. I've measured that this works at ~300 MB/s, even faster than /dev/urandom. But of course using this workaround isn't great.

Something is weird; when I last time benchmarked tf-random was 5-10 time faster than random.

Hmm, let me give that a "quick check".

@phadej
Copy link
Contributor

phadej commented Nov 25, 2018

@nh2 could you try this haskellari/qc-instances@master...phadej:faster-bytestring version of quickcheck-instances, how much faster than released it is in your case (and how much slower than pcg-random) ?

In the simple benchmark there, it was already 20x faster.

(I'll resort to crreateAndTrim next)

@nick8325
Copy link
Owner

I think that the best place to fix this particular problem is in the Arbitrary ByteString instance (e.g. using @phadej's patch).

More generally, there are a couple of things that might cause Gen to be slow:

  • Gen uses splitting to distribute random numbers throughout the computation, which may be slower than generating random numbers in the traditional way. The fix to this would be to change Gen into a state monad, but I don't want to do that as it would no longer be possible to generate infinite structures.
  • When I looked at the Core generated by GHC a while ago, it seemed rather suboptimal. I seem to remember a lot of combinators were not getting specialised as one would hope. I've been meaning to look more into this but not got round to it yet.

@moodmosaic
Copy link

@nick8325

there are a couple of things that might cause Gen to be slow

Just a thought, but generating infinite structures only works when the bottom of the monad stack is Identity, right? If so, perhaps it's worth considering StateT over splitting(?)

@phadej
Copy link
Contributor

phadej commented Mar 25, 2019

https://hackage.haskell.org/package/quickcheck-instances-0.3.20 released with faster ByteString generator

@MaximilianAlgehed
Copy link
Collaborator

Seeing as this was fixed in quickcheck-isntances I'm going to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants