Very low throughput #51

nh2 · 2018-11-25T03:26:10Z

main = do
  let rs :: [Word8]
      rs = randoms (mkStdGen 42)

  print $ sum $ map (\x -> fromIntegral x :: Int) $ take 10000000 rs

gives me around 5 MB/s on my laptop.

main = do
  let rs :: [Int]
      rs = randoms (mkStdGen 42)

  print $ sum $ map (\x -> fromIntegral x :: Int) $ take 10000000 rs

gives me around 8 MB/s on my laptop.

This is very slow for a pseudorandom number generator.

/dev/urandom gives me 230 MB on the same machine.

The text was updated successfully, but these errors were encountered:

nh2 · 2018-11-25T03:26:24Z

I found this issue when trying to test my lz4 binding with QuickCheck (tests would take forever when generating reasonably sized input), so I started measuring the speed of random gens.

idontgetoutmuch · 2018-11-25T11:13:19Z

@nh2 don't use this package - see

#31
#29
https://ghc.haskell.org/trac/ghc/ticket/2280

Throughput is not its only problem.

I have made suggestions for replacing it and deprecating it.

@nh2 I thought QuickCheck used tf-random these days on account of this package not implementing split int a sensible way? Are you sure QuickCheck is actually using this package?

FYI - this runs about x10 faster for me.

import System.Random.MWC

main :: IO ()
main2 = do
    gen <- create

    let testUniform 0 !x = return (x :: Int)
        testUniform n x = do
            y <- uniform gen
            testUniform (n - 1) (x + y)

    total <- testUniform 10000000 0
    print total

cartazio · 2018-11-25T13:33:43Z

@nh2 master has a much more performant and good quality algorithm. Old random (current hackage ), Tf random and mwc all have poor statistical quality on any big crush. Addditionally, at no point have any of their Haskell interfaces been optimized for performance wrt batch throughout. Aka stream / unfold style interface Dominic: since you’ve done zero work of ever contributing to the new stuff, could you please just stop being involved on tickets and let’s just get you off the maintainer list? I can start doing drive by comments on stuff you do that I don’t help you on too otherwise. It’ll be really fun and demotivating. Just like you’ve been to meeeee. So please stop it. Please. You’re not helping accomplish anything except making an antagonistic dynamic that helps no one.

…

On Sun, Nov 25, 2018 at 6:13 AM idontgetoutmuch ***@***.***> wrote: @nh2 <https://github.com/nh2> don't use this package - see #31 <#31> #29 <#29> https://ghc.haskell.org/trac/ghc/ticket/2280 Throughput is not its only problem. I have made suggestions for replacing it and deprecating it. @nh2 <https://github.com/nh2> I thought QuickCheck used tf-random these days on account of this package not implementing split int a sensible way? Are you sure QuickCheck is actually using *this* package? FYI - this runs about x10 faster for me. import System.Random.MWC main :: IO () main2 = do gen <- create let testUniform 0 !x = return (x :: Int) testUniform n x = do y <- uniform gen testUniform (n - 1) (x + y) total <- testUniform 10000000 0 print total — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#51 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAQwiBK8nB3XlNhbv0PD35UPII5Vt1jks5uyntQgaJpZM4Yxscq> .

cartazio · 2018-11-25T13:57:14Z

nh2: hey I can actually help you if you want on the rng stuff. If you want a fast decent rng today, use the pcg package. The next major version of random is gonna have some breaking changes and have a pcg variant as one of the recommended defaults. Pcg should be way faster / better quality than all the other non cryptographic rngs in hackage. What is the application domain / way of using rngs you have going on? On Sun, Nov 25, 2018 at 8:33 AM Carter Schonwald <[email protected]> wrote:

…

@nh2 master has a much more performant and good quality algorithm. Old random (current hackage ), Tf random and mwc all have poor statistical quality on any big crush. Addditionally, at no point have any of their Haskell interfaces been optimized for performance wrt batch throughout. Aka stream / unfold style interface Dominic: since you’ve done zero work of ever contributing to the new stuff, could you please just stop being involved on tickets and let’s just get you off the maintainer list? I can start doing drive by comments on stuff you do that I don’t help you on too otherwise. It’ll be really fun and demotivating. Just like you’ve been to meeeee. So please stop it. Please. You’re not helping accomplish anything except making an antagonistic dynamic that helps no one. On Sun, Nov 25, 2018 at 6:13 AM idontgetoutmuch ***@***.***> wrote: > @nh2 <https://github.com/nh2> don't use this package - see > > #31 <#31> > #29 <#29> > https://ghc.haskell.org/trac/ghc/ticket/2280 > > Throughput is not its only problem. > > I have made suggestions for replacing it and deprecating it. > > @nh2 <https://github.com/nh2> I thought QuickCheck used tf-random these > days on account of this package not implementing split int a sensible way? > Are you sure QuickCheck is actually using *this* package? > > FYI - this runs about x10 faster for me. > > import System.Random.MWC > > main :: IO () > main2 = do > gen <- create > > let testUniform 0 !x = return (x :: Int) > testUniform n x = do > y <- uniform gen > testUniform (n - 1) (x + y) > > total <- testUniform 10000000 0 > print total > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#51 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AAAQwiBK8nB3XlNhbv0PD35UPII5Vt1jks5uyntQgaJpZM4Yxscq> > . >

nh2 · 2018-11-25T17:15:59Z

I thought QuickCheck used tf-random these days on account of this package not implementing split int a sensible way? Are you sure QuickCheck is actually using this package?

I know, but tf-random is equally at least the way quickcheck-instances uses it, see: nick8325/quickcheck#234

don't use this package

@idontgetoutmuch I'm aware of the issues but I cannot easily patch it out all the way down to QuickCheck -- as you can imagine, this issue is just a sidetrack in my 5-levels deep recursion stack of what I actually want to work on :)

master has a much more performant and good quality algorithm

@cartazio This is nice, but for it to have a real impact it must be on Hackage and libs like QuickCheck must be using it.

What is the application domain / way of using rngs you have going on?

Right now, I just want QuickCheck to work at reasonable speeds. I'm writing an lz4 binding and wanted to test it with it, for which I need to generate lots of input ByteStrings, including large ones. I found that I cannot write a Gen ByteString that exceeds 8 MB/s, which makes my tests take forever.

So I started measuring the underlying libraries (random and tf-random and found that they are all slow).

Pcg should be way faster / better quality

Good hint, appreciated!

I have already implemented a workaround based on pcg-random (with a seed and target length generated by QuickCheck's choose) to generate the ByteStrings. I've measured that this works at ~300 MB/s, even faster than /dev/urandom.

But of course having to do this using this workaround isn't great; I am already many hours into a complete side-project -- I had expected this stuff to just work (tm).

could you please just stop being involved on tickets and let’s just get you off the maintainer list?

Let's not get bitter about things, we all want the same thing in the end. It is understandable that people are frustrated because a core component that they must rely on (by choice or dependency) doesn't work out of the box, for a long time. I too am frustrated because I didn't expect I'd be spending many hours of my weekend dealing with random number generation.

Pointing out problems with the current state is also a useful contribution (this is what my ticket here does, too). Let's not get discouraged by it, but rather encouraged to get things fixed.

cartazio · 2018-11-25T17:26:42Z

Nh2 I’m gonna be doing a release later this holiday season. Maybe this week. There’s some portability issues in the current interface that means you don’t get reproducible results on every platform. This holiday season Unless some work or personal emergencies derail stuff Additionally : there isn’t a a well defined split operation for pcg, or at least not one that’s well studied.

…

On Sun, Nov 25, 2018 at 12:16 PM Niklas Hambüchen ***@***.***> wrote: I thought QuickCheck used tf-random these days on account of this package not implementing split int a sensible way? Are you sure QuickCheck is actually using *this* package? I know, but tf-random is equally at least the way quickcheck-instances uses it, see: nick8325/quickcheck#234 <nick8325/quickcheck#234> don't use this package @idontgetoutmuch <https://github.com/idontgetoutmuch> I'm aware of the issues but I cannot easily patch it out all the way down to QuickCheck -- as you can imagine, this issue is just a sidetrack in my 5-levels deep recursion stack of what I actually want to work on :) master has a much more performant and good quality algorithm @cartazio <https://github.com/cartazio> This is nice, but for it to have a real impact it must be on Hackage and libs like QuickCheck must be using it. What is the application domain / way of using rngs you have going on? Right now, I just want QuickCheck to work at reasonable speeds. I'm writing an lz4 binding and wanted to test it with it, for which I need to generate lots of input ByteStrings, including large ones. I found that I cannot write a Gen ByteString that exceeds 8 MB/s, which makes my tests take forever. So I started measuring the underlying libraries (random and tf-random and found that they are all slow). Pcg should be way faster / better quality Good hint, appreciated! I have already implemented a workaround based on pcg-random (with a seed and target length generated by QuickCheck's choose) to generate the ByteStrings. I've measured that this works at ~300 MB/s, even faster than /dev/urandom. But of course having to do this using this workaround isn't great; I am already many hours into a complete side-project -- I had expected this stuff to just work (tm). could you please just stop being involved on tickets and let’s just get you off the maintainer list? Let's not get bitter about things, we all want the same thing in the end. It is understandable that people are frustrated because a core component that they must rely on (by choice or dependency) doesn't work out of the box, for a long time. I too am frustrated because I didn't expect I'd be spending many hours of my weekend dealing with random number generation. Pointing out problems with the current state is also a useful contribution (this is what my ticket here does, too). Let's not get discouraged by it, but rather encouraged to get things fixed. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#51 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAAQwkHd_9ipFkK0eZFlkh9GTp4xWdWcks5uytBPgaJpZM4Yxscq> .

ecthiender · 2019-04-25T21:30:50Z

@cartazio was this pcg version released? Just curious to know if the fast version has landed on hackage.

cartazio · 2019-04-25T22:18:03Z

atm not yet, but the split-mix and pcg packages on hackage should be decent today
(i had to work though how to make sure stuff is forward migrable, and now its on my release queue again once i suss things out with relevant CLC members)

cartazio · 2019-04-25T22:24:40Z

this is definitely my fault, but its one of those things where i need to / want to navigate evolving the ecosystem and breakages carefully

ecthiender · 2019-04-26T07:07:52Z

Ok, thanks for the update.

lehins · 2020-05-26T09:44:49Z

For anyone that is interested in this ticket, apparently it has been a problem for ~~12 years~~ almost 15 years and there is finally a solution in #61

All we are waiting for now is getting it reviewed and released, right @cartazio ?

author Alexey Kuleshevich <[email protected]> 1581472095 +0300 committer Leonhard Markert <[email protected]> 1590493894 +0200 This patch is mostly backwards compatible. See "Breaking Changes" below for the full list of backwards incompatible changes. This patch fixes quality and performance issues, addresses additional miscellaneous issues, and introduces a monadic API. Issues addressed ================ Priority issues fixed in this patch: - Title: "The seeds generated by split are not independent" Link: haskell#25 Fixed: changed algorithm to SplitMix, which provides a robust 'split' operation - Title: "Very low throughput" Link: haskell#51 Fixed: see "Performance" below Additional issues addressed in this patch: - Title: "Add Random instances for tuples" Link: haskell#26 Addressed: added 'Uniform' instances for up to 6-tuples - Title: "Add Random instance for Natural" Link: haskell#44 Addressed: added 'UniformRange' instance for 'Natural' - Title: "incorrect distribution of randomR for floating-point numbers" Link: haskell#53 Addressed: see "Regarding floating-point numbers" below - Title: "System/Random.hs:43:1: warning: [-Wtabs]" Link: haskell#55 Fixed: no more tabs - Title: "Why does random for Float and Double produce exactly 24 or 53 bits?" Link: haskell#58 Fixed: see "Regarding floating-point numbers" below - Title: "read :: StdGen fails for strings longer than 6" Link: haskell#59 Addressed: 'StdGen' is no longer an instance of 'Read' Regarding floating-point numbers: with this patch, the relevant instances for 'Float' and 'Double' sample more bits than before but do not sample every possible representable value. The documentation now clearly spells out what this means for users. Quality (issue 25) ================== The algorithm [1] in version 1.1 of this library fails empirical PRNG tests when used to generate "split sequences" as proposed in [3]. SplitMix [2] passes the same tests. This patch changes 'StdGen' to use the SplitMix implementation provided by the splitmix package. Test batteries used: dieharder, TestU1, PractRand. [1]: P. L'Ecuyer, "Efficient and portable combined random number generators". https://doi.org/10.1145/62959.62969 [2]: G. L. Steele, D. Lea, C. H. Flood, "Fast splittable pseudorandom number generators". https://doi.org/10.1145/2714064.2660195 [3]: H. G. Schaathun, "Evaluation of splittable pseudo-random generators". https://doi.org/10.1017/S095679681500012X Performance (issue 51) ====================== The "improvement" column in the following table is a multiplier: the improvement for 'random' for type 'Float' is 1038, so this operation is 1038 times faster with this patch. | Name | Mean (1.1) | Mean (patch) | Improvement| | ----------------------- | ---------- | ------------ | ---------- | | pure/random/Float | 30 | 0.03 | 1038| | pure/random/Double | 52 | 0.03 | 1672| | pure/random/Integer | 43 | 0.33 | 131| | pure/uniform/Word8 | 14 | 0.03 | 422| | pure/uniform/Word16 | 13 | 0.03 | 375| | pure/uniform/Word32 | 21 | 0.03 | 594| | pure/uniform/Word64 | 42 | 0.03 | 1283| | pure/uniform/Word | 44 | 0.03 | 1491| | pure/uniform/Int8 | 15 | 0.03 | 511| | pure/uniform/Int16 | 15 | 0.03 | 507| | pure/uniform/Int32 | 22 | 0.03 | 749| | pure/uniform/Int64 | 44 | 0.03 | 1405| | pure/uniform/Int | 43 | 0.03 | 1512| | pure/uniform/Char | 17 | 0.49 | 35| | pure/uniform/Bool | 18 | 0.03 | 618| | pure/uniform/CChar | 14 | 0.03 | 485| | pure/uniform/CSChar | 14 | 0.03 | 455| | pure/uniform/CUChar | 13 | 0.03 | 448| | pure/uniform/CShort | 14 | 0.03 | 473| | pure/uniform/CUShort | 13 | 0.03 | 457| | pure/uniform/CInt | 21 | 0.03 | 737| | pure/uniform/CUInt | 21 | 0.03 | 742| | pure/uniform/CLong | 43 | 0.03 | 1544| | pure/uniform/CULong | 42 | 0.03 | 1460| | pure/uniform/CPtrdiff | 43 | 0.03 | 1494| | pure/uniform/CSize | 43 | 0.03 | 1475| | pure/uniform/CWchar | 22 | 0.03 | 785| | pure/uniform/CSigAtomic | 21 | 0.03 | 749| | pure/uniform/CLLong | 43 | 0.03 | 1554| | pure/uniform/CULLong | 42 | 0.03 | 1505| | pure/uniform/CIntPtr | 43 | 0.03 | 1476| | pure/uniform/CUIntPtr | 42 | 0.03 | 1463| | pure/uniform/CIntMax | 43 | 0.03 | 1535| | pure/uniform/CUIntMax | 42 | 0.03 | 1493| API changes =========== MonadRandom ----------- This patch adds a class 'MonadRandom': -- | 'MonadRandom' is an interface to monadic pseudo-random number generators. class Monad m => MonadRandom g s m | g m -> s where {-# MINIMAL freezeGen,thawGen,(uniformWord32|uniformWord64) #-} type Frozen g = (f :: Type) | f -> g freezeGen :: g s -> m (Frozen g) thawGen :: Frozen g -> m (g s) uniformWord32 :: g s -> m Word32 -- default implementation in terms of uniformWord64 uniformWord64 :: g s -> m Word64 -- default implementation in terms of uniformWord32 -- plus methods for other word sizes and for byte strings -- all have default implementations so the MINIMAL pragma holds Conceptually, in 'MonadRandom g s m', 'g s' is the type of the generator, 's' is the state type, and 'm' the underlying monad. Via the functional dependency 'g m -> s', the state type is determined by the generator and monad. 'Frozen' is the type of the generator's state "at rest". It is defined as an injective type family via 'f -> g', so there is no ambiguity as to which 'g' any 'Frozen g' belongs to. This definition is generic enough to accommodate, for example, the 'Gen' type from the 'mwc-random' package, which itself abstracts over the underlying primitive monad and state token. The documentation shows the full 'MonadRandom Gen' instance. Four 'MonadRandom' instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail. 'Uniform' and 'UniformRange' ---------------------------- The 'Random' typeclass has conceptually been split into 'Uniform' and 'UniformRange'. The 'Random' typeclass is still included for backwards compatibility. 'Uniform' is for types where it is possible to sample from the type's entire domain; 'UniformRange' is for types where one can sample from a specified range. Breaking Changes ================ This patch introduces these breaking changes: * requires 'base >= 4.10' (GHC-8.2) * 'StdGen' is no longer an instance of 'Read' * 'randomIO' and 'randomRIO' where extracted from the 'Random' class into separate functions In addition, there may be import clashes with new functions, e.g. 'uniform' and 'uniformR'. Deprecations ============ This patch introduces 'genWord64', 'genWord32' and similar methods to the 'RandomGen' class. The significantly slower method 'next' and its companion 'genRange' are now deprecated. Co-authored-by: Alexey Kuleshevich <[email protected]> Co-authored-by: idontgetoutmuch <[email protected]> Co-authored-by: Leonhard Markert <[email protected]>

This patch is mostly backwards compatible. See "Breaking Changes" below for the full list of backwards incompatible changes. This patch fixes quality and performance issues, addresses additional miscellaneous issues, and introduces a monadic API. Issues addressed ================ Priority issues fixed in this patch: - Title: "The seeds generated by split are not independent" Link: haskell#25 Fixed: changed algorithm to SplitMix, which provides a robust 'split' operation - Title: "Very low throughput" Link: haskell#51 Fixed: see "Performance" below Additional issues addressed in this patch: - Title: "Add Random instances for tuples" Link: haskell#26 Addressed: added 'Uniform' instances for up to 6-tuples - Title: "Add Random instance for Natural" Link: haskell#44 Addressed: added 'UniformRange' instance for 'Natural' - Title: "incorrect distribution of randomR for floating-point numbers" Link: haskell#53 Addressed: see "Regarding floating-point numbers" below - Title: "System/Random.hs:43:1: warning: [-Wtabs]" Link: haskell#55 Fixed: no more tabs - Title: "Why does random for Float and Double produce exactly 24 or 53 bits?" Link: haskell#58 Fixed: see "Regarding floating-point numbers" below - Title: "read :: StdGen fails for strings longer than 6" Link: haskell#59 Addressed: 'StdGen' is no longer an instance of 'Read' Regarding floating-point numbers: with this patch, the relevant instances for 'Float' and 'Double' sample more bits than before but do not sample every possible representable value. The documentation now clearly spells out what this means for users. Quality (issue 25) ================== The algorithm [1] in version 1.1 of this library fails empirical PRNG tests when used to generate "split sequences" as proposed in [3]. SplitMix [2] passes the same tests. This patch changes 'StdGen' to use the SplitMix implementation provided by the splitmix package. Test batteries used: dieharder, TestU1, PractRand. [1]: P. L'Ecuyer, "Efficient and portable combined random number generators". https://doi.org/10.1145/62959.62969 [2]: G. L. Steele, D. Lea, C. H. Flood, "Fast splittable pseudorandom number generators". https://doi.org/10.1145/2714064.2660195 [3]: H. G. Schaathun, "Evaluation of splittable pseudo-random generators". https://doi.org/10.1017/S095679681500012X Performance (issue 51) ====================== The "improvement" column in the following table is a multiplier: the improvement for 'random' for type 'Float' is 1038, so this operation is 1038 times faster with this patch. | Name | Mean (1.1) | Mean (patch) | Improvement| | ----------------------- | ---------- | ------------ | ---------- | | pure/random/Float | 30 | 0.03 | 1038| | pure/random/Double | 52 | 0.03 | 1672| | pure/random/Integer | 43 | 0.33 | 131| | pure/uniform/Word8 | 14 | 0.03 | 422| | pure/uniform/Word16 | 13 | 0.03 | 375| | pure/uniform/Word32 | 21 | 0.03 | 594| | pure/uniform/Word64 | 42 | 0.03 | 1283| | pure/uniform/Word | 44 | 0.03 | 1491| | pure/uniform/Int8 | 15 | 0.03 | 511| | pure/uniform/Int16 | 15 | 0.03 | 507| | pure/uniform/Int32 | 22 | 0.03 | 749| | pure/uniform/Int64 | 44 | 0.03 | 1405| | pure/uniform/Int | 43 | 0.03 | 1512| | pure/uniform/Char | 17 | 0.49 | 35| | pure/uniform/Bool | 18 | 0.03 | 618| | pure/uniform/CChar | 14 | 0.03 | 485| | pure/uniform/CSChar | 14 | 0.03 | 455| | pure/uniform/CUChar | 13 | 0.03 | 448| | pure/uniform/CShort | 14 | 0.03 | 473| | pure/uniform/CUShort | 13 | 0.03 | 457| | pure/uniform/CInt | 21 | 0.03 | 737| | pure/uniform/CUInt | 21 | 0.03 | 742| | pure/uniform/CLong | 43 | 0.03 | 1544| | pure/uniform/CULong | 42 | 0.03 | 1460| | pure/uniform/CPtrdiff | 43 | 0.03 | 1494| | pure/uniform/CSize | 43 | 0.03 | 1475| | pure/uniform/CWchar | 22 | 0.03 | 785| | pure/uniform/CSigAtomic | 21 | 0.03 | 749| | pure/uniform/CLLong | 43 | 0.03 | 1554| | pure/uniform/CULLong | 42 | 0.03 | 1505| | pure/uniform/CIntPtr | 43 | 0.03 | 1476| | pure/uniform/CUIntPtr | 42 | 0.03 | 1463| | pure/uniform/CIntMax | 43 | 0.03 | 1535| | pure/uniform/CUIntMax | 42 | 0.03 | 1493| API changes =========== MonadRandom ----------- This patch adds a class 'MonadRandom': -- | 'MonadRandom' is an interface to monadic pseudo-random number generators. class Monad m => MonadRandom g s m | g m -> s where {-# MINIMAL freezeGen,thawGen,(uniformWord32|uniformWord64) #-} type Frozen g = (f :: Type) | f -> g freezeGen :: g s -> m (Frozen g) thawGen :: Frozen g -> m (g s) uniformWord32 :: g s -> m Word32 -- default implementation in terms of uniformWord64 uniformWord64 :: g s -> m Word64 -- default implementation in terms of uniformWord32 -- plus methods for other word sizes and for byte strings -- all have default implementations so the MINIMAL pragma holds Conceptually, in 'MonadRandom g s m', 'g s' is the type of the generator, 's' is the state type, and 'm' the underlying monad. Via the functional dependency 'g m -> s', the state type is determined by the generator and monad. 'Frozen' is the type of the generator's state "at rest". It is defined as an injective type family via 'f -> g', so there is no ambiguity as to which 'g' any 'Frozen g' belongs to. This definition is generic enough to accommodate, for example, the 'Gen' type from the 'mwc-random' package, which itself abstracts over the underlying primitive monad and state token. The documentation shows the full 'MonadRandom Gen' instance. Four 'MonadRandom' instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail. 'Uniform' and 'UniformRange' ---------------------------- The 'Random' typeclass has conceptually been split into 'Uniform' and 'UniformRange'. The 'Random' typeclass is still included for backwards compatibility. 'Uniform' is for types where it is possible to sample from the type's entire domain; 'UniformRange' is for types where one can sample from a specified range. Breaking Changes ================ This patch introduces these breaking changes: * requires 'base >= 4.10' (GHC-8.2) * 'StdGen' is no longer an instance of 'Read' * 'randomIO' and 'randomRIO' where extracted from the 'Random' class into separate functions In addition, there may be import clashes with new functions, e.g. 'uniform' and 'uniformR'. Deprecations ============ This patch introduces 'genWord64', 'genWord32' and similar methods to the 'RandomGen' class. The significantly slower method 'next' and its companion 'genRange' are now deprecated. Co-authored-by: Alexey Kuleshevich <[email protected]> Co-authored-by: idontgetoutmuch <[email protected]> Co-authored-by: Leonhard Markert <[email protected]>

cartazio · 2020-05-26T18:21:20Z

There’s a bunch of fixes and improvements indeed.

…

On Tue, May 26, 2020 at 5:45 AM Alexey Kuleshevich ***@***.***> wrote: For anyone that is interested in this ticket, apparently it has been a problem for 12 years <https://gitlab.haskell.org/ghc/ghc/issues/2280> and there is finally a solution in #61 <#61> All we are waiting for now is getting it reviewed and released, right @cartazio <https://github.com/cartazio> ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#51 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAABBQTIMVSD2WFNCLZ2U4LRTOFR7ANCNFSM4GGGY4VA> .

This patch is mostly backwards compatible. See "Breaking Changes" below for the full list of backwards incompatible changes. This patch fixes quality and performance issues, addresses additional miscellaneous issues, and introduces a monadic API. Issues addressed ================ Priority issues fixed in this patch: - Title: "The seeds generated by split are not independent" Link: haskell#25 Fixed: changed algorithm to SplitMix, which provides a robust 'split' operation - Title: "Very low throughput" Link: haskell#51 Fixed: see "Performance" below Additional issues addressed in this patch: - Title: "Add Random instances for tuples" Link: haskell#26 Addressed: added 'Uniform' instances for up to 6-tuples - Title: "Add Random instance for Natural" Link: haskell#44 Addressed: added 'UniformRange' instance for 'Natural' - Title: "incorrect distribution of randomR for floating-point numbers" Link: haskell#53 Addressed: see "Regarding floating-point numbers" below - Title: "System/Random.hs:43:1: warning: [-Wtabs]" Link: haskell#55 Fixed: no more tabs - Title: "Why does random for Float and Double produce exactly 24 or 53 bits?" Link: haskell#58 Fixed: see "Regarding floating-point numbers" below - Title: "read :: StdGen fails for strings longer than 6" Link: haskell#59 Addressed: 'StdGen' is no longer an instance of 'Read' Regarding floating-point numbers: with this patch, the relevant instances for 'Float' and 'Double' sample more bits than before but do not sample every possible representable value. The documentation now clearly spells out what this means for users. Quality (issue 25) ================== The algorithm [1] in version 1.1 of this library fails empirical PRNG tests when used to generate "split sequences" as proposed in [3]. SplitMix [2] passes the same tests. This patch changes 'StdGen' to use the SplitMix implementation provided by the splitmix package. Test batteries used: dieharder, TestU1, PractRand. [1]: P. L'Ecuyer, "Efficient and portable combined random number generators". https://doi.org/10.1145/62959.62969 [2]: G. L. Steele, D. Lea, C. H. Flood, "Fast splittable pseudorandom number generators". https://doi.org/10.1145/2714064.2660195 [3]: H. G. Schaathun, "Evaluation of splittable pseudo-random generators". https://doi.org/10.1017/S095679681500012X Performance (issue 51) ====================== The "improvement" column in the following table is a multiplier: the improvement for 'random' for type 'Float' is 1038, so this operation is 1038 times faster with this patch. | Name | Mean (1.1) | Mean (patch) | Improvement| | ----------------------- | ---------- | ------------ | ---------- | | pure/random/Float | 30 | 0.03 | 1038| | pure/random/Double | 52 | 0.03 | 1672| | pure/random/Integer | 43 | 0.33 | 131| | pure/uniform/Word8 | 14 | 0.03 | 422| | pure/uniform/Word16 | 13 | 0.03 | 375| | pure/uniform/Word32 | 21 | 0.03 | 594| | pure/uniform/Word64 | 42 | 0.03 | 1283| | pure/uniform/Word | 44 | 0.03 | 1491| | pure/uniform/Int8 | 15 | 0.03 | 511| | pure/uniform/Int16 | 15 | 0.03 | 507| | pure/uniform/Int32 | 22 | 0.03 | 749| | pure/uniform/Int64 | 44 | 0.03 | 1405| | pure/uniform/Int | 43 | 0.03 | 1512| | pure/uniform/Char | 17 | 0.49 | 35| | pure/uniform/Bool | 18 | 0.03 | 618| | pure/uniform/CChar | 14 | 0.03 | 485| | pure/uniform/CSChar | 14 | 0.03 | 455| | pure/uniform/CUChar | 13 | 0.03 | 448| | pure/uniform/CShort | 14 | 0.03 | 473| | pure/uniform/CUShort | 13 | 0.03 | 457| | pure/uniform/CInt | 21 | 0.03 | 737| | pure/uniform/CUInt | 21 | 0.03 | 742| | pure/uniform/CLong | 43 | 0.03 | 1544| | pure/uniform/CULong | 42 | 0.03 | 1460| | pure/uniform/CPtrdiff | 43 | 0.03 | 1494| | pure/uniform/CSize | 43 | 0.03 | 1475| | pure/uniform/CWchar | 22 | 0.03 | 785| | pure/uniform/CSigAtomic | 21 | 0.03 | 749| | pure/uniform/CLLong | 43 | 0.03 | 1554| | pure/uniform/CULLong | 42 | 0.03 | 1505| | pure/uniform/CIntPtr | 43 | 0.03 | 1476| | pure/uniform/CUIntPtr | 42 | 0.03 | 1463| | pure/uniform/CIntMax | 43 | 0.03 | 1535| | pure/uniform/CUIntMax | 42 | 0.03 | 1493| API changes =========== MonadRandom ----------- This patch adds a class 'MonadRandom': -- | 'MonadRandom' is an interface to monadic pseudo-random number generators. class Monad m => MonadRandom g s m | g m -> s where {-# MINIMAL freezeGen,thawGen,(uniformWord32|uniformWord64) #-} type Frozen g = (f :: Type) | f -> g freezeGen :: g s -> m (Frozen g) thawGen :: Frozen g -> m (g s) uniformWord32 :: g s -> m Word32 -- default implementation in terms of uniformWord64 uniformWord64 :: g s -> m Word64 -- default implementation in terms of uniformWord32 -- plus methods for other word sizes and for byte strings -- all have default implementations so the MINIMAL pragma holds Conceptually, in 'MonadRandom g s m', 'g s' is the type of the generator, 's' is the state type, and 'm' the underlying monad. Via the functional dependency 'g m -> s', the state type is determined by the generator and monad. 'Frozen' is the type of the generator's state "at rest". It is defined as an injective type family via 'f -> g', so there is no ambiguity as to which 'g' any 'Frozen g' belongs to. This definition is generic enough to accommodate, for example, the 'Gen' type from the 'mwc-random' package, which itself abstracts over the underlying primitive monad and state token. The documentation shows the full 'MonadRandom Gen' instance. Four 'MonadRandom' instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail. 'Uniform' and 'UniformRange' ---------------------------- The 'Random' typeclass has conceptually been split into 'Uniform' and 'UniformRange'. The 'Random' typeclass is still included for backwards compatibility. 'Uniform' is for types where it is possible to sample from the type's entire domain; 'UniformRange' is for types where one can sample from a specified range. Breaking Changes ================ This patch introduces these breaking changes: * requires 'base >= 4.10' (GHC-8.2) * 'StdGen' is no longer an instance of 'Read' * 'randomIO' and 'randomRIO' where extracted from the 'Random' class into separate functions In addition, there may be import clashes with new functions, e.g. 'uniform' and 'uniformR'. Deprecations ============ This patch introduces 'genWord64', 'genWord32' and similar methods to the 'RandomGen' class. The significantly slower method 'next' and its companion 'genRange' are now deprecated. Co-authored-by: Alexey Kuleshevich <[email protected]> Co-authored-by: idontgetoutmuch <[email protected]> Co-authored-by: Leonhard Markert <[email protected]>

This patch is mostly backwards compatible. See "Breaking Changes" below for the full list of backwards incompatible changes. This patch fixes quality and performance issues, addresses additional miscellaneous issues, and introduces a monadic API. Issues addressed ================ Priority issues fixed in this patch: - Title: "The seeds generated by split are not independent" Link: haskell#25 Fixed: changed algorithm to SplitMix, which provides a robust 'split' operation - Title: "Very low throughput" Link: haskell#51 Fixed: see "Performance" below Additional issues addressed in this patch: - Title: "Add Random instances for tuples" Link: haskell#26 Addressed: added 'Uniform' instances for up to 6-tuples - Title: "Add Random instance for Natural" Link: haskell#44 Addressed: added 'UniformRange' instance for 'Natural' - Title: "incorrect distribution of randomR for floating-point numbers" Link: haskell#53 Addressed: see "Regarding floating-point numbers" below - Title: "System/Random.hs:43:1: warning: [-Wtabs]" Link: haskell#55 Fixed: no more tabs - Title: "Why does random for Float and Double produce exactly 24 or 53 bits?" Link: haskell#58 Fixed: see "Regarding floating-point numbers" below - Title: "read :: StdGen fails for strings longer than 6" Link: haskell#59 Addressed: 'StdGen' is no longer an instance of 'Read' Regarding floating-point numbers: with this patch, the relevant instances for 'Float' and 'Double' sample more bits than before but do not sample every possible representable value. The documentation now clearly spells out what this means for users. Quality (issue 25) ================== The algorithm [1] in version 1.1 of this library fails empirical PRNG tests when used to generate "split sequences" as proposed in [3]. SplitMix [2] passes the same tests. This patch changes 'StdGen' to use the SplitMix implementation provided by the splitmix package. Test batteries used: dieharder, TestU1, PractRand. [1]: P. L'Ecuyer, "Efficient and portable combined random number generators". https://doi.org/10.1145/62959.62969 [2]: G. L. Steele, D. Lea, C. H. Flood, "Fast splittable pseudorandom number generators". https://doi.org/10.1145/2714064.2660195 [3]: H. G. Schaathun, "Evaluation of splittable pseudo-random generators". https://doi.org/10.1017/S095679681500012X Performance (issue 51) ====================== The "improvement" column in the following table is a multiplier: the improvement for 'random' for type 'Float' is 1038, so this operation is 1038 times faster with this patch. | Name | Mean (1.1) | Mean (patch) | Improvement| | ----------------------- | ---------- | ------------ | ---------- | | pure/random/Float | 30 | 0.03 | 1038| | pure/random/Double | 52 | 0.03 | 1672| | pure/random/Integer | 43 | 0.33 | 131| | pure/uniform/Word8 | 14 | 0.03 | 422| | pure/uniform/Word16 | 13 | 0.03 | 375| | pure/uniform/Word32 | 21 | 0.03 | 594| | pure/uniform/Word64 | 42 | 0.03 | 1283| | pure/uniform/Word | 44 | 0.03 | 1491| | pure/uniform/Int8 | 15 | 0.03 | 511| | pure/uniform/Int16 | 15 | 0.03 | 507| | pure/uniform/Int32 | 22 | 0.03 | 749| | pure/uniform/Int64 | 44 | 0.03 | 1405| | pure/uniform/Int | 43 | 0.03 | 1512| | pure/uniform/Char | 17 | 0.49 | 35| | pure/uniform/Bool | 18 | 0.03 | 618| | pure/uniform/CChar | 14 | 0.03 | 485| | pure/uniform/CSChar | 14 | 0.03 | 455| | pure/uniform/CUChar | 13 | 0.03 | 448| | pure/uniform/CShort | 14 | 0.03 | 473| | pure/uniform/CUShort | 13 | 0.03 | 457| | pure/uniform/CInt | 21 | 0.03 | 737| | pure/uniform/CUInt | 21 | 0.03 | 742| | pure/uniform/CLong | 43 | 0.03 | 1544| | pure/uniform/CULong | 42 | 0.03 | 1460| | pure/uniform/CPtrdiff | 43 | 0.03 | 1494| | pure/uniform/CSize | 43 | 0.03 | 1475| | pure/uniform/CWchar | 22 | 0.03 | 785| | pure/uniform/CSigAtomic | 21 | 0.03 | 749| | pure/uniform/CLLong | 43 | 0.03 | 1554| | pure/uniform/CULLong | 42 | 0.03 | 1505| | pure/uniform/CIntPtr | 43 | 0.03 | 1476| | pure/uniform/CUIntPtr | 42 | 0.03 | 1463| | pure/uniform/CIntMax | 43 | 0.03 | 1535| | pure/uniform/CUIntMax | 42 | 0.03 | 1493| API changes =========== StatefulGen ----------- This patch adds a class 'StatefulGen': -- | 'StatefulGen' is an interface to monadic pseudo-random number generators. class Monad m => StatefulGen g m where uniformWord32 :: g -> m Word32 -- default implementation in terms of uniformWord64 uniformWord64 :: g -> m Word64 -- default implementation in terms of uniformWord32 -- plus methods for other word sizes and for byte strings -- all have default implementations so the MINIMAL pragma holds In 'StatefulGen g m', 'g' is the type of the generator and 'm' the underlying monad. Four 'StatefulGen' instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail. FrozenGen --------- This patch also introduces a class 'FrozenGen': -- | 'FrozenGen' is designed for stateful pseudo-random number generators -- that can be saved as and restored from an immutable data type. class StatefulGen (MutableGen f m) m => FrozenGen f m where type MutableGen f m = (g :: Type) | g -> f freezeGen :: MutableGen f m -> m f thawGen :: f -> m (MutableGen f m) 'f' is the type of the generator's state "at rest" and 'm' the underlying monad. 'MutableGen' is defined as an injective type family via 'g -> f' so for any generator 'g', the type 'f' of its at-rest state is well-defined. Both 'StatefulGen' and 'FrozenGen' are generic enough to accommodate, for example, the 'Gen' type from the 'mwc-random' package, which itself abstracts over the underlying primitive monad and state token. The documentation shows the full instances. 'Uniform' and 'UniformRange' ---------------------------- The 'Random' typeclass has conceptually been split into 'Uniform' and 'UniformRange'. The 'Random' typeclass is still included for backwards compatibility. 'Uniform' is for types where it is possible to sample from the type's entire domain; 'UniformRange' is for types where one can sample from a specified range. Breaking Changes ================ This patch introduces these breaking changes: * requires 'base >= 4.10' (GHC-8.2) * 'StdGen' is no longer an instance of 'Read' * 'randomIO' and 'randomRIO' where extracted from the 'Random' class into separate functions In addition, there may be import clashes with new functions, e.g. 'uniform' and 'uniformR'. Deprecations ============ This patch introduces 'genWord64', 'genWord32' and similar methods to the 'RandomGen' class. The significantly slower method 'next' and its companion 'genRange' are now deprecated. Co-authored-by: Alexey Kuleshevich <[email protected]> Co-authored-by: idontgetoutmuch <[email protected]> Co-authored-by: Leonhard Markert <[email protected]>

This patch is mostly backwards compatible. See "Breaking Changes" below for the full list of backwards incompatible changes. This patch fixes quality and performance issues, addresses additional miscellaneous issues, and introduces a monadic API. Issues addressed ================ Priority issues fixed in this patch: - Title: "The seeds generated by split are not independent" Link: haskell#25 Fixed: changed algorithm to SplitMix, which provides a robust 'split' operation - Title: "Very low throughput" Link: haskell#51 Fixed: see "Performance" below Additional issues addressed in this patch: - Title: "Add Random instances for tuples" Link: haskell#26 Addressed: added 'Uniform' instances for up to 6-tuples - Title: "Add Random instance for Natural" Link: haskell#44 Addressed: added 'UniformRange' instance for 'Natural' - Title: "incorrect distribution of randomR for floating-point numbers" Link: haskell#53 Addressed: see "Regarding floating-point numbers" below - Title: "System/Random.hs:43:1: warning: [-Wtabs]" Link: haskell#55 Fixed: no more tabs - Title: "Why does random for Float and Double produce exactly 24 or 53 bits?" Link: haskell#58 Fixed: see "Regarding floating-point numbers" below - Title: "read :: StdGen fails for strings longer than 6" Link: haskell#59 Addressed: 'StdGen' is no longer an instance of 'Read' Regarding floating-point numbers: with this patch, the relevant instances for 'Float' and 'Double' sample more bits than before but do not sample every possible representable value. The documentation now clearly spells out what this means for users. Quality (issue 25) ================== The algorithm [1] in version 1.1 of this library fails empirical PRNG tests when used to generate "split sequences" as proposed in [3]. SplitMix [2] passes the same tests. This patch changes 'StdGen' to use the SplitMix implementation provided by the splitmix package. Test batteries used: dieharder, TestU1, PractRand. [1]: P. L'Ecuyer, "Efficient and portable combined random number generators". https://doi.org/10.1145/62959.62969 [2]: G. L. Steele, D. Lea, C. H. Flood, "Fast splittable pseudorandom number generators". https://doi.org/10.1145/2714064.2660195 [3]: H. G. Schaathun, "Evaluation of splittable pseudo-random generators". https://doi.org/10.1017/S095679681500012X Performance (issue 51) ====================== The "improvement" column in the following table is a multiplier: the improvement for 'random' for type 'Float' is 1038, so this operation is 1038 times faster with this patch. | Name | Mean (1.1) | Mean (patch) | Improvement| | ----------------------- | ---------- | ------------ | ---------- | | pure/random/Float | 30 | 0.03 | 1038| | pure/random/Double | 52 | 0.03 | 1672| | pure/random/Integer | 43 | 0.33 | 131| | pure/uniform/Word | 44 | 0.03 | 1491| | pure/uniform/Int | 43 | 0.03 | 1512| | pure/uniform/Char | 17 | 0.49 | 35| | pure/uniform/Bool | 18 | 0.03 | 618| API changes =========== StatefulGen ----------- This patch adds a class 'StatefulGen': -- | 'StatefulGen' is an interface to monadic pseudo-random number generators. class Monad m => StatefulGen g m where uniformWord32 :: g -> m Word32 -- default implementation in terms of uniformWord64 uniformWord64 :: g -> m Word64 -- default implementation in terms of uniformWord32 -- plus methods for other word sizes and for byte strings -- all have default implementations so the MINIMAL pragma holds In 'StatefulGen g m', 'g' is the type of the generator and 'm' the underlying monad. Four 'StatefulGen' instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail. FrozenGen --------- This patch also introduces a class 'FrozenGen': -- | 'FrozenGen' is designed for stateful pseudo-random number generators -- that can be saved as and restored from an immutable data type. class StatefulGen (MutableGen f m) m => FrozenGen f m where type MutableGen f m = (g :: Type) | g -> f freezeGen :: MutableGen f m -> m f thawGen :: f -> m (MutableGen f m) 'f' is the type of the generator's state "at rest" and 'm' the underlying monad. 'MutableGen' is defined as an injective type family via 'g -> f' so for any generator 'g', the type 'f' of its at-rest state is well-defined. Both 'StatefulGen' and 'FrozenGen' are generic enough to accommodate, for example, the 'Gen' type from the 'mwc-random' package, which itself abstracts over the underlying primitive monad and state token. The documentation shows the full instances. 'Uniform' and 'UniformRange' ---------------------------- The 'Random' typeclass has conceptually been split into 'Uniform' and 'UniformRange'. The 'Random' typeclass is still included for backwards compatibility. 'Uniform' is for types where it is possible to sample from the type's entire domain; 'UniformRange' is for types where one can sample from a specified range. Breaking Changes ================ This patch introduces these breaking changes: * requires 'base >= 4.8' (GHC-7.10) * 'StdGen' is no longer an instance of 'Read' * 'randomIO' and 'randomRIO' where extracted from the 'Random' class into separate functions In addition, there may be import clashes with new functions, e.g. 'uniform' and 'uniformR'. Deprecations ============ This patch introduces 'genWord64', 'genWord32' and similar methods to the 'RandomGen' class. The significantly slower method 'next' and its companion 'genRange' are now deprecated. Co-authored-by: Alexey Kuleshevich <[email protected]> Co-authored-by: idontgetoutmuch <[email protected]> Co-authored-by: Leonhard Markert <[email protected]>

lehins · 2020-06-23T15:20:48Z

@nh2 I am really happy to let you know that this is no longer an issue! 😄 random-1.2 is now on hackage.

Let me know the results if you get a chance to try out your throughput benchmark comparison against the /dev/urandom ;)

nh2 mentioned this issue Nov 25, 2018

Random generation is very slow nick8325/quickcheck#234

Closed

idontgetoutmuch mentioned this issue Apr 29, 2020

Interface to performance idontgetoutmuch/random#1

Open

idontgetoutmuch mentioned this issue May 20, 2020

V1.2 proposal #61

Merged

lehins mentioned this issue Jun 4, 2020

V1.2 proposal #62

Merged

lehins closed this as completed Jun 23, 2020

lehins added this to the 1.2.0 milestone Jan 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very low throughput #51

Very low throughput #51

nh2 commented Nov 25, 2018

nh2 commented Nov 25, 2018

idontgetoutmuch commented Nov 25, 2018

cartazio commented Nov 25, 2018 via email

cartazio commented Nov 25, 2018 via email

nh2 commented Nov 25, 2018

cartazio commented Nov 25, 2018 via email

ecthiender commented Apr 25, 2019

cartazio commented Apr 25, 2019

cartazio commented Apr 25, 2019

ecthiender commented Apr 26, 2019

lehins commented May 26, 2020 •

edited

Loading

cartazio commented May 26, 2020 via email

lehins commented Jun 23, 2020

Very low throughput #51

Very low throughput #51

Comments

nh2 commented Nov 25, 2018

nh2 commented Nov 25, 2018

idontgetoutmuch commented Nov 25, 2018

cartazio commented Nov 25, 2018 via email

cartazio commented Nov 25, 2018 via email

nh2 commented Nov 25, 2018

cartazio commented Nov 25, 2018 via email

ecthiender commented Apr 25, 2019

cartazio commented Apr 25, 2019

cartazio commented Apr 25, 2019

ecthiender commented Apr 26, 2019

lehins commented May 26, 2020 • edited Loading

cartazio commented May 26, 2020 via email

lehins commented Jun 23, 2020

lehins commented May 26, 2020 •

edited

Loading