-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
V1.2 proposal #62
V1.2 proposal #62
Conversation
cf2c518
to
80ef577
Compare
Loss of ability to save/restore state for mutable generators is regrettable since now each generator will have to create separate implementations and it makes much more difficult to implement API for uniform generator initialization etc. I think that crux of problem is that generator for pure PRNG is just some normal value of type |
Just to make the difference completely clear, here is the direct comparison between the original proposal (#61) and the version without the state token parameter (this PR). From #61: random/src/System/Random/Internal.hs Lines 179 to 205 in 463b8a1
This PR: random/src/System/Random/Internal.hs Lines 176 to 183 in 91b79ed
|
91b79ed
to
5773379
Compare
For those who don’t read the news, nyc is a bit complicated this past month or two and even more so this past week, with curfew. There is no sane planet where expecting me to be responsive and fast is possible when there’s that sort of stuff going one. I’ve deleted the comments that aren’t around technical discourse (and are oblivious to larger world events that might lead to slow replies). |
To be clear: I’m keen to focus on this, but the past few weeks in nyc are not a reasonable environment to facilitate that. |
Pardon this morning, I was a bit grumpy and was dealing with some asthma and packing this morning. Reviewing this squashed patch this evening |
@Shimuuar : is there any way to add this back in the form of a super class with the lost functionality put back in or is the argument simply over whether this is good design in the first place? |
@Boarders I was literally just discussing this on a private channel. Were you listening? ;P |
5773379
to
14df799
Compare
Current master has been reset to the version of This PR now has all of git commits since beginning of proposal, which we will probably squash as one mega commit before merging into master. For now we encourage anyone with interest in the newly added API to review and comment on this PR. Looking forward to some feedback I also update the description of this PR to reflect the actual proposal. |
Great work! |
14df799
to
86e06b8
Compare
@lehins the text of the first post now reads:
Note that |
@ulysses4ever Thanks for catching it, I updated the text. he class was renamed recently to avoid the what turned out to be a common confusion of what the intended use for the class should be. If you see MonadRandom in some other place, please ping me, it needs to be renamed to StatefulGen. |
@lehins the link to the PR in the original comment is dead, presumably because it has been deleted. also, a hyperlink for StatefulGen is still labelled 'MonadRandom'. |
This is very thorough! Going to be digesting this for a while. First thought: Is GHC >= 8.2 required just for |
performance is over x1000 times better; the minimum performance | ||
increase for the types listed below is more than x36. | ||
|
||
Name | 1.1 Mean | 1.2 Mean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These numbers are simply phenomenal.
@@ -1,3 +1,112 @@ | |||
# 1.2 | |||
|
|||
1. Breaking change which mostly maintains backwards compatibility, see |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these changes are great!
I actually think something along these lines would be a good idea. It's a pretty natural question and one that might get brought up in the issue tracker. Having something to point to and then "Close and comment" would be pretty useful. |
Oops, when I wrote "Close and comment", I had a brain malfunction and ended up hitting the "Close and comment" button. |
Not quite.
Until someone can show me an example where |
This would be another great set of points to include in any explanation. |
@Shimuuar Here is a puzzle for you. Use |
It's very much possible. But it requires structuring library differently and to expose only frozen variant of generator. Mutable generator is implementation detail hidden in some reader monad class MonadRandom g m | m -> g where
save :: m g
restore :: g -> m ()
...
True. I however don't see it as a big problem. In pre-DerivingVia era I would probably agree with you.
And this is example of more power! However StatefulGen is somewhat limited in this respect. It's impossible to use SplitMix and say some LCG without resorting to IO/ST tricks with IORefs.
Whole idea of this style of API is to rely on typeclasses to do all the lifting.
It's like saying walk over there but don't move your legs. In API that dispatches on type of monad one have to define new types. Mirror image is puzzle add generation of randomness in some function 3-layers deep in call graph but don't pass any extra parameters manually. |
@Shimuuar, with @chessai we were discussing this
You never know, one might be able to walk on hands ;P @Shimuuar The whole point is that stack of transformers is not user friendly AT ALL. And even if you cook up a proof that the two classes are equivalent in their power with definition of newtype wrappers and such, from the user's perspective, either |
That's my point! It will look ridiculous and won't work well.
I think this is focal point of disagreement. MonadRandom is natural choice for projects structured in mtl-style: Similarly intergating MonadRandom into mtl-style project seems to be very easy task. Slap MonadRandom into context here and there and add instance. Even with deriving machinery it seems to be rather simple task In the end all word have been said and all we can to is to agree to disagree |
@Shimuuar You are really wrong about this! Behold the proof. Let's look at a ludicrous example using mtl style programming and mix it with randomWithRejectionN :: (Random a, RandomGen g, Monad m) => Int -> (a -> m Bool) -> g -> m ([a], g)
randomWithRejectionN n f g = fmap swap $ runWriterT $ flip execRandT g $ evalStateT go 0
where
go = do
i <- get
when (i < n) $ do
v <- lift getRandom
reject <- lift $ lift $ lift $ f v
unless reject $ do
lift (lift (tell [v]))
put (i + 1)
go Running it we produce a list of odd numbers in a very convoluted way. λ> randomWithRejectionN 5 (pure . even) (mkStdGen 217) :: IO ([Int8], StdGen)
([37,3,23,65,45],StdGen {unStdGen = SMGen 18077526032776641972 15251669095119325999}) Question is, what do we need to do in order to use
data RandGenM g = RandGenM
instance (Monad m, RandomGen g) => StatefulGen (RandGenM g) (RandT g m) where
uniformWord32R r = applyRand (genWord32R r)
...
instance (Monad m, RandomGen g) => RandomGenM (RandGenM g) g (RandT g m) where
applyRandomGenM = applyRand
applyRand :: Applicative m => (g -> (a, g)) -> RandGenM g -> RandT g m a
applyRand f _ = liftRandT (pure . f) Note that the whole instance is isomorphic to
randomWithRejectionN ::
forall a g m. (Random a, RandomGen g, Monad m) => Int -> (a -> m Bool) -> g -> m ([a], g)
randomWithRejectionN n f g =
fmap swap $ runWriterT $ flip execRandT g $ evalStateT go 0
where
go = do
i <- get
when (i < n) $ do
v <- lift $ randomM (RandGenM :: RandGenM g)
reject <- lift $ lift $ lift $ f v
unless reject $ do
lift (lift (tell [v]))
put (i + 1)
go That's right, the only change that was needed is: Defining some helper functions as I did for @Shimuuar With this concrete example on the screen, can you tell me how And that is what I keep trying to say to you, If I wanted to go the other way around and try to define I really hope with this example we can agree to agree. PS. This is a way to add askRandom :: (Random a, RandomGenM g r m, MonadReader g m) => m a
askRandom = ask >>= randomM
randomWithRejectionN ::
(Random a, RandomGen g, Monad m) => Int -> (a -> m Bool) -> g -> m ([a], g)
randomWithRejectionN n f g =
fmap swap $ runWriterT $ execStateGenT g $ runReaderT $ evalStateT go 0
where
go = do
i <- get
when (i < n) $ do
v <- lift $ askRandom
reject <- lift $ lift $ lift $ lift $ f v
unless reject $ do
lift (lift (tell [v]))
put (i + 1)
go CC @chessai hopefully this demystifies the relation of |
Well, From my point of view, once idontgetoutmuch#169 lands into this PR (which is really couple of keystrokes away, Levenshtein distance is 8), we are good to merge. |
But example above is strawman of mtl. Its central idea is to rely on type classes to dispatch function. So randomWithRejectionN ::
(Random a, MonadRandom g m) => Int -> (a -> m Bool) -> m [a]
randomWithRejectionN n f g = replicateM n $ go n
where
go 0 = return []
go n = do
x <- randomM
reject <- f v
if reject then go (n-1) else (x:) <$> go (n-1) PRNG state is abstracted away, no needless complication from gratuitous Writer/State monad. I think that StatefulGen approach will have vaguely similar structure. I still maintain that both approaches have similar expressive power. P.S. While this exchange is quite interesting it shouldn't affect merge of this PR I think |
@Bodigrim, I think you are missing the point here. This question came up already a few times and it has to do with the fact that this PR provides @Shimuuar looks like you are missing the point as well. But whatever, I am not gonna waste my time anymore on trying to convince you. Putting all that nonsense behind us we are almost ready to get this PR merged with a release that will follow right after. Here is a summary of what's left to do:
Submit a bunch of precrafted PRs to a few affected repositories (QuickCheck, MonadRandom, uuid, ....) |
This is what the Haddock for -- | Returns an 'Int' that is uniformly distributed over the range returned by
-- 'genRange' (including both end points), and a new generator. Using 'next'
-- is inefficient as all operations go via 'Integer'. See
-- [here](https://alexey.kuleshevi.ch/blog/2019/12/21/random-benchmarks) for
-- more details. It is thus deprecated.
next :: g -> (Int, g) It's not mentioning rejection sampling, and the explanation is short, but I think the message that |
e9eade4
to
8d51546
Compare
This patch is mostly backwards compatible. See "Breaking Changes" below for the full list of backwards incompatible changes. This patch fixes quality and performance issues, addresses additional miscellaneous issues, and introduces a monadic API. Issues addressed ================ Priority issues fixed in this patch: - Title: "The seeds generated by split are not independent" Link: haskell#25 Fixed: changed algorithm to SplitMix, which provides a robust 'split' operation - Title: "Very low throughput" Link: haskell#51 Fixed: see "Performance" below Additional issues addressed in this patch: - Title: "Add Random instances for tuples" Link: haskell#26 Addressed: added 'Uniform' instances for up to 6-tuples - Title: "Add Random instance for Natural" Link: haskell#44 Addressed: added 'UniformRange' instance for 'Natural' - Title: "incorrect distribution of randomR for floating-point numbers" Link: haskell#53 Addressed: see "Regarding floating-point numbers" below - Title: "System/Random.hs:43:1: warning: [-Wtabs]" Link: haskell#55 Fixed: no more tabs - Title: "Why does random for Float and Double produce exactly 24 or 53 bits?" Link: haskell#58 Fixed: see "Regarding floating-point numbers" below - Title: "read :: StdGen fails for strings longer than 6" Link: haskell#59 Addressed: 'StdGen' is no longer an instance of 'Read' Regarding floating-point numbers: with this patch, the relevant instances for 'Float' and 'Double' sample more bits than before but do not sample every possible representable value. The documentation now clearly spells out what this means for users. Quality (issue 25) ================== The algorithm [1] in version 1.1 of this library fails empirical PRNG tests when used to generate "split sequences" as proposed in [3]. SplitMix [2] passes the same tests. This patch changes 'StdGen' to use the SplitMix implementation provided by the splitmix package. Test batteries used: dieharder, TestU1, PractRand. [1]: P. L'Ecuyer, "Efficient and portable combined random number generators". https://doi.org/10.1145/62959.62969 [2]: G. L. Steele, D. Lea, C. H. Flood, "Fast splittable pseudorandom number generators". https://doi.org/10.1145/2714064.2660195 [3]: H. G. Schaathun, "Evaluation of splittable pseudo-random generators". https://doi.org/10.1017/S095679681500012X Performance (issue 51) ====================== The "improvement" column in the following table is a multiplier: the improvement for 'random' for type 'Float' is 1038, so this operation is 1038 times faster with this patch. | Name | Mean (1.1) | Mean (patch) | Improvement| | ----------------------- | ---------- | ------------ | ---------- | | pure/random/Float | 30 | 0.03 | 1038| | pure/random/Double | 52 | 0.03 | 1672| | pure/random/Integer | 43 | 0.33 | 131| | pure/uniform/Word | 44 | 0.03 | 1491| | pure/uniform/Int | 43 | 0.03 | 1512| | pure/uniform/Char | 17 | 0.49 | 35| | pure/uniform/Bool | 18 | 0.03 | 618| API changes =========== StatefulGen ----------- This patch adds a class 'StatefulGen': -- | 'StatefulGen' is an interface to monadic pseudo-random number generators. class Monad m => StatefulGen g m where uniformWord32 :: g -> m Word32 -- default implementation in terms of uniformWord64 uniformWord64 :: g -> m Word64 -- default implementation in terms of uniformWord32 -- plus methods for other word sizes and for byte strings -- all have default implementations so the MINIMAL pragma holds In 'StatefulGen g m', 'g' is the type of the generator and 'm' the underlying monad. Four 'StatefulGen' instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail. FrozenGen --------- This patch also introduces a class 'FrozenGen': -- | 'FrozenGen' is designed for stateful pseudo-random number generators -- that can be saved as and restored from an immutable data type. class StatefulGen (MutableGen f m) m => FrozenGen f m where type MutableGen f m = (g :: Type) | g -> f freezeGen :: MutableGen f m -> m f thawGen :: f -> m (MutableGen f m) 'f' is the type of the generator's state "at rest" and 'm' the underlying monad. 'MutableGen' is defined as an injective type family via 'g -> f' so for any generator 'g', the type 'f' of its at-rest state is well-defined. Both 'StatefulGen' and 'FrozenGen' are generic enough to accommodate, for example, the 'Gen' type from the 'mwc-random' package, which itself abstracts over the underlying primitive monad and state token. The documentation shows the full instances. 'Uniform' and 'UniformRange' ---------------------------- The 'Random' typeclass has conceptually been split into 'Uniform' and 'UniformRange'. The 'Random' typeclass is still included for backwards compatibility. 'Uniform' is for types where it is possible to sample from the type's entire domain; 'UniformRange' is for types where one can sample from a specified range. Breaking Changes ================ This patch introduces these breaking changes: * requires 'base >= 4.8' (GHC-7.10) * 'StdGen' is no longer an instance of 'Read' * 'randomIO' and 'randomRIO' where extracted from the 'Random' class into separate functions In addition, there may be import clashes with new functions, e.g. 'uniform' and 'uniformR'. Deprecations ============ This patch introduces 'genWord64', 'genWord32' and similar methods to the 'RandomGen' class. The significantly slower method 'next' and its companion 'genRange' are now deprecated. Co-authored-by: Alexey Kuleshevich <[email protected]> Co-authored-by: idontgetoutmuch <[email protected]> Co-authored-by: Leonhard Markert <[email protected]>
f751492
to
b45bd28
Compare
Context
Following @lehins' performance analysis of Haskell pseudo-random number libraries and the ensuing discussion, @lehins, @idontgetoutmuch and @curiousleo with help from @Shimuuar set out to improve
random
as both an interface for and implementation of a pseudo-random number generator for Haskell.Our goals were to fix #25 (filed in 2015) and #51 (filed in 2018), see "Quality" and "Performance" below.
In the process of tackling these two issues, we addressed a number of other issues too (see "Other issues addressed" below) and added a monadic interface to the library so monadic pseudo-random number generators can be used interchangeably with
random
, see "API changes" below.This PR is the result of that effort. The changes are considerable. To signal this, we propose to release this as version 1.2 (the previous released version is 1.1, from 2014).
However, the API changes are generally backwards-compatible, see "Compatibility" below.
Quality (#25)
We created an environment for running statistical pseudo-random number generator tests, tested
random
v1.1 andsplitmix
using dieharder, TestU01, PractRand and other test suites and recorded the results.The results clearly show that the
split
operation inrandom
v1.1 produces pseudo-random number generators which are correlated, corroborating #25. Thesplit
operation insplitmix
showed no weakness in our tests.As a result, we replaced the pseudo-random number generator implementation in
random
by the one provided bysplitmix
.Performance (#51)
@lehins' performance analysis has the data for
random
v1.1. It is slow, and using faster pseudo-random number generators viarandom
v1.1 makes them slow.By switching to
splitmix
and improving the API, this PR speeds up pseudo-random number generation withrandom
by one to three orders of magnitude, depending on the number type. See Benchmarks for details.API changes
StatefulGen
The major API addition in this PR is the definition of a new class
StatefulGen
:Conceptually, in
StatefulGen g m
,g
is the type of the generator, andm
the underlying monad.This definition is generic enough to accommodate, for example, the
Gen
type frommwc-random
, which itself abstracts over the underlying primitive monad and state token. This is the full instance declaration (provided here as an example - this instance is not part ofrandom
asrandom
does not depend onmwc-random
):Four
StatefulGen
instances ("monadic adapters") are provided for pure generators to enable their use in monadic code. The documentation describes them in detail.Uniform
andUniformRange
The
Random
typeclass has conceptually been split intoUniform
andUniformRange
. TheRandom
typeclass is still included for backwards compatibility.Uniform
is for types where it is possible to sample from the type's entire domain;UniformRange
is for types where one can sample from a specified range.Changes left out
There were changes we considered and decided against including in this PR.
Some pseudo-random number generators are splittable, others are not. A good way of communicating this is to have a separate typeclass,
Splittable
, say, which only splittable generators implement. After long discussions (see this issue and this PR), we decided against addingSplittable
: the interface changes would either have been backwards-incompatible or too complex. For now,split
stays part of theRandomGen
typeclass. The new documentation suggests thatsplit
should callerror
if the generator is not splittable.Due to floating point rounding, generating a floating point number in a range can yield surprising results. There are techniques to generate floating point numbers in a range with actual guarantees, but they are more complex and likely slower than the naive methods, so we decided to postpone this particular issue.
Ranges on the real number line can be inclusive or exclusive in the lower and upper bound. We considered API designs that would allow users to communicate precisely what kind of range they wanted to generate. This is particularly relevant for floating point numbers. However, we found that such an API would make more sense in conjunction with an improved method for generating floating point numbers, so we postponed this too.
Compatibility
We strove to make changes backwards compatible where possible and desirable.
The following changes may break existing packages:
uniform
anduniformR
randomIO
andrandomRIO
where extracted outside ofRandom
class as separate functions, which means some packages need to adjust how they are importedStdGen
is no longer an instance ofRead
base >= 4.10
(GHC-8.2)In addition,
genRange
andnext
have been deprecated.We have built all of Stackage against the code in this PR, and confirmed that no other build breakages occurred.
For more details, see this comment and the "Compatibility" section in the docs.
Other issues addressed
This PR also addresses #26, #44, #53, #55, #58 and #59, see Issues Addressed for details.