Skip to content

Commit

Permalink
New blog post.
Browse files Browse the repository at this point in the history
  • Loading branch information
athas committed Apr 18, 2024
1 parent 7ba9557 commit 8cd2ef7
Showing 1 changed file with 174 additions and 0 deletions.
174 changes: 174 additions & 0 deletions blog/2024-04-18-random-numbers-with-uniqueness-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
---
title: Avoiding RNG bugs through uniqueness types
description: A programmer complained about a source of bugs, and this is how I think it could be fixed.
---

While visiting [Chalmers](https://www.chalmers.se/en/) to give a guest
lecture in their course on Parallel Functional Programming, I met a
student who had implemented a ray tracer in Futhark (I didn't catch
the name, please write me if you are that student and I'll put it
here). He mentioned that he'd had a bunch of bugs related to random
number generation (RNG), which is admittedly a bit awkward in Futhark.
The main challenge is that you manually need to maintain the RNG
*state*, since Futhark does not allow side effects.

To illustrate the problem, let us define a small set of functions for
generating random numbers. First, we define a type for storing the RNG
state:

```Futhark
type rng = u32
```

To initialise the state from a seed, we do a few rounds of a hash
function I [found on Stack
Overflow](https://stackoverflow.com/questions/664014/what-integer-hash-function-are-good-that-accepts-an-integer-hash-key/12996028#12996028):

```Futhark
def mk_rng (seed: i32) : rng =
let x = u32.i32 seed
let x = ((x >> 16) ^ x) * 0x45d9f3b
let x = ((x >> 16) ^ x) * 0x45d9f3b
let x = ((x >> 16) ^ x)
in x
```

For the random number generation itself, for simplicity we will just
do a basic [linear congruential
generator](https://en.wikipedia.org/wiki/Linear_congruential_generator):

```Futhark
def rand (l: i32) (x: rng) : (rng, i32) =
let a = 48271
let m = 2147483647
let rng' = (a * x) % m
in (rng', i32.u32 (rng' % u32.i32 l))
```

Note now the `rand` function returns both a new state, as well as the
randomly generated number in the range 0 to `l`. We can use `rand` like
this:

```Futhark
def use (seed: i32) =
let rng = mk_rng seed
let (rng', x) = rand 6 rng
let (rng'', y) = rand 6 rng'
in x + y
```

This works fine. However, it is easy to use an old RNG state by
accident:

```Futhark
def use (seed: i32) =
let rng = mk_rng seed
let (rng', x) = rand 6 rng
let (rng'', y) = rand 6 rng
in x + y
```

Note how I typed `rng` instead of `rng'` in the last call to `rand`.
In this case the compiler will complain about `rng'` being unused, but
it's not hard to imagine a larger program where `rng'` is indeed used
for something else later. Especially when refactoring, it is easy to
accidentally reuse the same RNG state twice, which will lead to
randomly numbers being correlated. For a ray tracer, this can result
in fun visual artefacts, but for other programs it may just result in
a number being wrong, which is both boring and tedious to debug.

In an imperative language, generating a random number mutates the
state, so it cannot be reused. In languages Haskell, you can use a
state monad to simulate the same thing, and similarly avoid reuse. In
Futhark, it turns out you can imitate a form of *affine types* using
Futhark's [slightly obscure support for uniqueness
types](2022-06-13-uniqueness-types.html). Affine type allows you to
express that a value can be used *at most once* (whereas linear types
allow you to require *exactly once*, which is why they are useful for
resource management, as then the last use must be a cleanup function).
By constructing an RNG library such that number generation *consumes*
a state and *produces* a new one, we can ensure that each state is
used at most once.

Although uniqueness types are really designed for dealing with arrays,
they can also be used for abstract types via [the module
system](https://futhark-lang.org/blog/2017-01-25-futhark-module-system.html).
So first we define a module that describes the RNG interface:

```Futhark
module type rand = {
type rng
val mk_rng : i32 -> rng
val rand : i32 -> *rng -> (rng, i32)
}
```

Note the asterisks on the `rand` parameter type - this denotes a
consuming parameter, meaning the `rng` we pass in may not be used
again.

We implement the module using the same code as above:

```Futhark
module rand : rand = {
type rng = u32
def mk_rng (seed: i32) : rng =
let x = u32.i32 seed
let x = ((x >> 16) ^ x) * 0x45d9f3b
let x = ((x >> 16) ^ x) * 0x45d9f3b
let x = ((x >> 16) ^ x)
in x
def rand (l: i32) (x: rng) : (rng, i32) =
let a = 48271u32
let m = 2147483647u32
let rng' = (a * x) % m
in (rng', i32.u32 (rng' % u32.i32 l))
}
```

We can only access the functions through the types defined in the
module type, which means `rand` will consume its `rng` argument,
despite the actual function not doing anything odd. Code like this
will now work:

```Futhark
def use (seed: i32) =
let rng = rand.mk_rng seed
let (rng', x) = rand.rand 6 rng
let (rng'', y) = rand.rand 6 rng'
in x + y
```

But if we try to reuse an RNG state, the type checker will tell us:

```Futhark
def abuse (seed: i32) =
let rng = rand.mk_rng seed
let (rng', x) = rand.rand 6 rng
let (rng'', y) = rand.rand 6 rng
in x + y
```

```
Error: Using variable "rng", but this was consumed at 3:31-34.
```

If we *want* to duplicate an RNG state, we can still do so, as the
`copy` prelude function can copy anything you can put in an array.
This is explicit, and so unlikely to lead to unintended behaviour.

The most widely used Futhark library for random numbers is
[cpprandom](https://github.com/diku-dk/cpprandom), which does not use
this approach, but I'm wondering whether it would be better if it did.
However, in practice, bugs like this are not too difficult to avoid,
if we simply use shadowing to make the old RNG states inaccessible:

```Futhark
def use (seed: i32) =
let rng = mk_rng seed
let (rng, x) = rand 6 rng
let (rng, y) = rand 6 rng
in x + y
```

0 comments on commit 8cd2ef7

Please sign in to comment.