I'm rereading this paper by Knuth and Yao [1], which could be called "How to throw a die when all you have is a penny", or something. Its premise is, how can you generate a random variable with arbitrary distribution when all you have is a perfectly fair *binary* random variate? It begins with this blindingly eye-catching diagram:

Every round node is a coin-toss (a bit-flip), and transitions are taken depending on the value obtained, until one of the dice nodes are reached. What's the expected number of flips needed to get a die? As explained in the paper, it's `T` = 3 + 2/8⋅`T`, which solves for `T` = 4. After this preliminaries, they show us *another* diagram, but not before observing that [t]he reader may have noticed a source of inefficiency in [the figure]: When three equal values are obtained, we could have used this common result as the value of the next flip, since 000 and 111 are equally probable.

I certainly hadn't; the paper states that the average number of bit-flips needed is 11/3, and that that is minimal. The rest of the paper is devoted to prove *constructively* that this is so, by giving an algorithm that minimizes the number of bit-flips required to compute an arbitrary random variate. What it doesn't show is how to arrive at this number of flips. Applying the method cited above, we have `T` = 1 + `U` and `U` = 2 + 1/4⋅`U`, for `U` = 8/3, `T` = 11/3.

All fine and dandy but, why am I writing this? For one, to show you pretty finite state machines (actually, Markov chains) that generate dice throws! You might think, what a roundabout way to do it, and counter with:

let die () = 1 + Random.int 6

One problem I see is that, as usually implemented, it is easier to find in the wild fairer bits than dice; or rather, usual implementations of integer random variates tend to take the lower bits from a linear congruential generator, which is a rather lousy way to do it. It is better to do:

let bit () = Random.float 1. < 0.5

which, as it takes into account the more significant bits first, is more apt to be fairer. Another good source of random bits is a linear feedback shift register; the table in [2, 376] shows that `x`^{31} + `x`^{3} + 1 is a primitive polynomial, modulo 2, which gives the following generator with period 2^{31} - 1:

let seed = ref 0x33c6ef37 let set_seed s = seed := if s == 0 then 0x33c6ef37 else s let rand_bit () = let b = !seed land 1 in seed := (!seed lsr 1) lxor (-b land 0x40000004); b

Of course, if you can control your source of bits this disquisition is probably moot; if not, it might be something you'd want to account for. Anyway, back to the bones. There are many ways to write a FSM, but most involve using a table. Instead of falling so low, I'll show you a little functional magic. A state is a finite map from an

type ('a, 'b) state = 'a -> ('a, 'b) edge and ('a, 'b) edge = Halt of 'b | Next of ('a, 'b) state

I'll represent the second machine as a value of type `(bool, int) state`

, with maximally-shared states via value recursion. I number the states from left to right, and from top to bottom:

let die_fsm : (bool, int) state = let rec st0 = function false -> Next st1 | true -> Next st2 and st1 = function false -> Next st3 | true -> Next st4 and st2 = function false -> Next st5 | true -> Next st6 and st3 = function false -> Next st1 | true -> Halt 1 and st4 = function false -> Halt 2 | true -> Halt 3 and st5 = function false -> Halt 4 | true -> Halt 5 and st6 = function false -> Halt 6 | true -> Next st2 in st0

The state machine is run simply via the following function:

let rec run_fsm st input = match st (input ()) with | Next st -> run_fsm st input | Halt x -> x

That's it, really! I can throw my die like so:

let die () = run_fsm die_fsm (fun () -> rand_bit () == 1)

Let's see how fair our die is. For that, I'll compute a histogram of recorded values in an experiment involving `n` throws:

let test_die n = let hist = Array.make 6 0 in for i = 1 to n do let d = die () - 1 in hist.(d) <- hist.(d) + 1 done; Array.map (fun s -> float s /. float n) hist

You can check that the die is fair:

# test_die 100_000 ;; - : float array = [|0.16379; 0.16713; 0.16702; 0.16768; 0.1661; 0.16828|]

And of course, let's make sure I haven't made a silly mistake:

# Array.fold_left (+.) 0. (test_die 100_000) ;; - : float = 1.

Now, shoot!

#### References

- Knuth, D.E.K. and Yao, A. C.
The Complexity of Nonuniform Random Number Generation

. Reprinted in Donald E. Knuth, Selected Papers on Analysis of Algorithms (Stanford: CSLI lecture notes**102**, 2000). - Bruce Schneier, Applied Cryptography Second Edition: protocols, algorithms, and source code in C (New York: John Wiley & Sons, Inc. 1996)

## No comments:

Post a Comment