I'm rereading this paper by Knuth and Yao [1], which could be called "How to throw a die when all you have is a penny", or something. Its premise is, how can you generate a random variable with arbitrary distribution when all you have is a perfectly fair binary random variate? It begins with this blindingly eye-catching diagram:
Every round node is a coin-toss (a bit-flip), and transitions are taken depending on the value obtained, until one of the dice nodes are reached. What's the expected number of flips needed to get a die? As explained in the paper, it's T = 3 + 2/8⋅T, which solves for T = 4. After this preliminaries, they show us another diagram, but not before observing that [t]he reader may have noticed a source of inefficiency in [the figure]: When three equal values are obtained, we could have used this common result as the value of the next flip, since 000 and 111 are equally probable.
I certainly hadn't; the paper states that the average number of bit-flips needed is 11/3, and that that is minimal. The rest of the paper is devoted to prove constructively that this is so, by giving an algorithm that minimizes the number of bit-flips required to compute an arbitrary random variate. What it doesn't show is how to arrive at this number of flips. Applying the method cited above, we have T = 1 + U and U = 2 + 1/4⋅U, for U = 8/3, T = 11/3.
All fine and dandy but, why am I writing this? For one, to show you pretty finite state machines (actually, Markov chains) that generate dice throws! You might think, what a roundabout way to do it, and counter with:
let die () = 1 + Random.int 6
One problem I see is that, as usually implemented, it is easier to find in the wild fairer bits than dice; or rather, usual implementations of integer random variates tend to take the lower bits from a linear congruential generator, which is a rather lousy way to do it. It is better to do:
let bit () = Random.float 1. < 0.5
which, as it takes into account the more significant bits first, is more apt to be fairer. Another good source of random bits is a linear feedback shift register; the table in [2, 376] shows that x31 + x3 + 1 is a primitive polynomial, modulo 2, which gives the following generator with period 231 - 1:
let seed = ref 0x33c6ef37 let set_seed s = seed := if s == 0 then 0x33c6ef37 else s let rand_bit () = let b = !seed land 1 in seed := (!seed lsr 1) lxor (-b land 0x40000004); b
Of course, if you can control your source of bits this disquisition is probably moot; if not, it might be something you'd want to account for. Anyway, back to the bones. There are many ways to write a FSM, but most involve using a table. Instead of falling so low, I'll show you a little functional magic. A state is a finite map from an
type ('a, 'b) state = 'a -> ('a, 'b) edge and ('a, 'b) edge = Halt of 'b | Next of ('a, 'b) state
I'll represent the second machine as a value of type (bool, int) state
, with maximally-shared states via value recursion. I number the states from left to right, and from top to bottom:
let die_fsm : (bool, int) state = let rec st0 = function false -> Next st1 | true -> Next st2 and st1 = function false -> Next st3 | true -> Next st4 and st2 = function false -> Next st5 | true -> Next st6 and st3 = function false -> Next st1 | true -> Halt 1 and st4 = function false -> Halt 2 | true -> Halt 3 and st5 = function false -> Halt 4 | true -> Halt 5 and st6 = function false -> Halt 6 | true -> Next st2 in st0
The state machine is run simply via the following function:
let rec run_fsm st input = match st (input ()) with | Next st -> run_fsm st input | Halt x -> x
That's it, really! I can throw my die like so:
let die () = run_fsm die_fsm (fun () -> rand_bit () == 1)
Let's see how fair our die is. For that, I'll compute a histogram of recorded values in an experiment involving n throws:
let test_die n = let hist = Array.make 6 0 in for i = 1 to n do let d = die () - 1 in hist.(d) <- hist.(d) + 1 done; Array.map (fun s -> float s /. float n) hist
You can check that the die is fair:
# test_die 100_000 ;; - : float array = [|0.16379; 0.16713; 0.16702; 0.16768; 0.1661; 0.16828|]
And of course, let's make sure I haven't made a silly mistake:
# Array.fold_left (+.) 0. (test_die 100_000) ;; - : float = 1.
Now, shoot!
References
- Knuth, D.E.K. and Yao, A. C.
The Complexity of Nonuniform Random Number Generation
. Reprinted in Donald E. Knuth, Selected Papers on Analysis of Algorithms (Stanford: CSLI lecture notes 102, 2000). - Bruce Schneier, Applied Cryptography Second Edition: protocols, algorithms, and source code in C (New York: John Wiley & Sons, Inc. 1996)
No comments:
Post a Comment