If you do statistics you know of erf
and erfc
; if you work in OCaml you surely miss them. It is not very difficult to port the canonical implementation given by Numerical Recipes (which I won't show and not just for licensing reasons); if you Google for a coefficient you'll see that this approximation is, indeed, ubiquitous. There exists a better approximation in the literature, one that is more than 40 years old. Since I find it inconceivable that it is not more widely used (again, Google for a coefficient), I present a straightforward implementation of it.
This approximation boasts a relative error of 10-19.5 in the range |x| ≤ 0.46875 = 15/32, of 10-18.59 in the range 0.46875 ≤ |x| ≤ 4, and of 10-18.24 in the range 4 ≤ |x|. The only difficulty is to accurately transcribe the coefficient tables into code; I've double-checked them and ran some tests to assess the proper functioning of the code:
let r0 = [| 3.20937_75891_38469_47256_2e+03, 2.84423_68334_39170_62227_3e+03; 3.77485_23768_53020_20813_7e+02, 1.28261_65260_77372_27564_5e+03; 1.13864_15415_10501_55649_5e+02, 2.44024_63793_44441_73305_6e+02; 3.16112_37438_70565_59694_7e+00, 2.36012_90952_34412_09349_9e+01; 1.85777_70618_46031_52673_0e-01, 1.00000_00000_00000_00000_0e+00; |] let r1 = [| 1.23033_93547_97997_25272e+03, 1.23033_93548_03749_42043e+03; 2.05107_83778_26071_46532e+03, 3.43936_76741_43721_63696e+03; 1.71204_76126_34070_58314e+03, 4.36261_90901_43247_15820e+03; 8.81952_22124_17690_90411e+02, 3.29079_92357_33459_62678e+03; 2.98635_13819_74001_31132e+02, 1.62138_95745_66690_18874e+03; 6.61191_90637_14162_94775e+01, 5.37181_10186_20098_57509e+02; 8.88314_97943_88375_94118e+00, 1.17693_95089_13124_99305e+02; 5.64188_49698_86700_89180e-01, 1.57449_26110_70983_47253e+01; 2.15311_53547_44038_46343e-08, 1.00000_00000_00000_00000e+00; |] let r2 = [| -6.58749_16152_98378_03157e-04, 2.33520_49762_68691_85443e-03; -1.60837_85148_74227_66278e-02, 6.05183_41312_44131_91178e-02; -1.25781_72611_12292_46204e-01, 5.27905_10295_14284_12248e-01; -3.60344_89994_98044_39429e-01, 1.87295_28499_23460_47209e+00; -3.05326_63496_12323_44035e-01, 2.56852_01922_89822_42072e+00; -1.63153_87137_30209_78498e-02, 1.00000_00000_00000_00000e+00; |]
The tables are laid out as pairs of coefficients (pi, qi) for the numerator and denominator polynomials, respectively, in ascending order of monomial power (that is, the independent coefficients are in the first position of the arrays). The rational functions can be evaluated with a suitable variation of Horner's schema:
let horner2 r x = let n = Array.length r in let s = ref 0. and t = ref 0. in for i = n - 1 downto 0 do let p, q = Array.unsafe_get r i in s := !s *. x +. p; t := !t *. x +. q done; !s /. !t
For clarity I haven't inlined Horner's recursion in the following code; this is not very difficult to do, so I've left it as an exercise for the implementor:
let iqpi = 5.64189_58354_77562_86948_1e-01 let erfc x = let z = abs_float x in let z2 = z *. z in let y = if z < 0.46875 then 1. -. z *. horner2 r0 z2 else if z < 4. then exp (-. z2) *. horner2 r1 z else let z' = 1. /. z in let z'2 = z' *. z' in exp (-. z2) *. z' *. (iqpi +. z'2 *. horner2 r2 z'2) in if x < 0. then 2. -. y else y
As explained in the paper, erfc x
must be everywhere equal to 1. -. erf x
:
let erf x = 1. -. erfc x
If you have Mathematica™ lying around, you might want to output a list of points to test the accuracy of the approximation:
let mma f x0 x1 n = let buf = Buffer.create 256 in Buffer.add_string buf "l = Map[N[\ (1 - 2 BitAnd[1, BitShiftRight[#, 63]])\ 2^(BitAnd[16^^7ff, BitShiftRight[#, 52]] - 1023)\ (1 + 2^-52 BitAnd[16^^fffffffffffff, #]),\ $MachinePrecision]&, #, {2}]&@{"; for i = 0 to n do let x = (x0 *. float (n - i) +. x1 *. float i) /. float n in let y = f x in if i != 0 then Buffer.add_char buf ','; Printf.bprintf buf "{16^^%Lx,16^^%Lx}" (Int64.bits_of_float x) (Int64.bits_of_float y) done; Buffer.add_string buf "};"; Buffer.contents buf
In order to exactly marshal the IEEE 754 double values I convert them to rational numbers equivalent to machine floats on Mathematica's end. My informal tests show that around the cut-off points Mathematica evaluates erf
to the same exact values this code does (that is, the absolute error is zero). When evaluated to sufficient precision, the maximum absolute error among 5,000 equispaced points in the interval [3.99, 4.01] is 5.5437×10-17.
To mathematical libraries implementors everywhere, I urge you: please steal this code!
References
- W. J. Cody. Rational Chebyshev Approximations for the Error Function, Mathematics of Computation Vol. 23, No. 107 (Jul., 1969), pp. 631-63. Online.
1 comment:
Nice post!
Post a Comment