Well, Jeff Atwood has weighed in the discussion, with a clear explanation for "the rest of them". Seeing Batchelder's Python code, I realized I didn't do as well as I could with my OCaml version. Specifically, I didn't look hard enough for a simple way to split the string between decimal and non-decimal parts. The idea is not to delimit parts by looking for transitions between one kind and the other, but to use one of the kinds as the delimiter itself:
let split = let delim = Str.regexp "[0-9]+" in List.map (function Str.Delim s -> s | Str.Text s -> s) % Str.full_split delim
As is to be expected with OCaml, the result of full_split
is rather odd, in that it uses an ADT to distinguish between text and delimiters. I use a forgetful mapping to coalesce all the parts into a plain string list.
Another point where the comparison could be more discriminating is when comparing numbers with leading zeros. While it is obvious that "010" > "08", it is also to be expected that "010" > "10". The fix for this is to compare first by value, then by length:
let cmp_alnum x y = try let m, n = int_of_string x, int_of_string y in let c = Pervasives.compare m n in if c != 0 then c else Pervasives.compare (String.length x) (String.length y) with Failure _ -> Pervasives.compare x y
There's always opportunity for polish…
No comments:
Post a Comment