Expressiveness or Performance?

Don't pick just one!

Over reddit, a lively discussion on which language, Python or Ruby, is faster. In the comments, the usual suspects and the expected opinions. Except for Slava Pestov's (of Factor and jEdit fame), whose pointedness is on par with his stature as a programmer. I concur wholeheartedly with his opinion; in particular, his observation that

Slow language implementations are fundamentally less expressive because they give you less freedom in how you implement your code. You have to optimize […] a slow language is by definition not very expressive because it constrains your coding style […]

is right on the money. To which I'd like to add: when out shopping for a programming language, caveat emptor and choose a compiled, native one. The usual argument about turnaround time and programmer productivity is bogus: after all, machines were getting equally faster both for dumb interpreters and multi-pass compilers. And if what bogs you down are the compile-time errors, an interpreted language will not make you any more productive: as much as you'd like to, you cannot sink the costs of debugging and failing at execution time.

A More Elegant Necklace

In my previous post, I showed how to use rplacd to create a circular list with the elements of a given list. I used a "roving pointer", a reference to the last cell added to keep track of the next element to insert. There is a solution a little more elegant using a fold. The idea is to use the result of the fold as an accumulating parameter for the last element. This forces the fold to be a fold_left, and the resulting code is:

let necklace = function
| [] -> invalid_arg "necklace"
| h :: t ->
  let rec r = h :: r in
  List.tl (List.fold_left (fun p h ->
    let s = h :: r in
    ignore (rplacd p s);
    s) r t)

On return from the fold we get the last element of the list. Since it is circular, the element past the last one is the first, so we only need to take the tl. Of course, the alternative is to ignore the result of the fold_left, and simply return r.


The Unsafe Clasp

Over the OCaml Group, Till Varoquaux asks if an arbitrary list can be tied over itself strictly, like you can do with recursive value definitions:

let rec arpeggio = 1 :: 2 :: 3 :: 2 :: arpeggio

In other words, a function necklace is sought such that:

⟨∀ l : |l| > 0 : ⟨∀ k ≥ 0 :: take k (necklace l) = take k (drop |l| (necklace l))⟩⟩

As Alain Frisch points out, it cannot be done safely, as only values of fixed shape (i.e., statically known) can be built recursively. However, with a parsimonious amount of unsafeness it can be done quite simply. I've written before of reversing trees in-place. The only operation needed is rplacd, that replaces the tail of a list with another one. With that, the only care needed is to copy the original list before closing the loop:

let necklace = function
| [] -> invalid_arg "necklace"
| h :: t ->
  let rec r = h :: r in 
  let p = ref r in
  List.iter (fun h ->
    let s = h :: r in
    ignore (rplacd !p s);
    p := s) t;

Given a nonempty list, the code begins by building a singleton circular list on r. At each point, the code keeps a reference to the "tail" (i.e., the last node) of the circular list on p. With each further element of the original list, a new node pointing back to the beginning of the circular list is built on s, so that it both replaces and becomes the current tail.

Since at each step the cons cell is fresh, the original list is not modified and the circular list is built from scratch, cell by cell, always pointing back to itself. Hence the unsafe operation is completely hidden, and the code is safe.


Counting Working Days (mostly for Profit)

The week seems such an orderly thing. Calendrical calculations are littered with conditionals and corner cases, and the little humble week with its staid repetition of seven days might appear deceptively manageable. Pun intended, since it hides a surprising difficulty when trying to count a number of working (or business) days from a given date in either direction. In the Western hemisphere, Saturdays and Sundays are almost everywhere days of rest. In any case, since every definition holds a kernel of arbitrariness in itself, in what follows I will assume a function weekday that, given a date t, returns the corresponding week day. Also, and as a sort of oracle, I have a predicate is_wday that returns true for a given date.

Aside: this might seem overcomplicated; the reason I abstract those out is that there are varying definitions for the starting day of the week. Business weeks start at Monday; with that proviso, let is_wday treat every multiple of seven as a Monday, so that

is_wday t ≡ 0 ≤ weekday t mod 7 < 5

And I say that is_wday is a sort of oracle since, by treating it as a black box, it can be made to return false not only for weekends but also for non-working holidays.

In what follows, I'll start with the naive way to skip a number of weekdays forwards or backwards, and I'll progressively refine the algorithm. The starting point is simple; I traverse the calendar from the given date, checking which of the days are work days, and accounting for them appropriately:

let wday_add tm delta =
  let i = if delta < 0 then -1 else 1 in
  let s = ref 0
  and d = ref delta in
  while !d != 0 do
    s := !s + i;
    if is_wday (dt_add tm !s) then d := !d - i
  dt_add tm !s

(The highlighted portion is the core of the algorithm; I'll massage it while keeping the rest invariant. In what follows, replace the block of code for the highlighted original.) The variable i is bound to the sign of the shift delta, so that the same loop traverses the calendar forwards or backwards. Then s is the accumulated total shift, and d is the number of weekdays yet to be accounted for. This quantity is brought to zero (accounting for direction) whenever s weekdays starting from the initial date are, in fact, working days. The last line finally applies this difference to the initial date and returns it.

This code works but is quite slow, since it carefully checks day by day if it is working or not. The first big insight is that every seven days we can account for exactly five workdays. In the limit, the ratio of workdays to weekdays approaches exactly 5/7; however, for small (that is, finite) shifts this ratio is imprecise. But with a little care we can skip ahead whole weeks:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  while is_wday (dt_add tm !s) && i * !d >= 5 do
    s := !s + i;
    if is_wday (dt_add tm !s) then d := !d - i

I kept the body of the original loop, and added another loop that, whenever there is a workday week or more left to count starting from a workday, counts whole weeks. This seems like a step backwards. It is a stepping-stone, a small rewrite that allows assuming in the inner loop that workdays are counted by fives and weekdays by seven:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  while is_wday (dt_add tm !s) && i * !d >= 5 do
    s := !s + 7 * i;
    d := !d - 5 * i

This is actually faster, and the inner loop is evidently a division in disguise: d is decremented, say q times by 5q until the result is less than 5; s is the shift correspondingly incremented by 7q.

Note: This speedup is bought at a cost, however: the code would have stepped over non-working holidays if is_wday accounted for them; now, by leaping up by fives and seven, it decidedly assumes that every working week is five days long without exception. All is not lost, though; it is possible to tweak it to account for non-working holidays later.

Now the greatest q such that d - 5q < 5 is precisely, q = ⌊d/5⌋:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  if is_wday (dt_add tm !s) && i * !d >= 5 then begin
    let q = i * !d / 5 in
    s := !s + 7 * i * q;
    d := !d - 5 * i * q

The inner loop is gone, but the loop guard must be kept to ensure that the skipping ahead by whole weeks is actually possible and valid. Note that the division involves i×d where i is d's sign; in other words the absolute value |d|. This is necessary to ensure that the division rounds to zero and the quotient is positive. Otherwise, negative (that is, towards past dates) deltas would overshoot the last week, leaving d negative and decreasing, and the loop would never end. Had the integer division built-in round-to-zero semantics all this would be, however, unnecessary. Fortunately, this is the case in virtually all modern computer architectures and languages, and the code simplifies a little bit more:

while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i;
  if is_wday (dt_add tm !s) && i * !d >= 5 then begin
    let q = !d / 5 in
    s := !s + 7 * q;
    d := !d - 5 * q

I could stop here. It might not be evident, but this loop will never execute more than seven times, since the remainder after the weekly jump does not exceed five, and the worst case for that is hitting a weekend either before or after it. There is, however, place for further simplification, and that comes from the fact that the loop condition and the inner guard overlap in a non-quite-obvious way. The standard technique to simplify the conditions is to repeat the loop, maintaining the guards:

while !d != 0 && not (is_wday (dt_add tm !s)) do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
if i * !d >= 5 && is_wday (dt_add tm !s) then begin
  let q = !d / 5 in
  s := !s + 7 * q;
  d := !d - 5 * q
while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i

The first copy of the loop prepares for the jump by skipping over a possible weekend. Then comes the jump and finally the last copy of the loop tidies up the remaining days to account for. Now, it is obvious that the first loop finishes by having shifted to a weekday, hence the second conjunct of the conditional is superfluous. Also, if |d| < 5, q = 0 (with the usual, round-to-zero semantics mentioned above), and the code can avoid a conditional at the price of two multiplications by zero.

while !d != 0 && not (is_wday (dt_add tm !s)) do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i
let q = !d / 5 in
s := !s + 7 * q;
d := !d - 5 * q;
while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i

There is not much more simplification to be done, except in the first loop, whose condition again overlaps with that of the inner conditional. Again, by splitting it in two and simplifying:

if !d != 0 && not (is_wday tm) then begin
  while not (is_wday (dt_add tm !s)) do
    s := !s + i
  d := !d - i
let q = !d / 5 in
s := !s + 7 * q;
d := !d - 5 * q;
while !d != 0 do
  s := !s + i;
  if is_wday (dt_add tm !s) then d := !d - i

From now on we are in the Land of Diminishing Returns. The first loop can execute at most twice, the only possibilities being a Saturday or a Sunday. Those two days can be special-cased with conditionals. However, the deltas must also be conditional on the direction in which we are counting days; the cascade is unwieldy. This code is satisfactory enough for me.

There is the added complication of non-working holidays, though. We need to account for the missed holidays after the division. It is not enough to adjust the shift s by the number of holidays missed by the jump, and let the last loop tidy things up: it could be that the extra shift landed us in a non-working day even if there are no remaining days left to account for. In this case, we could simply return the shifted date as is, or adjust it to the next working day; in this case, we must be careful not to enter the loop with zero days. The trouble is that, if the division was exact, the date overshoots by one day:

if !d != 0 && not (is_wday tm) then begin
  while not (is_wday (dt_add tm !s)) do
    s := !s + i
  d := !d - i
let t0 = dt_add tm !s in
let q = !d / 5 in
s := !s + 7 * q;
d := !d - 5 * q;
s := !s + i * holidays_between t0 (dt_add tm !s);
while not (is_wday (dt_add tm !s)) do
  s := !s + i
if !d != 0 then begin
  d := !d - i;
  while !d != 0 do
    s := !s + i;
    if is_wday (dt_add tm !s) then d := !d - i

(note that d might be off by one.) The alternative is to check whether the division was exact or not, and back up enough weeks to allow for the comprised holidays. By carrying this far enough, we'd land exactly where we started: there is no alternative but to check each calendar date individually to see if it is a working day or not.


Gödelizing Turing

Down at O'Reilly, Nitesh Dhanjani makes an extremely well-reasoned point that you cannot outright dismiss static verification tools by an unthinking appeal to Turing completeness, even if in principle (or shall I say, in the limit) you are right. He argues that every static analysis is an approximation, and you should take at face value the analytic power of that approximation even though it never ceases to be one, and in truth it never fails to advertise that it is, in fact, an approximation in the first place.

The halting problem is a direct consequence of Gödel's Incompleteness Theorem. Turing titled his paper "On computable numbers, with an application to the Entscheidungsproblem", that is, he started from Hilbert's Decision Problem much as Gödel did. The upshot of this relationship is that decidable = halting, or incomplete = not provably halting (note, not provably non-halting). Thus, one way to make a meaningful halting analysis (or other kinds of static verification) possible is to use a restricted enough computing semantics to weaken the formal system to the point it is rendered decidable.

This is precisely what static typing systems do. A typing system is a logic (witness the Curry-Howard correspondence) that is powerful enough to be useful, but not so powerful that it becomes undecidable. The typechecker then is a theorem verifier in the logic induced by the typing rules. A typechecker for an undecidable type system would hang, that is, not halt, for some programs and their types, which is not quite desirable. Note that a typing system could be sound and complete, that is, decidable for checking but not for inferring; that is, even though all programs could be verified type-correct or not, some (or most) might not be proven so, without manual intervention from the programmer in the form of typing annotations. This is the case with Haskell's extended typing rules (overlapping instances of types), I believe; that's why Haskell programs tend to be more heavily annotated with types than ML programs, even though both nominally use the Hindley-Milner type system, which is decidably inferrable.

There are many kinds of approximate static verifications that are possible, useful and even in current use: aliasing (and the special case, linearity, that analyzes and removes allocation and garbage generation whenever possible), bounds-preserving in indexed accesses, side-effecting and purity, security, and of course, type soundness. Inasmuch these analysis treat data and control flow statically, in exactly the same way that a theorem verifier checks every step of the derivation tree, the end result is the proof that the verified properties are upheld, or, in the case of the approximations, an "I don't know" result that could be refined either with more information supplied by the programmer to the tool, or with further checks that can be automatically introduced to corroborate that the stated properties are met in runtime.

This, in other words, is the gist of the raging debate between static- and dynamic-typing proponents. The point being denounced by Dhanjani's post, and that I think is more often missed, is that this is not an "either-or" proposition, but only the extreme points in a continuum in which language designers have a conscious choice to make their language lie in.


Tallying Grass

Or rather, how to become a vegetarian by discounting beings with a stomach.

In the last post, I showed how to count a kind of flat critters living on the integer-tiled plane (the two-dimensional integer lattice) called ominos. The ominos are sets of cells with integer coordinates that are connected horizontally or vertically; in other words, they are formed by square tiles that share an edge. This is the so-called F pentomino:

For every integer n ≥ 1, I showed how to generate (and count) all the n-celled ominos by extending the (n-1)-ominos, one cell at a time. Some of these ominos have holes, or stomachs in them; that is, they completely surround a region of the plane such that it remains disconnected from the outside region, or exterior. For instance, this 9-omino (or enneomino) has a stomach:

This isn't the simplest animal with a cavity. The obvious mutilation that results in a square ring isn't either, surprisingly. The simplest non-simply-connected omino (that's the medical, I mean technical term) is the following heptomino:

I want to wean my program off meat; that is, I want it to count only simple ominos. These I could call trees, but trees are a special kind of branching ominos that lack "blocks"; I can then call them vegetables, or veggies for short, as trees are a special case of vegetables. The problem is how to determine the realm an omino belongs to; in other words, I need to come up with a predicate that distinguishes the ominos that are simply-connected from those that aren't.

Inspecting cells one by one might seem like a plausible way to do that; however, simplicity is a global property of the omino, and no matter how clever the cell inspection is, it can be fooled by some configuration. The simplicity condition is geometrical in nature, and the test must somehow take this into account. Enter the Jordan Curve Theorem: a simply-connected region has a boundary that neatly divides the plane in just two regions, namely the interior and the exterior. By taking into account the simple geometry of the integer lattice and the connectivity properties of an omino's cells, a consequence of the theorem is that an omino is simple if and only if the exterior is comprised of one connected region.

I don't need to take into account the entire integer lattice; it is enough to consider a representative sample of the exterior cells. These are the cells that completely surround the omino; since our representation includes the omino's extents, including a further border of one cell is sufficient.

This rectangular region consists of all cells whose coordinates are, each independently, in range, so I first need to generate the list of all integers comprised by said range:

let rec range i j =
  if i > j then [] else
  i :: range (succ i) j

Then, I need to combine two ranges, each for the independent x- and y-coordinates into pairs, giving the list of cells. This is the familiar Cartesian product:

let cart l m =
  List.fold_right (fun i ->
    List.fold_right (fun j r -> (i, j) :: r) m) l []

Aside: This is perhaps more recognizable when expressed as a list comprehension:

cart l m = r
 where r = [(i, j) | i <- l, j <- m]

With that, a region is simply the set of comprised cells between the given coordinates:

let region (im, jm) (ix, jx) : C.t =
  C.of_list (cart (range im ix) (range jm jx))

Given an omino, its exterior is the set of cells in the including region not belonging to it:

let exterior o : C.t =
  C.diff (region (-1, -1) (o.wid, o.hgt)) o.cel

(Note how the exterior completely surrounds the omino by extending its bounds.) Now the hard part: determining if the resulting set contains one connected component, or more than one. If the former, the exterior is connected; if the latter, some cells are separated from the exterior by the omino's cells and it is non-simple. A graph traversal that "marks" or remembers visited cells is what I need:

let find_component s =
  let neighbors = C.inter s % flower in
  let rec visit visited pending =
    if C.is_empty pending then visited else
    let c = C.choose pending in
    let pending = C.remove c pending in
    let unvisited = C.diff (neighbors c) visited in
    visit (C.add c visited) (C.union unvisited pending)
  visit (C.empty) (C.singleton (C.choose s))

The algorithm keeps a set of visited cells in parallell with a set of cells pending of visit. If there are no cells of the latter kind, we are done; if not, we choose arbitrarily one, discount it, find its yet-unvisited neighbors in the region and recur, accounting for the cell as visited and remembering its neighbors for further consideration. We jumpstart the process by beginning with an arbitrary cell in the set, and without cells yet visited. Since it starts with an arbitrary cell, the returned connected component is arbitrary. This is the standard algorithm, except that the choice of cells, instead of following a first-in, first-out discipline (breadth-first), or a last-in, first-out discipline (depth-first) is completely non-deterministic.

Now, our predicate is, quite simply:

let is_connected o =
  let t = exterior o in
  C.equal t (find_component t)

That is, if any of the connected components of the exterior comprise the entire exterior, there are no holes and the omino is simple. I'll make a modification to the extensions predicate to filter candidates by an arbitrary predicate:

let extensions p o =
  let candidates c : C.t = C.diff (flower c) o.cel
  and extend c : omino = canon (normalize (C.add c o.cel)) in
  let extend_at c : O.t -> O.t = C.fold (fun c s ->
    let o = extend c in
    if p o then O.add o s else s) (candidates c) in
  C.fold extend_at o.cel O.empty

(For efficiency, the predicate is fused with the fold.) Now I must modify extend_set accordingly:

let extend_set p s = O.fold (O.union % extensions p) s O.empty

To test everything so far, I'll check that, given the true predicate (that is, that which is true everywhere) I get the same results as the last time. For that, I will use the constant function which returns the same value for all its arguments:

let const k _ = k

Then, exactly as before:

# List.map O.cardinal (nest 10 (extend_set (const true)) o1) ;;
- : int list = [1; 1; 2; 5; 12; 35; 108; 369; 1285; 4655]

That is, A000105. But now, by using is_connected:

# List.map O.cardinal (nest 10 (extend_set is_connected) o1) ;;
- : int list = [1; 1; 2; 5; 12; 35; 107; 363; 1248; 4460]

Which agrees with A000104, the number of n-ominos without holes. This takes a while; an obvious optimization would be to use sets with O(1) membership test; for instance, hash sets.

Thanks to Alejandro and Boris for the insightful discussion that led me to this solution.


Counting Cattle

Well, not exactly cattle, but a different kind of animals, the polyominos (or "ominos", for short). These are the kind of tiles you can form by gluing squares by their edges. The article "Generating Polyominoes" in The Monad Reader 5 presents a direct implementation of the naïve algorithm for generating the set of polyominos comprised of n squares. I decided to try my hand at doing the same in OCaml, while trying to write the simplest code I could come up with that solved the problem.

I try to use, whenever possible (and the monomorphic restriction allows!) composition of higher order functions. To make it more pleasant, I use the following operator:

let (%) f g x = f (g x)

Whenever introducing an operator, I find it sound practice to add to it enough structure to make it as generally applicable as possible. The identity of the composition is, unsurprisingly, the identity function:

let id x = x

I will be using sets extensively for computing ominos and their (ha!) sets. Unfortunately, the standard Set module in OCaml lacks a couple of useful functions. Fortunately, OCaml's module system is extremely flexible, and allows a kind of "extension by subclassing" that models pretty well the object-oriented mechanism of extension by inheritance:

module FSet(O : Set.OrderedType) = struct
  include Set.Make(O)
  let map f s = fold (add % f) s empty
  let of_list l = List.fold_right add l empty

I simply include the original module, and add the functions I need. The first addition to this module FSet (short for "functorial set") is a monomorphic map that applies a function f to each element of the set s, returning the set of results. The second function is the fix for a plain oversight of the module authors, that is, the lack of a function that returns the list of elements of a set, which is the inverse of Set.elements.

A cell is simply a pair of integer coordinates:

type cell = int * int

Cells will be treated individually, or as cell sets:

module C = FSet(struct
  type t = cell
  let compare (i, j) (i', j') =
    let c = compare j j' in
    if c <> 0 then c else
    compare i i'

Cells are sorted so that those in the first row come before those in the rows above; in other words, the one with the least y-coordinate is lesser. An omino is a cell set together with its integer extents:

type omino = { wid : int; hgt : int; cel : C.t; }

Knowing how many cells high and wide the region occupied by the cells is will make some of the operations on ominos more efficient. Now, an omino set is, plainly, an FSet of ominos:

module O = FSet(struct
  type t = omino
  let compare o o' =
    let c = compare o.hgt o'.hgt in
    if c <> 0 then c else
    let c = compare o.wid o'.wid in
    if c <> 0 then c else
    C.compare o.cel o'.cel

Again, the lesser omino is the shorter one, or the skinnier one, or the one whose cell set comes first. But, how do we know what the region occupied by a set of cells is? By traversing the cell set and recording the minimum and maximum coordinates seen:

let extents s =
  C.fold (fun (i, j) (mi, mj, xi, xj) ->
    (min mi i, min mj j, max xi i, max xj j))
    (max_int, max_int, min_int, min_int)

Since I am not storing the origin of the region but just its extents, I need a way to normalize the cell set by translating it so that it lies in the first quadrant:

let normalize s =
  let (mi, mj, xi, xj) = extents s in
  { wid = xi - mi + 1; hgt = xj - mj + 1;
    cel = C.map (fun (i, j) -> (i - mi, j - mj)) s; }

(Here's where the extended map in FSet comes handy. Note that the extents are one more than the maximum x- and y-coordinates in the set.) Now, given a list of otherwise unconstrained cells, I build an omino from it simply by normalizing the resulting cell set:

let of_list = normalize % C.of_list

I want to enumerate ominos disregarding plane symmetries. That is, if two ominos can be transformed one into the other by a sequence of rotations and reflections, they are treated as one and the same omino. The first symmetry we have already seen, it is the identity transformation, as implemented by id. A rotation by a quarter turn transforms coordinates (x, y) into (-y, x). By taking into account the omino's extents, I don't need to normalize the result:

let rot o =
  { wid = o.hgt; hgt = o.wid;
    cel = C.map (fun (i, j) -> (o.hgt - 1 - j, i)) o.cel; }

Of course, the old width is the transformed height and vice versa. The third basic transformation is a flip that swaps cells around the horizontal axis by changing coordinates (x, y) into (x, -y):

let flip o =
  { o with
    cel = C.map (fun (i, j) -> (i, o.hgt - 1 - j)) o.cel; }

Again, I take advantage of the extents to perform the transformation in such a way that avoids the need for normalization. These basic transformations are sufficient to carry all the plane symmetries of a rectangular region. In technical terms they finitely generate, or give rise to the so-called dihedral group of order 4, or D4. The remaining transformations are built up from these by composition:

let transforms = [
  rot % rot;
  rot % rot % rot;
  rot % flip;
  rot % rot % flip;
  rot % rot % rot % flip

The symmetries of a polyomino is the set resulting from applying to it each of these transforms in turn:

let symmetries o = O.of_list (List.map (fun x -> x o) transforms)

Aside: I could have fused the map and the fold implied in O.of_list, but doing this is mechanical and would have obscured this rather simple expression.

Given that I want to treat ominos related by a symmetry as identical, I need a way to pick a canonical representation of an omino. This is an arbitrary but fixed element in the set of transforms of the omino. Since ominos are totally ordered by O.compare, this can mean either the least or the greatest element of said set. I opted for the former:

let canon = O.min_elt % symmetries

I am now ready to start growing legs on our animals. I mean, start adding cells to our ominos. Since cells must be connected by an edge, the flower of a cell is the set of all four such neighbors:

let flower (i, j) =
  (C.add (i + 1, j) %
   C.add (i - 1, j) %
   C.add (i, j + 1) %
   C.add (i, j - 1)) C.empty

To extend an omino, we first need to find the candidate set of neighbors of any of its cells that actually fall outside the omino. Then, given a valid neighbor, I extend the omino by normalizing its cell set extended with the new cell. Then, focusing on a particular cell in the set, the set of extensions at that cell is generated by the flower of candidate neighbors. Note that the result of extend is a function from omino sets to omino sets; we will use it to accumulate into the final result the ominos generated by considering in turn all the extensions possible at each of the original omino's cells (I annotate the range types of the functions for documentation):

let extensions o =
  let candidates c : C.t = C.diff (flower c) o.cel
  and extend c : omino = canon (normalize (C.add c o.cel)) in
  let extend_at c : O.t -> O.t = C.fold (O.add % extend) (candidates c) in
  C.fold extend_at o.cel O.empty

The lifting of the extension of an omino to the extension of an omino set is straightforward:

let extend_set s = O.fold (O.union % extensions) s O.empty

(This is what is usually called a flat map, or a monadic join.) We are now ready to enumerate!

I start with the simplest possible omino set, the set of 1-celled ominos. Its only member is the (unique) one-celled omino:

let o1 = O.singleton (of_list [0,0])

To make things easy, instead of manually constructing the extension, the extension of the extension and so on, I'll use the iterated composition function:

let rec nest n f x =
 if n == 0 then [] else
 if n == 1 then [x] else
 x :: nest (pred n) f (f x)

This repeatedly applies the function f to its argument, up to n times. Instead of just returning the n-th iterate, fn(x), it returns the list of partial iterates. This makes simple to compute the enumeration:

let enumerate_to n = List.map O.cardinal (nest n extend_set o1)

For instance, let's enumerate the ominos up to the decominos (the critters with exactly ten tiles in their bodies):

# enumerate_to 10 ;;
- : int list = [1; 1; 2; 5; 12; 35; 108; 369; 1285; 4655]

This takes a while, but the result does indeed coincide with the beginning of the On-Line Encyclopedia of Integer Sequences' A000105!

I'm usually surprised at the conciseness and elegance most Haskell programs achieve, through a combination of laziness, a great standard library and higher-order combinators. OCaml programs, in my experience, are much more low-level, and they tend to have a "procedural" flavor to them caused by too much explicit scaffolding being visible. Here, a judicious combination of the proper data structure (Sets) and some general-purpose recursive higher order functions make this implementation short, simple and manifestly correct.

Plane Geometry in Java (III)

See also Part I and Part II.

Now that we have vectors and points, I can show how a line on the plane can be implemented:

final class Line {
  public final Pt p;
  public final Vec v;

  public Line(Pt p, Vec v) {
    if (p == null || v == null)
      throw new IllegalArgumentException("Arguments cannot be null");
    this.p = p;
    this.v = v;

  public boolean equals(Object o) {
    if (o == null || !(o instanceof Line)) return false;
    else if (this == o) return true;
    final Line that = (Line) o;
    return this.p.equals(that.p) && this.v.equals(that.v);

  public int hashCode() {
    return 127 * p.hashCode() + v.hashCode();

  public String toString() {
    return String.format("%s-->%s", p, v);

A difference with previous value classes is that we must make sure that a Line is properly constructed with actual values; other than that, the implementation is unsurprising. A basic property of a line is its slope:

  public double slope()            { return v.arg(); }

Given two lines, we might want to know whether they are parallel or perpendicular:

  private static final double EPSILON = 1e-11;

  public boolean isPerpendicular(Line l) {
    return Math.abs(v.dot(l.v)) < EPSILON;

  public boolean isParallel(Line l) {
    return isPerpendicular(l.perp(l.p));

This is necessarily an approximation, since it involves a subtraction. In any case, the angle between two lines is the angle between its respective direction vectors:

  public double angle(Line l)      { return v.angle(l.v); }

Our first construction on lines gives the perpendicular through a given point:

  public Line perp(Pt q)           { return new Line(q, v.perp()); }

In particular, the median between to points is the perpendicular to the line comprising both and passing through their midpoint:

  public static Line median(Pt p, Pt q) {
    return p.through(q).perp(p.mid(q));

Note how the definition formalizes the English description. The second, fundamental construction is the intersection point between two lines:

  public Pt meet(Line l) {
    final Vec u = l.v.perp();
    return p.at(v.scale(u.dot(p.to(l.p)) / u.dot(v)));

This is a direct translation of Property 20. If the lines are parallel, u.dot(v) is zero and the division gives the point at infinity. Another usual construction is to find the foot of a point on a given line; that is, the nearest point to it on the line:

  public Pt foot(Pt q) {
    final Vec u = v.unit();
    return p.at(u.scale(u.dot(p.to(q))));

To test whether two segments (that is, the subset of the line comprised between the base point and its translate by the direction vector) intersect, check if the quadrilateral formed by said points is convex or not:

  public boolean meets(Line l) {
    final Vec u = l.v.perp();
    final Vec w = p.to(l.p);
    final double r = u.dot(w);
    final double s = v.perp().dot(w);
    final double t = u.dot(v);
    return 0 <= r && r <= t && 0 <= s && s <= t;

Sometimes is useful to specify a line in implicit form, as ax + by + c = 0. This constructor does so:

  public static Line implicit(double a, double b, double c) {
    final Pt p = b != 0 ? new Pt(0, c / b) : new Pt(c / a, 0);
    final Vec v = b > 0 ? new Vec(b, -a) : new Vec(-b, a);
    return new Line(p, v);

Conversely, the implicit coefficients of a line can be retrieved with:

  public double[] coeffs() {
    final Vec u = v.perp();
    return new double[] { u.i, u.j, -u.dot(p.to(Pt.ORIGIN)) };

Now we have everything in place to program simple Plane Geometry constructions. For instance, we can check that the altitudes of a triangle meet at a point. First, it is convenient to have a way to produce triangles that checks for us that we passed sensible values:

public class Geom {
  static Pt[] triangle(double a, double b, double c) {
    double t;
    if (a < b) { t = a; a = b; b = t; }
    if (b < c) { t = b; b = c; c = t; }
    if (a < b) { t = a; a = b; b = t; }
    // Now a >= b >= c
    if (a >= b + c)
      throw new IllegalArgumentException("Unsatisfied triangle inequality");

Our triangle will have the origin as its first vertex, its second vertex on the positive x-axis, and the third on the first quadrant. To find the latter, a simple system of equations involving the Pythagorean Theorem suffices:

    final double x = 0.5 * (a * a + (b + c) * (b - c))/a;
    final double y = Math.sqrt(b * b - x * x);
    return new Pt[] { Pt.ORIGIN, new Pt(a, 0), new Pt(x, y) };

Since we have to find the three altitudes, another utility function will save repetition. This one-liner reads exactly as a mathematical description:

  static Line altitude(Pt a, Pt b, Pt c) {
    return a.through(b).foot(c).through(c);

(albeit backwards!) Maybe this justifies writing a Tri class? Finally, the test:

  public static void main(String[] args) {
    final Pt[] tri = triangle(3, 6, 8);
    final Line u = altitude(tri[0], tri[1], tri[2]);
    final Line v = altitude(tri[1], tri[2], tri[0]);
    final Line w = altitude(tri[2], tri[0], tri[1]);
    final Pt r = u.meet(v);
    final Pt s = v.meet(w);
    System.out.printf("r = %s, s = %s\nDistance between points: %f\n", r, s, r.dist(s));

The result shows, as expected, the coincidence of both points of intersection:

r = (5.68750 , 6.88204), s = (5.68750 , 6.88204)
Distance between points: 0.000000


Plane Geometry in Java (II)

See also Part I and Part III.

We have our vectors ready. Now, points are entirely analogous, insofar they are value objects:

final class Pt {
  public final double x;
  public final double y;

  public Pt(double x, double y)    { this.x = x; this.y = y; }

  public static final Pt ORIGIN = new Pt(0, 0);

  public boolean equals(Object o) {
    if (o == null || !(o instanceof Pt)) return false;
    else if (this == o) return true;
    final Pt that = (Pt) o;
    return this.x == that.x && this.y == that.y;

  public int hashCode() {
    return (int) ((Double.doubleToLongBits(x) >> 32) & 0xffffffffL)
         ^ (int) ((Double.doubleToLongBits(y) >> 32) & 0xffffffffL);

  public String toString() {
    return String.format("(%g , %g)", x, y);

There is a distinguished point, the origin.

It is important to understand the need to duplicate code that is essentially the same as in Vec; in other words, it is not appropriate to refactor common code in classes Vec and Pt, for two reasons:

  1. First, a vector is not a point, nor vice-versa. There is not even a meaningful common supertype between them. The whole point of using the typed algebra is to manipulate formulas with the added security afforded by the typing rules
  2. Second, even if a private base class could abstract some code, the bulk of the code needed to correctly implement a base type is in equals. This method needs to cast to the static class of the object in order for the comparison to be meaningful (else it would be comparing a derived class to the base class), so the duplicate equals are essential anyway

The Algebra has two "constructors" linking points and vectors: the point constructor and the vector constructor. Both operate on points, so naturally they are members of the Pt class. Also, as Axiom 1 states, both operations are some kind of additive inverses one of the other:

  public Pt at(Vec v)              { return new  Pt(x + v.i, y + v.j); }
  public Vec to(Pt p)              { return new Vec(p.x - x, p.y - y); }

A natural operation on points is to find the distance between them. It is, quite simply, p.to(q).norm(). However for convenience I include it directly in Pt:

  public double dist(Pt p)         { return to(p).norm(); }

Linear interpolation, or lerp mixes two points to give a third:

  public Pt lerp(Pt p, double k)   { return at(to(p).scale(k)); }

In particular, the midpoint between two given points is:

  public Pt mid(Pt p)              { return lerp(p, 0.5); }

The set of all linear interpolants is obviously a line. The simplest way to determine one is however to give a point through which it passes and the direction it takes:

  public Line towards(Vec v)       { return new Line(this, v); }
  public Line through(Pt p)        { return new Line(this, to(p)); }

I will show next how to use lines.


On the Integer Spiral

The problem: find a procedure to map the integers 0, 1, 2 … uniquely into points (i, j) ∈ Z² with integer coordinates, starting with 0 ↦ (0, 0), and continuing in a counterclockwise spiral pattern. The first few points are (0, 0), (1, 0), (1, 1), (0, 1), (-1, 1), (-1, 0), (-1, -1), (0, -1), (1, -1), (2, -1), (2, 0), (2, 1), (2, 2), (2, 1), (2, 0) … The idea is to find a bijective mapping between N and Z². Of course there are simpler ones, but the integer spiral is particularly pretty:

An instant's doodling on graphing paper shows that every complete turn of the spiral completely surrounds the origin with a square, each having 1, 8, 16, 24 … new points. The first, or "zero-th" such "square" is the origin itself, having "side" 1. Each new square's boundary points are at a distance 1 from those of the older one (except for the corner), so each new square has side two plus the older square's side, or l = 2⋅n + 1. An appeal to the inclusion-exclusion principle proves that the perimeter of a square is four times the side minus the four corners counted twice; hence, the perimeter is comprised of p = 8⋅n points. The total area of this square is, obviously a = (2⋅n + 1)² points.

The i-th point lands on the n-th square's perimeter if and only if it is not inside it, that is, if it doesn't land on the (n-1)-th square's area. The index starts at zero, so this condition is equivalent to a.(n-1) ≤ i < a.n. Massaging the inequality to extract n:

   a.(n-1) ≤ i < a.n
≡ { Definition a }
   (2⋅(n-1) + 1)² ≤ i < (2⋅n + 1)²
   (2⋅n - 1)² ≤ i < (2⋅n + 1)²
≡ { i is nonnegative }
   2⋅n - 1 ≤ √i < 2⋅n + 1
≡ { Algebra }
   n ≤ (√i + 1)/2 < n + 1
≡ { Definition floor }
   n = ⌊(√i + 1)/2⌋

Let d = i - (2⋅n - 1)² be how far around the n-th square's perimeter the index falls. There are four sides to it, each thus having 2⋅n points. If d is less than 2⋅n it is mapped to the first (i.e., left) side; if it is less than 4⋅n it's mapped to the second (top) one, and so on. Let this side index be k = ⌊d/(2⋅n)⌋.

As the spiral goes, each side has start- and end-points that depend on k like this:

0(n, 1-n)(n, n)
1(n-1, n)(-n, n)
2(-n, n-1)(-n, -n)
3(1-n, -n)(n, n)

So, if sx, sy, dx, dy are the signs and offsets applied to each starting point (sx⋅(n - dx), sy⋅(n - dy)) as a function of k, we have:


Note that dy.k = dx.(k - 1) and sy.k = sx.(k - 1), modulo 4. It is not difficult to see that dx.k = k % 2, and sx.k = 1 - k % 4 + k % 2.

Aside: To explain it in full: first, sx.k must have period four, hence the reduction of the argument modulo 4. Second, it takes two consecutive values in pairs; this means that 0 and 1, and 2 and 3 must be treated the same, hence the reduction modulo 2. Finally, the result is shifted so that the range is ±1.

Some algebraic manipulation gives the alternative sx.k = 1 - 2⋅⌊(k % 4)/2⌋, which when implemented with bitwise operators is the very compact 1 - (k & 2).

Similarly, we need to find two characteristic functions, or multipliers, to apply to the corner the side offset, call it o = d % (2⋅n); so that points are traversed first up, then left, then down and finally down:

0 0+1
1-1 0
2 0-1
3+1 0

But this is simply -sxdx (resp. -sydy)!. Expanding and collecting equal terms, we have that the i-th point has coordinates (sx⋅(n - dx⋅(o + 1)), sy⋅(n - dy⋅(o + 1))).

Aside: Had I chosen instead to subtract the offset from the side's endpoint, I would have arrived to this directly, without finding the need to simplify the expression.

Hence, it all translates directly to OCaml as:

let spiral i =
  let n  = truncate (0.5 *. (sqrt (float i) +. 1.)) in
  let d  = i - (2*n - 1) * (2*n - 1) in
  let k  = d / (2*n)
  and o  = d mod (2*n) in
  let sx = 1 -  k    land 2
  and sy = 1 - (k+3) land 2
  and dx =      k    land 1
  and dy =     (k+3) land 1 in
  (sx * (n - dx * (o + 1)), sy * (n - dy * (o + 1)))


Plane Geometry in Java (I)

See also Part II and Part III.

Algebras have computational content, and the Plane Geometry Algebra I presented is no exception. It's time to put it to test by implementing a DSL embodying it. I'll use value objects to represent geometric objects in the plane. The reason for this decision, if it is not obvious enough, is that points, vectors and lines, as geometric entities, have no meaningful state.

As pointed out in C2, value objects should be immutable; this implies that there is no point in hiding the instance variables behind setters, as the values are determined at the moment of construction. On the other hand, it is necessary to ensure proper value semantics by correctly implementing equals and hashCode.

I find it easiest to start with the class Vec, as vectors have a very well-known, rich structure that you will probably find familiar. But first, the basic Object contract:

final class Vec {
  public final double i;
  public final double j;

  public Vec(double i, double j)   { this.i = i; this.j = j; }

  public static final Vec ZERO = new Vec(0, 0);

  public boolean equals(Object o) {
    if (o == null || !(o instanceof Vec)) return false;
    else if (this == o) return true;
    final Vec that = (Vec) o;
    return this.i == that.i && this.j == that.j;

  public int hashCode() {
    return (int) ((Double.doubleToLongBits(i) >> 32) & 0xffffffffL)
         ^ (int) ((Double.doubleToLongBits(j) >> 32) & 0xffffffffL);

  public String toString() {
    return String.format("[%g , %g]", i, j);

There is a distinguished vector, the zero vector.

In order for equals to be an equivalence relation, it must be closed on its domain (that is, it must compare meaningfully only Vecs, line 1), reflexive (line 2), symmetric and transitive. The equality operator in Java fulfills these properties, and so (line 4), using it to compare by value the coordinates ensures that the overall result conforms.

With respect to hashCode, there are many proposed methods to hash IEEE 754 doubles, but most aim at "chunking" buckets of nearby values by chopping off the least significant bits of the mantissa. To avoid clustering more in one coordinate than over the other, I mix symmetrically bits from both values using an exclusive-or.

Now, let's flesh out Vec's structure, not forgetting the perp operation in the Algebra:

  public Vec add(Vec v)            { return new Vec(i + v.i, j + v.j); }
  public Vec neg()                 { return new Vec(-i, -j); }
  public Vec scale(double k)       { return new Vec(k * i, k * j); }
  public Vec perp()                { return new Vec(-j, i); }

and endow it with a norm and a scalar product:

  public double dot(Vec v)         { return i * v.i + j * v.j; }
  public double norm()             { return Math.hypot(i, j); }
  public double arg()              { return Math.atan2(j, i); }
  public Vec unit()                { return scale(1/norm()); }

A natural operation on vectors is to find the enclosed (convex) angle between them. The dual operation is to rotate a vector by a given angle. It is well-known that the dot product between two plane vectors is proportional to the cosine of the comprised angle. By Theorem 16, the perp-dot product is proportional to the sine of the same angle. The cross product captures both proportionalities in vectorial form:

  public Vec cross(Vec v)          { return new Vec(dot(v), perp().dot(v)); }

This operation satisfies the following properties:

(0.0)  u × (v + w) = u × v + u × w
(0.1)  (u + v) × w = u × w + v × w
(0.2)  (ku) × v = k ⋅ (u × v)
(0.3)  u × (kv) = k ⋅ (u × v)
(0.4)  (-u) × v = u × (-v) = -(u × v)
(0.5)  (u × v) = u × v
(0.6)  u × v = u × v
(0.7)  u × v = -u × v

Properties (0.0) to (0.3) show that cross is linear on both arguments. This is a consequence of both the linearity of dot product and that of perp. From this, Property (0.4) is an easy consequence. Properties (0.4) to (0.7) capture the rotational invariance of the cross product, at least for quarter rotations and reflections. With this, we can define:

  public static Vec rotor(double a) {
    return new Vec(Math.cos(a), Math.sin(a));

  public double angle(Vec v)       { return cross(v).arg(); }
  public Vec rotate(double a)      { return rotor(-a).cross(this); }

That angle correctly captures the intended functionality is clear from the definitions:

= { Definition }
= { Definition }
   (new Vec(u.dot(v), u.perp().dot(v))).arg()
= { Properties of dot, perp-dot }
   (new Vec(|u|⋅|v|⋅cos.(θ.u.v), |u|⋅|v|⋅sin.(θ.u.v)).arg()
= { Definition arg, atan2 }
   atan.((|u|⋅|v|⋅sin.(θ.u.v)) / (|u|⋅|v|⋅cos.(θ.u.v)))
= { Algebra, definition tan }

Other properties of angle are:

(1.0)  θ.u.v = -θ.v.u
(1.1)  θ.u.v = θ.(ku).v = θ.u.(kv)
(1.2)  |v|⋅u.rotate(θ.u.v) = |u|⋅v
(1.3)  u × (u.rotate(a)) = |u|²⋅rotor(a)
(1.4)  θ.u.(u.rotate(a)) = a

Next, I will present points.