2008-03-19

Generating Combinations

Only for the record, and because I arrived at this form of the algorithm after massaging the one given in Algorithms for Programmers, Algorithm 6.1 "Lexicographic and Co-Lexicographic Order" and I find it very pretty. First the algorithm, then the explanation:

let iter_comb n k (f : int array -> unit) =
  let v = Array.init k (fun i -> i) in
  f v;
  while v.(0) != n - k do
    let j = ref (k - 1) in
    while v.(!j) == n - k + !j do decr j done;
    let z = v.(!j) - !j + 1 in
    while !j != k do
      v.(!j) <- z + !j;
      incr j
    done;
    f v
  done

Since the number of combinations grows factorially, I opted to write a generator instead of something returning a list. The continuation f will receive each of the combinations of k elements taken from a set of n elements, with each represented as an integer between 0 and n - 1.

Aside: This means that it can be translated to Python, say, via a direct syntactic mapping:

def comb(n, k):
  v = range(0, k)
  yield v
  while v[0] != n - k:
    j = k - 1
    while v[j] == n - k + j:
      j -= 1
    z = v[j]
    while j != k:
      z += 1
      v[j] = z
      j += 1
    yield v
  return

I've nominally faked that with the help of a cheat sheet.

These subsets are in turn represented as an integer array sorted in ascending order, and each combination will be lexicographically the successor to the previous one. This is what is normally called lex-order generation.

Taken together, these two preconditions determine univocally what the algorithm does. The lexicographically first combination satisfying these preconditions is {0, 1, … k - 1 }; this is built on line 2 and passed to the generator's continuation f on line 3. Dually, the lexicographically last combination is { n - k, n - k + 1, … n - 1 }; this gives us the loop invariant:

P ≡ ⟨∃i : 0 ≤ i < k : v.(i) ≠ n - k + i

The loop condition on line 4 takes advantage of this. The inner loop then sets to change the last element on the subset that can possibly be changed, so it has to find it. The loop on line 6 looks for the greatest j such that all elements to the right of j have their maximum allowed value, so that:

Q ≡ v.(j) ≠ n - k + j ∧ ⟨∀i : j < i < k : v.(i) = n - k + i

and the loop condition imply P, and so v.(j) can be incremented without violating the invariant; the next elements must be changed to restore it fully. Note that in line 7, the increment is 1 ≤ zn - k, so the algorithm makes progress on each iteration.

Aside: The upper bound for z follows from straightforward algebraic manipulation; the lower bound stems from the fact that the minimum value that v.(j) can take after the execution of line 7 is 0, but only for j = 0 since the subset is sorted.

This algorithm, when specialized to k = n, is equivalent to Dijkstra's algorithm for the Next Lexicographic Permutation (A Discipline of Programming, Prentice-Hall, 1997.)

2 comments:

Anonymous said...

seems like a lot of work for something that can be done in a more straightforward functional manner...

let rec combos = function
| [] -> [[]]
| x::xs -> List.concat (List.map (fun rest -> [x::rest; rest]) (combos xs))

Matías Giovannini said...

Yes, it is some work to formally justify a hairy imperative algorithm like this one, especially when the straightforward recursive definition is so clear. But, as I remarked in the post, the intent was to analyze the imperative algorithm for generating all the combinations and proving it actually works, not just to show an algorithm for it.