2008-07-12

Picturing Networks

This is what a sorting network looks like:

It is a sorting network of width 8 with a minimal number of 19 exchange boxes and 6 parallel stages. Each item to be sorted is represented by a horizontal line, and there are eight of them. Each exchange box is represented by a thicker vertical line between inputs. If the exchange boxes can be scheduled to run in parallel without interference, they are brought closer together, to emphasize grouping. It is not difficult to generate these graphics. I'll emit EPS as it is a simple textual format which Preview in Mac OS X can rasterize readily enough.

As before, a network is an array of pairs of integer indices ij representing an exchange operation between inputs i and j:

type network = (int * int) array

The program will take networks represented in textual format: each exchange box is a pair of integers separated by a colon:

let re_box = Str.regexp "\\([0-9]+\\):\\([0-9]+\\)"

sequences of such boxes are separated by whitespace, commas or colons:

let re_sep = Str.regexp "[\t ]*[,;][\t ]*"

A box is parsed with the former, by matching and converting. If the matching succeeds, it is sufficient validation to let me convert both indices to decimal without fear:

let box_of_string s =
  if Str.string_match re_box s 0 then
    let i = int_of_string (Str.matched_group 1 s)
    and j = int_of_string (Str.matched_group 2 s) in
    if i > j then j, i else i, j
  else failwith "malformed exchange box"

Converting an entire network is as simple as splitting the input line according to the latter regular expression, and applying the preceding function to each pair. Afterwards, I count the greatest referenced input line in the network to obtain its width:

let network_of_string s : int * network =
  let l = List.map box_of_string (Str.split re_sep s) in
  succ (List.fold_left (fun width (_, j) -> max width j) 0 l), Array.of_list l

For instance, the above network can be parsed like this:

# network_of_string "4:5,0:6,3:7,1:2;2:5,0:4,1:3,6:7;2:6,3:4,0:1;1:3,5:7,4:6;4:5,2:3;5:6,1:2,3:4" ;;
- : int * network =
(8,
 [|(4, 5); (0, 6); (3, 7); (1, 2); (2, 5); (0, 4); (1, 3); (6, 7); (2, 6);
   (3, 4); (0, 1); (1, 3); (5, 7); (4, 6); (4, 5); (2, 3); (5, 6); (1, 2);
   (3, 4)|])

In order to determine which group of comparators can be scheduled in parallel, I simulate placing them onto wires and "pushing" them as far to the left as previously placed comparators would allow them. The following function abstract this logic into a generic fold, which lets me reuse it in a couple of places without bothering with this logic:

let fold_network f e width (net : network) =
  let bounds = Array.make width 0 in
  fst (Array.fold_left (fun (e, depth) (i, j) ->
    if i == j then (e, depth) else
    let d = max depth (1 + max bounds.(i) bounds.(j)) in
    bounds.(i) <- d; bounds.(j) <- d;
    (f (d > depth) e (i, j), d)) (e, 0) net)

The array bounds keeps tab of how many comparators are already placed on line i. The variable depth in the fold accumulator tracks the overall depth of the network. An exchange box i = j with both terminals on the same input is considered a "no-operation" and not counted. Otherwise, the endpoints would accumulate on the respective entries in bounds; if the resulting number is larger than the recorded depth, the accumulating function is notified and the accumulator is updated accordingly.

With this, is not difficult to pretty print a network in a format that round-trips correctly via parsing:

let string_of_network width (net : network) =
  let buffer = Buffer.create 16 in
  ignore (fold_network (fun is_sequential any (i, j) ->
    if any then
      Buffer.add_char buffer (if is_sequential then ';' else ',');
    Buffer.add_string buffer (string_of_int i);
    Buffer.add_char   buffer ':';
    Buffer.add_string buffer (string_of_int j);
    true) false width net);
  Buffer.contents buffer

Exchange boxes that can be grouped in parallel are separated by commas, and sequential groups are separated by semicolons.

An EPS document is characterized by a reduced number of PostScript commands; the file itself must follow the DSC set forth by Adobe in the respective manual. The most important comment is the bounding box that must accurately enclose the drawing. This requires calculating the total width and height the network will occupy, and to avoid clipping I use a little margin:

let open_document out ?(margin=0.) width height =
  Printf.fprintf out "%%!PS-Adobe-3.0 EPSF-3.0\n";
  Printf.fprintf out "%%%%BoundingBox: 0 0 %d %d\n"
    (truncate (ceil (width  +. 2. *. margin)))
    (truncate (ceil (height +. 2. *. margin)));
  Printf.fprintf out "%%%%DocumentData: Clean7Bit\n";
  Printf.fprintf out "%%%%EndComments\n"

Similarly, the document must be finalized with a standard prelude:

let close_document out =
  Printf.fprintf out "showpage\n";
  Printf.fprintf out "%%%%EOF\n"

To ease the measurements in PostScript, a conversion from millimeters to points:

let mm = ( *. ) (72. /. 25.4)

This lets me express sizes in metric units (even if neither Preview's nor Photoshop's conversions accurately reflect these sizes). The distance between input lines is:

let line_distance = mm 5.

and the distances between exchange boxes are:

let parallel_distance   = mm 2.
and sequential_distance = mm 5.

The lines have their different thicknesses, too:

let line_thickness     = 0.5
and exchange_thickness = 2.

To compute the bounding box I need to calculate the overall width of the laid out network. As you can see in the graphic above, the first and last boxes are inset by a sequential_distance from the input lines' ends. Then it's just a matter of determining if the box must be laid in a group or after one, and add the corresponding distance:

let network_width width (net : network) =
  fold_network (fun is_sequential width _ ->
    width +. if is_sequential then sequential_distance else parallel_distance)
  sequential_distance width net

Note that I have no use for the indices in the fold function, and that the initial width is a sequential_distance as explained above. Note also that the parameter width is actually the number of input lines (the network's width), not its graphical horizontal dimension.

Generating the actual PostScript is a little messy, so I'll intersperse these comments with the code instead of presenting it as a complete function:

let output_network out width (net : network) =

First I compute the dimensions of the graphic. I will use an exchange_thickness of margin around the picture, so that no line or terminal gets clipped. The height is the line_distance times the number of inter-line gaps, that is, one less the network width:

  let m = exchange_thickness
  and h = line_distance *. float (pred width)
  and w = network_width width net in

With that I can output the document header:

  open_document out ~margin:m w h;

I lay first the input line chord:

  Printf.fprintf out "%g setlinewidth\n" line_thickness;
  for i = 0 to pred width do
    let y = m +. line_distance *. float i in
    Printf.fprintf out "%g %g moveto %g %g lineto stroke\n"
      m y (m +. w) y
  done;

I use %g format specifiers to minimize the number of decimal digits printed. Also, I take into account the margins both in the horizontal and vertical coordinates. Now the actual placing of the exchange boxes is a matter of folding over the network:

  ignore (fold_network (fun is_sequential width (i, j) ->

The horizontal coordinate is accumulated over, and updated for each new exchange box being placed:

    let x  = width +. if is_sequential then sequential_distance else parallel_distance

The vertical coordinates present no mystery:

    and y  = m +. line_distance *. float i
    and y' = m +. line_distance *. float j in

With exchange_thickness I lay the vertical lines:

    Printf.fprintf out "%g setlinewidth\n" exchange_thickness;
    Printf.fprintf out "%g %g moveto %g %g lineto stroke\n" x y x y';

And with a zero line to avoid overstepping the bounding box I place connector dots on both endpoints:

    Printf.fprintf out "0 setlinewidth\n";
    Printf.fprintf out "%g %g %g 0 360 arc fill\n" x y  exchange_thickness;
    Printf.fprintf out "%g %g %g 0 360 arc fill\n" x y' exchange_thickness;

The fold accumulates the horizontal coordinate starting with a margin:

    x) m width net);

Finally, I close the document:

  close_document out
;;

The terminating double-colon is just to emphasize that this function must be reconstructed from its component pieces. A minimally simple command-line driver for this pretty-printer takes its argument from the line, and writes its output to standard output:

let () =
  let width, network = network_of_string Sys.argv.(1) in
  output_network stdout width network

A robust driver with options, on-screen usage and an exception handler is left to the reader.

No comments: