Also, quasicrystals. Have you ever found yourself wanting to do generated animations but didn't know how to visualize them? Since I couldn't find a self-contained GIF encoder, I made myself one. Even though it is not fast, I think its 220 lines are clear enough to be a good reference implementation to build upon. The encoder conforms to the following interface:
type palette val palette : ?background:int -> color array -> palette type t val make : ?ct:palette -> width:int -> height:int -> (t -> unit) -> out_channel -> unit val image : ?ct:palette -> ?transparent:int -> ?fps:int -> ?x:int -> ?y:int -> ?width:int -> ?height:int -> string -> t -> unit val comment : string -> t -> unit val repeat : ?count:int -> t -> unit
The idea is to build the animation via repeat
, comment
and image
inside the higher-order argument to make
to ensure a well-formed output file.
GIF is a block-oriented format specifically allowing decoders to quickly skip over unknown blocks. GIF data is written as length-prefixed blocks, with the block size occupying a byte, hence a data block can be no larger than 255 bytes. Variable-length data is written as a succession of nonempty blocks terminated by an empty block, that is, a zero byte.
For this reason I need a buffering layer upon channel output. This is where I find objects a nice programming construct: manipulating stateful abstractions in an imperative way, even when the overall interface is functional. The output
stream can be in immediate mode, where bytes are passed directly to the output channel, or in block mode, where bytes are accumulated in the buffer
to be written in block:
class output otch = object (self) val buffer = String.create 255 val mutable current = -1
The member current
is -1 in the first case, and in the second it marks the next empty position in buffer
. When the buffer
is full or at the end of the block any accumulated and unwritten bytes must be flush
ed out:
method private flush = if current > 0 then begin output_byte otch current; output otch buffer 0 current; current <- 0 end
The output
stream is byte
-oriented:
method byte n = let b = n land 255 in if current == -1 then output_byte otch b else begin if current == 255 then self#flush; buffer.[current] <- char_of_int b; current <- current + 1 end
If in immediate mode the byte
is written to the underlying channel, or else it is accumulated. This primitive is the building block for the other methods:
method le16 n = self#byte n; self#byte (n lsr 8) method string str = String.iter (fun c -> self#byte (int_of_char c)) str
(I admitted this is not particularly fast.) The switch into and out of block mode is performed via the following pair of methods:
method begin_block = if current != -1 then failwith "begin_block"; current <- 0 method end_block = if current == -1 then failwith "end_block" else self#flush; current <- -1; self#byte 0
It doesn't make sense to nest block modes, so as a sanity check I verify before changing mode. Finally, a bit of clean-up:
method close = if current != -1 then self#end_block; flush otch end
Pretty simple except maybe for the fact that the output
stream must be careful not to commit a fence error when writing a block of exactly 255 characters. This class is functional enough to support all the data required to build a GIF file, including the LZW-compressed image data. For that I also use a class, again to encapsulate a blob of mutable data. That doesn't mean that I will forgo functional style, though:
let fold_string f e s = let res = ref e in String.iter (fun c -> res := f !res c) s; !res
The LZW compressor used by GIF starts with a given code size, in principle the pixel bit-width, and expands it as it compresses data. It uses two special codes for signaling a stream reset and the end of the stream:
class lzw ?(bits=8) out = let count = 1 lsl bits in let clear_code = count and end_of_info = count + 1 in object (self)
Since the stream is bit-oriented, the compressor keeps a bit buffer on which it append
s data until there are enough bits to make whole bytes:
val mutable code_size = 0 val mutable buffer = 0 val mutable length = 0 method append n = buffer <- buffer lor ((n land pred (1 lsl code_size)) lsl length); length <- length + code_size; while length >= 8 do out#byte (buffer land 0xff); buffer <- buffer lsr 8; length <- length - 8 done
Bits accumulate from least significative to most significative, that is, right to left. Of course, any left-over bits must be flushed
out too:
method finish = if length > 0 then out#byte (buffer land pred (1 lsl length))
The basic idea behind the algorithm is to build a table of those strings repeatedly seen in the input stream and to replace them with a code whenever they are detected. A good (but frustratingly sketchy in crucial parts) reference is that of Steve Blackstock, who recommends using a hash table to quickly find substrings:
val table : (int * int, int) Hashtbl.t = Hashtbl.create 5003 val mutable next_code = 0
The table maps substrings of the form <prefix>⋅c to codes; the <prefix> is represented as the corresponding code, in effect treating substrings as linked lists of characters. The initial table is populated with the roots, the 2bits
possible input values, plus the two special codes:
method private reset = Hashtbl.clear table; for i = 0 to count - 1 do Hashtbl.add table (-1, i) i done; code_size <- bits + 1; next_code <- count + 2
Since the total number of roots is 2bits
+ 2, the code_size
is bits + 1
. Whenever a new string is added to the table, the encoder must check that the code_size
is sufficient, and adjust it if not. The GIF specification allows for a maximum of 12-bit codes, so if that size is reached the stream is reset:
method private add_string s = if next_code == 1 lsl code_size then code_size <- code_size + 1; Hashtbl.add table s next_code; next_code <- next_code + 1; if next_code == 0x1000 then begin self#append clear_code; self#reset; end
The compressor follows essentially the description given in the reference:
method compress data = self#reset; self#append clear_code; let last = fold_string (fun prefix c -> let k = int_of_char c in try Hashtbl.find table (prefix, k) with Not_found -> self#append prefix; self#add_string (prefix, k); k) (-1) data in self#append last; self#append end_of_info; self#finish end
It took me a number of tries to discover the correct expand/reset sequence; all in all, I find the LZW algorithm really clever.
GIF is an indexed-color graphics format. Pixels are indices into a Color Table or palette
:
type color = { red : int; green : int; blue : int; } type palette = { background : int; bits : int; table : color array; }
Creating a palette is a matter of validating the input (GIF is restricted to power-of-two color tables) and making sure that the array of color
s can't be mutated outside the palette
:
let palette ?(background=0) table = let size = Array.length table in if size < 2 || size > 256 || size land (size - 1) != 0 || background < 0 || background >= size then invalid_arg "color_table" else let bits = truncate (log (float size) /. log 2.) in { background; bits; table = Array.copy table; }
The GIF type records the screen dimensions, the Global Color Table and the output
stream:
type t = { width : int; height : int; ct : palette option; out : output; }
The GIF is built part by part, beginning with the Header and ending with the Trailer:
let header { out; _ } = out#string "GIF89a" let trailer { out; _ } = out#byte 0x3b
(Note the concision that I gain by using a record as the data type instead of an object.) Color Tables are laid out sequentially:
let color_table { table; _ } { out; _ } = Array.iter (fun { red; green; blue } -> out#byte red ; out#byte green; out#byte blue) table
A GIF file consists of one or more images laid out in a graphics space represented by a Logical Screen Descriptor:
let logical_screen ({ width; height; ct; out } as gif) = out#le16 width; out#le16 height; match ct with | None -> out#byte 0; out#byte 0; out#byte 0 (* default aspect ratio (1:1 = 49) *) | Some ({ background; bits; _ } as ct) -> out#byte (0xf0 lor pred bits); out#byte background; out#byte 0; color_table ct gif
If the GIF includes a global color table, it follows the Logical Screen Descriptor. In order for make
to be safe it must close the output stream even in the presence of a client error:
let unwind ~(protect : 'a -> unit) f x = try let y = f x in protect x; y with e -> protect x; raise e
Then it is a matter of creating the data record, writing the header and the logical screen description, calling the client function to generate the content, and wrapping up the file with the trailer:
let make ?ct ~width ~height proc otch = unwind ~protect:(fun gif -> trailer gif; gif.out#close) (fun gif -> header gif; logical_screen gif; proc gif) { width; height; ct; out = new output otch; }
GIF87a already allows a GIF file to contain more than one image to be displayed in the logical screen, for instance by tiling. GIF89a includes Graphic Control Extension to make cell animations out of those images, by specifying transparency and frame delay:
let graphics_control_extension transparent fps { out; _ } = let delay = if fps = 0 then 0 else (200 + fps) / (fps + fps) in out#byte 0x21; (* GIF Extension Code *) out#byte 0xf9; (* Graphic Control Label *) out#byte 0x04; (* Length of Data Sub-Block *) out#byte (match transparent with None -> 0 | Some _ -> 9); out#le16 delay; out#byte (match transparent with None -> 0 | Some c -> c); out#byte 0x00 (* Data Sub-Block Terminator *)
An image itself is introduced by an Image Descriptor giving its position and size relative to the logical screen, and optionally a Local Color Table:
let image_descriptor ct x y width height ({ out; _ } as gif) = out#byte 0x2c; out#le16 x; out#le16 y; out#le16 width; out#le16 height; match ct with | None -> out#byte 0x00 | Some ({ bits; _ } as ct) -> out#byte (0x80 lor pred bits); color_table ct gif
The image is laid out as a sequence of byte-sized pixels, in height
rows of width
bytes
, represented as a string for compactness:
let image ?ct ?transparent ?(fps=0) ?(x=0) ?(y=0) ?width ?height bytes ({ out; _ } as gif) = let w = match width with None -> gif.width | Some w -> w and h = match height with None -> gif.height | Some h -> h in if x < 0 || x + w > gif.width || y < 0 || x + h > gif.height || String.length bytes != w * h then invalid_arg "image"; let bits = match ct, gif.ct with | Some ct, _ | None , Some ct -> max 2 ct.bits | None , None -> invalid_arg "image" in (match transparent, fps with | None, 0 -> () | _ , _ -> graphics_control_extension transparent fps gif); image_descriptor ct x y w h gif; out#byte bits; out#begin_block; (new lzw ~bits out)#compress bytes; out#end_block
I validate the image dimensions and make sure that there is at least a color table applicable to this image to compute the pixel size in bits (GIF requires a minimum of 2-bit pixels even for bilevel images). If the image has a transparent color or an animation speed, it must be modified by a Graphics Control Extension block. Then I write the Image Descriptor and the Table-Based Image Data.
Another extension block allows the GIF file to include Comments:
let comment text { out; _ } = out#byte 0x21; (* GIF Extension Code *) out#byte 0xfe; (* Comment Label *) out#begin_block; out#string text; out#end_block
These can be useful for debugging, for instance. Looping is specified by a specific Application Extension Block introduced by Netscape:
let repeat ?(count=0) { out; _ } = out#byte 0x21; (* GIF Extension code *) out#byte 0xff; (* Application Extension Label *) out#byte 0x0b; (* Length of application block *) out#string "NETSCAPE"; (* Application Identifier *) out#string "2.0"; (* Appl. Authentication Code *) out#byte 0x03; (* Length of Data Sub-Block *) out#byte 0x01; (* Loop sub-block ID *) out#le16 count; (* Loop count (0 = forever) *) out#byte 0x00; (* Data Sub-Block Terminator *)
That is it. As an application, let me show you the Quasicrystal generator. First I need a grayscale palette:
let grayscale = GIF.palette (Array.init 256 (fun i -> { red = 255 - i; green = 255 - i; blue = 255 - i }))
and a way to write to a named file:
let with_output_file proc fname = unwind ~protect:close_out proc (open_out_bin fname)
The code itself is a direct port from the original:
let quasicrystal ?(nwaves=7) ?(freq=27) ?(nsteps=30) ~width ~height fname = let pi = 3.14159265358979312 in let dp = 2. *. pi /. float nsteps and dt = pi /. float nwaves and ds = 1.0 /. float (max width height) and omega = 2. *. pi *. float freq in let frame phase pixels = let o = ref 0 in let y = ref (-0.5 *. ds *. float height) in for i = 0 to height - 1 do let x = ref (-0.5 *. ds *. float width) in for j = 0 to width - 1 do let s = ref 0. in for k = 0 to nwaves - 1 do let t = dt *. float k in let b = !y *. cos t -. !x *. sin t in s := !s +. 0.5 *. (cos (b *. omega +. phase) +. 1.) done; s := !s -. 2. *. floor (0.5 *. !s); if !s > 1.0 then s := 2. -. !s; pixels.[!o] <- char_of_int (truncate (255. *. !s)); incr o; x := !x +. ds done; y := !y +. ds done in let pixels = String.create (width * height) in with_output_file (GIF.make ~ct:grayscale ~width ~height (fun gif -> GIF.repeat gif; for p = 0 to nsteps - 1 do GIF.comment (Printf.sprintf "Frame %d/%d" (p+1) nsteps) gif; frame (float p *. dp) pixels; GIF.image ~fps:25 pixels gif; done) ) fname
The last 8 lines are the ones actually dealing with the GIF output. For instance, the call:
quasicrystal ~width:256 ~height:256 ~nwaves:7 ~freq:13 ~nsteps:25 "quasicrystal.gif" ;;
creates the following image:
I hope you find the code useful.
No comments:
Post a Comment