2011-11-17

Brainfuck in Java

(I don't feel like writing a punning title.) Last time I showed how the use of functors allows for modular and type-safe specification of semantics. I wrote I can't imagine this flexibility in object-oriented languages like Java or Python; fighting words for which I was justly taken to task. Here's my penance: a morally equivalent reimplementation in Java of the Brainfuck interpreter. The translation was mechanic, but I realized with some surprise that the result is more flexible than the original OCaml: whereas the latter only allows static composition of semantics (unless you use first-class modules), Java allows fully dynamic, run-time stacking of semantics.

Even though the Java is better decoupled than the original (which uses the same memory representation for every interpreter variant), this is more by necessity than design. In the process, the interpreter suffered more than a 300% code expansion by line count at 520 lines. Bear with me! This is a single-file, multiple-class program; feel free to package and refactor it to your heart's content. The external requirements are minimal:

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.Stack;

As advertised, the top-level Brainfuck class parses, prints and evaluates Brainfuck programs (a List of Instructions) using fully compositional semantics:

public final class Brainfuck {
    public static void main(String[] args) {
        String hello =
">+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>++++++++[<++++>-]"+
"<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+.";
        Semantics semantics = new S3_(5000, new S2(new S1_(new S0())));
        List<Instruction> program = Brainfuck.parse(hello);
        System.out.print(Brainfuck.print(program));
        Brainfuck.evaluate(semantics, program);
    }

The class is completely static and reentrant; the main operations delegate to appropriate visitors:

private Brainfuck() { }

    public static void evaluate(Semantics semantics, List<Instruction> program) {
        new Evaluator(semantics).evaluate(program);
    }

    public static String print(List<Instruction> program) {
        return new Printer().print(program);
    }

The parser is standard, recursive-descent but with explicitly-managed recursion. For simple (concrete) instructions it just adds the corresponding Instruction to the current program. When it sees an opening bracket it uses a Stack to hold the partially-parsed List of Instructions and starts fresh. When it encounters a closing bracket, it builds a Repeat instruction from the just-parsed program, pops the context and adds the former to the latter:

public static List<Instruction> parse(String text) {
        final Stack<List<Instruction> > stack = new Stack<List<Instruction> >();
        List<Instruction> program = new ArrayList<Instruction>();

        for (int i = 0; i != text.length(); i++)
            switch (text.charAt(i)) {
            case '>': program.add(Instruction.FWD); break;
            case '<': program.add(Instruction.BAK); break;
            case '+': program.add(Instruction.INC); break;
            case '-': program.add(Instruction.DEC); break;
            case '.': program.add(Instruction.OUT); break;
            case ',': program.add(Instruction.INP); break;
            case '[':
                stack.push(program);
                program = new ArrayList<Instruction>();
                break;
            case ']':
                if (stack.empty())
                    throw new IllegalArgumentException("unmatched `]'");
                final Instruction rep = Instruction.REP(program);
                program = stack.pop();
                program.add(rep);
                break;
            }
        if (!stack.empty())
            throw new IllegalArgumentException("unmatched `['");
        return program;
    }
}

A textbook example that completes the class. Now for the real program. An Instruction is an abstract class implementing the case class pattern à la Scala (cf Wadler's expression problem). Every instruction supports two operations, evaluation and printing, by using the visitor pattern:

abstract class Instruction {
    private Instruction() { }

    public abstract void evaluate(Evaluator evaluator);
    public abstract void print(Printer printer);

Each concrete Instruction is tucked away inside this class. They simply forward the call to the passed-in visitor. The first six are mind-numbingly straightforward:

private static final class Forward extends Instruction {
        private Forward() { }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateForward();
        }

        @Override
        public void print(Printer printer) {
            printer.printForward();
        }

    private static final class Backward extends Instruction {
        private Backward() { }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateBackward();
        }

        @Override
        public void print(Printer printer) {
            printer.printBackward();
        }

    private static final class Increment extends Instruction {
        private Increment() { }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateIncrement();
        }

        @Override
        public void print(Printer printer) {
            printer.printIncrement();
        }

    private static final class Decrement extends Instruction {
        private Decrement() { }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateDecrement();
        }

        @Override
        public void print(Printer printer) {
            printer.printDecrement();
        }

    private static final class Output extends Instruction {
        private Output() { }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateOutput();
        }

        @Override
        public void print(Printer printer) {
            printer.printOutput();
        }

    private static final class Input extends Instruction {
        private Input() { }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateInput();
        }

        @Override
        public void print(Printer printer) {
            printer.printInput();
        }

The last Instruction is fractionally more interesting in that it has a program as a parameter:

private static final class Repeat extends Instruction {
        private final List<Instruction> program;

        private Repeat(List<Instruction> program) {
            this.program = program;
        }

        @Override
        public void evaluate(Evaluator evaluator) {
            evaluator.evaluateRepeat(program);
        }

        @Override
        public void print(Printer printer) {
            printer.printRepeat(program);
        }

These classes are fully private to the enclosing case class Instruction; the Brainfuck interpreter uses static members to refer to each:

public static final Instruction FWD = new Forward();
    public static final Instruction BAK = new Backward();
    public static final Instruction INC = new Increment();
    public static final Instruction DEC = new Decrement();
    public static final Instruction OUT = new Output();
    public static final Instruction INP = new Input();

    public static Instruction REP(List<Instruction> program) {
        return new Repeat(program);
    }
}

This is all the (essentially boilerplate) code necessary just to reproduce an algebraic data type over which OCaml code would pattern match. The visitors implement each arm of this match, so more boilerplate follows. The first one is the Evaluator:

class Evaluator {
    private final Semantics semantics;
    private final Memory memory;

It takes a Semantics and sets itself to operate upon an initial Memory:

public Evaluator(Semantics semantics) {
        this.semantics = semantics;
        this.memory = semantics.initial();
    }

This is all a bit abstract because both Semantics and Memory are interfaces that can be plugged-in at will. To evaluate a program, it visits each Instruction in turn:

public void evaluate(List<Instruction> program) {
        for (Instruction instruction : program)
            instruction.evaluate(this);
    }

Again, this is a bog-standard visitor. Each visiting method simply delegates to the chosen Semantics with the current state of the Memory:

public void evaluateForward() {
        semantics.forward(this.memory);
    }

    public void evaluateBackward() {
        semantics.backward(this.memory);
    }

    public void evaluateIncrement() {
        semantics.increment(this.memory);
    }

    public void evaluateDecrement() {
        semantics.decrement(this.memory);
    }

    public void evaluateOutput() {
        semantics.output(this.memory);
    }

    public void evaluateInput() {
        semantics.input(this.memory);
    }

The Repeat instruction is an Action over a Memory that corresponds to evaluate-ing the repeated program:

public void evaluateRepeat(final List<Instruction> program) {
        semantics.repeat(new Action<Memory>() {
            @Override
            public void apply(Memory _unused) {
                evaluate(program);
            }
        }, this.memory);
    }
}

(Why is the Memory _unused? Because evaluate already has it in scope, but the Semantics are fully parametric with respect to the store. If you find an elegant solution to this that doesn't involve each Instruction passing the Memory around, let me know.) This is the moral equivalent of a higher-order closure; someday you might be able to write a lambda instead of that. The second visitor is the Printer:

class Printer {
    private final StringBuffer buffer = new StringBuffer(72);
    private int linelen;

It formats the Brainfuck program in neat lines of 72 characters:

public Printer() {
        linelen = 0;
    }

    private void putc(char c)  {
        if (linelen == 72 || c == '\n') {
            buffer.append('\n');
            linelen = 0;
        }
        if (c != '\n') {
            buffer.append(c);
            linelen++;
        }
    }

The visitor sets the buffer up, prints each instruction by visiting them in turn and closes the buffer:

public String print(List<Instruction> program) {
        buffer.setLength(0);
        for (Instruction instruction : program)
            instruction.print(this);
        putc('\n');
        return buffer.toString();
    }

Each visited Instruction accumulates its representation in the buffer:

public void printForward()   { putc('>'); }
    public void printBackward()  { putc('<'); }
    public void printIncrement() { putc('+'); }
    public void printDecrement() { putc('-'); }
    public void printOutput()    { putc('.'); }
    public void printInput()     { putc(','); }

Aside: It is not the responsibility of the Instruction to know its external representation since it denotes Brainfuck's abstract syntax. Do not be misled by phony invocations to the Law of Demeter.

Repeat just emits its list of Instructions between brackets by mutual recursion between it and this visitor:

public void printRepeat(List<Instruction> program) {
        putc('[');
        for (Instruction instruction : program)
            instruction.print(this);
        putc(']');
    }
}

Neat. Now for the raison d'être of the exercise: A Semantics encapsulates the effects of each Brainfuck instruction upon the Memory:

interface Semantics {
    Memory initial();
    void forward(Memory memory);
    void backward(Memory memory);
    void increment(Memory memory);
    void decrement(Memory memory);
    void output(Memory memory);
    void input(Memory memory);
    void repeat(Action<Memory> program, Memory memory);
}

The effect of a repetition depends on the program being repeated; this is represented by an Action taking the Memory as a parameter:

interface Action<T> {
    void apply(T argument);
}

Aside: Don't confuse a program as a List of Instructions with its meaning as an Action or effect. It wouldn't be appropriate for repeat to take anything else.

In object-oriented languages there are two reuse mechanisms: inheritance and composition. The first is static, compile-time and determined by the providing (i.e. library) code; the second is dynamic, run-time and controlled by the consuming (i.e. client) code. Coplien's envelope pattern allows to combine both in a sort of dynamic inheritance:

abstract class DelegatingSemantics implements Semantics {
    private final Semantics delegate;

    public DelegatingSemantics(Semantics delegate) {
        this.delegate = delegate;
    }

    @Override
    public Memory initial()              { return delegate.initial(); }

    @Override
    public void forward(Memory memory)   { delegate.forward(memory); }

    @Override
    public void backward(Memory memory)  { delegate.backward(memory); }

    @Override
    public void increment(Memory memory) { delegate.increment(memory); }

    @Override
    public void decrement(Memory memory) { delegate.decrement(memory); }

    @Override
    public void output(Memory memory)    { delegate.output(memory); }

    @Override
    public void input(Memory memory)     { delegate.input(memory); }

    @Override
    public void repeat(Action<Memory> program, Memory memory) {
        delegate.repeat(program, memory);
    }
}

Envelopes take a delegating letter to which to forward by default the implemented methods; subclasses need only override the methods of interest and let the envelope handle the rest. The initial semantics S0 is, however, fully concrete. It operates on an unbounded memory of machine integers:

class S0 implements Semantics {
    @Override
    public Memory initial()              { return new UnboundedMemory(); }

    @Override
    public void forward(Memory memory)   { memory.next(); }

    @Override
    public void backward(Memory memory)  { memory.prev(); }

    @Override
    public void increment(Memory memory) { memory.set(memory.get() + 1); }

    @Override
    public void decrement(Memory memory) { memory.set(memory.get() - 1); }

    @Override
    public void output(Memory memory) {
        System.out.print((char) (memory.get() & 255));
    }

The input behavior on EOF is to leave the memory untouched:

@Override
    public void input(Memory memory) {
        try {
            final int c = System.in.read();
            if (c != -1) memory.set(c);
        } catch (IOException e) {
            System.err.println(e);
        }
    }

The meaning of repeat is to execute the enclosed program until the current memory cell is zero:

@Override
    public void repeat(Action<Memory> program, Memory memory) {
        int c;
        while ( (c = memory.get()) != 0 )
            program.apply(memory);
    }
}

This interpretation of repeat will probably never change (but consider a non-Turing-complete version of Brainfuck with bounded repetition, e.g. for server-side Core Wars competitions). The first difference, however, is on input:

class S1 extends DelegatingSemantics {
    public S1(Semantics delegate) { super(delegate); }

    @Override
    public void input(Memory memory) {
        try {
            final int c = System.in.read();
            memory.set(c == -1 ? 0 : c);
        } catch (IOException e) {
            System.err.println(e);
        }
    }
}

The S1 interpretation overrides a Semantics so that it sets the current cell to zero on EOF. The other variant is to set it to -1:

class S1_ extends DelegatingSemantics {
    public S1_(Semantics delegate) { super(delegate); }

    @Override
    public void input(Memory memory) {
        try {
            final int c = System.in.read();
            memory.set(c);
        } catch (IOException e) {
            System.err.println(e);
        }
    }
}

The second difference is in the memory cell size, in this case 8-bit-wide:

class S2 extends DelegatingSemantics {
    public S2(Semantics delegate) { super(delegate); }

    @Override
    public void increment(Memory memory) {
        memory.set((memory.get() + 1) & 255);
    }

    @Override
    public void decrement(Memory memory) {
        memory.set((memory.get() - 1) & 255);
    }
}

Although it uses the delegate Semantics's memory, it truncates it on mutation. The third difference is in the organization of the Memory itself:

class S3 extends DelegatingSemantics {
    protected final int length;

    public S3(int length, Semantics delegate) {
        super(delegate);
        this.length = length;
    }

    @Override
    public Memory initial() { return new BoundedMemory(length); }
}

It uses a bounded memory of the size provided, which disallows moving past its bounds. A variation is to wrap around the current cell when reaching either end:

class S3_ extends DelegatingSemantics {
    protected final int length;

    public S3_(int length, Semantics delegate) {
        super(delegate);
        this.length = length;
    }

    @Override
    public Memory initial() { return new CircularMemory(length); }
}

The same code except for the underlying store selected. The Memory interface itself is simple:

interface Memory {
    int get();
    void set(int value);
    void next();
    void prev();
}

The only operations on it are getting and setting the current cell's value and moving it one position in each direction. The implementation of an UnboundedMemory is, however, a bit sophisticated. It uses the standard Turing Machine trick of simulating a doubly-infinite tape with two semi-infinite ones:

class UnboundedMemory implements Memory {
    private final List<Integer> left  = new ArrayList<Integer>();
    private final List<Integer> right = new ArrayList<Integer>();
    private int cursor;

The right tape contains cells with index 0, 1, 2… while the left tape contains cells indexed by -1, -2, -3… Initially only the zero-th cell exists:

public UnboundedMemory() {
        right.add(0);
        cursor = 0;
    }

The class maintains the invariant that the current cell, to which cursor points, always exists:

@Override
    public int get() {
        if (cursor < 0)
            return left.get(-1 - cursor);
        else
            return right.get(cursor);
    }

    @Override
    public void set(int value) {
        if (cursor < 0)
            left.set(-1 - cursor, value);
        else
            right.set(cursor, value);
    }

New cells are created on demand on the appropriate tape upon cursor movement:

@Override
    public void next() {
        cursor++;
        if (cursor >= 0 && right.size() == cursor)
            right.add(0);
    }

    @Override
    public void prev() {
        --cursor;
        if (cursor < 0 && left.size() == -1 - cursor)
            left.add(0);
    }
}

A BoundedMemory is, by contrast, much simpler:

class BoundedMemory implements Memory {
    protected final int[] tape;
    protected int cursor;

    public BoundedMemory(int length) {
        this.tape = new int[length];
        cursor = 0;
    }

    @Override
    public int get() {
        return tape[cursor];
    }

    @Override
    public void set(int value) {
        tape[cursor] = value;
    }

The only intelligence required is in avoiding moving the cursor outside its bounds:

@Override
    public void next() {
        if (cursor != tape.length - 1)
            cursor++;
    }

    @Override
    public void prev() {
        if (cursor != 0)
            --cursor;
    }
}

A CircularMemory is just a BoundedMemory with a different out-of-bounds behavior:

class CircularMemory extends BoundedMemory {
    public CircularMemory(int length) {
        super(length);
    }

    @Override
    public void next() {
        cursor++;
        if (cursor == tape.length)
            cursor = 0;
    }

    @Override
    public void prev() {
        if (cursor == 0)
            cursor = tape.length;
        --cursor;
    }
}

tl; dr: writing interpreters in Java is tedious. Interestingly enough, I find the object-oriented expression of the Semantics over the different types of Memory very natural, more so even than what modular programming allows. The downsides I find, at least for Java, are two:

  1. The sheer amount of boilerplate necessary to express the problem. Even allowing for the lack of closures, algebraic data types and pattern matching, it lacks an automatic delegation mechanism. This is actually a shortcoming of every mainstream object-oriented language (and no, metaprogramming is not a substitute for a language construct that can be checked and verified statically for a number of properties that run-time reflection would not)
  2. The need to name everything. Nominal type disciplines not only suck on principle, they actually encourage naming trivial program constructs (classes, interfaces, methods) that therefore take an artificial existence of their own, bloating (module) interfaces and opening new ports for coupling

Still with me? Let me thank you and express my admiration for your fortitude.

2011-11-11

Modular Semantics for Brainfuck

The problem with Brainfuck is that its semantics are given by its different implementations but not all its implementations agree so that writing an interpreter is straightforward but writing portable Brainfuck programs is not. OCaml functors allows playing with pluggable semantics in a way that would be very difficult laborious with other languages. I can't imagine this flexibility in object-oriented languages like Java or Python, but in OCaml the mix-and-match approach with module inclusion is straightforward.

Update: OK, so reddit justly challenged my grandstanding. The moral equivalent of this typesafe, modular, extensible interpreter requires more than 520 tight lines, of which 270 go to the parser, pretty-printer and the evaluator (two visitors over 7 case classes), and 250 go to the plug-in semantics. The end result is more flexible than the following OCaml code, since I can choose the composition dynamically, at run-time:

public static void main(String[] args) {
  String hello =
">+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>++++++++[<++++>-]"+
"<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+.";

  Semantics s = new S3_(5000, new S2(new S1_(new S0())));
  Brainfuck bf = new Brainfuck(s);
  bf.evaluate(bf.parse(hello), s.initial());
}

The key is using delegation instead of inheritance to override behavior and spending six hours typing code like a madman.

The language is defined by the effect that its eight instructions have on a memory:

module type SEMANTICS = sig
  type mem
  val mem       : mem
  val forward   : mem -> mem
  val backward  : mem -> mem
  val increment : mem -> mem
  val decrement : mem -> mem
  val output    : mem -> mem
  val input     : mem -> mem
  val repeat    : (mem -> mem) -> (mem -> mem)
end

The initial state of the memory is given by mem. The next six functions correspond directly to the six instructions > < + - . , while the operational semantics of each of the brackets is consolidated into a single small-step function, repeat, so that for a properly balanced Brainfuck program [P] its effect on mem is repeat P mem.

Given a SEMANTICS, the interpretation of a Brainfuck program is straightforward:

module Brainfuck (S : SEMANTICS) = struct

The program is represented by an abstract syntax comprising of a sequence of instructions mapping one-to-one to the surface syntax of the language, except for the pair of brackets that represents a single unit of repetition:

type instr = Fwd | Bak | Inc | Dec | Out | Inp | Rep of program
  and program = instr list

Note: The fact that properly balanced [ and ] denote a repeating computation can be taken as a definition. The translation from concrete syntax to abstract syntax ensures that the mapping is faithful and the operational semantics given for [ and ] as separate instructions corresponds to the one for Rep.

Since the semantics of each instruction is a function from memory to memory, the semantics of a program is the left-to-right composition of the semantics of the individual instructions comprising it:

let rec eval p t = List.fold_left (fun t -> function
  | Fwd   -> S.forward   t
  | Bak   -> S.backward  t
  | Inc   -> S.increment t
  | Dec   -> S.decrement t
  | Out   -> S.output    t
  | Inp   -> S.input     t
  | Rep p -> S.repeat (eval p) t) t p

The translation from concrete to abstract syntax is given by a parser:

let parse str =
    let syntax_err fmt = Printf.kprintf failwith fmt in
    let res = ref []
    and stk = ref [] in
    let emit  i  = res := i :: !res
    and enter () = stk := !res :: !stk; res := []
    and leave () = match !stk with
    | r :: s -> stk := s; res := Rep (List.rev !res) :: r
    | []     -> syntax_err "unmatched `]'"
    in String.iter (function
    | '>' -> emit Fwd
    | '<' -> emit Bak
    | '+' -> emit Inc
    | '-' -> emit Dec
    | '.' -> emit Out
    | ',' -> emit Inp
    | '[' -> enter ()
    | ']' -> leave ()
    |  _  -> ()) str;
    if !stk != [] then syntax_err "unmatched `['" else
    List.rev !res

Only the concrete instructions > < + - . , [ and ] have meaning in a program; all other possible characters are ignored and [ and ] must be properly balanced. An adjoint pretty-printer or formatter translates abstract syntax back to concrete, such that for all abstract programs P, parse (format P) ≡ P:

let format prog =
    let buf = Buffer.create 72
    and len = ref 0 in
    let putc c =
      if !len == 72 || c == '\n' then begin
        Buffer.add_char buf '\n';
        len := 0
      end;
      if c != '\n' then begin
        Buffer.add_char buf c;
        incr len
      end
    in
    let rec go prog = List.iter (function
    | Fwd   -> putc '>'
    | Bak   -> putc '<'
    | Inc   -> putc '+'
    | Dec   -> putc '-'
    | Out   -> putc '.'
    | Inp   -> putc ','
    | Rep p -> putc '['; go p; putc ']') prog
    in go prog; putc '\n'; Buffer.contents buf
end

(The proof of this last property would go by mutual induction on format and parse if it were not that they are imperative; the conversion to pure recursive form is straightforward and mechanical but left as an exercise to the reader.) It is time to give concrete semantics for each of the instructions.

The simplest possible store is a tape of memory cells which can be conveniently represented by a zipper:

type tape = Tape of int list * int * int list

The initial memory has an infinite number of zeros:

module S0 = struct
  type mem = tape

  let mem =
    let rec zeros = 0 :: zeros in
    Tape (zeros, 0, zeros)

Advancing and retreating the cell pointer simply moves the zipper, since both ends are infinite:

let forward  (Tape (ls, c, r :: rs)) = Tape (c :: ls, r, rs)
  let backward (Tape (l :: ls, c, rs)) = Tape (ls, l, c :: rs)

(The cyclic lists at both ends can be considered as sentinel nodes.) The simplest representation for a cell is as the underlying machine integer:

let increment (Mem (ls, c, rs)) = Mem (ls, c + 1, rs)
  let decrement (Mem (ls, c, rs)) = Mem (ls, c - 1, rs)

Character output is consistently among implementations just the output to console of the cell value as an ASCII code:

let output (Mem (_, c, _) as t) =
    output_char stdout (char_of_int (c land 255));
    t

Character input differs on the treatment of end-of-file; probably the most general behavior is to not modify the cell on EOF:

let input (Mem (ls, c, rs) as t) =
    let c = try int_of_char (input_char stdin) with End_of_file -> c in
    Tape (ls, c, rs)

Repetition is the effect of p whenever the current cell is not zero:

let rec repeat p t = match t with
  | Mem (_, 0, _) -> t
  | _             -> repeat p (p t)
end

One variation between interpreters is the EOF behavior. Some of them set the current cell to 0 on EOF:

module S1 (S : SEMANTICS with type mem = tape) = struct
  include S

  let input (Tape (ls, c, rs)) =
    let c = try int_of_char (input_char stdin) with End_of_file -> 0 in
    Tape (ls, c, rs)
end

Others set it to -1:

module S1' (S : SEMANTICS with type mem = tape) = struct
  include S

  let input (Tape (ls, c, rs)) =
    let c = try int_of_char (input_char stdin) with End_of_file -> -1 in
    Tape (ls, c, rs)
end

Another variation between interpreters is the cell width in bytes. The standard reference interpreter uses eight-bit cells:

module S2 (S : SEMANTICS with type mem = tape) = struct
  include S

  let increment (Tape (ls, c, rs)) = Tape (ls, (c + 1) land 255, rs)
  let decrement (Tape (ls, c, rs)) = Tape (ls, (c - 1) land 255, rs)
end

(To avoid redefining all functions I simply mask off excess bits from the native integer representation.) Yet another possible variation is to have a fixed-size tape. What to do when trying to move the cell pointer past the ends is a free choice; I could opt for simply ignoring the attempted movement:

let fill v =
  let rec go l n =
    if n == 0 then l else
    go (v :: l) (pred n)
  in go []

module S3 (N : sig val size : int end)
          (S : SEMANTICS with type mem = tape) = struct
  include S

  let mem = Tape ([], 0, fill 0 N.size)

  let forward  t = match t with
  | Tape (ls, c, r :: rs) -> Tape (c :: ls, r, rs)
  | Tape (_ , _, []     ) -> t

  let backward t = match t with
  | Tape (l :: ls, c, rs) -> Tape (ls, l, c :: rs)
  | Tape ([]     , _, _ ) -> t
end

The other alternative is to wrap the cell pointer at either end:

module S3' (N : sig val size : int end)
           (S : SEMANTICS with type mem = tape) = struct
  include S3 (N) (S)

  let forward  t = match t with
  | Tape (ls, c, r :: rs) -> Tape (c :: ls, r, rs)
  | Tape (ls, c, []     ) ->
    let r :: rs = List.rev (c :: ls) in
    Tape ([], r, rs)

  let backward t = match t with
  | Tape (l :: ls, c, rs) -> Tape (ls, l, c :: rs)
  | Tape ([]     , c, rs) ->
    let l :: ls = List.rev (c :: rs) in
    Tape (ls, l, [])
end

Now the choice of semantics is a matter of stacking the desired functors:

let hello = "
>+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>++++++++[<++++>-]
<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+."

let _ =
  let module S  = S3'(struct let size = 5000 end)(S2(S1(S0))) in
  let module BF = Brainfuck(S) in
  BF.(eval (parse hello)) S.mem

This makes BF a Brainfuck interpreter running on a bounded memory of 5.000 byte-sized cells with 0 on EOF.