OCaml Planet

May 25, 2015


ICFP programming contest 2014: a retrospective (part 2/2)

This is the second post on the Gagallium participation to the ICFP programming context last year -- 2014. See the first post for the introduction and a link to the source code; this post details the design of the ghost code.

The ghost we submitted is the last of a series of increasingly sophisticated designs, each of which is an incremental enhancement of the previous one. Given the short amount of time available, we did not do a rewrite from scratch, no matter how much we wanted to. This method let us make a pretty good ghost with a small amount of work (about 3 man-days) but it's not without pitfalls: because serious testing was not an option, some huge bugs stayed undiscovered for quite some time.

Like almost everyone in the competition, we wrote the ghosts directly in assembly code, with only a small tool to make the GHC assembly language more palatable: a macro-processor that implements symbolic labels (for jumps) and names (for arguments). The first feature allowed us to retain our sanity when inserting and deleting instructions in the code, while the second one let us use names for our variables stored in memory.


Our first version is called stupid. It starts with 6 steps up, 6 steps right, 6 steps down, 6 steps left (in an attempt to get out of the initial box) then it always tries to move toward lambda-man. It restarts this algorithm every time it gets eaten.

To implement it, we had to solve two interesting problems: 1. Detect when the ghost gets eaten or a new round has started. 2. Find out the direction closest to lambda-man's position.

Detecting restarts

Detecting when the ghost gets eaten or a new round is started is almost easy: just detect any sudden jump in position that is inconsistent with normal movement. This is done by getting the current position in (a,b) and having the previous position stored in ([1],[2]). We subtract [1] from a and [2] from b, which yields two numbers in {-1, 0, 1}. Adding them together yields either -1 or 1 because one of them must be zero and the other nonzero (the ghost has moved either vertically or horizontally). Adding one yields 0 or 2. Any other value indicates that the ghost's position has jumped suddenly. An unsigned comparison with 3 is enough to check this condition. This algorithm is not exact: it will fail whenever the ghost gets eaten at a position very close to its starting point. We decided we could live with this imprecision.

Computing the direction

Computing the direction of lambda-man is a bit harder as it involves trigonometry. Since GHC lacks instructions to compute sines, cosines, tangents, and the like, we do it with addition, subtraction, and exclusive-or instead.

We start by defining an angle alpha, encoded in units of 45 degrees. Alpha will be the middle of the quadrant where lambda-man is found (1, 3, 5, or 7).

The origin and orientation for alpha are the ones used by the INT 0 instruction: 0 is up, 1 is upper-right, 2 is right, etc.

+-----------> X
|   7  0  1
|    \ | /
|  6 - . - 2
|    / | \
|   5  4  3

Consider dx = lambdaman.x - ghost.x and dy = lambdaman.y - ghost.y

The correct answer depends on the signs of dx and dy (to determine in which quadrant lambda-man is) and the comparison of |dx| and |dy| (to determine which octant in this quadrant). The following table enumerates all the cases:

dx sign | dy sign | |dx|>|dy| | direction
   +    |    +    |     T     |  2 = 3-1
   +    |    +    |     F     |  4 = 3+1
   +    |    -    |     T     |  2 = 1+1
   +    |    -    |     F     |  0 = 1-1
   -    |    +    |     T     |  6 = 5+1
   -    |    +    |     F     |  4 = 5-1
   -    |    -    |     T     |  6 = 7-1
   -    |    -    |     F     |  0 = 7+1 (modulo 8)

The result can be decomposed in two values: alpha (1, 3, 5 or 7) and phi (+1 or -1). alpha can be computed by:

  alpha = 1 + (6 if dx is negative) XOR (2 if dy is positive)

and phi is simply a XOR of all three inputs: it is the product of the three signs of the input, taking - for T and + for F. The code simply computes alpha, phi, and the absolute values simultaneously, by testing the signs of dx and dy only once. Then just dividing the result by two yields the direction of lambda-man.

We checked by hand that (as expected because the functions are continuous) we get correct results when lambda-man lies exactly in one of our 8 directions (i.e. when a sign is 0 or |dx| is equal to |dy|).

In retrospect, we could probably get shorter code by simply implementing:

  if |dx| > |dy| then
    if dx positive then 2 else 6
    if dy positive then 4 else 0

Panic mode

To complete the ghost's code, it tests the "fright mode" flag returned by interrupt 6, and reverses the direction if it's in fright mode, thus fleeing away from lambda-man instead of chasing him.


Idiot is a very small variation on the Stupid ghost: it just tries to avoid banging its head on walls. If there is a wall in the direction chosen by Stupid, Idiot will try the other side of the quadrant where lambda-man lies.

This new direction is easy to compute from the outputs of the Stupid code: the primary direction (the direction chosen by Stupid) is alpha + phi and the secondary direction (the other side of the quadrant) is alpha - phi.

To have a look at the square that lies in a given direction, we used our first subroutine. Since the GHC processor doesn't have a stack, we had to manage the return address by hand, storing it in a global variable.


The Smart ghost uses a trick from the ghosts in the real PacMan game: alternate between scatter and chase. Each ghost maintains a "clock" by counting its execution cycles, and in 32 cycles out of 128, the ghost goes into "scatter" mode. In this mode, it sets a target point that depends on the ghost number (a different corner of the map for each ghost). Otherwise, it is in "chase" mode, where the target point is the coordinates of lambda-man.

The goal of this system is to avoid having all the ghosts chasing behind lambda-man without ever blocking his path. By scattering from time to time, the ghosts have a better chance of surrounding lambda-man at some point.

Note that the scatter modes of the ghosts start synchronized, but their clocks run at different speeds, so they will become unsynchronized as the game progresses. In addition, there is no way to know the size of the map, so we used an approximate notion of "corner of the map".

Finally, when Smart is in panic mode (i.e. when lambda-man has eaten a power pill and is chasing the ghost) the ghost will run the scatter algorithm rather than just fleeing in the direction opposite to lambda-man. This is to disperse the ghosts, thus maximising the probability that some of them will escape lambda-man.


Brilliant is the ghost that adds the most important feature: before going into a given direction, it follows the corresponding corridor until it finds one of three things:

  1. lambda-man
  2. an intersection or a fork
  3. a dead end

In cases 1 and 2, the ghost goes into that direction, while in case 3 it will chose another direction: there's no point in going into a dead-end unless lambda-man is there.

Unlike its predecessors, Brilliant will not only test the principal direction to avoid banging its head on walls, it also tests it to make sure it is not a U-turn (the rules of the game forbid U-turns for ghosts, except at the bottom of a dead end). This is done at lines 146 to 178: if there is a wall in the primary direction, or the primary direction is a U-turn, or it's a dead-end without lambda-man, then take the secondary direction.

At least that's the theory, because there is a nasty bug: after following the corridor, Brilliant will just ignore this information and take the principal direction anyway... not really brilliant after all!

To follow a corridor from a starting point and a direction, we first make a step in that direction (lines 193 to 196) then look in the three allowed direction (excluding a U-turn) and count the number of non-wall tiles in %good_dir_num, recording one of them in %good_dir_val.

Line 220 has a nasty bug: it should jump to lwalling1 rather than lwalling2. As a result, the second direction (90 degrees clockwise from the current direction) is never tested and the whole corridor-following code seldom returns a correct result.


Genius is a small but important variation on the Brilliant code: when following the corridor, check for the presence of other ghosts in front of the current ghost. If there is already a ghost going down that corridor, avoid entering it.

This means that the ghosts will naturally disperse and try alternate routes to lambda-man, thus cutting of more of his escape paths.

There is a difficulty here: the ghost doesn't know how many other ghosts there are, so it cannot query their position. We solved it by only testing for lower-numbered ghosts, that is ghost 0 ignores all other ghosts, ghost 1 only tests for presence of ghost 0, etc. It works pretty well because lower-numbered ghosts walk faster, so they will tend to be in front anyway.


Since Genius tends to scatter naturally around lambda-man, it was natural to try a version that didn't alternate between scatter and chase phases, but did chase all the time (except when in panic mode, obviously). This is what Genius2 does; otherwise it is identical to Genius.

by Damien Doligez at May 25, 2015 08:00 AM

May 24, 2015

Shayne Fletcher

Church Numerals

This is just a little fun. Jason Hickey in "Introduction to Objective Caml" poses some little end of chapter problems to define arithmetic operations for a type of unary (base-1) natural numbers. The type is

type num = Z | S of num
where Z represents the number zero and if i is a unary number, then S i is i + 1.

This formulation of Church numerals using a recursive type and pattern matching means in truth, the problems can be solved in less than 5 minutes or so. Of course, the real Church numerals are numbers encoded in functions

  • $c_{0} = \lambda s.\lambda z.\;z$
  • $c_{1} = \lambda s.\lambda z.\;s\;z$
  • $c_{2} = \lambda s.\lambda z.\;s\;(s\;z)$
  • $c_{3} = \lambda s.\lambda z.\;s\;(s\;(s\;z))$
  • $\cdots$
and their represenation including arithmetic operations can be formulated in OCaml too (and it's a good exercise but harder than what we are going to do here -- if you'd like to see more about that, have a look at this Cornell lecture).

Alright, without further ado, here we go then.

type num = Z | S of num

let scc (x : num) : num = S x
let prd : num -> num = function | S n -> n | _ -> Z

let rec add (x : num) (y : num) : num =
match (x, y) with
| (Z, _) -> y
| (_, Z) -> x
| (S m, n) -> scc (add m n)

let rec sub (x : num) (y : num) : num =
match (x, y) with
| (Z, _) -> Z
| (n, Z) -> n
| (S m, n) -> sub m (prd n)

let rec mul (x : num) (y : num) : num =
match (x, y) with
| (Z, _) -> Z
| (_, Z) -> Z
| (S Z, x) -> x
| (x, S Z) -> x
| (S m, n) -> add (mul m n) n

let rec to_int : num -> int = function | Z -> 0 | S n -> 1 + to_int n
let rec from_int (x : int) = if x = 0 then Z else scc (from_int (x - 1))
For example, in the top-level we can write things like,

# to_int (mul (sub (from_int 23) (from_int 11)) (from_int 2));;
- : int = 24

The main thing I find fun about this little program though is how obvious its mapping to C++. Of course you need a discriminated union type my default choice being boost::variant<> (by the way, standardization of a variant type for C++ is very much under active discussion and development, see N4450 for example from April this year - it seems that support for building recursive types might not be explicitly provided though... That would be a shame in my opinion and if that's the case, I beg the relevant parties to reconsider!).

#include <boost/variant.hpp>
#include <boost/variant/apply_visitor.hpp>

#include <stdexcept>
#include <iostream>

struct Z;
struct S;

typedef boost::variant<Z, boost::recursive_wrapper<S>> num;

struct Z {};
struct S { num i; };

int to_int (num const& i);

struct to_int_visitor
: boost::static_visitor<int> {
int operator ()(Z const& n) const { return 0; }
int operator ()(S const& n) const { return 1 + to_int (n.i); }

int to_int (num const& i) {
return boost::apply_visitor (to_int_visitor (), i);

num from_int (int i) {
if (i == 0){
return Z {};
return S {from_int (i - 1)};

num add (num l, num r);

struct add_visitor : boost::static_visitor<num> {
num operator () (Z, S s) const { return s; }
num operator () (S s, Z) const { return s; }
num operator () (Z, Z) const { return Z {}; }
num operator () (S s, S t) const { return S { add (s.i, t) }; }

num add (num l, num r) {
return boost::apply_visitor (add_visitor (), l, r);

num succ (num x) { return S{x}; }

struct prd_visitor : boost::static_visitor<num>{
num operator () (Z z) const { return z; }
num operator () (S s) const { return s.i; }

num prd (num x) {
return boost::apply_visitor(prd_visitor (), x);

num sub (num x, num y);

struct sub_visitor : boost::static_visitor<num> {
num operator () (Z, Z) const { return Z {}; }
num operator () (Z, S) const { return Z {}; }
num operator () (S m, Z) const { return m; }
num operator () (S m, S n) const { return sub (m.i, prd (n)); }

num sub (num x, num y) {
return boost::apply_visitor (sub_visitor (), x, y);


int main () {

num zero = Z {};
num one = succ (zero);
num two = succ (succ (zero));
num three = succ (succ (succ (zero)));

std::cout << to_int (add (two, three)) << std::endl;
std::cout << to_int (sub (from_int (23), from_int (12))) << std::endl;

return 0;
I didn't get around to implementing mul in the above. Consider it an "exercise for the reader"!

by Shayne Fletcher (noreply@blogger.com) at May 24, 2015 08:13 PM

May 23, 2015

Daniel Bünzli

On the book « More OCaml »

I eventually read through John Whitington's latest book « More OCaml. » It's again a very good one, I made a small review here.

May 23, 2015 03:17 PM

May 22, 2015


ICFP programming contest 2014: a retrospective (part 1/2)

The dates for the 2015 ICFP programming contest have been announced on the contest webpage. This is a good occasion to dig out a summary of our 2014 participation, that had been written over the span of several months (mostly right after the contest), and that we had unexplainably failed to publish before.

In 2013, some Gallium members (Thibaut Balabonski, Thomas Braibant, Jacques-Henri Jourdan and Gabriel Scherer) participated to the ICFP programming contest, and it was a lot of fun. We participated again in 2014, with a bit less gallium in the team (blame the precarity of post-docs, and some badly-chosen dates for a honeymoon) and a skewed alphabetic distribution: François Bobot, Pierre Boutillier, Thomas Braibant, Damien Doligez and Gabriel Scherer.

This (double) blog post was written over the course of the last ~5 months by three different persons. With hindsigth, we could probably have phrased some of the post differently. We believe that this blog post is best enjoyed as a pot-pourri of our thoughts, much like our ragtag submission.

The subject of the 2014 contest was the following: a computer game called Lambda-Man, a variation on Pacman, running on an arcane (simulated) computer architecture with heterogeneous processors. We were asked to implement a Lambda-man AI (the player character) on a Lisp-machine-inspired processor architecture, using a functional but relatively low-level programming language, and to write ghost programs (the attackers) on very low-level co-processors, using a low-level assembly-like language. The organizers funnily enough called the functional low-level language "Gcc", and the Ghost language "Ghc".

Because it would be too long to be read comfortably in one shot, we split the document in two posts: we will now mostly focus on the Lambda-Man AI, and describe the ghost in a later post.

Our broad design

We wrote a Ghost in ghost machine code, using a very thin assembler to have named labels and simple constants. The ghost AI, due to Damien and Thomas, is relatively elaborate and is probably the best part of our submission (Editor's note: the judges specifically refered to our ghost as the best of the competition. See the annoucement.).

For the Lambda-man AI, we wrote a compiler from OCaml (what else?) to the Lisp machine. Specifically, we use the OCaml compiler API (as exposed in the not-well-documented compiler-libs ocamlfind package) to get the "lambda" representation to the Lisp machine.

"Lambda" is the representation used just after the compilation of pattern-matching, the choice of data representation, and a couple of very simple type-directed specializations; it is an untyped representation that is rather close to Lisp, and was thus a fitting choice. In particular, lower-level representations (bytecode or Cmm) have already performed closure conversion, which was unnecessary as our backend supports closures. You can inspect the lambda-representation of an OCaml module foo.ml by running ocamlc -dlambda -c foo.ml.

We then wrote the Lambda-man AI using OCaml, in the subset that our compiler supports (more on this in the "# Compiler limitations" section). The AI is rather simple; a problem in our team (and I guess many other teams) is that everyone was quite excited at writing a compiler, but nobody was really fond of writing AIs, so we delayed this part to the very last day of the contest.

PS: We spent quite a few hours implementing a simulator (interpreters for Gcc and Ghc, and game rules engine), but we couldn't make much use of them in the end. Time would probably have been better spent experimenting with the Javascript API directly from the start. On the other hand, we kept the option open to actually simulate other's ghosts in our lambda-man, which would have made use of our Ghc simulator -- written in OCaml; we didn't have the time budget to make that work, though.

In retrospect, I think that reusing/porting/hacking an existing language (OCaml) was a great choice, because it allowed us to test our IA code not only on the organizers' simulator, but also by compiling it as an usual OCaml program with the usual OCaml toolchain, which has less bugs (than our own compiler backend) and is much more efficient. This saved our sanity of mind many times.

The lambda-man AI

We iterated quite a lot on the lambda-man AI in the last few hours of the contest. One problem is that we had a lot of fun with the ghosts and the compiler, and we delayed working on the main part of the contest till the end. Here are the main features of our lambda-man, which was mostly implemented by Thomas with help from Gabriel:

  • at initialisation time, it computes the graph structure of the map (seen as a vector of vector of possible directions); this makes it possible to have a more efficient main loop in the time constrained part of the game.

  • our main loop is a depth-first search that computes the best possible path of a given length, based on a few heuristics. Broadly speaking, the value of a path is the sum of the value of the pills that are eaten along the way, plus an approximation of the value of the ghosts that are eaten, minus a huge value in case our lambda-man gets eaten. One nice thing is that Gabriel implemented a function that predicts the possible positions of the ghosts in the next few steps, which helps take good decisions in tight situations (rather than flee blindly if a ghost comes nearby).

  • since this DFS is quite costly and cannot be used to find a long path, we had to put a failsafe mechanism in place that tries to find a pill if no "good" path is available.

  • finally, our lambda-man "remembers" its decisions and try to keep its path, unless its path is blocked by a ghost or a ghost is too close.

There are quite a few problems with this basic solution. For instance, our lambda-man will quite often let a pill alone (because its AI forbids backtracking). Then, it does not try to maximise the use of power pills (and does not even give the proper value to the ghosts eaten in succession). Also, it does not computes the proper timing of the moves of the ghosts -- each ghost move at its own pace, and keeping track of this situation might be useful in tight situations.

Finally, in the heat of the last fifteen minutes, Thomas added (amond other changes) the Fruit line in the part of our code that computes the value of a path:

let content = get2 map pos in
let value, fright  =
  match content with
    | Pill -> value + 10, fright
    | Power_pill -> value + 50, fright + 20
    | Fruit -> value + 15, fright (* Blunder *)
    | _ -> value, fright

This is a real bad idea, and it was obvious soon enough. Problem is we ran out of time to correct it ("Never try to update your submission in the last 120 seconds of the contest"). The problem is that Fruit indicates a cell of the map where the fruit may be, not the position of the fruit updated in real-time. It turns out that this single line makes our lambda-man perform quite badly, because it might try to run in circle around the Fruit cell. There are plenty of one-line changes that could solve or mitigate this issue: e.g., removing the line, or changing this 15 value into 5, or just checking that the fruit is indeed present at the time of the evaluation. Too bad.

The OCaml-to-Lisp-machine compiler

The compiler, written mainly by François (which first experimented the backend by hacking on the miniML compiler of Andrej Bauer's PL zoo, thanks Andrej!), only implements the bare minimum we need to write our AIs. We made two design choices:

  1. We would only implement the minimum needed; so we have no modules, no polymorphic variants (or at least we didn't check that they were supported), and even our compilation of pattern-matching was a bit lousy: the OCaml compiler will generate various kind of code depending on whether it thinks, for example, that binary tests or jump tables are the best way to test something, and we only added support for the various lambda-code instruction as our AI examples revealed a need for. It's a compiler that is meant to be used in tandem with a backend hacker, implementing (or fixing) the features as your need evolve.

  2. We wanted the OCaml datatypes to be compiled in such a way that we would need no separate FFI layer to talk with the values given to us, in Lisp-land, by the game engine. At the beginning of each turn the lambda-man AI is passed a big tuple of linked lists that represents the map, each, and we wanted those to fit exactly the representation of some OCaml datatype. Another option would be to represent the OCaml datatypes however we like, and add the FFI types as primitives that can only be inspected through special compiler primitives.

Those two choices are inter-linked: representing OCaml records as Lisp tuples (our goal) is harder than as Lisp lists or Lisp closures (did you notice how frame environments are the only memory structure in this Lisp machine that allows reasonably compact storage of data?), because the access pattern for the very last element is not the same as the others. We ended up without records, but with good support for OCaml tuples (implemented as Lisp tuples).

Another OCaml-using team (the OCamlpro guys) made the different design choice of having the FFI as compiler primitives, and also implemented a more complete support for the rest of the OCaml language -- we'll let them describe their solution if they want to.

Data representation tricks

We initially wished to have a valid OCaml representation for Lisp lists (either 0 for the empty list or a cons-cell (x . xs) for... a cons cell x::xsq), and arrange our compilation of OCaml tuples and records to be exactly Lisp tuples (eg. (a,b,c) is (a . (b . c)), note how there is no end-marker). This is all that was needed to talk natively with the world representation passed to our Lambda-AI. We ended up with no records, OCaml tuples that were exactly Lisp tuples and, surprisingly, OCaml lists that mapped directly to Lisp lists (we would have been happy with having to define a hand-made OCaml datatype).

Focusing on tuples for now, we faced two independent issues:

  • Constructor tags: OCaml variants (sums) and tuples/records (products) come with a header word that contains some GC-specific information and a "constructor tag"; it always is 0 for tuples and records, and for variants it indicates the constructor. For example with the datatype

    type foo =
     | A
     | B
     | C of foo
     | D of bar * baz

    A and B are represented as (tagged) integers, and the heap-allocated blocks are C x (tag 0, one parameter) and D (y, z) (tag 1, two parameters).

    Directly translating this to cons-cells would give to the OCaml tuple (a, b, c) the representation (0 . (a . (b . c))), which is not good to talk to the FFI (of course we could convert first, but constraints sometime provoke creativity).

  • Irregular access: with the Lisp representation, the access pattern for the third-element of a triplet is distinct from the first or second. Unfortunately, the lambda-representation of field access in the OCaml compiler tells us which field to access, but not the total length of the structure (when it is statically known; the same field-access instruction is used for arrays and strings). We would need to compile getfield foo 3 different if foo has 3 fields, or 4 or more, and we don't know which it is.

Solving the irregular access

Thomas had a cool idea to solve the irregular access problem: let's preprocess the OCaml source-code to replace each n-tuple by a composition of 2-tuples, just as in the Lisp representation: (a, b, c) becomes (a, (b, c)), and we gain the invariant that field accesses are always to index 0 and 1, which map directly to car and cdr.

In theory this also works for records: translate record types to tuples and, on record field selection, use the type information associated to the field to know which field number it is, and translate that accordingly. (OCaml 4.00 supports distinct record types sharing field names, so you really need typing information here.) It works in theory, but processing the OCaml typedtree is painful enough in practice to be worth it; so we did not implement this for records.

The preprocessing for tuples is amazingly easy to do using Alain Frisch's visitor-pattern Ast_mapper library.

Solving the constructor tags

The two data structures we care about the most are linked lists and tuples; in both case the tag is 0. Gabriel suggested that we make that a rule: let's forbid non-0 tags, that is, sum types with strictly more than one non-constant constructor. We can then never include the tag and get exactly the Lisp tuple representation.

That may seem like an awful limitation, but as PhD students (or, for the most part, former PhD students), we get paid to know about arcane type-system tricks we would never hope to suggest in a serious discussion, but can use in situations like that: you can translate any OCaml type declaration into an isomorphic one respecting this condition (at most one non-constant constructor), using GADTs to simulate sigma-types. The type declaration above gets rewritten as:

type foo =
 | Foo : 'a foo_tag * 'a -> foo
and foo_tag =
 | A : unit foo_tag
 | B : unit foo_tag
 | C : foo foo_tag
 | D : (bar * baz) foo_tag

And we used this trick in our AI implementation, in this rushed implementation of logarithmic-access immutable arrays (we didn't implement any support for mutation, as pure and strict is clearly the way to go):

type 'a vect_tree = Tree : ('a, 'b) vect_tag * 'b -> 'a vect_tree
and ('a, _) vect_tag =
| Leaf : ('a, 'a) vect_tag
| Node : ('a, 'a vect_tree * 'a vect_tree) vect_tag
type 'a vect = 'a vect_tree * int

let get_vect (t, n) i =
  let rec find : type a . int -> int -> a vect_tree -> a =
    fun i n -> function
    | Tree (Leaf, v) -> v
    | Tree (Node, (left, right)) ->
      let mid = n / 2 in
      if i < mid
      then find i mid left
      else find (i - mid) (n - mid) right
  in find i n t

let rec vect_of_list li =
  let rec consume li n =
    if n = 1 then
      begin match li with
        | [] -> assert false
        | x::xs -> (Tree (Leaf, x), xs)
    else begin
      let mid = n / 2 in
      let left, li = consume li mid in
      let right, li = consume li (n - mid) in
      Tree (Node, (left, right)), li
  let len = list_length li in
  match consume li len with
    | tree, [] -> (tree, len)
    | _, _::_ -> assert false

Our productions

For reference, we uploaded the sorry state of our code at the very end of the contest as this archive. (This is not an open-source release, because it would be work to prepare one.)

The ghost (readable) assembly are in:

  • ghosts/genius.g
  • ghosts/genius2.g

The lambda-man source is:

  • mlia/graph_local.ml

Actually running our code

We use

  • ocaml 4.01 (including compilerlibs (+unix +graphics))
  • ocamlfind

In code/

  • gcc/ contains compiler ./ocamlgcc.native from OCaml (minus records, references, exceptions) to gcc. « make » compiles it.

  • mlia/ is the AI of lambda man written in OCaml. In this directory do ../gcc/ocamlgcc.native « an_ia.ml » --print --no-exec to get some gcc code

  • ghc/ contains an assembler ./asm.byte of ghc. « make » compiles it.

  • ghosts/*.g contains the assembly of our ghost AI. It must go though ../ghc/asm.byte « an_ia.g » to give correct ghc code.

  • At top-level you can try ocamlbuild -use-ocamlfind main.native to get our not working own game simulator (you’ll need « cmdliner » to compile it)

by Gabriel Scherer, Thomas Braibant at May 22, 2015 08:00 AM

May 19, 2015

Shayne Fletcher

Labeled and optional arguments

I don't know why this is, but of all the features of the OCaml language, somehow remembering how to make use of labeled or optional arguments is difficult for me. I find when I need to use them, I have to go back and "relearn" the syntax every time. For that reason, I've written these summary notes for myself (based mostly on Jacques Garrigue's "Labels and Variants" chapter of the OCaml reference) and I hope they might be of help you too!


I've seen people write this function from time to time.

let flip (f : 'a -> 'b -> 'c) = fun y x -> f x y

flip applied to a function f in two arguments, produces a new function that when presented reverses their order in the application of f. For example, if sub is defined by let sub x y = x - y, then sub 3 2 will give the result 1 whereas (flip sub) 3 2 will produce the value -1.

The reason you'd find yourself wanting a function like flip is because you want to partially apply f but the argument you want to bind is inconveniently in the 'wrong' position. For example, the function that subtracts 2 from it's argument can be written (flip sub 2) (whereas sub 2 is the function that subtracts its argument from 2).

Labeled arguments do away with the need for functions like flip. You see, arguments in function types can be annotated.

val sub : (x:int) -> (y:int) -> int
In the above, x and y are labels. When you define a function with labeled arguments, you prefix the labels with a tilde ~.

let sub : (x:int -> y:int -> int) = fun ~x ~y -> x - y

Labels and variables are not the same things. A label is a name associated with a variable e.g. let u = 3 and v = 2 in sub ~x:u ~y:v (u, v are variables x, y labels) and this also works when the variables are replaced with literals as in sub ~x:3 ~y:2.

The expression ~x on it's own is shorthand for ~x:x which means that you can conveniently write let x = 3 and y = 2 in sub ~x ~y for example. This is "punning" - meaning if the variable name and the label are the same, the variable name is permitted to be elided in the application.

Labels enable making function application commutative (changing the order of the arguments does not change the result); one can write sub ~y ~x and that will yield the same result as sub ~x ~y. Where this is useful of course is partial application of a function on any any argument. For example, we can create functions from sub by binding either of x or y such as sub ~x:3 or sub ~y:2 without having to resort to ad-hoc trickery like flip sub.

Some technical details:

  • If more than one argument of a function has the same label (or no label), position prevails and they will not commute among themselves;
  • If an application is total, labels can be omitted (e.g. like in sub 3 2). Be wary though - when a function is polymorphic in its return type, it will never be considered as totally applied. Consider for example , let app ~f ~x = f x. Then app has type f:('a -> 'b') -> x:'a -> 'b. Now we attempt total application but omit the required labels app (fun x -> x) 1 and this types to f:('a -> ('b -> 'b) -> int -> 'c) -> x:'a -> 'c. You can see that the arguments have been "stacked up" in anticipation of presentation of f and x and the result although it types, is not at all what we were going for!
  • When a function is passed to a higher order function, labels must match in both types, labels may not be added or removed.

Optional arguments

In a function type involving an optional argument, the annotation ? appears.

val range : ?step:int -> int -> int -> int list
The ? appears again in the value syntax of a function definition involving an optional argument. Additionally, at that time, optional parameters can be given default values. When calling a function with a value for an optional parameter, well, you use the fact it is a labeled argument and provide a ~ as normal.

let rec range ?(step:int = 1) (m : int) (n : int) : int list =
if m >= n then []
else m :: (range ~step (m + step) n)

Now, this bit is important. A function can not take optional arguments alone. There must be some non-optional ones there too and it is when a non-optional parameter that comes after an optional one in the function type is encountered is it determined if the optional parameter has been omitted or not. For example, if we reorder the argument list as in the below, we find we can't 'erase' the optional argument anymore.

let rec range (m : int) (n : int) ?(step : int = 1): int list = ...
Warning 16: this optional argument cannot be erased.
That said, optional parameters may commute with non-optional or unlabeled ones, as long as they are applied simultaneously. That is, going back to this definition,

let rec range ?(step:int = 1) (m : int) (n : int) : int list =
if m >= n then []
else m :: (range ~step (m + step) n)
(range 0 ~step:2) 100 is valid whereas, (range 0) ~step:2 100 is not.

Optional arguments are implemented as option types. In the event no default is given, you can treat them as exactly that by matching on None or Some x to switch behaviors. Here's a schematic of what that looks like.

let f ?s y =
match s with
| None -> ...
| Some x -> ...

When you are just "passing through" an optional argument from one function call to another, you can prefix the applied argument with ? as in the following.

let g ?s x = ...
let h ?s x = g ?s x (*[s] is passed through to [g]*)

by Shayne Fletcher (noreply@blogger.com) at May 19, 2015 06:09 PM

Andrej Bauer

The troublesome reflection rule (TYPES 2015 slides)

Here are the slides of my TYPES 2015 talk “The troublesome reflection rule” with fairly detailed presenter notes. The meeting is  taking place in Tallinn, Estonia – a very cool country in many senses (it’s not quite spring yet even though we’re in the second half of May, and it’s the country that gave us Skype).

Download slides: The troublesome reflection rule (TYPES 2015) [PDF].

by Andrej Bauer at May 19, 2015 03:10 PM

May 18, 2015


Sliding Tile OCaml iOS App

May 18, 2015

I revamped another OCaml iOS app from a few years ago to make it run under iOS 8.3. It implements a sliding tile puzzle that was a fad in the 1880s and was also popular in my childhood a few years after that.

Instructions for downloading, building, and running the app are here:

Slide24: Sliding Tile Puzzle for iOS

You can download the sources directly here:

Slide24 2.0.2, OCaml app for iOS 8.3 (111 KB)

I had a lot of fun with this puzzle as a kid, and I still find it enjoyable to play with though it’s not particularly challenging to solve. To make the app more interesting, I coded up a heuristic search that solves the puzzle.

Here are some insights I had while revamping the app:

  • iOS devices have gotten tremendously faster since I wrote the code. I should rewrite it to solve with fewer moves. I’d really love to see what an extremely short solution looks like. I’ll bet it looks impossibly wise, like those Swedish girls in First Aid Kit.

  • It’s a little strange for a puzzle to solve itself. Like what if there was a button on the jigsaw puzzle box that made the pieces crawl themselves on the table into the solution. It would be fun to watch, but would it be a puzzle any more? More like a garden I guess.

Now, reader, it’s late at night and I have to rest up. There are exciting new developments in the OCaml-on-iOS world to talk about in the coming weeks. A version of the the iOS cross compiler, much improved with help from the truly knowledgeable, and with support for 64-bit ARM, will soon be available through OPAM. I’m really looking forward to this, and indeed this is why I’m working through these apps as fast as I can. I’m testing them against the new compiler.

If you have any trouble (or success) getting Slide24 to work for you, leave a comment below or email me at jeffsco@psellos.com.

Posted by: Jeffrey

<style type="text/css"> .flowaroundimg { float: left; margin: 0em 1em 0em 0em; } .rightoffloat li { position: relative; left: 1em; } pre { white-space: pre-wrap; width: 96%; margin-bottom: 24px; overflow: hidden; padding: 3px 10px; -webkit-border-radius: 3px; background-color: #fed; border: 1px solid #dcb; } pre code { white-space: pre-wrap; border: none; padding: 0; background-color: transparent; -webkit-border-radius: 0; } td { padding: 0em 1em 0em 1em; } th { padding: 0em 1em 0em 1em; } </style>

May 18, 2015 07:00 PM


Reduced Memory Allocations with ocp-memprof

In this blog post, we explain how ocp-memprof helped us identify a piece of code in Alt-Ergo that needed to be improved. Simply put, a function that merges two maps was performing a lot of unnecessary allocations, negatively impacting the garbage collector's activity. A simple patch allowed us to prevent these allocations, and thus speed up Alt-Ergo's execution.


The Story

Il all started with a challenging example coming from an industrial user of Alt-Ergo, our SMT solver. It was proven by Alt-Ergo in approximately 70 seconds. This seemed abnormnally long and needed to be investigated. Unfortunately, all our tests with different options (number of triggers, case-split analysis, ...) and different plugins (satML plugin, profiling plugin, fm-simplex plugin) of Alt-Ergo failed to improve the resolution time. We then profiled an execution using ocp-memprof to understand the memory behavior of this example.


Profiling an Execution with ocp-memprof

As usual, profiling an OCaml application with ocp-memprof is very simple (see the user manual for more details). We just compiled Alt-Ergo in the OPAM switch for ocp-memprof (version 4.01.0+ocp1) and executed the following command:

$ ocp-memprof --exec ./ae-4.01.0+ocp1-public-without-patch pb-many-GCs.why

The execution above triggers 612 garbage collections in about 114 seconds. The analysis of the generated dumps produces the evolution graph below. We notice on the graph that:

  • we have approximately 10 MB of hash-tables allocated since the beginning of the execution, which is expected;

  • the second most allocated data in the major heap are maps, and they keep growing during the execution of Alt-Ergo.

We are not able to precisely identify the allocation origins of the maps in this graph (maps are generic structures that are intensively used in Alt-Ergo). To investigate further, we wanted to know if some global value was abnormally retaining a lot of memory, or if some (non recursive-terminal) iterator was causing some trouble when applied on huge data structures. For that, we extended the analysis with the --per-root option to focus on the memory graph of the last dump. This is done by executing the following command, where 4242 is the PID of the process launched by ocp-memprof --exec in the previous command:

$ ocp-memprof --load 4242 --per-root 611



The per-root graph (above, on the right) gives more interesting information. When expanding the stack node and sorting the sixth column in decreasing order, we notice that:

  • a bunch of these maps are still in the stack: the item Map_at_192_offset_1 in the first column means that most of the memory is retained by the fold function, at line 192 of the Map module (resolution of stack frames is only available in the commercial version of ocp-memprof);

  • the "kind" column corresponding to Map_at_192_offset_1 gives better information. It provides the signature of the function passed to fold. This information is already provided by the online version.

Uf.Make(Uf.??Make.X).LX.t ->
Explanation.t ->
Explanation.t Map.Make(Uf.Make(Uf.??Make.X).LX).t ->
Explanation.t Map.Make(Uf.Make(Uf.??Make.X).LX).t

This information allows us to see the precise origin of the allocation: the map of module LX used in uf.ml. Lucky us, there is only one fold function of LX's maps in the Uf module with the same type.


Optimizing the Code

Thanks to the information provided by the --per-root option, we identified the code responsible for this behavior:

(*** function extracted from module uf.ml ***)

01 | module MapL = Map.Make(LX)
02 |
03 | let update_neqs r1 r2 dep env =
04 |   let merge_disjoint_maps l1 ex1 mapl =
05 |     try
06 |       let ex2 = MapL.find l1 mapl in
07 |       let ex = Ex.union (Ex.union ex1 ex2) dep in
08 |       raise (Inconsistent (ex, cl_extract env))
09 |     with Not_found ->
10 |       MapL.add l1 (Ex.union ex1 dep) mapl
11 |   in
12 |   let nq_r1 = lookup_for_neqs env r1 in
13 |   let nq_r2 = lookup_for_neqs env r2 in
14 |   let mapl = MapL.fold merge_disjoint_maps nq_r1 nq_r2 in
15 |   MapX.add r2 mapl (MapX.add r1 mapl env.neqs)

Roughly speaking, the function above retrieves two maps nq_r1 and nq_r2 from env, and folds on the first one while providing the second map as an accumulator. The local function merge_disjoint_maps (passed to fold) raises Exception.Inconsistent if the original maps were not disjoint. Otherwise, it adds the current binding (after updating the corresponding value) to the accumulator. Finally, the result mapl of the fold is used to update the values of r1 and r2 in env.neqs.

After further debugging, we observed that one of the maps (nq_r1 and nq_r2) is always empty in our situation. A straightforward fix consists in testing whether one of these two maps is empty. If it is the case, we simply return the other map. Here is the corresponding code:

(*** first patch: testing if one of the maps is empty ***)

.. | ...
14 |   let mapl =
15 |      if MapL.is_empty nq_r1 then nq_r2
16 |      else
17 |        if MapL.is_empty nq_r2 then nq_r1
18 |        else
19 |          MapL.fold merge_disjoint_maps nq_r1 nq_r2
.. | ...

Of course, a more generic solution should not just test for emptiness, but should fold on the smallest map. In the second patch below, we used a slightly modified version of OCaml's maps that exports the height function (implemented in constant time). This way, we always fold on the smallest map while providing the biggest one as an accumulator.

(*** second (better) patch : folding on the smallest map ***)

01 | module MapL = Emap.Make(LX)
.. | ...
14 |   let small, big =
15 |     if MapL.height nq_r1 < MapL.height nq_r2 then nq_r1, nq_r2
16 |     else nq_r2, nq_r1
17 |   in
18 |   let mapl = MapL.fold merge_disjoint_maps small big in
.. |   ...

Checking the Efficiency of our Patch

Curious to see the result of the patch ? We regenerate the evolution and memory graphs of the patched code (fix 1), and we noticed:

  • a better resolution time: from 69 seconds to 16 seconds;

  • less garbage collection : from 53,000 minor collections to 19,000;

  • a smaller memory footprint : from 26 MB to 24 MB;





We show in this post that ocp-memprof can also be used to optimize your code, not only by decreasing memory usage, but by improving the speed of your application. The interactive graphs are online in our gallery of examples if you want to see and explore them (without the patch and with the patch).

AE AE + patch Remarks
4.01.0 69.1 secs 16.4 secs substantial improvement on the example
4.01.0+ocp1 76.3 secs 17.1 secs when using the patched version of Alt-Ergo
dumps generation 114.3 secs (+49%) 17.6 secs (+2.8%) (important) overhead when dumping
memory snapshots
# dumps (major collections) 612 GCs 31 GCs impressive GC activity without the patch
dumps analysis
(online ocp-memprof)
759 secs 24.3 secs
dumps analysis
(commercial ocp-memprof)
153 secs 3.7 secs analysis with commercial ocp-memprof is
~ x5 faster than public version (above)
AE memory footprint 26 MB 24 MB memory consumption is comparable
minor collections 53K 19K fewer minor GCs thanks to the patch

Do not hesitate to use ocp-memprof on your applications. Of course, all feedback and suggestions are welcome, just email us !

More information:

by Çagdas Bozman (cagdas.bozman@ocamlpro.com) at May 18, 2015 12:00 AM