OCaml Planet

December 19, 2014

Psellos

OCaml 4.01 for iOS 8 Simulator

December 19, 2014

OCamlXARM compiles for an iOS device, but OCamlXSim compiles for an iOS Simulator. The same ocamloptrev script that compiles OCaml for iOS 8 can also get OCamlXSim to compile OCaml for the iOS 8 Simulator. The only thing that changes is the location of the compiler.

If you want to try out OCaml on the iOS 8 Simulator, here is an update to the script that compiles for either an iOS device or an iOS Simulator (ocamloptrev):

#!/bin/bash
#
# ocamloptrev     ocamlopt for specified iOS revision
#
USAGE='ocamloptrev  -rev M.N  [ -sim ]  other-ocamlopt-options ...'

OCAMLDIR=/usr/local/ocamlxarm/v7
OCAMLSIMDIR=/usr/local/ocamlxsim

REV=''
SIM=n
declare -a ARGS
while [ $# -gt 0 ] ; do
    case $1 in
    -rev)
        if [ $# -gt 1 ]; then
            REV=$2
            shift 2
        else
            echo "$USAGE" >&2
            exit 1
        fi
        ;;
    -sim)
        SIM=y
        shift
        ;;
    *)  ARGS[${#ARGS[*]}]="$1"
        shift 1
        ;;
    esac
done
if [ "$REV" = "" ]; then
    echo "$USAGE" >&2
    exit 1
fi

HIDEOUT=/Applications/Xcode.app/Contents/Developer 

case $SIM in
y)  PLT=$HIDEOUT/Platforms/iPhoneSimulator.platform 
    SDK=/Developer/SDKs/iPhoneSimulator${REV}.sdk 
    OCAMLC=$OCAMLSIMDIR/bin/ocamlopt
    ;;
n)  PLT=$HIDEOUT/Platforms/iPhoneOS.platform 
    SDK=/Developer/SDKs/iPhoneOS${REV}.sdk 
    OCAMLC=$OCAMLDIR/bin/ocamlopt
    ;;
esac

$OCAMLC -ccopt -isysroot -ccopt "$PLT$SDK" "${ARGS[@]}"

To compile for the iOS Simulator, specify -sim along with -rev M.N.

Let’s make a tiny OCaml program for testing:

$ Q="Do you know what it's like on the outside?\\n"
$ echo "Printf.printf \"$Q\"" > ny1941.ml

Here’s what happens if you compile with the current OCamlXSim on a system with the iOS 8.1 SDK:

$ /usr/local/ocamlxsim/bin/ocamlopt -o ny1941 ny1941.ml
clang: warning: no such sysroot directory: '/Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/Developer/SDKs/iPhoneSimulator7.1.sdk'
ld: library not found for -lSystem
clang: error: linker command failed with exit code 1 (use -v to see invocation)
File "caml_startup", line 1:
Error: Error during linking

As you can see, it’s trying and failing to use the default iOS Simulator 7.1 SDK. Here’s how to use ocamloptrev (the above script):

$ ocamloptrev -sim -rev 8.1 -o ny1941 ny1941.ml
$ ls -l ny1941
-rwxr-xr-x  1 jeffsco  staff  303364 Dec 19 23:02 ny1941
$ file ny1941
ny1941: Mach-O executable i386
$

You can actually run an iOS simulator app from the OS X command line, though there are many things that don’t work properly.

$ ny1941
Do you know what it's like on the outside?
$

See iOS Simulator Vs. OS X for a description of some differences between the OS X and the iOS Simulator environments.

If you don’t specify -sim, the script compiles for an iOS device as before:

$ ocamloptrev -rev 8.1 -o ny1941 ny1941.ml -cclib -Wl,-no_pie
$ file ny1941
ny1941: Mach-O executable arm
$ 

When not working in the subbasement of my alma mater, I’m working in my cluttered underground workroom on several OCaml-on-iOS projects. Along with holiday joys and the delights of coding in node.js during the day, I’ll keep working through them as fast as I can.

I hope this script will be useful for folks who want to try OCaml on the iOS Simulator while I’m updating my humble patches to the latest versions of everything and keeping all the irons in the fire.

If you have any trouble (or success) with the script, or have any other comments, leave them below or email me at jeffsco@psellos.com.

Posted by: Jeffrey

<style type="text/css"> .flowaroundimg { float: left; margin: 0em 1em 0em 0em; } .rightoffloat li { position: relative; left: 1em; } pre { white-space: pre-wrap; width: 96%; margin-bottom: 24px; overflow: hidden; padding: 3px 10px; -webkit-border-radius: 3px; background-color: #fed; border: 1px solid #dcb; } pre code { white-space: pre-wrap; border: none; padding: 0; background-color: transparent; -webkit-border-radius: 0; } td { padding: 0em 1em 0em 1em; } th { padding: 0em 1em 0em 1em; } </style>

December 19, 2014 07:00 PM

@typeocaml

Become a BST Ninja - Genin Level

bst_ninja

Binary Search Tree (BST) is one of the most classic data structures. The definition for its structure is shown as below:

  • It consists of Nodes and Leaves.
  • A Node has a child bst on the left side, a key (, a data), and a child bst on the right side. Note here a node's left or right child is not a node, instead, is indeed another binary search tree.
  • A Leaf has nothing but act only as a STOP sign.
type 'a bst_t =  
  | Leaf
  | Node of 'a bst_t * 'a * 'a bst_t (* Node (left, key, right) *)
(* a node may also carry a data associated with the key. 
   It is omitted here for neater demonstration *)

The important feature that makes BST unique is

  • for any node
    • all keys from its left child are smaller than its own key
    • all keys from its right child are bigger (assumming keys are distinct)

And here is an example of BST:

example_bst

Instead of continuing to present the basics of BST, this post will now focus on how to attack BST related problems with the most powerful weapon: Recursion.

Recursion on BST

From Recursion Reloaded, we know that one way to model recursion is:

  1. Assume we already got a problem solver: solve.
  2. Don't think about what would happen in each iteration.
  3. Split the problem to smallers sizes (N = N1 + N2 + ...).
  4. Solve those smaller problems like solve N1, solve N2, ... Here is the tricky part: still, we do not know or care what are inside solve and how f would do the job, we just believe that it will return the correct results.
  5. Now we have those results for smaller problems, how to wire them up to return the result for N? This is the ultimate question we need to answer.
  6. Of course, we do not forget the STOP sign for some edge cases.
  7. Together with point 5 and 6, we get our solve.

A good thing coming from BST is that the split step has been done already, i.e., a BST problem can be always divided into left child, root, and right child.

Thus if we assume we already got solve, we just need to solve left and / or solve right, then do something with root, and finally wire all outcomes to obtain the final result. Again, we should of course never forget the STOP sign and in BST, usually it is the Leaf, i.e., we need to ask ourselves what if the BST is a Leaf.

The thinking flow is illustrated as the diagram below.

solve

Before we start to look at some problems, note that in the diagram above or Recursion Reloaded, we seem to always solve both left and right, or say, all sub-problmes. It is actually not necessary. For BST, sometimes either left or right is enough. Let's have a look at this case first.

Either Left or Right

Our starter for this section is the simplest yet very essential operation: insert a key to a BST.

Insert

From the definition of BST, we know the basic rule is that if the new key is smaller than a root, then it belongs to the root's left child; otherwise, to the root's right child. Let's follow the modelling in the previous diagram to achieve this.

  1. We assume we already got insert key bst
  2. We know the problem can be split into left, root, and right.
  3. Because a new key can go either left or right, so we need to deal_with root first, i.e., compare the new key and the key of the root.
  4. Directly taken from the rule of BST, if the new key is smaller, then we need to solve left thus insert key left; otherwise insert key right.
  5. What if we get to a Leaf? It means we can finally place our new key there as a new Node and of course, at this moment both children of the new Node are Leaves.

insert

Note that the BST type in OCaml we defined early on is pure functional, which means every time we need to update something, we have to create new. That's why in the diagram, even if we just insert x to left or right, we need to construct a new Node because we are updating the left child or the right one. The code is shown as below.

let rec insert x = function  
  | Leaf -> Node (Leaf, x, Leaf) (* Leaf is a STOP *)
  | Node (left, k, right) ->
    if x < k then Node (insert x left, k, right) (* if x is smaller, go left *)
    else Node (left, k, insert x right) (* otherwise, go right *)

Member

member is to check whether a given key exists in the BST or not. It is very similar to insert. There are three differences:

  1. When dealing with root, we need to see whether the given key equals to the root's key or not. If yes, then we hit it.
  2. member is a readonly function, so we directly solve left or right without constructing new Node.
  3. If arriving at a Leaf, then we have nowhere to go any more and it means the given key is not in the BST.
let rec mem x = function  
  | Leaf -> false
  | Node (left, k, right) ->
    if x < k then mem x left
    else if x = k then true
    else mem x right

Both Left and Right

Usually when we need to retrieve some properties of a BST or possibly go through all nodes to get an answer, we have to solve both children. However, the modelling technique does not change.

Height

Recall from Some properties of a tree, the height of a tree is the number of edges that the longest path (from the root to a leaf) has. From this definition, it seems easy to get the height. We simply try to find all possible paths from root and for each path we record its number of edges. The answer will be the max of them. It sounds straightforward, but if you really try to write the code in this way, I bet the code would be a bit messy. Honestly, I never wrote in this way and I will never do that.

Another way is to think recursively. First let analyse a little bit about the longest path matter.

height

As we can see from the above diagram, Root has two edges: one to Left and the other to Right. So whatever the longest path from Root might be, it must pass either Left or Right. If we somehow could obtain the longest path from the root of Left and the longest path from the root of Right, the longest path from Root should be the bigger one of the two paths, right?

Let's now assume we already got height and it will return the height of a BST. Then we can obtain h_left and h_right. The answer is what is the h (height of Root)? Note that the height implies the longest path already (that's the definition). So from the paragraph above, What we need to do is getting max h_left h_right. Since Root has an edge to either child, h = 1 + max h_left h_right.

Don't forget the STOP sign: the height of a Leaf is 0.

let rec height = function  
  | Leaf -> 0
  | Node (left, _, right) -> 1 + max (height left) (height right)

Simple, isn't it?

Keys at a certain depth

So far, it seems our hypothetic solve function takes only the sub-probem as parameter. In many cases this is not enough. Sometimes we need to supply more arguments to help solve. For example, in the problem of retriving all keys at a certain depth definitely needs current depth information.

depth

Only with the help of current_depth, the Root can know whether it belongs to the final results.

  1. If current_depth = target_depth, then Root should be collected. Also we do not continue to solve Left or Right as we know their depths will never equal to target_deapth.
  2. Otherwise, we need to solve both Left and Right with argument of 1 + current_depth.
  3. Assume our solve is working. Then solve left (1+current_depth) would return a list of target nodes and so does solve right (1+current_depth). We simply then concatenate two target lists.
  4. STOP sign: Leaf is not even a Node, so the result will be empty list.

The code is like this:

let rec from_depth d cur_d = function  
  | Leaf -> []
  | Node (left, k, right) ->
    if cur_d = d then [k]
    else from_depth d (cur_d + 1) left @ from_depth d (cur_d + 1) right

let all_keys_at depth bst = from_depth d 0 bst  

Genin

The story of BST is not ending, but actually just beginning. Up to now, we have learnt essential recursive skills to somehow survive in BST's world but only if we face some trivial monsters. It is not enough. There are many problems related to BST and some of them are really hard to conquer. Anyhow, we do not need to be afraid. We are a proper Ninja now, although only with rank Genin, we still have the hope to improve ourselves and realise the dream of being a master one day.

From Ninja Wiki

A system of rank existed. A jōnin ("upper man") was the highest rank, representing the group and hiring out mercenaries. This is followed by the chūnin ("middle man"), assistants to the jōnin. At the bottom was the genin ("lower man"), field agents drawn from the lower class and assigned to carry out actual missions.

So in Ninja's world, we have three levels:

  1. Genian (lower)
  2. Chunin (middle)
  3. Jonin (master)

There will be two more dedicated posts on BST Ninja career. We will obtain rank Chunin soon and eventually be a boss Jonin. Let's sit tight and see.


Ps.

Some readers contacted me. They hope that maybe I can use more advanced knowledge or harder examples in my posts and the current ones might seem a little boring. I think I need to explain a bit here.

The general idea behind Many things about OCaml is not to write a cookbook for certain problems related to OCaml or be a place where quick solution is there and copy / paste sample code would do the trick. Instead, Many things means some important aspects in OCaml that might be overlooked, or some particular problems that can show the greatness of OCaml, or the elegant OCaml solutions for some typical data structures and algorithms, etc. As long as something are valuable and that value shows only in OCaml or Functional Programming, I would like to add them all in here one by one.

Fortunately or unfortunately, even though I have only limited experiences in OCaml, I found that the many is actually quite big. And due to this many, I had to make a plan to present them all in a progressive way. Topics can interleave with each other in terms of time order as we do not want to have the same food for a month. More importantly, however, all should go from simple / easy to advanced / hard. In order to present some advanced topic, we need to make sure we have a solid foundation. This is why, for example, I even produced a post for the properties of a tree although they are so basic. This is also why I reloaded recursion since recursion is everywhere in OCaml. They are a kind of preparations.

Moreover, I believe in fundamentals. Fundamentals are normally concise and contain the true essence. But sometimes they can be easily overlooked or ignored and we may need to experience a certain number of difficult problems afterwards, then start to look back and appreciate the fundamentals.

The reason of using simple examples is that it makes my life easier for demonstrations. I love visualisations and one graph can be better than thousands of words. For complicated problems and solutions, it is a bit more difficult to draw a nice and clean diagram to show the true idea behind. I will try to do so later on, but even if I could achieve it in this early stage, some readers might easily get lost or confused because of the unavoidable complication of the graph. As a result, the point of grasping fundamentals might be missed.

Anyway, please don't worry too much. Attractive problems in OCaml are always there. For example, in my plan, I will later start to present a number (maybe 15 ~ 17) of my beloved Functional Pearls in OCaml and if you are really chasing for some awesomeness, I hope they would satisfy you.

by Jackson Tale at December 19, 2014 12:28 PM

December 18, 2014

Shayne Fletcher

Compiling regular expressions (I)

This post picks up from here which was concerned with parsing - obtaining representations of regular expressions as abstract syntax trees. The ultimate goal is, given a string representation of a regular expression $e$ , produce a 'recognizer' for the expression (referred to as compiling a regular expression). That is, a function string -> bool that can be used to categorize strings as either belonging to the language $\mathcal{L_{e}}$ or not.

Having produced an abstract syntax tree for a regular expression $e$, the first step in compiling the expression is to compute an abstract syntax tree of the corresponding augmented regular expression $(e)\#$. This augmented regular expression is the original expression $e$ concatenated with a unique end-marker $\#$. For the given expression $e$, the accepting state for $e$ is given a transition on $\#$. This is a device that allows us to "forget" about accepting states as the computation of a recognizer proceeds; when the construction is complete, any state with a transition on $\#$ must be an accepting state.

Leaves in the abstract syntax tree of the augmented regular expression $(e)\#$ are labeled by $\epsilon$ or a symbol from from $\mathcal{A}$. For those non-$\epsilon$ leaves we attach a unique integer. Accordingly, we will need functions to generate unique integers (positions) that we will employ as we transform the AST of $e$ into the AST of the augmented expression $(e)\#$ leading to our first code example.


let reset_label, generate_label =
let r = ref (-1) in
((fun () -> r := (-1)), (fun () -> r := !r + 1; !r))

As we construct the syntax tree of the $(e)\#$ we compute four functions : null_pos, first_pos, last_pos and following:

  1. null_pos is $true$ for a syntax-tree node $n$ if and only if the sub-expression represented by $n$ has $\epsilon$ in its language. That is, $true$ if the regular sub-expression recognizes the empty string and $false$ otherwise;
  2. first_pos is the set of positions in the sub-tree rooted at $n$ that correspond to the first symbol of at least one string in the language of the sub-expression rooted at $n$. That is, the set of symbols that can begin a string recognized by the regular sub-expression;
  3. last_pos is the set of positions in the sub-tree rooted at the syntax-tree node $n$ that corresponds to the last symbol of at least one string in the language of the sub-expression rooted at $n$. That is, the set of symbols that can terminate a string recognized by the regular sub-expression;
  4. following, for a position $p$ is the set of positions $q$ in the entire syntax-tree such that there is some string $x = a_{1}a_{2} \cdots a_{n}$ in $\mathcal{L_{(e)\#}}$ such that for some $i$, there is some way to explain the membership of $x$ in $\mathcal{L_{(e)\#}}$ by matching $a_{i}$ to $p$ and $a_{i + 1}$ to a position in $q$.
Of these, following is the last to compute as it relies upon the values of first_pos and last_pos. If the definition is confusing for now, don't worry about it. The rules for computing following will come later and will be obvious at that point. We'll focus for now on null_pos, first_pos and last_pos.

The results of first_pos, last_pos and follow are sets of integers. Accordingly, we are going to need a type to represent these.


module Int_set : Set.S with type elt = int = Set.Make (
struct
let compare = Pervasives.compare
type t = int
end)
With this we can present the type of ASTs for augmented regular expressions.

type augmented_regexp =
| Epsilon
| Character of char * int
| Sequence of augmented_regexp * augmented_regexp * pos
| Alternative of augmented_regexp * augmented_regexp * pos
| Repetition of augmented_regexp * pos
| Accept of int
and pos = {
null:bool;
first:Int_set.t;
last:Int_set.t;
}

For a given node $n$, the values of its pos record depend only on the sub-expressions of that node. Assuming constructed augmented regular expression syntax trees, we can write null_pos, first_pos and last_pos like this.


let (null_pos : augmented_regexp -> bool) =
fun x ->
match x with
| Epsilon -> true
| Character (_, i) -> false
| Sequence (_, _, p) -> p.null
| Alternative (_, _, p) -> p.null
| Repetition (_, p) -> p.null
| Accept _ -> false

let (first_pos : augmented_regexp -> Int_set.t) =
fun x ->
match x with
| Epsilon -> Int_set.empty
| Character (_, i) -> Int_set.add i (Int_set.empty)
| Alternative (_, _, p) -> p.first
| Repetition (_, p) -> p.first
| Sequence (_, _, p) -> p.first
| Accept i -> Int_set.add i (Int_set.empty)

let (last_pos : augmented_regexp -> Int_set.t) =
fun x ->
match x with
| Epsilon -> Int_set.empty
| Character (_, i) -> Int_set.add i (Int_set.empty)
| Alternative (_, _, p) -> p.last
| Repetition (_, p) -> p.last
| Sequence (_, _, p) -> p.last
| Accept i -> Int_set.add i (Int_set.empty)

Our strategy in building the syntax-tree of $(e)\#$ from the syntax tree of $e$ will be to visit each node of $e$ and invoke a function to construct the corresponding node of $(e)\#$ inductively. These functions will include the generation of unique integers for the non-$\epsilon$ leaves and encode the rules for building the pos records:

  • null
    • Sequence $(e_{1}, e_{2})$ : null_pos $e_{1}$ and null_pos $e_{2}$
    • Alternative $(e_{1}, e_{2})$ : null_pos $e_{1}$ or null_pos $e_{2}$
    • Repetition : $true$
  • first
    • Alternative $(e_{1}, e_{2})$ : first_pos $e_{1} \cup$ first_pos $e_{2}$
    • Sequence $(e_{1}, e_{2})$ : if null_pos $e_{1}$ then first_pos $e_{1} \cup$ first_pos $e_{2}$ else first_pos $e_{1}$
    • Repetition $e$ : first_pos $e$
  • last
    • Alternative $(e_{1}, e_{2})$ : last_pos $e_{1} \cup$ last_pos $e_{2}$
    • Sequence $(e_{1}, e_{2})$ : if null_pos $e_{2}$ then last_pos $e_{1} \cup$ last_pos $e_{2}$ else last_pos $e_{2}$
    • Repetition $e$ : last_pos $e$

Here then are the augmented regular expression syntax-tree node constructor functions.


let (epsilon : unit -> augmented_regexp) =
fun () ->
Epsilon

and (character : char -> augmented_regexp) =
fun c ->
Character (c, generate_label ())

and (repetition : augmented_regexp -> augmented_regexp) =
fun e ->
Repetition (e, {null=true;first=first_pos e; last=last_pos e})

and (alternative : augmented_regexp -> augmented_regexp -> augmented_regexp) =
fun e1 e2 ->
Alternative (e1, e2,
{null=null_pos e1 || null_pos e2;
first=Int_set.union (first_pos e1)(first_pos e2);
last=Int_set.union (last_pos e1) (last_pos e2)})

and (sequence : augmented_regexp -> augmented_regexp -> augmented_regexp) =
fun e1 e2 ->
let b1 = null_pos e1
and b2 = null_pos e2 in
Sequence (e1, e2,
{null=b1 && b2;
first=
if b1 then
Int_set.union (first_pos e1)(first_pos e2)
else
first_pos e1;
last=
if b2 then
Int_set.union (last_pos e1) (last_pos e2)
else
last_pos e2})

let (accept : augmented_regexp -> augmented_regexp) =
fun e ->
sequence e (Accept (generate_label ()))

We are now in a position to write the function that transforms a syntax tree of the regular expression $e$ into the syntax tree of the augmented regular expression $(e)\#$.


let rec (augmented_regexp : Syntax.regular_expression -> augmented_regexp) =
fun x ->
match x with
| Syntax.Epsilon -> epsilon ()
| Syntax.Character i -> character (Char.chr i)
| Syntax.Sequence (x, y) ->
(*Be very careful here. Evaluation order matters!*)
let x' = (augmented_regexp x)
and y' = (augmented_regexp y) in
sequence x' y'
| Syntax.Alternative (x, y) ->
(*Be very careful here. Evaluation order matters!*)
let x' = (augmented_regexp x)
and y' = (augmented_regexp y) in
alternative x' y'
| Syntax.Repetition x -> repetition (augmented_regexp x)

We can wrap all of the above up in a convenience function parse_augmented_regexp which first parses a string to build the syntax tree of the regular expression it represents and then transforms the result into the syntax tree of the corresponding augmented regular expression.


let (parse_augmented_regexp : string-> augmented_regexp * int) =
fun s ->
let () = reset_label () in
let ast = regexp_of_string s in
let re1 = augmented_regexp ast in
let re2 = accept re1 in
let count = generate_label () in
(re2, count)
Notice that this function returns a pair of the syntax-tree and the number of positions it contains.

The next step in compiling a recognizer from the expression $(e)\#$ is to compute the follow function. To do this we "unite" the information encoded by the first_pos and last_pos functions. Put plainly, follow is a function that takes each symbol (position) in the regular expression to the (set of) symbols (positions) that can follow it. The information is stored in an array of length equal to the number of symbols appearing in the regular expression. There are only two ways a position in a regular expression can follow another:

  • If $n$ is a Sequence node with left child $c_{1}$ and right child $c_{2}$, then for every position $i$ in lastpos $c_{1}$, all positions in firstpos $c_{2}$ are in follow_pos $c_{i}$
  • If $n$ is a Repition and $i$ a position in lastpos $n$, then all positions in first_pos $n$ are in follow_pos $i$
In addition to computing follow the code below also stores the association between positions and characters of the regular expression. That information goes into an array. The elements of the array have type char option since the Accept symbol has a position but no character associated with it.

let (compute_follow : Int_set.t array -> char option array -> augmented_regexp -> unit) =
fun follow chars x ->
let rec compute x =
match x with
| Sequence (e1, e2, p) ->
compute e1; compute e2;
let first2 = first_pos e2 in
let f i =
follow.(i) <- Int_set.union first2 (follow.(i)) in
Int_set.iter f (last_pos e1)
| Repetition (e, p) ->
compute e;
let f i =
follow.(i) <- Int_set.union (p.first) (follow.(i)) in
Int_set.iter f (p.last)
| Alternative (e1, e2, p) -> compute e1; compute e2
| Epsilon -> ()
| Accept i -> chars.(i) <- None
| Character (c, i) -> chars.(i) <- Some c in
compute x

Now the computation of the augmented regular expression syntax-tree and all four of the associated functions together with the mapping from positions to symbols of $\mathcal{A}$ can be wrapped up in another "high-level" convenience function.


let (regexp_follow : string -> augmented_regexp * Int_set.t array * char option array) =
fun s ->
let re, n = parse_augmented_regexp s in
let follow = Array.make n (Int_set.empty) in
let chars = Array.make n None in
compute_follow follow chars re;
(re, follow, chars)

We're in good shape - but a hop, skip and a jump to computing a recognizer from a regular expression. We'll leave off here on this local maxima for today and finish off the job in the next post!

by Shayne Fletcher (noreply@blogger.com) at December 18, 2014 08:05 PM

OCaml Platform

OPAM 1.2 and Travis CI

The new pinning feature of OPAM 1.2 enables new interesting workflows for your day-to-day development in OCaml projects. I will briefly describe one of them here: simplifying continuous testing with Travis CI and GitHub.

Creating an opam file

As explained in the previous post, adding an opam file at the root of your project now lets you pin development versions of your project directly. It's very easy to create a default template with OPAM 1.2:

$ opam pin add <my-project-name> . --edit
[... follow the instructions ...]

That command should create a fresh opam file; if not, you might need to fix the warnings in the file by re-running the command. Once the file is created, you can edit it directly and use opam lint to check that is is well-formed.

If you want to run tests, you can also mark test-only dependencies with the {test} constraint, and add a build-test field. For instance, if you use oasis and ounit, you can use something like:

build: [
  ["./configure" "--prefix=%{prefix}%" "--%{ounit:enable}%-tests"]
  [make]
]
build-test: [make "test"]
depends: [
  ...
  "ounit" {test}
  ...
]

Without the build-test field, the continuous integration scripts will just test the compilation of your project for various OCaml compilers. OPAM doesn't run tests by default, but you can make it do so by using opam install -t or setting the OPAMBUILDTEST environment variable in your local setup.

Installing the Travis CI scripts

Travis CI is a free service that enables continuous testing on your GitHub projects. It uses Ubuntu containers and runs the tests for at most 50 minutes per test run.

To use Travis CI with your OCaml project, you can follow the instructions on https://github.com/ocaml/ocaml-travisci-skeleton. Basically, this involves:

  • adding .travis.yml at the root of your project. You can tweak this file to test your project with different versions of OCaml. By default, it will use the latest stable version (today: 4.02.1, but it will be updated for each new compiler release). For every OCaml version that you want to test (supported values for <VERSION> are 3.12, 4.00, 4.01 and 4.02) add the line:
env:
 - OCAML_VERSION=<VERSION>
  • signing in at TravisCI using your GitHub account and enabling the tests for your project (click on the + button on the left pane).

And that's it, your project now has continuous integration, using the OPAM 1.2 pinning feature and Travis CI scripts.

Testing Optional Dependencies

By default, the script will not try to install the optional dependencies specified in your opam file. To do so, you need to manually specify which combination of optional dependencies you want to tests using the DEPOPTS environment variable. For instance, to test cohttp first with lwt, then with async and finally with both lwt and async (but only on the 4.01 compiler) you should write:

env:
   - OCAML_VERSION=latest DEPOPTS=lwt
   - OCAML_VERSION=latest DEPOPTS=async
   - OCAML_VERSION=4.01   DEPOPTS="lwt async"

As usual, your contributions and feedback on this new feature are gladly welcome.

by Thomas Gazagnaire at December 18, 2014 12:00 AM

December 14, 2014

Psellos

OCaml App for iOS 8.1 (Sources)

December 14, 2014

I coded up a simple OCaml iOS app to run in iOS 8.1. Instructions for downloading, building, and running the app are here:

Portland: Which Way Is Up on iOS?

You can download the sources directly here:

Portland 2.0.3, OCaml app for iOS 8.1 (29 KB)

This is a revamped version of Portland, the first example OCaml iOS app I made a few years ago. For maximum clarity it doesn’t do anything particularly impressive. It really just shows how to code an iOS app in OCaml.

Here are some things I learned while revamping.

  • Remember to call caml_main() in your main program (see main.m). If you forget, you’ll get the “undefined atom table” error at link time. I wrote about this in Undefined caml_atom_table.

  • If you keep disembodied OCaml values in the Objective C world, remember to register them as global roots using caml_register_global_root. Otherwise you’ll experience chaos at the first GC. I wrote about this in OCaml, Objective C, Rule 4.

  • Automatic Reference Counting imposes some restrictions on what you can do in wrapper code. For the Portland example (and probably for many real-world apps) it’s enough to have a table of Objective C objects that are conceptually referenced from OCaml. That is, the table in the Objective C world references the objects as a proxy for references from the OCaml world. You can see the code for this in wrap.m. I hope to write more about this. Maybe you, reader, have some ideas for a better approach.

  • Modern day iOS apps are based on View Controllers rather than on Views. In particular, it’s usual to define a custom subclass of UIViewController for each piece of the interface. This is tricky for OCaml on iOS, as it’s not (currently) possible to define an OCaml subclass of an Objective C class. For Portland I’m using an Objective C subclass of UIViewController that delegates to an OCaml object. Here too, this is probably good enough for many real-world apps. I hope to write more about this also.

  • There are several cyclic dependencies among the classes of Cocoa Touch used in Portland. To represent them in OCaml I use a common set of definitions named ui.mli, where the cycles can be accommodated using class type a =and b = … . It seems to me this is a strength of OCaml’s structural typing for objects. That is, it’s possible to define class types independently of particular classes. In this way cycles can be represented without forward-reference loopholes. (However it’s possible that the number of cycles in a full interface to Cocoa Touch would become overwhelming.)

It’s dark, chilly, and wet here by Puget Sound; I’m going to retire now to my tent and my dreams. The next thing on my OCaml-on-iOS schedule is to update to the latest OCaml compiler. I’m getting serious polymathic help on this, as I hope you’ll hear about soon.

If you have any trouble (or success) with the Portland app, or have any other comments, leave them below or email me at jeffsco@psellos.com.

Posted by: Jeffrey

<style type="text/css"> .flowaroundimg { float: left; margin: 0em 1em 0em 0em; } .rightoffloat li { position: relative; left: 1em; } pre { white-space: pre-wrap; width: 96%; margin-bottom: 24px; overflow: hidden; padding: 3px 10px; -webkit-border-radius: 3px; background-color: #fed; border: 1px solid #dcb; } pre code { white-space: pre-wrap; border: none; padding: 0; background-color: transparent; -webkit-border-radius: 0; } td { padding: 0em 1em 0em 1em; } th { padding: 0em 1em 0em 1em; } </style>

December 14, 2014 07:00 PM

December 13, 2014

OCamlCore Forge News

OCaml EFL 1.12.0 released

Major changes: - Moved to version 1.12 of the EFL/Elementary - Rewriting of the build toolchain (although ocamlbuild still do the compilation part) - The package adapts to the EFL/Elementary version of the user: An archive for each version of the EFL is not necessary any more. For example, if version 1.11 is installed, the interfaces to 1.12 specific functions will not be built but the library obtained will still be usable. - An experimental interraction with Lwt (available in an example) - Compiling with OCaml 4.02 should not create any warning. More precisely, the mutability of strings is not used any more. However, it is still possible to use the old OCaml 3.12.

by Alexis Bernadet at December 13, 2014 11:00 PM

Shayne Fletcher

Recognizers

In my last post, I gave an OCaml program to parse regular expressions. I intend however to show you, over the coming weeks, not just how to parse them, but how to compile them to recognizers too. Doing so will require me to share quite a lot of code that I gleaned from the book The Functional Approach to Programming by Guy Cousineau and Michel Mauny from which I learned how to do this. In doing so, I will along the way, provide updated code in modern OCaml as the book is presented in the Caml language a predecessor of today's OCaml, and not everywhere equivalent. Additionally, there will be functions of my own "filling in" gaps left as exercises in the book. That said, I don't think there's enough effort of my own to justify "plagiarizing" the material so blatantly so am determined to add at least a little value, where my limited skills allow, in order that I might redeem myself.

So today, let us start with a minimal combinator library implementing "recognizers" for recursive descent parsers, based on Cousineau and Mauny transliterated to the C++ programming language. I hope on reading the following, that you will agree, the multi-paradigm language C++-11 admits "The Functional Approach to Programming" as a "first class citizen" and the result here might be viewed as the beginning of a mini Boost.Spirit! Let's begin.

We are concerned with the problem of "recognizing" phrases belonging to a given language. We will produce programs (functions) that accept character strings as input which decide to accept or reject them. First some headers.


#include <boost/variant.hpp>
#include <boost/variant/apply_visitor.hpp>
#include <boost/utility.hpp>
#include <boost/range.hpp>
#include <boost/detail/lightweight_test.hpp>

#include <list>
#include <functional>
#include <numeric>
#include <string>

Recognizers work on lists. When recognition succeeds, part of the list is consumed and the part of the list remaining is returned or, recognition fails. So the type remaining<A> is a sum type with two cases, where A is the type of "tokens" (symbols) contained by the list (typically but not necessarily char).


//Recognition succeeded
template <class A>
struct remains{
std::list<A> left;
template <class ItT>
remains (ItT begin, ItT end)
: left (begin, end)
{}
};

//Recognition failed
template <class A>
struct recognition_fails {};

//Result of a recognizer. Indicates the result of attempting to
//recognize something from a list
template <class A> using remaining =
boost::variant<remains <A>, recognition_fails<A> >;
Recognizers then are but functions from lists of tokens to values of remaining<A>.

//A 'recognizer' is a function from a list to a value of remaining<A>
template <class A> using recognizer =
std::function<remaining<A>(std::list<A> const&)>;
Here's an example. This function (constant) produces a recognizer that matches the empty string. The recognizer it produces always succeeds and never consumes any input.

//A recognizer that recognizes the empty string. It always succeeds and
//no input is ever consumed
template <class A>
recognizer<A> epsilon ()
{
return [](std::list<A> const& cs) -> remaining<A>
{
return remains<A> (cs.begin (), cs.end ());
};
}
This next factory function produces recognizers that recognize elements that satisfy a user provided predicate.

//Given a predicate, 'recognizer_of_token' produces the recognizer
//associated with the elements that satisfy this predicate
template <class A, class F/*bool(A)*/>
recognizer<A> recognizer_of_token (F test)
{
return
[=] (std::list<A> const& cs) -> remaining<A>
{
if (cs.empty ())
return recognition_fails<A> ();

if (test (cs.front ()))
return remains<A> (boost::next (cs.begin()), cs.end());

return recognition_fails<A>();
};
}
Recognizers can be composed. This function takes two recognizers and returns a recognizer that will, when presented with a list of tokens, attempt recognition via the first and if that doesn't work out, backtrack and attempt to recognize via the second. It's the "if then else" of recognizers.

//Recognizer disjunction
template <class A>
recognizer<A> compose_or (recognizer<A> p, recognizer<A> q)
{
return [=](std::list<A> const& toks) -> remaining<A>
{
remaining <A> res = p (toks);
if (remains<A>* rem = boost::get<remains<A> >(&res)) return *rem;

return q (toks);
};
}
This next function is the builder of the corresponding "and then" sort of recognizer.

//Recognizer conjunction
template <class A>
recognizer<A> compose_and (recognizer<A> p, recognizer<A> q)
{
return [=](std::list<A> const& toks) -> remaining<A>
{
remaining <A> res = p (toks);
if (remains<A>* rem = boost::get<remains<A> >(&res))
return q (rem->left);

return recognition_fails<A> ();
};
}
With disjunction and conjunction of recognizers, we can implement a function to compute, given a recognizer, the corresponding Kleene star recognizer. That is, zero or more occurrences of a given phrase.

//The '*' iterator (Kleene star)
template <class A>
recognizer<A> zero_or_more (recognizer<A> p)
{
return [=] (std::list<A> const& toks) -> remaining <A>
{
return compose_or (
compose_and (p, zero_or_more<A> (p)) , epsilon<A> ()) (toks);
};
}
Well, let's now get concrete. Here's a function to produce a recognizer of a specific token.

//A function to produce a recognizer that recognizes only the
//given character
template <A>
recognizer<A> char_ (A c)
{
return recognizer_of_token<A>([=](A x) -> bool { return c == x; });
}
The composite recognizer for a given "string" of tokens can be constructed then like this.

//A function to produce a recognizer of a specific string
template <class C>
recognizer<typename boost::range_value<C>::type> string_ (C s)
{
typedef typename boost::range_value<C>::type value;

std::list <recognizer<value> > rs;

typedef std::back_insert_iterator<std::list<recognizer<value> > > it_t;

std::accumulate (
boost::begin (s)
, boost::end (s)
, std::back_inserter (rs)
, [](it_t dst, value c) -> it_t
{ *dst++ = char_ (c); return dst; });

return std::accumulate (
boost::next (rs.begin ())
, rs.end ()
, rs.front ()
, [](recognizer<value> acc, recognizer<value> r) -> recognizer<value>
{ return compose_and (acc, r); });
}
That will do for our mini-recognizer library for today. Let's turn attention to testing. First some utilities.

//Match on a remaining<A> returns 'true' if it contains a 'remains<A>'
//value with all input consumed, 'false' if it contains a 'recognition_fails<A>' value
template <class A>
struct accept_visitor
{
typedef bool result_type;
bool operator () (remains<A> const& r) const { return r.left.empty (); }
bool operator () (recognition_fails<A> const& r) const { return false; }
};

//Function to determine if recognition was achieved
template <class A>
bool accept (remaining<A> const& r)
{
return boost::apply_visitor( accept_visitor<A> (), r);
}

//Test if the provided recognizer can recognize the given string
bool parse (recognizer<char> parser, std::string const& s)
{
return accept (parser (std::list<char>(s.begin (), s.end())));
}
Now, a little test.

int main ()
{
//char_
BOOST_TEST( parse (char_ ('a'), "a") );
BOOST_TEST( !parse (char_ ('b'), "a") );
BOOST_TEST( parse (char_ ('b'), "b") );

//'*'
BOOST_TEST( parse (zero_or_more (char_ ('a')), "aa") );
BOOST_TEST( parse (zero_or_more (char_ ('a')), "") );
BOOST_TEST( !parse (zero_or_more (char_ ('a')), "ab") );

//string_
BOOST_TEST (parse (string_ (std::string ("foo")), "foo") );
BOOST_TEST (!parse (string_ (std::string ("foo")), "bar") );

return boost::report_errors ();
}
Et voilà. L' approche fonctionnelle de C++! See you soon for more about regular expressions. Happy holidays!

by Shayne Fletcher (noreply@blogger.com) at December 13, 2014 04:35 PM

December 08, 2014

Github OCaml jobs

Full Time: Software Developer (Functional Programming) at Jane Street in New York, NY; London, UK; Hong Kong

Software Developer (Functional Programming)

Jane Street is looking to hire great software developers with an interest in functional programming. OCaml, a statically typed functional programming with similarities to Haskell, Scheme, Erlang, F# and SML, is our language of choice. We've got the largest team of OCaml developers in any industrial setting, and probably the world's largest OCaml codebase. We use OCaml for running our entire business, supporting everything from research to systems administration to trading systems. If you're interested in seeing how functional programming plays out in the real world, there's no better place.

The atmosphere is informal and intellectual. There is a focus on education, and people learn about software and trading, both through formal classes and on the job. The work is challenging, and you get to see the practical impact of your efforts in quick and dramatic terms. Jane Street is also small enough that people have the freedom to get involved in many different areas of the business. Compensation is highly competitive, and there's a lot of room for growth.

You can learn more about Jane Street and our technology from our main site, janestreet.com. You can also look at a a talk given at CMU about why Jane Street uses functional programming (http://ocaml.janestreet.com/?q=node/61), and our programming blog (http://ocaml.janestreet.com).

We also have extensive benefits, including:

  • 90% book reimbursement for work-related books
  • 90% tuition reimbursement for continuing education
  • Excellent, zero-premium medical and dental insurance
  • Free lunch delivered daily from a selection of restaurants
  • Catered breakfasts and fresh brewed Peet's coffee
  • An on-site, private gym in New York with towel service
  • Kitchens fully stocked with a variety of snack choices
  • Full company 401(k) match up to 6% of salary, vests immediately
  • Three weeks of paid vacation for new hires in the US
  • 16 weeks fully paid maternity/paternity leave for primary caregivers, plus additional unpaid leave

More information at http://janestreet.com/culture/benefits/

December 08, 2014 02:34 PM