OCaml Planet

## January 29, 2015

### OCamlPro

#### Private Release of Alt-Ergo 1.00

After the public release of Alt-Ergo 0.99.1 last December, it's time to announce a new major private version (1.00) of our SMT solver. As usual:

## Quick Evaluation

A quick comparison between this new version and the previous releases is given below. Timeout is set to 60 seconds. The benchmark is made of 19044 formulas: (a) some of these formulas are known to be invalid, and (b) some of them are out of scope of current SMT solvers. The results are obtained with Alt-Ergo's native input language.

public release0.95.2 public release0.99.1 private release1.00 15980 16334 17638 84,01 % 85,77 % 92,62 % 10831 10504 9767 1,47 1,55 1,81

## Main Novelties of Alt-Ergo 1.00

### General Improvements

• theories data structures: semantic values (internal theories representation of terms) are now hash-consed. This enables the use of hash-based comparison (instead of structural comparison) when possible

• theories combination: the dispatcher component, that sends literals assumed by the SAT solver to different theories depending on whether these literals are equalities, disequalities or inequalities, has been re-implemented. The new code is much more simpler and enables some optimizations and factorizations that could not be made before

• case-split analysis: we made several improvements in the heuristics of the case-split analysis mechanism over finite domains

• explanations propagation: we improved explanations propagation in congruence closure and linear arithmetic algorithms. This makes the proofs faster thanks to a better back-jumping in the SAT solver part

• linear integer arithmetic: we re-implemented several parts of linear arithmetic and introduced important improvements in the Fourier-Motzkin algorithm to make it run on smaller sub-problems and avoid some useless executions. These optimizations allowed a significant speed up on our internal benchmarks

• data structures: we optimized hash-consing and some functions in the "formula" and "literal" modules

• SAT solving: we made a lot of improvements in the default SAT-solver and in the SatML plugin. In particular, the solvers now send lists of facts (literals) to "the decision procedure part" instead of sending them one by one. This avoids intermediate calls to some "expensive" algorithms, such as Fourier-Motzkin

• Matching: we extended the E-matching algorithm to also perform matching modulo the theory of records. In addition, we simplified matching heuristics and optimized the E-matching process to avoid computing the same instances several times

• Memory management: thanks to the ocp-memprof tool (http://memprof.typerex.org/), we identified some parts of Alt-Ergo that needed some improvements in order to avoid useless memory allocations, and thus unburden the OCaml garbage collector

• the function that retrieves the used axioms and predicates (when option 'save-used-context' is activated) has been improved

### Bug Fixes

• 6 in the "inequalities" module of linear arithmetic

• 4 in the "formula" module

• 3 in the "ty" module used for types representation and manipulation

• 2 in the "theories front-end" module that interacts with the SAT solvers

• 1 in the "congruence closure" algorithm

• 1 in "existential quantifiers elimination" module

• 1 in the "type-checker"

• 1 in the "AC theory" of associative and commutative function symbols

• 1 in the "union-find" module

### New OCamlPro Plugins

• profiling plugin: when activated, this plugin records and prints some information about the current execution of Alt-Ergo every 'x' seconds: In particular, one can observe a module being activated, a function being called, the amount of time spent in every module/function, the current decision/instantiation level, the number of decisions/instantiations that have been made so far, the number of case-splits, of boolean/theory conflicts, of assumptions in the decision procedure, of generated instances per axiom, ....

• fm-simplex plugin: when activated, this plugin is used instead of the Fourier-Motzkin method to infer bounds for linear integer arithmetic affine forms (which are used in the case-split analysis process). This module uses the Simplex algorithm to simulate particular runs of Fourier-Motzkin, which makes it scale better on linear integer arithmetic problems containing a lot of inequalities

### New Options

-no-theory: deactivate theory reasoning. In this case, only the SAT-solver and the matching parts are working

-inequalities-plugin: specify a plugin to use, instead of the "default" Fourier-Motzkin algorithm, to handle inequalities of linear arithmetic

-tighten-vars: when this option is set, the Fm-Simplex plugin will try to infer bounds for integer variables as well. Note that this option may be very expensive

-profiling-plugin: specify a profiling plugin to use to monitor an execution of Alt-Ergo

-profiling <freq>: makes the profiling module prints its information every <freq> seconds

-no-tcp: deactivate constraints propagation modulo theories

### Removed Capabilities

• the pruning module used in the frontend is now removed

• the SMT and SMT2 front-ends are removed. We plan to implement a new front-end for SMT2 in upcoming releases

## January 26, 2015

### GaGallium

#### Fast vectorizable math functions approximations

How and why I designed my own approximated implementations of log, exp, sin and cos. The approximation error is quite small, and the functions are fast and vectorizable. For example, my logapprox function is 7.5x faster in tight loops than the icsi_log function, while being more precise.

<script src="https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML" type="text/javascript"> </script>

Recently, I helped my wife optimizing some code doing simulations for her bio-statistics research project. Without going into details, her problem requires doing Bayesian inference using MCMC methods. Those methods are typically very CPU intensive, and in this particular code, most of the time was spent computing logarithms and exponentials. This is a bit disappointing, because the results of these computations are in the end only used as a probability from which we are going to draw a Boolean (they are rejection probabilities of the Metropolis-Hastings algorithm). We are clearly spending a lot of time computing these number with the full double precision, while we need much less.

My first optimization idea was to use an approximated version of the logarithm and exponential functions. Namely, I used the well-known icsi_log function for the logarithm, and I have written a similar function for exponential. I am going to explain the principle for logarithm: exponential use similar tricks, but in the other direction. The first idea is to transform the problem of natural logarithm into base-2 logarithm by multiplying the result by $$\log 2$$. Next, recall that floating-point numbers are represented by a mantissa and an exponent: using bitmasks, it is very easy to extract from a normalized floating point number $$f$$ two numbers $$m$$ and $$e$$ such that $$f = m2^e$$, $$m \in [1,2[$$, $$e \in \mathbb{Z}$$. Thus, we have $$\log_2 f = \log_2 m + e$$, so that we reduced the problem into computing the logarithm of a number in $$[1,2[$$, which is done in icsi_log using precomputed look-up tables.

While the use of these functions brought some speedup, there were two major drawbacks: first, because the look-up table must not be too big (otherwise it will not fit in low-level cache, and look-up would be too slow), the precision is limited. Second, the loops that contained calls to the approximated functions contained accesses to arrays with random indexes, and thus cannot be vectorized.

Loop vectorization is a technique able to provide huge speedups (typically x3 at least), when the hottest part of the code is simple enough. Not-so-old processors have special instructions, called SIMD instructions, that are able to execute the same operation on several pieces of data simultaneously. In the x86 world, they are called SSE instructions. They are typically used when doing repetitive operations on arrays, and can lead to big speedups. Until recently, if you wanted your code to be vectorized, you had to manually write code that explicitly used these features of the processor. This leads to code that is neither easily maintainable nor portable. However, a lot of progress had been done, and recent compilers are able to vectorize automatically simple loops: that is, you write loops as if they were completely sequential, and the compiler is able to figure out the iterations are independent, and, when possible, bind them together by using SIMD instructions. This can lead to huge speedups, and I wanted to take advantage of this opportunity. Automatic vectorization is enabled in recent versions of GCC, when using -O3 option. For some loops, it is necessary to add -unsafe-math-optimizations to allow GCC to reorder some operations and enable vectorization. The problem with icsi_log is that accessing tabulated values cannot be vectorized, and so these functions blocked the automatic vectorization process of GCC.

I searched for other approximated approximations of these functions: I found Intel's Approximate Math Library, which is oldish, written in assembly, thus not portable for other platforms, and needed rewriting if used for GCC. Moreover, its use would need manual vectorization, which is not handy.

So, I crafted my own version of log and exp, reusing the trick about the binary representation of floating-point numbers, but replacing the table look-up by a good polynomial approximation. I used the Sollya tool to find those polynomial approximations. Here is, for example, the Sollya script that can be used to find the best polynomial approximation of the natural logarithm in the interval $$[1, 2]$$ -- using the Remez algorithm. It also plots the error :

f = remez(log(x), 4, [1,2]);
plot(f-log(x), [1,2]);

These functions are vectorizable, giving big speedups if used in appropriate loops. We gain x21 for exponentials (compared to the expf function in glibc, which does not give much more precision) and x38 for logarithms (compared to logf function in glibc, which gives almost twice more correct mantissa bits). Moreover, their precision is quite good: for exponential, the relative error is bounded by $$10^{-5}$$, and for logarithm, the absolute error is bounded by $$7.10^{-5}$$ (both bounds stand for "reasonable" inputs are given, that is when denormal numbers are not involved). That's better than both tabulated and Intel's versions.

In the hope it could be useful to someone, I provide the implementations of both functions, logapprox and expapprox, with their benchmarking scripts (feel free to reproduce this), under MIT license:   https://github.com/jhjourdan/SIMD-math-prims

I also provide the implementation (sinapprox, cosapprox) of approximated $$\sin$$ and $$\cos$$, based on polynomial approximations, and correct only in $$[-\pi, \pi]$$, which represents a common usecase. Future work would be to design approximations for other common math functions: I'd be happy to implement or merge your own, if asked for.

Here are the performance results for the logarithm:

Benchmarking logf...         88.9M/s
Benchmarking icsi_log...    470.7M/s
Benchmarking logapprox...  3557.5M/s

And the accuracy measurements, compared against logf:

Comparing the behavior of logapprox against logf, in the interval [1e-10, 10]:
Bias:                      -1.672978e-06
Mean absolute error:        3.756378e-05
Mean square error:          4.211918e-05
Min difference:            -6.055832e-05
Max difference:             6.103516e-05

For reference, icsi_log, also compared against logf, is less precise:

Comparing the behavior of icsi_log against logf, in the interval [1e-10, 10]:
Bias:                       6.748079e-06
Mean absolute error:        1.771525e-04
Mean square error:          2.089661e-04
Min difference:            -4.338026e-04
Max difference:             4.688501e-04

## January 25, 2015

### Psellos

#### Further OCaml GC Disharmony

January 25, 2015

While working on an OCaml app to run in the iPhone Simulator, I discovered another wrapper code construct that looks plausible but is incorrect. I was able to reproduce the error in a small example under OS X, so I am writing it up for the record.

The error is in code that calls from Objective C into OCaml. In an OCaml iOS app these calls happen all the time, since events originate in iOS. You can imagine events being formed originally from an Objective C-like substance, then being remanufactured into an OCaml material and passed through the interface.

As a teensy example, assume that you want to create a point and a rectangle in C and pass them to a function in OCaml. To make it interesting, assume that you want to count the fraction of time a rectangle with randomly chosen origin and size (uniform values in [0, 1]) contains the point (0.5, 0.5).

Here is some C code that does this (r4b.c):

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

#define CAML_NAME_SPACE
#include "caml/memory.h"
#include "caml/alloc.h"
#include "caml/callback.h"

double dran()
{
static unsigned long state = 72;
state = state * 6364136223846793005L + 1442695040888963407L;
return (double) state / (double) ULONG_MAX;
}

static value Val_point(double x, double y)
{
CAMLparam0();
CAMLlocal3(point, fx, fy);
point = caml_alloc(2, 0);
fx = caml_copy_double(x);
fy = caml_copy_double(y);
Store_field(point, 0, fx);
Store_field(point, 1, fy);
CAMLreturn(point);
}

static value ran_rect()
{
CAMLparam0();
CAMLlocal5(rect, fx, fy, fwd, fht);
rect = caml_alloc(4, 0);
fx = caml_copy_double(dran());
fy = caml_copy_double(dran());
fwd = caml_copy_double(dran());
fht = caml_copy_double(dran());
Store_field(rect, 0, fx);
Store_field(rect, 1, fy);
Store_field(rect, 2, fwd);
Store_field(rect, 3, fht);
CAMLreturn(rect);
}

int main(int ac, char *av[])
{
CAMLparam0();
int ct, i;
CAMLlocal2(point, isinside);
value *inside;

caml_main(av);

point = Val_point(0.5, 0.5);
inside = caml_named_value("inside");

ct = 0;
for (i = 0; i < 1000000000; i++) {
isinside = caml_callback2(*inside, point, ran_rect());
if (Bool_val(isinside))
ct++;
}
printf("%d (%f)\n", ct, (double) ct / (double) 1000000000);
CAMLreturnT(int, 0);
}

This OCaml code tests whether the point is inside the rectangle (inside.ml):

let inside (px, py) (x, y, w, h) =
px >= x && px <= x +. w && py >= y && py <= y +. h

let () = Callback.register "inside" inside

The basic idea is sound, but if you build and run this code in OS X you see the following:

$ocamlopt -output-obj -o inside.o inside.ml$ W=ocamlopt -where; clang -I $W -L$W -o r4b r4b.c inside.o -lasmrun
$r4b Segmentation fault: 11 You, reader, are probably way ahead of me as usual, but the problem is in this line: isinside = caml_callback2(*inside, point, ran_rect()); The problem is that ran_rect() allocates OCaml memory to hold the rectangle and its float values. Every once in a while, this will cause a garbage collection. If the OCaml value for point has already been calculated (i.e., if the parameters to caml_callback2 are evaluated left to right), this can cause the calculated value to become invalid before the call happens. This will lead to trouble: either a crash (as here) or, worse, the wrong answer. The solution is to call ran_rect() beforehand: int main(int ac, char *av[]) { CAMLparam0(); int ct, i; CAMLlocal3(point, rect, isinside); value *inside; caml_main(av); point = Val_point(0.5, 0.5); inside = caml_named_value("inside"); ct = 0; for (i = 0; i < 1000000000; i++) { rect = ran_rect(); isinside = caml_callback2(*inside, point, rect); if (Bool_val(isinside)) ct++; } printf("%d (%f)\n", ct, (double) ct / (double) 1000000000); CAMLreturnT(int, 0); } This revised version works correctly: $ ocamlopt -output-obj -o inside.o inside.ml
$W=ocamlopt -where; clang -I$W -L $W -o r4b r4b.c inside.o -lasmrun$ r4b
140625030 (0.140625)

(If my calculations are correct, the expected fraction is indeed 9/64, or 0.140625.)

In retrospect the problem is obvious, but I’ve wondered for years whether this construct is OK. As far as I can tell it isn’t explicitly forbidden by any of the GC Harmony Rules. In many ways, though, it’s related to Rule 4: the calculated value to be passed is like a global value, in that it’s outside the reach of the CAML_local() macros.

The general principle seems to be that you shouldn’t write an expression as an argument to a function if it can cause OCaml allocation. If necessary, evaluate the expression before the call.

I hope this may help some other humble OCaml developer seeking to attune his or her life with the Garbage Collector. If you have any comments or encouragement, leave them below or email me at jeffsco@psellos.com.

Posted by: Jeffrey

<style type="text/css"> .flowaroundimg { float: left; margin: 0em 1em 0em 0em; } .rightoffloat li { position: relative; left: 1em; } pre { white-space: pre-wrap; width: 96%; margin-bottom: 24px; overflow: hidden; padding: 3px 10px; -webkit-border-radius: 3px; background-color: #fed; border: 1px solid #dcb; } pre code { white-space: pre-wrap; border: none; padding: 0; background-color: transparent; -webkit-border-radius: 0; } td { padding: 0em 1em 0em 1em; } th { padding: 0em 1em 0em 1em; } </style>

### @typeocaml

#### Recursive Memoize & Untying the Recursive Knot

When I wrote the section of When we need later substitution in Mutable, I struggled. I found out that I didn't fully understand the recursive memoize myself, so what I had to do was just copying the knowledge from Real World OCaml. Luckily, after the post was published, glacialthinker commented in reddit:

(I never thought before that a recursive function can be split like this, honestly. I don't know how to induct such a way and can't explain more. I guess we just learn it as it is and continue. More descriptions of it is in the book.)

This is "untying the recursive knot". And I thought I might find a nice wikipedia or similiar entry... but I mostly find Harrop. :) He actually had a nice article on this many years back in his OCaml Journal. Anyway, if the author swings by, searching for that phrase may turn up more material on the technique.

It greatly enlightened me. Hence, in this post, I will share with you my futher understanding on recursive memoize together with the key cure untying the recursive knot that makes it possible.

# Simple Memoize revamped

We talked about the simple memoize before. It takes a non-recursive function and returns a new function which has exactly the same logic as the original function but with new new ability of caching the argument, result pairs.

let memoize f =
let table = Hashtbl.Poly.create () in
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = f x in
y
in
g


The greatness of memoize is its flexibility: as long as f takes a single argument, memoize can make a memo version out of it without touching anything inside f.

This means while we create f, we don't need to worry about the ability of caching but just focus on its own correct logic. After we finish f, we simply let memoize do its job. Memoization and functionality are perfectly separated.

Unfortunately, the simple memoize cannot handle recursive functions. If we try to do memoize f_rec, we will get this:

f_rec is a recursive function so it will call itself inside its body. memoize f_rec will produce f_rec_memo which is a little similar as the previous f_memo, yet with the difference that it is not a simple single call of f_rec arg like we did f arg. Instead, f_rec arg may call f_rec again and again with new arguments.

Let's look at it more closely with an example. Say, arg in the recursive process will be always decreased by 1 until 0.

1. Let's first od f_rec_memo 4.
2. f_rec_memo will check the 4 against Hashtbl and it is not in.
3. So f_rec 4 will be called for the first time.
4. Then f_rec 3, f_rec 2, f_rec 1 and f_rec 0.
5. After the 5 calls, result is obtained. Then 4, result pair is stored in Hashtbl and returned.
6. Now let's do f_rec_memo 3, what will happen? Obviously, 3 won't be found in Hashtbl as only 4 is stored in step 5.
7. But should 3, result pair be found? Yes, it should of course because we have dealt with 3 in step 4, right?
8. Why 3 has been done but is not stored?
9. ahh, it is because we did f_rec 3 but not f_rec_memo 3 while only the latter one has the caching ability.

Thus, we can use memoize f_rec to produce a memoized version out of f_rec anyway, but it changes only the surface not the f_rec inside, hence not that useful. How can we make it better then?

# Recursive Memoize revamped

What we really want for memoizing a recursive function is to blend the memo ability deep inside, like this:

Essentially we have to replace f_rec inside with f_rec_memo:

And only in this way, f_rec can be fully memoized. However, we have one problem: **it seems that we have to change the internal of f_rec.

If we can modify f_rec directly, we can solve it easily . For instance of fibonacci:

let rec fib_rec n =
if n <= 1 then 1
else fib_rec (n-1) + fib_rec (n-2)


we can make the memoized version:

let fib_rec_memo_trivial n =
let table = Hashtbl.Poly.create () in
let rec fib_rec_memo x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = fib_rec_memo (x-1) + fib_rec_memo (x-2) in
y
in
fib_rec_memo


In the above solution, we replaced the original fib_rec inside with fib_rec_memo, however, we also changed the declaration to fib_rec_memo completely. In fact, now fib_rec is totally ditched and fib_rec_memo is a new function that blends the logic of memoize with the logic of fib_rec.

Well, yes, fib_rec_memo_trivial can achieve our goal, but only for fib_rec specificly. If we need to make a memoized version for another recursive function, then we need to change the body of that function again. This is not what we want. We wish for a memoize_rec that can turn f_rec directly into a memoized version, just like what the simple memoize can do for f.

So we don't have a shortcut. Here is what we need to achieve:

1. we have to replace the f_rec inside the body of f_rec with f_rec_memo
2. We have keep the declaration of f_rec.
3. We must assume we can't know the specific logic inside f_rec.

It sounds a bit hard. It is like giving you a compiled function without source code and asking you to modify its content. And more imporantly, your solution must be generalised.

Fortunately, we have a great solution to create our memoize_rec without any hacking or reverse-engineering and untying the recursive knot is the key.

# Untying the Recursive Knot

To me, this term sounds quite fancy. In fact, I never heard of it before 2015-01-21. After I digged a little bit about it, I found it actually quite simple but very useful (this recursive memoize case is an ideal demonstration). Let's have a look at what it is.

Every recursive function somehow follows a similar pattern where it calls itself inside its body:

Once a recursive function application starts, it is out of our hands and we know it will continue and continue by calling itself until the STOP condition is satisfied. What if the users of our recursive function need some more control even after it gets started?

For example, say we provide our users fib_rec without source code, what if the users want to print out the detailed trace of each iteration? They are not able unless they ask us for the source code and make a new version with printing. It is not that convenient.

So if we don't want to give out our source code, somehow we need to reform our fib_rec a little bit and give the space to our users to insert whatever they want for each iteration.

let rec fib_rec n =
if n <= 1 then 1
else fib_rec (n-1) + fib_rec (n-2)


Have a look at the above fib_rec carefully again, we can see that the logic of fib_rec is already determined during the binding, it is the fib_recs inside that control the iteration. What if we rename the fib_recs within the body to be f and add it as an argument?

let fib_norec f n =
if n <= 1 then 1
else f (n-1) + f (n-2)

(* we actually should now change the name of fib_norec
to something like fib_alike_norec as it is not necessarily
doing fibonacci anymore, depending on f *)


So now fib_norec won't automatically repeat unless f tells it to. Moreover, fib_norec becomes a pattern which returns 1 when n is <= 1 otherwise add f (n-1) and f (n-2). As long as you think this pattern is useful for you, you can inject your own logic into it by providing your own f.

Going back to the printing requirement, a user can now build its own version of fib_rec_with_trace like this:

let rec fib_rec_with_trace n =
Printf.printf "now fibbing %d\n" n;
fib_norec fib_rec_with_trace n


Untying the recusive knot is a functional design pattern. It turns the recursive part inside the body into a new parameter f. In this way, it breaks the iteration and turns a recursive function into a pattern where new or additional logic can be injected into via f.

It is very easy to untie the knots for recusive functions. You just give an addition parameter f and replace f_rec everywhere inside with it. For example, for quicksort:

let quicksort_norec f = function
| [] | _::[] as l -> l
| pivot::rest ->
let left, right = partition_fold pivot rest in
f left @ (pivot::f right)

let rec quicksort l = quicksort_norec quicksort l


There are more examples in Martin's blog, though they are not in OCaml. A formalized description of this topic is in the article Tricks with recursion: knots, modules and polymorphism from The OCaml Journal.

Now let's come back to recursive memoize problem with our new weapon.

# Solve Recursive Memoize

At first, we can require that every recursive function f_rec must be supplied to memoize_rec in the untied form f_norec. This is not a harsh requirement since it is easy to transform f_rec to f_norec.

Once we get f_norec, we of course cannot apply memoize (non-rec version) on it directly because f_norec now takes two parameters: f and arg.

Although we can create f_rec in the way of let rec f_rec arg = f_norec f_rec arg, we won't do it that straightforward here as it makes no sense to have an exactly the same recursive function. Instead, we can for now do something like let f_rec_tmp arg = f_norec f arg.

We still do not know what f will be, but f_rec_tmp is non-recursive and we can apply memoize on it: let f_rec_tmp_memo = memoize f_tmp.

f_rec_tmp_memo now have the logic of f_norec and the ability of memoization. If f can be f_rec_tmp_memo, then our problem is solved. This is because f is inside f_norec controlling the iteration and we wished it to be memoized.

The magic that can help us here is making f mutable. Because f needs to be known in prior and f_rec_tmp_memo is created after, we can temporarily define f as a trivial function and later on after we create f_rec_tmp_memo, we then change f to f_rec_tmp_memo.

Let's use a group of bindings to demonstrate:

(* trivial initial function and it should not be ever applied in this state *)
let f = ref (fun _ -> assert false)

let f_rec_tmp arg = f_norec !f arg

(* memoize is the simple non-rec version *)
let f_rec_tmp_memo = memoize f_rec_tmp

(* the later substitution made possible by being mutable *)
f := f_rec_tmp_memo


The final code for memoize_rec is:

let memo_rec f_norec =
let f = ref (fun _ -> assert false) in
let f_rec_memo = memoize (fun x -> f_norec !f x) in
f := f_rec_memo;
f_rec_memo


## January 21, 2015

### Coq

#### Coq 8.5 beta 1 is out!

The first beta release of Coq 8.5 is available for testing. The 8.5 version brings several major features to Coq:
• asynchronous edition of documents under CoqIDE to keep working on a proof while Coq checks the other proofs in the background (by Enrico Tassi);
• universe polymorphism making it possible to reuse the same definitions at various universe levels (by Matthieu Sozeau);
• primitive projections improving space and time efficiency of records, and adding eta-conversion for records (by Matthieu Sozeau);
• a new tactic engine allowing dependent subgoals, fully backtracking tactics, as well as tactics which can consider multiple goals together (by Arnaud Spiwack);
• a new reduction procedure called native_compute to evaluate terms using the OCaml native compiler, for proofs with large computational steps (by Maxime Dénès).
More information about the changes from 8.4 to 8.5 can be found in the CHANGES file. Feedback and bug reports are extremely welcome. Enjoy!

### Thomas Leonard

#### Securing the Unikernel

Back in July, I used MirageOS to create my first unikernel, a simple REST service for queuing file uploads, deployable as a virtual machine. While a traditional VM would be a complete Linux system (kernel, init system, package manager, shell, etc), a Mirage unikernel is a single OCaml program which pulls in just the features (network driver, TCP stack, web server, etc) it needs as libraries. Now it’s time to look at securing the system with HTTPS and access controls, ready for deployment.

( this post also appeared on Hacker News and Reddit )

## Introduction

As a quick reminder, the service (“Incoming queue”) accepts uploads from various contributors and queues them until the (firewalled) repository software downloads them, checks the GPG signatures, and merges them into the public software repository, signed with the repository’s key:

Although the queue service isn’t security critical, since the GPG signatures are made and checked elsewhere, I would like to ensure it has a few properties:

• Only the repository can fetch items from the queue.
• Only authorised users can upload to it.
• I can see where an upload came from and revoke access if necessary.
• An attacker cannot take control of the system and use it to attack other systems.

We often think of security as a set of things we want to prevent - taking away possible actions from a fundamentally vulnerable underlying system (such as my original implementation, which had no security features). But ideally I’d like every component of the system to be isolated by default, with allowed interactions (shown here by arrows) specified explicitly. We should then be able to argue (informally) that the system will meet the goals above without having to verify that every line of code is correct.

My unikernel is written in OCaml and runs as a guest OS under the Xen hypervisor, so let’s look at how well those technologies support isolation first…

## OCaml

I want to isolate the components of my unikernel, giving each just the access it requires. When writing an OS, some unsafe code will occasionally be needed, but it should be clear which components use unsafe features (so they can be audited more carefully), and unsafe features shouldn’t be needed often.

For example, the code for handling an HTTP upload request should only be able to use our on-disk queue’s Uploader interface and its own HTTP connection. Then we would know that an attacker with upload permission can only cause new items to be added to the queue, no matter how buggy that code is. It should not be able to read the web server’s private key, establish new out-bound connections, corrupt the disk, etc.

Like most modern languages, OCaml is memory-safe, so components can’t interfere with each other through buggy pointer arithmetic or unsafe casts of the kind commonly found in C code.

But we also need to avoid global variables, which would allow two components to communicate without us explicitly connecting them. I can’t reason about the security of the system by looking at arrows in the architecture diagrams if unconnected components can magically create new arrows by themselves! I’ve seen a few interesting approaches to this problem (please correct me if I’ve got this wrong):

Haskell avoids all side-effects, which makes global variables impossible (without using “unsafe” features), since updating them would be a side-effect.
Rust
Rust almost avoids the problem of globals through its ownership types. Since only one thread can have a pointer to a mutable value at a time, mutable values can’t normally be global. Rust does allow “mutable statics” (which contain only pointers to fixed data), but requires an explicit “unsafe” block to use them, which is good.
E
E allows modules to declare top-level variables, but each time the module is imported it creates a fresh copy.

OCaml does allow global variables, but by convention they are generally not used.

A second problem is controlling access to the outside world, including the network and disks (which you could consider to be more global variables):

Haskell doesn’t allow functions to access the outside world at all, but they can return an IO type if they want to do something (the caller must then pass this value up to the top level). This makes it easy to see that e.g. evaluating a function “uriPath :: URI -> String” cannot access the network. However, it appears that all IO gets lumped in together: a value of type IO String may cause any side-effects at all (disk, network, etc), so the entire side-effecting part of the program needs to be audited.
Rust
Rust allows all code full access to the system via its standard library. For example, any code can read or write any file.
E
E passes all access to the outside world to the program’s entry point. For example, <file> grants access to the file system and <unsafe> grants access to all unsafe features. These can be passed to libraries to grant them access, and can be attenuated (wrapped) to provide limited access.

For example:

<notextile><figure class="code"><figcaption>main.e </figcaption>
 1 2  def queue := () ... 
</figure></notextile>

Here, queue has read-write access to the /var/myprog/queue sub-tree (and nothing else). It also has no way to share data with any other parts of the program, including other queues.

Like Rust, OCaml does not limit access to the outside world. However, Mirage itself uses E-style dependency injection everywhere, with the unikernel’s start function being passed all external resources as arguments:

<notextile><figure class="code"><figcaption>unikernel.ml </figcaption>
 1 2 3 4 5 6 7 8 9 10  module Main (C : V1_LWT.CONSOLE) (B : V1_LWT.BLOCK) (H : Cohttp_lwt.Server) = struct module Q = Upload_queue.Make(B) let start console block http = lwt queue = Q.create block in ... end 
</figure></notextile>

Because everything in Mirage is defined using abstract types, libraries always expect to be passed the things they need explicitly. We know that Upload_queue above won’t access a block device directly because it needs to support different kinds of block device.

OCaml does enforce its abstractions. There’s no way for the Upload_queue to discover that block is really a Xen block device with some extra functionality (as a Java program might do with if (block instanceof XenBlock), for example). This means that we can reason about the limits of what functions may do by looking only at their type signatures.

The use of functors means you can attenuate access as desired. For example, if we want to grant just part of block to the queue then we can create our own module implementing the BLOCK type that exposes just some partition of the device, and pass that to the Upload_queue.Make functor.

In summary then, we can reason about security fairly well in Mirage if we assume the libraries are not malicious, but we cannot make hard guarantees. It should be possible to check with automatic static analysis that we’re not using any “unsafe” features such as global variables, direct access to devices, or allocating uninitialised memory, but I don’t know of any tools to do that (except Emily, but that seems to be just a proof-of-concept). But these issues are minor: any reasonably safe modern language will be a huge improvement over legacy C or C++ systems!

## Xen

The Xen hypervisor allows multiple guest operating systems to run on a single physical machine. It is used by many cloud hosting providers, including Amazon’s AWS. I run it on my CubieTruck - a small ARM board. Xen allows me to run my unikernel on the same machine as other services, but ideally with the same security properties as if it had its own dedicated machine. If some other guest on the machine is compromised, it shouldn’t affect my unikernel, and if the unikernel is compromised then it shouldn’t affect other guests.

A typical Xen deployment, running four domains.

The diagram above shows a deployment with Linux and Mirage guests. Only dom0 has access to the physical hardware; the other guests only see virtual devices, provided by dom0.

How secure is Xen? The Xen security advisories page shows that there are about 3 new Xen advisories each month. However, it’s hard to compare programs this way because the number of vulnerabilities reported depends greatly on the number of people using the program, whether they use it for security-critical tasks, and how seriously the project takes problems (e.g. whether a denial-of-service attack is considered a security bug).

I started using Xen in April 2014. These are the security problems I’ve found myself so far:

XSA-93 Hardware features unintentionally exposed to guests on ARM
While trying to get Mini-OS to boot, I tried implementing ARM’s recommended boot code for invalidating the cache (at startup, you should invalidate the cache, dropping any previous contents). When running under a hypervisor this should be a null-op since the cache is always valid by the time a guest is running, but Xen allowed the operation to go ahead, discarding pending writes by the hypervisor and other guests and crashing the system. This could probably be used for a successful attack.
XSA-94 : Hypervisor crash setting up GIC on arm32
I tried to program an out-of-range register, causing the hypervisor to dereference a null pointer and panic. This is just a denial of service (the host machine needs to be rebooted); it shouldn’t allow access to other VMs.
I got the length wrong when creating the zImage for the unikernel (I included the .bss section in the length). The xl create tool didn’t notice and tried to read the extra data, causing the tool to segfault. You could use this to read a bit of private data from the xl process, but it’s unlikely there would be anything useful there.

Although that’s more bugs than you might expect, note that they’re all specific to the relatively new ARM support. The second and third are both due to using C, and would have been avoided in a safer language. I’m not really sure why the “xl” tool needs to be in C - that seems to be asking for trouble.

To drive the physical hardware, Xen runs the first guest (dom0) with access to everything. This is usually Linux, and I had various problems with that. For example:

Bug sharing the same page twice
Linux got confused when the unikernel shared the same page twice (it split a single page of RAM into multiple TCP segments). This wasn’t a security bug (I think), but after it was fixed, I then got:
Page still granted
If my unikernel sent a network packet and then exited quickly, dom0 would get stuck and I’d have to reboot.
Oops if network is used too quickly
The Linux dom0 initialises the virtual network device while the VM is booting. My unikernel booted fast enough to send packets before the device structure had been fully filled in, leading to an oops.
Linux Dom0 oops and reboot on indirect block requests
Sending lots of block device requests to Linux dom0 from the unikernel would cause Linux to oops in swiotlb_tbl_unmap_single and reboot the host. I wasn’t the first to find this though, and backporting the patch from Linux 3.17 seemed to fix it (I don’t actually know what the problem was).

So it might seem that using Xen doesn’t get us very far. We’re still running Linux in dom0, and it still has full access to the machine. For example, a malicious network packet from outside or from a guest might still give an attacker full control of the machine. Why not just use KVM and run the guests under Linux directly?

The big (potential) advantage of Xen here is Dom0 Disaggregation. With this, Dom0 gives control of different pieces of physical hardware to different VMs rather than driving them itself. For example, Qubes (a security-focused desktop OS using Xen) runs a separate “NetVM” Linux guest just to handle the network device. This is connected only to the FirewallVM - another Linux guest that just routes packets to other VMs.

This is interesting for two reasons. First, if an attacker exploits a bug in the network device driver, they’re still outside your firewall. Secondly, it provides a credible path to replacing parts of Linux with alternative implementations, possibly written in safer languages. You could, for example, have Linux running dom0 but use FreeBSD to drive the network card, Mirage to provide the firewall, and OpenBSD to handle USB.

Finally, it’s worth noting that Mirage is not tied to Xen, but can target various systems (mainly Unix and Xen currently, but there is some JavaScript support too). If it turns out that e.g. Genode on seL4 (a formally verified microkernel) provides better security, we should be able to support that too.

## Transport Layer Security

We won’t get far securing the system while attackers can read and modify our communications. The ocaml-tls project provides an OCaml implementation of TLS (Transport Layer Security), and in September Hannes Mehnert showed it running on Mirage/Xen/ARM devices. Given the various flaws exposed recently in popular C TLS libraries, an OCaml implementation is very welcome. Getting the Xen support in a state where it could be widely used took a bit of work, but I’ve submitted all the patches I made, so it should be easier for other people now - see https://github.com/mirage/mirage-dev/pull/52.

### C stubs for Xen

TLS needs some C code for the low-level cryptographic functions, which have to be constant time to avoid leaking information about the key, so first I had to make packages providing versions of libgmp, ctypes, zarith and nocrypto compiled to run in kernel mode.

The reason you need to compile C programs specially to run in kernel mode is because on x86 processors user mode code can assume the existence of a red zone, which allows some optimisations that aren’t safe in kernel mode.

### Ethernet frame alignment

The Mirage network driver sends Ethernet frames to dom0 by sharing pages of memory. Each frame must therefore be contained in a single page. The TLS code was (correctly) passing a large buffer to the TCP layer, which incorrectly asked the network device to send each TCP-sized chunk of it. Chunks overlapping page boundaries then got rejected.

My previous experiments with tracing the network layer had shown that we actually share two pages for each packet: one for the IP header and one for the payload. Doing this avoids the need to copy the data to a new buffer, but adds the overhead of granting and revoking access to both pages. I modified the network driver to copy the data into a single block inside a single page and got a large speed boost. Indeed, it got so much faster that it triggered a bug handling full transmit buffers - which made it initially appear slower!

In addition to fixing the alignment problem when using TLS, and being faster, this has a nice security benefit: the only data shared with the network driver domain is data explicitly sent to it. Before, we had to share the entire page of memory containing the application’s buffer, and there was no way to know what else might have been there. This offers some protection if the network driver domain is compromised.

### HTTP API

My original code configured a plain HTTP server on port 8080 like this:

<notextile><figure class="code"><figcaption>config.ml </figcaption>
 1 2 3 4 5 6 7 8 9  ... let server = http_server (TCP (Port 8080)) (conduit_direct (stack default_console)) let () = register "queue" [ queue $default_console$ storage $server ]  </figure></notextile> stack creates TCP/IP stack. conduit_direct can dynamically select different transports (e.g. http or vchan). http_server applies the configuration to the conduit to get an HTTP server using plain HTTP. I added support to Conduit_mirage to let it wrap any underlying conduit with TLS. However, the configuration needed for TLS is fairly complicated, and involves a secret key which must be protected. Therefore, I switched to creating only the conduit in config.ml and having the unikernel itself load the key and certificate by copying a local “keys” directory into the unikernel image as a “crunch” filesystem: <notextile><figure class="code"><figcaption>config.ml </figcaption>  1 2 3 4 5 6 7 8  let conduit = conduit_direct ~tls:(tls_over_conduit default_entropy) (stack default_console) let () = register "queue" [ queue$ default_console $storage$ conduit $crunch "keys" ]  </figure></notextile> <notextile><figure class="code"><figcaption>unikernel.ml </figcaption>  1 2 3 4 5 6 7 8   let start c b conduit kv = lwt certificate = X509.certificate kv Default in let tls_config = Tls.Config.(of_server (server ~certificate ())) in let http spec = let ctx = conduit in let mode = TLS (tls_config, TCP (Port 8443)) in Conduit.serve ~ctx ~mode (H.listen spec) in ...  </figure></notextile> This runs a secure HTTPS server on port 8443. The rest of the code is as before. ### The private key The next question is where to store the real private key. The examples provided by the TLS package compile it into the unikernel image using crunch, but it’s common to keep unikernel binaries in Git repositories and people don’t expect kernel images to contain secrets. In a traditional Linux system, we’d store the private key on the disk, so I decided to try the same here. (I did think about storing the key encrypted in the unikernel and storing the password on the disk so you’d need both to get the key, but the TLS library doesn’t support encrypted keys yet.) I don’t use a regular filesystem for my queuing service, and I wouldn’t want to share it with the key if I did, so instead I reserved a separate 4KB partition of the disk for the key. It turned out that Mirage already has partitioning support in the form of the ocaml-mbr library. I didn’t actually create an MBR at the start, but just used the Mbr_partition functor to wrap the underlying block device into two parts. The configuration looks like this: How safe is this? I don’t want to audit all the code for handling the queue, and I shouldn’t have to: we can see from the diagram that the only components with access to the key are the disk, the partitions and the TLS library. We need to trust that the TLS library will protect the key (not easy, but that’s its job) and that queue_partition won’t let queue access the part of the disk with the key. We also need to trust the disk, but if the partitions are only allowing correct requests through, that shouldn’t be too much to ask. ### The partition code Before relying on the partition code, we’d better take a look at it because it may not be designed to enforce security. Indeed, a quick look at the code shows that it isn’t: <notextile><figure class="code">  1 2 3 4 5 6 7 8 9 10 11   let read t start_sector buffers = let length = Int64.add start_sector (length buffers) in if length > t.length_sectors then return (Error (Unknown (Printf.sprintf "read %Ld %Ld out of range" start_sector length))) else B.read t.b (Int64.add start_sector t.start_sector) buffers let write t start_sector buffers = let length = Int64.add start_sector (length buffers) in if length > t.length_sectors then return (Error (Unknown (Printf.sprintf "write %Ld %Ld out of range" start_sector length))) else B.write t.b (Int64.add start_sector t.start_sector) buffers  </figure></notextile> It checks only that the requested start sector plus the length of the results buffer is less than the length of the partition. To (hopefully) make this bullet-proof, I: • moved the checks into a single function so we don’t have to check two copies, • added a check that the start sector isn’t negative, • modified the end check to avoid integer overflow, and • added some unit-tests. <notextile><figure class="code">  1 2 3 4 5 6 7 8 9 10   let adjust_start name op t start_sector buffers = let buffers_len_sectors = length t buffers in if start_sector < 0L || start_sector > t.id.length_sectors -- buffers_len_sectors then return (Error (Unknown (Printf.sprintf "%s %Ld+%Ld out of range" name start_sector buffers_len_sectors))) else op t.id.b (Int64.add start_sector t.id.start_sector) buffers let read = adjust_start "read" B.read let write = adjust_start "write" B.write  </figure></notextile> The need to protect against overflow is an annoyance. OCaml’s Int64.(add max_int one) doesn’t abort, but returns Int64.min_int. That’s disappointing, but not surprising. I wrote a unit-test that tried to read sector Int64.max_int and ran it (before updating the code) to check it detected the problem. I was expecting the partition code to pass the request to the underlying block device, which I expected to return an error about the invalid sector, but it didn’t! It turns out, Int64.to_int (used by my in-memory test block device) silently truncates out-of-range integers: # Int64.max_int;; - : int64 = 9223372036854775807L # Int64.(add max_int one);; - : int64 = -9223372036854775808L # Int64.min_int;; - : int64 = -9223372036854775808L # Int64.(to_int min_int);; - : int = 0  So, if the queue can be tricked into asking for sector 9223372036854775807 then the partition would accept it as valid and the block device would truncate it and give access to sector 0 - the sector with the private key! Still, this is a nice demonstration of how we can add security in Mirage by inserting a new module (Mbr_partition) between two existing ones. Rather than having some complicated fixed policy language (e.g. SELinux), we can build whatever security abstractions we like. Here I just limited which parts of the disk the queue could access, but we could do many other things: make a partition read-only, make it readable only until the unikernel finishes initialising, apply rate-limiting on reads, etc. Here’s the final code. It: 1. Takes a block device, a conduit, and a KV store as inputs. 2. Creates two partitions (views) onto the block device. 3. Creates a queue on one partition. 4. Reads the private key from the other, and the certificate from the KV store. 5. Begins serving HTTPS requests. <notextile><figure class="code"><figcaption>unikernel.ml </figcaption>  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39  module Main (C : V1_LWT.CONSOLE) (B : V1_LWT.BLOCK) (Conduit : Conduit_mirage.S) (CertStore : V1_LWT.KV_RO) = struct module Part = Mbr_partition.Make(B) module Q = Upload_queue.Make(Part) module Http = HTTP.Make(Conduit) [...] let start c b conduit cert_store = B.get_info b >>= fun info -> let key_sectors = (key_partition_size + info.B.sector_size - 1) / info.B.sector_size |> Int64.of_int in let queue_sectors = info.B.size_sectors -- key_sectors in Part.(connect {b; start_sector = 0L; length_sectors = key_sectors}) >>|= fun key_partition -> Part.(connect {b; start_sector = key_sectors; length_sectors = queue_sectors}) >>|= fun queue_partition -> Q.create queue_partition >>= fun q -> lwt certs = read_from_kv cert_store "tls/server.pem" >|= X509.Cert.of_pem_cstruct in lwt pk = read_from_partition key_partition ~len:private_key_len >|= X509.PK.of_pem_cstruct1 in let tls_config = Tls.Config.(of_server (server ~certificate:(certs, pk) ())) in let spec = Http.Server.make ~callback:(callback q) ~conn_closed () in let mode = TLS (tls_config, TCP (Port 8443)) in Conduit.serve ~ctx:conduit ~mode (Http.Server.listen spec) end  </figure></notextile> ### Entropy Another issue is getting good random numbers, which is required for the cryptography. On start-up, the unikernel displayed: Entropy_xen_weak: using a weak entropy source seeded only from time.  To fix this, you need to use Dave Scott’s version (with a slight patch from me): opam pin add mirage-entropy-xen 'https://github.com/talex5/mirage-entropy.git#handshake'  You should then see: Entropy_xen: attempting to connect to Xen entropy source org.openmirage.entropy.1 Console.connect org.openmirage.entropy.1: doesn't currently exist, waiting for hotplug  Now run xentropyd in dom0 to share the host’s entropy with guests. The interesting question here is what Linux guests do for entropy, especially on ARM where there’s no RdRand instruction. ## Access control Traditional means of access control involve issuing users with passwords or X.509 client certificates, which they share with the software they’re running. All requests sent by the client can then be authenticated as coming from them and approved based on some access control policy. This approach leads to all the well-known problems with traditional access control: the confused deputy problem, Cross-Site Request Forgery, Clickjacking, etc, so I want to avoid that kind of “ambient authority”. The previous diagram let us reason about how the different components within the unikernel could interact with each other, showing the possible (initial) interactions with arrows. Now I want to stretch arrows across the Internet, so I can reason in the same way about the larger distributed system that includes my queue service with the uploaders and downloaders. Like C pointers, traditional web URLs do not give us what we want: a compromised CA anywhere in the world will allow an attacker to impersonate our service, and our URLs may be guessable. Instead, I decided to try a YURL: ”[…] the identifier MUST provide enough information to: locate the target site; authenticate the target site; and, if required, establish a private communication channel with the target site. A URL that meets these requirements is a YURL.” The latest version of this (draft) scheme I could find was some brief notes in HTTPSY (2014), which uses the format: httpsy://algorithm:fingerprint@domain:port/path1/!redactedPath2/…  There are two parts we need to consider: how the client determines that it is connected to the real service, and how the service determines what the client can do. To let the client authenticate the server without relying on the CA system, YURLs include a hash (fingerprint) of the server’s public key. You can get the fingerprint of an X509 certificate like this: $ openssl x509 -in server.pem -fingerprint -sha256 -noout
SHA256 Fingerprint=3F:27:2D:E6:D6:3D:7C:08:E0:E3:EF:02:A8:DA:9A:74:62:84:57:21:B4:72:39:FD:D0:72:0E:76:71:A5:E9:94


Base32-encoding shortens this to h4ts3zwwhv6aryhd54bkrwu2orriivzbwrzdt7oqoihhm4nf5gka.

Alternatively, to get the value with OCaml, use:

<notextile><figure class="code">
 1  Certificate.cs_of_cert cert |> Nocrypto.Hash.digest SHA256 
</figure></notextile>

To control what each user of the service can do, we give each user a unique YURL containing a Swiss number, which is like a password except that it applies only to a specific resource, not to the whole site. The Swiss number comes after the ! in the URL, which indicates to browsers that it shouldn’t be displayed, included in Referer headers, etc. You can use any unguessably long random string here (I used pwgen 32 1). After checking the server’s fingerprint, the client requests the path with the Swiss number included.

Putting it all together, then, a sample URL to give to the downloader looks like this:

httpsy://sha256:h4ts3zwwhv6aryhd54bkrwu2orriivzbwrzdt7oqoihhm4nf5gka@10.0.0.2:8443/downloader/!eequuthieyexahzahShain0abeiwaej4


The old code for handling requests looked like this:

<notextile><figure class="code">
 1 2 3  match Uri.path request.Cohttp.Request.uri with | "/uploader" -> handle_uploader q request body | "/downloader" -> handle_downloader q request 
</figure></notextile>

This becomes:

<notextile><figure class="code">
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15  let re_endpoint = Str.regexp "^/\$$uploader\\|downloader\$$/!\$$.*\$$\$" in let path = Uri.path request.Cohttp.Request.uri in if Str.string_match re_endpoint path 0 then ( let label = Str.matched_group 1 path in let swiss_hash = Str.matched_group 2 path |> Cstruct.of_string |> Nocrypto.Hash.digest SHA256 |> Cstruct.to_string |> B64.encode in match label, swiss_hash with | "uploader", "kW8VOKYP1eu/cWInpJx/jzYDSJzo1RUR14GoxV/CImM=" -> handle_uploader q request body ~user:"Alice" | "downloader", "PEj3nuboy3BktVGzi9y/UBmgkrGuhHD1i6WsXDw1naI=" -> handle_downloader q request ~user:"0repo" | _ -> bad_path path ) else bad_path path 
</figure></notextile>

I hashed the Swiss number here so that the unikernel doesn’t have to contain any secrets and I therefore don’t have to worry about timing attacks. Even if the attacker knows the hash we’re looking for, they still shouldn’t be able to generate a URL which hashes to that value.

By giving each user of the service a different Swiss number we can keep records of who authorised each request and revoke access individually if needed (here the ~user:"Alice" indicates this is the uploader URL we gave to Alice).

Of course, the YURLs need to be sent to users securely too. In my case, the users already have known GPG keys, so I can just email them an encrypted version.

### Python client

The downloader (0repo) is written in Python, so the next step was to check that it could still access the service. The Python SSL API was rather confusing, but this seems to work:

<notextile><figure class="code"><figcaption>client.py </figcaption>
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42  #!/usr/bin/python3 from urllib.parse import urlparse from http import client import ssl import hashlib import base64 # Note: MUST check fingerprint BEFORE sending URL path with Swiss number class FingerprintContext: required_fingerprint = None verify_mode = ssl.CERT_REQUIRED check_hostname = False def __init__(self, fingerprint): self.required_fingerprint = fingerprint def wrap_socket(self, sock, server_hostname, **kwargs): wrapped = ssl.wrap_socket(sock, **kwargs) cert = wrapped.getpeercert(binary_form = True) actual_fingerprint = base64.b32encode(hashlib.sha256(cert).digest()).lower().decode('ascii').strip('=') if actual_fingerprint != self.required_fingerprint: raise Exception("Expected server certificate to have fingerprint:\n%s but got:\n%s" % (self.required_fingerprint, actual_fingerprint)) return wrapped # Testing... url = "httpsy://sha256:h4ts3zwwhv6aryhd54bkrwu2orriivzbwrzdt7oqoihhm4nf5gka@localhost:8443/downloader/!eequuthieyexahzahShain0abeiwaej4" parsed = urlparse(url) assert parsed.scheme == "httpsy" assert parsed.username == "sha256" conn = client.HTTPSConnection( host = parsed.hostname, port = parsed.port, context = FingerprintContext(parsed.password), check_hostname = False) conn.request("GET", parsed.path) resp = conn.getresponse() print("Fetching item of size %s..." % resp.getheader('Content-Length')) d = resp.read() print("Fetched %d bytes" % len(d)) 
</figure></notextile>

## Conclusions

MirageOS should allow us to build systems that are far more secure than traditional operating systems. By starting with isolated components and then connecting them together in a controlled way we can feel some confidence that our security goals will be met.

At the language level, OCaml’s abstract types and functors make it easy to reason informally about how the components of our system will interact. Mirage passes values granting access to the outside world (disks, network cards, etc) to our unikernel’s start function. Our code can then delegate these permissions to the rest of the code in a controlled fashion. For example, we can grant the queuing code access only to its part of the disk (and not the bit containing the TLS private key) by wrapping the disk in a partition functor. Although OCaml doesn’t actually prevent us from bypassing this system and accessing devices directly, code that does so would not be able to support the multiple different back-ends (e.g. Unix and Xen) that Mirage requires and so could not be written accidentally. It should be possible for a static analysis tool to verify that modules don’t do this.

Moving up a level from separating the components of our unikernel, Xen allows us to isolate multiple unikernels and other VMs running on a single physical host. Just as we interposed a disk partition between the queue and the disk within the unikernel, we can use Xen to interpose a firewall VM between the physical network device and our unikernel.

Finally, the use of transport layer security and YURLs allows us to continue this pattern of isolation to the level of networks, so that we can reason in the same way about distributed systems. My current code mixes the handling of YURLs with the existing application logic, but it should be possible to abstract this and make it reusable, so that remote services appear just like any local service. In many systems this is awkward because local APIs are used synchronously while remote ones are asynchronous, but in Mirage everything is non-blocking anyway, so there is no difference.

I feel I should put some kind of warning here about these very new security features not being ready for real use and how you should instead use mature, widely deployed systems such as Linux and OpenSSL. But really, can it be any worse?

If you’ve spotted any flaws in my reasoning or code, please add comments! The code for this unikernel can be found on the tls branch at https://github.com/0install/0repo-queue/tree/tls.

### GaGallium

#### Left-recursive versus right-recursive lists in LR parsers

Contrary to top-down (LL) parsers, which do not support left recursion, bottom-up (LR) parsers support both left recursion and right recursion. When defining a list-like construct, a user of an LR parser generator, such as Menhir, faces a choice between these two flavors. Which one should she prefer?

Two considerations guide this choice: expressiveness (which flavor leads to fewer conflicts?) and performance (which flavor leads to a more efficient parser?).

In this post, I am mainly interested in discussing expressiveness. I also comment on performance in the setting of Menhir.

As we will see, the bottom line is that neither formulation seems deeply preferable to the other.

### An example where left recursion is preferable

This example is inspired by the syntax of C variable-argument functions. Suppose we wish to parse a list of arguments separated with commas. Furthermore, after this list, we wish to allow an optional ellipsis.

arguments:
separated_list(COMMA, arg) vararg

vararg:
/* epsilon */
| COMMA ELLIPSIS

There is a potential problem because the token COMMA appears both as a delimiter within separated_list and as the first symbol of vararg.

If one uses a left-recursive definition of separated_list, everything is fine (for reference, the code for left- and right-recursive definitions is listed at the end of this post). When an LR(1) parser encounters COMMA, it shifts, which means that it explores in parallel two possibilities: either this is the beginning of a new list element, or this is the beginning of a vararg. Looking at the next token, which should be either the beginning of an arg or ELLIPSIS, resolves this choice, and all is well.

If one uses a right-recursive definition of separated_list, then a conflict appears. When an LR(1) parser encounters the very first COMMA, it must immediately decide whether this COMMA announces the continuation of the list (in which case it should shift) or is the beginning of a vararg (in which case it should reduce what it has read up to this point to a list).

This example has been encountered in real life by a user of Menhir, and is also described online (Practical Considerations for LALR(1) Grammars). The author says: "left recursion is good, right recursion is bad", and indeed this example seems to support this motto. However, there also situations where the converse is true, as shown by the next example.

### An example where right recursion is preferable

Suppose we (again) wish to parse a list whose elements are separated with commas. Furthermore, we wish to allow an optional trailing comma to appear at the end of the list.

There are several ways of expressing this. An interesting view is this: either the comma is a separator, or it is a terminator. Thus, one writes:

arguments:
LPAREN separated_or_terminated_list(COMMA, arg) RPAREN

separated_or_terminated_list(sep, elem):
nonempty_list(terminated(elem, sep))
| separated_nonempty_list(sep, elem)

terminated(elem, sep):
elem sep

Perhaps surprisingly, if one uses right-recursive definitions of nonempty_list and separated_nonempty_list, everything is fine. An LR(1) parser reads the whole list without ever reducing. It shifts all the way through, which means that it explores in parallel the two possibilities: either this is a terminated list, or this is a separated list. At the very end, when it finds either COMMA RPAREN or just RPAREN, it decides between the two alternatives, and performs all of the required reductions.

If one uses left-recursive definitions of nonempty_list and separated_nonempty_list, a conflict appears. When an LR(1) parser encounters the very first COMMA, it must immediately decide whether this COMMA is part of the first element in a terminated list (in which case it should shift) or is the first separator in a separated list (in which case it should reduce what it has so read so far to a separated list).

### Moral of the story

The idea that "left recursion is good, right recursion is bad" is a myth, at least in terms of conflicts. There are ways of working with left recursion and ways of working with right recursion; they are just not the same.

### A few words on flexible lists

So, what is the right way of describing a flexible list, where the final delimiter is optional?

One way is to describe it as a separated list, followed with an optional delimiter. In this case, a left-recursive version of separated lists must be used, for the reasons explained in the first part of this post. Here it is in the syntax of Menhir:

%inline flexible_list(delim, X):
xs = separated_llist(delim, X) delim?
{ xs }

The definition of separated_llist is given at the end of this post.

Another way is to give a direct recursive definition, with two base cases and a single recursive case. In this case, it seems that we do not have a choice: this definition is naturally right-recursive.

right_flexible_list(delim, X):
| (* nothing *)
{ [] }
| x = X
{ [x] }
| x = X delim xs = right_flexible_list(delim, X)
{ x :: xs }

In the end, I like this definition, because I think it is the simplest and most direct. It leads to a smaller LR automaton and (potentially) to simpler syntax error messages.

Symmetrically, one can define a left_flexible_list where delimiters precede elements and where the first delimiter is optional. The various ways of doing this are left as an exercise for the reader!

### A few words on performance.

The GNU Bison manual says "you should always use left recursion, because it can parse a sequence of any number of elements with bounded stack space".

In Menhir, the LR stack is heap-allocated, so there is no hard limit on its size. If one wishes to parse a list on file and construct a list in memory, where the first list element on file becomes the first list element in memory, then both approaches are equivalent. In the right-recursive approach, the parser pushes O(n) elements onto the stack, then pops them off and constructs the desired list. In the left-recursive approach, the parser uses only O(1) stack space but constructs a reversed list, which takes up O(n) space and must be reversed afterwards.

The only two scenarios where left recursion may be more efficient is if one wishes to construct a reversed list (in which case left recursion allows saving a constant factor) or if one wishes to process the list elements on the fly, without allocating a list at all (in which case left recursion allows working in space O(1) instead of O(n)).

### Code for separated lists (for reference)

Here is a right-recursive version of separated lists, as currently found in Menhir's standard library:

%inline separated_list(separator, X):
xs = loption(separated_nonempty_list(separator, X))
{ xs }

loption(X):
/* nothing */
{ [] }
| x = X
{ x }

separated_nonempty_list(separator, X):
x = X
{ [ x ] }
| x = X; separator; xs = separated_nonempty_list(separator, X)
{ x :: xs }

Here is a left-recursive version of separated lists:

reverse_separated_nonempty_llist(separator, X):
x = X
{ [ x ] }
| xs = reverse_separated_nonempty_llist(separator, X); separator; x = X
{ x :: xs }

%inline reverse_separated_llist(separator, X):
{ [] }
| xs = reverse_separated_nonempty_llist(separator, X)
{ xs }

%inline separated_llist(separator, X):
xs = reverse_separated_llist(separator, X)
{ List.rev xs }

#### Technical remark.

The symmetry between the left- and right-recursive versions of separated lists (above) is not quite perfect, because one uses loption whereas the other does not. One could define reverse_separated_llist(separator, X) as loption(reverse_separated_nonempty_llist(separator, X)) but this causes a conflict to re-appear in the vararg example. One can eliminate this conflict by declaring %inline loption, but doing that in the standard library might break existing code.

### @typeocaml

#### Mutable

While OCaml is a functional programming language and emphasises pure functional style, it allows mutable (variables and values) and hence imperative programming style. The reason is said in Real World OCaml:

Imperative code is of fundamental importance to any practical programming language, because real-world tasks require that you interact with the outside world, which is by its nature imperative. Imperative programming can also be important for performance. While pure code is quite efficient in OCaml, there are many algorithms that can only be implemented efficiently using imperative techniques.

How to write imperative code in OCaml has been well introduced in both OCaml's documentation and user's manual and Real World OCaml. The imperative logic flow is very similar as in other imperative languages. For example, we can write an OCaml binary search like this:

let binary_search a x =
let len = Array.length a in
let p = ref 0 and q = ref (len-1) in
let loc = ref None in
while !p <= !q && !loc = None && !p >= 0  && !q < len do
let mi = (!p + !q) / 2 in
if x = a.(mi) then loc := Some mi
else if x > a.(mi) then p := mi + 1
else q := mi - 1
done;
!loc


Mutable or imperative could be a snake. People, who are forced to use OCaml (for work or course assignments) while they are not yet convinced by the potential functional power, may be happy when they know stuff can be written like in Java. They may also intend to (secretly) write imperative code in OCaml as much as possible to just get things done.

It sounds bad and it indeed is. What is the point to code in a heavily imperative way with a functional programming language?

However, I won't be worried even if one does that, because I know it is just the initial avoidance and it won't last long. Why? Because writing imperative code is a bit troublesome anyway. Let's revisit the binary_search we wrote before.

You can see that nothing is straightforward there.

1. When we define a mutable variable, we have to use ref;
2. When we get the value out, we have to use ! everywhere;
3. When we assign a value, we can't forget : before =;
4. We even need to add . before () when we access arrays.

Even after we finish the function, it appears quite verbose and a little bit uncomfortable to read, right? This is why I am sure in longer term, one may give up imperative style, just truly learn the functional style and eventually understand OCaml's beautiful side.

So if we will not / should not write everything imperatively, when to leverage the benefits from mutable?

# When performance requires it

In order to manipulate a sequence of elements, we noramlly will use array in imperative languages; however, in pure functional language, we prefer list as it is immutable.

The best thing array offers us is the O(1) speed in accessing an element via its index. But we lose this advantage if using list, i.e., we have to do a linear scan, which is O(n), to get the ith value. Sometimes it would be too slow. To demonstrate this, let's have a look at the problem of shuffle:

Given a sequence, randomize the order of all the elements inside.

A typical algorithm for shuffle can be:

1. i initially is 0 and len is the length of the sequence.
2. Randomly generate an integer r within [i, len).
3. Swap the element at i and the one at r.
4. Increase i for next targeting element.
5. Repeat from step 2.

If we use list in OCaml, then we will have:

let rm_nth n l =
let rec aux acc i = function
| [] -> List.rev acc
| _::tl when i = n -> List.rev_append acc tl
| x::tl -> aux (x::acc) (i+1) tl
in
aux [] 0 l

(* a slight modification when using list.
we do not swap, instead, we remove the element from the randomised index
and put it to the new list.
*)
let shuffle l =
Random.self_init();
let len = List.length l in
let rec aux i acc l =
if i < len then
let r = Random.int (len - i) in
aux (i+1) (List.nth l r ::acc) (rm_nth r l)
else acc
in
aux 0 [] l


We have to scan the list twice in each iteration: once to get the element at the randomised index r and once to remove it. Totally we have n iterations, thus, the time complexity is O(n^2).

If we use array, then it is much faster with time complexity of O(n) as locating an element and swapping two elements in array just cost O(1).

let swap a i j =
let tmp = a.(i) in
a.(i) <- a.(j);
a.(j) <- tmp

let shuffle_array a =
Random.self_init();
let len = Array.length a in
for i = 0 to len-1 do
let r = i + Random.int (len - i) in
swap a i r
done


In addition to array, some other imperative data structures can also improve performance. For example, the Hashtbl is the traditional hash table with O(1) time complexity. However, the functional Map module uses AVL tree and thus provides O(logN). If one really cares about such a difference, Hashtbl can be used with higher priority.

Note that we should not use potential performance boost as an excuse to use imperative code wildly in OCaml. In many cases, functional data structures and algorithms have similar performance, such as the 2-list based functional queue can offer us amortised O(1).

# When we need to remember something

Constantly creating new values makes functional programming safer and logically clearer, as discussed previously. However, occasionally we don't want everything to be newly created; instead, we need a mutable object that stays put in the memory but can be updated. In this way, we can remember values in a single place.

Write a function that does x + y * z. It outputs the correct result and in addition, prints how many times it has been called so far.

The x + y * z part is trivial, but the later recording the times of calls is tricky. Let's try to implement such a function in pure functional style first.

(* pure functional solution *)
let f x y z last_called_count =
print_int (last_called_count+1);
x + y * z, last_called_count + 1


The code above can roughly achieve the goal. However, it is not great.

1. It needs an additional argument which is meant to be the most recent count of calls. The way could work but is very vulnerable because it accepts whatever integer and there is no way to constrain it to be the real last count.
2. It has to return the new call count along with the real result

When a user uses this function, he / she will feel awkward. last_called_count needs to be taken care of very carefully in the code flow to avoid miscounting. The result returned is a tuple so pattern matching is necessary to obtain the real value of x + y * z. Also again, one need to somehow remember the new call count so he / she can supply to the next call.

This is where imperative programming comes to help:

(* with help of imperative programming.
note that the function g can be anonymous.*)
let f =
let countable_helper () =
let called_count = ref 0 in
let g x y z =
called_count := !called_count + 1;
print_int !called_count;
x + y * z
in
g
in
countable_helper()


countable_helper is a helper function. If it is called, it first creates a mutable called_count and then pass it to g. Because called_count is mutable, so g can freely increase its value by 1. Eventually x + y * z is done after printing called_count. g is returned to f as it is the result of countable_helper(), i.e., f is g.

# When we need later substitution

I found this interesting case from The chapter for imperative programming in Real World OCaml and it is about memoize. In this section, we borrow contents directly from the book but will emphasise the substitution part.

Python developers should be quite familiar with the memoize decorator . Essentially, it makes functions remember the argument, result pairs so that if the same arguments are supplied again then the stored result is returned immediately without repeated computation.

We can write memoize in OCaml too:

(* directly copied from Real World OCaml, does not use anonymous function *)
let memoize f =
let table = Hashtbl.Poly.create () in
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = f x in
y
in
g


It uses Hashtbl as the imperative storage box and the mechanism is similar as our previous example of call count.

The fascinating bit comes from the fact that this memoize cannot handle recursive functions. If you don't believe it, let's try with the fibonacci function:

let rec fib n =
if n <= 1 then 1 else fib(n-1) + fib(n-2)

let fib_mem = memoize fib


Hmmm...it will compile and seems working. Yes, fib_mem will return correct results, but we shall have a closer look to see whether it really remembers the argument, result pairs. So what is fib_mem exactly? By replacing f with fib inside memoize, we get:

(* not proper ocaml code, just for demonstration purpose *)

fib_mem is actually
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = fib x in (* here is fib *)
y


So inside fib_mem, if a new argument comes, it will call fib and fib is actually not memoized at all. What does this mean? It means fib_mem may eventually remember the new argument and its result, but will never remember all the arguments and their results along the recusive way.

For example, let's do fib_mem 5.

1. 5 is not in the Hashtbl, so fib 5 is called.
2. According to the definition of fib, fib 5 = fib 4 + fib 3, so now fib 4.
3. fib 4 = fib 3 + fib 2, no problem, but look, fib 3 will be done after fib 4 finishes.
4. Assuming fib 4 is done, then we go back to the right hand side of the + in point 2, which means we need to do fib 3.
5. Will we really do fib 3 now? Yes, unfortunately. Even though it has been done during fib 4, because fib is not memoized.

To solve this problem, we need the substitution technique with the help of imperative programming.

## An attractive but WRONG solution

When I read the book for the first time, before continuing for the solution of memoized fib, I tried something on my own.

So the problem is the f inside memoize is not memoized, right? How about we make that that f mutable, then after we define g, we give g to f? Since g is memoized and g is calling g inside, then the whole thing would be memoized, right?

let mem_rec_bad f =
let table = Hashtbl.Poly.create () in
let sub_f = ref f in
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = !sub_f x in
y
in
sub_f := g;
g


It can pass compiler but will stackoverflow if you do let fib_mem = mem_rec_bad fib;; fib_mem 5. After we define g and replae the original value of sub_f with g, it seems fine, but What is g now?

(* g is now actually like: *)
let g x =
match Hashtbl.find table x with
| Some y -> y
| None ->
let y = g x in
y


g will check the Hashtbl forever! And we totally lose the functionality of the original f and g becomes meaningless at all.

So we can't directly replace the f inside. Is there any way to reserve the functionality of f but somehow substitute some part with the memoization?

## The elegant solution

Updated on 2015-01-25: Please go to Recursive Memoize & Untying the Recursive Knot for better explanation of recursive memoize. This following content here is not that good.

The answer is Yes, and we have to build a non-recusive function out of the recusive fib.

(* directly copied from the book *)
let fib_norec f n =
if n <= 1 then 1
else f(n-2) + f(n-1)

let fib_rec n = fib_norec fib_rec n


(I never thought before that a recursive function can be split like this, honestly. I don't know how to induct such a way and can't explain more. I guess we just learn it as it is and continue. More descriptions of it is in the book.) ([1])

Basically, for a recursive function f_rec, we can

1. change f_rec to f_norec with an additional parameter f
2. replace the f_rec in the original body with f
3. then let rec f_rec parameters = f_norec f_rec parameters
4. In this way, f_rec will bring up f_norec whenever being called, so actually the recusive logic is still controled by f_norec.

Here the important part is f_norec and its parameter f gives us the chance for correct substitution.

let memo_rec f_norec =
let fref = ref (fun _ -> assert false) in
(* !fref below will have the new function defined next, which is
the memoized one *)
let h x = f_norec !fref x in
let f_mem = memoize h in
fref := f_mem;
f_mem


Now, fref's value will become the memoized one with f_norec. Since f_norec` controls the logic and real work, we do not lose any functionality but can remember every argument, result pairs along the recursive way.

# Summary

In this post, we just list out three quite typical cases where imperative programming can be helpful. There are many others, such as lazy, etc.

Moreover, one more suggestion on using imperative style is that don't expose imperative interface to public if you can. That means we should try to embed imperative code inside pure function as much as possible, so that the users of our functions cannot access the mutable parts. This way can let our functional code continue being pure enough while enjoying the juice of imperative programming internally.

[1]. glacialthinker has commented here. This technique is called untying the recursive knot.

Mystique in the head image is another major character in X-Men's world. She is a shapeshifter who can mimic the appearance and voice of any person with exquisite precision. Similar like the mutable in OCaml that always sticks to the same type once being created, human is the only thing she can mutate to.

## January 20, 2015

### Amir Chaudhry

#### Brewing MISO to serve Nymote

The mission of Nymote is to enable the creation of resilient decentralised systems that incorporate privacy from the ground up, so that users retain control of their networks and data. To achieve this, we reconsider all the old assumptions about how software is created in light of the problems of the modern, networked environment. Problems that will become even more pronounced as more devices and sensors find their way into our lives.

We want to make it simple for anyone to be able to run a piece of the cloud for their own purposes and the first three applications Nymote targets are Mail, Contacts and Calendars, but to get there, we first have to create solid foundations.

### Defining the bedrock

In order to create applications that work for the user, we first have to create a robust and reliable software stack that takes care of fundamental problems for us. In other words, to be able to assemble the applications we desire, we must first construct the correct building blocks.

We’ve taken a clean-slate approach so that we can build long-lasting solutions with all the benefits of hindsight but none of the baggage. As mentioned in earlier posts, there are three main components of the stack, which are: Mirage (OS for the Cloud/IoT), Irmin (distributed datastore) and Signpost (identity and connectivity) - all built using the OCaml programming language.

#### Using the MISO stack to build Nymote

As you’ve already noticed, there’s a useful acronym for the above tools — MISO. Each of the projects mentioned is a serious undertaking in its own right and each is likely to be impactful as a stand-alone concept. However, when used together we have the opportunity to create applications and services with high levels of security, scalability and stability, which are not easy to achieve using other means.

In other words, MISO is the toolstack that we’re using to build Nymote — Nymote is the decentralised system that works for its users.

Each of the projects is at a different phase but they have all have made great strides over the last year.

#### Mirage

Mirage — a library operating system that constructs unikernels — is the most mature part of the stack. I previously wrote about the Mirage 1.0 release and only six months later we had an impressive 2.0 release, with continuing advances throughout the year. We achieved major milestones such as the ability to deploy unikernels to ARM-based devices, as well as a clean-slate implementation of the transport layer security (TLS) protocol.

In addition to the development efforts, there have also been many presentations to audiences, ranging from small groups of startups all the way to prestigious keynotes with 1000+ attendees. Ever since we’ve had ARM support, the talks themselves have been delivered from unikernels running on Cubieboards and you can see the growing collection of slides at decks.openmirage.org.

All of these activities have led to a tremendous increase in public awareness of unikernels and the value they can bring to developing robust, modern software as well as the promise of immutable infrastructure. As more people look to get involved and contribute to the codebase, we’ve also begun curating a set of Pioneer Projects, which are suitable for a range of skill-levels.

You can find much more information on all the activities of 2014 in the comprehensive Mirage review post. As it’s the most mature component of the MISO stack, anyone interested in the development of code towards Nymote should join the Mirage mailing list.

#### Irmin

Irmin — a library to persist and synchronize distributed data structures — made significant progress last year. It’s based on the principles of Git, the distributed version control system, and allows developers to choose the appropriate combination of consistency, availability and partition tolerance for their needs.

Early last year Irmin was released as an alpha with the ability to speak ‘fluent Git’ and by the summer, it was supporting user-defined merge operations and fast in-memory views. A couple of summer projects improved the merge strategies and synchronisation strategies, while an external project — Xenstore — used Irmin to add fault-tolerance.

More recent work has involved a big clean-up in the user-facing API (with nice developer documentation) and a cleaner high-level REST API. Upcoming work includes proper documentation of the REST API, which means Irmin can more easily be used in non-OCaml projects, and full integration with Mirage projects.

Irmin is already being used to create a version controlled IMAP server and a version controlled distributed log system. It’s no surprise that the first major release is coming very soon!

#### Signpost

Signpost will be a collection of libraries that aims to provide identity and connectivity between devices. Forming efficient connections between end-points is becoming ever more important as the number of devices we own increases. These devices need to be able to recognise and reach each-other, regardless of their location on the network or the obstacles in between.

This is very much a nascent project and it involves a lot of work on underlying libraries to ensure that security aspects are properly considered. As such, we must take great care in how we implement things and be clear about any trade-offs we make. Our thoughts are beginning to converge on a design we think will work and that we would entrust with our own data, but we’re treating this as a case of ‘Here Be Dragons’. This is a critical piece of the stack and we’ll share what we learn as we chart our way towards it.

Even though we’re at the design stage of Signpost, we did substantial work last year to create the libraries we might use for implementation. A particularly exciting one is Jitsu — which stands for Just In Time Summoning of Unikernels. This is a DNS server that spawns unikernels in response to DNS requests and boots them in real-time with no perceptible lag to the end user. In other words, it makes much more efficient use of resources and significantly reduces latency of services for end-users — services are only run when they need to be, in the places they need to be.

There’s also been lots of efforts on other libraries that will help us iterate towards a complete solution. Initially, we will use pre-existing implementations but in time we can take what we’ve learned and create more robust alternatives. Some of the libraries are listed below (but note the friendly disclaimers!).

• Source code

#### OCaml

OCaml is a mature, powerful and highly pragmatic language. It’s proven ideal for creating robust systems applications and many others also recognise this. We’re using it to create all the tools you’ve read about so far and we’re also helping to improve the ecosystem around it.

One of the major things we’ve been involved with is the coordination of the OCaml Platform, which combines the OCaml compiler with a coherent set of tools and workflows to be more productive in the language and speed up development time. We presented the first major release of these efforts at OCaml 2014 and you can read the abstract or watch the video.

There’s more to come, as we continue to improve the tooling and also support the community in other ways.

### Early steps towards applications

Building blocks are important but we also need to push towards working applications. There are different approaches we’ve taken to this, which include building prototypes, wireframing use-cases and implementing features with other toolstacks. Some of this work is also part of a larger EU funded project* and below are brief summaries of the things we’ve done so far. We’ll expand on them as we do more over time.

Mail - As mentioned above, a prototype IMAP server exists (IMAPlet) which uses Irmin to store data. This is already able to connect to a client to serve mail. The important feature is that it’s an IMAP server which is version controlled in the backend and can expose a REST API from the mailstore quite easily.

Contacts - We first made wireframe mockups of the features we might like in a contacts app (to follow in later post) and then built a draft implementation. To get here, code was first written in OCaml and then put through the js_of_ocaml compiler. This is valuable as it takes us closer to a point where we can build networks using our address books and have the system take care of sharing details in a privacy-conscious manner and with minimal maintenance. The summary post has more detail.

Calendar - This use-case was approached in a completely different way as part of a hackathon last year. A rough but functional prototype was built over one weekend, with a team formed at the event. It was centralised but it tested the idea that a service which integrates intimately with your life (to the point of being very invasive) can provide disproportionate benefits. The experience report describes the weekend and our app — Clarity — won first place. This was great validation that the features are desirable so we need to work towards a decentralised, privacy-conscious version.

### Time to get involved!

The coming year represents the best time to be working on the MISO stack and using it to make Nymote a reality. All source code is publicly available and the projects are varied enough that there is something for everyone. Browse through issues, consider the projects or simply write online and share with us the things you’d like to see. This promises to be an exciting year!

Sign up to the Nymote mailing list to keep up to date!

* The research leading to these results has received funding from the European Union's Seventh Framework Programme FP7/2007-2013 under the UCN project, grant agreement no 611001.