OCaml Planet

March 03, 2015

Functional Jobs

Full-Stack Senior Functional Web Engineer at Front Row Education (Full-time)

Position

Senior full-stack functional web engineer to join fast-growing education startup that changes how over a million young students learn math.

TL;DR - Reasons to care about working with Front Row

  • Our mission is important to us, and we want it to be important to you as well: hundreds of thousands of kids learn math using Front Row every month. Our early results show students improve twice as much while using Front Row than their peers who aren’t using the program.
  • You’ll be one of the first engineers on the team, which means you’ll have an immense impact on our company, product, and culture; you’ll have a ton of autonomy; and you’ll have equity to match the weight of this role
  • A lot of flexibility: while we all work towards the same goals, you’ll have a lot of autonomy in what you work on, you can work from home up to one day a week, and we have a very flexible unlimited vacation days policy
  • You’ll use the most effective tools in the industry: Haskell, Postgres, Backbone.js, Ansible and more. Front Row is one of the very few organizations in the world that use Haskell in production for most of their systems and is an active member of the Haskell community.
  • In addition to doing good, we’re doing really well: in just over a year after launch, we are in more than 20% of all US elementary & middle schools.

The Business

Millions of teachers around the USA are struggling to help 30+ students in their class learn math because every student is in their own place. In a typical fourth grade classroom, there may be students learning to count, students learning to add, students learning to multiply, and students learning how exponents work - and one teacher somehow needs to address all these needs.

Front Row makes that impossible task possible, and as of today, more than a hundred thousand students use Front Row to receive personalized guidance in their learning. Thousands of teachers use Front Row every day to save hours of time and make sure their students are growing at the fastest rate achievable. Front Row active users have been growing over 25% a month for the past 6 months.

Front Row is successfully venture-funded and on the road to profitability.

The Role

As one of our very first engineers, you will be part of a team of developers who are passionate about their vocation, take pride in their craft and who push each other to grow as professionals. You will strive for pragmatism and 80/20 in your work. You will be using tools that make you most effective. By working really smart, you will produce more than the average developer ever will, but without the crazy hours.

We love generalists who can quickly bring themselves up to speed with any technology we’re using: you will have the chance to learn a lot, and fast too. You will receive continuous support and mentorship on your journey to achieving mastery. We do however expect you not to need to be hand-held and rely on others for your own growth. You will have full autonomy over your work.

You will work in an effective team that plans, executes and reflects together. Because we’re a small team, everything you create will go into production and be used by students. You will never do unimportant work: every contribution will make a clear and tangible impact on the company’s trajectory. Your personal success will be directly aligned with that of the company.

Most importantly, your work will have purpose: Front Row is a mission-driven company that takes pride in making a significant impact in the lives of hundreds of thousands of students.

Tools

  • Front Row is a polyglot combination of multiple web applications, mobile apps and asset generation tools.
  • Web front-ends are a custom version of Backbone.js + plugins.
  • The backend is a series of Haskell+Yesod-based applications talking to PostgreSQL and 3rd party services.
  • All test, build and deployment automation relies on Ansible. AWS for hosting.
  • We have mobile apps for both iOS and Android
  • Work is continuously happening to simplify and minimize the codebases.

Must haves

  • You have experience doing full-stack web development. You understand HTTP, networking, databases and the world of distributed systems.
  • You have functional programming experience.
  • Extreme hustle: you’ll be solving a lot of problems you haven’t faced before without the resources and the support of a giant organization. You must thrive on getting things done, whatever the cost.

Nice-to-haves

  • You have familiarity with a functional stack (Haskell / Clojure / Scala / OCaml etc)
  • You're comfortable with the Behavior-Driven Development style
  • You have worked at a very small startup before: you thrive on having a lot of responsibility and little oversight
  • You have worked in small and effective Agile/XP teams before
  • You have delivered working software to large numbers of users before
  • You have familiarity with system and network administration

Benefits

  • Competitive salary
  • Generous equity option grants
  • Medical, Dental, and Vision
  • Lunch is on us three times a week, and half-day event every month (trip to Sonoma, BBQ, etc)
  • Equipment budget
  • Flexible work schedule
  • Flexible, untracked vacation day policy
  • Working from downtown SF, very accessible location

Front Row - our mission

It's an unfortunate reality that students from less affluent families perform worse in school than students from wealthier families. Part of this reason has to do with home environment and absentee parents, but much of it has to do with inferior resources and less experienced teachers. The worst part of this problem is that if a student falls behind in any grade, they will forever be behind in every grade.

That's the core problem Front Row solves - it doesn't let students fall behind. And if they fall behind, Front Row helps catch them up really quickly because Front Row arms teachers with the resources and knowledge to develop their students individually. Now, the probability of falling behind in any given grade is irrelevant, because it will never compound. The student who would have been the most at risk will instead be up to speed, and therefore far more motivated.

Get information on how to apply for this position.

March 03, 2015 08:52 PM

March 02, 2015

Heidi Howard

Part 2: Running your own DNS Resolver with MirageOS

Last time, we wrote a simple “dig like” unikernel. Given a domain and the address of a nameserver, the unikernel resolved the domain by asking the nameserver and returned the return to the console.

Today, we will look at another way to resolve a DNS query, being a DNS server. This is useful in its own right but also allows us to cool things with our local DNS resolver such as locally overwriting DNS names and resolving .local names, both of which we will add to our DNS resolver another day.

Today we use features only added to ocaml-dns library in version 0.15 (currently PR #52), so if you do not have this version or later, then update OPAM or pin the master branch on github.

Building a DNS server with MirageOS is simple, look at the following code:

open Lwt
open V1_LWT
open Dns
open Dns_server

let port = 53
let zonefile = "test.zone"

module Main (C:CONSOLE) (K:KV_RO) (S:STACKV4) = struct

  module U = S.UDPV4
  module DNS = Dns_server_mirage.Make(K)(S)

  let start c k s =
    let t = DNS.create s k in
    DNS.serve_with_zonefile t ~port ~zonefile
end

The above code will serve DNS requests to port 53, responding with the resource records (RR) in test.zone. We have provided an example zone file in the repo with the code from this guide. To use this unikernel, we also need to edit the config.ml file from yesterday.

open Mirage

let data = crunch "./data"

let handler =
  foreign "Unikernel.Main" (console @-> kv_ro @-> stackv4 @-> job)

let ip_config:ipv4_config = {
  address= Ipaddr.V4.make 192 168 1 2;
  netmask= Ipaddr.V4.make 255 255 255 0;
  gateways= [Ipaddr.V4.make 192 168 1 1];
}

let direct =
  let stack = direct_stackv4_with_static_ipv4 default_console tap0 ip_config  in
  handler $ default_console $ data $ stack

let () =
  add_to_ocamlfind_libraries ["dns.mirage";"dns.lwt-core"];
  add_to_opam_packages ["dns"];
  register "dns" [direct]

We are using crunch to access the zone file in the data directory. As explain in part 1, this config file is specific to my network setup for xen backends and can easily be generalised.

You can now test your DNS server and see it work

$ dig @192.168.1.2 ns0.d1.signpo.st.

 

by Heidi-ann at March 02, 2015 09:50 PM

February 24, 2015

Functional Jobs

Senior Software Engineer at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District.

We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you have do some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

February 24, 2015 03:34 PM

February 21, 2015

@typeocaml

Pearl No.2 - The Max Number of Surpassers

pear-2

In a list of unsorted numbers (not necessarily distinct), such as

problem1

The surpassers of an element are all elements whose indices are bigger and values are larger. For example, the element 1's surpassers are 9, 5, 5, and 6, so its number of surpassers is 4.

problem2

And also we can see that 9 doesn't have any surpassers so its number of surpassers is 0.

So the problem of this pearl is:

Given an unsorted list of numbers, find the max number of surpassers, O(nlogn) is required.

In the answer for the above example is 5, for the element -10.

An easy but not optimal solution

As usual, let's put a trivial solution on the table first ([1]). It is straightforward:

  1. For every element, we scan all elements behind it and maintain a ns as its number of surpassers.
  2. If one element is larger, then we increase the ns.
  3. After we finish on all elements, the max of all the nses is what we are looking for.

The diagram below demonstrates the process on 1.
trivial solution

let num_surpasser p l = List.fold_left (fun c x -> if x > p then c+1 else c) 0 l

let max_num_surpasser l =  
  let rec aux ms l =
    match ms, l with
    | _, [] -> ms
    | None, p::tl -> aux (Some (p, num_surpasser p tl)) tl
    | Some (_, c), p::tl -> 
      let ns = num_surpasser p tl in
      if ns > c then aux (Some (p, ns)) tl
      else aux ms tl
  in
  aux None l

(* note think the answer as an `option` will make the code seem more complicated, but it is not a bad practice as for empty list we won't have max number of surpassers *)

The solution should be easy enough to obtain but its time complexity is O(n^2) which is worse than the required O(nlogn).

Introducing Divide and Conquer

hero

The algorithm design technique divide and conquer was mentioned in Recursion Reloaded. I believe it is a good time to properly introduce it now as it provides a elegant approach towards a better solution for pearl 2.

What is it, literally

Suppose we want to replace the dragon lady and become the king of the land below ([2]).

game_of_thrones

We are very lucky that we have got a strong army and now the only question is how to overcome the realm.

One "good" plan is no plan. We believe in our troops so much that we can just let all of them enter the land and pwn everywhere.

pwn

Maybe our army is very good in terms of both quality and quantity and eventually this plan will lead us to win. However, is it really a good plan? Some soldiers may march to places that have already been overcome; Some soldiers may leave too soon for more wins after one winning and have to come back due to local rebel... the whole process won't be efficient and it cost too much gold on food for men and horses.

Fortunately, we have a better plan.

better

We divide the land into smaller regions and further smaller ones inside until unnecessary. And for each small region, we put ideal amount of soldiers there for battles. After soldiers finish their assigned region, they don't need to move and just make sure the region stay with us. This is more oganised and more efficient in terms of both gold and time. After all, if we conquer all the tiny regions, who would say we were not the king?

What is it in algorithm design, with accumulation

divide and conquer in algorithm design is not a universal solution provider, but a problem attacking strategy or paradigm. Moreover, although this classic term has been lasting for quite a long time, personally I would like to add one more action - accumulate - to make it appear more complete. Let's check the 3 actions one by one to see how we can apply the techque.

Divide

Conceptually this action is simple and we know we need to divide a big problem into smaller ones. But how to is non-trivial and really context-sensitive. Generally we need to ask ourselves 2 questions first:

What are the sizes of the smaller sub-problems?

Normally we intend to halve the problem because it can lead us to have a O(logn) in our final time complexity.

But it is not a must. For example 3-way quicksort divides the problem set into 3 smallers ones. 3-way partition can let quicksort have O( $ \log_3{n} $ ). However, do not forget the number of comparison also increases as we need to check equality during partition.

Moreover, sometimes we may have to just split the problem into a sub-problem with size 1 and another sub-problem with the rest, like what we did for the sum function. This kind of divide won't give us any performance boost and it turns out to be a normal recursion design.

Do we directly split the problem, or we somehow reshape the problem and then do the splitting?

In mergesort, we simply split the problem into 2; while in quicksort, we use partition to rearrange the list and then obtain the 2 sub-problems.

The point of this question is to bear in mind that we do not have shortcuts. We can have a very simple splitting, but later on we need to face probably more complicated accumulate. Like mergesort relies on merge. Or, we can do our important work during divide phase and have a straight accumulate (quicksort just needs to concatenate the solutions of the two sub-problems with the pivot in the middle).

Conquer

This action implies two things:

  1. Recursion. We divided the problem, then we need to conquer. How to conquer? We need to apply divide and conquer and accumulate again until we are not able to divide any more.
  2. Edge cases. This means if we cannot divide further, then it is time to really give a solution. For example, let's say our target is a list. If we reach an empty list or an one-element list, then what shall we do? Normally, if this happens, we do not need to accumulate and just return the answer based on the edge cases.

I believe the conquer part in the original divide and conquer term also implies the accumulate. I seperate accumulate as explained next.

Accumulate

After we conquer every little area of the land, we should now somehow combine all our accomplishments and really build a kingdom out of them. This is the step of accumulate.

A key way to figure out how to accumulate is to start from small. In mergesort, if each of the 2 sub-problems just has one element, then the according answer is a list having that element and we have finished the conquer step. Now we have two lists each of which has one element, how can we accumulate them to have a single sorted list? Simple, smaller element goes first into our resulting list. What if we have two sorted list each of which has two elements? The same, smaller element goes first again.

If we decide to divide the problem in a fairly simple way, then accumulate is normally non-trivial and also dominates the time complexity. Figuring out a cost-efficient approach of accumulate is very important.

Summary

Again, divide and conquer and accumulate is just a framework for solving an algorithm problem. All the concrete solutions are problem context based and can spread into all 3 steps.

In addition, a fundamental hint to using this techqniue is that if we are given a problem, and we know the future solution is not anyhow related to the problem size, then we should try divide and conquer and accumulate

Solve pearl 2

Although pearl 2 asks us to get the max number of surpassers, we can

  1. Get the number of surpassers for every element (we anyway need to)
  2. Then do a linear scan for the max one.

The second step is O(n). If we can achieve the first step in O(nlogn), the overall time complexity stays as O(nlogn).

So our current goal is to use divide and conquer and accumulate to get all numbers of surpassers.

Divide the problem of pearl 2

problem1

We have such as list and we want to get a new list that have the same elements and each element is associated with the number of its surpassers. Now we want to divide the original list (problem set).

Can we directly halve the list?

divide_1

As pearl 2 stated, an element only cares about all elements that are behind it. So if we split the list in the middle, we know the numbers of surpassers for all elements in sub-problem 2 do not need any special operations and the answers can directly be part of the future resulting list.

For the elements inside sub-problem 1, the answers are not fully accomplished yet as they will be affected by the elemnts in sub-problem 2. But hey, how we obtain full answers for sub-problem 1 with the help of the solutions of sub-problem 2 should be the job of accumulate, right? For now, I believe halving the problem is a good choice for divide as at least we already solve half of the problem directly.

Conquer

We of course will use recursion. For sub-problems, we further halve them into smaller sub-problems until we are not able to, which means we reach the edge cases.

There are possibly two edge cases we need to conquer:

  1. Empty list
  2. A list with only one element.

conquer

For empty list, we just need to return an empty list as there is no element for us to count the number of surpassers. For an one-element list, we also return an one-element resulting list where the only element has 0 surpassers.

Accumulate

Now we finished dividing and conquering like below and it is time to accumulate (take sub-problem 1 only for illustration).

accumulate 1

It is easy to combine solutions of sp 111 and sp 112: just compare 8 from sp 111 with 1 from sp112, update 8's number of surpassers if necessary and we can leave sp 112 alone as we talked about during divide. The same way can be applied on sp 121 and sp 122. Then we get:

accumulate 2

Now both sp 11 and sp 12 have more than one element. In order to get the solution for sp 1, sp 12 can stay put. How about sp 11? An obvious approach is just let every element in sp 11 to compare every element in sp 12, and update their numbers of surpassers accordingly. This can be a candidate for accumulate action, however, it is O(n^2). We need to accumulate better.

We said in the very beginning of this post during our trivial solution that the original order of the list matters. However, is it still sensitive after we get the solution (for a sub-problem)?

accumulate 3

As we can see once the answer of sp 11 is obtained, the order between 8 and 1 doesn't matter as they don't rely on each for their number of surpassers any more.

If we can obtain the solution in a sorted manner, it will help us a lot. For example, assume the resulting lists for sp 11 and sp 12 are sorted like this:

accumulate 4

Then we can avoid comparing every pair of elements by using merge like this:

accumulate 5

We can see that 8 in the left hand side list doesn't have to compare to -10 any more. However, this example has not shown the full picture yet. If keep tracking the length of resulting list on the right hand side, we can save more comparisons. Let's assume both sp 1 and sp 2 have been solved as sorted list with lengths attached.

accumulate 6

We begin our merge.

accumulate 7

Have you noticed the fascinating part? Because -10 < -2, without further going down along the resulting list on the right hand side, we can directly update the number of surpassers of -10 and get it out. Why? Because -2 is the smallest element on the right, and if it is bigger than -10, then the rest of the elements on the right must all be bigger than -10, right? Through only one comparison (instead of 4), we get the number of surpassers.

Thus, as long as the solutions of all sub-problems are sorted list with the length associated, we can accumulate like this:

  1. Compare the heads hd1 and hd2, from two sorted resulting lists l1 and l2, respectively
  2. If hd1 >= hd2, then hd2 gets out; go to 1 with updated length for l2
  3. if hd1 < hd2, then hd1 gets out, and its ns gets updated by adding the length of l2 to the existing value; go to 1 with updated length for l1

The full process of accumulating sp 1 and sp 2 is illustrated as follows:

accumulate 8

Two things might need to be clarified:

  1. Although we assumed the resulting lists of sub-problems to be sorted, they will naturally become sorted anyway because we are doing the smaller goes first merging.
  2. We need to attach the lengths to each resulting list on the right and keep updating them because scanning the length of a list takes O(n).

Obviously this way of accumulation can give us O(n). Because at most we can divide O(logn) times, our divide and conquer and accumulate solution will be O(nlogn).

Code

At first, we divide. Note that this version of divide is actually a kind of splitting from middle, as the original order of the elements before we get any solution is important.

(* we have a parameter n to indicate the length of l.
   it will be passed by the caller and 
   in this way, we do not need to scan l for its length every time.

   it will return left, right and the length of the right.
*)
let divide l n =  
  let m = n / 2 in
  let rec aux left i = function
    | [] -> List.rev left, [], 0
    | right when i >= m -> List.rev left, right, n-i
    | hd::tl -> aux (hd::left) (i+1) tl
  in
  aux [] 0 l

Now accumulate. We put it before writing conquer because conquer would call it thus it must be defined before conquer.

let accumulate l1 l2 len2 =  
  let rec aux acc len2 = function
    | l, [] | [], l -> List.rev_append acc l
    | (hd1,ns1)::tl1 as left, ((hd2,ns2)::tl2 as right) ->
      if hd1 >= hd2 then aux ((hd2,ns2)::acc) (len2-1) (left, tl2)
      else aux ((hd1,ns1+len2)::acc) len2 (tl1, right)
  in
  aux [] len2 (l1, l2)

conquer is the controller.

(* note the list parameter is a list of tuple, i.e., (x, ns) *)
let rec conquer n = function  
  | [] | _::[] as l -> l
  | l ->
    let left, right, right_len = divide l n in
    let solution_sp1 = conquer (n-right_len) left in
    let solution_sp2 = conquer right_len right in
    accumulate solution_sp1 solution_sp2 right_len

So if we are given a list of numbers, we can now get all numbers of surpassers:

let numbers_of_surpassers l =  
  List.map (fun x -> x, 0) l |> conquer (List.length l)

Are we done? No! we should find the max number of surpassers out of them:

(* we should always consider the situation where no possible answer could be given by using **option**, although it is a bit troublesome *)
let max_num_surpassers = function  
  | [] -> None
  | l ->
    let nss = numbers_of_surpassers l in
    Some (List.fold_left (fun max_ns (_, ns) -> max max_ns ns) 0 nss)

[1] Unless I can see an optimal solution instantly, I always intend to think of the most straightforward one even though it sometimes sounds stupid. I believe this is not a bad habit. Afterall, many good solutions come out from brute-force ones. As long as we anyway have a solution, we can work further based on it and see whether we can make it better.

[2] Yes, I believe the dragon lady will end the game and win the throne. It is the circle and fate.

by Jackson Tale at February 21, 2015 01:24 AM

February 20, 2015

OCaml Platform

Improving the OCaml documentation toolchain

Last week, we published an alpha version of a new OCaml documentation generator, codoc 0.2.0. In the 2014 OCaml workshop presentation (abstract, slides, video), we mentioned the 'module wall' for documentation and this attempts to fix it. To try it out, simply follow the directions in the README on that repository, or browse some samples of the current, default output of the tool. Please do bear in mind codoc and its constituent libraries are still under heavy development and are not feature complete.

codoc's aim is to provide a widely useful set of tools for generating OCaml documentation. In particular, we are striving to:

  1. Cover all of OCaml's language features
  2. Provide accurate name resolution and linking
  3. Support cross-linking between different packages
  4. Expose interfaces to the components we've used to build codoc
  5. Provide a magic-free command-line interface to the tool itself
  6. Reduce external dependencies and default integration with other tools

We haven't yet achieved all of these at all levels of our tool stack but are getting close. codoc 0.2.0 is usable today (if a little rough in some areas like default CSS). This post outlines the architecture of the new system to make it easier to understand the design decisions that went into it.

The five stages of documentation

There are five stages in generating documentation from OCaml source code. Here we describe how each was handled in the past (using OCamldoc), the present (using our current prototype), and the future (using the final version of the tools we are developing).

Associating comments with definitions

The first stage is to associate the various documentation comments in an .ml or .mli file with the definitions that they correspond to.

Past

Associating comments with definitions is handled by the OCamldoc tool, which does this in two steps. First it parses the file using the regular OCaml parser or camlp4, just as in normal compilation. It uses the syntax tree from the first step and then re-parses the file looking for comments. This second parse is guided by the location information in the syntax tree; for example if there is a definition which ends on line 5 then OCamldoc will look for comments to attach to that definition starting at line 6.

The rules used for attaching comments are quite intricate and whitespace dependent. This makes it difficult to parse the file and attach comments using a single parser. In particular, it would be difficult to do so in a way that doesn't cause a lot of problems for camlp4 extensions. This is why OCamldoc does the process in two steps.

A disadvantage of this two-step approach is that it assumes that the input to any preprocessor is something which could reasonably be read by the compiler/tool creating documentation, which may not always be the case.

Present

Our current prototype associates comments with definitions within the compiler itself. This relies on a patch to the OCaml compiler (pull request #51 on GitHub). Comment association is activated by the -doc command-line flag. It uses (a rewritten version of) the same two-step algorithm currently used by OCamldoc. The comments are then attached to the appropriate node in the syntax tree as an attribute. These attributes are passed through the type-checker and appear in .cmt/.cmti files, where they can be read by other tools.

Future

We intend to move away from the two-step approach taken by OCamldoc. To do this we will need to simplify the rules for associating comments with definitions. One suggestion was to use the same rules as attributes, however that seems to be overly restrictive. So the approach we hope to take is to keep quite close to what OCamldoc currently supports, but disallow some of the more ambiguous cases. For example,

val x : int
(** Is this for x or y? *)
val y : float

may well not be supported in our final version. We will take care to understand the impact of such design decisions and we hope to arrive at a robust solution for the future. By avoiding the two-step approach, it should be safe to always turn on comment association rather than requiring a -doc command-line flag.

Parsing the contents of comments

Once you have associated documentation comments with definitions, you must parse the contents of these comments.

Past

OCamldoc parses the contents of comments.

Present

In our current prototype, the contents of comments are parsed in the compiler, so that the documentation attributes available in .cmt/.cmti files contain a structured representation of the documentation.

Future

We intend to separate parsing the contents of documentation comments from the compiler. This means that the documentation will be stored as strings within the .cmt/.cmti files and parsed by external tools. This will allow the documentation language (and its parser) to evolve faster than the distribution cycle of the compiler.

Representing compilation units with types and documentation

The typed syntax tree stored in .cmt/.cmti files is not a convenient representation for generating documentation from, so the next stage is to convert the syntax tree and comments into some suitable intermediate form. In particular, this allows .cmt files and .cmti files to be treated uniformly.

Past

OCamldoc generates an intermediate form from a syntax tree, a typed syntax tree, and the comments that it found and parsed in the earlier stages. The need for both an untyped and typed syntax tree is a historical artefact that is no longer necessary.

Present

Our current prototype creates an intermediate form in the doc-ock library. This form can be currently be created from .cmti files or .cmi files. .cmi files do not contain enough information for complete documentation, but you can use them to produce partial documentation if the .cmti files are not available to you.

This intermediate form can be serialised to XML using doc-ock-xml.

Future

In the final version, doc-ock will also support reading .cmt files.

Resolving references

Once you have a representation for documentation, you need to resolve all the paths and references so that links can point to the correct locations. For example,

(* This type is used by {!Foo} *) type t = Bar.t

The path Bar.t and the reference Foo must be resolved so that the documentation can include links to the corresponding definitions.

If you are generating documentation for a large collection of packages, there may be more than one module called Foo. So it is important to be able to work out which one of these Foos the reference is referring to.

Unlike most languages, resolving paths can be very difficult in OCaml due to the powerful module system. For example, consider the following code:

module Dep1 : sig
 module type S = sig
   class c : object
     method m : int
   end
 end
 module X : sig
   module Y : S
 end
end

module Dep2 :
 functor (Arg : sig module type S module X : sig module Y : S end end) ->
   sig
     module A : sig
       module Y : Arg.S
     end
     module B = A.Y
   end

type dep1 = Dep2(Dep1).B.c;;

Here it looks like, Dep2(Dep1).B.c would be defined by a type definition c within the submodule B of the functor Dep2. However, Dep2.B's type is actually dependent on the type of Dep2's Arg parameter, so the actual definition is the class definition within the module type S of the Dep1 module.

Past

OCamldoc does resolution using a very simple string based lookup. This is not designed to handle collections of projects, where module names are not unique. It is also not sophisticated enough to handle advanced uses of OCaml's module system (e.g. it fails to resolve the path Dep2(Dep1).B.c in the above example).

Present

In our current prototype, path and reference resolution are performed by the doc-ock library. The implementation amounts to a reimplementation of OCaml's module system that tracks additional information required to produce accurate paths and references (it is also lazy to improve performance). The system uses the digests provided by .cmti/.cmi files to resolve references to other modules, rather than just relying on the module's name.

Future

There are still some paths handled incorrectly by doc-ock-lib, which will be fixed, but mostly the final version will be the same as the current prototype.

Producing output

Finally, you are ready to produce some output from the tools.

Past

OCamldoc supports a variety of output formats, including HTML and LaTeX. It also includes support for plugins called "custom generators" which allow users to add support for additional formats.

Present

codoc only supports HTML and XML output at present, although extra output formats such as JSON should be very easy to add once the interfaces settle down. codoc defines a documentation index XML format for tracking package hierarchies, documentation issues, and hierarchically localized configuration.

codoc also defines a scriptable command-line interface giving users access to its internal documentation phases: extraction, linking, and rendering. The latest instructions on how to use the CLI can be found in the README. We provide an OPAM remote with all the working versions of the new libraries and compiler patches required to drive the new documentation engine.

Future

As previously mentioned, codoc and its constituent libraries doc-ock-lib and doc-ock-xml are still under heavy development and are not yet feature complete. Notably, there are some important outstanding issues:

  1. Class and class type documentation has no generated HTML. (issue codoc#9)
  2. CSS is subpar. (issue codoc#27)
  3. codoc HTML does not understand --package. (issue codoc#42)
  4. opam doc is too invasive (temporary for demonstration purposes; tracked by (issue codoc#48))
  5. Documentation syntax errors are not reported in the correct phase or obviously enough. (issue codoc#58)
  6. Character sets are not handled correctly (issue doc-ock-lib#43)
  7. -pack and cmt extraction are not supported (issue doc-ock-lib#35 and issue doc-ock-lib#3)
  8. Inclusion/substitution is not supported (issue doc-ock-lib#2)

We are very happy to take bug reports and patches at https://github.com/dsheets/codoc/issues. For wider suggestions, comments, complaints and discussions, please join us on the Platform mailing list. We do hope that you'll let us know what you think and help us build a next generation documentation tool which will serve our community admirably.

by OCaml Platform Team at February 20, 2015 12:00 AM

February 18, 2015

Heidi Howard

Part 1: Running your own DNS Resolver with MirageOS

The following is the first part in a step-by-step guide to setting up your own DNS resolver using MirageOS. I will be running this on a low power, low cost ARM device called the Cubieboard 2. Up to date code for each version of the DNS resolver is on Github. This guide assumes some basic experience of lwt and MirageOS, up to the level of the Hello World Tutorial.

Feedback on this article and pull requests to the demo code are welcome.

Part 1.1 – Setting up the cubieboard with MirageOS

Plenty of information on setting up a cubieboard with Xen and MirageOS is available elsewhere, most notability:

For debugging I am a big fan for wireshark. I run a full wireshark sesson on the machine which is connection sharing to my cubieboard network, to check all external traffic.

For this guide, I will always be compiling for Xen ARM backend, with direct network connection via br0 and a static IP for all unikernels. My test network router is configured to give out static IP of the form 192.168.1.x to hosts with the MAC address 00:00:00:00:00:0x. As a result, my config.ml file look like:

open Mirage

let ip_config:ipv4_config = {
  address= Ipaddr.V4.make 192 168 1 2;
  netmask= Ipaddr.V4.make 255 255 255 0;
  gateways= [Ipaddr.V4.make 192 168 1 1];
}

let client =
  foreign "Unikernel.Client" @@ console @-> stackv4 @-> job

let () =
  add_to_ocamlfind_libraries [ "dns.mirage"; ];
  register "dns-client" 
[ client $ default_console $ direct_stackv4_with_static_ipv4 default_console tap0 ip_config]

Since the IP address of the unikernel is 192.168.1.2, before launching the unikernel, I do:

echo "vif = [ 'mac=00:00:00:00:00:02,bridge=br0' ]" >> dns-client.xl

I build unikernel using the usual commands:

mirage configure --xen
make depend; make; make run
# edit file.xl
sudo xl create -c file.xl

Part 1.2 – Getting Started

The following is the complete code for a unikernel which queries a DNS server for a DNS domain and prints to console the IP address returned.

open Lwt
open V1_LWT

let domain = "google.com"
let server = Ipaddr.V4.make 8 8 8 8

module Client (C:CONSOLE) (S:STACKV4) = struct

  module U = S.UDPV4
  module DNS = Dns_resolver_mirage.Make(OS.Time)(S)

  let start c s =
    let t = DNS.create s in
    OS.Time.sleep 2.0 
    >>= fun () ->
    C.log_s c ("Resolving " ^ domain)
    >>= fun () ->
    DNS.gethostbyname t ~server domain
    >>= fun rl ->
    Lwt_list.iter_s
      (fun r ->
         C.log_s c ("Answer " ^ (Ipaddr.to_string r))
      ) rl

end

This unikernel will query a DNS server at 8.8.8.8 (google public DNS resolver) for a domain google.com. Here we are using the simple function, DNS.gethostbyname, with the following type sig:

  val gethostbyname : t ->
    ?server:Ipaddr.V4.t -> ?dns_port:int ->
    ?q_class:Dns.Packet.q_class ->
    ?q_type:Dns.Packet.q_type ->
    string -> Ipaddr.t list Lwt.t

This returns a list of IP’s, which we then iterative over with Lwt_list.iter_s and print to the console.

Part 1.3 – Boot time parameters

Hardcoding the server and domain is far from ideal, instead we will provide them at boot time with Bootvar, the interface for bootvar is below:

type t
(* read boot parameter line and store in assoc list - expected format is "key1=val1 key2=val2" *)
val create: unit -> t Lwt.t

(* get boot parameter *)
val get: t -> string -> string option

(* get boot parameter, throws Not Found exception *)
val get_exn: t -> string -> string

We can now use this to provide domain and server at boot time instead of compile time

let start c s =
    Bootvar.create () >>= fun bootvar ->
    let domain = Bootvar.get_exn bootvar "domain" in
    let server = Ipaddr.V4.of_string_exn (Bootvar.get_exn bootvar "server") in
    ...

Part 1.4 – Using Resolve

Now, a real DNS resolver will need to make many more parameters (any DNS query) and return full DNS responses not just IP address. Thus we need to move on from DNS.hostbyname to using the less abstract resolve function, resolve:

  val resolve :
    (module Dns.Protocol.CLIENT) ->
    t -> Ipaddr.V4.t -> int ->
    Dns.Packet.q_class ->
    Dns.Packet.q_type ->
    Dns.Name.domain_name ->
    Dns.Packet.t Lwt.t 

We can achieve same result of hostbyname as follows:

...
    DNS.resolve (module Dns.Protocol.Client) t server 53 Q_IN Q_A (string_to_domain_name domain)
    >>= fun r ->
    let ips =
    List.fold_left (fun a x ->
      match x.rdata with
      | A ip -> (Ipaddr.V4 ip) :: a
      | _ -> a ) [] r.answers in
...

We are now explicit about parameters such as port, class and type. Note that we have opened the Dns.Name and Dns.Packet.t modules. The return value of resolve is a Dns.Packet.t, we fold over answers in the produce an IPaddr.V4 list as with hostbyname. We can also use the to_string function in Packet to print

I’ve taken a break to do some refactoring work on the ocaml-dns library. In the next post, Part 2, we will expand our code to a DNS stub resolver.

 

by Heidi-ann at February 18, 2015 05:08 PM

OCaml Platform

Why we use OPAM for XenServer development

This is a guest post from an OPAM user about how they use it. If you would like to post about your own use, please let us know.

XenServer uses the Xen project's "Xapi toolstack": a suite of tools written mostly in OCaml which

The Xapi toolstack is built from a large set of libraries and components which are developed independently and versioned separately. It's easy for us to share code with other open-source projects like Mirage, however this flexibility comes with a cost: when one binary such as "xapi" (the cluster manager) depends on 45 separate libraries, how do we quickly set up a build environment? Exactly which libraries do we need? How do we apply updates? If we change one of these libraries (e.g. to make a bugfix), exactly which bits should we rebuild? This is where OPAM, the source package manager, makes everything easy.

Installing a build environment with OPAM is particularly easy. For example in a CentOS 6.5 VM, first install OPAM:

and then:

$ opam init --comp=4.01.0
$ eval `opam config env`

Next install the necessary C libraries and development tools for xapi using a command like

$ sudo yum install `opam install xapi -e centos`

Finally to build xapi itself:

$ opam install xapi
  ∗  install obuild                 0.1.1          [required by cdrom, nbd]
  ∗  install base-no-ppx            base           [required by lwt]
  ∗  install cmdliner               0.9.7          [required by nbd, tar-format]
  ∗  install camlp4                 4.01.0         [required by nbd]
  ∗  install ocamlfind              1.5.5          [required by xapi]
  ∗  install xmlm                   1.2.0          [required by xapi-libs-transitional, rpc, xen-api-client]
  ∗  install uuidm                  0.9.5          [required by xapi-forkexecd]
  ∗  install type_conv              111.13.00      [required by rpc]
  ∗  install syslog                 1.4            [required by xapi-forkexecd]
  ∗  install ssl                    0.4.7          [required by xapi]
  ∗  install ounit                  2.0.0          [required by xapi]
  ∗  install omake                  0.9.8.6-0.rc1  [required by xapi]
  ∗  install oclock                 0.4.0          [required by xapi]
  ∗  install libvhd                 0.9.0          [required by xapi]
  ∗  install fd-send-recv           1.0.1          [required by xapi]
  ∗  install cppo                   1.1.2          [required by ocplib-endian]
  ∗  install cdrom                  0.9.1          [required by xapi]
  ∗  install base-bytes             legacy         [required by ctypes, re]
  ∗  install sexplib                111.17.00      [required by cstruct]
  ∗  install fieldslib              109.20.03      [required by cohttp]
  ∗  install lwt                    2.4.7          [required by tar-format, nbd, rpc, xen-api-client]
  ∗  install xapi-stdext            0.12.0         [required by xapi]
  ∗  install stringext              1.2.0          [required by uri]
  ∗  install re                     1.3.0          [required by xapi-forkexecd, tar-format, xen-api-client]
  ∗  install ocplib-endian          0.8            [required by cstruct]
  ∗  install ctypes                 0.3.4          [required by opasswd]
  ∗  install xenctrl                0.9.26         [required by xapi]
  ∗  install rpc                    1.5.1          [required by xapi]
  ∗  install xapi-inventory         0.9.1          [required by xapi]
  ∗  install uri                    1.7.2          [required by xen-api-client]
  ∗  install cstruct                1.5.0          [required by tar-format, nbd, xen-api-client]
  ∗  install opasswd                0.9.3          [required by xapi]
  ∗  install xapi-rrd               0.9.1          [required by xapi-idl, xapi-rrd-transport]
  ∗  install cohttp                 0.10.1         [required by xen-api-client]
  ∗  install xenstore               1.2.5          [required by xapi]
  ∗  install tar-format             0.2.1          [required by xapi]
  ∗  install nbd                    1.0.2          [required by xapi]
  ∗  install io-page                1.2.0          [required by xapi-rrd-transport]
  ∗  install crc                    0.9.0          [required by xapi-rrd-transport]
  ∗  install xen-api-client         0.9.7          [required by xapi]
  ∗  install message-switch         0.10.4         [required by xapi-idl]
  ∗  install xenstore_transport     0.9.4          [required by xapi-libs-transitional]
  ∗  install mirage-profile         0.4            [required by xen-gnt]
  ∗  install xapi-idl               0.9.19         [required by xapi]
  ∗  install xen-gnt                2.2.0          [required by xapi-rrd-transport]
  ∗  install xapi-forkexecd         0.9.2          [required by xapi]
  ∗  install xapi-rrd-transport     0.7.2          [required by xapi-rrdd-plugin]
  ∗  install xapi-tapctl            0.9.2          [required by xapi]
  ∗  install xapi-netdev            0.9.1          [required by xapi]
  ∗  install xapi-libs-transitional 0.9.6          [required by xapi]
  ∗  install xapi-rrdd-plugin       0.6.0          [required by xapi]
  ∗  install xapi                   1.9.56
===== ∗  52 =====
Do you want to continue ? [Y/n] y

Obviously it's extremely tedious to do all that by hand!

OPAM also makes iterative development very easy. Consider a scenario where a common interface has to be changed. Without OPAM we have to figure out which components to rebuild manually-- this is both time-consuming and error-prone. When we want to make some local changes we simply clone the repo and tell OPAM to "pin" the package to the local checkout. OPAM will take care of rebuilding only the dependent packages:

$ git clone git://github.com/xapi-project/xcp-idl
... make some local changes ...
$ opam pin add xapi-idl ./xcp-idl
$ opam install xapi
...
xapi-idl needs to be reinstalled.
The following actions will be performed:
  ↻  recompile xapi-idl               0.9.19*
  ↻  recompile xapi-rrd-transport     0.7.2    [uses xapi-idl]
  ↻  recompile xapi-forkexecd         0.9.2    [uses xapi-idl]
  ↻  recompile xapi-tapctl            0.9.2    [uses xapi-forkexecd]
  ↻  recompile xapi-netdev            0.9.1    [uses xapi-forkexecd]
  ↻  recompile xapi-libs-transitional 0.9.6    [uses xapi-forkexecd]
  ↻  recompile xapi-rrdd-plugin       0.6.0    [uses xapi-idl]
  ↻  recompile xapi                   1.9.56   [uses xapi-idl]
===== ↻  8 =====
Do you want to continue ? [Y/n] 

It's even easier if you just want to pin to a branch, such as master:

$ opam pin add xapi-idl git://github.com/xapi-project/xcp-idl

It's important to be able to iterate quickly when testing a bugfix-- OPAM makes this easy too. After making a change to a "pinned" repository the user just has to type

$ opam update -u

and only the affected components will be rebuilt.

OPAM allows us to create our own 'development remotes' containing the latest, bleeding-edge versions of our libraries. To install these unstable versions we only need to type:

$ opam remote add xapi-project git://github.com/xapi-project/opam-repo-dev
$ opam update -u

=-=- Updating package repositories =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
[xapi-project] git://github.com/xapi-project/opam-repo-dev already up-to-date
[default] /home/djs/ocaml/opam-repository synchronized

Updates available for 4.01.0, apply them with 'opam upgrade':
=== ∗  1   ↻  6   ↗  6 ===
The following actions will be performed:
  ∗  install   xapi-backtrace         0.2               [required by xapi-idl, xapi-stdext]
  ↗  upgrade   xenctrl                0.9.26 to 0.9.28
  ↗  upgrade   xapi-stdext            0.12.0 to 0.13.0
  ↻  recompile xapi-rrd               0.9.1             [uses xapi-stdext]
  ↻  recompile xapi-inventory         0.9.1             [uses xapi-stdext]
  ↗  upgrade   xapi-idl               0.9.19 to 0.9.21
  ↻  recompile xapi-rrd-transport     0.7.2             [uses xapi-idl]
  ↻  recompile xapi-forkexecd         0.9.2             [uses xapi-idl, xapi-stdext]
  ↗  upgrade   xapi-libs-transitional 0.9.6 to 0.9.7
  ↻  recompile xapi-tapctl            0.9.2             [uses xapi-stdext]
  ↻  recompile xapi-netdev            0.9.1             [uses xapi-stdext]
  ↗  upgrade   xapi-rrdd-plugin       0.6.0 to 0.6.1
  ↗  upgrade   xapi                   1.9.56 to 1.9.58
===== ∗  1   ↻  6   ↗  6 =====
Do you want to continue ? [Y/n]

When a linked set of changes are ready to be pushed, we can make a single pull request updating a set of components, which triggers the travis integration tests.

Summary

The Xapi toolstack is built from a large set of libraries, independently versioned and released, many of them shared with other projects (such as Mirage). The libraries are easy to build and test separately, but the sheer number of dependencies makes it difficult to build the whole project -- this is where opam really shines. OPAM simplifies our day-to-day lives by

  • automatically rebuilding dependent software when dependencies change
  • allowing us to share 'development remotes' containing bleeding-edge software amongst the development team
  • allowing us to 'release' a co-ordinated set of versions with a git push and then trigger integration tests via travis

If you have a lot of OCaml code to build, try OPAM!

by Dave Scott at February 18, 2015 12:00 AM

Anil Madhavapeddy

ICFP 2015 - a call for sponsorship and how you can help

The call for papers for this year’s International Conference on Functional Programming is about to close in two weeks, and over a hundred cutting edge research papers will be submitted on the theory, application and experiences behind functional programming and type theory. In addition to the main conference, there are also over 10 big affiliated workshops that run throughout the week on topics ranging from specific languages (Erlang, Haskell, OCaml), the broader commercial community, and even art and music.

The ICFP conference experience can be a remarkable one for students. Some great ideas have emerged from random corridor conversations between talks with the likes of Phil Wadler, or from rain-soaked discussions with Simon PJ at Mikeller, or in my case, from being convinced to write a book while in a smoky Tokyo bar. This year, it will be held in the beautiful city of Vancouver in the fall.

We’re committed to growing the ICFP community, not just in numbers but also in diversity. The Programming Language Mentoring Workshop has been at capacity since it started and will run again. For the first time ever, I am really excited to announce that the Ada Initiative will also be running an Ally Skills workshop during the conference.

Sustaining these activities and responsible growth means that we need to reach ever wider to support the activities of the (not-for-profit) ICFP conference. So as this year’s industrial relations chair, I wish to invite any organization that wishes to support ICFP to get in touch with us (e-mail at avsm2@cl.cam.ac.uk) and sponsor us. I’ve put an abridged version of the e-mail solicitation below that describes the benefits. Sponsorship can start as low as $500 and is often tax deductible in many countries.

I’m writing to ask if you would be willing to provide corporate financial support for the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP), which takes place in Vancouver, Canada, from August 30th through September 5th, 2015:

http://icfpconference.org/icfp2015/

Corporate support funds are primarily used to subsidize students – the lifeblood of our community – and in turn serve to raise the community profile of the supporting companies through a high-profile industrial recruitment event.

Last year, unprecedented levels of support from you and folks like you at over 25 companies and institutions made it possible for students from all over the world to attend ICFP 2014 in Sweden. The Industrial Reception, open to all attendees, was by all accounts a roaring success. All 2014 sponsoring companies had the opportunity to interact with the gathered students, academics, and software professionals.

This year, let’s build on that success and continue to grow our community, and bring even more students to ICFP 2015 in Vancouver!

Your generosity will make it possible for students from all over the world to attend ICFP, the premier conference in functional programming. There, they will meet luminaries in the field, as well as people who’ve built a successful career and/or business on functional programming. They will return home inspired to continue pursuing functional programming in the confidence that exciting future careers await them. For the first time, we will also host an Ally Skills workshop by the Ada Foundation, as well as continue the successful student mentoring workshop from previous years.

This year, we’re continuing similar system of levels of financial support as last year. Our goal is to enable smaller companies to contribute while allowing larger companies to be as generous as they wish (with additional benefits, in recognition of that generosity).

The support levels, and their associated benefits and pledge amounts and benefits are as follows (costs in US dollars).

Bronze: $500: Logo on website, poster at industrial reception, listed in proceedings.

Silver: $2500: As above plus: logo in proceedings, logo on publicity materials (e.g., posters, etc.)

Gold: $5000: As above plus: named supporter of industrial reception with opportunity to speak to the audience, and opportunity to include branded merchandise in participants’ swag bag.

Platinum: $10000: As above plus: named supporter of whole event, logo on lanyards, badge ribbon, table/booth-like space available (in coffee break areas), other negotiated benefits (subject to ACM restrictions on commercial involvement).

Thank you for your time and especially for your generosity! I look forward to seeing you in Vancouver. If you are willing to be a sponsor, it would be helpful to hear back by March 9th to help us plan and budget.

If you are interested, please get in touch with me or any of the organizing committee. If you’re interested in helping out ICFP in a non-financial capacity (for example as a student volunteer), then there will also be plenty of opportunity to sign up later in the year.

February 18, 2015 12:00 AM