OCaml Planet

September 08, 2014

Shayne Fletcher

Concatenation of a list of strings

Concatenation of a list of strings

Here's another fun (but probably silly) exercise. Its value I posit, is in highlighting the fundamental similarities that exist between the C++ and OCaml languages (that emerge when one "peeks" beyond the apparent dissimilarities on the surface). Maybe this sort of comparison aids in "lowering the barrier to entry" for the C++ programmer embarking on a journey into OCaml? Anyway, here we go.

The OCaml String module contains a function concat which concatenates a list of strings whilst inserting a separator between each of the elements. Prior to OCaml 4.02 at least, it's implementation went as follows:


let concat sep l =
match l with
[] -> ""
| hd :: tl ->
let num = ref 0 and len = ref 0 in
List.iter (fun s -> incr num; len := !len + length s) l;
let r = create (!len + length sep * (!num - 1)) in
unsafe_blit hd 0 r 0 (length hd);
let pos = ref(length hd) in
List.iter
(fun s ->
unsafe_blit sep 0 r !pos (length sep);
pos := !pos + length sep;
unsafe_blit s 0 r !pos (length s);
pos := !pos + length s)
tl;
r

So, a faithful translation of this program into C++ (unsafe_blit 'n all), yields this:


#include <boost/range.hpp>

#include <string>
#include <numeric>
#include <cstring>

namespace string_util /*In honor of Stefano of http://spacifico.org/ :)*/
{
template <class RgT>
std::string concat (std::string const& sep, RgT lst)
{
if (boost::empty (lst)) return "";

std::size_t num = 0, len = 0;
std::accumulate (
boost::begin (lst), boost::end (lst), 0,
[&](int _, std::string const& s) ->
int { ++num, len += s.size(); return _; } );
std::string r(len + sep.size () * (num - 1), '\0');
std::string const& hd = *(boost::begin (lst));
std::memcpy ((void*)(r.data ()), (void*)(hd.data ()), hd.size());
std::size_t pos = hd.size();
std::accumulate (
boost::next (boost::begin (lst)), boost::end (lst), 0,
[&](int _, std::string const& s) ->
int {
std::memcpy((void*)(r.data()+pos),(void*)(sep.data()),sep.size());
pos += sep.size ();
std::memcpy ((void*)(r.data()+pos),(void*)(s.data()),s.size());
pos += s.size ();
return _; });

return r;
}
}//namespace<string_util>
For example, this fragment

#include <boost/assign/list_of.hpp>

// ...

std::list <std::string> lst = boost::assign::list_of ("foo")("bar")("baz");
std::string r = string_util::concat (",", lst);
will produce the string "foo,bar,baz".

So there it is... As usual, a little more verbosity required on the C++ side but otherwise, not much between them IMHO. Agree?

by Shayne Fletcher (noreply@blogger.com) at September 08, 2014 01:38 AM

September 03, 2014

WODI

OCaml 4.02.0 available

WODI, the OCaml distribution for Windows, now officially supports OCaml 4.02.0. Updated installers and archives can be found in the download section.

Not all packages are yet officially ported to OCaml 4.02.0. In some cases, they were patched, so that depending libraries and program could be built; the rest will hopefully be available soon.

The changes to WODI apart from the upgrade to OCaml 4.02.0 were already discussed in the previous posts that announces OCaml 4.02.0rc1 and 4.02.0beta:

  • external libraries are now based on fedora’s repository.

  • better integration into cygwin (windows-style newlines \r\n breaks autoconf scripts and many makefile, they are therefore disabled for common invocations like ocamlc -where).

  • the quote handling of ocamlbuild was changed.

  • the performance of godi-tools has been improved.

Upgrade

Instead of upgrading a previous version, it’s probably easier to start with a fresh installation:

/opt/wodi32/lib/godi/winconfig --remove # This removes environment
  # variables that were set by WODI from the environment outside cygwin
mv -i /opt/wodi32 /opt/wodi32.old
wget 'http://ml.ignorelist.com/wodi/7/wodi32.tar.xz' -O - | tar -C /tmp -xJf-
bash /tmp/wodi32/install.sh
eval $(/opt/wodi32/sbin/godi_env)
godi_add godi-core_kernel godi-batteries # your favorite packages here

Or just use the installer and install WODI to a different location. You can use different installations of cygwin/WODI in parallel. Just disable the export of environment variables like PATH or OCAMLLIB during the installation process (and don’t create links in the start menu).

September 03, 2014 10:00 PM

September 02, 2014

Shayne Fletcher

Terms With Variables (C++)

Terms with Variables (C++)

In this earlier post I showed a nifty OCaml type for modeling terms with variables for problems involving substitutions. I got interested in what it would take to implement the type in C++(03) (doing 'sum' types in C++ elegantly is a perennial problem). It ain't nowhere near as succinct but we got there nonetheless.

#include <list>

#include <boost/variant.hpp>

/*
type ('a, 'b) term =
| Term of 'a * ('a, 'b) term list
| Var of 'b
*/

template <class A, class B> struct term;
template <class B> struct var;

template <class A, class B>
struct make_tree
{
typedef boost::variant <
boost::recursive_wrapper<term <A, B> >, var<B> > type;
};

template <class A, class B>
struct term
{
typedef typename make_tree <A, B>::type tree;
A a;
std::list <tree> children;
term (A a
, std::list<tree> const& children)
: a (a), children (children)
{}
};

template <class A, class B>
inline term <A, B> make_term (
A a, std::list<typename make_tree<A, B>::type> const& c, B const*)
{
return term<A, B> (a, c);
}

template <class B>
struct var
{
B tag;
var (B tag) : tag (tag) {}
};

template <class B>
inline var<B> make_var (B tag) { return var<B> (tag); }
For example, this little program builds the term represented by the concrete syntax "a(b(), c)".

int main ()
{
typedef make_tree<std::string, std::string>::type tree;
typedef std::list<tree> term_list;
std::string const* tag_str=(std::string const*)0L;

// a(b(), c)
term_list l;
l.push_back (make_term(std::string("b"), term_list (), tag_str));
l.push_back (make_var(std::string("c")));
tree t = make_term(std::string("a"), l, tag_str);

return 0;
}

by Shayne Fletcher (noreply@blogger.com) at September 02, 2014 09:52 PM

August 30, 2014

Anil Madhavapeddy

Talks from OCaml Labs during ICFP 2014

It’s the ever-exciting week of the International Conference on Functional Programming again in Sweden, and this time OCaml Labs has a variety of talks, tutorials and keynotes to deliver throughout the week. This post summarises all them so you can navigate your way to the right session. Remember that once you register for a particular day at ICFP, you can move between workshops and tutorials as you please.

Quick links to the below in date order:

Language and Compiler Improvements

The first round of talks are about improvements to the core OCaml language and runtime.

» Modular implicits

Leo White and Frederic Bour have been taking inspiration from Scala implicits and Modular Type Classes by Dreyer et al, and will describe the design and implementation of a system for ad-hoc polymorphism in OCaml based on passing implicit module parameters to functions based on their module type.

This provides a concise way to write functions to print or manipulate values generically, while maintaining the ML spirit of explicit modularity. You can actually get get a taste of this new feature ahead of the talk, thanks to a new facility in OCaml: we can compile any OPAM switch directly into an interactive JavaScript notebook thanks to iocamljs by Andy Ray.

» Multicore OCaml

Currently, threading in OCaml is only supported by means of a global lock, allowing at most one thread to run OCaml code at any time. Stephen Dolan, Leo White and Anil Madhavapeddy have been building on the early design of a multicore OCaml runtime that they started in January, and now have a (early) prototype of a runtime design that is capable of shared memory parallelism.

  • Abstract
  • Date: 09:10-10:00, OCaml Workshop, Fri Sept 5th

» Type-level Module Aliases

Leo White has been working with Jacques Garrigue on adding support for module aliases into OCaml. This significantly improves the compilation speed and executable binary sizes when using large libraries such as Core/Async.

» Coeffects: A Calculus of Context-dependent Computation

Alan Mycroft has been working with Tomas Petricek and Dominic Orchard on defining a broader notion of context than just variables in scope. Tomas will be presenting a research paper on developing a generalized coeffect system with annotations indexed by a correct shape.

  • Paper
  • Date: 16:30-17:20, ICFP Day 1, Mon Sep 1st.

 Mirage OS 2.0

We released Mirage OS 2.0 in July, and there will be several talks diving into some of the new features you may have read on the blog.

» Unikernels Keynote at Haskell Symposium

Since MirageOS is a unikernel written entirely in OCaml, it makes perfect sense to describe it in detail to our friends over at the Haskell Symposium and reflect on some of the design implications between Haskell type-classes and OCaml functors and metaprogramming. Anil Madhavapeddy will be doing just that in a Friday morning keynote at the Haskell Symposium.

  • Haskell Symposium Program
  • Date: 0900-1000, Haskell Symposium, Fri Sep 5th.

» Transport Layer Security in OCaml

Hannes Menhert and David Kaloper have been working hard on integrating a pure OCaml Transport Layer Security stack into Mirage OS. They’ll talk about the design principles underlying the library, and reflect on the next steps to build a TLS stack that we can rely on not to been more insecure than telnet.

  • Abstract
  • Date: 10:25-11:20, OCaml Workshop, Fri Sep 5th.

Hannes will also continue his travels and deliver a couple of talks the week after ICFP on the same topic in Denmark, so you can still see it if you happen to miss this week’s presentation:

  • 9th Sep at 15:00, IT University of Copenhagen (2A08), details
  • 11th Sep Aarhus University, same talk (time and room TBA)

» Irmin: a Branch-consistent Distributed Library Database

Irmin is an OCaml library to persist and synchronize distributed data structures both on-disk and in-memory. It enables a style of programming very similar to the Git workflow, where distributed nodes fork, fetch, merge and push data between each other. The general idea is that you want every active node to get a local (partial) copy of a global database and always be very explicit about how and when data is shared and migrated.

This has been a big collaborative effort lead by Thomas Gazagnaire, and includes contributions from Amir Chaudhry, Anil Madhavapeddy, Richard Mortier, David Scott, David Sheets, Gregory Tsipenyuk, Jon Crowcroft. We’ll be demonstrating Irmin in action, so please come along if you’ve got any interesting applications you would like to talk to us about.

  • Abstract
  • Blog Post
  • Date: 15:10-16:30, Joint Poster Session for OCaml/ML Workshop, Fri Sep 5th 2014.

» Metaprogramming with ML modules in the MirageOS

Mirage OS lets the programmer build modular operating system components using a combination of OCaml functors and generative metaprogramming. This ensures portability across both Unix binaries and Xen unikernels, while preserving a usable developer workflow.

The core Mirage OS team of Anil Madhavapeddy, Thomas Gazagnaire, David Scott and Richard Mortier will be talking about the details of the functor combinators that make all this possible, and doing a live demonstration of it running on a tiny ARM board!

  • Abstract
  • Date: 14:50-15:10, ML Workshop, Thu Sep 4th 2014.

» CUFP OCaml Language Tutorial

Leo White and Jeremy Yallop (with much helpful assistance from Daniel Buenzli) will be giving a rather different OCaml tutorial from the usual fare: they are taking you on a journey of building a variant of the popular 2048 game in pure OCaml, and compiling it to JavaScript using the js_of_ocaml compiler. This is a very pragmatic introduction to using statically typed functional programming combined with efficient compilation to JavaScript.

In this tutorial, we will first introduce the basics of OCaml using an interactive environment running in a web browser, as well as a local install of OCaml using the OPAM package manager. We will also explore how to compile OCaml to JavaScript using the js_of_ocaml tool.

The tutorial is focused around writing the 2048 logic, which will then be compiled with js_of_ocaml and linked together with a frontend based on (a pre-release version of) Useri, React, Gg and Vg, thanks to Daniel Buenzli. There’ll also be appearances from OPAM, IOCaml, Qcheck and OUnit.

There will also be a limited supply of special edition OCaml-branded USB sticks for the first tutorial attendees, so get here early for your exclusive swag!

» The OCaml Platform

The group here has been working hard all summer to pull together an integrated demonstration of the new generation of OCaml tools being built around the increasingly popular OPAM package manager. Anil Madhavapeddy will demonstrate all of these pieces in the OCaml Workshop, with guest appearances of work from Amir Chaudhry, Daniel Buenzli, Jeremie Diminio, Thomas Gazagnaire, Louis Gesbert, Thomas Leonard, David Sheets, Mark Shinwell, Christophe Troestler, Leo White and Jeremy Yallop.

The OCaml Platform combines the OCaml compiler toolchain with a coherent set of tools for build, documentation, testing and IDE integration. The project is a collaborative effort across the OCaml community, tied together by the OCaml Labs group in Cambridge and with other major contributors.

» The 0install Binary Installation System

Thomas Leonard will also be delivering a separate talk about cross-platform binary installation via his 0install library, which works on a variety of platforms ranging from Windows, Linux and MacOS X. He recently rewrote it in OCaml from Python, and will be sharing his experiences on how this went as a new OCaml user, as well as deliver an introduction to 0install.

  • Abstract
  • Date: 10:25-10:50, OCaml Workshop, Fri Sep 5th 2014.

» Service and Socialising

Heidi Howard and Leonhard Markert are acting as student volunteers at this years ICFP, and assisting with videoing various workshops such as CUFP Tutorials, Haskell Symposium, the Workshop on Functional High-Performance Computing and the ML Family Workshop. Follow their live blogging on the Systems Research Group SysBlog and leave comments about any sessions you’d like to know more about!

Anil Madhavapeddy is the ICFP industrial relations chair and will be hosting an Industrial Reception on Thursday 4th September in the Museum of World Culture starting from 1830. There will be wine, food and some inspirational talks from the ICFP sponsors that not only make the conference possible, but provide an avenue for the academic work to make its way out into industry (grad students that are job hunting: this is where you get to chat to folk hiring FP talent).

This list hasn’t been exhaustive, and only covers the activities of my group in OCaml Labs and the Systems Research Group at Cambridge. There are numerous other talks from the Cambridge Computer Lab during the week, but the artistic highlight will be on Saturday evening following the CUFP talks: Sam Aaron will be doing a live musical performance sometime after 8pm at 3vaningen. Sounds like a perfect way to wind down after what’s gearing to up to be an intense ICFP 2014. I look forward to seeing old friends and making new ones in Gothenburg soon!

August 30, 2014 10:00 PM

August 29, 2014

Richard Jones

virt-v2v: better living through new technology

If you ever used the old version of virt-v2v, our software that converts guests to run on KVM, then you probably found it slow, but worse still it was slow and could fail at the end of the conversion (after possibly an hour or more). No one liked that, least of all the developers and support people who had to help people use it.

A V2V conversion is intrinsically going to take a long time, because it always involves copying huge disk images around. These can be gigabytes or even terabytes in size.

My main aim with the rewrite was to do all the work up front (and if the conversion is going to fail, then fail early), and leave the huge copy to the last step. The second aim was to work much harder to minimize the amount of data that we need to copy, so the copy is quicker. I achieved both of these aims using a lot of new technology that we developed for qemu in RHEL 7.

Virt-v2v works (now) by putting an overlay on top of the source disk. This overlay protects the source disk from being modified. All the writes done to the source disk during conversion (eg. modifying config files and adding device drivers) are saved into the overlay. Then we qemu-img convert the overlay to the final target. Although this sounds simple and possibly obvious, none of this could have been done when we wrote old virt-v2v. It is possible now because:

  • qcow2 overlays can now have virtual backing files that come from HTTPS or SSH sources. This allows us to place the overlay on top of (eg) a VMware vCenter Server source without having to copy the whole disk from the source first.
  • qcow2 overlays can perform copy-on-read. This means you only need to read each block of data from the source once, and then it is cached in the overlay, making things much faster.
  • qemu now has excellent discard and trim support. To minimize the amount of data that we copy, we first fstrim the filesystems. This causes the overlay to remember which bits of the filesystem are used and only copy those bits.
  • I added support for fstrim to ntfs-3g so this works for Windows guests too.
  • libguestfs has support for remote storage, cachemode, discard, copy-on-read and more, meaning we can use all these features in virt-v2v.
  • We use OCaml — not C, and not type-unsafe languages — to ensure that the compiler is helping us to find bugs in the code that we write, and also to ensure that we end up with an optimized, standalone binary that requires no runtime support/interpreters and can be shipped everywhere.

by rich at August 29, 2014 09:26 PM

August 28, 2014

Functional Jobs

Senior Software Engineer (Functional) at McGraw-Hill Education (Full-time)

This Senior Software Engineer position is with the new LearnSmart team at McGraw-Hill Education's new and growing Research & Development center in Boston's Innovation District. We make software that helps college students study smarter, earn better grades, and retain more knowledge.

The LearnSmart adaptive engine powers the products in our LearnSmart Advantage suite — LearnSmart, SmartBook, LearnSmart Achieve, LearnSmart Prep, and LearnSmart Labs. These products provide a personalized learning path that continuously adapts course content based on a student’s current knowledge and confidence level.

On our team, you'll get to:

  • Move textbooks and learning into the digital era
  • Create software used by millions of students
  • Advance the state of the art in adaptive learning technology
  • Make a real difference in education

Our team's products are built with Flow, a functional language in the ML family. Flow lets us write code once and deliver it to students on multiple platforms and device types. Other languages in our development ecosystem include especially JavaScript, but also C++, SWF (Flash), and Haxe.

If you're interested in functional languages like Scala, Swift, Erlang, Clojure, F#, Lisp, Haskell, and OCaml, then you'll enjoy learning Flow. We don't require that you have previous experience with functional programming, only enthusiasm for learning it. But if you do have some experience with functional languages, so much the better! (On-the-job experience is best, but coursework, personal projects, and open-source contributions count too.)

We require only that you:

  • Have a solid grasp of CS fundamentals (languages, algorithms, and data structures)
  • Be comfortable moving between multiple programming languages
  • Be comfortable with modern software practices: version control (Git), test-driven development, continuous integration, Agile

Get information on how to apply for this position.

August 28, 2014 09:18 PM

Github OCaml jobs

Full Time: Software Developer (Functional Programming) at Jane Street in New York, NY; London, UK; Hong Kong

Software Developer (Functional Programming)

Jane Street is looking to hire great software developers with an interest in functional programming. OCaml, a statically typed functional programming with similarities to Haskell, Scheme, Erlang, F# and SML, is our language of choice. We've got the largest team of OCaml developers in any industrial setting, and probably the world's largest OCaml codebase. We use OCaml for running our entire business, supporting everything from research to systems administration to trading systems. If you're interested in seeing how functional programming plays out in the real world, there's no better place.

The atmosphere is informal and intellectual. There is a focus on education, and people learn about software and trading, both through formal classes and on the job. The work is challenging, and you get to see the practical impact of your efforts in quick and dramatic terms. Jane Street is also small enough that people have the freedom to get involved in many different areas of the business. Compensation is highly competitive, and there's a lot of room for growth.

You can learn more about Jane Street and our technology from our main site, janestreet.com. You can also look at a a talk given at CMU about why Jane Street uses functional programming (http://ocaml.janestreet.com/?q=node/61), and our programming blog (http://ocaml.janestreet.com).

We also have extensive benefits, including:

  • 90% book reimbursement for work-related books
  • 90% tuition reimbursement for continuing education
  • Excellent, zero-premium medical and dental insurance
  • Free lunch delivered daily from a selection of restaurants
  • Catered breakfasts and fresh brewed Peet's coffee
  • An on-site, private gym in New York with towel service
  • Kitchens fully stocked with a variety of snack choices
  • Full company 401(k) match up to 6% of salary, vests immediately
  • Three weeks of paid vacation for new hires in the US
  • 16 weeks fully paid maternity/paternity leave for primary caregivers, plus additional unpaid leave

More information at http://janestreet.com/culture/benefits/

August 28, 2014 01:04 PM