In this post I compare and contrast Haskell and F#.  It may come as no surprise that with so much shared history they share so much in common.  However, it’s interesting to consider how the perspectives of the languages’ developers play a large role in determining the differences between the languages.

 

A Shared History

As far as the family tree of functional programming is concerned, F# and Haskell are not too distant cousins. haskellandfsharphistory

They both share a very similar syntax as well as a large number of features.  A great example of this is Hindley–Milner type inference. 

ML was the first widely used language to leverage Hindley–Milner for static inferred typing, a feature to which it owes much of it’s success.  However, almost all functional programming languages now also have this feature.  The FP community has always been fast to adopt obviously useful features.  Some other things that fit into this category are garbage collection (Lisp) and lazy evaluation (Lazy ML).

The most obvious difference between Haskell and F# is somewhat easy to infer from this graph: object oriented constructs.  That is to say, OCaml pioneered the use of object oriented data structures in functional programming and F# is it’s direct descendant.  This has made OCaml (and in turn F#) somewhat of a black sheep in the theoretical functional programming world. 

The reason many functional programming theorists dislike objects is because they want a language based on math.  Unlike the majority of the ideas in functional programming, objects don’t have roots in either lambda calculus or category theory.  However, this has not stopped OCaml from being successful.  In fact, quite the opposite. 

The use of objects mitigates one of the largest roadblocks in the path to functional programming adoption by engineers:  the difficulty inherent in organizing large functional programs.  The OCaml language engineers also showed that leveraging the object oriented paradigm did not hamper their ability to use static analysis techniques.  Because of this OCaml approaches the speed of C

While it is not pure, OCaml is almost an ideal compromise between theory and engineering.   Indeed, nothing approaches it in terms of a functional language which fits into the paradigms of the Microsoft .NET framework.  It’s easy to see why Microsoft chose to extend OCaml when building a functional language to bring to it’s software engineering masses.

On the other hand, Haskell is almost the ideal language for academic exploration of functional programming.  The fact that it’s strictly limiting in terms of side effecting and adherence to abstract mathematical concepts means no side effecting surprises.  Also, the fact that it’s a committee language means that if a researcher can get enough support for an idea, they can almost be sure it will be included in the next iteration of the language.

 

Haskell as a Committee Language

Repeat the mantra after me:  Haskell is Lazy; Haskell is Pure; Haskell has Type Classes; Haskell is a Committee Language.

Of all of these, the most defining characteristic of Haskell is that it is a committee language.  It’s an amalgamation of many different goals with no clear vision.  This is at the same time Haskell’s greatest strength and greatest weakness.  While it is the most widely used pure functional programming language, the quirks of committee design are obvious.

Some I ran into within two hours of starting with Haskell:

The first was integer rollover.  Haskell has two integer datatypes: integer and int.  Integer is infinitely sized but can be quite slow to use and due to that, it’s rather infrequently used.  On the other hand int is fast but, just like in C, can roll over. There is no way to check the overflow bit.

So, ints can roll over, I can accept that.  What it implies to me is that speed is more important to Haskell than robustness.  However, this brings me to my second point:  Many basic list operations will throw errors on an empty list.  This seems entirely inconsistent to me. 

I understand that if they didn’t, a logic error would be much more likely to cause an infinite loop in a tail recursive function.  However, this seems completely at odds with the “speed first” definition of an int.  It also means that almost everyone ends up wrapping the default list operations with the Maybe monad.

The third issue was that operations with the float data type are slow.  Real World Haskell suggests always using a double due to the fact that a great deal of focus has gone into optimizing double arithmetic but very little into floats.  This demonstrates another thing that comes about with committee languages: often things as important as optimization of basic data types can fall through the cracks because everyone involved wants to work on more exciting things.

Please don’t misunderstand me here, I really like Haskell.  I’m hard on it because I can see that it has a great deal of unrealized potential.  If Haskell is to be a language used for real software engineering, the committee needs to sit down and think hard about an overarching vision for the project.

 

What is the goal here?

The biggest difference between the world of theorists and the world of engineers is that each group has an entirely different set of concerns.

Theorists want to implement ideas fast so that they can crank out papers fast.  A large part of this is having a language that is very close to math so that implementing ideas directly from the chalkboard is trivial.  As the theory world changes so fast, they don’t often care much about organization or maintainability. 

As the committee responsible for Haskell is mainly made up of theorists, it’s easy to see why the language has taken the direction it has.  It’s a language that is very close to math.  As the lifecycle of most academic code is very short, small implementation details which might cause a reduction in robustness are less important.

Engineers want to minimize time spent maintaining code.  Part of this is having a language that emphasizes safety in that it facilitates catching as many bugs as possible, as early in the process as possible.  Another important part of this is code organization as every moment that is spent trying to find a bug is a moment spent not fixing it.  As the cost of maintaining software generally dwarfs the initial development cost, development speed must take a back seat to testing and organization.

The syntax heavy C# is a great example of this.  It’s slow to write in but provides many constructs for the organization and testing of code.  On top of this a great number of design patterns exist to further categorize substructures in a computer program.  C# is slow to write, but it’s relatively safe and mountains of patterns and best practices have been made to guide it’s developers.

However, we in the software engineering world are in the midst of a crisis.  It turns out that traditional imperative object oriented programs do not lend themselves to heavy parallelization.  Yet, parallelize we must.  We are looking at exponential growth in the number of cores contained in each processor. Because of this we engineers find ourselves at a bit of an impasse.  Those that are looking ahead know…

Engineers will soon want very badly to minimize time spent maintaining parallelized code.  We need our programs to be easy to organize, manage and test.  Yet, as we will soon need to deal with massively parallelized systems, we find many of our ideas about what makes code robust and maintainable are broken.  At the same time, to move to a purely functional language means leaving behind years of thought on how computer programs ought to be constructed, tested and maintained.  Having any pattern, even if it’s wasteful or has many corner cases, is much better than having none.  This is why a hybrid language is so important.

OCaml and F# provide engineers with the set and forget concurrency that comes along with the functional tradition.  At the same time these languages have all of the organizational constructs of object oriented programming as well.  This means that we can continue to use the same types of large scale organizational structures in our programs and also gain the safe parallelism that implicit immutability provides.

 

Conclusion

And so we see that it’s important to consider a language in terms of how it’s creators envisioned it’s use.  Haskell has been developed mainly with research in mind and so is a fantastic research language.  F# has been developed mainly with engineering in mind and so is much better suited for engineering.

In understanding this it’s also easy to see that comparing Haskell to F# is like comparing the tools of a physicist to those of an engineer.  They may have much in common superficially, but they are designed with much different ends in mind.

 

Links

A History of Haskell: Being Lazy with Class
Tutorial: OCaml for Scientific Computation (Contains some history)