Why F#? (TechEd '09 DEV450)

by joel 6. September 2009 13:14

I’m presenting at TechEd Australia 2009 on Microsoft’s new functional language F#. The session is called “DEV450 Big Algorithms Made Easy with Microsoft’s F#”. It’s jammed packed with content – too much to easily absorb in an hour, so I figured I’d start dumping stuff we may miss for pre/post talk review and follow up – homework if you like. The talk is scheduled for Friday the 9th in Central A, and you can find the session abstract here: http://www.msteched.com/australia/Public/SessionList.aspx. For those not attending, I think it’s being filmed, so I’ll see about grabbing a link and posting here after the show.

What I’m Talking About

The title suggests “big algorithms”, but I’d have preferred to say “interesting algorithms you can start learning and applying today in your apps” – the former seems more marketable. I deliver a survey of algorithms that they’re classifying these days as “Collective Intelligence Algorithms” – classic Web 2.0 hyper-super-nomenclature. What this really means is looking at some search and rank, classification and recommendation algorithms in the context of big data – typically crowd generated data. We’ll then marry them to the powerful language and platform feature set of the F# functional language. We’re trying to achieve two things: explore some of the algorithms that the big guns like Google and Microsoft are using in a fun, absorbable way; and show off how easy it is to implement them in a functional language like F#. At the end of the hour, I hope to convince you that it’s worth spending the time learning more about these algorithms, and more importantly, adopting the functional programming religion.

In my talk, I don’t spend too much time on what F# actually represents as a language or ideology, so I’ll start by surveying a lot of that background here. We’ll start with the ultimate question: why bother with all this? Then we’ll take a quick look at the F# elevator pitch, and finally, I’ve got all the code I use in my presentation available for download. If I have time before the talk this week, I try and also dump some pre-reading content on the CI algorithms themselves.

Let’s start with the ultimate question:

Why Should I Learn a Functional Language?

I think the “why” is a judgement call for the individual. Learning a functional language (particularly if you’re engrained in the imperative) requires brain re-wiring. It’s a frustrating path, especially when the payoff isn’t immediate - based on a very small sample of people I know who made the leap, I’d say it’s a 40-60 hour ramp up to get comfortable and productive. And the motivations and benefits are going to be largely intrinsic. There’s no widespread market reward for knowing a functional programming language, and there’s certainly no forcing functions for adoption (at least not right now).

So where’s the beef? I figured I’d stand on the shoulders of giants and give a quick link dump/summary for your leisurely review:

John Hughes, Why Functional Programming Matters (1984). An absolute classic.

“Conventional languages place conceptual limits on the way problems can be modularised. Functional languages push those limits back.”

He talks about functional languages providing efficient modularity, composability, safety through immutability and provides numerous program examples juxtaposed against imperative equivalents. His core theme is modularity – arguing that clean modularity is the key to successful programming, and languages that aim to enable modularity are inherently more productive.

Reg Braithwaite has great review of Hughes’ paper here. I particularly like his extension to Hughes discussion through his analysis of Seperation of Concerns (SoC). He gives an example of Object Orientations poor SoC in the game design world:

“Let’s ask a question about Monopoly (and Enterprise Software). Where do the rules live? In a noun-oriented design, the rules are smooshed and smeared across the design, because every single object is responsible for knowing everything about everything it can ‘do’. All the verbs are glued to the nouns as methods.”

He goes on to note:

“The great insight is that better programs separate concerns. They are factored more purely, and the factors are naturally along the lines of responsibility (rather than in Jenga piles of abstract virtual base mixin module class proto_ extends private implements). Languages that facilitate better separation of concerns are more powerful in practice than those that don’t.”

His core point is that functional languages offers users the ability to get to the verb right away, without being admonished to objects and interfaces. The comments of Reg’s post are also worth the read. Particularly the one chap that points out the obvious: there’s a problem with the semantic distance between the language of the problem domain, and the language used to implement the solution. I call this developer cognitive dissonance – how hard is it to map what you want to “do” to the language and tools you’re using to implement?

Paul Graham extends the theme of developer cognitive dissonance and introduces differentiation as a key motivation for the “why functional” through his essay Beating the Averages. Referring to the initial stages of a startup he was involved in, he talks about his startups language choice:

“If you can use any language, which do you use? We chose Lisp. For one thing, it was obvious that rapid development would be important in this market. We were all starting from scratch, so a company that could get new features done before its competitors would have a big advantage. We knew Lisp was a really good language for writing software quickly, and server-based applications magnify the effect of rapid development, because you can release software the minute it's done.”

And then goes on to say:

“What were the results of this experiment? Somewhat surprisingly, it worked. We eventually had many competitors, on the order of twenty to thirty of them, but none of their software could compete with ours.”

Graham’s essay shines a light on a software business truism: the barriers to entry are really low, so expect lots of competition. Those with more business acumen than me often point out that first mover advantage makes a competitive difference, so if you can shorten your development cycle, get to market first, all with high quality, who wouldn’t? And with today’s audience almost expecting features like recommendations, awesome search, social networking, classification, high availability, speed, scale and more, perhaps it’s time we mapped that sort of feature set to the best programming language and toolset for the job.

The book “Real World Haskell” (Haskell being the infamous pure functional language used by academics, geeks and language nuts) also has a great deal for those that choose the functional programming path: Novelty, Power and Enjoyment. The book is a great read, and worth the time if you’re looking for a deep dive in to one of the more ‘purist’ functional languages around. 

Sill not convinced? Erik Meijer (among others) have made the case for purist functional programming ideology being the saviour to the concurrency problem we’re facing today. He makes the case that state and side-effects (which include IO, exceptions, threads, resources, reflection etc etc) are the bane of concurrency, and that imperative/OO languages today simply aren’t equipped to declare and manage those side-effects in a safe way. I like his unexpected exception example on page 15:

var xs = new[] { 5, 4, 3, 2, 1, 0 };
IEnumerable<int> q = null;

    q = from x in xs select 1 / x;
    q = new int[] {};

foreach (var z in q)
    // DivideByZero gets thrown here and not caught!

The link between unexpected exception behaviour and concurrency is vague at best, but the basic concept is this: if you can’t describe side-effects using the available language syntax, the compiler can’t reason about it, and inevitably can’t tell you if it’s safe to parallelise or not. He later goes on to talk about how a functional feature called Monads are able to describe side-effects right there in the type system, thus allowing the compiler, the programmer, and/or concurrency libraries to take advantage of that when figuring out how to parallelise safely.

Tim Sweeney from Epic Games (who brought me Gears of War, and I love them for it) also shares Erik’s view – his POPL (Principles of Programming Languages) talk titled “The Next Mainstream Programming Language: A Game Developers Prospective” walks through the life of a game programmer in an imperative OO world. The reliability and concurrency/data parallelism slides have strong arguments for language features found in traditional functional programming languages. Slides 28-39 start with an incredible statistic: 50% of Unreal engine bugs can be traced to common imperative/OO compiler problems (out-of-bounds array access, dereferencing null pointers, integer overflow and uninitialized variable access).

Under the QCon track “Historically Bad Idea’s”, Tony Hoare, type system designer for ALGOL, call’s the invention of the null reference a “billion dollar mistake”. In similar vein to Erik and Tim, he call’s for the eradication of null from the type systems of current languages – something functional programmers generally don’t need to worry about.

And if better design traits, easier composability, reduced developer cognitive dissonance, competitive differentiation, fun, power, enjoyment, tamed side-effects, safety and being smarter and better looking aren’t enough, you could always start looking to the functional language feature set to see what you’re missing out on.

Functional Language Feature Set

Wikipedia gives a decent run down on some of the notable functional programming language features: higher-order functions, recursion, laziness, evaluation, pattern matching, type inference, immutability and meta-programming. I suggest reading and understanding each of those features listed (or you could come to my talk and learn about them there ;)).

Matt Podwysocki and Erik Meijer have a great discussion about functional programming and it’s feature set (in fact Matt’s blog is just fantastic for all things functional, and more importantly all things F#).

Erik also goes solo with a whiteboard session on some core functional programming features and concepts.

The comp.lang.functional FAQ has a great overview of what functional programming is, and some of the features that make them special.

And if you want to build a multi-billion dollar search company, you better ramp up on Map Reduce (two simple functions that are available in most functional programming languages). These two simple functions gave Google the programming model it needed to scale up and out and crawl billions of web pages.

“MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper.

Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system.”

While you may not get the distributed computing part for free when adopting a functional programming language, you get all the language enablers to go do it yourself (and my personal belief is that your favourite cloud computing provider will provide you with a similar model anyway).

And with that, we begin looking at the functional feature set with a F#/.NET slant: “The Don” (as I like to call him), talks on video about F# and its feature set. His “comparison’s with C#” slides are good – one core point being that in C#, there’s more noise than signal (as in, there’s a lot more lines of code in C#, than there is in the same F# example). The “Pleasure and Pain” slides (about 8 mins in) you may instantly relate to if you’ve ever tried to build a compiler in an OO language.

Amanda Launcher and Ted Neward have a chat with the .NET Rocks team about Functional Programming, it’s feature set, and F#.

Chris Smith (the F# team’s guru tester) and Dustin Campbell chat on the Deep Fried Bytes podcast about F# and functional programming.

And StackOverflow has a discussion of functional features and how they relate to the C# language.

The F# Elevator Pitch

The modern software application has evolved in to a heterogeneous mesh of platforms, architecture, data, and design, which spans a variety of topologies like client/server, n-tier, Peer-to-Peer, cloud and more. The modern software consumer has also evolved. They want  high availability, reliable apps that light up on new hardware with many-cores. They want their apps to scale and accommodate the massive amount of data that they and their friends generate.

And you just want to get through the day building what they want, cheaply and quickly without complexity.

F# has superb fundamentals to help you:

  • A full-featured functional language with all the guts and glory to help solve the problems we’ve talked about above.
  • Works great with large datasets in a beautifully succinct and scalable way. (Check out how the Applied Games group parsed terabytes of data easily with F#)
  • And works great in the small and adds value to your app dev cycle straight away.
  • Helps you program in a naturally concurrent way, and gives you concurrency features out-of-the box to scale across many-cores.

And its core value proposition:

  • It’s a .NET language, which comes with all the cross-language interop goodness we’ve come to expect from the CLR.
  • Because it speaks .NET, you can use all the .NET libraries you know and love (BCL, WPF, WCF, ASP.NET etc).
  • And that also means you get to use ALL the .NET tools you know and love (profilers, code analysis,
  • It speaks OO and mutability too: while immutable and functional by default, gives you plenty of escape hatches to walk the mutable/OO plank.

But most of all, it’s hacking F# is just plain fun. And if you get nothing out of the 2000+ word “why functional, why F#” pitch, take this away: learning F# and/or the functional paradigm will make you a much better C#/VB.NET programmer. And that to me is worth the 40-60 hours of brain-hurt. So get started now. Download the free VS 2008 add-on, or grab the VS 2010 beta bits (it comes default installed!)

Hopefully see you at the talk!

Download DEV450 Demo Code Search, Netflix Recommendation Engine, SVM Classifier, Fuzzy matching.


blog comments powered by Disqus

Powered by BlogEngine.NET
A modified theme by Mads Kristensen

About the author

Joel Pobar works on languages and runtimes at Facebook