Joel Spolsky: Can your programming language do this?

onan_barbarian · on April 23, 2011

I think there's some reasonable stuff buried in here, I really do.

But... having actually spent some time in the trenches dealing with a hard problem on a massively parallel machine - more than once - I find it hard to believe that something like map/reduce or the like - or any given small-scale language feature is going to be particularly significant in terms of parallelizing any goddamn thing that's actually hard to do. I see a lot of fatuous claims that language feature X is the missing link in parallelism for the everyday working programmer but I don't see a lot of new solutions for anything hard as a proof of concept.

We've only had OpenMP and all sorts of other kludgy extensions for Fortran and C for what, about 15 years? I'm not saying that they're great or elegant or anything, but so many of the things that are hard about parallel programming are NOT FUCKING SOLVED BY MAP-REDUCE. Oops, sorry, shouty. But anything that can be solved by map-reduce wasn't enormously hard to begin with. Map-reduce was itself initially publicized in terms of 'less useful but easier for mere mortals than generalized parallel prefix' which made sense to me.

What doesn't make sense for me is all this frenzied enthusiasm for dragging parallel programming into the never-ending programmlng language abstraction wars; at least when the topics being discussed only touch on the very shallowest things needed by parallel programming. You want some respect, solve something hard.

Yes, you can do the same thing to each element of an array. Whaddya want, a cookie?

jxcole · on April 23, 2011

I agree that map reduce does not solve parallel computing. However, that is just one example of how closures/lambdas can help you design software. It helps in almost any system in which you have two different functions that do almost the same thing, but a little different.

For example, say you are writing a double linked list. The push to head and push to tail functions may be completely the same, except one uses the head and one uses the tail, so you could just write one function and pass in the head or tail, right?

But you can't. The problem is that when you add something to one side, the links are the opposite. That is a new head goes to the right of the old head, while a new tail goes to the left of the old tail. In order to really do this, you also have to pass in a function.

Now, I've written a few doubly linked lists in Java (don't get mad, there _are_ good to do this), and in the end I just copy the function and make the inline changes. Trying to create the right interface and then pass in anonymous classes is almost impossible. And even if in my supreme cleverness I could figure it out, the code would be so confusing as to be unmaintainable.

Anyway, I would say that closures are immensely useful, not _just_ because of map reduce, but because of a whole class of examples that work in this manner.

Peaker · on April 23, 2011

Here's the main Haskell hero, Simon Peyton Jones giving a talk about data parallel Haskell, which I think is a promising language abstraction to do hard things with parallelism relatively easily:

http://www.youtube.com/watch?v=NWSZ4c9yqW8

As for techniques that can be used now: Haskell's `par` and `pseq` combinators allow adding parallelization to "hard" code after-the-fact pretty easily.

pjscott · on April 23, 2011

On a few occasions, I've taken a program and added a few functions from Control.Parallel, and five minutes later, I had a significantly faster program running on several cores. It was surprisingly easy.

Granted, I was going after the low-hanging-fruit cases, but even so, I was impressed.

SeanLuke · on April 24, 2011

Wasn't the promise of haskell the idea that parallelism would be implicit? What are operators like those doing there?

Peaker · on April 24, 2011

No, that was never "the promise" of Haskell.

Making automatic implicit parallelism in Haskell is trivial - but it will yield too much parallelism and the overhead will trump the benefits.

The `par` and `pseq` combinators allow the programmer to specify which computations to parallelize explicitly to avoid parallelizing computations that are too small to be worth it and to allow the programmer to care for data-locality.

Despite being explicit, they are still far easier than other explicit parallelism mechanisms such as explicit threads because they are:

* So easy to throw in the program

* Guaranteed not to alter the semantics of your program - so you can just throw them in there and profile the performance changes -- and you know they won't break your program. They can alter the performance for better or worse.

praptak · on April 23, 2011

"I find it hard to believe that something like map/reduce or the like - or any given small-scale language feature is going to be particularly significant in terms of parallelizing any goddamn thing that's actually hard to do."

I don't believe it either, but it wasn't the point of the article. It was not about magically solving hard problems, but getting about parallelism for free where it is easy.

OTOH it is not apparent that those language features are actually indispensable for that. In most cases where 'map' would apply, it is quite possible for a compiler to detect that the loop body does not rely on its own previous executions and can thus be parallelized automatically.

onan_barbarian · on April 24, 2011

Yes, the last bit was largely my point.

kragen · on April 23, 2011

It's a little misleading to link MapReduce to language support for anonymous functions. The original implementation of MapReduce was in C++!

Rusky · on April 23, 2011

Joel's point is that these things are easy now because of techniques or features like the ones he brings up. Try to do something like that in assembly or FORTRAN or C and it's not all that easy any more.

You're working a level above- map/reduce is easy there and things that would be completely unthinkable to maintain in assembly/FORTRAN/C are the real hard problems.

onan_barbarian · on April 24, 2011

My feeling is that this notion that each level of abstraction makes 'everything above that level easier, including sophisticated algorithms that would be 2 or 3 levels up' is simply not true.

Just because easy things are easier in a language with map/reduce and gc and whatever else, doesn't necessarily mean that things that were hard before will fit cleanly into the mechanism.

Your idea that things that are 'completely unthinkable to maintain in Fortran or C' are somehow much easier in this kind of environment is just silly; there are multiple orders of magnitude more parallel code written in both of those languages than map/reduce and many of the classic parallel algorithms and (wince) design patterns like parallel prefix were written in these languages originally or are easily expressed that way.

The _hard_ algorithms are ones with sophisticated communication requirements that aren't well captured by map/reduce, so in this case, having a abstraction to deal with the easy cases has accomplished nothing towards making hard cases easier.

I'm not against abstraction, of course - just this premature triumphalism that picking a 'easy things become easy' approach gets you, especially if you don't solve the hard things.

Rusky · on April 26, 2011

Yes. However, doesn't it make sense that it's easier at least to focus on the hard things when you don't have to worry about the notational clumsiness that comes from a lack of first-class functions, etc.?

ramchip · on April 24, 2011

Even as a Haskell programmer, I don't really agree with this particular example. This kind of parallelization is extremely easy in C++ and offers a lot of control if you need to do more complicated things:

http://en.wikipedia.org/wiki/OpenMP

I think it's just that a more complicated example is needed to really show the advantage of the first-class functions and such.

yid · on April 23, 2011

> But anything that can be solved by map-reduce wasn't enormously hard to begin with.

Depends on your definition of hard. Algorithmically, yes the tasks are relatively simple, but try programming an 80,000 node MPI/PVM cluster to run general jobs and you'll get a very different definition of "hard".

6ren · on April 23, 2011

Also, Fortran has long been used for parallel programs (on mainframes). Guy Steele is working on Fortress, and he has some interesting insights on the issue.

But I do think that a practical solution will have to rethink not only programming languages, but the concept of programs. It will be made by a young and naive tinkerer, trying to solve a specific problem on actual parallel hardware, not by guys who have been entrenched in the field for decades. So it won't happen til that concrete technology (eg. 100 cores) is widely available. While there may be similarities with functional and object oriented programming (Alan Kay's version), it won't be based on them.

earl · on April 23, 2011

In practice, as someone who has spent the last 18 months writing MR code at Quantcast, the difficult parts of writing map reduce programs all are how does one take communicative/global knowledge algorithms and either (1) change the algorithm to require only a handful of passes each with partial knowledge, or (2) create a probabilistic algorithm to solve an approximation of your problem that can be computed with only partial knowledge. No one says otherwise.

But Map Reduce does, IMO, do one thing that MPI doesn't: the easy things are easy. Hard stuff is still hard, but easy things are easy. If you have a low communication algorithm that requires only local knowledge and fits into the MR paradigm well, hadoop + hdfs makes it really simple to write and reliably execute, including automatically handling things like (i) worker nodes dying, (ii) sorters dying, (iii) storage nodes dropping out, (iv) whole steps disappearing and having to be rerun.

kragen · on April 23, 2011

> Correction: The last time I used FORTRAN was 27 years ago. Apparently it got functions.

FORTRAN had user-defined functions since FORTRAN II in 1958; see http://archive.computerhistory.org/resources/text/Fortran/10... on page numbers 5, 14, and 15.

Joel unfortunately completely misses the point of why C and Java suck at this stuff: you can use functions as values in them (anonymous inner classes in Java) but they aren't closures. And his comment about automatically parallelizing "map" is a little off-base; if you take some random piece of code and stick it into a transparently parallel "map", you're very likely to discover that it isn't safe to run multiple copies of it concurrently, which is why languages like Erlang have a different name for the "parallel map" function. The "map" in MapReduce is inspired by the function of that name in Lisp and other functional languages; it isn't a drop-in replacement for it.

As usual, while Joel's overall point is reasonably accurate, most of his supporting points are actually false to the point of being ignorant nonsense. I think someone could tell as good a story in as entertaining a way without stuffing it full of lies, although admittedly my own efforts fall pretty far short.

statictype · on April 23, 2011

You're nitpicking about details that are not relevant to the article at all.

His point was that languages that treat functions as first-class citizens enable a class of abstractions that are tedious/impossible in languages which don't support them.

Whether the naive implementation of map can be easily parallelized across multiple machines/threads/processes is not relevant. In fact, he even says that you need a 'super genius' to write the scaling map and reduce operations.

kragen · on April 23, 2011

> Whether the naive implementation of map can be easily parallelized across multiple machines/threads/processes is not relevant. In fact, he even says that you need a 'super genius' to write the scaling map and reduce operations.

I think this brings out why I wasn't just "nitpicking." Even though you're presumably pretty savvy, Joel's article has confused you. You're thinking that the easy parallelization is an attribute of the implementation of map, and that a sufficiently smart implementor could make it work.

In fact, the implementation of parallel map is pretty irrelevant here. The important thing is that the function provided to map create no inter-loop dependencies. And given that, you don't need to be a super-genius to implement MapReduce. The super-genius was in coming up with the interface constraints in the first place and recognizing that they were applicable to a large class of problems.

So there are two problems with these "details that are not relevant to the article".

One is that Joel is presenting these details in order to convince you of his point of view. But since they aren't actually true, any confidence they inspire in you is misplaced. You'd be better off with a bald assertion of the importance of functional programming, backed up with "trust me, I know better than you." Admittedly, that's not as much fun to read.

The second problem is that, although Joel has achieved his goal of convincing many people of his point of view, he has done damage to their models of the world in the process. I fully expect to see newbies for years believing that you could never possibly implement MapReduce in C or C++, or that Java requires you to create a whole new file instead of an anonymous function, or that Google Search works by invoking a MapReduce job. And these false beliefs will do real damage to their lives.

_ezkx · on April 23, 2011

But isn't that point pretty obvious?

statictype · on April 23, 2011

Perhaps to you. Perhaps not to someone who programmer exclusively in Java or VB.

rbanffy · on April 23, 2011

Just to complement your point, try to do recursion on 80's BASIC.

Raphael_Amiard · on April 23, 2011

> Joel unfortunately completely misses the point of why C and Java suck at this stuff: you can use functions as values in them (anonymous inner classes in Java) but they aren't closures

I don't at all think he completely misses the point.

First, anonymous inner classes do capture their lexical environment, and you can access outer variables provided they are declared as final.

Second, even if closures allow another bunch of programming techniques, and even if they seem a natural way of programming with anonymous functions, they are not needed for everything related to being able to pass functions as arguments, at all. Notably, every example joel has given in his article doesn't rely on functions capturing their outer lexical environnment. mapping an add function to an array doesn't, and is still a very useful programming technique.

Admitedly though, i really prefer my languages to support full lexical scoping and closures. But i don't think the article was about that, and i feel more like you missed what joel had to say because you think joel should have introduced the full notion of what a first class function is.

kragen · on April 23, 2011

> First, anonymous inner classes do capture their lexical environment, and you can access outer variables provided they are declared as final.

Well, I admit I haven't written much in Java in years, but doesn't that mean they only capture a small part of their lexical environment --- an immutable part? And that means you can't write jQuery.each in Java, right? Any time you want to write a loop with data dependencies between iterations, you need to use some mechanism other than just mutation of variables from the outer scope?

> every example joel has given in his article doesn't rely on functions capturing their outer lexical environnment. mapping an add function to an array doesn't, and is still a very useful programming technique.

You are absolutely correct. Thank you. (That means you can do it in C with almost no trouble.)

> i feel more like you missed what joel had to say because you think joel should have introduced the full notion of what a first class function is.

See my other comment about "nitpicking" for my commentary on that.

Raphael_Amiard · on April 23, 2011

> Well, I admit I haven't written much in Java in years, but doesn't that mean they only capture a small part of their lexical environment ...

They capture only the immutable variables. Wether that's a good or bad thing in itself is open to discussion, but that's clearly not the only problem with anonymous inner classes. Most people end up never using them where you would have passed a function just because it's so long and boilerplatey to do so.

> You are absolutely correct. Thank you. (That means you can do it in C with almost no trouble.)

Yeah that's right, and part of why i consider C being a less retarded language than java.

> See my other comment about "nitpicking" for my commentary on that

I read it, didn't really understand where you were going. You're first saying that joel is incorrect about some details in his article, but i've yet to see which ones. I see him overgeneralizing, sure. The point you then develop is that the fact that he is over-generalizing will harm newbies programmers.

This has been discussed a thousand times before in other contexts, but suffice to say that i profoundly disagree with that kind of statements, and that i find it weak to even go there.

First, you have no real proof of that, so even mentionning it, saying you are sure of it, is weak, you're basically saying "what you're written is gonna harm people", which is a pretty serious accusation in my opinion, without backing it with any facts at all.

Second, i really don't believe that even if it is true and you can prove it, the responsibility should be on the writer. No individual has a perfect knowledge of it's field, that's why it is your responsibility as an engineer to question the facts, and to demand hard data before you say something is true or false. If you fail to do that you have bigger problems than what Joel Splosky is writing on his blog.

In the end, and to explain more generally why i find your critique to be quite unfair, is that, while i agree that Joel is over generalizing, that things are far from being as simple as he is saying they are, the general idea , that i'm gonna sum up as "Having first class functions should be a basic building block of your language" is not only valid, but seriously needs to spread to the whole programming community.

It was because of such an article that i learnt functionnal programming in the first place. It probably wasn't this article, but it wasn't a lot better, maybe even worse. It doesn't mean that i think functionnal programming is the silver bullet to parallelism.

earl · on April 23, 2011

Well, java is retarded, so the standard trick is to declare an array of size one and declare the array itself as final. So you can't change the array object but you're certainly free to change the object it contains. Stupid? As all hell. Makes static local classes much closer to closures, though, even if twenty times as lengthy to write.

edanm · on April 23, 2011

"Joel unfortunately completely misses the point of why C and Java suck at this stuff: you can define anonymous functions in them (anonymous inner classes in Java) but they aren't closures."

You can't define anonymous functions in C.

And I wouldn't "attack" Joel for "stuffing the article full of lies". It's more like "abstracting away the details". When people talk of the benefits of some practice or programming paradigm, they don't always mention all the work going into it - they just explain the concept. That's what a good teacher does, IMO. He takes complex concepts and explains the important parts.

kragen · on April 23, 2011

> You can't define anonymous functions in C.

You are of course correct. I have corrected the parent post.

My point is, though, that you almost can't define anonymous functions in Python, either, but you can do nearly all of these clever patterns. The difference is that Python functions are closures.

> And I wouldn't "attack" Joel for "stuffing the article full of lies". It's more like "abstracting away the details".

I agree that that's what a good teacher does, but I don't see this article as doing that.

cynicalkane · on April 23, 2011

Anonymous inner classes in Java actually are closures. The compiler will force you to declare any closed-over local variables as final. Like everything else in Java, it's verbose but it works. The problems with using closures in Java is that the type system doesn't play well with function objects and that the standard library doesn't care about functional programming, and it's really only the former that is stopping you from functional programming in Java.

dantheman · on April 23, 2011

There's a great talk on parallelism, by guy steele: http://www.infoq.com/presentations/Thinking-Parallel-Program...

Notice that you need different data structures.

gaius · on April 23, 2011

He also misses the point that FORTRAN compilers are bloody good at finding and exploiting opportunities for parallelization. You have much less need of these higher level language constructs if the compiler is doing it for you under the hood! On VAX this was as simple as the /PARALLEL flag at compile time...

schmylan · on April 23, 2011

You're missing it. Here's what Joel is saying without saying it: C# 4 is "that language.". Both his companies are built on the MS stack.

mattmanser · on April 23, 2011

Given that the article is from 2006 that'd be some prescient writing.

Also at the time of this article they had their own custom language Wasabi. This article explaining Wasabi is dated a month after the original article:

http://www.joelonsoftware.com/items/2006/09/01b.html

It's based on VBScript, not C#.

astrange · on April 23, 2011

C# 4.0 didn't exist when the article was written in 2006.

sid0 · on April 23, 2011

While you need a lot of syntactic bullshit to get there, Java does have closures.

rbanffy · on April 23, 2011

By the time I am done with the syntactic BS, I pretty much forgot what I was doing and why...

Java is just not very conducive to exploratory programming.

grav1tas · on April 23, 2011

I think it might be important to note that while the terms map and reduce do come from Lisp, they're not one-to-one with what these functions do in Lisp. The original MapReduce paper mentions the borrowing, but doesn't really go into specifics. There's a good paper by Ralf Lämmel that describes the relation that MapReduce has to "map" and "reduce" at http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.104.... . I liked this paper much better and found it the most informative functional explanation to MapReduce (note, it's in Haskell).

I think MapReduce is really part of a more general pattern where you have an (in more Haskell-y terms) unfold (anamorphism) to a foldr (catamorphism). If your operations on the items in your intermediate set of data in the MapReduce are associative/commutative, you can work out parallelization more or less for free. It's pretty cool stuff, and really not that complicated when you sit down and think about it.

Dn_Ab · on April 23, 2011

Wow. Thank you. I wish I could upvote you more. I have never looked into Google's "MapReduce" but if what you say is true then my assumptions on it were completely wrong. I assumed it was 1-1 with map and reduce. But if it is generating a set of unfolds into 'essentially' fold (is it? I only skimmed the first pages due to time) then that is a very important detail and makes the name a bit of a misnomer. Why don't people make a bigger deal about this - unfold, fold is not harder but it requires a different mindset and approach than map fold.

If you are approaching it thinking it is just a map and fold then you are going in disadvantaged and will have to unlearn that fact to properly leverage something that is actually even more impressive/useful/powerful than a mare fold o map.

sthatipamala · on April 23, 2011

This article shows that Javascript is truly the common man's functional programming language. Despite its ugliness, it got lambdas/anonymous functions right.

ionfish · on April 23, 2011

As right as you can get them in an imperative programming language, perhaps, but the semantics of closing over environments containing mutable variables is still confusing to even the smartest of people (cf. [1]; it's about Python, but JavaScript has essentially the same nature in this regard).

[1] http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/

SomeCallMeTim · on April 23, 2011

Lua's ability to bind a closure to a mutable variable is a very powerful ability that I use all the time. But due to the way loop variables are handled, the example that's "broken" in Python works "as expected" in Lua:

  fns = {}
  
  for i=1,10 do
    fns[i]= function() return i; end
  end
  
  for i=1,10 do
    print("fns[i]",fns[i]())
  end

This produces:

  fns[i] 1
  fns[i] 2
  fns[i] 3
  fns[i] 4
  fns[i] 5
  fns[i] 6
  fns[i] 7
  fns[i] 8
  fns[i] 9
  fns[i] 10

If you change the set-up loop to look like this, though:

  local j
  for i=1,10 do
    j=i
    fns[i]= function() return i,j; end
  end

...then it prints:

  fns[i] 1 10
  fns[i] 2 10
  fns[i] 3 10
  fns[i] 4 10
  fns[i] 5 10
  fns[i] 6 10
  fns[i] 7 10
  fns[i] 8 10
  fns[i] 9 10
  fns[i] 10 10

In that case, it has bound the mutable "j" to each of the 10 closures, though you can see it's created a new binding for i on each pass through the loop. It WILL create a new instance of the mutable "j" each time "local j" is executed, though, so if the above code is in a function or another block that's executed more than once, then each time those functions will be bound to a new "j".

EDIT: Get the code markup right.

sid0 · on April 23, 2011

Not ECMAScript Harmony! [1] And no, that isn't the definitive semantics of closing over mutable variables. It is one out of at least two possible semantics.

[1] http://brendaneich.com/2010/11/paren-free/ -- see the "Implicit Fresh Bindings" section.

ionfish · on April 23, 2011

Harmony's new for in construct implicitly introduces a fresh let binding per iteration, yes, but it doesn't fundamentally change the language semantics with regard to the kind of example Eich mentions. So while one can, in Harmony, write code that behaves in the way that people like Andrej Bauer expect, it doesn't mean that the language's closure semantics have suddenly changed.

It's also hard for me to see how one could have a 'definitive' semantics without having at least one alternative—not that I claimed that either JS's closure semantics were the only ones going. Presumably the alternative you mean is where the variables present in the local environment are copied on creation of a closure.

I'd also resist the idea that I claimed that JavaScript's closure semantics were in some way definitive, although I can see how one might infer that from what I did write. That being said, given the potential problems with the alternative, I would claim that they're a reasonable choice.

sid0 · on April 24, 2011

it doesn't mean that the language's closure semantics have suddenly changed.

Yes, but what are the issues with fresh let-bindings per iteration?

Presumably the alternative you mean is where the variables present in the local environment are copied on creation of a closure.

No, that's bad -- it leaves you unable to show that objects with mutable state are reducible to closures.

JonnieCache · on April 23, 2011

Something that I only realised the other day which made me feel kinda embarrassed: in ruby, the reduce method is called inject.

For years I've been doing MapReduce functions, without realising it. MapReduce was in my mental pile of "genius things that cleverer people than me do, must be looked into when there is time."

For info on inject: http://blog.jayfields.com/2008/03/ruby-inject.html

SandB0x · on April 23, 2011

In Ruby the reduce method is also called reduce:

    [1,2,3].inject{|s,i| s += i}
    => 6
    [1,2,3].reduce{|s,i| s += i}
    => 6

JonnieCache · on April 23, 2011

Reduce is an alias for inject, and depending on which template you use to generate the docs, method aliases are not given the same prominence as full methods. Some templates simply say "this method is also known as..." underneath the original method in a tiny font.

If it were the other way around, and inject was an alias for reduce, I think there would be a lot less confused rubyists.

DanielRibeiro · on April 23, 2011

I've wrote about these ancient ways of great languages a while ago: http://metaphysicaldeveloper.wordpress.com/2009/05/02/closur...

Peaker · on April 23, 2011

What would it do on an empty list? It should probably take an argument for the empty case...

What is the history behind the name "inject" here?

kragen · on April 23, 2011

The name comes from Smalltalk. I imagine that the idea is that you're "injecting" an operator between items in the list: 1 2 3 4 becomes 1 + 2 + 3 + 4.

kejadlen · on April 23, 2011

By default, inject uses the first element of the array as the argument, so it would use nil on an empty list. It optionally takes an argument to kick off the process.

radiowave · on April 23, 2011

Most likely Ruby takes the name inject from Smalltalk. (Smalltalk also uses 'collect' for map, 'select' for filter, 'reject' for an inverse filter.)

kgo · on April 23, 2011

Thank you for pointing out the history! The names inject, collect, select, and reject drive me crazy for some reason. Maybe it's because they rhyme. But every time I saw them, I wondered why Matz made these up instead of using map/filter/reduce. At least now I know.

revorad · on April 23, 2011

A nice relevant post from raganwald - http://weblog.raganwald.com/2008/02/1100inject.html

raganwald · on April 23, 2011

Thanks you for your nice words. I would have suggested this one, although it has been discussed Ad Nauseum on HN in the past:

http://weblog.raganwald.com/2007/03/why-why-functional-progr...

rivalis · on April 23, 2011

Even when I'm working in a language that doesn't have first class functions, I find it easier to lay out my code by writing functional pseudocode and then "unrolling" maps into loops, closures into structs/objects, compositions into a sequence of calls, etc. It probably leads to idiomatically awful Java, but I find it easier to read and write, and nobody else needs to deal with my code. So...

tybris · on April 23, 2011

Yup, any language I've worked with, including Java and C, can do that just fine. They just spread the verbosity differently. Organizing large projects is a pain in JavaScript, trivial in Java. Using anonymous functions is a pain in Java, trivial in JavaScript.

(Not so) fun fact: The public MapReduce services by Google and Amazon do not (directly) support JavaScript.

ajays · on April 23, 2011

He gives reduce as an example of "purely functional programs have no side effects and are thus trivially parallelizable.", but reduce by definition is not trivially parallelizable.

merijnv · on April 23, 2011

It is when your reduce operation is associative. (Example: sum, or --- less trivially --- merge sort)

Zaak · on April 23, 2011

Reduce is trivially parallelizable if you can ensure your operation is associative, but you have to work in languages like Epigram and Agda to guarantee that.

(retract "he's wrong")

rntz · on April 23, 2011

No, you just have to make sure the function you pass is associative.

Only in really strongly typed languages like Agda can you make the associativity constraint /part of the type/ of the function, and thus statically checked by the compiler; but it's perfectly reasonable to make it part of the specification of the function's behavior that it expects its argument to be associative. (Or, slightly more generally, that the order of reduction is unspecified; in which case only an associative function will produce a totally deterministic result.)

wnoise · on April 23, 2011

Even when you can't have a _proof_ of associativity in the type, it's perfectly reasonable to document the associativity requirement in types -- e.g. the Monoid typeclass in Haskell.

Zaak · on April 23, 2011

Good point.

On a related note, I would like to see a "perverse" compiler and library which exhibits random behavior when undefined behavior is specified by the language.

chuhnk · on April 23, 2011

Has anyone else just read this and realised they need to go off an learn some form of functional programming? I ignored it for such a long time because I felt it wasn't relevant to my current situation but I was wrong. You gain some incredible fundamental knowledge that you would otherwise be completely oblivious to.

Is lisp really the way to go though?

merijnv · on April 23, 2011

I would recommend Haskell over Lisp, because Haskell will force you to do functional programming, where-as Lisp is less rigid in that regard. In addition the #haskell IRC channel on FreeNode is filled with people who have infinite patience with Haskell newbies and it has an excellent introductory text at http://learnyouahaskell.com/

About two years ago I decided to learn Haskell. Had about 4 false starts and didn't "get" it until my 5th attempt, at which point I figured out what the type system meant and had a bit of a Matrix "I know kung fu!"-moment. Two years of hanging around in #haskell later, I know a billion more things about programming/programming languages and learned type theory (although I still don't grok all conversations there :p).

Some other comments on the usefulness of Haskell:

http://news.ycombinator.com/item?id=1145743

silentbicycle · on April 23, 2011

You may also have luck learning FP with Erlang. It's also functional, but approaches it from a different direction, focusing on reliability and concurrency rather than types and purity. Between Haskell and Erlang, one of them will probably fit your mind better.

I started FP with OCaml, but OCaml makes it too easy to continue writing imperative code when you're actively trying to learn new idioms.

merijnv · on April 23, 2011

I agree Erlang is also very interesting. I think some of the features looks absolutely fascinating (hot updatable code!), etc. I just haven't found the the time to learn it yet.

I wish Erlang's syntax was more like Haskell, though :\

Peaker · on April 23, 2011

Your two links are identical..

merijnv · on April 23, 2011

My bad the other one was supposed to be http://news.ycombinator.com/item?id=2322920

Bit of a copy and paste fail...

troels · on April 23, 2011

Javascript is just as good for learning the concept of functional programming. It's often (jokingly) called Scheme with C-like syntax. Unlike lisps though, it doesn't give you macros, which is even more mind-bogling than first-class functions.

Ruby also has decent fp-semantics, but they are not as clearly cut as in Javascript.

bluehavana · on April 23, 2011

Learn Scheme in 20 minutes, because it's that simple: http://www.youtube.com/watch?v=GAHA0nJv8EA

yesbabyyes · on April 23, 2011

Try javascript and/or coffeescript. Valuable skills!

JonnieCache · on April 23, 2011

I hear lisp is pretty sweet. Haskell as well, there's a lot going on there. You can program in a very functional style in ruby or javascript if you want something with a more active community, its just that the resources you find on those languages won't be so targeted at FP, because its not their defining feature.

gaius · on April 23, 2011

I would recommend OCaml over Lisp, once you've had type inference, there's no going back.

mike_esspe · on April 23, 2011

You can try Scala (especially if you know Java).

chuhnk · on April 23, 2011

I initially tried scala when getting back to using a language on the jvm. While scala boasts less verbosity than java I felt like the syntax became more complex than what I was accustomed to and slowed my progress if you can believe that. Java was the first real OO language I learned and then moved onto ruby. Currently working with C. I think in pursuing functional programming I'll probably go with something pure like haskell, ML, or lisp.

macmac · on April 23, 2011

gaius · on April 23, 2011

FTA:

The very fact that Google invented MapReduce, and Microsoft didn't, says something about why Microsoft is still playing catch up trying to get basic search features to work

I don't believe this is true, and that's easy to prove: There was parallelism of SELECTs in SQL Server 2000. So there is a part of MS that is perfectly happy with the idea, even in another bit of MS isn't. They just need to talk more...

justwrote · on April 23, 2011

Yes, it can! Scala:

  def Cook(i1: String, i2: String, f: String => Unit) {
    println("get the " + i1)
    f(i1)
    f(i2)
  }
  
  Cook("lobster", "water", x => println("pot " + x))
  Cook("chicken", "coconut", x => println("boom " + x))

  List(1,2,3).sum
  List(1,2,3).mkString
  List(1,2,3).map(println) // or more verbose List(1,2,3).map(x => println(x))

pmr_ · on April 23, 2011

Today I tried to explain someone what exactly boost::mpl::fold does and how it is supposed to be used (For those unfamiliar: boost::mpl is a collection of compile-time metaprogramming mechanisms for C++).

I took me a while to realize that the person I was explaining it to had only little problems with the templates and compile-time part but close to no idea what a fold or a lambda are.

Not knowing some basics of functional programming can keep a person from understanding so many different things and I have encountered those in different fields (e.g. explicitly like in formal semantics or implicitly in different theories of morphology).

I think the real point here is that different paradigms offer you new views onto the world and enhance your understanding all the programming language things aside.

becomevocal · on April 23, 2011

I think this could also be called 'can your brain think like this?'... Many programmers stray from thinking in a massive way and tend to problems with similar, familiar codebases.

sthatipamala · on April 23, 2011

Programmers are perfectly willing to think this way if you disguise it like this: http://api.jquery.com/each/

People are put off by esoteric and academic-sounding names like "first-class functions". But people will adopt it if you put it in a practical context.

pietro · on April 23, 2011

It might even be the other way round. I have a (slightly) harder time teaching first-year students how to use the for loop in JavaScript than how to use anonymous functions in jQuery. I think it's a better match for how the brain works.

becomevocal · on April 23, 2011

Of course. But I'm thinking about it in terms of big data. A lot of us are and will be increasingly dealing with sets of data that should be thought of in parallel solutions. Not just a single process.

svrocks · on April 23, 2011

Does anyone else think it's a travesty that the AP Computer Science curriculum is taught in Java? Java was my first programming language and I've spent the past 8 years trying to unlearn most of it

Deestan · on April 23, 2011

Java's not too bad for educational purposes; at least it has a rigid type system and checked exceptions, both of which makes it harder to ignorantly slam some code together and nugde it until it works.

svrocks · on April 23, 2011

"ignorantly slam some code together and nugde it until it works"

haha pretty sure I still do that most of the time...followed by rewriting everything and calling that "refactoring"

hdragomir · on April 23, 2011

I remember my days as a CS student.

The single most mind-opening course I took was functional programming, where I learned LISP and Prolog.

That knowledge today is crucial as it deeply changed my mindset when tackling most any problem.

bluehavana · on April 23, 2011

It's funny that he mentions Google as an example of a company that gets the paradigm, but most of Google is C++ and Java. C# has better functional paradigm support than both of those.

pragmatic · on April 23, 2011

Yes the additions that were added to support Linq makes c# (imho) the most advanced production language today. By advanced I mean it's fully supported, everything works as intended. It's not academic. I can use advanced features today for real work and get full IDE support and good documentation.

I just built a search engine. I was able to make it massively parallel with some trivial parallel linq code. The power of Linq and the parallel extensions is just amazing.

bad_user · on April 23, 2011

The Google Search engine was at some point written in Python.

pragmatic · on April 23, 2011

The crawler was in Python, not the search engine itself.

http://infolab.stanford.edu/~backrub/google.html

Both the URLserver and the crawlers are implemented in Python.

icey · on April 23, 2011

Do you have a source for this? I'd love to read more about it; I knew they had python in use in some places, but this is the first I've heard that it was used in their search engine.

cincinnatus · on April 23, 2011

I don't like the way in line functions hurt the readability of code. Is there anything out there that solves that issue?

Also I haven't had an excuse to use it yet but F# seems to have great syntactic sugar for parallelizing things in a more natural way than the typical map reduce.

dpritchett · on April 23, 2011

Inline anonymous functions don't have to look so funny. Here's an example in coffeescript that takes no arguments and performs a quick action: https://github.com/dpritchett/chatbox/blob/master/public/cha...

Explanation: After the 500ms textbox fadeIn finishes it fires a an anonymous callback function which calls a 3000ms fadeOut.

ericf · on April 24, 2011

I implemented these examples in Ruby 1.9, would love to know if there are more efficient ways of doing some of these:

    def cook(p1, p2, f)
      puts "get the " + p1.to_s
      f.call(p1)
      f.call(p2)
    end

    cook( "lobster", "water", lambda {|x| puts "pot " + x })
    cook( "chicken", "coconut", lambda {|x| puts "boom " + x })

    @a = [1,2,3]
    @a.map {|x| puts x*2}
    @a.map {|x| puts x}

    def sum(a)
      @a.reduce(0) do |a, b|
        a + b
      end
    end

    def join(a)
      @a.reduce("") do |a, b|
        a.to_s + b.to_s
      end
    end

    puts "sum " + sum(@a).to_s
    puts "join " + join(@a)

flatwhatson · on April 24, 2011

Here's one more efficient way... use perl!

  #!/usr/bin/perl
  
  use Modern::Perl;
  use List::Util 'reduce';
  
  sub cook {
    my ($i1, $i2, $f) = @_;
    say "get the $i1";
    $f->($i1);
    $f->($i2);
  }
  
  cook "lobster", "water",   sub { say "pot "  . shift };
  cook "chicken", "coconut", sub { say "boom " . shift };
  
  my @a = (1, 2, 3);
  
  map { say $_ * 2 } @a;
  map { say $_     } @a;
  
  sub my_sum {
    reduce { $a + $b } 0, @_;
  }
  
  sub my_join {
    reduce { $a . $b } "", @_;
  }
  
  say "sum "  . my_sum(@a);
  say "join " . my_join(@a);

mncolinlee · on April 23, 2011

The moment I read this, I immediately thought of the work I performed on Cray's Chapel parallel language. Chapel has an elegant way of expressing functional parallel code like this that is much more difficult to write in Unified Parallel C and High Performance Fortran. In fact, one Google search later and I found a student's presentation on Chapel and MapReduce.

http://www.slidefinder.net/l/l22_parallel_programming_langua...

buddydvd · on April 23, 2011

Can Xcode 4 compile code using Objective-c blocks into iOS 3.x compatible binaries? This article made me realize how much I miss anonymous functions/lambda expressions from C# and javascript.

astrange · on April 23, 2011

http://code.google.com/p/plblocks/

nickik · on April 23, 2011

WOW, welcome to the year 1959.

anonymoushn · on April 24, 2011

We don't have coroutines yet. Maybe in another 5 years javascript make it all the way to 50 years ago.

hasenj · on April 23, 2011

I think this article was my first introduction to functional programming.

Yea, don't look at me like that. My university mostly taught us Java/C++; we only did functional programming in one course.

SpookyAction · on April 24, 2011

"Look! We're passing in a function as an argument. Can your language do this?"

Umm, yes it can....

#!/usr/bin/perl

sub cook_time { ($hours, $min) = @_; $result = "$hours hours and $min minutes\n"; return $result; }

sub animal { $animal = shift; return $animal; }

sub cook_animal { ($get_animal, $get_time) = @_; return "Cook $get_animal for $get_time"; }

print cook_animal(animal(cow),cook_time(5,23));

ScotterC · on April 23, 2011

Last time I used FORTRAN was all of 11 months ago. Thank god I've moved on to O-O and can actually declare functions.

jasonlynes · on April 23, 2011

i'm smarter for reading this. need more.

troels · on April 23, 2011

Joel is a good writer. Especially his early stuff is golden. Go read some of it.

leon_ · on April 23, 2011

Yes, Go lets me do this. Though I don't like passing anonymous functions too much as the codes becomes hard to read rather soon.

mkramlich · on April 23, 2011

ahhh... Joel at his best. great piece of writing. and a gem about programming languages and abstraction.

mariusmg · on April 23, 2011

So are we supposed to be impressed by clojures now ? Or are we supposed to be impressed that the "great" Joel Spolsky (a ex manager in EXCEL team !!!!) writes about them.