Hi!
Thanks for reading my suggested approach. I can understand how Map-reduce can be used to process the things faster (by processing them in parallel and later using reduce to aggregate the results??)
But can't imagine how reduce (of map-reduce) can help in dropping of word-pairs (i.e. in Step 3)
I think it's an imperfect solution, though it might still be solid. It depends on your reduction strategy (basically breaking associativity). If you do a fold-left, then you'll accumulate such high relevance that you'll quickly start discarding every word in the right-hand data set. On the other hand, if you binary-split the folding process you're more likely to be OK.
I used a binary split and due to the sparseness of the data (and the uninsightfulness of my algorithm) didn't run into too many spurious drop cases. But the implementation I posted is very basic and shows my lack of background in NLP-related pursuits -- I'm lucky it did anything useful at all :)