This is an amazing dataset. I did some work with it in 2012 at a hackathon, in particular linking it to wiki entities. I used the redirects as drug synonyms and automatically extracted and linked drug names to their Wikipedia pages, which then offered more structured info on the drugs (eg side effects etc). From that, I then did some basic NLP to identify diseases that mentioned those drugs, thereby linking the diseases each drugs can be used to treat for further analysis. The next step we didn't do much with was linking it to the British national formulary - the uk clinical guidelines on drug use (a digital version wasn't readily available then). You can see some of those wiki datasets on my website (http://www.stewh.com - I'm on a train so can't get the exact url, but you should find it in the menu). I had more datasets and the MapReduce code for working with the dataset and wiki dumps if anyone could use it.