While it is predominantly a statistics language there is also a huge wealth of data manipulation capabilities in functions like plyr, aggregate, *apply, ave, subset, etc.
Just in terms of organizing data sets, ignoring any statistical analysis, R is fantastic.
I've found Python + Pandas much better in this regard than R. Maybe it's just me, but for grouping, indexing, and manipulating tabular data, Python syntax just makes more sense.
That said, R is better for stats and matrix operations.
They might have borrowed from R.
Wes McKinney admits to being influenced by R especialy data frames...but it makes data analysis all the more easier when i can do everything i want within the the Python environment.
pandas is proving to be a bit of a longer learning curve i must admit, but then the python environment and native Matplotlib support made life oh so much simpler.
The split-apply-combine framework dealing with group by tasks (http://www.jstatsoft.org/v40/i01/paper, not that there aren't other precedents) for one. But generally, Wes has used R to figure out what people want to do, and then ported an elegant interface to python.
Can you elaborate on data.table being 'a game changer'. I am inclined to agree, but I'm am just starting to get a handle on it. I am still hesitant and switching between sqldf, reshape2, base::merge and data.table more than I would like. Do you think it could become a dominant method for data preparation?
Python has PyTables which complements Pandas nicely and seems to offer the same sort of features as data.table (note, I've not actually used data.table)
I am using R to analyse and document (knitr and latex) epidemiologic data which does not involve parsing a lot of text to extract my analysis data set. Data preparation for this type of research involves more combining data from different source tables, restructuring repeated measures, etc. I only know how to do that using R. Can Python be incorporated into the knitr literate programming framework and is it worth learning another language?
Just in terms of organizing data sets, ignoring any statistical analysis, R is fantastic.