I'm becoming a stronger and stronger advocate of teaching command-line interfaces to even programmers at the novice level...it's easier in many ways to think of how data is being worked on by "filters" and "pipes"...and more importantly, every time you try a step, something happens...making it much easier to interactively iterate through a process.
That it also happens to very fast and powerful (when memory isn't a limiting factor) is nice icing on the cake. I moved over to doing much more on CLI after realizing that doing something as simple as "head -n 1 massive.csv" to inspect headers of corrupt multi-gb CSV files made my data-munging life substantially more enjoyable than opening them up in Sublime Text.
A few years ago between projects, my coworkers cooked up some satirical amazing Web 2.0 data science tools. They used git, did a screencast and distributed it internally.
It was basically a few compiled perl scripts and some obfuscated shell scripts with a layer of glitz. People actually used it and LOVED it... It was supposedly better than the real tools some groups were using.
It was one of the more epic work trolls I've ever seen!
Your CSV peeking epiphany was in essence a matter of code vs. tools though rather than necessarily CLI vs. GUI. On Windows you might just as well have discovered you could fire up Linqpad and enter File.ReadLines("massive.csv").First() for example.
In a real production environment that command line would be put into a script parametrized with named variables and the embedded awk scripts would be changed to here-docs.
Sounds good although at that point it's just programming, and there are tools that are cleaner and faster and more robust than piping semi-structured strings around from a command line.
The one real benefit that can be argued is ubiquity (on *ix). Not every system has Perl, Python, or Ruby installed - or Hadoop for that matter - but there's usually a programmable shell and some variant of the standard utilities that will get something done in a pinch. If it happens to be 200x faster than some enormous framework, so much the better.
That code you're replying about was carefully and correctly written. You just replied as if you know how it works just so you could look like you know what you're talking about.
If you're unlucky, someone who actually knows how File.ReadLines() works will show up in an hour or two and explain that it's lazily evaluated.
That it also happens to very fast and powerful (when memory isn't a limiting factor) is nice icing on the cake. I moved over to doing much more on CLI after realizing that doing something as simple as "head -n 1 massive.csv" to inspect headers of corrupt multi-gb CSV files made my data-munging life substantially more enjoyable than opening them up in Sublime Text.