I believe this is to be a bit more educative about how to build a pipeline. Also, iteratively building such solutions quickly often leads to such "inefficiencies" but makes things easier to reason with. Besides, the awk step may have been factored out in the end so it wouldn't make sense to optimise early. Also, by the time the author reaches the end, he gets IO-bound so there's not much need to optimise further (in the context of the exercise).