The post you responded to is merely giving evidence that the GP's overall claim "Any of these text problems where you're tying to optimize over a whole corpus is kind of not hard to see to be NP-complete" is sometimes not true in surprising ways -- Knuth conjectured that there was no linear time algorithm for longest common substring (but then suffix trees came along).