> It seems like there should be a way to determine whether a regex can be compiled using the classic O(n) DFA algorithm or with whatever madness PCREs use to support backtracking and so on.
Obviously. If the "regex" includes a backreference, it requires backtracking. If it includes only regular operations (I can't call any other nonregular operations that people might expect to mind), it doesn't. This is information that can be trivially surfaced by a parser.
It's backreference free. Backtracking to process it is just bad design.
Here's a state machine that recognizes /^.*\s+$/:
1: initial state (non-matching)
2: matching state
all states transition to 2 on \s, and to 1 on \S. Done.
Taking this a little further, we can observe that since the transition table is identical for every state, there's no real need to have states at all -- you get the same effect by just checking the last character.
Obviously. If the "regex" includes a backreference, it requires backtracking. If it includes only regular operations (I can't call any other nonregular operations that people might expect to mind), it doesn't. This is information that can be trivially surfaced by a parser.