Most simple regular expression evaluators are basically state machines. They hol...

Most simple regular expression evaluators are basically state machines. They hold a few variables:

a) what part of the regex am I currently trying to match

b) what point in the string am I currently starting at

c) how much of the string has this piece of the regex consumed so far

Then the state machine basically has three transitions:

* if (b+c) terminally matches (a), increment both (a) and (b) and reset (c)

* if (b+c) matches (a), but more could be consumed, increment (c) and check again

* if (b+c) doesn't match (a), increment (b) and reset (c)

So yeah, you could put in short cuts for things like "well this didn't match because the next piece of the regex is matching on a line ending", but keeping it simple is straightforward and generally smiled upon.