> If you're OK with the ocassional catastrophically slow regex: https://swtch.co...

coldtea · on Feb 9, 2024

>You just need to read one blog post that tells you to avoid regular expression matchers that use backtracking, and you are good to go. You don't even need to understand why matching via backtracking is bad.

Yeah, no.

You might not be able to avoid using your standard lib's regex, or your project's chosen regex dependency - based on team/company policy. So it's not as simple as "use a regex engine that doesn't has this flaw".

Then if you want to avoid the cost, you need to know what backtracking is, to the level of understanding which kind of expressions can give you those performance issues.

>Why?

Because there are tons of factors that can affect your regex experience with unicode, normalization, different lower/upper case treatment, composite characters that don't match even though it looks like you typed the same character in your query, handling new unicode characters (ASCII 7/8 bit has been fixed for decades) and so on.

eru · on Feb 9, 2024

> You might not be able to avoid using your standard lib's regex, or your project's chosen regex dependency - based on team/company policy. So it's not as simple as "use a regex engine that doesn't has this flaw".

Well, yes, if someone forces you to use tools that have flaws, you need to learn about the flaws so you can work around them. Like when using a shoe as a hammer.

I'm not sure that proves anything about abstractions?

See also https://blog.codinghorror.com/the-php-singularity/

> Because there are tons of factors that can affect your regex experience with unicode, normalization, different lower/upper case treatment, composite characters that don't match even though it looks like you typed the same character in your query, handling new unicode characters (ASCII 7/8 bit has been fixed for decades) and so on.

Thanks.