Yeah, it's actually an interesting case study in performance optimization - GNU grep puts all this effort into optimizing the performance characteristics of the system calls it uses based on deep kernel knowledge, but ripgrep is orders of magnitude faster for many users via the simple trick of "completely ignore a lot of files by default"
That's not really what's happening in the article. If you read through the single file benchmark, you'll see several clever algorithmic improvements (like rarest byte guessing, building a set of variants for Unicode-aware multiple pattern matching, etc...).
The author literally concedes that the .gitignore feature was not done for performance, and actually carries a significant overhead in large directory trees. For the sake of comparability, the study was controlled for the .gitignore overhead.
> simple trick of "completely ignore a lot of files by
The author of rg wrote a blog post about this. According to what I recall, he did performance comparisons on same limitations and scope. So it's not like in that benchmark, the difference would be due to an obvious fact as this.
This is very very very wrong. GNU grep is not doing any optimizations based on "deep kernel knowledge" that ripgrep doesn't do. I'm honestly not even sure what you're referring to. GNU grep uses standard 'read' syscalls. ripgrep does that too (but also uses memory maps in some cases). There is some buffer size tuning, but otherwise, nothing particularly interesting there.
ripgrep's speed might come from ignoring files in any given use case, and it might even be the biggest reason why a search completes faster. But in my linked blog post, I control for all of that. Yes, while ripgrep might be faster in some cases because of its "smart" filtering, it's also faster in cases where "smart" filtering isn't enabled.