Hacker News new | past | comments | ask | show | jobs | submit login

What is really oddly specific is this instruction :p but I believe it is a common trick to quickly scan for short string matches.

It computes `sum(|x[i] - y[i]|)` for consecutive `i` at different offsets, so it should be zero at substring matches.

For context: https://epubs.siam.org/doi/pdf/10.1137/1.9781611972931.10

I was slightly mistaken, the instruction of interest is _mm256_mpsadbw_epu8




Oh I see. Yes, that's what is commonly used in academic publications. But I've yet to see it used in the wild.

I mentioned exactly that paper (I believe) in my write-up on Teddy: https://github.com/BurntSushi/aho-corasick/tree/master/src/p...


I think Wojciech Muła, who devised the original SIMD-oriented Rabin Karp algorithm, also did measure MPSADBW approaches and found that it is not a good fit for general string-in-string searches [1]. Maybe not today though.

[1] http://0x80.pl/articles/simd-strfind.html#id7




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: