Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't there also a significant difference in what the input is being parsed to? My expectation is for a protobuf library to parse messages to structs, with names resolved and giving constant time field access. Simdjson parses a json object to an iterator, with field access being linear time and requiring string comparisons rather than just indexing to a known memory offset.

I.e. it seems like simdjson trades off performance at access time for making the parsing faster. Whether that tradeoff is good depends on the access pattern.



But the same could be true for protobuf. Decode fields only when you need them, and 'parse' just to find the field boundaries and cardinality. Did stuff like that for internal protobuf-like tool and with precomputed message profiles you can get amazing perf. Just get the last or first bit of most bytes (vgather if not on AMD) and you can do some magic.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: