Probabilistic data structures are pretty cool. Count-min sketch [1] is another o...

Probabilistic data structures are pretty cool.

Count-min sketch [1] is another one. It gives a reasonably accurate count of different events, even millions of unique events, in a fixed memory size. It shows up a bunch in high volume stream processing, and it's easy to understand like the bloom filter.

Another cool data structure is HeavyKeeper [2], which was built as an improvement on count-min sketch for one of its use cases: ranking the most frequent events (like for a leaderboard). It can get 4 nines of accuracy with a small fixed size even for enormous data sets. It's implemented by Redis's topk.

[1]: https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch

[2]: https://www.usenix.org/system/files/conference/atc18/atc18-g...