Inside boost::unordered_flat_map

htfy96 · on Nov 19, 2022

To anybody unfamiliar with recent progress, boost::unordered_flat_map is the fastest open-addressing hash map implementation in this field for C++. Coming out just several months ago, it outperforms absl::flat_hash_map and various other implementations benchmarked in https://martin.ankerl.com/2022/08/27/hashmap-bench-01/.

jeffbee · on Nov 19, 2022

It's a little hard to take seriously some of the top performers in that table, like emhash8, when the implementations are only a couple of weeks old and the commits are things like "fix crash", "fix clear", and "fix overflow".

waynesonfire · on Nov 19, 2022

Uhh.. super irrelevant--being web scale is the most important attribute.

no_wizard · on Nov 19, 2022

I think I need more context for whet you’re trying to say here. I understand “web scale” in the marketing term Mongo sense but I don’t know how that applies to this

MauranKilom · on Nov 20, 2022

> I understand “web scale” in the marketing term Mongo sense

That's the joke.

(And for anyone who hasn't seen it: https://www.youtube.com/watch?v=b2F-DItXtZs)

layer8 · on Nov 19, 2022

I’m pretty sure they’re being sarcastic.

tubs · on Nov 20, 2022

That link does not have boost unordered_flat_map?

samsquire · on Nov 20, 2022

Imagine you could interact with an object that had the API of all data structures. And it looked like a HashMap if you ran hash operations on it or like a linked list if you interacted with it that approach.

Behind the scenes the compiler shall analyse all your calls to that data structure and see what APIs you use to interact with it.

It can then decide which algorithms to use to optimise your interaction with that data structure.

If only computers could do hot/cold analysis. It could detect that you call a hash lookup in a hot for loop but only call the ordered map iterator from the admin screen and outside a loop.

I would then JIT compile the right data structure algorithm used from a bucket of implemented algorithms. So the data is kept optimised for its usage patterns.

Decisions such as use tombstones, compaction, vacuum, LSM trees, btrees, ordered, unordered, LSM trees with separate value storage, a growable array, a std::list or vector. All depend on what your intended access patterns to the data is.

Could also do clever materialized views. If you only use one view of the data rarely, it isn't efficient to maintain that view.

duckerude · on Nov 20, 2022

PHP's arrays are a bit like this. You can treat them either like an (ordered) hashmap or like a growable array. The underlying representation will adapt depending on whether the current state can be represented as an array.

And I hate it! I'm always unsure about an array's contents. I write functions that expect to operate on linear arrays and might misbehave if given a map, or vice versa. It's rare for the conflation to be helpful. Arrays and maps have different meanings and are used in different ways.

Maybe a very mild version of this would be helpful, like automatically converting an array to an array-backed deque—though even that case seems better handled by a linter making the suggestion, so the programmer can make an informed decision.

Going all the way would be terrible. Types are helpful. Even in a dynamically typed language they give you a vocabulary.

anonymoushn · on Nov 20, 2022

This sounds cool but currently compilers are very very bad at way way easier transformations

IshKebab · on Nov 20, 2022

This is exactly how JavaScript arrays work.

I'm not convinced it's a good idea in general because usually the author will have a good idea about how the data will be accessed and what kind of data there is so if you can choose up front you may as well.

Plus having runtime analysis like that can lead to weird and unpredictable performance characteristics.

anonymoushn · on Nov 19, 2022

I have tried swiss tables but I have been unable to make these match the performance of robin hood tables with a single-stage lookup. Fundamentally the issue seems to be that swiss tables require one to read 2 or 3 cache lines while the alternative requires one to read only 1 cache line per lookup (most of the time). The included benchmarks are not that useful for inserting many items or looking up many items because one can achieve much higher throughput using prefetching (like <<10ns/op), but this sort of consideration never makes it into any graphs.

jonstewart · on Nov 20, 2022

I wish that such benchmarks included bad old std::map, too, so that I could beat my stubborn coworker with them until he finally started using hash maps by default.

plq · on Nov 20, 2022

There isn't one std::map though, each C++ stdlib has its own.

jonstewart · on Nov 20, 2022

True enough!

gpderetta · on Nov 22, 2022

std::map has very different characteristics though. Do you mean std::unordered_map?

KerrickStaley · on Nov 20, 2022

Related: this talk https://www.youtube.com/watch?v=ncHmEUmJZf4 about the making of absl::flat_hash_map (a "competitor" to boost::unordered_flat_map) is one of my favorites.

garaetjjte · on Nov 20, 2022

This talk is also interesting, covering multiple approaches: https://www.youtube.com/watch?v=M2fKMP47slQ

dzdt · on Nov 20, 2022

I have a question about time complexity for hash maps. What is considered the complexity of inserting an initially unknown number n of items into the map?

My longstanding belief was that this complexity should be O(n log n) because a particular hash size will be chosen and then have to be repeatedly grown as the density of the map's population overwhelms the current size of the structure. An exponentially increasing size gives growth steps every log n items, and the growth step will require O(n) time.

Now everyone seems to believe that hash maps insertion is O(1) time with the only exception being degenerate coincidence between hash function and hashed data that puts large number of items into the same bucket.

Do current hash maps avoid O(n) complexity when growth is required? How?

joaquintides · on Nov 20, 2022

Complexity is still O(n) because sizes grow exponentially, so the cost of amortizing rehashes over the number of inserted elements is a constant. Here's a theoretical analysis of this: https://www.cs.cornell.edu/courses/cs3110/2011sp/Lectures/le...

throw827474737 · on Nov 20, 2022

Sizing the hash map appropriately is one of the tuning knobs to usually achieve the O(1) desired complexity, in theory you are right, but tuning away from that in practice this does not or rare enough happens is the goal why you actually chose a hash map for hopefully close to O(1)?

mkoubaa · on Nov 20, 2022

Sometimes I wonder why these nifty data structures that someone already implemented in c++ don't immediately (or even 5+ years later) have ports to other languages. Especially languages that are closer to the metal than average (e.g. Rust, Go, ...)

tialaramex · on Nov 20, 2022

What sort of languages are you thinking of which are "closer to the metal" than C++ but support the features you'd want for these structures ?

The closest I can think of is Rust, but it does have ports of such things. Rust's HashMap is a Swiss Table - Abseil's flash_hash_map but ported to Rust (via HashBrown), Haphazard is a Rust port of the Folly library's hazard pointers (which might some day be C++ 26 or C++ 29 Hazard Pointers)

sgerenser · on Nov 20, 2022

perhaps mkoubba meant “closer to the metal” than the average programming language, like, say Swift or even C#, not “closer to the metal” than C++. Otherwise it the comment makes little sense, as I would not even consider Rust “closer to the metal” than C++. Maybe C or x86 Assembly?

mkoubaa · on Nov 20, 2022

Yes precisely. Closer than the average

uluyol · on Nov 20, 2022

There are different semantics for these data structures in different languages. These tend to exist for corner cases, but those corner cases constrain your design.

So you might not be able to replace the default Java hashmap with absl::flat_hash_map, but you can steal ideas and maybe do better in other ways where the C++ map would be constrained. However, doing that is work that someone has to pick up.

And releasing a (e.g.) Go library that breaks from the built in map semantics might not see much adoption in that community. C++ actually seems a bit unique in terms of how many widely used implementations there are of key data structures (e.g. strings, maps, trees, inline vectors). Not criticizing just making an observation.

mkoubaa · on Nov 20, 2022

This is a really good point. C++ developers seem to be more opinionated than other communities about data structure semantics in my experience. Possibly because the language chose the wrong semantics for the original standard library, whereas other languages made more sensible choices

aaronbee · on Nov 20, 2022

There are some folks working on porting SwissTable to Go: https://github.com/golang/go/issues/54766

What’s holding it back currently is coming up with a way to do incremental rehash, which Go’s current map provides.

rurban · on Nov 21, 2022

Since the swisstable is available in C++ and C, everybody can trivially use it.

I'm more concerned about missing security policies. Like collision counting.

roboben · on Nov 19, 2022

Reading boost:: gives me shivers and flashbacks of my Diploma thesis

hesdeadjim · on Nov 20, 2022

If you want to see utter hell on earth, check out CGAL. Even at my peak C++ TMP days it was bad, being 10 years out of C++ completely, it’s vomit inducing.

77pt77 · on Nov 20, 2022

This one: https://www.cgal.org/ ?

hesdeadjim · on Nov 20, 2022

Yea, computational geometry library. What’s unfortunate is that it’s basically the only quality implementation for some pretty niche algorithms.

boppo1 · on Nov 20, 2022

New to "serious" programming and C++. What causes it to be vomit-worthy?

77pt77 · on Nov 20, 2022

I'd assume extensive arcane template tricks and preprocessor abuse.

mjburgess · on Nov 20, 2022

Likewise, HRanDoMCApitals gives me overwhelmed-bordem flashbacks to my early teens making windows GUI apps.

    help(int &&WhYIsThisLang **so, const auto *LaYereED, &&(f(auto h(int *)) []withBS) {

        auto otherLang = stopTryingToRepairC++ByMakingProgrammersAddMoreSymbolsToFixIt()

        return move(move(move(boost::boost::HPatienceInstance::Empty)))))));;
    }

ActorNightly · on Nov 20, 2022

I still point to wide adoption of boost:: as reason why C++ should never be used. If 4 pages of a single error message isn't enough to convince someone, nothing will.

jacoblambda · on Nov 20, 2022

Those 4 pages of errors are useful as hell though. Some languages are better at giving you just the important bits (Rust) but a lot of languages suffer from "simple errors obscuring complex problems" syndrome.

C++ compile errors, particularly overloading and template errors are normally just some fairly tiny succinct errors (could not find a suitable method/type/operator) with a whole bunch of debug information attached that tells you exactly what the compiler tried and how it came to the conclusion that it could not proceed.

I can take that information and work out exactly why my code is failing to compile 99% of the time but it's rather often in other languages that I get a succinct 1-3 line error that provides little to no context beyond where in the code it is. That leaves me to dig through stack overflow in the hopes that I can find someone with the same error and a similar code layout such that I can determine what is inducing the error.

Don't get me wrong, C++ has a whole lot of issues which justify why things like Rust should be preferred wherever possible but the errors aren't nearly as bad as people make them out to be. I'm much more concerned with the cancer that are C & C++ arithmetic rules(https://twitter.com/hackingcpp/status/1492242039110524935). Likewise for the numerous ways you can accidentally introduce undefined behavior when interacting with non "new C++" code (and even sometimes then as well).

ActorNightly · on Nov 21, 2022

The bigger issue is the template overloading in the first place that makes libraries like boost possible.

A language should not be able to do stuff like this https://blog.mattbierner.com/stupid-template-tricks-super-te.... This is on the same level as JSFuck with javascript.

jacoblambda · on Nov 22, 2022

Honestly I have to disagree. There is nothing particularly special about what Super Template Tetris(STT) is doing.

At its core, template metaprogramming is just functional programming at compile time. STT is just a template and a runtime function which do the following:

1, take an input via compile time flag (the `-D DIRECTION`)

2. take a type input from an included header file containing the current state (`#include "current_game.h"`)

3. via functional programming, compute the results of a single step of the game.

4. specialise a single function using the results of step 3. this function prints the computed result to the screen and the computed game state to a file (`./current_game.h`).

5. gcc/clang exits. compilation is complete.

6. call the compiled binary.

7. the binary runs the specialised function and prints the outputs.

Sure it's fucky and you shouldn't do that in production but what sane individual is writing a piece of code that at runtime (after compiling) seeks out one of its own source files and modifies that file?

To prevent this from being possible you'd have to remove runtime file IO from the language. The other potential solutions wouldn't work:

1. Remove templates entirely: Still would be possible using https://github.com/Hirrolot/metalang99 which solely uses the preprocessor. Given that the pre-processor is literally just term substitution(a glorified copy/paste engine), if you removed that as well, you'd have to accept no form of metaprogramming at all.

2. Remove the ability to #include other files: Could still be done by doing everything inline. `#include` is just copy-paste anyways so it's more an abstraction than anything else to the compiler and preprocessor, it's basically the same as if all the code was pasted into the same file.

That leaves you with removing file IO. Without IO a programming language is basically useless, particularly as a systems programming language.

22SAS · on Nov 20, 2022

Most production C++ codebases are very wary of using much boost. Two of the four places I worked at (wrote C++ at all of them), the use of boost was very minimal and only when absolutely needed. The first place where this was the rule, the only boost usage was boost.asio, at my current place boost.circular_buffer is the only thing we use.

gpderetta · on Nov 22, 2022

These days I use way less boost that I did 10 years ago, simply because the language comes with more batteries. But I never regretted using boost.

jcelerier · on Nov 20, 2022

who cares about 4 pages of error message ? it's one double-click away in my IDE to go to the line in my code that has the issue whether it's one line or 500. this is an absolute non-problem. and if you don't have an IDE just ctrl-f the magic words "required from here" (with gcc) or "requested here": the last one will point to the place in your code where the issue comes from

ActorNightly · on Nov 20, 2022

Just because a solution to the problem exists doesn't make it not a problem in the first place.