Comprehensive C++ Hashmap Benchmarks (2022)

54 points • by klaussilveira • last Wednesday at 3:01 PM • 18 comments • view on HN

Comments

Note that this benchmark does not include boost::unordered_flat_map. This is an open addressing variant of boost::unordered_map which has only been released in December 2022.

I wanted to mention this because boost::unordered_flat_map and boost::unordered_flat_set are among the fastest open addressing hash containers in C++ land. Internally, they use lots of cool SIMD tricks. If anyone is interested in the details, here's a nice blog post by the developer: https://bannalia.blogspot.com/2022/11/inside-boostunorderedf...

➕ show 1 reply

stephc_int13 • today at 7:01 PM

"Each benchmark was run multiple times, and I’m using the median to get rid of any potential outliers."

This is not how you should do benchmarks. Don't take the median, you don't even need to do any "warming up".

Simply run it long enough and only take the best result of each. This is more reliable and correct.

jandrewrogers • today at 3:02 PM

Benchmarks are fine but they will only be loosely correlated with the measured performance for any specific use case.

There is still substantial performance to be gained by creating bespoke hashmap designs at every point of use in code. The high dimensionality of the algorithm optimization space makes it improbable that any specific hashmap algorithm implementation will optimally capture the characteristics of a use case or set of use cases. The variance can be relatively high.

It isn't uncommon to find several independent hashmap designs inside performance-engineered code bases. The sensitivity to small details makes it difficult to build excellent hashmap abstractions with broad scope.

➕ show 2 replies

zX41ZdbW • today at 6:20 PM

The performance of hash tables and hash functions significantly depends on the data distribution, and should be compared on real datasets.

I've covered it in my presentation: https://presentations.clickhouse.com/2017-hash_tables/

hermitcrab • today at 2:02 PM

Would be interested to hear how the Qt QHash compares.

https://doc.qt.io/qt-6/qhash.html

➕ show 1 reply

rurban • today at 2:20 PM

Not really comprehesive. Doesn't include my favorite https://github.com/greg7mdp/parallel-hashmap which adds thread-safety to performance.

➕ show 1 reply

alt Hacker News

Comprehensive C++ Hashmap Benchmarks (2022)

Comments