I've spent some brainpower on binary search and have not been able to beat this:
https://github.com/protocolbuffers/protobuf/blob/44025909eb7...
1. Check for dense list O(1) 2. Check upper bound 3. Constant trip count binary search
The constant trip count is great for the branch predictor, and the core loop is pretty tightly optimized for the target hardware, avoiding multiplies. Every attempt to get more clever made the loop worse and did not pay for itself. It's hard because it's an array-of-structs format with a size of 12, and mostly pretty small N.
I know protobuf code is extremely high quality, but I really can't stand the c-style naming conventions.
I know people train themselves into grokking this and reading and emitting this way, but it sounds like writing "bork bork bork bork" runes to me.
I'm glad Rust feels more like Ruby and Python and that method and field names are legible.
My eyes just glaze over:
UPB_API_INLINE
const struct upb_MiniTableField* upb_MiniTable_FindFieldByNumber(
const struct upb_MiniTable* m, uint32_t number) {
const uint32_t i = number - 1; // 0 wraps to UINT32_MAX
// Ideal case: index into dense fields
if (i < m->UPB_PRIVATE(dense_below)) {
UPB_ASSERT(m->UPB_ONLYBITS(fields)[i].UPB_ONLYBITS(number) == number);
return &m->UPB_ONLYBITS(fields)[i];
}
// Early exit if the field number is out of range.
uint32_t hi = m->UPB_ONLYBITS(field_count);
uint32_t lo = m->UPB_PRIVATE(dense_below);
UPB_ASSERT(hi >= lo);
uint32_t search_len = hi - lo;
if (search_len == 0 ||
number > m->UPB_ONLYBITS(fields)[hi - 1].UPB_ONLYBITS(number)) {
return NULL;
}
// Slow case: binary search
const struct upb_MiniTableField* candidate;
#ifndef NDEBUG
candidate = UPB_PRIVATE(upb_MiniTable_ArmOptimizedLowerBound)(
m, lo, search_len, number);
UPB_ASSERT(candidate ==
UPB_PRIVATE(upb_MiniTable_LowerBound)(m, lo, search_len, number));
#elif UPB_ARM64_ASM
candidate = UPB_PRIVATE(upb_MiniTable_ArmOptimizedLowerBound)(
m, lo, search_len, number);
#else
candidate = UPB_PRIVATE(upb_MiniTable_LowerBound)(m, lo, search_len, number);
#endif
return candidate->UPB_ONLYBITS(number) == number ? candidate : NULL;
}
For a pretty small N I've found that less clever can be quite a bit faster. I'd try a linear search - possibly SIMD if you can change the data format to struct-of-arrays. An adaptive approach that uses linear search up to a certain N can also yield some benefit.