logoalt Hacker News

Zigzag Decoding with AVX-512

115 pointsby luulast Wednesday at 5:43 PM21 commentsview on HN

Comments

flohofwoetoday at 9:28 AM

Worth mentioning that MeshOptimizer (https://github.com/zeux/meshoptimizer) has become one of a handful 'hidden champion' pillar libraries that probably carries half of the gaming industry.

Basically the curl of asset pipelines ;)

https://github.com/zeux/meshoptimizer/discussions/986

MaskRaytoday at 7:44 AM

While we could utilize zigzag encoding (i>>31) ^ (i<<1) to convert SLEB128-encoded type/addend to use ULEB128 instead, the generate code is inferior to or on par with SLEB128 for one-byte encodings on x86, AArch64, and RISC-V. Haven't tried wider values - but zigzag encoding is likely slower as well

// One-byte case for SLEB128 int64_t from_signext(uint64_t v) { return v < 64 ? v - 128 : v; }

// One-byte case for ULEB128 with zig-zag encoding int64_t from_zigzag(uint64_t z) { return (z >> 1) ^ -(z & 1); }

wood_spirittoday at 1:57 PM

Zigzag encodings are a common compression scheme used in the Parquet format. It is fun to speculate that these kind of tricks could be applied there in something so commonly under the hood of a lot of data processing and analytics

fc417fc802today at 11:21 AM

Is the matrix for bit shifting upside down or am I momentarily making a really dumb mistake here? Edit: nvm I missed the footnote which clarifies that for whatever reason the instruction populates the matrix from bottom to top.

londons_exploretoday at 7:50 AM

This sort of analysis is great.

Now why can't compilers do this sort of thing automatically?

Almost any problem seems to be possible to speed up 1000x in AVX512+days of thought compared to the naive version written in a python loop. If we could automate that whole process for big codebases the performance gains could be huge.

show 6 replies