This is nice work, but I found the bug finding example to be weird:
> One such bug was in the sign function for zigzag decoding of the datrs/varinteger library. On input Std.U64.MAX, the expression (value + 1) overflowed, causing crashes in debug mode and silent corruption in release mode—an edge case that testing and fuzzing would typically miss.
In what way would this boundary condition case be considered something that "testing [...] would typically miss"? It's certainly something that bad tests would miss or not think about, but I find that (a) careful people and (b) ML coding systems are actually really good at "oh, I should test the extreme values". Especially for things that parse user input.
I'm curious if they found other bugs that were more interesting, but found them too hard to explain quickly.
Maybe it's not something they would "typically miss", but, from proof by existence, it's something they sometimes miss.
It does speak to the benefits of using lean in that you don't need to be clever about the different examples you test.
Yes, it's basic QA. If tests missed this kind of thing, they would be of much more limited use than we generally expect them to be. It raises questions about the authors' background.
Because this is garbage PR. That's it.
Every property-based testing system (invented ca. 1980) will explore boundary values. The semantics (or lack thereof) of C and C++ can make this difficult to actually test for because the compiler is allowed to say "test passed" to any input leading to UB.
particularly "and fuzzing", yea. fuzzing generally does intentionally explore boundary values, from what I've seen. for an encoding library like this, I think it's fair to say that fuzzing is a baseline expectation for any decent code, and it almost certainly would've caught this in seconds.
--- edit
concretely, I made a very simple round-trip test with proptest, and got dozens of failures and this in less than a second: