What if you asked your favorite AI agent to produce mathematics at the level of Vladimir Voevodsky, Fields Medal-winning, foundation-shaking work but directed toward something the legendary Nikolaj Bjørner (co-creator of Z3) could actually use?
Well, you'd get this embarrassing mess, apparently.
I miss the days when humans submitted things they had done to this site, instead of generating long slop articles in 5 minutes: ‘LLM‑based code synthesis—while mind-numbingly effective—’ about slop code they generated in 5 minutes (or worse in hours) with foolish prompts:’Produce mathematics at the level of Vladimir Voevodsky, Fields Medal-winning, foundation-shaking work’.
Should we even read this or should we get an LLM to summarise it onto a few bullet points again?
This bit was interesting in illuminating the human authors’ credulity (assuming they believe in their own article):
‘The central move was elegant: stop asking only “is the system safe?“, start asking “how far is it from safety?“‘
This ersatz profundity couched in a false opposition is common in generated text - does it have anything at all to do with the code generated or is it all just convincing bullshit?
[dead]
[dead]
At a very quick look, no evidence is given that the "bugs" found in requests are in fact reachable, i.e. not prevented by construction. And sure enough, the very first one is impossible because of a validating guard[1]: `address_in_network` only gets called after `is_valid_cidr`, which enforces the presence of a slash.
I think we should hold claims about effective static analysis and/or program verification to a higher standard than this.
[1]: https://github.com/psf/requests/blob/4bd79e397304d46dfccd76f...