logoalt Hacker News

kneel25today at 2:34 PM4 repliesview on HN

> After the initial translation, I ran multiple passes of adversarial review, asking different models to analyze the code for mistakes and bad patterns.

I feel like you just know it’s doomed. What this is saying is “I didn’t want to and cannot review the code it generated” asking models to find mistakes never works for me. It’ll find obvious patterns, a tendency towards security mistakes, but not deep logical errors.


Replies

zamadatixtoday at 7:10 PM

Somehow they did use this as part of their approach to get to 0 regressions across 65k tests + no performance regressions though + identical output for AST and bytecode though. How much manual review was part of the hundreds of rounds of prompt steering is not stated, but I don't think it's possible to say it couldn't find any deep logical errors along the way and still achieve those results.

The part that concerns me is whether this part will actually come in time or not:

> The Rust code intentionally mimics things like the C++ register allocation patterns so that the two compilers produce identical bytecode. Correctness is a close second. We know the result isn’t idiomatic Rust, and there’s a lot that can be simplified once we’re comfortable retiring the C++ pipeline. That cleanup will come in time.

Of course, it wouldn't be the first time Andreas delivered more than I expected :).

show 1 reply
herrkanintoday at 2:38 PM

Your argument is just as applicable on human code reviewers. Obviously having others review the code will catch issues you would never have thought of. This includes agents as well.

show 3 replies
u_samatoday at 2:54 PM

That is what the testing suite is there to check, no?

show 2 replies
cardanometoday at 3:05 PM

Yeah, I lost all interest in the ladybird project now that it is AI slop.

No one wants to work with this generated, ugly, unidiomatic ball of Rust. Other than other people using AI. So you dependency AI grows and grows. It is a vicious trap.