logoalt Hacker News

indymikeyesterday at 6:33 PM2 repliesview on HN

Because of the scale of generated code, often it is the AI verifying the AI's work.


Replies

ptnpzwqdyesterday at 10:26 PM

I of course cannot say what the future holds, but current frontier models are - in my experience - nowhere near good enough for such autonomy.

Even with other agents reviewing the code, good test coverage, etc., both smaller - and every now and then larger - mistakes make their way through, and the existence of such mistakes in the codebase tend to accellerate even more of them.

It for sure depends on many factors, but I have seen enough to feel confident that we are not there yet.

tartoranyesterday at 6:37 PM

So who's verifying the AI doing the verifying or is it yet another AI layer doing that? If something goes wrong who's liable, the AI?

show 1 reply