logoalt Hacker News

sigmoid10yesterday at 7:49 PM3 repliesview on HN

Managing a McDonalds is a question of integration and modalities at this point. I don't think anyone still doubts that these models lack the reasoning capability or world knowledge needed for the job. So it's less of a fundamental technical problem and more of a process engineering issue.


Replies

andy12_yesterday at 8:38 PM

I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds.

throw-the-towelyesterday at 7:50 PM

The capability they lack is being able to be sued.

show 1 reply