logoalt Hacker News

the_gipsyyesterday at 6:00 PM2 repliesview on HN

With AIs, it seems like there never is a comparison that is useful.


Replies

theptipyesterday at 7:47 PM

You can build evals. Look at Harbor or Inspect. It’s just more work than most are interested in doing right now.

jascha_engyesterday at 6:17 PM

yup its all vibes. And anthropic is winning on those in my book still