logoalt Hacker News

woeiruayesterday at 3:57 AM4 repliesview on HN

That's such a huge delta that Anthropic might be onto something...


Replies

conceptionyesterday at 4:03 AM

Anthropic has been the only AI company actually caring about AI safety. Here’s a dated benchmark but it’s a trend Ive never seen disputed https://crfm.stanford.edu/helm/air-bench/latest/#/leaderboar...

show 2 replies
LeoPantherayesterday at 4:32 AM

This might also be why Gemini is generally considered to give better answers - except in the case of code.

Perhaps thinking about your guardrails all the time makes you think about the actual question less.

show 2 replies
rahidzyesterday at 12:09 PM

Or Anthropic's models are intelligent/trained on enough misalignment papers, and are aware they're being tested.