logoalt Hacker News

whatlast Saturday at 11:23 PM2 repliesview on HN

What assumption?


Replies

sigmarulelast Saturday at 11:24 PM

The one I quoted, which contradicts Anthropic’s post and has no supporting evidence publicly available. That a jailbreak was found that accesses the model’s _raw_ capabilities. Something Anthropic has explained was not the case.

apstlslast Saturday at 11:39 PM

It is pretty clear, no? Anthropic claims that the jailbreaks they were made aware of did not access the model’s raw capability, explained that there are protections to mitigate the impact of successful jailbreaks, etc. Coming here and stating something to the contrary with zero explanation or actual evidence is the assumption.