logoalt Hacker News

j2kunyesterday at 11:24 PM1 replyview on HN

It's less that I think they would take the other side of the argument, than that they would lend some credence to the content of the analysis. For example, I would not particularly trust a bunch of AI researchers to come up with a representative set of CTF tasks, which seems to be the basis of this analysis.


Replies

tptacekyesterday at 11:54 PM

Yeah, you might be right about this particular analysis! The sense I have from talking to people at the labs is that they're really just picking deliberately diverse and high-profile targets to see what the models are capable of.