logoalt Hacker News

bhoustonyesterday at 5:41 PM1 replyview on HN

Mythos was clear it was one agent per chunk. But this positive confirming results do not actually disprove anytime with Mythos, because it is only one side of the discriminator challenge - you got positives, but we do not know your false positive rate and your false negative rate.


Replies

kennywinkeryesterday at 5:46 PM

In TFA they talk a fair bit about how different models perform wrt false positives:

“The results show something close to inverse scaling: small, cheap models outperform large frontier ones.”