logoalt Hacker News

ameliaquiningtoday at 1:39 AM1 replyview on HN

On the contrary, when a machine has been shown to outperform human judgment at a specific task, you should trust it over your own gut feeling, especially if you have no particular training or track record at the task.

There've been third-party evaluations of Pangram, e.g., https://bfi.uchicago.edu/wp-content/uploads/2025/09/BFI_WP_2.... I personally do not think I could achieve that rate of accuracy, if you made me read a bunch of text samples and guess whether humans or AIs wrote them. Do you think you could?


Replies

bonsai_spooltoday at 11:43 AM

One needs to be careful of citing papers - Pangram was not tested on the most recent models; the most recent one in that report is Claude Opus 4. Notably, Pangram does worse on news reports than on other types of textual detection tasks, and its failure depends on the model used in a way that suggests Pangram detection is very sensitive to whatever sources it was used for training.

> Do you think you could?

Not the right question. I am saying that this particular article based on its tendencies and the historical writings of this author are LLM-assisted if not wholly generated.