This matches my experience using LLMs for science. Out of curiosity, I downloaded a randomized study...

biophysboy • yesterday at 8:19 PM • 1 reply • view on HN

This matches my experience using LLMs for science. Out of curiosity, I downloaded a randomized study and the CONSORT checklist, and asked Claude code to do a review using the checklist.

I was really impressed with how it parsed the structured checklist. I was not at all impressed by how it digested the paper. Lots of disguised errors.

Replies

baq • yesterday at 8:59 PM

try codex 5.3. it's dry and very obviously AI; if you allow a bit of anthropomorphisation, it's kind of high-functioning autistic. it isn't an oracle, it'll still be wrong, but it's a powerful, completely different from claude tool.

➕ show 1 reply

alt Hacker News

Replies