logoalt Hacker News

creamyhorroryesterday at 7:48 PM3 repliesview on HN

I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took really long), and it was to analyse my codebase and write an evaluation on a topic. But I found its writing and analysis thoughtful, precise, and surprisingly clearly written, unlike 5.3-Codex. It feels very lucid and uses human phrasing.

It might be my AGENTS.md requiring clearer, simpler language, but at least 5.4's doing a good job of following the guidelines. 5.3-Codex wasn't so great at simple, clear writing.


Replies

pembrookyesterday at 10:40 PM

The latest research these days is that including an AGENTS.md file only makes outcomes worse with frontier models.

show 5 replies
samptonyesterday at 9:52 PM

That's been my experience as well switching from Opus to Codex. Reasoning takes longer but answers are precise. Claude is sloppy in comparison.

show 2 replies
irishcoffeeyesterday at 9:19 PM

> It might be my AGENTS.md requiring clearer, simpler language

If you gave the exact same markdown file to me and I posted ed the exact same prompts as you, would I get the same results?

show 2 replies