I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took r...

creamyhorror • yesterday at 7:48 PM • 3 replies • view on HN

I've only used 5.4 for 1 prompt (edit: 3@high now) so far (reasoning: extra high, took really long), and it was to analyse my codebase and write an evaluation on a topic. But I found its writing and analysis thoughtful, precise, and surprisingly clearly written, unlike 5.3-Codex. It feels very lucid and uses human phrasing.

It might be my AGENTS.md requiring clearer, simpler language, but at least 5.4's doing a good job of following the guidelines. 5.3-Codex wasn't so great at simple, clear writing.

Replies

pembrook • yesterday at 10:40 PM

The latest research these days is that including an AGENTS.md file only makes outcomes worse with frontier models.

➕ show 5 replies

sampton • yesterday at 9:52 PM

That's been my experience as well switching from Opus to Codex. Reasoning takes longer but answers are precise. Claude is sloppy in comparison.

➕ show 2 replies

irishcoffee • yesterday at 9:19 PM

> It might be my AGENTS.md requiring clearer, simpler language

If you gave the exact same markdown file to me and I posted ed the exact same prompts as you, would I get the same results?

➕ show 2 replies

alt Hacker News

Replies