Why not? I described this in more detail in other comments. Even when using structured output, som...

XCSme • yesterday at 8:34 AM • 1 reply • view on HN

Why not? I described this in more detail in other comments.

Even when using structured output, sometimes you want to define how the data should be displayed or formatted, especially for cases like chat bots, article writing, tool usage, calling external api's, parsing documents, etc.

Most models get this right. Also, this is just one failure mode of Claude.

Replies

BoorishBears • yesterday at 7:25 PM

Like I said in the edit, when people want specific formatting they ask for well known formats: Markdown, XML, JSON

I don't even need to debate if the benchmark is useful, it doesn't pass a sniff test: GPT-5.4 is not worse than Gemini 2.5 Flash in any way that matters to most users. In your benchmark it's meaningfully worse.

➕ show 1 reply

alt Hacker News

Replies