GPT-5.4

586 points • by mudkipdev • yesterday at 6:08 PM • 512 comments • view on HN

https://openai.com/index/gpt-5-4-thinking-system-card/

https://x.com/OpenAI/status/2029620619743219811

Comments

I was just testing this with my unity automation tool and the performance uplift from 5.2 seems to be substantial.

Even with the 1m context window, it looks like these models drop off significantly at about 256k. Hopefully improving that is a high priority for 2026.

nthypes • yesterday at 6:37 PM

$30/M Input and $180/M Output Tokens is nuts. Ridiculous expensive for not that great bump on intelligence when compared to other models.

➕ show 6 replies

motza • yesterday at 9:12 PM

No doubt this was released early to ease the bad press

melbourne_mat • yesterday at 9:47 PM

Quick: let's release something new that gives the appearance that we're still relevant

vicchenai • yesterday at 7:00 PM

Honestly at this point I just want to know if it follows complex instructions better than 5.1. The benchmark numbers stopped meaning much to me a while ago - real usage always feels different.

ilaksh • yesterday at 6:36 PM

Remember when everyone was predicting that GPT-5 would take over the planet?

➕ show 2 replies

tmpz22 • yesterday at 6:33 PM

Does this improve Tomahawk Missile accuracy?

➕ show 1 reply

ltbarcly3 • yesterday at 10:55 PM

Not a single comparison between 5.4 and Gemini or Claude. OpenAI continues to fall further behind.

brcmthrowaway • yesterday at 10:28 PM

How much of LLM improvement comes from regular ChatGPT usage these days?

world2vec • yesterday at 6:37 PM

Benchmarks barely improved it seems

gigatexal • yesterday at 9:54 PM

Is it any good at coding?

fernst • yesterday at 9:32 PM

Now with more and improved domestic espionage capabilities

throwaway5752 • yesterday at 9:02 PM

Does this model autonomously kill people without human approval or perform domestic surveillance of US citizens?

koakuma-chan • yesterday at 8:18 PM

Anyone else getting artifacts when using this model in Cursor?

numerusformassistant to=functions.ReadFile մեկնաբանություն 天天爱彩票网站json {"path":

➕ show 2 replies

jesse_dot_id • yesterday at 7:55 PM

ChatMDK

thefounder • yesterday at 10:08 PM

Is it just me or the price for 5.4 pro is just insane?

OutOfHere • yesterday at 6:59 PM

What is with the absurdity of skipping "5.3 Thinking"?

lostmsu • yesterday at 6:52 PM

What is Pro exactly and is it available in Codex CLI?

➕ show 1 reply

HardCodedBias • yesterday at 6:50 PM

We'll have to wait a day or two, maybe a week or two, to determine if this is more capable in coding than 5.3, which seems to be the economically valuable capability at this time.

In terms of writing and research even Gemini, with a good prompt, is close to useable. That's likely not a differentiator.

wahnfrieden • yesterday at 6:41 PM

No Codex model yet

➕ show 1 reply

oytis • yesterday at 6:59 PM

Everyone is mindblown in 3...2...1

ignorantguy • yesterday at 6:06 PM

it shows a 404 as of now.

➕ show 1 reply

iamleppert • yesterday at 7:15 PM

I wouldn't trust any of these benchmarks unless they are accompanied by some sort of proof other than "trust me bro". Also not including the parameters the models were run at (especially the other models) makes it hard to form fair comparisons. They need to publish, at minimum, the code and runner used to complete the benchmarks and logs.

Not including the Chinese models is also obviously done to make it appear like they aren't as cooked as they really are.

simianwords • yesterday at 6:33 PM

What is the point of gpt codex?

➕ show 1 reply

minimaxir • yesterday at 6:25 PM

More discussion here on the blog post announcement which has been confusingly penalized by Hacker News's algorithm: https://news.ycombinator.com/item?id=47265005

➕ show 1 reply

Smart_Medved • yesterday at 10:36 PM

[dead]

shablulman • yesterday at 6:25 PM

[dead]

readytion • yesterday at 9:17 PM

[flagged]

chromic04850 • yesterday at 6:24 PM

[dead]

chromic04850 • yesterday at 6:38 PM

[dead]

leftbehinds • yesterday at 6:59 PM