logoalt Hacker News

GPT-5.4

586 pointsby mudkipdevyesterday at 6:08 PM512 commentsview on HN

https://openai.com/index/gpt-5-4-thinking-system-card/

https://x.com/OpenAI/status/2029620619743219811


Comments

bob1029yesterday at 8:17 PM

I was just testing this with my unity automation tool and the performance uplift from 5.2 seems to be substantial.

swingboyyesterday at 6:46 PM

Even with the 1m context window, it looks like these models drop off significantly at about 256k. Hopefully improving that is a high priority for 2026.

nthypesyesterday at 6:37 PM

$30/M Input and $180/M Output Tokens is nuts. Ridiculous expensive for not that great bump on intelligence when compared to other models.

show 6 replies
motzayesterday at 9:12 PM

No doubt this was released early to ease the bad press

melbourne_matyesterday at 9:47 PM

Quick: let's release something new that gives the appearance that we're still relevant

vicchenaiyesterday at 7:00 PM

Honestly at this point I just want to know if it follows complex instructions better than 5.1. The benchmark numbers stopped meaning much to me a while ago - real usage always feels different.

ilakshyesterday at 6:36 PM

Remember when everyone was predicting that GPT-5 would take over the planet?

show 2 replies
tmpz22yesterday at 6:33 PM

Does this improve Tomahawk Missile accuracy?

show 1 reply
ltbarcly3yesterday at 10:55 PM

Not a single comparison between 5.4 and Gemini or Claude. OpenAI continues to fall further behind.

brcmthrowawayyesterday at 10:28 PM

How much of LLM improvement comes from regular ChatGPT usage these days?

world2vecyesterday at 6:37 PM

Benchmarks barely improved it seems

gigatexalyesterday at 9:54 PM

Is it any good at coding?

fernstyesterday at 9:32 PM

Now with more and improved domestic espionage capabilities

throwaway5752yesterday at 9:02 PM

Does this model autonomously kill people without human approval or perform domestic surveillance of US citizens?

koakuma-chanyesterday at 8:18 PM

Anyone else getting artifacts when using this model in Cursor?

numerusformassistant to=functions.ReadFile մեկնաբանություն 天天爱彩票网站json {"path":

show 2 replies
jesse_dot_idyesterday at 7:55 PM

ChatMDK

thefounderyesterday at 10:08 PM

Is it just me or the price for 5.4 pro is just insane?

OutOfHereyesterday at 6:59 PM

What is with the absurdity of skipping "5.3 Thinking"?

lostmsuyesterday at 6:52 PM

What is Pro exactly and is it available in Codex CLI?

show 1 reply
HardCodedBiasyesterday at 6:50 PM

We'll have to wait a day or two, maybe a week or two, to determine if this is more capable in coding than 5.3, which seems to be the economically valuable capability at this time.

In terms of writing and research even Gemini, with a good prompt, is close to useable. That's likely not a differentiator.

wahnfriedenyesterday at 6:41 PM

No Codex model yet

show 1 reply
oytisyesterday at 6:59 PM

Everyone is mindblown in 3...2...1

ignorantguyyesterday at 6:06 PM

it shows a 404 as of now.

show 1 reply
iamleppertyesterday at 7:15 PM

I wouldn't trust any of these benchmarks unless they are accompanied by some sort of proof other than "trust me bro". Also not including the parameters the models were run at (especially the other models) makes it hard to form fair comparisons. They need to publish, at minimum, the code and runner used to complete the benchmarks and logs.

Not including the Chinese models is also obviously done to make it appear like they aren't as cooked as they really are.

simianwordsyesterday at 6:33 PM

What is the point of gpt codex?

show 1 reply
minimaxiryesterday at 6:25 PM

More discussion here on the blog post announcement which has been confusingly penalized by Hacker News's algorithm: https://news.ycombinator.com/item?id=47265005

show 1 reply
Smart_Medvedyesterday at 10:36 PM

[dead]

shablulmanyesterday at 6:25 PM

[dead]

readytionyesterday at 9:17 PM

[flagged]

chromic04850yesterday at 6:24 PM

[dead]

chromic04850yesterday at 6:38 PM

[dead]

leftbehindsyesterday at 6:59 PM

[flagged]

leftbehindsyesterday at 6:49 PM

some sloppy improvements

kotevcodeyesterday at 7:11 PM

[flagged]

show 1 reply
elmeanyesterday at 7:25 PM

Wow insane improvements in targeting systems for military targets over children

show 4 replies
woeiruayesterday at 8:57 PM

Feels incremental. Looks like OpenAI is struggling.

show 1 reply