Previewing GPT‑5.6 Sol: a next-generation model

1102 points • by minimaxir • yesterday at 5:06 PM • 713 comments • view on HN

System card: https://deploymentsafety.openai.com/gpt-5-6-preview

Comments

I hate not being able to use the latest models. There needs to be a much faster resolution to whatever is happening with the federal government.

➕ show 2 replies

da_grift_shift • yesterday at 5:27 PM

    Flagged activity can also trigger account-level review across relevant conversations and risk signals, consistent with our terms and policies around content retention and review. Looking beyond a single conversation helps our systems distinguish persistent malicious behavior from legitimate dual-use security work, where similar technical concepts may appear in very different contexts.

Fascinating!

Every conversation you have with these "more capable" models will be monitored and joined up and then your entire account might one day be tagged as Distiller or Cyber Threat Actor or whatnot. When combined with identity verification (which isn't discussed in this press release), expect people to be falsely flagged and banned from ever using OpenAI models again.

Wish I could find the thread from last week where discussions of exactly this kind of thing were dismissed as daft and outlandish.

➕ show 2 replies

oofbey • yesterday at 10:32 PM

Another year, and OpenAI comes up with yet another naming scheme for their models. First it was integers (GPT2, GPT3). Then they added friendly names (remember Ada, Babbage, Curie, Davinci?), but decided against it. Instead we got dot integers (GPT3.5), then then letter-number modifiers (o1), plus word modifiers like o1-pro, o3-mini, or -mini-high, or codex, codex-max, Pro, etc.

Now they've got friendly cosmic names. And this time they want us to believe that this time they're gonna stick to a naming convention? I'll believe it when they do 3 releases in a row without inventing a new naming scheme.

masonwan • yesterday at 6:34 PM

Guess it's just another price bump hidden behind output token speed.

madikz • today at 1:45 PM

[flagged]

ermantrout • today at 1:27 PM

[dead]

monkamonme • today at 4:35 AM

[flagged]

yashthakker • today at 1:33 AM

[flagged]

jkwang • today at 8:02 AM

[flagged]

moozechen • today at 2:18 AM

[flagged]

gck1 • yesterday at 5:55 PM

[dead]

randomuser558 • yesterday at 6:44 PM

[flagged]

w4yai • yesterday at 5:36 PM

[flagged]

wonkyfruit • yesterday at 7:07 PM

TLDR - It's not quite Mythos but it uses about 5 times less tokens, and those tokens are also cheaper?

https://pbs.twimg.com/media/HLwuJLvbwAAOfQZ?format=jpg&name=...

➕ show 1 reply

0dayman • today at 10:38 AM

[dead]

HarHarVeryFunny • yesterday at 5:45 PM

[flagged]

➕ show 1 reply

nakedrobot2 • yesterday at 5:14 PM

[flagged]

➕ show 2 replies

ponyous • yesterday at 6:50 PM

[flagged]

andrewlin247 • yesterday at 5:40 PM

they're trying to be anthropic with these model names

ericyd • yesterday at 7:06 PM

whoa, a new model that surpasses benchmarks of other models? wild.

CurbStomper • yesterday at 7:04 PM

Could not care less.

johnnyApplePRNG • yesterday at 5:53 PM

Doesn't it strike anyone as strange that SOL, TERRA, and LUNA are all quasi-scam crypto tickers?

➕ show 2 replies

renoir • today at 7:09 AM

GPT 5.5 in Codex is so much worse than Opus, and sometimes worse than Sonnet. I don't think 5.6 Sol will be anywhere near Fable, let alone Mythos. Probably slightly better than Opus. Maybe not even.

throwitaway222 • yesterday at 7:22 PM

Time to create more LLM based startups.

  * House design plans from prompts
  * Government surveillance of public communication
  * Extracting world/spatial concepts from language models (do we really need a world/spatial models now?)
  * Driverless City planning startups
  * Election vote rigging/harvesting startups
  * Video game NPC backstory startups (all NPCs in GTA 6 go to work, go home, shower, go to sleep now?)

Keep moving don't doom.

JohnRoseDev • yesterday at 7:14 PM

I can’t help but think that these benchmarks are completely fake. Sam even posted a benchmark on X a couple days ago of how the ‘complete version’ of 5.5 cyber was already ahead of Mythos apparently. This just feels like absolutely fake nonsense. The impact of Mythos on the industry was clear and in front of everyone’s eyes. The amount of vulnerabilities Mozilla fixed. The vulnerabilities and exploits Anthropic showcased in that blog post about the chrome sandbox escape etc. And now we’re supposed to believe this 5.5 cyber is already ahead of Mythos, ok. And yeah, gpt 5.6 is even further ahead, alright.

➕ show 1 reply

alt Hacker News

Previewing GPT‑5.6 Sol: a next-generation model

Comments