logoalt Hacker News

jmorse3yesterday at 2:34 PM2 repliesview on HN

https://www.sigops.org/2026/can-llms-model-real-world-system...

"Running leading LLMs across the eleven systems shows that LLMs are great at producing correct TLA+ syntax but struggle to ensure conformance and appropriate invariants."


Replies

zihotkiyesterday at 2:58 PM

I think that's what author implies by this sentence in the intro:

> It’s still your responsibility to understand your system and define what “correctness” means, and you need a high-level understanding of temporal logic.

show 1 reply
baqyesterday at 7:42 PM

This blog post made it more difficult than it should to find the actual tool (needed to click twice! abysmal, really): https://github.com/specula-org/Specula

Ran it on my tricky caching changes and it found bugs the standard agentic review flows missed on all frontier models. Good stuff.