logoalt Hacker News

fcarraldotoday at 1:10 AM6 repliesview on HN

It sounds like you’re running this mostly on a single machine? Temporal gets much more complex with scale. Cassandra isn’t fun to manage. Ringpop and TChannel are hard to debug when things go wrong. The SQL backend support doesn’t support horizontally scaled replicas (just single instance) due to consistency requirements. Depending on how your code is written, modifying code baked into workflows becomes complex, as anything that modifies the history event ordering breaks determinism in already-deployed workers.

We use it heavily and everyone who started on it doing simple scripting/automation all love it, everyone who built real production systems on top of it all hate it. Possibly operator error, but my experience hasn’t matched the rosy picture painted in these comments.


Replies

inglortoday at 8:04 AM

In the last two years, we built (with a team of 15, now 100) a billion dollar business on top of Temporal that performs business critical applications for fortune 500 companies. We couldn't be happier with temporal.

Determinism sucks, you do have to work hard and make everything idempotent in activities like we would for durable software anyway. The language we used was incorrect (Go) and has a lot of boilerplate compared to alternatives we later investigated (Python and TypeScript). Visibility can be slow and misses information. We needed to write our own APIs to work effectively with Agents for root-cause analysis of failures.

With all the caveats - Temporal is amazing, it feels much better than previous orchestrators I used like Prefect or Airflow. 100% would adopt again.

show 2 replies
taberiandtoday at 8:47 AM

> Depending on how your code is written, modifying code baked into workflows becomes complex, as anything that modifies the history event ordering breaks determinism in already-deployed workers.

I see this as Temporal surfacing inherent complexity of the domain in a way that forces the developer to consider it, rather than introducing extra complexity.

If it didn't make workflow determinism a strict requirement, the requirement would still exist - it would just hurt much worse in production when it's broken.

See also: Rust borrowing

show 1 reply
bitexplodertoday at 5:01 AM

Yep. Individual systems with yolo agents doing stuff in isolation. I could see how it can get complex. Most distributed systems are. No free lunches I guess. I am not sure what the alternatives are at scale.

cnlwsutoday at 1:35 PM

We are a huge production setup where it’s absolutely critical successfully but we use temporal cloud. Hosting it yourself is what makes it miserable. https://temporal.io/resources/on-demand/netflix

chrisss395today at 1:20 PM

"gets much more complex with scale" feels like the crux of it. Pick the right solution for your intended scale

That said, I appreciate this is hard in practice. We need to start small to manage the development rabbit hole risk, while also wanting to dream big. There is a tension there that I find hard to balance.

kumulotoday at 6:22 AM

Temporal feels massive, I tried it for a small workflow embedded on my system, and worked fine, but when thinking on scaling it, it just didn't made any sense for my use case.

I also have restate.dev on my reseearch list, which on paper should scale well and be definitely more lightweight and simple to setup, worth having a look.

show 1 reply