SQLite is all you need for durable workflows

629 points • by tomasol • yesterday at 5:54 PM • 336 comments • view on HN

Comments

I started setting up my workflows using Temporal. It deploys as relatively light weight local app. For an isolated local installation it uses SQLite. It makes the process of dealing with API retries and organizing workflows and tasks really simple. I recommend giving it a try. It is, philosophically, exactly what this article is suggesting, but it adds an incredibly rich and flexible interface for agents to work with. Additionally, the web UI makes it very easy to inspect workflows, review agent execution, etc. Temporal also encodes much higher reliability into your system, almost for free. Distributed and reliable systems are hard, don't reinvent the wheel IMO.

If you find yourself wanting things like an easy way to then introspect your SQLite database, figure out what is happening in the workflow, compose individual tasks, make workflows trivially callable, etc, give Temporal a look.

Alongside this, I have mostly moved away from files for agents. Markdown and JSON are great, but also feel like traps when building out smaller local apps. LLMs are great at SQLite and you can render anything you want out of it (Markdown, JSON, etc). It saves a lot of tokens when an agent can just query a specific row instead of having to fire up jq or grep through markdown. You get a nice portable self contained data management system that encourages agents to be more disciplined about how they structure their data than a bunch of files. It also continues to scale into MySQL/Postgres if your little local projects start to outgrow or become more formal, you already have schema and discipline around data.

➕ show 6 replies

utopiah • today at 7:56 AM

The cycle of expertise :

- what is X, I just do Y

- wow I can see so many limits of Y, now I do X

- I use X for literally everything

- now that I properly understand the limits of Y but also the heavy constraints of X ... maybe Y is enough

- I use Y for literally everything

rinse & repeat. The thing is with actual usage and actual context one does learn and thus can get away with a lot more "basic" solution but it does require genuinely understanding the limits.

➕ show 1 reply

levkk • yesterday at 6:54 PM

I don't understand this obsession with SQLite for real, production apps. SQLite is an embedded database, completely unsuitable for managing concurrency. This is what database _servers_ are for, e.g., Postgres, MySQL, etc. Their entire job is to allow you to modify data from multiple processes, on different machines, at the same time.

This is a foundational principle of computer science. It seems to me that the "SQLite for everything" crowd is a little bit inexperienced.

➕ show 33 replies

faangguyindia • today at 1:45 AM

I've replaced all of these with Go + SQLite:

1. Intercom 2. Zendesk 3. Email marketing 4. Kanban 5. Todo 6. Our billing stack 7. Our issue tracker 8. Our forum 9. Uptime monitor 10. PagerDuty (clone)

I have dozens of products I sell, so I thought: why not build everything ourselves?

All of these run on the same server and use very little memory. I replaced all the SaaS tools we used with these.

I also moved to dedicated servers and dropped costs to about 1/10th of what we were paying for managed cloud solutions, while maintaining the same HA and even achieving lower latency (partly because noisy neighbors on VPSes were increasing tail latency).

We used to spend a ton on this stuff. These have now been in production for four months and have only needed minor updates.

Deployment is dirt simple. No Docker, no Kubernetes—just a systemd service and a binary built on the dev machine and deployed.

We also used to pay for services like MaxMind and IPData. I ended up hand-rolling my own IP geolocation service, which, in my tests, outperforms most existing solutions.

It all started with replacing Uptime Robot. Then I got more confident and replaced PagerDuty. After that, I replaced Intercom.

Finally, I had always heard people say, "Don't build your own billing stack." But I said YOLO, let me make that mistake myself. So I studied our existing billing solution, developed my own, and rolled it out. So far, we've had zero issues with it.

Caddy in front.

I found that we only use maybe 1–5% of the features most SaaS products offer, while the features we actually need keep getting buried deeper and deeper inside these "enterprise-grade" platforms, making our workflows more difficult.

I won't show my commercial products because our partners and clients probably wouldn't appreciate knowing how cheap I am—but I call it being resourceful.

I can show my free app, though, which has 20,000+ users and was launched recently: https://macrocodex.app/

It only uses the Zendesk clone. Email is handled through Cloudflare routing, so we pay almost nothing to run the app.

➕ show 4 replies

shukantpal • yesterday at 6:47 PM

SQLite is surprisingly performant for single node applications even when comparing to Postgres. Postgres consumes a lot more memory and requires IO to hop through IPC whereas you can keep everything in process in SQLite with a shared connection pool.

I've been testing different storage engines for my agent harness and I can get up to 7.5k concurrent sessions on a single vCPU with SQLite whereas Postgres crashes or runs out connections.

[0] https://github.com/impalasys/talon/pull/23#issuecomment-4577...

➕ show 2 replies

m2f2 • yesterday at 8:13 PM

There's a wide gap from files to multipartition databases. Running databases in a container is not for me sorry whenever real production stuff is on the table.

Personally, lots of ETL can just be taken care of locally without involving enterprise databases. In such cases, DuckDB is 5x-10x better than SQLite and orders of magnitude simpler/faster than spinning up a dedicated Postgres database.

For general scripting, there's no match between a 20-lines awk script and a much cleaner, robust, maintainable equivalent SQL script based on DuckDB.

I just hope MotherDuck don't need to pump/dump for IPO - it would be sad losing that tool for the usual corporate greed.

➕ show 1 reply

Xeoncross • today at 3:46 PM

So is https://github.com/obeli-sk/obelisk the Rust version of https://github.com/temporalio/temporal (Go)? Can you guys add a comparison between them on the site?

prmph • yesterday at 9:23 PM

> Postgres ... is the right choice when you need higher availability, broader shared scalability, or other deployment properties that are better served by a network database. It is also the better fit when asynchronous replication to object storage is not the durability model you want... Many workflow systems do not need that on day one and should not start with more infrastructure than their state actually demands.

------

I see this kind of YAGNI thinking a lot, but in my view, it must be balanced against the effort you'd put into resolving any edge cases and adapting current architecture to your use case.

Imagine you deploy Sqlite, and thought it fine by itself, you keep running into some unforeseen challenges with the use to which you are putting. YOu'd need to sink valuable time and effort into addressing those. Then, when you have outgrown it, you'd beed to spend additional valuable times dping the same with Postgres.

This is why, when it comes to Architecture, I increasingly find my myself over-enigneering a bit. Assuming there is a good chance you might need to upgrade your architecture in the not too distant future, that approach is actually kind of very efficient. I find that I am able to uncover a lot of potential gotchas, which feeds back into the what the simplified current architecture should be, and helps me understand the roadmap I'm facing very well. I also avoid wasting too much time going too deep in directions that make sense now, but need a lot of plumbing to get right, when I can see that I'd likely have to throw it all out in a few years. Going from A -> B -C -> D, where each step is the optimal good-enough-for-now architecture but which requires a lot of work to stabilize and iron out the kinks of, is much less efficient than exploring D well enough to know whether you should build A, B, or C now.

Basically, some over-engineering, if done right, is not wasted. It cuts right to the heart of what you are dealing with, efficiently, and allows you to make (maybe) simpler but informed choices now as to how best to allocate your development resources now.

➕ show 1 reply

bob1029 • today at 9:50 AM

> This is especially attractive for AI agents and AI-generated workflows. Those systems are often bursty, experimental, and easier to reason about when each agent or tenant has a small self-contained unit of state.

I am finding that the most important thing is one big, consistent data warehouse that is updated with the state of the business as close to real time as we can get.

SQLite is not really great at this particular problem. Something like Postgres or SQL Server would be much more suitable for an OLAP data warehouse that can serve clients (AI agents) while simultaneously merging massive record sets from upstream business systems. These products also offer intricate permissions control. You can prove to an auditor that your AI solution will never see tables or rows it's not supposed to. SQLite doesn't even have a concept of a user, role or login.

> The compute can stay cheap and disposable.

Again, hosted sql is better aligned. The alternative is DIY hosted sql (SQLite + some other magic) which immediately violates this rule.

throwaway58670 • today at 3:58 PM

Holy sticky header... It takes literally half the screen on mobile. Shit like this makes me wonder: do you even look at your website? At least sometimes?

psanford • today at 12:46 AM

I wrote a library[0] to let you concurrently update a sqlite db in s3 safely. It uses the little known sqlite sessions extension plus s3 compare-and-swap on a small metadata file to make this work reasonably efficiently and safely. I have been enjoying it for a bunch of small projects where I want a lambda function to have a db for state but I don't want to pay for a full database instance.

[0]: https://github.com/psanford/s3db

➕ show 1 reply

stephenlf • yesterday at 7:44 PM

Can’t wait to see the next iteration of this idea with “Logs are all you need for durable workflows.”

➕ show 5 replies

freakynit • today at 11:06 AM

SQLite backed with Raid-10 NVME disks and periodic backups to cloud storage is generally more than enough to run majority of the production workloads of startups.

Writes are single threaded, but, you can still easily do thousands per second.

DuckDB offers similar qualities, on the OLAP side.

This is not to say this is best combination.. but, when you consider the simplicity of setup, usage, operations, and backups, and cost element, this indeed offers one of the best, if not the best combination.

golem14 • yesterday at 7:09 PM

Litestream releases 5.9 and newer have a bug that causes instances to sync an insane amount of data. a DB with <10K of data in it and practically no writes/reads causes something like 10GB of daily replication traffic. For my toy project that got needlessly expensive.

PUSH_AX • yesterday at 9:44 PM

I went from using the various big player postgres clusters to SQLite, we have an MAU in 7 figures, all backed by SQLite durable objects. We have to think differently about the access patterns but the benefits have been worth it.

Thaxll • yesterday at 9:28 PM

I started using SQLite for a home project after years of reading about it, I was shocked at the poor type system coming from Postgres. It is really inferior, not sure why it gets so much praise.

https://sqlite.org/datatype3.html

https://www.postgresql.org/docs/current/datatype.html

Working with date/time feels like using a 30years old database, nothing is enforced at insert. Really someone needs to explain why so many people like it.

➕ show 7 replies

jessmartin • today at 3:30 PM

Waiting for “JSONL is all you need for durable workflows.”

teravor • yesterday at 9:30 PM

if you have an application that needs to maintain state in a non-critical section or if you discover that using SQL is actually a good idea for some tasks (even in critical sections), SQLite is not only a good choice but it will save you a lot of time coming up with a brittle custom solution.

maintain an in-memory SQLite db and work it with SQL commands, and if you also want to preserve state across application restarts you can routinely save to disk or load from it: <https://www.sqlite.org/backup.html#example_1_loading_and_sav...>

this also happens to be the most convenient file-format (aka. application-format) I ever worked with.

kubik369 • yesterday at 6:25 PM

Meta comment: This is a domain under my countries TLD (Slovakia) and it is one of the handful of words that are a word with the TLD in my language (and coincidentally) also in English. Every now and then, I will check on the domains with a retrograde dictionary for domains that have this property and root of this particular domain had a roundcube email server on it (can be checked on archive.org). After further checking, the local company actually named themselves Obeli s.r.o. (s.r.o. is Ltd), presumably so that they could use a domain that is a real word when said together with the TLD. (EDIT:) Forgot to write the thing I wanted to mention in the first place: it appears the domain must have lapsed and/or the author bought it from the company that was using it.

Another fascinating fact: our countries TLD has been stolen Ocean's 11 style (I am not kidding). After Czechoslovakia split into Czech Republic and Slovak Republic, the newly created Slovak .sk TLD has been under the care of people from the local university. The university also had some offices that they were leasing out. Someone had leased this office space (EDIT: this is important as this means they had the same physical address), created a company that had the same name as the NGO that was taking care of the domain, so e.g. the NGO was named "My Company o.z." and the perpetrator created a "My Company s.r.o." (our countries version of the american Ltd). This person then wrote to ICANN to change the address to the "My Company s.r.o." presumably under the pretense that this was just an administrative error and from this point, they have functionally taken custody of the TLD. I was not able to find how they did it technically, but I presume they persuaded ICANN to then point to their servers instead of the real ones. After this happened, it seems that no one noticed for some time. When they noticed, they tried taking it back, but they weren't able to. For some inexplicable reason, the government during that time (Šuster era, early 2000s) gave the new company a contract that was functionally uncancellable from the government side. Later governments made this even more uncancellable and in 2017, then Minister of IT (and as of this day president!) Pellegrini made the contract literally uncancellable. As a result of this, we have one of the most expensive domains around (18e/year, rising each year for no good reason). (EDIT:) The company running our countries TLD is now a foreign entity that the whole thing has been sold to (multiple owners over time) and we as a country have no control over if I understand it correctly.

I might have gotten some details wrong as I am writing this from my memory of researching it a couple of years back, but you get the idea, crazy stuff. Here is an article in Czech [0] that tells the story a bit better, but you have to translate it.

[0] https://www.root.cz/clanky/pribeh-domeny-sk-aneb-kradez-za-b...

// EDIT: I have found that the article actually links the movement to return the TLD back [1]. It also has a story tab [2], so they have something much more precise than the paraphrasing I wrote.

[1] https://www.nasadomena.sk/

[2] https://www.nasadomena.sk/historia/

➕ show 1 reply

halamadrid • today at 8:30 AM

Operators of Unmeshed here, which is basically a rewrite of Netflix Conductor. In this orchestrator we heavily use a uniquely scaled version of SQLite and also offers “managed” SQLite instances for managing user data. Combining the durable executions of Unmeshed and along with workflow primitives like sleep, workers, etc you can actually build complex systems with a lot less code than ever.

Check it out here: https://unmeshed.io

jackzhuo • today at 12:07 PM

100% this. I used to default to Postgres for everything. But seeing SQLite handle concurrency so well now—plus having built-in BM25 search and vector support—it really is all you need for these kinds of architecture.

vixalien • today at 1:26 PM

Has anyone actually used PGLite[0]?

[0]: https://pglite.dev/

sgloutnikov • yesterday at 6:14 PM

It's close enough that DBOS does support SQLite. [0] The default for prototyping is SQLite, but sure you can run it in production if you wanted.

Obligatory list of workflow engines and libraries because it's such a common need that a lot have rolled their own. [1]

[0] https://docs.dbos.dev/python/tutorials/database-connection

[1] https://github.com/meirwah/awesome-workflow-engines

Xcelerate • yesterday at 6:05 PM

Haha, I just started doing this on my own. Found it helps the agents preserve state better. I typically ask them to design a DAG first based on a set of specifications and then execute it (each step stores something in a SQLite DB). Iteration is pretty simple then because I just ask for a tweak to one or two steps of the DAG, and then to re-run.

Funny how people are independently converging on similar patterns of "what works" here. Still feels like we're in the wild west with all these ad-hoc patterns of agent orchestration that people are coming up with.

➕ show 1 reply

yokoprime • yesterday at 7:08 PM

If you're just doing workflows from a single node, i guess it can be ok as long as theres a single writer. But scaling across multiple servers it clearly is not all you need.

oulipo2 • today at 4:20 PM

Obelisk also supports Postgres. When you're using it with Postgres, what are the differences with DBOS? Are there any that would be significant

(I'm already using Postgres, so I don't really need a sqlite-based durable workflow engine, so looking to know how to choose between DBOS and Obelisk)

➕ show 1 reply

dev_l1x_be • today at 7:53 AM

I start to think the SQLite is all i need to store data. When there is a chance of non-coordinated writes that I can distribute among servers (or even a range based ID) SQLite is my first idea. With durable storage backups this works amazingly well.

aykutseker • today at 6:59 AM

Storage never ended up being the thing we worried about. The painful bits started once a workflow could touch external systems. Replaying state is one thing. Replaying a charge or an email is another. How are you dealing with that?

localhoster • yesterday at 7:01 PM

Idk if this article was vibe written or the author just "got adjusted" but it's clearly is, and it's unreadable. Man this becomes anmoying

mburaksayici • yesterday at 8:56 PM

Agreeing on the point, I needed NoSQL version on the similar uses, I've used TinyDB : https://mburaksayici.com/blog/2024/09/21/easy-to-use-nosql-p...

skybrian • yesterday at 8:42 PM

Instead of "just use Litestream," I'd like to see a review of different object stores one could use and which ones work well with Litestream. Is there a nice object store I could run in another Linux VM? As a hobbyist, which services providing an S3-like API make the most sense?

➕ show 1 reply

vkaku • today at 1:21 AM

Back in the day, I wrote a simple job queue with SQLite

https://github.com/guilt/squeue

It did the job, was fairly easy to use.

emodendroket • today at 6:05 AM

SQLite is an underrated tool for how powerful it really is and probably people don't think of it often enough.

vultour • yesterday at 11:54 PM

The GitHub statistics for the project this website represents are insane. It has a sole author that has averaged approximately 20,000 lines of code every week in the past month. How do you even maintain that alone?

gunnarmorling • yesterday at 9:18 PM

Related piece I wrote some time ago: https://www.morling.dev/blog/building-durable-execution-engi...

sharts • today at 6:49 AM

For something so widely deployed you’d think it’d be included in Claude/Codex/etc for

dannypdx • today at 1:33 AM

All this SQLite hate from big vector db... leave SQLite alone!

flying_sheep • yesterday at 10:41 PM

Cloudflare durable object is implemented with SQLite (or some variant of it)

orliesaurus • yesterday at 8:24 PM

Surprised no one has mentioned Turbopuffer yet [1] which natively supports dense vector similarity and BM25 keyword indexes out of the box

[1]. https://turbopuffer.com/

➕ show 1 reply

simplestates • today at 1:57 AM

Good framing. SQLite is often enough when the main problem is making workflow state durable, inspectable and easy to recover.

0x59 • yesterday at 7:07 PM

Big complex data model with ambiguous query patterns? Postgres

Small, well defined, data model with known query patterns? Bespoke model

There probably is a place for sqlite and my project space so far hasn't yet well-aligned with it.

➕ show 1 reply

ryanisnan • today at 6:52 AM

Sweet I get to tell my team we can move off of dapr workflows

fathermarz • yesterday at 10:41 PM

Excellent write up and inspired me for our next IA design run. After reading Fly’s Litestream work it makes me think this is a solid option.

delduca • today at 12:33 AM

No, mmap is all you need.

netik • yesterday at 7:19 PM

Until you scale past one machine…

bze12 • yesterday at 7:33 PM

Isn’t this very similar to cloudflare durable objects & workflows?

dnnddidiej • today at 2:53 AM

Butchering the Beatles song again.

nodesocket • yesterday at 9:56 PM

The biggest annoyance about SQLite for me is no ability to:

    ALTER TABLE users MODIFY COLUMN…

    ALTER TABLE users ALTER COLUMN…

    ALTER TABLE users ADD CONSTRAINT…

You have to create a new temporary table with correct schema, copy data into this new table, drop the old table, and then rename the temporary table.

➕ show 1 reply

shevy-java • today at 7:25 AM

Hmmm. SQLite is great, but I remember years ago, at a university cluster, I had to populate a SQL database via tons of INSERT statements from genomic/meta-genomic workflows. Postgresql was so much faster just at that particular action (inserting data) that it convinced me that SQLite may be useful for many, many applications, but for "big data"(sets), Postgresql is simply better.

3dedb728-3f77 • yesterday at 10:25 PM

Is this just a AWS ads?

alt Hacker News

SQLite is all you need for durable workflows

Comments

🔗 View 18 more comments