We built a PDF processing tool and faced this exact question early on.
For our use case — merge, split, compress — we went fully stateless. Files are processed in memory and never stored. No database needed at all.
The only time a database becomes necessary is when you need user accounts, history, or async jobs for large files. For simple tools, a database is often just added complexity.
The real question isn't "do you need a database" but "do you need state" — and often the answer is no.
> The real question isn't "do you need a database" but "do you need state" — and often the answer is no.
This is a solid takeaway and applies to a lot of domains. Great observation
> The real question isn't "do you need a database" but "do you need state" — and often the answer is no.
We have a bunch of these applications and they are a joy to work with.
Funny enough, even if you have a database, if you wonder if you need caches to hold state in your application server, the answer is, kindly, fuck no. Really, really horrible scaling problems and bugs are down that path.
There are use cases to store expensive to compute state in varnish (HTTP caching), memcache/redis (expensive, complex datastructures like a friendship graph), elasticsearch/opensearch (aggregated, expensive full-text search), but caching SQL results in an application server because the database is "slow" beyond a single transaction brings nothing but pain in the future. I've spent so much energy working around decisions born out of simple bad schema design decisions and tuning...