> If it takes a long time to copy a database and it gets updated midway through, rsync may give me an invalid database file. The first half of the file is pre-update, the second half file is post-update, and they don’t match. When I try to open the database locally, I get an error
Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption.
For replicating sqlite databases safely there is
> You can't copy the file of a running, active db receiving updates, that can only result in corruption
To push back against "only" -- there is actually one scenario where this works. Copying a file or a subvolume on Btrfs or ZFS can be done atomically, so if it's an ACID database or an LSM tree, in the worst case it will just rollback. Of course, if it's multiple files you have to take care to wrap them in a subvolume so that all of them are copied in the same transaction, simply using `cp --reflink=always` won't do.
Possibly freezing the process with SIGSTOP would yield the same result, but I wouldn't count on that
The built-in .backup command is also intended as an official tool for making “snapshotted” versions of a live db that can be copied around.
While I run and love litestream on my own system, I also like that they have a pretty comprehensive guide on how to do something like this manually, via built-in tools: https://litestream.io/alternatives/cron/
Litestream is really cool! I'm planning to use it to backup and restore my SQLite in the container level, just like what that ex-google guy who started a startup of a small KVM and had a flood in his warehouse while on vacation did. If I'm not mistaken. I would link here the perfect guide he wrote but there's 0 chance I'll find it. If you understand the reference please post the link.
Litestream looks interesting but they are still in beta, and seem to have not had a release in over a year, although SQLite doesn't move that quickly.
Is Litestream still an active project?
>You can't copy the file of a running, active db receiving updates, that can only result in corruption
There is a slight 'well akshully' on this. A DB flush and FS snapshot where you copy the snapshotted file will allow this. MSSQL VSS snapshots would be an example of this.
Litestream is overkill for this; "VACUUM foo INTO foo.db;" is all you need.
"If it takes a long time to copy a database and it gets updated midway through, rsync may give me an invalid database file"
Wot? There are multiple ways of snapshotting/checkpointing, starting at the virty level and working on down the stack through the application level.
If the corruption is detectable and infrequent enough for your purposes, then it does work, with a simple “retry until success” loop. (That’s how TCP works, for example.)
> Of course! You can't copy the file of a running, active db receiving updates, that can only result in corruption
Do people really not understand how file storage works? I cannot rightly apprehend the confusion of ideas that would produce an attempt to copy a volatile database without synchronization and expect it to work.
"For replicating sqlite databases safely there is (litestream) ..."
A reminder that litestream can run over plain old SFTP[1] which means you can stream database replication to just about any UNIX endpoint over SSH.
I have a favorite[2] but any SFTP server will do ...
[1] https://github.com/benbjohnson/litestream/issues/140
[2] https://www.rsync.net/resources/notes/2021-q3-rsync.net_tech...