> Multiple copies of SQLite linked into the same application
I had recent SQLite corruption, and I suspect it was this - I was accessing an SQLite database from the same python process using both the builtin sqlite3 package, and also the third party apsw library
I've never corrupted an SQLite database _file_, but an FTS5 index is remarkably easy to mess up if you don't read and understand the docs to the letter and which parts of them apply to your exact use case. When you do a thing wrong with FTS5, you either get a generic ambiguous error, or silent corruption of the index. It's not a black-box extension to you can just enable with no understanding of how it works under the hood, I learned that the hard way.
Interesting that it doesn't specifically call out Anti-Virus scanning (which does occasionally result in at least one of these scenarios). I've seen many SQLite database become corrupted and the best you can do is have a backup.
I just finished building an open-source project using SQLite, and this makes me a little nervous. Is SQLite generally considered a reliable system? I've only had one experience with the DB becoming corrupted and that was because I used the Task Manager to force kill the process while it was running under a heavy load, and I felt that was understandable.
To make the corruption more efficient, store the file on a RAID.
Interesting title for official SQLite documentation :)
> One example of this occurred circa 2013-08-30 on the canonical repository for the Fossil DVCS. In that event, file descriptor 2 (standard error) was being erroneously closed (by stunnel, we suspect) prior to sqlite3_open_v2() so that the file descriptor used for the repository database file was 2. Later, an application bug caused an assert() statement to emit an error message by invoking write(2,...). But since file descriptor 2 was now connected to a database file, the error message overwrote part of the database.
Kind of crazy that this is an issue in modern operating systems. There are just so many ways to avoid this obvious footgun of an API design. stdin/out/err should be reserved file descriptors. In fact, why reuse file descriptors at all? Just count up.
Article date is Jan 2022. This changes when SQLite runs as WASM in a browser — a context that only became properly viable with OPFS synchronous access handles in mid-2022.
Related. Others?
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=41846796 - Oct 2024 (1 comment)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=33503555 - Nov 2022 (1 comment)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=31214131 - April 2022 (139 comments)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=16579986 - March 2018 (10 comments)
How to Corrupt an SQLite Database File - https://news.ycombinator.com/item?id=6502229 - Oct 2013 (63 comments)