logoalt Hacker News

andersmurphytoday at 5:00 PM1 replyview on HN

I've updated the article with the correct rowid alias (integer not int) so the rowid version is now 715ms. I've also added an example of rowid and a secondary index UUID4, and that also seems to be bad for performance (as although it's not a clustered index it's still random inserts into a b-tree).


Replies

adityaathalyetoday at 6:53 PM

Well, I expect to never need WITHOUT ROWID. And even if such an arcane situation hits my system, WITHOUT ROWID has so many ifs and buts that I'll probably elect to eat the $$$ cost of running an un-optimised normie SQLite as far as possible.

cf. https://sqlite.org/withoutrowid.html

> The WITHOUT ROWID syntax is an optimization. It provides no new capabilities. Anything that can be done using a WITHOUT ROWID table can also be done in exactly the same way, and exactly the same syntax, using an ordinary rowid table. The only advantage of a WITHOUT ROWID table is that it can sometimes use less disk space and/or perform a little faster than an ordinary rowid table.

As of now, I am doing the following in my (Bitemporal data system) experiment (When will it see the light of day? Nobody knows.).

All data are globally uniquely identified by a UUIDv7. However all tables have `rowid` integer primary key asc (which is just an alias for SQLite's autoincrement int id). The `rowid` is the basis for joins, and is the foreign key reference. This lets me offload some useful disambiguation work to the DB as well as have it enforce global (across data systems) record uniqueness guarantees, while retaining local (within process) query efficiency by retaining the ability to use integer rowids.

While the idealised insert performance in your bench is indeed mind-boggling, the DB Schema isn't doing anything CPU-intensive during inserts (checks, constraints, triggers etc.). My schema / query pattern yields comparatively meagre throughput, but I am happy with the ballpark it has landed in, given all the work I'm making SQLite do for me on each `assert!` and `redact!`.

cf. my dirty-but-useful-enough bench, with production-like record content:

A poor man's napkin-mathy, append-only SQLite write/read benchmark

https://gist.github.com/adityaathalye/3c8195dc70626b33c23867...

Summary:

  ;; Okay, I think I can live with this...

  ;; - "facts" table: 12M+ records
  ;;     - single process writes to it
  ;;     - ~ 400 transactions/second
  ;;     - append-only table, enforced via SQLite "before" triggers
  ;; - "now" table: 
  ;;     - updates on every assert/redact on "facts" table, via triggers
  ;;     - currently at "limit case": for each read it is empty, or very small, because writes do back-to-back assert/redact of the same fact
  ;;     - gets reads from two reader threads (evenly split)
  ;;     - ~41,000 reads/second
  ;; - all reads are concurrent with writes (poor man's futures)