logoalt Hacker News

hedorayesterday at 10:47 PM0 repliesview on HN

I want to dig into this "free" workflow_error.sql. I'll assume 1024 byte workflow job descriptors, and the article's steady state of 10,000 jobs per second.

Possibility one: There is one index on the table, and it is the created_at TS. This query has to scan 10,000 jobs/sec * 60 seconds * 60 minutes * 24 hours * 31 days * 1024 bytes / job = 25,543 GB.

A KV store would scan exactly that much.

Possibility two: The primary key is refined to (state, timestamp). Assume a 1% failure rate. Now, we "only" scan and return 255 GB. A key value store would scan exactly that much. (This is probably the right physical design).

Possibility three: The primary key is (timestamp), and there's a secondary index on state. I guess we do an index join, where one side of the join is 25,543 GB, and the other side is one unsorted bucket with 255GB * number of months the system has been in operation in it.

A KV store wouldn't let you express that.

Now, what other ad hoc queries are we supposed to efficiently support over a one month lookback? Also, what does PG do if you tell it to scan 25TB at the same time as it's inserting 10MB/sec at 10K TPS? How is vacuuming configured?