So if I understand this correctly, there are three main approaches: 1. SKIP LOCKED family 2. Par...

halfcat • yesterday at 11:50 PM • 1 reply • view on HN

So if I understand this correctly, there are three main approaches:

1. SKIP LOCKED family

2. Partition-based + DROP old partitions (no VACUUM required)

3. TRUNCATE family (PgQue’s approach)

And the benefit of PgQue is the failure mode, when a worker gets stuck:

- Table grows indefinitely, instead of

- VACUUM-starved death spiral

And a table growing is easier to reason about operationally?

Replies

samokhvalov • today at 12:03 AM

Taxonomy is correct. But the benefit isn't "table grows indefinitely vs. vacuum-starved death spiral"

in all three approaches, if the consumer falls behind, events accumulate

The real distinction is cost per event under MVCC pressure. Under held xmin (idle-in-transaction, long-running writer, lagging logical slot, physical standby with hot_standby_feedback=on):

1. SKIP LOCKED systems: every DELETE or UPDATE creates a dead tuple that autovacuum can't reclaim (xmin is frozen). Indexes bloat. Each subsequent FOR UPDATE SKIP LOCKED scans don't help.

2. Partition + DROP (some SKIP LOCKED systems already support it, e.g. PGMQ): old partitions drop cleanly, but the active partition is still DELETE-based and accumulates dead tuples — same pathology within the active window, just bounded by retention. Another thing is that DROPping and attaching/detaching partitions is more painful than working with a few existing ones and using TRUNCATE.

3. PgQue / PgQ: active event table is INSERT-only. Each consumer remembers its own pointer (ID of last event processed) independently. CPU stays flat under xmin pressure.

I posted a few more benchmark charts on my LinkedIn and Twitter, and plan to post an article explaining all this with examples. Among them was a demo where 30-min-held-xmin bench at 2000 ev/s: PgQue sustains full producer rate at ~14% CPU; SKIP LOCKED queues pinned at 55-87% CPU with throughput dropping 20-80% and what's even worse, after xmin horizon gets unblocked, not all of them recovered / caught up consuming withing next 30 min.

alt Hacker News

Replies