logoalt Hacker News

swid05/04/20251 replyview on HN

The comment about = is mine and I thought about rewriting that part of the comment after the fact. It is soooo hard to explain these things without writing a book.

The reason = mean it runs once is because the outer query will only run once, and in this case that query, when using =, can only delete based on a single id. But if that outer query was subquery in a context where it could be run more than once, you are back to where you started. Hence me saying their fix was sort of incidental.


Replies

Diggsey05/04/2025

I think your logic is flawed (even if PostgreSQL may behave this way in practice).

First, let's make sure we're talking about the same two examples.

A:

    DELETE FROM task_queue
    WHERE id = (
      SELECT id FROM task_queue
      WHERE queue_group_id = 15
      LIMIT 1
      FOR UPDATE SKIP LOCKED
    )
B:

    DELETE FROM task_queue
    WHERE id IN (
      SELECT id FROM task_queue
      WHERE queue_group_id = 15
      LIMIT 1
      FOR UPDATE SKIP LOCKED
    )
You seem to be saying that B may exhibit the problem whilst A does not. (ie. there is a different between using `=` vs `IN`). I would like to see some documentation justifying that.

Here's my logic:

- In both cases, the outer query is run once.

- In both cases the outer WHERE clause is evaluated for each row in `task_queue`.

- In both cases, it is up to the optimizer whether the result of the subquery is materialized or not.

- In both cases, if the subquery is not materialized, multiple rows in the outer query may match the condition.

In practice, it may be that the optimizer always materializes uncorrelated subqueries on the RHS of an `=` expression. My contention is whether that is a formal guarantee.

show 1 reply