Subquery in PostgreSQL | PostgreSQL Tutorial

A subquery is a SELECT statement nested inside another SQL statement, enclosed in parentheses. PostgreSQL evaluates the inner query first and passes its result to the outer query as a condition, a derived table, or a scalar value. Subqueries let you break a complex retrieval problem into composable steps without creating intermediate tables or materialized views.

Syntax

The most common placement is inside a WHERE clause:

SELECT select_list
FROM table_name
WHERE column operator (
  SELECT column
  FROM other_table
  WHERE condition
);

Subqueries can also appear in the FROM clause (derived table), the SELECT list (scalar subquery), and inside HAVING clauses.

Practical Example

Create two tables — products and warehouses — that represent an inventory domain:

CREATE TABLE warehouses (
  warehouse_id  SERIAL PRIMARY KEY,
  region        VARCHAR(50) NOT NULL,
  capacity      INT         NOT NULL
);

CREATE TABLE products (
  product_id    SERIAL PRIMARY KEY,
  name          VARCHAR(100) NOT NULL,
  warehouse_id  INT REFERENCES warehouses (warehouse_id),
  stock_qty     INT         NOT NULL DEFAULT 0,
  unit_cost     NUMERIC(10, 2)
);

INSERT INTO warehouses (region, capacity) VALUES
  ('North', 500),
  ('South', 300),
  ('East',  800);

INSERT INTO products (name, warehouse_id, stock_qty, unit_cost) VALUES
  ('Widget A',    1, 120, 4.50),
  ('Widget B',    1,  80, 6.00),
  ('Gadget Pro',  2, 200, 22.00),
  ('Gadget Lite', 2,  40, 11.00),
  ('Component X', 3, 350, 1.75),
  ('Component Y', 3,  90, 3.25);

Single-value subquery with a comparison operator

Find all products stored in the highest-capacity warehouse without hard-coding its ID:

SELECT name, stock_qty
FROM products
WHERE warehouse_id = (
  SELECT warehouse_id
  FROM warehouses
  ORDER BY capacity DESC
  LIMIT 1
);

     name      | stock_qty
---------------+-----------
 Component X   |       350
 Component Y   |        90

PostgreSQL runs the inner query first (returning warehouse_id = 3), then filters the outer query with that value.

Multi-value subquery with IN

Find all products stored in warehouses with more than 400 units of capacity:

SELECT name, stock_qty, unit_cost
FROM products
WHERE warehouse_id IN (
  SELECT warehouse_id
  FROM warehouses
  WHERE capacity > 400
)
ORDER BY stock_qty DESC;

     name      | stock_qty | unit_cost
---------------+-----------+-----------
 Component X   |       350 |      1.75
 Widget A      |       120 |      4.50
 Component Y   |        90 |      3.25
 Widget B      |        80 |      6.00

The subquery returns a set of IDs; the outer query filters with IN.

Subquery Positions

Position	When to use	Constraint
`WHERE column = (subquery)`	Scalar comparison	Must return exactly one row and one column
`WHERE column IN (subquery)`	Match any of several values	Must return one column; multiple rows allowed
`FROM (subquery) AS alias`	Derived table for aggregation	Must have an alias
`SELECT (subquery)`	Scalar value per outer row	Must return exactly one row and one column
`HAVING aggregate op (subquery)`	Filter groups against an aggregate	Same as scalar comparison

Testing with Vela

Subqueries that reference large tables — especially correlated subqueries — can have unexpected performance characteristics in production once data grows. Vela’s database branching lets you clone production data to a branch, run EXPLAIN ANALYZE against the exact dataset, and tune the query before it ever touches the live database. Run the branch query during off-peak hours without any risk to production queries.

Production Tips

Use a scalar subquery (returns one row, one column) with =; use IN or ANY when the subquery may return multiple rows.
Correlated subqueries re-execute for every outer row. On large tables, replace them with a JOIN or a pre-aggregated CTE.
Profile all non-trivial subqueries with EXPLAIN ANALYZE to see actual row counts and execution times before deploying.
When a subquery appears more than once in a query, refactor it into a CTE (WITH name AS (...)) so PostgreSQL evaluates it once and the code is easier to maintain.
In NOT IN patterns, be careful of NULL values in the subquery result — a single NULL in the set causes the entire NOT IN condition to return NULL for every row. Prefer NOT EXISTS when NULLs are possible.

Frequently Asked Questions

What is a subquery in PostgreSQL?

A subquery (also called a nested query or inner query) is a SELECT statement enclosed in parentheses that appears inside another SQL statement. PostgreSQL evaluates the subquery first and uses its result to drive the outer query — as a filter value, a derived table, or a scalar expression.

Does a subquery lock the tables it reads?

A subquery acquires the same locks as a standalone SELECT — typically a ROW SHARE (or no lock in read-committed mode for plain reads). It does not block concurrent writes unless the outer statement itself is a DML command or you are inside a serializable transaction.

Can a subquery return multiple rows?

Yes. When a subquery returns multiple rows, use it with the IN operator (WHERE column IN (subquery)) or with the ANY/ALL operators. If you use a scalar comparison operator such as = or > with a subquery that returns more than one row, PostgreSQL raises an error.

Can I use IF EXISTS with a subquery?

There is no IF EXISTS guard for subqueries. To test whether a subquery returns any rows, use the EXISTS operator: WHERE EXISTS (SELECT 1 FROM ...). EXISTS returns true as soon as the first matching row is found and stops scanning.

What is the safest way to use subqueries in production?

Profile every non-trivial subquery with EXPLAIN ANALYZE before deploying. Correlated subqueries re-execute for every outer row and can become expensive on large tables. When the same subquery is referenced more than once, move it to a CTE (WITH clause) so PostgreSQL evaluates it once.