The EXPLAIN statement reveals the query execution plan PostgreSQL’s query planner generated for a statement without running it. The output is a tree of operations—scans, joins, sorts, aggregations—each annotated with cost estimates and expected row counts. Reading this output is the starting point for query optimization.
Syntax
EXPLAIN [ (option [, ...]) ] sql_statement;
Key options:
| Option | Effect |
|---|---|
ANALYZE | Executes the statement and adds real timing and actual row counts |
BUFFERS | Shows shared buffer hits, reads, and dirtied pages (requires ANALYZE) |
VERBOSE | Adds column-level output lists and per-node details |
COSTS | Includes cost estimates — on by default |
FORMAT | Output format: TEXT (default), JSON, XML, YAML |
Practical Example
Create a products and orders table to generate a realistic plan:
CREATE TABLE products (
id SERIAL PRIMARY KEY,
sku VARCHAR(20) NOT NULL UNIQUE,
name VARCHAR(100) NOT NULL,
unit_price NUMERIC(10,2) NOT NULL
);
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
product_id INTEGER REFERENCES products(id),
quantity INTEGER NOT NULL,
placed_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Populate with sample data
INSERT INTO products (sku, name, unit_price)
SELECT 'SKU-' || i, 'Product ' || i, (random() * 200 + 5)::NUMERIC(10,2)
FROM generate_series(1, 500) AS i;
INSERT INTO orders (product_id, quantity, placed_at)
SELECT (random() * 499 + 1)::INT, (random() * 10 + 1)::INT,
now() - (random() * 365 || ' days')::INTERVAL
FROM generate_series(1, 10000);
A sequential scan on a small table is expected and efficient:
EXPLAIN SELECT * FROM products;
QUERY PLAN
------------------------------------------------------------------
Seq Scan on products (cost=0.00..12.00 rows=500 width=56)
Filter by primary key — the planner switches to an index scan:
EXPLAIN SELECT * FROM products WHERE id = 42;
QUERY PLAN
-----------------------------------------------------------------------------
Index Scan using products_pkey on products (cost=0.28..8.29 rows=1 width=56)
Index Cond: (id = 42)
Reading EXPLAIN Output
Each line is one plan node. Indentation shows parent-child relationships — child nodes feed their results up to parent nodes.
EXPLAIN
SELECT p.sku, p.name, o.quantity, o.placed_at
FROM orders o
JOIN products p ON p.id = o.product_id
WHERE o.placed_at >= now() - interval '30 days'
ORDER BY o.placed_at DESC;
A typical output with indexes on placed_at and product_id:
Sort (cost=245.80..248.30 rows=1000 width=72)
Sort Key: o.placed_at DESC
-> Hash Join (cost=14.25..195.50 rows=1000 width=72)
Hash Cond: (o.product_id = p.id)
-> Index Scan using idx_orders_placed_at on orders (cost=0.29..155.40 rows=1000 width=24)
Index Cond: (placed_at >= (now() - '30 days'::interval))
-> Hash (cost=9.00..9.00 rows=500 width=56)
-> Seq Scan on products (cost=0.00..9.00 rows=500 width=56)
Key fields to understand:
- Seq Scan — reads every row in the table, appropriate for small tables or when most rows match.
- Index Scan — uses an index to jump to matching rows; good for high-selectivity filters.
- Bitmap Index Scan + Bitmap Heap Scan — batches multiple index lookups before visiting the heap; useful for moderate-selectivity filters.
- cost=X..Y —
Xis startup cost (before first row),Yis total cost (all rows returned). - rows=N — planner’s estimate of matching rows; large gaps vs reality indicate stale statistics.
- width=N — estimated average row width in bytes.
Using FORMAT JSON for Visualization
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT p.name, SUM(o.quantity) AS total_sold
FROM orders o
JOIN products p ON p.id = o.product_id
GROUP BY p.name
ORDER BY total_sold DESC
LIMIT 10;
The JSON output can be pasted into tools like explain.depesz.com or pgMustard for visual plan inspection.
Testing with Vela
The best way to validate an index or query rewrite is to test it against a production-scale dataset without modifying production. Create a Vela branch cloned from your production database, add the candidate index with CREATE INDEX CONCURRENTLY, and run EXPLAIN ANALYZE on the branch. Compare the plan before and after — buffer reads, row estimates, and total cost — then promote the change to production only once the plan looks correct.
Production Tips
- A large gap between estimated and actual rows (visible in
EXPLAIN ANALYZE) usually means stale statistics. RunANALYZE table_nameto refresh them. - A sequential scan on a large table with a narrow filter is a signal for a missing index. A sequential scan on a small table (< a few hundred rows) is usually the right plan.
- Use
EXPLAIN (ANALYZE, BUFFERS)to distinguish I/O-bound queries (highshared blks read) from CPU-bound ones (high time without many reads). - Run
EXPLAIN ANALYZEat least twice on a warm database—the first run may include cold buffer-cache effects that inflate I/O numbers. - Wrap DML in a transaction (
BEGIN; EXPLAIN ANALYZE ...; ROLLBACK;) to measure real write performance without committing changes.