EXPLAIN in PostgreSQL | PostgreSQL Tutorial

Q: What does EXPLAIN show in PostgreSQL?

EXPLAIN shows the execution plan the query planner chose: which tables are scanned and how (sequential scan, index scan, bitmap index scan), how tables are joined (nested loop, hash join, merge join), and estimated startup and total costs alongside expected row counts. It does not execute the query.

Q: Does EXPLAIN lock the table or affect database performance?

EXPLAIN without ANALYZE does not execute the query and acquires no locks. EXPLAIN ANALYZE executes the query, so it does acquire the same locks the query normally would. Wrap DML statements in a transaction and roll back to avoid side effects.

Q: What is the difference between EXPLAIN and EXPLAIN ANALYZE?

EXPLAIN shows the estimated plan without executing the query. EXPLAIN ANALYZE actually runs the query and augments each plan node with real execution time and actual row counts, making it possible to spot where estimates diverge from reality.

Q: What does 'cost' mean in an EXPLAIN output?

Cost is an arbitrary unit based on sequential page reads (1.0 by default). The two numbers cost=X..Y are the startup cost (X, before the first row is returned) and the total cost (Y, to return all rows). Lower costs indicate a more efficient plan, but costs are estimates—use ANALYZE to see real timings.

Q: What is the safest way to EXPLAIN a DELETE or UPDATE in production?

Wrap the DML in a transaction: BEGIN; EXPLAIN ANALYZE your_delete_or_update; ROLLBACK; This executes the statement so ANALYZE can collect real statistics, but the ROLLBACK prevents any data from being modified.

The EXPLAIN statement reveals the query execution plan PostgreSQL’s query planner generated for a statement without running it. The output is a tree of operations—scans, joins, sorts, aggregations—each annotated with cost estimates and expected row counts. Reading this output is the starting point for query optimization.

Syntax

EXPLAIN [ (option [, ...]) ] sql_statement;

Key options:

Option	Effect
`ANALYZE`	Executes the statement and adds real timing and actual row counts
`BUFFERS`	Shows shared buffer hits, reads, and dirtied pages (requires ANALYZE)
`VERBOSE`	Adds column-level output lists and per-node details
`COSTS`	Includes cost estimates — on by default
`FORMAT`	Output format: `TEXT` (default), `JSON`, `XML`, `YAML`

Practical Example

Create a products and orders table to generate a realistic plan:

CREATE TABLE products (
  id         SERIAL PRIMARY KEY,
  sku        VARCHAR(20) NOT NULL UNIQUE,
  name       VARCHAR(100) NOT NULL,
  unit_price NUMERIC(10,2) NOT NULL
);

CREATE TABLE orders (
  id          SERIAL PRIMARY KEY,
  product_id  INTEGER REFERENCES products(id),
  quantity    INTEGER NOT NULL,
  placed_at   TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Populate with sample data
INSERT INTO products (sku, name, unit_price)
SELECT 'SKU-' || i, 'Product ' || i, (random() * 200 + 5)::NUMERIC(10,2)
FROM generate_series(1, 500) AS i;

INSERT INTO orders (product_id, quantity, placed_at)
SELECT (random() * 499 + 1)::INT, (random() * 10 + 1)::INT,
       now() - (random() * 365 || ' days')::INTERVAL
FROM generate_series(1, 10000);

A sequential scan on a small table is expected and efficient:

EXPLAIN SELECT * FROM products;

                           QUERY PLAN
------------------------------------------------------------------
 Seq Scan on products  (cost=0.00..12.00 rows=500 width=56)

Filter by primary key — the planner switches to an index scan:

EXPLAIN SELECT * FROM products WHERE id = 42;

                                  QUERY PLAN
-----------------------------------------------------------------------------
 Index Scan using products_pkey on products  (cost=0.28..8.29 rows=1 width=56)
   Index Cond: (id = 42)

Reading EXPLAIN Output

Each line is one plan node. Indentation shows parent-child relationships — child nodes feed their results up to parent nodes.

EXPLAIN
SELECT p.sku, p.name, o.quantity, o.placed_at
FROM orders o
JOIN products p ON p.id = o.product_id
WHERE o.placed_at >= now() - interval '30 days'
ORDER BY o.placed_at DESC;

A typical output with indexes on placed_at and product_id:

 Sort  (cost=245.80..248.30 rows=1000 width=72)
   Sort Key: o.placed_at DESC
   ->  Hash Join  (cost=14.25..195.50 rows=1000 width=72)
         Hash Cond: (o.product_id = p.id)
         ->  Index Scan using idx_orders_placed_at on orders  (cost=0.29..155.40 rows=1000 width=24)
               Index Cond: (placed_at >= (now() - '30 days'::interval))
         ->  Hash  (cost=9.00..9.00 rows=500 width=56)
               ->  Seq Scan on products  (cost=0.00..9.00 rows=500 width=56)

Key fields to understand:

Seq Scan — reads every row in the table, appropriate for small tables or when most rows match.
Index Scan — uses an index to jump to matching rows; good for high-selectivity filters.
Bitmap Index Scan + Bitmap Heap Scan — batches multiple index lookups before visiting the heap; useful for moderate-selectivity filters.
cost=X..Y — X is startup cost (before first row), Y is total cost (all rows returned).
rows=N — planner’s estimate of matching rows; large gaps vs reality indicate stale statistics.
width=N — estimated average row width in bytes.

Using FORMAT JSON for Visualization

EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON)
SELECT p.name, SUM(o.quantity) AS total_sold
FROM orders o
JOIN products p ON p.id = o.product_id
GROUP BY p.name
ORDER BY total_sold DESC
LIMIT 10;

The JSON output can be pasted into tools like explain.depesz.com or pgMustard for visual plan inspection.

Testing with Vela

The best way to validate an index or query rewrite is to test it against a production-scale dataset without modifying production. Create a Vela branch cloned from your production database, add the candidate index with CREATE INDEX CONCURRENTLY, and run EXPLAIN ANALYZE on the branch. Compare the plan before and after — buffer reads, row estimates, and total cost — then promote the change to production only once the plan looks correct.

Production Tips

A large gap between estimated and actual rows (visible in EXPLAIN ANALYZE) usually means stale statistics. Run ANALYZE table_name to refresh them.
A sequential scan on a large table with a narrow filter is a signal for a missing index. A sequential scan on a small table (< a few hundred rows) is usually the right plan.
Use EXPLAIN (ANALYZE, BUFFERS) to distinguish I/O-bound queries (high shared blks read) from CPU-bound ones (high time without many reads).
Run EXPLAIN ANALYZE at least twice on a warm database—the first run may include cold buffer-cache effects that inflate I/O numbers.
Wrap DML in a transaction (BEGIN; EXPLAIN ANALYZE ...; ROLLBACK;) to measure real write performance without committing changes.

Frequently Asked Questions

What does EXPLAIN show in PostgreSQL?

EXPLAIN shows the execution plan the query planner chose: which tables are scanned and how (sequential scan, index scan, bitmap index scan), how tables are joined (nested loop, hash join, merge join), and estimated startup and total costs alongside expected row counts. It does not execute the query.

Does EXPLAIN lock the table or affect database performance?

EXPLAIN without ANALYZE does not execute the query and acquires no locks. EXPLAIN ANALYZE executes the query, so it does acquire the same locks the query normally would. Wrap DML statements in a transaction and roll back to avoid side effects.

What is the difference between EXPLAIN and EXPLAIN ANALYZE?

EXPLAIN shows the estimated plan without executing the query. EXPLAIN ANALYZE actually runs the query and augments each plan node with real execution time and actual row counts, making it possible to spot where estimates diverge from reality.

What does 'cost' mean in an EXPLAIN output?

Cost is an arbitrary unit based on sequential page reads (1.0 by default). The two numbers cost=X..Y are the startup cost (X, before the first row is returned) and the total cost (Y, to return all rows). Lower costs indicate a more efficient plan, but costs are estimates—use ANALYZE to see real timings.

What is the safest way to EXPLAIN a DELETE or UPDATE in production?

Wrap the DML in a transaction: BEGIN; EXPLAIN ANALYZE your_delete_or_update; ROLLBACK; This executes the statement so ANALYZE can collect real statistics, but the ROLLBACK prevents any data from being modified.