EXPLAIN ANALYZE in PostgreSQL | PostgreSQL Tutorial

EXPLAIN ANALYZE is the most powerful query-tuning tool in PostgreSQL. Unlike plain EXPLAIN ANALYZE, which only shows estimated plans, EXPLAIN ANALYZE actually runs the query and reports real execution time and row counts at every node in the plan tree. Comparing “actual” numbers against “estimated” numbers quickly reveals where the optimizer is making wrong assumptions.

Syntax

EXPLAIN ANALYZE sql_statement;

-- With additional options
EXPLAIN (ANALYZE, BUFFERS) sql_statement;
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) sql_statement;

Practical Example

Set up an orders reporting query and analyze it:

CREATE TABLE products (
  id         SERIAL PRIMARY KEY,
  name       VARCHAR(100) NOT NULL,
  category   VARCHAR(50)
);

CREATE TABLE orders (
  id          SERIAL PRIMARY KEY,
  product_id  INTEGER REFERENCES products(id),
  quantity    INTEGER NOT NULL,
  total_amount NUMERIC(10,2) NOT NULL,
  placed_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
  status      VARCHAR(20) NOT NULL DEFAULT 'pending'
);

-- Create an index for the date filter
CREATE INDEX idx_orders_placed_at ON orders (placed_at DESC);

Run EXPLAIN ANALYZE on a join query:

EXPLAIN ANALYZE
SELECT
  p.name,
  SUM(o.total_amount) AS revenue,
  COUNT(*) AS order_count
FROM orders o
JOIN products p ON p.id = o.product_id
WHERE o.placed_at >= now() - interval '90 days'
GROUP BY p.name
ORDER BY revenue DESC;

Sample output with actual statistics:

 Sort  (cost=310.50..312.50 rows=800 width=72)
       (actual time=18.430..18.612 rows=350 loops=1)
   Sort Key: (sum(o.total_amount)) DESC
   Sort Method: quicksort  Memory: 55kB
   ->  HashAggregate  (cost=265.80..274.80 rows=800 width=72)
                      (actual time=17.201..17.590 rows=350 loops=1)
         Group Key: p.name
         ->  Hash Join  (cost=14.25..215.30 rows=2050 width=40)
                        (actual time=0.412..14.887 rows=2100 loops=1)
               Hash Cond: (o.product_id = p.id)
               ->  Index Scan using idx_orders_placed_at on orders
                          (cost=0.29..185.00 rows=2050 width=24)
                          (actual time=0.083..12.304 rows=2100 loops=1)
               ->  Hash  (cost=9.00..9.00 rows=400 width=24)
                         (actual time=0.291..0.291 rows=400 loops=1)
 Planning Time: 0.687 ms
 Execution Time: 18.904 ms

Key fields to interpret:

actual time=X..Y — milliseconds from node startup (X) to last row returned (Y).
rows=N — actual row count; compare to the estimate in cost=...rows=N to spot misestimates.
loops=N — how many times this node ran; multiply actual time by loops for total node cost.
Planning Time — time the planner spent choosing the plan.
Execution Time — total wall-clock time including planning.

Adding BUFFERS for I/O Analysis

EXPLAIN (ANALYZE, BUFFERS)
SELECT product_id, SUM(total_amount) AS revenue
FROM orders
WHERE placed_at >= now() - interval '30 days'
  AND status = 'completed'
GROUP BY product_id
ORDER BY revenue DESC
LIMIT 20;

With BUFFERS, each node reports cache and disk activity:

   Buffers: shared hit=820 read=42

shared hit — pages found in the shared buffer cache (fast, in-memory).
shared read — pages fetched from disk or OS page cache (slow).
A high read count relative to hit means the working set is larger than shared_buffers—adding memory or an index typically helps.

Diagnosing Row-Count Misestimates

When actual rows differ sharply from the planner’s estimate, refresh the statistics:

-- Refresh statistics on the orders table
ANALYZE orders;

-- Re-run to see updated estimates
EXPLAIN ANALYZE
SELECT * FROM orders WHERE status = 'pending';

For skewed data distributions, increase the statistics target for the relevant column:

ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
ANALYZE orders (status);

Then verify the estimates improved:

SELECT attname, most_common_vals, most_common_freqs
FROM pg_stats
WHERE tablename = 'orders' AND attname = 'status';

Analyzing Write Statements Safely

EXPLAIN ANALYZE executes the statement, so wrap DML in a transaction to avoid committing changes:

BEGIN;
EXPLAIN ANALYZE
UPDATE orders
SET status = 'archived'
WHERE placed_at < now() - interval '2 years';
ROLLBACK;  -- no rows are actually changed

Testing with Vela

EXPLAIN ANALYZE results are only meaningful at production scale. Use a Vela branch cloned from your live database to benchmark query changes against real data volumes and realistic buffer cache states. Create the branch, run EXPLAIN (ANALYZE, BUFFERS) on the candidate query, add or drop indexes, and compare plans side-by-side before deploying any schema or query change to production.

Production Tips

Run EXPLAIN ANALYZE at least twice—the first run may show cold cache effects that inflate buffer read counts.
A loops value greater than 1 in a nested-loop node multiplies the node cost; adding an index on the inner table’s join key typically eliminates repeated scanning.
Use FORMAT JSON and a visualization tool such as explain.depesz.com to navigate complex multi-node plans visually.
High Planning Time on complex queries can sometimes be reduced by simplifying CTEs or splitting the query; look for queries where planning time rivals execution time.
Combine with pg_stat_statements to find the highest-impact queries before investing time in EXPLAIN ANALYZE analysis.

Frequently Asked Questions

What is EXPLAIN ANALYZE in PostgreSQL?

EXPLAIN ANALYZE executes the SQL statement and augments the normal EXPLAIN output with real execution time in milliseconds and actual row counts at each plan node. This makes it possible to compare planner estimates against real behavior and identify bottlenecks.

What does 'loops=N' mean in EXPLAIN ANALYZE output?

A node with loops=N ran N times during query execution. This is typical for the inner side of a nested loop join—it runs once per outer row. The actual time shown is the per-loop average, so multiply by the loop count to get total time for that node.

How do I use EXPLAIN ANALYZE without modifying data?

Wrap the DML in a transaction: BEGIN; EXPLAIN ANALYZE your_dml; ROLLBACK; PostgreSQL executes the statement (required to collect actual statistics) but the ROLLBACK ensures no changes are committed to the database.

What does EXPLAIN (ANALYZE, BUFFERS) add over plain EXPLAIN ANALYZE?

The BUFFERS option adds shared buffer hit, read, and written counts to each plan node. This reveals whether the query is reading data from cache (shared hits, fast) or disk (shared reads, slow), which guides decisions about increasing shared_buffers or adding indexes.

Why do actual rows differ from estimated rows in EXPLAIN ANALYZE output?

The planner uses statistics collected by ANALYZE to estimate row counts. Stale statistics, highly correlated columns, or skewed data distributions cause misestimates. Run ANALYZE on the table and consider increasing the statistics target for important columns.