VACUUM Command in PostgreSQL

Learn how the PostgreSQL VACUUM command reclaims space from dead tuples, prevents transaction ID wraparound, and how autovacuum keeps tables healthy automatically.

5 min read · Back to overview

Quick Answer

VACUUM reclaims storage occupied by dead tuples—rows left behind by UPDATE and DELETE—and updates the visibility map used by index-only scans. PostgreSQL runs VACUUM automatically via autovacuum, but understanding it is essential for managing table bloat and preventing transaction ID wraparound.

Spin up a Postgres database in 20 seconds with Vela.

Try Vela Sandbox

PostgreSQL uses MVCC (multi-version concurrency control) to handle concurrent transactions: when a row is updated or deleted, the old version stays on disk as a “dead tuple” until no open transaction can see it. The VACUUM command reclaims that space, making it available for reuse. Without regular vacuuming, tables accumulate bloat and eventually face transaction ID wraparound—a critical condition that forces PostgreSQL into read-only mode.

Syntax

-- Standard VACUUM: marks dead-tuple space as reusable (no OS space returned)
VACUUM orders;

-- VACUUM ANALYZE: reclaim space and update planner statistics in one pass
VACUUM (ANALYZE) orders;

-- VACUUM VERBOSE: show detailed per-table progress
VACUUM (VERBOSE, ANALYZE) orders;

-- VACUUM FULL: rewrite table, returning space to OS (requires exclusive lock)
VACUUM FULL orders;

-- Vacuum all tables in the current database
VACUUM;

The critical distinction: standard VACUUM marks dead-tuple pages as reusable within PostgreSQL but does not shrink the file on disk. VACUUM FULL compacts the table to its minimum size and returns space to the OS, but holds an ACCESS EXCLUSIVE lock for the entire operation—blocking all reads and writes.

Practical Example

Simulate a high-churn table and observe bloat before and after vacuuming:

CREATE TABLE orders (
  id           SERIAL PRIMARY KEY,
  product_id   INTEGER NOT NULL,
  status       VARCHAR(20) NOT NULL DEFAULT 'pending',
  total_amount NUMERIC(10,2) NOT NULL,
  placed_at    TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Insert initial data
INSERT INTO orders (product_id, status, total_amount)
SELECT (random() * 999 + 1)::INT, 'pending', (random() * 500 + 10)::NUMERIC(10,2)
FROM generate_series(1, 100000);

-- Simulate updates that create dead tuples
UPDATE orders SET status = 'completed' WHERE id % 3 = 0;
UPDATE orders SET status = 'cancelled'  WHERE id % 7 = 0;

-- Check bloat before vacuum
SELECT
  relname             AS table_name,
  n_live_tup          AS live_rows,
  n_dead_tup          AS dead_rows,
  last_vacuum,
  last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'orders';

Run standard VACUUM ANALYZE and re-check:

VACUUM (ANALYZE) orders;

SELECT relname, n_live_tup, n_dead_tup, last_vacuum
FROM pg_stat_user_tables
WHERE relname = 'orders';

After VACUUM, n_dead_tup should drop to near zero and the reclaimed pages become available for new inserts.

Autovacuum vs Manual VACUUM

The autovacuum daemon monitors every table and triggers VACUUM automatically when dead tuples exceed a threshold. The default threshold is autovacuum_vacuum_scale_factor × row_count + autovacuum_vacuum_threshold (20% + 50 rows). For high-write tables, this may be too lenient:

-- Tune autovacuum per-table for a high-churn orders table
ALTER TABLE orders SET (
  autovacuum_vacuum_scale_factor  = 0.02,   -- 2% dead tuples triggers vacuum
  autovacuum_analyze_scale_factor = 0.01,   -- 1% row changes triggers analyze
  autovacuum_vacuum_threshold     = 100
);

Monitor autovacuum activity across all tables:

SELECT
  relname             AS table_name,
  n_dead_tup          AS dead_rows,
  last_autovacuum,
  last_autoanalyze,
  autovacuum_count,
  autoanalyze_count
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;

VACUUM FULL Trade-offs

VACUUM FULL is a last resort for reclaiming disk space:

Standard VACUUMVACUUM FULL
Returns space to OSNoYes
Lock typeShareUpdateExclusiveLock (no writes blocked)AccessExclusiveLock (all access blocked)
Safe during peak hoursYesNo — schedule a maintenance window
Rewrites tableNoYes
Updates visibility mapYesYes
Rebuilds indexesNoYes

An alternative for online table compaction without a full lock is the pg_repack extension, which can compact a table and its indexes with only a brief lock at the end of the operation.

Transaction ID Wraparound

PostgreSQL transaction IDs are 32-bit counters. After roughly 2 billion transactions, IDs wrap around. VACUUM freezes old tuple versions before this happens by marking them as visible to all future transactions. Monitor wraparound risk:

SELECT
  relname,
  age(relfrozenxid)                AS xid_age,
  2147483648 - age(relfrozenxid)   AS xids_remaining
FROM pg_class
WHERE relkind = 'r'
ORDER BY xid_age DESC
LIMIT 20;

When xid_age exceeds ~1.5 billion, PostgreSQL emits warnings. Near 2 billion, it forces an aggressive autovacuum freeze. Tables with high xid_age should be vacuumed promptly to avoid forced downtime.

Testing with Vela

Before running VACUUM FULL or tuning autovacuum thresholds on a large production table, test the operation on a Vela branch cloned from production. You can measure how long VACUUM FULL takes on production-scale data, verify the space reclaimed, and confirm the table remains accessible during a standard VACUUM run—all without any risk to the live database.

Production Tips

  • Run VACUUM ANALYZE rather than VACUUM and ANALYZE separately—it updates statistics in the same pass, saving one full table scan.
  • Schedule manual VACUUM immediately after large bulk operations (DELETE, UPDATE) that autovacuum would not catch quickly enough due to its scale-factor threshold.
  • Long-running transactions prevent VACUUM from reclaiming dead tuples that the transaction might still need to see. Check pg_stat_activity for idle-in-transaction sessions and terminate them if they are stale.
  • The WAL overhead from VACUUM FULL can be significant. Monitor WAL generation rate during maintenance windows on replicated systems.
  • Use pg_stat_user_tables.n_dead_tup as a bloat indicator. A ratio of dead-to-live rows above 20% on a large table is a signal to investigate autovacuum configuration.

Continue in Performance: pg_stat_statements.

Related in this section: EXPLAIN · EXPLAIN ANALYZE · Indexes and Index-Only Scans

Frequently Asked Questions

What does VACUUM do in PostgreSQL?
VACUUM removes dead tuples—row versions left on disk after UPDATE and DELETE operations—and marks their storage space as reusable within the table. It also updates the visibility map, which is required for index-only scans to skip heap fetches efficiently.
What is the difference between VACUUM and VACUUM FULL?
Standard VACUUM marks dead-tuple space as reusable within PostgreSQL but does not shrink the physical file on disk. VACUUM FULL rewrites the entire table into a new file, returning space to the operating system, but requires an exclusive lock that blocks all reads and writes for the duration.
What is table bloat in PostgreSQL?
Table bloat is the accumulated disk space occupied by dead tuples that VACUUM has not yet reclaimed, or that standard VACUUM cannot return to the OS. In high-write workloads, tables can grow significantly larger than their live-row footprint, slowing sequential scans and wasting storage.
What is transaction ID wraparound and why does VACUUM prevent it?
PostgreSQL transaction IDs are 32-bit integers that eventually wrap around after roughly 2 billion transactions. VACUUM freezes old tuple versions, marking them as visible to all future transactions, preventing the database from exhausting its transaction ID space and being forced into read-only mode.
Does autovacuum run automatically in PostgreSQL?
Yes. The autovacuum daemon monitors all tables and triggers VACUUM and ANALYZE when the volume of dead tuples or changed rows exceeds configured thresholds. It is enabled by default. For high-write tables you can lower autovacuum_vacuum_scale_factor to trigger more frequent vacuuming.