PostgreSQL Backup & Recovery
Learn how PostgreSQL WAL works, why it is required for durability, and how it supports crash recovery, replication, and PITR.
A logging mechanism where changes are written to a log before being applied to the database files.
PostgreSQL writes every change as WAL records before the corresponding data pages are flushed to data files. This ordering is what makes crash recovery reliable and enforces ACID durability.
A commit is considered durable only after its WAL is safely flushed according to your durability settings. Data files can be written later because WAL replay can reconstruct the final consistent state.
During restart recovery, PostgreSQL replays WAL records from the last checkpoint forward until the cluster reaches a consistent point.
WAL is mandatory for PITR. A base backup provides starting state, while WAL archives provide every change after that snapshot.
If archive shipping breaks for even a short interval, your theoretical recovery target can become impossible in practice.
For production readiness, teams should pair WAL archiving with scheduled restore drills, not backup-job success metrics alone.
Physical streaming replication sends WAL from primary to standbys. Replica lag is directly tied to WAL transport and replay throughput.
In logical pipelines, slot and consumer health still determine whether required WAL can be safely removed.
Mismanaged replication slots are a common reason for uncontrolled WAL growth and storage incidents.
WAL is foundational for PITR and replication. Any gap in WAL retention can break restore guarantees.
Teams should monitor archive lag, replication lag, and WAL volume growth continuously.
In high-change systems, WAL volume can grow faster than expected; capacity planning must account for peak write bursts, not averages.
WAL volume often spikes during backfills, large transactions, index builds, or deploy windows with schema/data rewrites.
WAL behavior is influenced by checkpoint cadence, retention thresholds, and archiving settings. Tuning these knobs changes both performance and recoverability characteristics.
Configuration should be validated against your target RPO/RTO and peak write patterns, not just average traffic.
WAL is tightly linked to archive_mode, archive_command, and recovery targets.
Teams evaluating tooling should also compare operational workflows in this PostgreSQL backup/restore tools guide.