PostgreSQL Backup & Recovery

Archiving (WAL Archiving)

A practical guide to WAL archiving in PostgreSQL, including archive pipeline design, validation checks, and common operational pitfalls.

Definition

Continuous shipping of WAL segments to durable storage for backups and PITR. Controlled by archive_mode and archive_command.

What WAL Archiving Does

WAL archiving continuously copies completed WAL segments to durable external storage so recovery can move beyond the most recent base backup.

Without archiving, restore options are limited to the backup snapshot itself. With archiving, you unlock timeline-accurate PITR.

Archiving is the bridge between backup-time state and incident-time recovery targets.

Required Building Blocks

  • archive_mode enabled
  • Reliable archive_command with error handling
  • Durable storage target with retention policy
  • Base backup cadence aligned to RPO and restore practicality

Validation Checks Teams Miss

Successful command execution is not enough. You need end-to-end verification that archived WAL files are complete, ordered, and retrievable during restore.

Most incidents happen because teams verify backup jobs but never verify replay from those artifacts.

  • Sample-restore recent WAL files in a drill environment
  • Alert on archive failure rate and lag
  • Verify retention continuity across backup windows

How It Connects to PITR

During recovery, PostgreSQL restores a base backup and pulls archived WAL via restore_command until a selected recovery target.

If any WAL segment in the required chain is missing or corrupt, precise recovery fails.

Frequently Asked Questions

Is WAL archiving required for PITR?
Yes. PITR requires base backups plus a continuous archived WAL chain from backup time to target time.
Can I rely on local WAL files without archiving?
Not safely for disaster recovery. Local WAL can be lost during node, disk, or region incidents.
How often should archiving be tested?
Continuously with alerts, plus scheduled restore drills at least monthly and after major infrastructure changes.
What is the most common WAL archiving failure?
Silent archive_command failures or lag that go unnoticed until restore time.
What storage should archived WAL use?
Use durable, versioned storage with lifecycle controls and documented retention aligned to compliance and RPO/RTO needs.