Replication itself doesn’t guarantee recovery; it’s the failover process that actually enacts disaster recovery.
Let’s watch a dataset replicate and then simulate a disaster to trigger a failover. Imagine we have a primary database server, db-primary.example.com, and a replica, db-replica.example.com. Our goal is to keep db-replica a near-real-time copy of db-primary so that if db-primary goes offline, we can quickly switch our applications to use db-replica.
Here’s a simplified PostgreSQL setup for replication. On db-primary, we’ll configure postgresql.conf:
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1024MB
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/wal_archive/%f'
And pg_hba.conf:
host replication replicator db-replica.example.com/32 md5
We’ll create a replication user:
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'supersecret';
On db-replica, we’ll use a standby.signal file in the data directory and a recovery.conf (or postgresql.conf for newer versions) to connect to the primary:
standby_mode = 'on'
primary_conninfo = 'host=db-primary.example.com port=5432 user=replicator password=supersecret'
primary_slot_name = 'replication_slot_01'
And we’ll create a replication slot on the primary:
SELECT pg_create_physical_replication_slot('replication_slot_01');
Now, db-replica will connect to db-primary, stream WAL records, and apply them. If we insert data into db-primary, it will appear on db-replica within seconds.
-- On db-primary
CREATE TABLE users (id SERIAL PRIMARY KEY, username VARCHAR(50));
INSERT INTO users (username) VALUES ('alice'), ('bob');
-- Check on db-replica
SELECT * FROM users;
The output on db-replica will show 'alice' and 'bob'.
The critical part is the failover. This isn’t automatic. When db-primary fails, we have to tell the system to switch. This typically involves:
- Detecting the failure: Application monitoring, load balancer health checks, or manual alerts.
- Stopping writes to the primary: If the primary is still partially available, prevent new data from being written that won’t be replicated.
- Promoting the replica: Make
db-replicaa standalone, writable database. This is done by removing thestandby.signalfile (or equivalent configuration) and restarting PostgreSQL. - Redirecting applications: Update connection strings or DNS to point to
db-replica.
Let’s simulate db-primary failing by shutting it down:
# On db-primary
sudo systemctl stop postgresql
On db-replica, you’d execute:
# On db-replica
sudo systemctl stop postgresql # Ensure it's stopped cleanly
sudo rm /var/lib/postgresql/data/standby.signal # Or modify postgresql.conf
sudo systemctl start postgresql
Now, db-replica is the primary. Applications would be pointed to db-replica.example.com.
The most surprising true thing about replication is that the replica is fundamentally a read-only copy until it’s explicitly promoted. It doesn’t "become" the primary on its own; it waits for instructions. All the streaming and applying of WAL records are just preparations for this manual or orchestrated promotion.
Internally, the replica maintains a connection to the primary and requests WAL segments. If the primary is unavailable, the replica will continue to wait, buffering WAL records if wal_keep_size is sufficient or if archive_command successfully archives segments. The primary_slot_name is crucial here; it tells the primary to retain WAL segments that the replica needs, preventing the primary from overwriting them before the replica has received them. Without a slot, the primary might clean up WAL files that the replica hasn’t yet fetched, leading to replication divergence or failure.
The exact levers you control are the replication parameters (wal_level, max_wal_senders, wal_keep_size, archive_command), the connection details (primary_conninfo), and the replication slot name. For failover, it’s about the process of stopping the old primary, promoting the replica, and re-pointing clients.
The one thing most people don’t know is how critical replication slots are for ensuring durability during transient network issues or planned primary downtime. If the primary restarts or WAL files are purged without the replica having consumed them, the slot ensures that the primary holds onto those WAL files until the replica confirms receipt. This prevents data loss that would otherwise occur if the replica fell too far behind and the primary cleaned up the necessary WAL segments.
The next problem you’ll hit is managing the original primary once it comes back online.