The most surprising thing about database replication storage is that the "storage" isn’t really about storing data in the traditional sense, but about managing the flow of changes.
Imagine two servers, db-primary and db-replica, mirroring a PostgreSQL database. When a INSERT statement hits db-primary, it doesn’t just write to its own disk. It also writes that change to a special log file, often called a Write-Ahead Log (WAL) in PostgreSQL. This WAL is the heart of replication.
Here’s a simplified view of db-primary in action:
-- On db-primary
CREATE TABLE users (id SERIAL PRIMARY KEY, name VARCHAR(100));
INSERT INTO users (name) VALUES ('Alice');
Immediately after that INSERT, PostgreSQL writes a record to its WAL files. This record describes the change: "add a row with id=1 and name='Alice' to the 'users' table."
Now, db-replica needs to get that WAL record. It does this by streaming the WAL from db-primary. Think of it like a continuous download of changes.
-- On db-primary (configuration snippet - postgresql.conf)
wal_level = replica
max_wal_senders = 5
wal_keep_size = 1024MB
-- On db-primary (pg_hba.conf)
host replication repl_user 0.0.0.0/0 md5
-- On db-replica (configuration snippet - postgresql.conf)
hot_standby = on
primary_conninfo = 'host=db-primary port=5432 user=repl_user password=mysecretpassword'
restore_command = 'cp /path/to/wal/archive/%f %p'
db-replica connects to db-primary using a special replication protocol, authenticates as repl_user, and asks for the WAL records. It then applies these records to its own copy of the database, effectively replaying the INSERT statement.
-- On db-replica (after starting up)
-- This would show the 'Alice' row
SELECT * FROM users;
The "storage" aspect comes into play on both sides. db-primary needs enough disk space to hold its WAL files until they are successfully sent to all connected replicas. If a replica falls behind, db-primary can’t just delete those WAL files. db-replica needs to store the WAL records it receives before applying them, and also has its own WAL for its outgoing replication (if it were also a primary, or for point-in-time recovery).
The primary problem people hit with replication storage is that the primary node runs out of disk space because WAL files aren’t being removed fast enough. This happens when replicas are offline, slow, or the wal_keep_size (or wal_keep_segments in older versions) is too small to accommodate the lag.
A common cause is a replica becoming disconnected. PostgreSQL on the primary keeps WAL segments for a minimum of wal_keep_size or for as long as any standby server is connected and needs them. If a replica disconnects for an extended period, and wal_keep_size is too small, the primary might delete WAL files that the replica still needs. When the replica reconnects, it will find it’s missing crucial WAL data and will need to be repointed or reinitialized.
Another cause is simply a busy primary generating a lot of WAL. If the primary is writing data very rapidly, the WAL files can accumulate faster than the replicas can consume them, even if they are online. This can fill up the primary’s disk if wal_keep_size isn’t sufficient or if there’s a bottleneck in WAL shipping.
A misconfigured wal_sender_timeout can also cause issues. If this is set too low on the primary, it might disconnect a replica that’s simply slow to process WAL, leading to the same problem as a full disconnection. The default is 1 minute. If a replica consistently takes longer than this to apply a batch of WAL, it might get disconnected.
Finally, insufficient disk space on the replica itself can halt replication, though this usually manifests as an error on the replica, not the primary. The replica needs space for its own data, its WAL, and any temporary files it might use. If the replica runs out of space, it can’t apply incoming WAL, which then causes the primary’s WAL to accumulate.
The key lever you control is wal_keep_size on the primary. This setting determines the minimum amount of WAL PostgreSQL will keep on disk, regardless of whether it has been backed up or sent to a standby. To fix a primary disk full error due to WAL accumulation, you’d typically increase this value, perhaps to several gigabytes, depending on your write load and expected replica lag.
-- Example: Increase wal_keep_size on primary to 2GB
-- Edit postgresql.conf and set:
-- wal_keep_size = 2048MB
-- Then reload PostgreSQL configuration:
SELECT pg_reload_conf();
This works because it tells the primary to hold onto WAL files for longer, giving slow or disconnected replicas more time to catch up before those WAL segments are purged.
If you’re using streaming replication, the primary_conninfo on the replica is critical. If this string is incorrect (wrong host, port, user, or password), the replica won’t be able to connect to the primary to stream WAL, and the primary’s WAL directory will fill up.
-- Example: Correcting primary_conninfo on replica
-- Edit postgresql.conf on the replica:
-- primary_conninfo = 'host=db-primary.example.com port=5432 user=repl_user password=correct_password sslmode=prefer'
-- Then restart the replica's PostgreSQL service.
This ensures the replica can establish and maintain its connection to the primary to receive the necessary WAL stream.
What most people don’t realize is that PostgreSQL replication is transactional. When a replica receives a WAL record, it doesn’t immediately commit it. It buffers these records and applies them in batches, and the commit on the replica happens after a group of WAL records have been successfully processed. This batching is crucial for performance but means there’s a small window where a replica might lag slightly behind the primary’s transaction commit.
The next error you’ll hit after fixing disk space issues is likely replication slot has gone too far ahead or WAL was discarded.