Valkey can save its in-memory dataset to disk in two primary ways: RDB snapshots and AOF logging.
RDB Snapshots: Point-in-Time Copies
RDB (Redis Database) snapshots are point-in-time, point-in-time copies of your Valkey dataset. Think of it like taking a photograph of your data at a specific moment.
Here’s how it works: Valkey forks its main process. The child process then writes the entire dataset to a binary file (usually named dump.rdb). This is efficient because the main process can continue serving requests while the snapshot is being generated.
Configuration:
You control RDB snapshots via the save directive in your valkey.conf file. It takes seconds and number of changes as arguments.
save 900 1 # Save if at least 1 key changed in 900 seconds (15 minutes)
save 300 10 # Save if at least 10 keys changed in 300 seconds (5 minutes)
save 60 10000 # Save if at least 10000 keys changed in 60 seconds (1 minute)
Example Snapshot:
Let’s say you have a valkey.conf with these settings and Valkey has been running for a while. If you’ve made 15,000 changes in the last 50 seconds, Valkey will trigger an RDB save. A dump.rdb file will appear in your Valkey data directory.
Why it’s useful: RDB is great for backups and for faster restarts. When Valkey restarts, it can load the dump.rdb file much faster than replaying AOF logs, especially for large datasets.
AOF Logging: Every Write Operation
AOF (Append Only File) logging records every write operation Valkey receives. Instead of taking a snapshot, it keeps a log of commands that modify the dataset.
How it works: Valkey appends each write command to an AOF file. When Valkey restarts, it replays these commands to rebuild the dataset.
Configuration:
You enable AOF by setting appendonly yes in valkey.conf. You also choose the appendfsync policy:
appendfsync always: Sync after every write command. Safest, but slowest.appendfsync everysec: Sync every second. Good balance of safety and performance.appendfsync no: Let the operating system decide when to sync. Fastest, but least safe.
appendonly yes
appendfilename "appendonly.aof"
appendfsync everysec
Example AOF Log:
If you run SET mykey "hello" and INCR counter, your appendonly.aof might look like this:
*2
$6
SELECT
$1
0
*3
$3
SET
$5
mykey
$5
hello
*2
$4
INCR
$7
counter
When Valkey restarts, it reads this file and executes these commands.
Why it’s useful: AOF provides better durability than RDB, especially with appendfsync everysec or always. If Valkey crashes, you lose at most a second’s worth of writes (with everysec), rather than potentially minutes of writes with RDB.
AOF Rewriting: Keeping the Log Lean
AOF logs can grow very large over time, slowing down restarts and disk usage. AOF rewriting is a process that creates a new, optimized AOF file by discarding redundant commands.
How it works: Valkey generates a new AOF file in the background, containing only the commands necessary to reconstruct the current state of the dataset. For example, if you SET mykey val1 and then SET mykey val2, the original AOF might have both commands. The rewritten AOF will only contain SET mykey val2.
Configuration:
You configure automatic AOF rewriting with auto-aof-rewrite-percentage and auto-aof-rewrite-min-size.
auto-aof-rewrite-min-size 64mb
auto-aof-rewrite-percentage 100 # Rewrite when the AOF file size is 100% larger than the last rewrite
When you might rewrite: If your appendonly.aof file is 50MB and its size doubles to 100MB (a 100% increase), and it’s also larger than the auto-aof-rewrite-min-size (64MB), Valkey will initiate a rewrite.
Combining RDB and AOF
You can use both RDB and AOF simultaneously. This gives you the best of both worlds: RDB for fast restarts and backups, and AOF for better durability. In this setup, Valkey uses AOF for durability and RDB for periodic backups. On restart, Valkey will load the AOF file if it exists, otherwise it will load the RDB file.
The most surprising truth about Valkey persistence is that the RDB snapshotting process doesn’t block writes to the main Valkey process. This is achieved through a fork() system call, where the parent process continues to handle client requests while the child process reads the memory to create the snapshot. The child process uses a copy-on-write (COW) mechanism: when the parent process modifies a key that the child process has already read, the child process gets a private copy of that key’s data, ensuring the snapshot remains consistent with the state at the time of the fork.
The next concept you’ll likely encounter is how Valkey handles replication and how persistence interacts with it, particularly regarding how data is synchronized from a master to replicas.