Monitor Valkey with INFO: Key Metrics to Watch (2026)

Valkey’s INFO command is less about monitoring Valkey and more about interrogating it, giving you a snapshot of its internal state and performance characteristics at a specific moment.

Let’s see it in action.

valkey-cli
127.0.0.1:6379> INFO memory
# Memory
used_memory:1048576
used_memory_human:1.00M
used_memory_rss:2097152
used_memory_rss_human:2.00M
mem_fragmentation_ratio:2.00
...

This output isn’t a live dashboard; it’s a report. You’re asking Valkey, "Hey, how are you doing right now?" and it’s telling you. The real power comes from understanding what it’s telling you, and how those numbers relate to the health and performance of your application.

The Core Problem: Valkey’s Black Box

Valkey is often treated as a simple key-value store, but it’s a complex system with internal caches, background processes, and intricate data structures. Without visibility, it’s easy to encounter performance degradation, memory exhaustion, or unexpected behavior, and have no idea why. The INFO command is your primary tool for peeling back the layers and understanding the machine’s internal monologue.

Diving into the Metrics

The INFO command can be segmented to provide specific categories of information. Here are the key ones and what they mean:

INFO memory: This is crucial for understanding resource consumption.
- used_memory: The total amount of memory allocated by Valkey for its data structures and internal buffers. This is the number you want to keep an eye on for capacity planning.
- used_memory_rss: The amount of memory Valkey has requested from the operating system. This can be higher than used_memory due to memory fragmentation or OS-level overhead.
- mem_fragmentation_ratio: The ratio of used_memory_rss to used_memory. A ratio significantly above 1.0 (e.g., > 1.5) indicates fragmentation, meaning Valkey has to ask the OS for more memory than it actually uses. A ratio below 1.0 is generally impossible unless memory is being shared (e.g., with forked processes, which isn’t typical for a running Valkey instance). High fragmentation can lead to increased memory usage and potentially slower memory allocation.
To monitor memory over time, you’d typically run INFO memory periodically (e.g., every minute) and log the output, or use a monitoring tool that scrapes this metric.
INFO clients: Shows information about connected clients.
- connected_clients: The number of currently connected clients. A sudden spike might indicate an application issue or a denial-of-service attack.
- client_longest_output_list: The maximum length of the output buffer for any client. A growing number here suggests a client isn’t consuming its responses fast enough, potentially blocking Valkey.
- blocked_clients: The number of clients currently blocked by commands like BLPOP, BRPOP, BRPOPLPUSH. A high number here isn’t necessarily bad, but it means a significant portion of your connections are waiting for something.
INFO persistence: Details about RDB and AOF persistence.
- rdb_changes_since_last_save: Number of write operations since the last RDB snapshot. A steadily increasing number means the RDB file is getting stale.
- aof_enabled: Whether AOF persistence is active.
- aof_last_write_status: Status of the last AOF write operation. ok is good; err is a critical problem.
- aof_last_bgrewrite_status: Status of the last AOF background rewrite. ok is good; err indicates a potential issue.
INFO stats: General statistics.
- total_commands_processed: Total number of commands executed since the server started. Good for understanding overall workload.
- instantaneous_ops_per_sec: The average number of commands processed per second over the last 15 seconds. A key indicator of current throughput.
- keyspace_hits and keyspace_misses: Ratios here are crucial. A low keyspace_hit_ratio (hits / (hits + misses)) means clients are requesting keys that don’t exist or have expired frequently. This can indicate inefficient caching on the application side or issues with key expiration policies.
- evicted_keys: Number of keys that were evicted due to memory limits. A non-zero and increasing number means Valkey is actively removing data to stay within memory constraints, which is usually undesirable.
INFO replication: If Valkey is used as a master or replica.
- role: master or slave.
- master_repl_offset: The master’s current replication offset.
- slave_repl_offset: The replica’s current replication offset. A significant lag between these numbers indicates replication is falling behind.
- master_sync_in_progress: Indicates if a full synchronization is happening. This is a blocking operation.
INFO cpu: CPU usage information.
- used_cpu_sys: Valkey’s CPU usage in system mode.
- used_cpu_user: Valkey’s CPU usage in user mode.
- used_cpu_sys_children: CPU usage by child processes (e.g., for RDB saving).
- used_cpu_user_children: CPU usage by child processes. High CPU usage, especially used_cpu_user, directly correlates with how much work Valkey is doing, often due to a high volume of complex commands or large data operations.

The "Why" Behind the Numbers

Consider mem_fragmentation_ratio. When Valkey allocates memory for a key-value pair, it asks the OS for a chunk. If the key (or value) is later deleted, that memory isn’t always perfectly reclaimed and can leave "holes." Over time, these holes accumulate, meaning the total memory Valkey reports to the OS (used_memory_rss) is much larger than the memory it’s actively using for data (used_memory). This isn’t a Valkey bug; it’s a characteristic of how memory allocators work. A high ratio means your Valkey process is consuming more RAM than strictly necessary for its data, potentially leading to OOM killer situations or higher infrastructure costs. The fix is often to restart Valkey, which forces a fresh memory allocation from the OS, resetting the fragmentation.

Another critical insight comes from instantaneous_ops_per_sec. If this number suddenly drops, it’s not because Valkey is "slowing down" in a general sense. It means the current workload is taking longer to process. This could be due to a single, very slow command (like KEYS * on a large dataset), a network bottleneck causing client commands to take longer to arrive, or Valkey being busy with a background task like an RDB save or AOF rewrite, which can consume CPU and I/O.

The Counterintuitive Truth: `INFO` is Synchronous

The INFO command itself blocks your Valkey instance while it collects and formats the data. On a very busy server with millions of operations per second, running INFO can introduce a tiny, but measurable, latency spike for other clients. For most use cases, this is negligible, but in ultra-low-latency environments, you might consider running INFO less frequently or using a dedicated monitoring client that’s less sensitive to these micro-pauses.

The next step after understanding your current Valkey metrics is to correlate them with application behavior and set up automated alerting based on these values.

The Core Problem: Valkey’s Black Box

Diving into the Metrics

The "Why" Behind the Numbers

The Counterintuitive Truth: INFO is Synchronous

More Deep Dives in Valkey

The Counterintuitive Truth: `INFO` is Synchronous