Consistent Hashing: Beyond Modulo

Consistent hashing is a surprisingly effective way to distribute data across a changing set of servers without causing a cascade of reassignments.

Let’s see it in action. Imagine we have a cache with three servers: cache-01, cache-02, and cache-03. We want to store and retrieve items like user:123 or product:456.

import hashlib
import bisect

class ConsistentHash:
    def __init__(self, nodes=None, replicas=100):
        self.replicas = replicas
        self.ring = {}
        self.sorted_keys = []
        if nodes:
            for node in nodes:
                self.add_node(node)

    def add_node(self, node):
        for i in range(self.replicas):
            key = self._hash(f"{node}-{i}")
            self.ring[key] = node
            bisect.insort(self.sorted_keys, key)

    def remove_node(self, node):
        for i in range(self.replicas):
            key = self._hash(f"{node}-{i}")
            del self.ring[key]
            self.sorted_keys.remove(key)

    def get_node(self, key):
        if not self.ring:
            return None
        hash_key = self._hash(key)
        # Find the first node clockwise from the hash_key
        idx = bisect.bisect_left(self.sorted_keys, hash_key)
        if idx == len(self.sorted_keys):
            idx = 0
        return self.ring[self.sorted_keys[idx]]

    def _hash(self, value):
        return int(hashlib.md5(value.encode('utf-8')).hexdigest(), 16)

# --- Example Usage ---
nodes = ["cache-01", "cache-02", "cache-03"]
ch = ConsistentHash(nodes)

# Distribute keys
keys_to_distribute = ["user:123", "product:456", "session:abc", "image:xyz", "config:def"]
key_assignments = {}
for key in keys_to_distribute:
    assigned_node = ch.get_node(key)
    key_assignments[key] = assigned_node
    print(f"Key '{key}' assigned to: {assigned_node}")

print("\n--- Adding a new node ---")
ch.add_node("cache-04")

# See which keys are reassigned
reassignments = {}
for key in keys_to_distribute:
    new_assigned_node = ch.get_node(key)
    if key_assignments[key] != new_assigned_node:
        reassignments[key] = (key_assignments[key], new_assigned_node)
        print(f"Key '{key}' reassigned from {key_assignments[key]} to {new_assigned_node}")
    key_assignments[key] = new_assigned_node # Update for subsequent checks

if not reassignments:
    print("No keys were reassigned!")

print("\n--- Removing a node ---")
ch.remove_node("cache-02")

# See which keys are reassigned again
reassignments_after_removal = {}
for key in keys_to_distribute:
    new_assigned_node = ch.get_node(key)
    if key_assignments[key] != new_assigned_node:
        reassignments_after_removal[key] = (key_assignments[key], new_assigned_node)
        print(f"Key '{key}' reassigned from {key_assignments[key]} to {new_assigned_node}")
    key_assignments[key] = new_assigned_node # Update for subsequent checks

if not reassignments_after_removal:
    print("No keys were reassigned!")

The core problem consistent hashing solves is maintaining a relatively stable mapping of data to servers when servers are added or removed. Traditional hashing (like hash(key) % num_servers) would require almost every key to be remapped when num_servers changes, leading to massive data shuffling and downtime. Consistent hashing mitigates this by mapping both keys and servers onto a conceptual "ring" (often represented by a sorted list of hash values). When a server is added, it only affects keys that fall into its new segment on the ring. When a server is removed, its keys are redistributed only to its immediate neighbors on the ring. The replicas parameter creates multiple virtual nodes for each physical server, further smoothing out the distribution and reducing the impact of a single point of failure or imbalance.

Internally, the ConsistentHash class uses a dictionary (self.ring) to map hash values to actual node names and a sorted list (self.sorted_keys) to efficiently find the correct node on the ring. When add_node or remove_node is called, it iterates through a configurable number of virtual replicas for that node, hashing each replica and updating the ring and sorted keys. The get_node method hashes the incoming key and uses bisect_left to find the insertion point in the sorted list of keys. This insertion point directly corresponds to the hash value of the first node encountered clockwise on the ring.

The number of virtual nodes (replicas) is a crucial tuning parameter. A higher number of replicas leads to a more even distribution of keys across the available nodes, but also increases the memory overhead of storing the ring and the computational cost of adding/removing nodes. Conversely, a lower number of replicas is more memory-efficient but can result in uneven load distribution, especially with a small number of physical nodes. The goal is to find a balance that provides good distribution without excessive overhead.

A common misconception is that consistent hashing completely eliminates data movement. While it minimizes it, when a node is added, the keys that now fall within its responsibility must be moved from the node that was previously responsible for them. Similarly, when a node is removed, its keys are moved to its successor. The "minimal disruption" comes from the fact that only a fraction of the total keys are affected, rather than almost all of them.

The next challenge is handling the actual data migration that occurs when keys are reassigned due to node additions or removals.

Related Concepts

More Deep Dives in Distributed Systems