Phases/Working of Consistent Hashing

The following are the phases involved in the process of consistent hashing: 

  1. Hash Function Selection: The first step in consistent hashing is to choose the hash function that will be used to associate keys with network nodes. For each key, this hash function ought to yield a different value and be deterministic. Keys will be consistently and predictably mapped to nodes using the chosen hash function.
  2. Node Assignment: Based on the hash function’s findings, nodes in the network are given keys in this phase. The nodes are organized in a circle, and the keys are given to the node that is situated closest to the key’s hash value in a clockwise direction in the circle.
  3. Key Replication: It’s critical to make sure that data is accessible in a distributed system even in the case of node failures. Keys can be copied across a number of network nodes to accomplish this. In the event that one node fails, this helps to guarantee that data is always accessible.
  4. Node Addition/Removal: In order to keep the system balanced as nodes are added to or removed from the network, it may be necessary to remap the keys to new nodes. Consistent hashing reduces the effect of new or removed nodes by merely remapping a small portion of keys to the new node. 
  5. Load balancing: Consistent hashing aids in distributing the load among the network’s nodes. To keep the system balanced and effective when a node is overloaded, portions of its keys can be remapped to other nodes.
  6. Failure Recovery: Keys assigned to a node can be remapped to other nodes in the network in the event of a node failure. This makes it possible to keep data current and constantly accessible, even in the event that a node fails.

For example:

Let’s say we have 5 nodes in the ring and say node 3 fails, then the range of the next server node widens and any request coming in all of this range, goes to the new server node. This shows that due to use of consistent hashing only a small portion of keys are affected

Consistent Hashing | System Design

Consistent hashing is a distributed hashing technique used in computer science and distributed systems to achieve load balancing and minimize the need for rehashing when the number of nodes in a system changes. It is particularly useful in distributed hash tables (DHTs), distributed caching systems, and other distributed storage systems.

Important Topics for the Consistent Hashing

  • What is Hashing?
  • What is Consistent Hashing?
  • What is the use of Consistent Hashing?
  • Phases/Working of Consistent Hashing
  • Implementation of Consistent Hashing algorithm
  • Advantages of using Consistent Hashing
  • Disadvantages of using Consistent Hashing

Similar Reads

What is Hashing?

Hashing involves using a hash function to produce a pseudo-random number. This number is then divided by the size of the available memory space, resulting in the transformation of the random identifier into a position within the given memory space. This process can be conceptually represented as follows:...

What is Consistent Hashing?

Consistent hashing is a technique used in computer systems to distribute keys (e.g., cache keys) uniformly across a cluster of nodes (e.g., cache servers). The goal is to minimize the number of keys that need to be moved when nodes are added or removed from the cluster, thus reducing the impact of these changes on the overall system....

What is the use of Consistent Hashing?

Consistent hashing is a popular technique used in distributed systems to address the challenge of efficiently distributing keys or data elements across multiple nodes/servers in a network. Consistent hashing’s primary objective is to reduce the number of remapping operations necessary when adding or removing nodes from the network, which contributes to the stability and dependability of the system....

Phases/Working of Consistent Hashing

The following are the phases involved in the process of consistent hashing:...

Implementation of Consistent Hashing algorithm

Choose a Hash Function: Select a hash function that produces a uniformly distributed range of hash values. Common choices include MD5, SHA-1, or SHA-256. Define the Hash Ring: Represent the range of hash values as a ring. This ring should cover the entire possible range of hash values and be evenly distributed. Assign Nodes to the Ring: Assign each node in the system a position on the hash ring. This is typically done by hashing the node’s identifier using the chosen hash function. Key Mapping: When a key needs to be stored or retrieved, hash the key using the chosen hash function to obtain a hash value. Find the position on the hash ring where the hash value falls. Walk clockwise on the ring to find the first node encountered. This node becomes the owner of the key. Node Additions: When a new node is added, compute its position on the hash ring using the hash function. Identify the range of keys that will be owned by the new node. This typically involves finding the predecessor node on the ring. Update the ring to include the new node and remap the affected keys to the new node. Node Removals: When a node is removed, identify its position on the hash ring. Identify the range of keys that will be affected by the removal. This typically involves finding the successor node on the ring. Update the ring to exclude the removed node and remap the affected keys to the successor node. Load Balancing: Periodically check the load on each node by monitoring the number of keys it owns. If there is an imbalance, consider redistributing some keys to achieve a more even distribution....

Advantages of using Consistent Hashing

...

Disadvantages of using Consistent Hashing

The following are some of the key advantages of using consistent hashing:...