Collision Resolution with Hashing

Collision Resolution with Hashing

While there are many hash functions, new programmers are confused about which one to choose. There’s no such formula for choosing the right hash function. Clustering or collision is the most commonly occurring problems in hash functions and needs to be addressed appropriately.



Collision Resolution Techniques:

When one or more hash values compete with a single hash table slot, collisions occur. To resolve this, the next available empty slot is assigned to the current hash value. The most common methods are open addressing, chaining, probabilistic hashing, perfect hashing and coalesced hashing technique.

Let’s understand them in more detail:



a) Chaining:

This technique implements a linked list and is the most popular collision resolution techniques. Below is an example of a chaining process.



Here, since one slot has 3 elements – {50, 85, 92}, a linked list is assigned to include the other 2 items {85, 92}. When you use the chaining technique, inserting or deleting of items with the hash table is fairly simple and high performing. Likewise, a chain hash table inherits the pros and cons of a linked list. Alternatively, chaining can use dynamic arrays instead of linked lists.



b) Open Addressing:

This technique depends on space usage and can be done with linear or quadratic probing techniques. As the name says, this technique tries to find an available slot to store the record. It can be done in one of the 3 ways –


  • Linear probing – Here, the next probe interval is fixed to 1. It supports best caching but miserably fails at clustering.

  • Quadratic probing – the probe distance is calculated based on the quadratic equation. This is considerably a better option as it balances clustering and caching.

  • Double hashing – Here, the probing interval is fixed for each record by a second hashing function. This technique has poor cache performance although it does not have any clustering issues.

Below are some of the hashing techniques that can help in resolving collision.



c) Probabilistic hashing:

This is memory based hashing that implements caching. When collision occurs, either the old record is replaced by the new or the new record may be dropped. Although this scenario has a risk of losing data, it is still preferred due to its ease of implementation and high performance.



d) Perfect hashing:

When the slots are uniquely mapped, there is very less chances of collision. However, it can be done where there is a lot of spare memory.



e) Coalesced hashing:

This technique is a combo of open address and chaining methods. A chain of items are stored in the table when there is a collision. The next available table space is used to store the items to prevent collision.
Learn more about Data Structures and Algorithms for your next interview from Data Structures & Algorithms Interview Questions You’ll Most Likely Be Asked.
Learn more about Data Structures and Algorithms for your next interview from our :
Job Interview Question Book Series