Collision Resolution with Hashing
by Vibrant Publishers
While many hash functions exist, new programmers need clarification about which one to choose. There’s no formula available for choosing the right hash function. Clustering or collision is the most common problem in hash functions and must be addressed appropriately.
Let’s understand them in more detail:
To ace your interview, you can explore related topics. Learn about other data structures through the following blogs on our website:
Pave the route to your dream job by preparing for your interview with questions from our Job Interview Questions Book Series These provide a comprehensive list of questions for your interview regardless of your area of expertise. These books include:
Collision Resolution Techniques:
When one or more hash values compete with a single hash table slot, collisions occur. To resolve this, the next available empty slot is assigned to the current hash value. The most common methods are open addressing, chaining, probabilistic hashing, perfect hashing and coalesced hashing techniques.Let’s understand them in more detail:
a) Chaining:
This technique implements a linked list and is the most popular out of all the collision resolution techniques. Below is an example of a chaining process.
Since one slot here has 3 elements – {50, 85, 92}, a linked list is assigned to include the other 2 items {85, 92}. When you use the chaining technique, the insertion or deletion of items with the hash table is fairly simple and high-performing. Likewise, a chain hash table inherits the pros and cons of a linked list. Alternatively, chaining can use dynamic arrays instead of linked lists.
b) Open Addressing:
This technique depends on space usage and can be done with linear or quadratic probing techniques. As the name says, this technique tries to find an available slot to store the record. It can be done in one of the 3 ways –
- Linear probing – Here, the next probe interval is fixed to 1. It supports the best caching but miserably fails at clustering.
- Quadratic probing – the probe distance is calculated based on the quadratic equation. This is considerably a better option as it balances clustering and caching.
- Double hashing – Here, the probing interval is fixed for each record by a second hashing function. This technique has poor cache performance although it does not have any clustering issues.
c) Probabilistic hashing:
This is memory-based hashing that implements caching. When a collision occurs, either the old record is replaced by the new or the new record may be dropped. Although this scenario has a risk of losing data, it is still preferred due to its ease of implementation and high performance.
d) Perfect hashing:
When the slots are uniquely mapped, the chances of collision are minimal. However, it can be done where there is a lot of spare memory.
e) Coalesced hashing:
This technique is a combo of open address and chaining methods. A chain of items is are stored in the table when there is a collision. The next available table space is used to store the items to prevent collision.To ace your interview, you can explore related topics. Learn about other data structures through the following blogs on our website:
- The Basics of Stack Data Stucture
- Trees
- Basics of hash data structures
- Know your Queue Data Structures in 60 seconds
Pave the route to your dream job by preparing for your interview with questions from our Job Interview Questions Book Series These provide a comprehensive list of questions for your interview regardless of your area of expertise. These books include:
- HR Interview Questions You’ll Most Likely Be Asked (Third Edition)
- Innovative Interview Questions You’ll Most Likely Be Asked
- Leadership Interview Questions You’ll Most Likely Be Asked
Share