Collision Resolution with Hashing

by Vibrant Publishers
While many hash functions exist, new programmers need clarificaion about which one to choose. There’s no formula available for choosing the right hash function. Clustering or collision is the most addressed appropriately.

 

 

Collision Resolution Techniques:

When one or more hash values compete with a single hash table slot, collisions occur. To resolve this, the next available empty slot is assigned to the current hash value. The most common methods are open addressing, chaining, probabilistic hashing, perfect hashing and coalesced hashing techniques.

Let’s understand them in more detail:

 

 

a) Chaining:

This technique implements a linked list and is the most popular out of all the collision resolution techniques. Below is an example of a chaining process.

 

 

Since one slot here has 3 elements – {50, 85, 92}, a linked list is assigned to include the other 2 items {85, 92}. When you use the chaining technique, the insertion or deletion of items with the hash table is fairly simple and high-performing. Likewise, a chain hash table inherits the pros and cons of a linked list. Alternatively, chaining can use dynamic arrays instead of linked lists.

 

 

b) Open Addressing:

This technique depends on space usage and can be done with linear or quadratic probing techniques. As the name says, this technique tries to find an available slot to store the record. It can be done in one of the 3 ways –

 

  • Linear probing – Here, the next probe interval is fixed to 1. It supports the best caching but miserably fails at clustering.
  • Quadratic probing – the probe distance is calculated based on the quadratic equation. This is considerably a better option as it balances clustering and caching.
  • Double hashing – Here, the probing interval is fixed for each record by a second hashing function. This technique has poor cache performance although it does not have any clustering issues.
    Below are some of the hashing techniques that can help in resolving collision.

     

     

    c) Probabilistic hashing:

    This is memory-based hashing that implements caching. When a collision occurs, either the old record is replaced by the new or the new record may be dropped. Although this scenario has a risk of losing data, it is still preferred due to its ease of implementation and high performance.

     

     

    d) Perfect hashing:

    When the slots are uniquely mapped, the chances of collision are minimal. However, it can be done where there is a lot of spare memory.

     

     

    e) Coalesced hashing:

    This technique is a combo of open address and chaining methods. A chain of items is are stored in the table when there is a collision. The next available table space is used to store the items to prevent collision.
     

    Check out the books we have, which are designed to help you clear your interview with flying colors, no matter which field you are in. These include HR Interview Questions You’ll Most Likely Be Asked (Third Edition) and Innovative Interview Questions You’ll Most Likely Be Asked.