Basis of Hash Data Structures

by Vibrant Publishers

Hash Table is an abstract data structure that stores data in the key-value form. Here the key is the index and value is the actual data. Irrespective of the size of data, accessing them is relatively fast. Hence, hashing is used in search algorithms.

Hash Table:

A hash table uses an array to store data and hashing technique is used for generating an index.
A simple formula for hashing is (key)%(size), where the key is the element’s key from the key-value pair and size is the hash table size. This is done for mapping an item when storing in the hash table.

Below illustration shows how hashing works.

As shown above, we calculate the slot where the item can be stored using % modulo of the hash table size.

Applications:

Hash tables make indexing faster because of which they are used to store and search large volumes of data. Hash tables have 4 major functions: put, value pair and get, contains and remove. Caching is a real-time example of where hashing is used.

Hash tables are also used to store relational data.

Issues with Hashing:

Although hash tables are used in major search programs, they sometimes take a lot of time especially when they need to execute a lot of hash functions. Data is randomly distributed, so the index has to be searched first. There are many complex hash functions, hence they may be prone to errors. Unwarranted collision occurs if a function is poorly coded.

Types of Hashing:

Probabilistic hashing: Caching is implemented by hashing in the memory. The old record is either replaced by new records or the new record is ignored whenever there is a collision.
Perfect Hashing: When the number of items and the storage space is constant, the slots are uniquely mapped. This method is desirable, but not always guaranteed. This ideal scenario can be achieved by mapping large hash table with unique spots provided there is a lot of free memory. This technique is easy to implement and has lesser collision probability. It may adversely affect the performance as the entire hash table need to be traversed for search.
Coalesced Hashing: Whenever a collision occurs, the objects are stored in the same table as chain compared to storing in a separate table.
Extensible Hashing: Here the table bucket can be resized depending on the items. This is suitable to implement time-sensitive applications. Whenever a resize is needed, we can either recreate a bucket or a new bit is then added to the current index with all the additional items and mapping is updated.

For example:

Linear hashing:
Whenever an overflow occurs, the hash table is resized and the table will be rehashed. The overflow items are added to the bucket. When the number of overflow items change, the table is rehashed.

A new slot is then allocated to each item.

Get one step closer to your dream job!

Prepare for your programming interview with Data Structures & Algorithms Interview Questions You’ll Most Likely Be Asked. This book has a whopping 200 technical interview questions on lists, queues, stacks, and algorithms and 77 behavioral interview questions crafted by industry experts and interviewers from various interview panels.

Back to blog