Basics of Hash Data Structures
by Vibrant Publishers
A Hash Table is an abstract data structure that stores data in the key-value form. Here the key is the index and value is the actual data. Irrespective of the size of the data, accessing it is relatively fast. Hence, hashing is used in search algorithms.
A simple formula for hashing is (key)%(size), where the key is the element’s key from the key-value pair and size is the hash table size. This is done for mapping an item when storing in the hash table.
The below illustration shows how hashing works.
Hash Table:
A hash table uses an array to store data and hashing technique is used for generating an index.A simple formula for hashing is (key)%(size), where the key is the element’s key from the key-value pair and size is the hash table size. This is done for mapping an item when storing in the hash table.
The below illustration shows how hashing works.
As shown above, we calculate the slot where the item can be stored using % modulo of the hash table size.
Applications:
Hash tables make indexing faster, because of which they are used to store and search large volumes of data. Hash tables have 4 major functions: put, value pair and get, contains, and remove. Caching is a real-time example of where hashing is used.
Hash tables are also used to store relational data.
Issues with Hashing:
Although hash tables are used in major search programs, they sometimes take a lot of time especially when they need to execute a lot of hash functions. Data is randomly distributed, so the index has to be searched first. There are many complex hash functions, hence they may be prone to errors. An unwarranted collision occurs if a function is poorly coded.
Types of Hashing:
-
Probabilistic hashing: Caching is implemented by hashing in the memory. The old record is either replaced by a new record or the new record is ignored whenever there is a collision.
-
Perfect Hashing: When the number of items and the storage space is constant, the slots are uniquely mapped. This method is desirable, but not always guaranteed. This ideal scenario can be achieved by mapping a large hash table with unique spots provided there is a lot of free memory. This technique is easy to implement and has a lesser collision probability. It may adversely affect the performance as the entire hash table needs to be traversed for search.
-
Coalesced Hashing: Whenever a collision occurs, the objects are stored in the same table as the chain rather than being stored in a separate table.
- Extensible Hashing: Here the table bucket can be resized depending on the items. This is suitable for implementing time-sensitive applications. Whenever a resize is needed, we can either recreate a bucket or a new bit is then added to the current index with all the additional items and mapping is updated.
For example:
- Linear hashing:
A new slot is then allocated to each item.
To ace your interview, you can explore related topics. Learn about other data structures through the following blogs on our website:
- The Basics of Stack Data Structure
- Trees
- Know your Queue Data Structures in 60 seconds
- Basics of hash data structures
Get one step closer to your dream job!
Under our Job Interview Questions book series, we have books designed to help you clear your interview with flying colors, no matter which field you are in. These include:
- HR Interview Questions You’ll Most Likely Be Asked (Third Edition)
- Innovative Interview Questions You’ll Most Likely Be Asked
- Leadership Interview Questions You’ll Most Likely Be Asked
Share