Data Structures for Sets

Data Structures for Sets

A set is a group of unique, yet similar items related to each other where order doesn’t matter. For example, a set of balls as shown in the picture is a real-world example for a set.





Below represents a set of balls, but none of them are the same. We can also arrange them in any order. We cannot extend the set to add new items as they are finite.  ball={football,baseball,basketball,cricket ball, tennis ball}



How are sets implemented?

Set is an abstract data type that uses List, an Associative array or Bit array for its implementation. We can implement a simple data representation of similar items using Linear list. However, if you are representing Boolean results like True or False, Bit array comes handy as it requires very less storage space. Hash tables and binary trees are used to depict search algorithms and Associative arrays are very useful in those scenarios.





A null set contains no elements. Null sets are used when we are not sure if the elements will be populated as a result of a runtime operation. The values are dynamically populated in a null set.



How are sets implemented?

A set supports the following operations:


  • Subset:

    smaller set of the larger set. In the below example, A is a Subset of a larger portion B. A subset returns 1 if the first set has some elements which are a subset of B, else returns 0.


Here, A is a subset of B and includes A={4,2,1}


  • Union:

    returns a unique set of elements of both Set A and B. For example, pick all the musicians who play a band or who sing a chorus or who sing both and without including the duplicates.


  • Intersection:

    When you need a set of elements common to both sets, you can create an intersection. Below is a representation of intersection on sets A and B.





Types of sets:

  • A Mutable Set:

    Sets can be mutable if you want to create them during runtime. Size of the set can be  dynamically defined and elements can be added or removed during runtime. You can create, or add, delete or check the capacity when working with a mutable set.


  • An Immutable Set:

    When you are sure of the set size and the elements, immutable sets are best. These are static, so you cannot alter the set. You can check the size if empty, build or read the element of the immutable set.


  • A Disjoint Set:

    This set is a collection of subsets that do not overlap with each other.



    A very good way to represent these singleton sets is using a linked list. Elements can be added to the set by pointing to other sets. Undirected graphs are another way to use a disjoint set. Disjoint set supports find, union and subset operations.


    • A Sparse Set:

      This is similar to the Bit set with the only difference that the elements are indices of a larger array. As managing a very large array is a big hassle, sparse set lets you easily break them into multiple sets and reference the parent set.
      Below is an example of a sparse set.



    When you use a sparse set, the entire set is split into multiple coarse-grained arrays and each array stores the indices. This makes search faster and simpler. A sparse set supports Union and intersection operations.



    Get one step closer to your dream job!

    Prepare for your programming interview with Data Structures & Algorithms Interview Questions You’ll Most Likely Be Asked. This book has a whopping 200 technical interview questions on lists, queues, stacks, and algorithms and 77 behavioral interview questions crafted by industry experts and interviewers from various interview panels.