Graphs in Data Structures

In this blog on Graphs in Data Structures, we will learn about graph representation, operations, some graph-related algorithms, and their uses. The graph data structure is widely used in real-life applications. Its main implementation is seen in networking, whether it is on the internet or in transportation. A network of nodes is easily represented through a graph. Here, nodes can be people on a social media platform like Facebook friends or LinkedIn connections or some electric or electronic components in a circuit. Telecommunication and civil engineering networks also make use of the graph data structure.

Graphs in Data Structures:

Graphs in data structures consist of a finite number of nodes connected through a set of edges. Graphs visualize a relationship between objects or entities.

Nodes in graphs are entities. The relationship between them is defined through paths called edges. Nodes are also called Vertices of graphs. Edges represent a unidirectional or bi-directional relation between the vertices.

An undirected edge between two entities or nodes, A and B, represents a bi-directional relation between them.

The root node is the parent node of all other nodes in a graph, and it does not have a parent node itself. Every graph has a single root node. Leaf nodes have no child nodes but only contain parent nodes. There are no outgoing edges from leaf nodes.

Types of Graphs:

Undirected Graph: A graph with all bi-directional edges is an undirected graph. In them undirected graph below, we can traverse from node A to node B as well as from node B to node A as the edges are bi-directional.

Directed Graph: This graph consists of all unidirectional edges. All edges are pointing in a single direction. In the directed graph below, we can traverse from node A to node B but not from node B to node A as the edges are unidirectional.

A tree is a special type of undirected graph. Any two nodes in a tree are connected only by one edge. In this way, there are no closed loops or circles in a tree., whereas looped paths are present in graphs.

The maximum number of edges in a tree with n nodes is n-1. In graphs, nodes can have more than one parent node, but in trees, each node can only have one parent node except for the root node that has no parent node.

Graph representation:

Usually, graphs in data structures are represented in two ways. Adjacency Matrix and Adjacency List. First, let’s see: what is an adjacency in a graph? We say that a vertex or a node is adjacent to another vertex if they have an edge between them. Otherwise, the nodes are non-adjacent. In the graph below, vertex B is adjacent to C but non-adjacent to node D.

Adjacency Matrix:

In the adjacency matrix, a graph is represented as a 2-dimensional array. The two dimensions in the form of rows and columns represent the vertices and their value. The values in the form of 1 and 0 tell whether the edge is present between the two vertices or not. For example, if the value of M[A][D] is 1, it means there is an edge between the vertices A and D of the graph.

See the graph below and then its adjacency matrix representation for better understanding.

The adjacency matrix gives the benefit of an efficient edge lookup, that is, whether the two vertices have a connected edge or not. But at the same time, it requires more space to store all the possible paths between the edges. For fast results, we have to compromise on space.

Adjacency Lists:

In the adjacency list, an array of lists is used to represent a graph. Each list contains all edges adjacent to a particular vertex. For example, list LA will contain all the edges adjacent to the vertex A of a graph.

As the adjacency list stores information only for edges that are present in the graph, it proves to be a space-efficient representation. For cases such as sparse matrix where zero values are more, an adjacency list is a good option, whereas an adjacency matrix will take up a lot of unnecessary space.

The adjacency list of the above graph would be as,

Graph Operations:

Some common operations that can be performed on a graph are:

Element search
Graph traversal
Insertion in graph
Finding a path between two nodes

Graph Traversal Algorithms:

Visiting the vertices of a graph in a specific order is called graph traversal. Many graph applications require graph traversal according to its topology. Two common algorithms for graph traversal are Breadth-First Search and Depth First Search.

Breadth-First Search Graph Traversal:

The breadth-first search approach visits the vertices breadthwise. It covers all the vertices at one level of the graph before moving on to the next level. The breadth-first search algorithm is implemented using a queue.

The BFS algorithm includes three steps:

Pick a vertex and insert all its adjacent vertices in a queue. This will be level 0 of the graph.
Dequeue a vertex, mark it visited, and insert all its adjacent vertices in the queue. This will be level 1. If the vertex has no adjacent nodes, delete it.
Repeat the above two steps until the whole graph is covered or the queue is empty.

Consider the graph below where we have to search node E.

First, we will visit node A then move on to the next level and visit nodes B and C. When nodes at this level are finished we move on to the next level and visit nodes D and E.

Nodes once visited can be stored in an array, whereas adjacent nodes are stored in a queue.

The time complexity of the BFS graph is 0(V+E), where V is the number of vertices and E is the number of edges.

Depth First search Graph Traversal:

The depth-first search (DFS) algorithm works on the approach of backtracking. It keeps moving on to the next levels, or else it backtracks. DFS algorithms can be implemented using stacks. It includes the following steps.

Pick a node and push all its adjacent vertices onto a stack.

Pop a vertex, mark it visited, and push all its adjacent nodes onto a stack. If there are no adjacent vertices, delete the node.

Repeat the above two steps until the required result is achieved or the stack gets empty.

Consider the graph below:

To reach node E we start from node A and move downwards, as deep as we can go. We will reach node C. Now, we backtrack to node B, to go downwards towards another adjacent vertex of B. In this way, we reach node E.

IsReachable is a common problem scenario for graphs in Data Structures. The problem states to find whether there is a path between two vertices or not, that is whether a vertex is reachable from the other vertex or not. If the path is present, the function returns true else false.

This problem can be solved with both of the traversal approaches we have discussed above, that is BFS and DFS.

Consider a function with three input parameters. The graph and the two vertices we are interested in. Let’s call them u and v. Using a BFS algorithm, the solution consists of the following steps:

Initialize an array to store visited vertices of size equal to the size of the graph.
Create a queue and enqueue the first vertex, in this case, u = 1. Mark u as visited and store it in the array.
Add all the adjacent vertices of u in the queue.
Now, dequeue the front element from the queue. Enqueue all its adjacent vertices in the queue. If any vertex is the required vertex v, return true. Else continue visiting adjacent nodes and keep on adding them to the queue.

Repeat the above process in a loop. Since there is a path from vertex 1 to vertex 3 we will get a true value.

If there is no path, for example, from node 1 to node 8, our queue will run empty, and false will be returned, indicating that vertex u is not reachable to vertex v.

Below is an implementation of the IsReachable function for the above graph in Python programming language.

Weighted Graphs:

Weighted graphs or di-graphs have weights or costs assigned to the edges present in the graph. Such weighted graphs are used in problems such as finding the shortest path between two nodes on the graph. These graph implementations help in several real-life scenarios. For example, a weighted graph approach is used in computing the shortest route on the map or to trace out the shortest path for delivery service in the city. Using the cost on each edge we can compute the fastest path.

Weighted graphs lead us to an important application of graphs, namely finding the shortest path on the graph. Let’s see how we can do it.

Finding the Shortest Path on the Graph:

The goal is to find the path between two vertices such that the sum of the costs of their edges is minimal. If the cost or weight of each edge is one, then the shortest path can be calculated using a Breadth-first Search approach. But if costs are different, then different algorithms can be used to get the shortest path.

Three algorithms are commonly used for this problem.

Bellman Ford’s Algorithm
Dijkstra’s Algorithm
Floyd-Warshall’s Algorithm

This is an interesting read on shortest path algorithms.

Minimum Spanning Tree:

A spanning tree is a sub-graph of an undirected graph containing all the vertices of the graph with the minimum possible number of edges. If any vertex is missing it is not a spanning tree. The total number of spanning trees that can be formed from n vertices of a graph is n(n-2).

A type of spanning tree where the sum of the weights of the edges is minimum is called a Minimum Spanning Tree. Let’s elaborate on this concept through an example.

Consider the following graph.

The possible spanning trees for the above graph are:

The Minimum Spanning tree for the above graph is where the sum of edges’ weights = 7.

Incidence Matrix:

The Incidence matrix relates the vertices and edges of a graph. It stores vertices as rows and edges as columns. In contrast, to the adjacency matrix where both rows and columns are vertices.

An incidence matrix of an undirected graph is nxm matrix A where n is the number of vertices of the graph and m is the number of edges between them. If Ai,j =1, it means the vertex vi and edge ej are incident.

Incidence List:

Just like the adjacency list, there is also an incidence list representation of a graph. The incident list is implemented using an array that stores all the edges incident to a vertex.

For the following graph, the list Li shows the list of all edges incident to vertex A in the above graph.

Ending Note:

In this blog on Graphs in Data Structures, we discussed the implementation of graphs, their operations, and real-life applications. Graphs are widely used, especially in networking, whether it is over the internet or for mapping out physical routes. Hopefully, now you have enough knowledge regarding a data structure with extensive applications.