Chapter 11: Graphs

The Universal Connector

Throughout this course, we have seen how lists organize data in a line and how trees organize data in a hierarchy. However, many real-world relationships are far more complex than a simple "parent-child" or "next-in-line" structure. Think of a social network where any person can be friends with any other person, or a city map where streets connect various locations in a tangled web. A Graph is the most general and versatile data structure in computer science because it can model these arbitrary connections with ease. If you can describe a relationship between two things, you can represent it with a graph.

In formal terms, a graph consists of a set of Vertices (also called nodes) and a set of Edges that connect them. Vertices represent the entities, such as users in a social network or cities on a map, while edges represent the relationships between them, such as "is-friends-with" or "connected-by-road." Because graphs are so flexible, they are used to solve some of the most difficult problems in computing, from calculating the fastest GPS route to recommending new movies based on your viewing history.

Graph Anatomy and Types

Graphs come in many different flavors depending on the nature of the relationships they represent. An Undirected Graph has edges that are bidirectional; if there is a connection between node A and node B, you can travel in both directions. A Directed Graph (or Digraph), however, has edges with a specific direction, usually represented by an arrow. For example, a "Following" relationship on social media is directed: you can follow a celebrity without them following you back.

We also categorize graphs by whether their connections have costs associated with them. A Weighted Graph assigns a value (a weight) to each edge, representing a distance, time, or cost. An Unweighted Graph treats all connections as equal. Furthermore, a graph can be Cyclic if it contains a path that starts and ends at the same vertex, or Acyclic if such paths are impossible. The most famous acyclic graph is the "Directed Acyclic Graph" (DAG), which is used to model dependencies in software build systems and blockchain technologies.

Representing Graphs in Code

Because graphs are non-linear and potentially very dense, we need efficient ways to store them in memory. There are two primary methods used in industry: the Adjacency Matrix and the Adjacency List. Each has its own strengths and weaknesses, and choosing the right one is a key decision in algorithm design.

1. Adjacency Matrix: The Fast Lookup

An Adjacency Matrix uses a 2D array where the rows and columns represent the vertices. If there is an edge between vertex $i$ and vertex $j$ , the cell matrix[i][j] is set to 1 (or the weight of the edge); otherwise, it is 0. This representation is excellent for "Dense Graphs," where many nodes are connected to each other. It allows you to check if an edge exists between any two nodes in constant $O(1)$ time. However, it is very wasteful for "Sparse Graphs," as it always requires $O(V^2)$ space regardless of how few edges actually exist.

# Adjacency Matrix in Python
class MatrixGraph:
    def __init__(self, num_vertices):
        self.matrix = [[0] * num_vertices for _ in range(num_vertices)]
    
    def add_edge(self, u, v, weight=1):
        self.matrix[u][v] = weight
        # For undirected: self.matrix[v][u] = weight

graph = MatrixGraph(4)
graph.add_edge(0, 1) # Connection from node 0 to 1

2. Adjacency List: The Memory Saver

The Adjacency List is the most popular way to represent graphs, especially sparse ones like social networks. In this format, we maintain an array or a hash map where each index corresponds to a vertex. Each index stores a list of all the other vertices it is connected to. This approach is highly memory-efficient because it only stores the edges that actually exist, using $O(V + E)$ space. While checking for a specific edge is slightly slower ( $O(\text{degree of node})$ ), iterating through all the neighbors of a node is much faster than in a matrix.

// Adjacency List in JavaScript
class ListGraph {
    constructor() {
        self.adjList = new Map();
    }

    addVertex(v) {
        if (!self.adjList.has(v)) self.adjList.set(v, []);
    }

    addEdge(u, v) {
        // Ensure both vertices exist
        this.addVertex(u);
        this.addVertex(v);
        this.adjList.get(u).push(v);
    }
}

Modeling Relationships: Weighted Connections

In many applications, the mere existence of a connection isn't enough; we need to know the quality or cost of that connection. In a GPS app, the weight of an edge might be the distance between two intersections or the time it takes to drive between them. In an Adjacency List, we represent this by storing objects or pairs instead of just integers. For example, the list for node "New York" might contain {node: "Boston", weight: 215}. This allows algorithms like Dijkstra's to calculate the "Shortest Path" by finding the sequence of edges with the minimum total weight.

// Weighted Adjacency List in TypeScript
type Edge = { to: string; weight: number };

class WeightedGraph {
    private adj: Map<string, Edge[]> = new Map();

    addNode(node: string) {
        if (!this.adj.has(node)) this.adj.set(node, []);
    }

    addEdge(from: string, to: string, weight: number) {
        this.addNode(from);
        this.addNode(to);
        this.adj.get(from)?.push({ to, weight });
    }
}

Graph Connectivity and Degrees

We also measure nodes by their Degree, which is the number of edges connected to them. In a directed graph, we split this into In-degree (how many arrows point to the node) and Out-degree (how many arrows point away). A node with a high out-degree is often an "Influencer" or a central hub in a network. A graph is considered Connected if there is a path between every pair of vertices. In a directed graph, if you can reach every node from every other node, it is called Strongly Connected. If you can only reach them by ignoring the arrow directions, it is Weakly Connected.

Key Takeaway: Graphs are the most powerful tool for representing non-linear, non-hierarchical data. By choosing between a matrix for speed and a list for memory efficiency, you can model everything from social connections to electrical grids. In the next chapter, we will learn how to search through these complex webs using Graph Traversals.