Lesson 43 of 50 advanced

Graphs — Connecting the Dots

Explore the data structure that models friendships, maps, networks, and so much more!

Open interactive version (quiz + challenge)

Real-world analogy

Imagine a map of cities connected by roads. Each city is a dot (we call it a node or vertex), and each road is a line connecting two dots (we call it an edge). Some roads are one-way streets (directed edges) and some go both ways (undirected edges). A graph is just this: dots and lines! Your social media friends list is a graph — you are a dot, your friends are dots, and friendships are the lines connecting you. Once you see the world as a graph, you see graphs literally everywhere!

What is it?

A graph is a data structure consisting of nodes (vertices) connected by edges. It is used to model relationships and connections between objects. Graphs can be directed or undirected, weighted or unweighted. BFS and DFS are the two fundamental algorithms for traversing graphs, and they form the basis of almost all graph algorithms in competitive programming.

Real-world relevance

Graphs are everywhere in the real world! Google Maps uses graphs where cities are nodes and roads are edges to find the shortest route. Social networks like Facebook and Instagram are giant graphs where users are nodes and connections are edges. The internet itself is a graph of web pages connected by hyperlinks. Even your brain is a graph — neurons connected by synapses!

Key points

What is a Graph? — A graph is a collection of nodes (also called vertices) connected by edges. Think of nodes as cities and edges as roads between them. Graphs can represent anything with connections: friendships on social media, web pages linked together, cities connected by flights, or even moves in a chess game. They are one of the most important data structures in all of computer science!
Directed vs Undirected Graphs — In an undirected graph, edges go both ways — like a two-way road. If city A connects to city B, then B also connects to A. In a directed graph, edges have a direction — like a one-way street. A might connect to B, but B might not connect back to A. Twitter follows are directed (you can follow someone who does not follow you back). Facebook friendships are undirected (if you are friends, it goes both ways).
How to Store a Graph — Adjacency List — The most common way to store a graph in competitive programming is an adjacency list. For each node, you keep a list of its neighbors. In C++, we use vector> adj(n). If there is an edge from node 1 to node 3, we add 3 to adj[1]. For undirected graphs, we add both directions: adj[1].push_back(3) AND adj[3].push_back(1). This is memory-efficient and fast to traverse!
BFS — Breadth-First Search — BFS explores a graph level by level, like ripples spreading in a pond. Start at a node, visit all its direct neighbors first, then their neighbors, and so on. BFS uses a queue (FIFO) to keep track of which node to visit next. BFS is perfect for finding the shortest path in an unweighted graph because it always explores closer nodes before farther ones!
DFS — Depth-First Search — DFS explores a graph by going as deep as possible before backtracking, like exploring a maze by always turning left until you hit a dead end. DFS uses a stack (or recursion, which uses the call stack). DFS is great for detecting cycles, finding connected components, and topological sorting. It is the explorer who goes all the way to the end of one path before trying another.
BFS vs DFS — When to Use Each — Use BFS when you need the shortest path in an unweighted graph, or when you need to explore nodes level by level. Use DFS when you need to explore all paths, detect cycles, or do topological sorting. In competitive programming, BFS is used for shortest paths and DFS is used for connectivity and cycle detection. Both visit every node and edge exactly once, running in O(V + E) time.
Connected Components — A connected component is a group of nodes where you can reach any node from any other node in the group. Think of it as an island — everyone on the island can reach each other, but cannot reach people on other islands. To find all connected components, run BFS or DFS from each unvisited node. Each run finds one complete component. The number of runs equals the number of components!
Graph Vocabulary You Need to Know — Degree: the number of edges connected to a node. Path: a sequence of edges connecting two nodes. Cycle: a path that starts and ends at the same node. Tree: a connected graph with no cycles (N nodes, N-1 edges). Weighted graph: edges have numbers (distances or costs). These terms appear in almost every graph problem, so memorize them!
Graphs in Competitive Programming — Graphs are HUGE in CP — about 30% of contest problems involve graphs! Common graph problems include: shortest path (BFS, Dijkstra), cycle detection (DFS), minimum spanning tree (Kruskal, Prim), topological sorting (DFS on directed acyclic graphs), and flood fill (BFS/DFS on grids). Start with BFS and DFS — they are the foundation for everything else!

Code example

#include <bits/stdc++.h>
using namespace std;

// BFS — visits nodes level by level (closest first)
void bfs(int start, vector<vector<int>>& adj, int n) {
    vector<int> dist(n + 1, -1);   // -1 means not yet visited
    queue<int> q;

    dist[start] = 0;               // Starting node is 0 steps away
    q.push(start);

    cout << "BFS order: ";
    while (!q.empty()) {
        int node = q.front();      // Take the next node in line
        q.pop();
        cout << node << " ";

        for (int nb : adj[node]) { // Check each neighbor
            if (dist[nb] == -1) {  // Only visit if not seen before
                dist[nb] = dist[node] + 1;  // One step farther
                q.push(nb);
            }
        }
    }
    cout << "\n";
}

// DFS — goes as deep as possible before backtracking
void dfs(int node, vector<vector<int>>& adj, vector<bool>& visited) {
    visited[node] = true;          // Mark this node as seen
    cout << node << " ";

    for (int nb : adj[node]) {     // Try each neighbor
        if (!visited[nb]) {        // If not yet visited
            dfs(nb, adj, visited); // Go deeper!
        }
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n = 5;                              // 5 nodes in our graph
    vector<vector<int>> adj(n + 1);         // Adjacency list (1-indexed)

    // Add undirected edges: 1-2, 1-3, 2-4, 3-4, 4-5
    vector<pair<int,int>> edges = {{1,2}, {1,3}, {2,4}, {3,4}, {4,5}};
    for (auto [u, v] : edges) {
        adj[u].push_back(v);
        adj[v].push_back(u);                // Both directions for undirected
    }

    bfs(1, adj, n);                         // BFS from node 1

    vector<bool> visited(n + 1, false);
    cout << "DFS order: ";
    dfs(1, adj, visited);                   // DFS from node 1
    cout << "\n";

    return 0;
}

Line-by-line walkthrough

1. We use vector> adj(n + 1) to store the graph as an adjacency list. Index 0 is unused because our nodes are numbered 1 to n. Each adj[i] is a list of neighbors of node i.
2. To add an undirected edge between u and v, we push v into adj[u] AND u into adj[v]. Both directions! For directed graphs, you would only add one direction.
3. In BFS, we start by marking the start node as visited, setting its distance to 0, and pushing it into the queue. The queue ensures we process nodes in order of their distance from the start.
4. The BFS while loop pops the front node from the queue, then loops through all its neighbors. If a neighbor has not been visited, we mark it, set its distance to (current distance + 1), and push it into the queue.
5. BFS naturally computes shortest distances! dist[neighbor] = dist[node] + 1 because BFS explores level by level. Every node is discovered at the earliest possible time.
6. In DFS, we use recursion. Visit a node, mark it, then for each unvisited neighbor, recursively DFS into it. This goes deep before going wide — like exploring one branch of a tree all the way to the leaf.
7. countComponents() loops through all nodes 1 to n. Each time it finds an unvisited node, it starts a new component and runs iterative DFS (using a stack) to mark all reachable nodes. The count of DFS runs equals the number of components.
8. The iterative DFS in countComponents uses a stack instead of recursion. Push the start node, then repeatedly pop a node, and push its unvisited neighbors. This avoids stack overflow for very large graphs.
9. In main(), we build a graph with 6 nodes and 5 edges forming two components: {1,2,3,4} and {5,6}. BFS from node 1 finds distances to nodes in the first component, and reports -1 for unreachable nodes 5 and 6.
10. The output shows BFS visiting nodes in order 1, 2, 3, 4 (level by level) and DFS visiting 1, 2, 4, 3 (going deep first). Both visit the same nodes, just in different orders!

Spot the bug

#include <bits/stdc++.h>
using namespace std;

void bfs(int start, vector<vector<int>>& adj, int n) {
    vector<int> dist(n + 1, -1);
    queue<int> q;
    
    dist[start] = 0;
    q.push(start);
    
    while (!q.empty()) {
        int node = q.front();
        q.pop();
        
        for (int neighbor : adj[node]) {
            dist[neighbor] = dist[node] + 1;
            q.push(neighbor);
        }
    }
    
    for (int i = 1; i <= n; i++) {
        cout << "Distance to " << i << ": " << dist[i] << "\n";
    }
}

int main() {
    int n = 4;
    vector<vector<int>> adj(n + 1);
    adj[1].push_back(2); adj[2].push_back(1);
    adj[2].push_back(3); adj[3].push_back(2);
    adj[3].push_back(1); adj[1].push_back(3);
    
    bfs(1, adj, n);
    return 0;
}

Need a hint?

Look at the BFS loop carefully. What happens when a neighbor has already been visited? Does the code check for that? What would happen in a graph with cycles?

Show answer

The bug is that BFS never checks if a neighbor was already visited! Without checking if dist[neighbor] == -1 (unvisited), the algorithm will visit the same node multiple times and run forever in a graph with cycles. The fix is to add 'if (dist[neighbor] == -1)' before updating the distance and pushing to the queue. This ensures each node is processed exactly once. Without this check, BFS becomes an infinite loop on any graph that has a cycle!

Explain like I'm 5

Imagine you have a bunch of toy houses on the floor, and you connect some of them with strings. That is a graph! The houses are nodes and the strings are edges. Now imagine you are an ant starting at one house. BFS is like the ant visiting ALL houses one step away first, then ALL houses two steps away, like circles getting bigger and bigger. DFS is like the ant going as far as it can along one string, then coming back and trying another string. Both ants eventually visit every house, just in a different order!

Fun fact

The famous 'Six Degrees of Separation' theory says that any two people on Earth are connected by at most 6 friendships. Facebook actually tested this on their entire network of over 2 billion users and found the average is just 3.57! This is a graph theory result — social networks form a 'small world' graph where even though there are billions of nodes, the shortest path between any two is surprisingly short. Graph theory literally explains how we are all connected!

Hands-on challenge

Build the sample graph from the code and trace through BFS and DFS by hand on paper first. Then implement it in C++. Next, solve this: given a grid of '.' (empty) and '#' (wall), count the number of connected empty regions using BFS or DFS. This is called 'flood fill' and it is one of the most common graph problems in contests! Test with a 5x5 grid of your own design. Finally, modify BFS to print the actual shortest path (not just the distance) from node 1 to every other node. Hint: keep a parent[] array!

More resources

Graph Algorithms for Beginners — William Fiset (YouTube — William Fiset)
Graph Traversal — USACO Guide (USACO Guide)
BFS/DFS Practice Problems — CSES (CSES)

Open interactive version (quiz + challenge) ← Back to course: CP Zero to Hero