Lesson 43 of 50 advanced

Graphs — Connecting the Dots

Explore the data structure that models friendships, maps, networks, and so much more!

Open interactive version (quiz + challenge)

Real-world analogy

Imagine a map of cities connected by roads. Each city is a dot (we call it a node or vertex), and each road is a line connecting two dots (we call it an edge). Some roads are one-way streets (directed edges) and some go both ways (undirected edges). A graph is just this: dots and lines! Your social media friends list is a graph — you are a dot, your friends are dots, and friendships are the lines connecting you. Once you see the world as a graph, you see graphs literally everywhere!

What is it?

A graph is a data structure consisting of nodes (vertices) connected by edges. It is used to model relationships and connections between objects. Graphs can be directed or undirected, weighted or unweighted. BFS and DFS are the two fundamental algorithms for traversing graphs, and they form the basis of almost all graph algorithms in competitive programming.

Real-world relevance

Graphs are everywhere in the real world! Google Maps uses graphs where cities are nodes and roads are edges to find the shortest route. Social networks like Facebook and Instagram are giant graphs where users are nodes and connections are edges. The internet itself is a graph of web pages connected by hyperlinks. Even your brain is a graph — neurons connected by synapses!

Key points

Code example

#include <bits/stdc++.h>
using namespace std;

// BFS — visits nodes level by level (closest first)
void bfs(int start, vector<vector<int>>& adj, int n) {
    vector<int> dist(n + 1, -1);   // -1 means not yet visited
    queue<int> q;

    dist[start] = 0;               // Starting node is 0 steps away
    q.push(start);

    cout << "BFS order: ";
    while (!q.empty()) {
        int node = q.front();      // Take the next node in line
        q.pop();
        cout << node << " ";

        for (int nb : adj[node]) { // Check each neighbor
            if (dist[nb] == -1) {  // Only visit if not seen before
                dist[nb] = dist[node] + 1;  // One step farther
                q.push(nb);
            }
        }
    }
    cout << "\n";
}

// DFS — goes as deep as possible before backtracking
void dfs(int node, vector<vector<int>>& adj, vector<bool>& visited) {
    visited[node] = true;          // Mark this node as seen
    cout << node << " ";

    for (int nb : adj[node]) {     // Try each neighbor
        if (!visited[nb]) {        // If not yet visited
            dfs(nb, adj, visited); // Go deeper!
        }
    }
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    int n = 5;                              // 5 nodes in our graph
    vector<vector<int>> adj(n + 1);         // Adjacency list (1-indexed)

    // Add undirected edges: 1-2, 1-3, 2-4, 3-4, 4-5
    vector<pair<int,int>> edges = {{1,2}, {1,3}, {2,4}, {3,4}, {4,5}};
    for (auto [u, v] : edges) {
        adj[u].push_back(v);
        adj[v].push_back(u);                // Both directions for undirected
    }

    bfs(1, adj, n);                         // BFS from node 1

    vector<bool> visited(n + 1, false);
    cout << "DFS order: ";
    dfs(1, adj, visited);                   // DFS from node 1
    cout << "\n";

    return 0;
}

Line-by-line walkthrough

  1. 1. We use vector> adj(n + 1) to store the graph as an adjacency list. Index 0 is unused because our nodes are numbered 1 to n. Each adj[i] is a list of neighbors of node i.
  2. 2. To add an undirected edge between u and v, we push v into adj[u] AND u into adj[v]. Both directions! For directed graphs, you would only add one direction.
  3. 3. In BFS, we start by marking the start node as visited, setting its distance to 0, and pushing it into the queue. The queue ensures we process nodes in order of their distance from the start.
  4. 4. The BFS while loop pops the front node from the queue, then loops through all its neighbors. If a neighbor has not been visited, we mark it, set its distance to (current distance + 1), and push it into the queue.
  5. 5. BFS naturally computes shortest distances! dist[neighbor] = dist[node] + 1 because BFS explores level by level. Every node is discovered at the earliest possible time.
  6. 6. In DFS, we use recursion. Visit a node, mark it, then for each unvisited neighbor, recursively DFS into it. This goes deep before going wide — like exploring one branch of a tree all the way to the leaf.
  7. 7. countComponents() loops through all nodes 1 to n. Each time it finds an unvisited node, it starts a new component and runs iterative DFS (using a stack) to mark all reachable nodes. The count of DFS runs equals the number of components.
  8. 8. The iterative DFS in countComponents uses a stack instead of recursion. Push the start node, then repeatedly pop a node, and push its unvisited neighbors. This avoids stack overflow for very large graphs.
  9. 9. In main(), we build a graph with 6 nodes and 5 edges forming two components: {1,2,3,4} and {5,6}. BFS from node 1 finds distances to nodes in the first component, and reports -1 for unreachable nodes 5 and 6.
  10. 10. The output shows BFS visiting nodes in order 1, 2, 3, 4 (level by level) and DFS visiting 1, 2, 4, 3 (going deep first). Both visit the same nodes, just in different orders!

Spot the bug

#include <bits/stdc++.h>
using namespace std;

void bfs(int start, vector<vector<int>>& adj, int n) {
    vector<int> dist(n + 1, -1);
    queue<int> q;
    
    dist[start] = 0;
    q.push(start);
    
    while (!q.empty()) {
        int node = q.front();
        q.pop();
        
        for (int neighbor : adj[node]) {
            dist[neighbor] = dist[node] + 1;
            q.push(neighbor);
        }
    }
    
    for (int i = 1; i <= n; i++) {
        cout << "Distance to " << i << ": " << dist[i] << "\n";
    }
}

int main() {
    int n = 4;
    vector<vector<int>> adj(n + 1);
    adj[1].push_back(2); adj[2].push_back(1);
    adj[2].push_back(3); adj[3].push_back(2);
    adj[3].push_back(1); adj[1].push_back(3);
    
    bfs(1, adj, n);
    return 0;
}
Need a hint?
Look at the BFS loop carefully. What happens when a neighbor has already been visited? Does the code check for that? What would happen in a graph with cycles?
Show answer
The bug is that BFS never checks if a neighbor was already visited! Without checking if dist[neighbor] == -1 (unvisited), the algorithm will visit the same node multiple times and run forever in a graph with cycles. The fix is to add 'if (dist[neighbor] == -1)' before updating the distance and pushing to the queue. This ensures each node is processed exactly once. Without this check, BFS becomes an infinite loop on any graph that has a cycle!

Explain like I'm 5

Imagine you have a bunch of toy houses on the floor, and you connect some of them with strings. That is a graph! The houses are nodes and the strings are edges. Now imagine you are an ant starting at one house. BFS is like the ant visiting ALL houses one step away first, then ALL houses two steps away, like circles getting bigger and bigger. DFS is like the ant going as far as it can along one string, then coming back and trying another string. Both ants eventually visit every house, just in a different order!

Fun fact

The famous 'Six Degrees of Separation' theory says that any two people on Earth are connected by at most 6 friendships. Facebook actually tested this on their entire network of over 2 billion users and found the average is just 3.57! This is a graph theory result — social networks form a 'small world' graph where even though there are billions of nodes, the shortest path between any two is surprisingly short. Graph theory literally explains how we are all connected!

Hands-on challenge

Build the sample graph from the code and trace through BFS and DFS by hand on paper first. Then implement it in C++. Next, solve this: given a grid of '.' (empty) and '#' (wall), count the number of connected empty regions using BFS or DFS. This is called 'flood fill' and it is one of the most common graph problems in contests! Test with a 5x5 grid of your own design. Finally, modify BFS to print the actual shortest path (not just the distance) from node 1 to every other node. Hint: keep a parent[] array!

More resources

Open interactive version (quiz + challenge) ← Back to course: CP Zero to Hero