Graphs — Connecting the Dots
Explore the data structure that models friendships, maps, networks, and so much more!
Open interactive version (quiz + challenge)Real-world analogy
What is it?
A graph is a data structure consisting of nodes (vertices) connected by edges. It is used to model relationships and connections between objects. Graphs can be directed or undirected, weighted or unweighted. BFS and DFS are the two fundamental algorithms for traversing graphs, and they form the basis of almost all graph algorithms in competitive programming.
Real-world relevance
Graphs are everywhere in the real world! Google Maps uses graphs where cities are nodes and roads are edges to find the shortest route. Social networks like Facebook and Instagram are giant graphs where users are nodes and connections are edges. The internet itself is a graph of web pages connected by hyperlinks. Even your brain is a graph — neurons connected by synapses!
Key points
- What is a Graph? — A graph is a collection of nodes (also called vertices) connected by edges. Think of nodes as cities and edges as roads between them. Graphs can represent anything with connections: friendships on social media, web pages linked together, cities connected by flights, or even moves in a chess game. They are one of the most important data structures in all of computer science!
- Directed vs Undirected Graphs — In an undirected graph, edges go both ways — like a two-way road. If city A connects to city B, then B also connects to A. In a directed graph, edges have a direction — like a one-way street. A might connect to B, but B might not connect back to A. Twitter follows are directed (you can follow someone who does not follow you back). Facebook friendships are undirected (if you are friends, it goes both ways).
- How to Store a Graph — Adjacency List — The most common way to store a graph in competitive programming is an adjacency list. For each node, you keep a list of its neighbors. In C++, we use vector> adj(n). If there is an edge from node 1 to node 3, we add 3 to adj[1]. For undirected graphs, we add both directions: adj[1].push_back(3) AND adj[3].push_back(1). This is memory-efficient and fast to traverse!
- BFS — Breadth-First Search — BFS explores a graph level by level, like ripples spreading in a pond. Start at a node, visit all its direct neighbors first, then their neighbors, and so on. BFS uses a queue (FIFO) to keep track of which node to visit next. BFS is perfect for finding the shortest path in an unweighted graph because it always explores closer nodes before farther ones!
- DFS — Depth-First Search — DFS explores a graph by going as deep as possible before backtracking, like exploring a maze by always turning left until you hit a dead end. DFS uses a stack (or recursion, which uses the call stack). DFS is great for detecting cycles, finding connected components, and topological sorting. It is the explorer who goes all the way to the end of one path before trying another.
- BFS vs DFS — When to Use Each — Use BFS when you need the shortest path in an unweighted graph, or when you need to explore nodes level by level. Use DFS when you need to explore all paths, detect cycles, or do topological sorting. In competitive programming, BFS is used for shortest paths and DFS is used for connectivity and cycle detection. Both visit every node and edge exactly once, running in O(V + E) time.
- Connected Components — A connected component is a group of nodes where you can reach any node from any other node in the group. Think of it as an island — everyone on the island can reach each other, but cannot reach people on other islands. To find all connected components, run BFS or DFS from each unvisited node. Each run finds one complete component. The number of runs equals the number of components!
- Graph Vocabulary You Need to Know — Degree: the number of edges connected to a node. Path: a sequence of edges connecting two nodes. Cycle: a path that starts and ends at the same node. Tree: a connected graph with no cycles (N nodes, N-1 edges). Weighted graph: edges have numbers (distances or costs). These terms appear in almost every graph problem, so memorize them!
- Graphs in Competitive Programming — Graphs are HUGE in CP — about 30% of contest problems involve graphs! Common graph problems include: shortest path (BFS, Dijkstra), cycle detection (DFS), minimum spanning tree (Kruskal, Prim), topological sorting (DFS on directed acyclic graphs), and flood fill (BFS/DFS on grids). Start with BFS and DFS — they are the foundation for everything else!
Code example
#include <bits/stdc++.h>
using namespace std;
// BFS — visits nodes level by level (closest first)
void bfs(int start, vector<vector<int>>& adj, int n) {
vector<int> dist(n + 1, -1); // -1 means not yet visited
queue<int> q;
dist[start] = 0; // Starting node is 0 steps away
q.push(start);
cout << "BFS order: ";
while (!q.empty()) {
int node = q.front(); // Take the next node in line
q.pop();
cout << node << " ";
for (int nb : adj[node]) { // Check each neighbor
if (dist[nb] == -1) { // Only visit if not seen before
dist[nb] = dist[node] + 1; // One step farther
q.push(nb);
}
}
}
cout << "\n";
}
// DFS — goes as deep as possible before backtracking
void dfs(int node, vector<vector<int>>& adj, vector<bool>& visited) {
visited[node] = true; // Mark this node as seen
cout << node << " ";
for (int nb : adj[node]) { // Try each neighbor
if (!visited[nb]) { // If not yet visited
dfs(nb, adj, visited); // Go deeper!
}
}
}
int main() {
ios_base::sync_with_stdio(false);
cin.tie(NULL);
int n = 5; // 5 nodes in our graph
vector<vector<int>> adj(n + 1); // Adjacency list (1-indexed)
// Add undirected edges: 1-2, 1-3, 2-4, 3-4, 4-5
vector<pair<int,int>> edges = {{1,2}, {1,3}, {2,4}, {3,4}, {4,5}};
for (auto [u, v] : edges) {
adj[u].push_back(v);
adj[v].push_back(u); // Both directions for undirected
}
bfs(1, adj, n); // BFS from node 1
vector<bool> visited(n + 1, false);
cout << "DFS order: ";
dfs(1, adj, visited); // DFS from node 1
cout << "\n";
return 0;
}Line-by-line walkthrough
- 1. We use vector> adj(n + 1) to store the graph as an adjacency list. Index 0 is unused because our nodes are numbered 1 to n. Each adj[i] is a list of neighbors of node i.
- 2. To add an undirected edge between u and v, we push v into adj[u] AND u into adj[v]. Both directions! For directed graphs, you would only add one direction.
- 3. In BFS, we start by marking the start node as visited, setting its distance to 0, and pushing it into the queue. The queue ensures we process nodes in order of their distance from the start.
- 4. The BFS while loop pops the front node from the queue, then loops through all its neighbors. If a neighbor has not been visited, we mark it, set its distance to (current distance + 1), and push it into the queue.
- 5. BFS naturally computes shortest distances! dist[neighbor] = dist[node] + 1 because BFS explores level by level. Every node is discovered at the earliest possible time.
- 6. In DFS, we use recursion. Visit a node, mark it, then for each unvisited neighbor, recursively DFS into it. This goes deep before going wide — like exploring one branch of a tree all the way to the leaf.
- 7. countComponents() loops through all nodes 1 to n. Each time it finds an unvisited node, it starts a new component and runs iterative DFS (using a stack) to mark all reachable nodes. The count of DFS runs equals the number of components.
- 8. The iterative DFS in countComponents uses a stack instead of recursion. Push the start node, then repeatedly pop a node, and push its unvisited neighbors. This avoids stack overflow for very large graphs.
- 9. In main(), we build a graph with 6 nodes and 5 edges forming two components: {1,2,3,4} and {5,6}. BFS from node 1 finds distances to nodes in the first component, and reports -1 for unreachable nodes 5 and 6.
- 10. The output shows BFS visiting nodes in order 1, 2, 3, 4 (level by level) and DFS visiting 1, 2, 4, 3 (going deep first). Both visit the same nodes, just in different orders!
Spot the bug
#include <bits/stdc++.h>
using namespace std;
void bfs(int start, vector<vector<int>>& adj, int n) {
vector<int> dist(n + 1, -1);
queue<int> q;
dist[start] = 0;
q.push(start);
while (!q.empty()) {
int node = q.front();
q.pop();
for (int neighbor : adj[node]) {
dist[neighbor] = dist[node] + 1;
q.push(neighbor);
}
}
for (int i = 1; i <= n; i++) {
cout << "Distance to " << i << ": " << dist[i] << "\n";
}
}
int main() {
int n = 4;
vector<vector<int>> adj(n + 1);
adj[1].push_back(2); adj[2].push_back(1);
adj[2].push_back(3); adj[3].push_back(2);
adj[3].push_back(1); adj[1].push_back(3);
bfs(1, adj, n);
return 0;
}Need a hint?
Show answer
Explain like I'm 5
Fun fact
Hands-on challenge
More resources
- Graph Algorithms for Beginners — William Fiset (YouTube — William Fiset)
- Graph Traversal — USACO Guide (USACO Guide)
- BFS/DFS Practice Problems — CSES (CSES)