Graphs
Nodes + edges. BFS for shortest path, DFS for connectivity. Topological sort for dependencies.
Graphs
A graph is a set of nodes (vertices) and edges. A tree is a special graph — connected and acyclic. Not all graphs are trees.
Graph Types
- Directed: edges have direction (one-way streets)
- Undirected: edges are bidirectional
- Connected graph: path exists between every pair of vertices
- Acyclic: no cycles (a DAG is a Directed Acyclic Graph)
- Weighted: edges have numeric weights
Representations
Adjacency List — most common for interviews
graph = {
'A': ['B', 'C'],
'B': ['D'],
'C': ['D', 'E'],
}
- Space: O(V + E) — good for sparse graphs
- In undirected graphs, store each edge twice: A→B and B→A
- Iterating neighbours: O(degree of node)
Adjacency Matrix — NxN boolean (or weight) matrix
# matrix[i][j] = True means edge from i to j
- Space: O(V²) — good for dense graphs
- Fast edge existence check: O(1) via
matrix[A][B] - Undirected graph matrix is symmetric
- Downside: iterating neighbours requires scanning all N columns
| | Adjacency List | Adjacency Matrix | |--|--|--| | Space | O(V + E) | O(V²) | | Edge lookup | O(degree) | O(1) | | Best for | Sparse graphs | Dense graphs |
BFS — Shortest Path (unweighted)
from collections import deque
def bfs(graph, start, end):
if start == end: return True
visited = {start}
q = deque([start])
while q:
node = q.popleft()
if node == end: return True
for neighbour in graph.get(node, []):
if neighbour not in visited:
visited.add(neighbour)
q.append(neighbour)
return False
BFS is preferred when you need shortest path (unweighted). Track parent pointers to reconstruct the path.
DFS — Connected Components
def dfs(graph, node, visited):
visited.add(node)
for neighbour in graph.get(node, []):
if neighbour not in visited:
dfs(graph, neighbour, visited)
def count_components(graph):
visited = set()
count = 0
for node in graph:
if node not in visited:
dfs(graph, node, visited)
count += 1
return count
DFS is preferred when you want to visit every node or explore all paths.
Bidirectional BFS
Run simultaneous BFS from both source and target. Stop when the two frontiers collide.
Why it's faster: Instead of exploring to depth d (O(kᵈ) nodes), each side only goes to depth d/2 — total O(kᵈ/²) nodes. Huge win on large graphs.
Topological Sort (Kahn's Algorithm)
Only works on DAGs (directed acyclic graphs). Key insight: find the tasks with nothing blocking them, do those first.
from collections import deque
def topo_sort(nodes, dependencies):
graph = {n: [] for n in nodes}
in_degree = {n: 0 for n in nodes}
for before, after in dependencies:
graph[before].append(after)
in_degree[after] += 1
# Start with all nodes that have no prerequisites
q = deque([n for n in nodes if in_degree[n] == 0])
order = []
while q:
node = q.popleft()
order.append(node)
for neighbour in graph[node]:
in_degree[neighbour] -= 1
if in_degree[neighbour] == 0:
q.append(neighbour)
return order if len(order) == len(nodes) else None # None = cycle
# Time: O(V + E)
Use cases: build order, course prerequisites, task scheduling, CI/CD pipeline ordering.
Cycle Detection
Directed graph — DFS with 3 colors:
def has_cycle_directed(graph):
WHITE, GRAY, BLACK = 0, 1, 2
color = {n: WHITE for n in graph}
def dfs(node):
color[node] = GRAY # mark as in-progress
for nb in graph.get(node, []):
if color[nb] == GRAY: return True # back edge = cycle
if color[nb] == WHITE and dfs(nb): return True
color[node] = BLACK # fully processed
return False
return any(color[n] == WHITE and dfs(n) for n in graph)
Undirected graph: DFS but track the parent — don't flag the edge back to parent as a cycle.
Alternative: Topological sort — if result length < node count, there's a cycle.
Shortest Path Algorithms
| Algorithm | Handles | Time | Use when | |-----------|---------|------|----------| | BFS | Unweighted | O(V + E) | Equal-weight edges | | Dijkstra | Non-negative weights | O((V+E) log V) | Standard weighted graph | | Bellman-Ford | Negative weights | O(VE) | Negative edges, detect negative cycles | | Floyd-Warshall | All pairs | O(V³) | Small graphs, all-pairs needed |
import heapq
def dijkstra(graph, start):
"""graph[node] = [(neighbor, weight), ...]"""
dist = {node: float('inf') for node in graph}
dist[start] = 0
heap = [(0, start)]
while heap:
d, node = heapq.heappop(heap)
if d > dist[node]: continue
for neighbour, weight in graph.get(node, []):
new_dist = dist[node] + weight
if new_dist < dist[neighbour]:
dist[neighbour] = new_dist
heapq.heappush(heap, (new_dist, neighbour))
return dist
Union-Find (Disjoint Set Union)
Tracks connected components efficiently. Operations: Find (with path compression) + Union (by rank).
class UnionFind:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
rx, ry = self.find(x), self.find(y)
if rx == ry: return False # already same component
if self.rank[rx] < self.rank[ry]: rx, ry = ry, rx
self.parent[ry] = rx
if self.rank[rx] == self.rank[ry]: self.rank[rx] += 1
return True
Amortized O(α(n)) per operation — effectively constant. Used in: Kruskal's MST, cycle detection, connected components.
Minimum Spanning Tree (MST)
Kruskal's: Sort edges by weight, add edge if it doesn't create a cycle (use Union-Find). Time: O(E log E).
Prim's: Start from any node, greedily add the minimum-weight edge connecting a new node. Time: O(E log V) with a heap.
Key Interview Tips
- Always track visited to avoid infinite loops
- BFS → shortest path (unweighted); Dijkstra → weighted non-negative
- Topological sort only works on DAGs — check for cycles
- "Islands" grid problems: treat the 2D grid as an implicit graph, BFS/DFS from unvisited cells
- For cycle detection in undirected graphs, track parent to avoid false positives