Graphs

Nodes + edges. BFS for shortest path, DFS for connectivity. Topological sort for dependencies.

Graphs

A graph is a set of nodes (vertices) and edges. A tree is a special graph — connected and acyclic. Not all graphs are trees.

Graph Types

Directed: edges have direction (one-way streets)
Undirected: edges are bidirectional
Connected graph: path exists between every pair of vertices
Acyclic: no cycles (a DAG is a Directed Acyclic Graph)
Weighted: edges have numeric weights

Representations

Adjacency List — most common for interviews

graph = {
    'A': ['B', 'C'],
    'B': ['D'],
    'C': ['D', 'E'],
}

Space: O(V + E) — good for sparse graphs
In undirected graphs, store each edge twice: A→B and B→A
Iterating neighbours: O(degree of node)

Adjacency Matrix — NxN boolean (or weight) matrix

# matrix[i][j] = True means edge from i to j

Space: O(V²) — good for dense graphs
Fast edge existence check: O(1) via matrix[A][B]
Undirected graph matrix is symmetric
Downside: iterating neighbours requires scanning all N columns

| | Adjacency List | Adjacency Matrix | |--|--|--| | Space | O(V + E) | O(V²) | | Edge lookup | O(degree) | O(1) | | Best for | Sparse graphs | Dense graphs |

BFS — Shortest Path (unweighted)

from collections import deque

def bfs(graph, start, end):
    if start == end: return True
    visited = {start}
    q = deque([start])
    while q:
        node = q.popleft()
        if node == end: return True
        for neighbour in graph.get(node, []):
            if neighbour not in visited:
                visited.add(neighbour)
                q.append(neighbour)
    return False

BFS is preferred when you need shortest path (unweighted). Track parent pointers to reconstruct the path.

DFS — Connected Components

def dfs(graph, node, visited):
    visited.add(node)
    for neighbour in graph.get(node, []):
        if neighbour not in visited:
            dfs(graph, neighbour, visited)

def count_components(graph):
    visited = set()
    count = 0
    for node in graph:
        if node not in visited:
            dfs(graph, node, visited)
            count += 1
    return count

DFS is preferred when you want to visit every node or explore all paths.

Bidirectional BFS

Run simultaneous BFS from both source and target. Stop when the two frontiers collide.

Why it's faster: Instead of exploring to depth d (O(kᵈ) nodes), each side only goes to depth d/2 — total O(kᵈ/²) nodes. Huge win on large graphs.

Topological Sort (Kahn's Algorithm)

Only works on DAGs (directed acyclic graphs). Key insight: find the tasks with nothing blocking them, do those first.

from collections import deque

def topo_sort(nodes, dependencies):
    graph = {n: [] for n in nodes}
    in_degree = {n: 0 for n in nodes}
    for before, after in dependencies:
        graph[before].append(after)
        in_degree[after] += 1

    # Start with all nodes that have no prerequisites
    q = deque([n for n in nodes if in_degree[n] == 0])
    order = []
    while q:
        node = q.popleft()
        order.append(node)
        for neighbour in graph[node]:
            in_degree[neighbour] -= 1
            if in_degree[neighbour] == 0:
                q.append(neighbour)

    return order if len(order) == len(nodes) else None  # None = cycle
# Time: O(V + E)

Use cases: build order, course prerequisites, task scheduling, CI/CD pipeline ordering.

Cycle Detection

Directed graph — DFS with 3 colors:

def has_cycle_directed(graph):
    WHITE, GRAY, BLACK = 0, 1, 2
    color = {n: WHITE for n in graph}

    def dfs(node):
        color[node] = GRAY         # mark as in-progress
        for nb in graph.get(node, []):
            if color[nb] == GRAY: return True   # back edge = cycle
            if color[nb] == WHITE and dfs(nb): return True
        color[node] = BLACK        # fully processed
        return False

    return any(color[n] == WHITE and dfs(n) for n in graph)

Undirected graph: DFS but track the parent — don't flag the edge back to parent as a cycle.

Alternative: Topological sort — if result length < node count, there's a cycle.

Shortest Path Algorithms

| Algorithm | Handles | Time | Use when | |-----------|---------|------|----------| | BFS | Unweighted | O(V + E) | Equal-weight edges | | Dijkstra | Non-negative weights | O((V+E) log V) | Standard weighted graph | | Bellman-Ford | Negative weights | O(VE) | Negative edges, detect negative cycles | | Floyd-Warshall | All pairs | O(V³) | Small graphs, all-pairs needed |

import heapq

def dijkstra(graph, start):
    """graph[node] = [(neighbor, weight), ...]"""
    dist = {node: float('inf') for node in graph}
    dist[start] = 0
    heap = [(0, start)]
    while heap:
        d, node = heapq.heappop(heap)
        if d > dist[node]: continue
        for neighbour, weight in graph.get(node, []):
            new_dist = dist[node] + weight
            if new_dist < dist[neighbour]:
                dist[neighbour] = new_dist
                heapq.heappush(heap, (new_dist, neighbour))
    return dist

Union-Find (Disjoint Set Union)

Tracks connected components efficiently. Operations: Find (with path compression) + Union (by rank).

class UnionFind:
    def __init__(self, n):
        self.parent = list(range(n))
        self.rank = [0] * n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # path compression
        return self.parent[x]

    def union(self, x, y):
        rx, ry = self.find(x), self.find(y)
        if rx == ry: return False   # already same component
        if self.rank[rx] < self.rank[ry]: rx, ry = ry, rx
        self.parent[ry] = rx
        if self.rank[rx] == self.rank[ry]: self.rank[rx] += 1
        return True

Amortized O(α(n)) per operation — effectively constant. Used in: Kruskal's MST, cycle detection, connected components.

Minimum Spanning Tree (MST)

Kruskal's: Sort edges by weight, add edge if it doesn't create a cycle (use Union-Find). Time: O(E log E).

Prim's: Start from any node, greedily add the minimum-weight edge connecting a new node. Time: O(E log V) with a heap.

Key Interview Tips

Always track visited to avoid infinite loops
BFS → shortest path (unweighted); Dijkstra → weighted non-negative
Topological sort only works on DAGs — check for cycles
"Islands" grid problems: treat the 2D grid as an implicit graph, BFS/DFS from unvisited cells
For cycle detection in undirected graphs, track parent to avoid false positives