Compute single-source shortest paths in a graph with non-negative edge weights by maintaining a priority queue of (tentative distance, vertex) pairs, repeatedly extracting the minimum, marking it final, and relaxing its outgoing edges.
1 worked example · 7 practice problems · 2 check problems
Worked example: Dijkstra on a 5-vertex graph
Problem. Run Dijkstra from source A on the undirected weighted graph:
A-B: 4 A-C: 2 B-C: 1 B-D: 5C-D: 8 C-E: 10 D-E: 2
Compute the shortest distance from A to every other vertex. Trace each priority-queue extraction by hand.
Diagnosis. Dijkstra greedily settles vertices in order of their shortest-distance from the source. The invariant: when we extract a vertex u with distance d from the priority queue, d is guaranteed to be the true shortest-path distance to u. We then "relax" each outgoing edge — checking whether routing through u gives a shorter path to its neighbors.
Predict before reading on: predict which vertex gets settled second (after A). Look at the edges from A: there are two, with weights 2 and 4. The minimum is 2.
Iteration trace. Maintain dist (current best estimate for each vertex), pq (priority queue of (dist, vertex) pairs), visited (settled set).
step
extracted
relaxations
dist after
0 (init)
—
—
A=0, others=∞
1
(0, A)
A→B: dist[B] = 4; A→C: dist[C] = 2
A=0, B=4, C=2
2
(2, C)
C→B: 2+1=3 < 4, dist[B]=3; C→D=10; C→E=12
A=0, B=3, C=2, D=10, E=12
3
(3, B)
B→D: 3+5=8 < 10, dist[D]=8
A=0, B=3, C=2, D=8, E=12
4
(4, B) stale
B already visited, skip
(unchanged)
5
(8, D)
D→E: 8+2=10 < 12, dist[E]=10
A=0, B=3, C=2, D=8, E=10
6
(10, E)
no improvements
(final)
Final distances.A=0,B=3,C=2,D=8,E=10. Note that the shortest path to B is A→C→B (length 3), not the direct edge A→B (length 4). Dijkstra discovers this by relaxing the edge C→B when C is settled.
Predict before reading on: at step 4 we extract (4, B), but B was already settled with distance 3 at step 3. Why is the stale entry still in the queue, and why is it safe to ignore?
Stale entries. A standard Python implementation doesn't update entries in heapq when distances improve — instead it pushes the better entry on top and ignores the worse one when popped. The "if already visited, skip" check is the standard way to handle this. Decrease-key data structures (Fibonacci heaps) avoid stale entries at the cost of much more complex implementation; for most practical purposes, push-and-skip on a binary heap is competitive.
Verification.
import heapqdef dijkstra(graph, src): dist = {v: float('inf') for v in graph} dist[src] = 0 pq = [(0, src)] visited = set() while pq: d, u = heapq.heappop(pq) if u in visited: continue visited.add(u) for v, w in graph[u]: if d + w < dist[v]: dist[v] = d + w heapq.heappush(pq, (d + w, v)) return distgraph = { 'A': [('B', 4), ('C', 2)], 'B': [('A', 4), ('C', 1), ('D', 5)], 'C': [('A', 2), ('B', 1), ('D', 8), ('E', 10)], 'D': [('B', 5), ('C', 8), ('E', 2)], 'E': [('C', 10), ('D', 2)],}print(dijkstra(graph, 'A')) # {'A': 0, 'B': 3, 'C': 2, 'D': 8, 'E': 10}
Articulate: state in one sentence the "greedy" invariant that makes Dijkstra work, and the property of edge weights that the invariant requires.
Practice problems
Seven problems across seven different settings. The algorithm doesn't change; the graph representation, the edge-weight transformation, and what "shortest path" means do.
P.1explicit weighted graph
Find shortest distances from vertex 1 to all others in this graph (undirected, weighted):
Find the analogue:
same algorithm as the worked example, larger graph. Identify the priority queue, the visited set, the relaxation step.
show answer
Distances:1→1=0,1→2=7,1→3=9,1→4=20,1→5=20,1→6=11.
Key trace points: vertex 6 is reached via 1→3→6 (distance 11), not the direct edge of weight 14. Vertex 5 is reached via 1→3→6→5 (distance 20) or via 1→3→4→5 (also 20) — a tie.
This is the classic Wikipedia "Dijkstra's algorithm" example graph. Run the same code as the worked example with this graph as input and confirm.
P.2grid shortest-path (implicit graph)
A 3×3 grid has movement cost equal to the value at each cell entered. You start at the top-left and may move to any 4-adjacent neighbor (up/down/left/right):
grid = [[1, 3, 1], [1, 5, 1], [4, 2, 1]]
What's the minimum total cost to reach the bottom-right corner? (Include both endpoints.)
Find the analogue:
no explicit graph — each cell is a vertex and each adjacent pair is an edge. The edge weight is the value of the cell being entered. Run Dijkstra with this implicit graph.
show answer
Answer: 7. Best path: (0,0)→(0,1)→(0,2)→(1,2)→(2,2) with values 1+3+1+1+1=7.
import heapqdef grid_shortest(grid): m, n = len(grid), len(grid[0]) dist = [[float('inf')]*n for _ in range(m)] dist[0][0] = grid[0][0] pq = [(grid[0][0], 0, 0)] while pq: d, i, j = heapq.heappop(pq) if d > dist[i][j]: continue for di, dj in [(1,0),(-1,0),(0,1),(0,-1)]: ni, nj = i+di, j+dj if 0 <= ni < m and 0 <= nj < n: nd = d + grid[ni][nj] if nd < dist[ni][nj]: dist[ni][nj] = nd heapq.heappush(pq, (nd, ni, nj)) return dist[m-1][n-1]print(grid_shortest([[1,3,1],[1,5,1],[4,2,1]])) # 7
Note: if movement were restricted to right and down only, this would be a 1D DP problem (P.2 of the coin-change exercises tackled a related variant). With 4-directional movement, the graph has cycles and Dijkstra's frontier-based shortest-path is the natural approach. The implicit-graph trick — letting cells be vertices without ever materializing the edge list — is what makes grid pathfinding fast.
P.3currency exchange via log-transform
Five currencies trade with these exchange rates (multiplicative — 1 unit of source × rate = units of target):
Find the route from USD to JPY that maximizes the final JPY amount. Hint: Dijkstra naturally minimizes a sum, not maximizes a product. Apply a transformation to fit.
Find the analogue:
shortest-path on multiplicative weights becomes shortest-path on additive weights via logarithms: maximizing ∏ri is the same as maximizing ∑logri is the same as minimizing ∑(−logri). As long as −logri≥0 (i.e., ri≤1), Dijkstra applies.
show answer
Transform: edge weight w(u,v)=−logr(u,v). Then minimizing ∑w maximizes ∏r.
For rates above 1 (like 145 for EUR→JPY), −log145<0 — Dijkstra requires non-negative weights. Two fixes: (i) shift all log-rates by adding a constant large enough to make them non-negative (note this changes which path is shortest only if all paths have the same length, which they don't); (ii) normalize all rates to relative ratios bounded by 1 first.
For this problem, a cleaner approach: work directly with the log-rates. Compute optimum routes for each currency pair via Bellman-Ford (which handles negative weights — no Dijkstra needed). On a graph this small, brute-force enumeration of the 3-edge paths works too: USD→EUR→JPY = 0.9 × 145 = 130.5; USD→GBP→JPY = 0.8 × 165 = 132; USD→EUR→CHF→JPY = 0.9 × 0.95 × 155 = 132.5; USD→GBP→CHF→JPY = 0.8 × 1.1 × 155 = 136.4 (best).
The lesson: Dijkstra works for any cost function that's monotone in path length and bounded below by zero. When negative edges appear (as they do after the log transformation), you need Bellman-Ford. The shape of "shortest path" generalizes; the algorithm sometimes doesn't.
P.4negative-edge counterexample
Construct a small graph (4 vertices is enough) where Dijkstra from the source produces an incorrect shortest-path estimate to some vertex when a negative-weight edge is present. Trace through the algorithm to show what goes wrong.
Find the analogue:
Dijkstra's correctness invariant ("a popped vertex's distance is final") requires that no later relaxation can improve it. Negative edges break this — a future edge could decrease the distance to an already-settled vertex.
show answer
Graph:
S → A: 1S → B: 4A → B: -10(directed)
True shortest distances from S: S=0,A=1,B=min(4,1+(−10))=−9.
Dijkstra's trace:
Extract S, settle with dist 0. Relax: A=1,B=4.
Extract A (smaller of A=1,B=4), settle with dist 1. Relax A→B: 1+(−10)=−9<4, so update B=−9.
Extract B with dist −9... but wait, the priority queue has B with old key 4. Even if we push the new (−9,B), the algorithm continues and would correctly find B=−9 on this small example.
So Dijkstra happens to work here. The real counterexample requires a vertex to be settled (popped from the queue) before a negative-weight back-edge can correct its distance. Try:
S → A: 5S → B: 1B → A: -3A → C: 1
True distances: S=0,B=1,A=1+(−3)=−2,C=A+1=−1.
Dijkstra: settle S (dist 0). Push A=5,B=1. Settle B (dist 1). Relax B→A: 1+(−3)=−2<5, update A=−2, push. Now the queue has (5,A) and (−2,A). Settle A from (−2,A): dist −2. Relax A→C: −2+1=−1, push. Settle C at −1.
Still correct! Push-and-pop handled it. The genuine failure happens with negative cycles: if you have A→B→A totaling negative, Dijkstra terminates with a finite (wrong) answer, while Bellman-Ford detects the cycle. Even without cycles, the visited-set variant of Dijkstra (which doesn't reprocess settled vertices) fails on this graph: if you mark A as visited at dist 5 in step 1, you'll never re-settle it at dist −2.
Bottom line: Dijkstra's correctness requires non-negative edge weights AND a no-revisit policy. Bellman-Ford handles arbitrary weights (in O(VE) instead of O(ElogV)) and detects negative cycles.
P.5single-pair early termination
You only need the shortest distance from A to one specific vertexT, not to every vertex. How can you modify Dijkstra to terminate as soon as T is settled? Show this is correct — explain why no further iterations can reduce dist[T].
Find the analogue:
Dijkstra's invariant says a popped vertex's distance is final. Once T is popped, we know its shortest-path distance — we can stop.
show answer
Modification: in the main loop, check if u == target: return d immediately after popping. No other change needed.
def dijkstra_pair(graph, src, target): dist = {v: float('inf') for v in graph} dist[src] = 0 pq = [(0, src)] while pq: d, u = heapq.heappop(pq) if u == target: return d # ← early exit if d > dist[u]: continue for v, w in graph[u]: if d + w < dist[v]: dist[v] = d + w heapq.heappush(pq, (d + w, v)) return float('inf')
Correctness: when T is popped with distance d, by Dijkstra's invariant d=dist∗(T). Any future relaxation through some not-yet-settled vertex u would route through u with dist∗(u)≥d (because u is unsettled and pops in increasing-distance order), so dist∗(u)+w(u,T)≥d when weights are non-negative. Nothing can improve dist[T].
Speedup is significant in practice: on a 1 million-vertex road graph with target a few miles away, early termination cuts the work by orders of magnitude. Map-routing systems exploit this aggressively, often combined with bidirectional search (running Dijkstra from both source and target simultaneously and meeting in the middle).
P.6BFS as unit-weight Dijkstra
Show that breadth-first search (BFS) is mathematically equivalent to Dijkstra on a graph with all edge weights equal to 1. Specifically:
(a) Why does the priority queue degenerate to a FIFO queue when all weights are 1?
(b) What's the running time of "Dijkstra with unit weights" if you replace the heap with a queue?
Find the analogue:
Dijkstra's priority queue orders by distance. When all edges have weight 1, distance equals BFS-depth, and the priority queue's "extract min" reduces to a FIFO's "extract front."
show answer
(a) With unit weights, every newly-relaxed neighbor has distance equal to the current popped vertex's distance + 1. So at any point, the priority queue contains at most two distinct distance values: d (current frontier) and d+1 (neighbors of current frontier). A FIFO that processes the d-frontier first naturally maintains this ordering. The priority queue degenerates to a FIFO queue.
(b) Replacing the heap (O(logV) per operation) with a queue (O(1) per operation) drops Dijkstra's O((V+E)logV) to O(V+E) — exactly BFS's complexity. This is why BFS is preferred for unweighted shortest-path: same algorithm, no log factor.
Generalization: a "0-1 BFS" uses a double-ended queue and gives O(V+E) on graphs with weights in {0,1} only. Pushes go to the front (for weight 0) or back (for weight 1). This is the natural data structure for shortest-path on a maze with both walkable cells and door cells.
P.7counting shortest paths
Modify Dijkstra to also count the number of distinct shortest paths from the source to each vertex. Apply to:
A—B: 1 A—C: 1B—D: 1 C—D: 1
How many shortest paths from A to D?
Find the analogue:
keep a parallel array count[v]. When a relaxation strictly improves dist[v], the path count resets to count[u] (we found a strictly better way, so old paths are abandoned). When relaxation matches dist[v], the count adds: there's another path of the same length.
show answer
def dijkstra_count(graph, src): dist = {v: float('inf') for v in graph} count = {v: 0 for v in graph} dist[src], count[src] = 0, 1 pq = [(0, src)] while pq: d, u = heapq.heappop(pq) if d > dist[u]: continue for v, w in graph[u]: nd = d + w if nd < dist[v]: dist[v] = nd count[v] = count[u] heapq.heappush(pq, (nd, v)) elif nd == dist[v]: count[v] += count[u] return dist, count
On the example graph: dist={A:0,B:1,C:1,D:2}, count={A:1,B:1,C:1,D:2}. Two shortest paths to D: A→B→D and A→C→D.
This generalizes: any function that's additive over paths and where "shortest" is well-defined (number of paths, sum of path-lengths, edge-disjoint paths, etc.) can ride on top of Dijkstra. The same machinery underlies betweenness centrality computations on networks (Brandes' algorithm), shortest-path Jacobians, and the Viterbi algorithm in HMMs.
Check problems
Check 1derivation
Prove Dijkstra's correctness invariant:
When a vertex u is popped from the priority queue with distance d, then d=dist∗(u), the true shortest-path distance from the source to u.
Assume all edge weights are non-negative. State explicitly where the non-negativity is used.
show solution sketch
Proof by induction on the order of pops.
Base case: the source s is popped first with distance 0. Trivially 0=dist∗(s) since no edge can have negative weight.
Inductive step: assume the invariant holds for all vertices popped before u. Suppose for contradiction d>dist∗(u). Then there's some shortest path P∗=s→…→u with total weight dist∗(u)<d.
Let y be the first vertex on P∗ that has not yet been popped at the moment we're popping u. (y exists because u itself is unpopped at this moment.) Let x be the vertex on P∗ immediately before y.
By construction, x was popped earlier, so by induction dist[x]=dist∗(x). When x was popped, the edge x→y was relaxed, so dist[y]≤dist∗(x)+w(x,y)=dist∗(y)≤dist∗(u) (the last inequality uses non-negativity: dist∗(y)≤dist∗(u) because the path y→…→u on P∗ has non-negative total weight).
So at the moment u is being popped, dist[y]≤dist∗(u)<d. But the priority queue pops in non-decreasing order, so y should have been popped before u. Contradiction. Therefore d=dist∗(u). ∎
Where non-negativity is used: in the inequality dist∗(y)≤dist∗(u), the "remaining path" from y to u on P∗ must have non-negative total weight. If negative edges existed, the remaining path could subtract, making dist∗(y)>dist∗(u) possible — breaking the contradiction. This is the formal reason Dijkstra requires non-negative weights.
Check 2diagnosis
The following Dijkstra implementation looks reasonable but produces incorrect results on certain graphs. Find the bug.
import heapqdef dijkstra_buggy(graph, src): dist = {v: float('inf') for v in graph} dist[src] = 0 pq = [(0, src)] while pq: d, u = heapq.heappop(pq) # ← no "if d > dist[u]" check here for v, w in graph[u]: if d + w < dist[v]: dist[v] = d + w heapq.heappush(pq, (d + w, v)) return dist
(a) Construct a small graph where this code returns wrong distances.
(b) Explain what's missing.
(c) Propose the smallest fix.
show solution sketch
(a) Counterexample. Graph: A→B:10,A→C:1,C→B:2,B→D:1. True distances from A: A=0,B=3,C=1,D=4.
Pop (10,B) — stale, but no check! Bug: the code treats this as B being at distance 10 and re-relaxes its outgoing edges with that stale distance, pushing (11,D) into the queue.
(11, D) is popped, but dist[D] is already 4, and the "if d + w < dist[v]" check correctly prevents corruption — so the final answer is right.
Wait — actually, the bug is subtler than "wrong answer." The "if d+w < dist[v]" check inside the relax loop saves the day for distance correctness. But the algorithm relaxes edges multiple times from stale entries, doing redundant work. On a graph with many improvable vertices, this can blow up to O(VElogV) instead of the correct O((V+E)logV).
(b) Missing: the if d > dist[u]: continue check after popping. Without it, stale popped entries cause redundant relaxations and degraded performance.
(c) Fix:
while pq: d, u = heapq.heappop(pq) if d > dist[u]: continue # ← skip stale entries for v, w in graph[u]: ...
Alternative fix: maintain a visited set and skip if already visited. Both produce correct algorithm with optimal complexity. The "d > dist[u]" check is more local and the conventional choice.
This is a real bug that appears in beginner Dijkstra implementations all the time — the algorithm seems to work on small inputs because the relaxation check protects distance correctness, but it's an order of magnitude slower than it should be on real-world graphs. Profiling reveals the issue: the priority queue grows to O(E) instead of O(V).