Recognize a problem as the search for the unique boundary of a monotonic predicate; maintain an invariant interval [lo, hi] with a known predicate sign at each end; halve the interval per step until the boundary is found in O(log n) time.
1 worked example · 7 practice problems · 2 check problems
Worked example: classic sorted-array search
Problem. Find 13 in the sorted array [1, 3, 5, 7, 9, 11, 13, 15]. Trace each iteration of binary search by hand. Report the final index.
Diagnosis. The array is sorted, so the predicate P(i)=(arr[i]≥target) is monotonic in i: it stays false up to some boundary index k, then is true from k onward. Binary search finds k in O(logn) by halving an interval where the boundary must live.
Predict before reading on: predict the number of iterations needed. The array has 8 elements; binary search halves the search range each step. Roughly how many halvings to isolate the answer?
Iteration trace. Maintain lo,hi as inclusive indices. At each step compute mid=(lo+hi)/2, compare with the target, and shrink the side that can't contain it.
iter
lo
hi
mid
arr[mid]
action
1
0
7
3
7
7<13, search right: lo=4
2
4
7
5
11
11<13, search right: lo=6
3
6
7
6
13
found at index 6
Three iterations. log2(8)=3, exactly the theoretical bound. Each iteration discards half the remaining candidates.
Predict before reading on: what would happen at iteration 2 if we accidentally wrote lo=mid instead of lo=mid+1? Trace one step and see.
The invariant. Throughout the loop, the target (if present) lies in [lo,hi]. The boundaries shrink monotonically — never moving past mid — so the loop terminates. The size of the interval halves each iteration, so total work is O(logn).
Verification.
def search(arr, target): lo, hi = 0, len(arr) - 1 while lo <= hi: mid = (lo + hi) // 2 if arr[mid] == target: return mid elif arr[mid] < target: lo = mid + 1 else: hi = mid - 1 return -1print(search([1, 3, 5, 7, 9, 11, 13, 15], 13)) # 6
Articulate: state in one sentence what binary search is "actually about" — not sorted arrays specifically, but the deeper property it exploits.
Practice problems
Seven problems, seven different surfaces. The unifying move is "find the boundary of a monotonic predicate by halving." The predicate changes between problems; the halving structure doesn't.
P.1classic sorted-array lookup, edge cases
In [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]:
(a) Where is 14? How many iterations?
(b) Search for 11. What does the algorithm return, and why?
(c) Search for 21. Trace the final iteration carefully.
Find the analogue:
same algorithm as the worked example, three different targets. The interesting cases are (b) and (c) where the target isn't in the array — what happens to lo and hi at termination?
show answer
(a) 14 is at index 6. Iteration 1: mid=4,arr[4]=10<14,lo=5. Iteration 2: mid=7,arr[7]=16>14,hi=6. Iteration 3: mid=5,arr[5]=12<14,lo=6. Iteration 4: mid=6,arr[6]=14, found.
(b) 11 isn't in the array. The loop exits with lo=5,hi=4 (lo passes hi), returns -1. At exit, lo points to where 11 would be inserted to keep the array sorted — a useful invariant.
(c) 21 is bigger than every element. After successive lo-bumps, lo goes past hi, terminates with lo=10,hi=9. Returns -1.
P.2lower-bound with duplicates
Find the first occurrence of 5 in [1, 3, 5, 5, 5, 7, 9, 11]. The vanilla search from the worked example returns some index of 5 (probably the middle one). Modify it to return the leftmost.
Find the analogue:
vanilla search uses the predicate arr[i]≥target. Lower-bound uses the same predicate but doesn't stop when it finds an equal element — it keeps shrinking hi to look for a smaller index that still satisfies it. Modify the comparison.
show answer
Index 2 (the first 5). Use a "predicate" search rather than equality:
def lower_bound(arr, target): lo, hi = 0, len(arr) # half-open [lo, hi) while lo < hi: mid = (lo + hi) // 2 if arr[mid] < target: lo = mid + 1 else: hi = mid # don't bracket past mid — could be leftmost return loprint(lower_bound([1, 3, 5, 5, 5, 7, 9, 11], 5)) # 2
Two structural changes from the worked example: (i) half-open interval (hi exclusive) so out-of-bounds insertion has a natural index; (ii) hi=mid not mid−1 — because the element at mid could still be the leftmost answer.
This pattern — find the first index where a monotonic predicate becomes true — is the most general form of binary search. Vanilla "find a target" is a degenerate case.
P.3integer square root via number theory
Compute ⌊50⌋ (the integer square root: the largest k with k2≤50) using binary search. Then generalize: ⌊n⌋ for arbitrary positive integer n.
Find the analogue:
no array here. What plays the role of the array? What's the monotonic predicate? Read off the answer at termination.
show answer
The "array" is the integers 0,1,2,…,n. The predicate is P(k)=(k2>n). P is monotonic in k: false for small k (their squares are still ≤n), true for large k. The largest k with ¬P(k) is the integer square root.
def isqrt(n): lo, hi = 0, n while lo <= hi: mid = (lo + hi) // 2 sq = mid * mid if sq == n: return mid elif sq < n: lo = mid + 1 else: hi = mid - 1 return hi # largest k with k² ≤ nprint(isqrt(50)) # 7 (since 7² = 49 ≤ 50 < 64 = 8²)print(isqrt(64)) # 8
Two iterations on n=50: mid=25,625>50,hi=24. mid=12,144>50,hi=11. mid=5,25<50,lo=6. mid=8,64>50,hi=7. mid=6,36<50,lo=7. mid=7,49<50,lo=8. Loop exits with hi=7.
This is the same algorithm as the worked example, but the "array" is virtual — defined by a function rather than stored explicitly. Anywhere a monotonic predicate lives on an ordered domain, binary search applies, whether the domain is in memory or not.
A sorted array has been rotated by an unknown amount: [4, 5, 6, 7, 0, 1, 2]. The array is no longer globally sorted, but it has a special structure (one rotation point). Find 0 in O(logn) time.
Find the analogue:
the array isn't sorted, so the vanilla predicate doesn't apply. But it has the property that at least one half of the array around any mid-point is sorted. Identify which half is sorted, then check whether the target lives in that half. If yes, search that half; if no, search the other.
show answer
Index 4. The algorithm:
def search_rotated(arr, target): lo, hi = 0, len(arr) - 1 while lo <= hi: mid = (lo + hi) // 2 if arr[mid] == target: return mid # Identify the sorted half if arr[lo] <= arr[mid]: # left [lo..mid] is sorted if arr[lo] <= target < arr[mid]: hi = mid - 1 else: lo = mid + 1 else: # right [mid..hi] is sorted if arr[mid] < target <= arr[hi]: lo = mid + 1 else: hi = mid - 1 return -1print(search_rotated([4, 5, 6, 7, 0, 1, 2], 0)) # 4
Trace on [4,5,6,7,0,1,2] with target 0: mid=3,arr[3]=7. Left half [4,5,6,7] is sorted (since arr[lo]=4≤7), but 0 is not in [4, 7), so search right: lo=4. mid=5,arr[5]=1,=0. Left half [0,1] is sorted, 0 is in [0, 1), so search left: hi=4. mid=4,arr[4]=0, found.
The key insight: the predicate "is target in [lo, mid] sorted half?" is monotonic given the side identification. Even when the array as a whole isn't sorted, the halving structure survives because each step preserves a sorted-half invariant.
P.5peak element — local monotonicity
Given an array [1, 3, 7, 8, 5, 4, 2], find any peak — an index i where arr[i−1]≤arr[i]≥arr[i+1] (using −∞ for out-of-bounds neighbors). The array isn't sorted — but binary search still works in O(logn). Why?
Find the analogue:
the array isn't sorted, so the monotonic-predicate-on-values formulation doesn't apply directly. But there's a local property: at any mid, look at the slope around it. If arr[mid+1]>arr[mid], the array is rising — there must be a peak to the right. If falling, a peak is to the left. Recurse on the half that contains a peak.
show answer
Peak at index 3 (value 8). The algorithm:
def find_peak(arr): lo, hi = 0, len(arr) - 1 while lo < hi: mid = (lo + hi) // 2 if arr[mid] < arr[mid + 1]: lo = mid + 1 # rising, peak is to the right else: hi = mid # falling or flat, peak is at mid or left return loprint(find_peak([1, 3, 7, 8, 5, 4, 2])) # 3
Why this works: at each step, the "is there a peak in [lo, hi]?" invariant is maintained. If the array rises at mid, the values continue to grow into the right half before they have to start falling — there must be a peak there. The halving is justified by this monotonic-direction predicate, even though the array as a whole has no global order.
The takeaway is the same as P.4: binary search isn't about sorted arrays specifically. It's about any situation where you can rule out half the search space in O(1) based on a property at the midpoint. Sortedness is one way to enable that ruling; local monotonicity is another.
P.6parametric search — binary search on the answer
You have packages with weights [1,2,3,4,5,6,7,8,9,10] and must ship them in 5 days, in order, on a conveyor with daily capacity C. What's the minimum C that works?
Find the analogue:
no array of capacities to search through — you have to compute the answer. But there's a monotonic predicate hiding: P(C)= "can we ship in 5 days at capacity C?" is false below some threshold and true above. Binary search on C.
show answer
The minimum capacity is 15. Construct the predicate as a greedy simulation, then binary search on C:
def can_ship(packages, days, cap): used, current = 1, 0 for w in packages: if w > cap: return False if current + w > cap: used += 1 current = w else: current += w return used <= daysdef min_capacity(packages, days): lo, hi = max(packages), sum(packages) while lo < hi: mid = (lo + hi) // 2 if can_ship(packages, days, mid): hi = mid else: lo = mid + 1 return loprint(min_capacity([1,2,3,4,5,6,7,8,9,10], 5)) # 15
Bounds: lo=max(packages) (any smaller and a single package wouldn't fit), hi=∑(packages) (one giant day always works). Binary search finds the smallest C where can_ship returns true.
This is "binary search on the answer." It applies whenever (i) the answer is a number in some range, (ii) you can check whether any candidate answer works in O(f(n)) time, and (iii) the "works"/"doesn't work" predicate is monotonic. Then the total cost is O(f(n)log(range)). Pattern shows up in capacity allocation, scheduling, resource provisioning, optimal-cut problems — a huge class of optimization problems.
P.7real-valued root finding
Find a positive root of f(x)=x3−x−1 to absolute tolerance 10−6. (There's one real root between 1 and 2.) Note that f(1)=−1<0 and f(2)=5>0, so the intermediate value theorem guarantees a root in between.
Find the analogue:
no discrete array; the "indices" are real numbers in [1,2]. The predicate P(x)=(f(x)>0) is monotonic on this interval because f is strictly increasing there. Binary search reduces to hi−lo→0.
show answer
Root: x≈1.32472.
def bisect_root(f, lo, hi, tol=1e-6): assert f(lo) * f(hi) < 0 # opposite signs while hi - lo > tol: mid = (lo + hi) / 2 if f(mid) == 0: return mid elif f(lo) * f(mid) < 0: hi = mid # root in left half else: lo = mid # root in right half return (lo + hi) / 2print(bisect_root(lambda x: x**3 - x - 1, 1.0, 2.0)) # ≈ 1.32472
This is the bisection method from numerical analysis. Same algorithm as integer binary search, but with continuous bookkeeping and a tolerance-based termination. Number of iterations to achieve tolerance ε on an interval of width W is log2(W/ε) — same O(log) as discrete binary search.
Practical note: bisection has linear convergence rate 1/2 per step. Newton's method converges quadratically when it works but can diverge; bisection always works given a sign-change bracket, but never beats linear. Hybrid methods (Brent's algorithm) combine the two — fast when Newton would work, robust when it wouldn't.
Check problems
Two problems that resist pattern-matching against the practice set. The first proves the running-time bound formally; the second is a diagnosis exercise on a notoriously bug-prone family of implementations.
Check 1derivation
Prove that binary search on a sorted array of n elements has worst-case running time O(logn), terminates on every input, and has the correctness property: if the target is present, it's found; if absent, the loop exits with lo>hi.
Specifically:
(a) Define an invariant that holds at the start of each iteration. State it precisely.
(b) Show the invariant is preserved by every iteration.
(c) Use the invariant to argue correctness on termination.
(d) Derive a recurrence for the size of [lo,hi] after k iterations. Use it to bound the number of iterations.
show solution sketch
(a) Invariant.If the target is in the array, then the target's index lies in the inclusive range [lo,hi]. Equivalently: arr[i]<target for all i<lo, and arr[i]>target for all i>hi.
(b) Preservation. Suppose the invariant holds at iteration start. We compute mid and compare:
If arr[mid]=target: return immediately. (No need to maintain invariant past return.)
If arr[mid]<target: by sortedness, arr[i]<target for all i≤mid. So the target's index, if it exists, is in [mid+1,hi]. Setting lo=mid+1 preserves the invariant.
If arr[mid]>target: symmetric, setting hi=mid−1 preserves the invariant.
(c) Correctness on termination. The loop ends when either (i) arr[mid]=target, which finds the target by direct check, or (ii) lo>hi, in which case the invariant gives "if the target is in the array, its index is in [lo,hi]" — but [lo,hi] is empty, so the target isn't present. Return -1. ✓
(d) Iteration bound. Let sk=hi−lo+1 be the size of the range at iteration k. The next iteration sets either lo→mid+1 or hi→mid−1. In both cases, sk+1≤⌊sk/2⌋ (because mid is roughly the midpoint and we discard half plus the midpoint itself).
Starting from s0=n, after k iterations sk≤n/2k. The loop exits when sk=0, which happens by k=⌈log2n⌉+1.
So the total number of iterations is at most ⌈log2n⌉+1=O(logn). Each iteration does O(1) work, so the total running time is O(logn). ✓
Check 2diagnosis
The following code is supposed to search for a target in a sorted array. It compiles, runs on most inputs, and looks correct at a glance. But there's a bug that produces incorrect results on specific inputs and may also cause an out-of-bounds access.
def search(arr, target): lo, hi = 0, len(arr) # ← note: hi = len(arr), not len(arr) - 1 while lo <= hi: # ← but uses <= mid = (lo + hi) // 2 if arr[mid] == target: return mid elif arr[mid] < target: lo = mid + 1 else: hi = mid - 1 return -1
(a) Find the bug. Describe it precisely (one sentence).
(b) Construct a small example input where the bug manifests. Trace the iteration that goes wrong.
(c) Propose the smallest possible fix. There are at least two valid options; name both.
show solution sketch
(a) The bug is an off-by-one inconsistency in the bracket convention. The initialization hi=len(arr) uses a half-open upper bound (hi is one past the last valid index), but the loop body treats hi as a closed upper bound by using lo≤hi and hi=mid−1. The two conventions are mixed.
(b) Example. Search for any target in [1, 2, 3], say target = 3. Iteration 1: lo=0,hi=3, mid=1, arr[1]=2<3, lo=2. Iteration 2: lo=2,hi=3, mid=2, arr[2]=3, returns 2. ✓ for this case. Now try a target larger than every element, like target = 5. Iteration 1: mid=1,2<5,lo=2. Iteration 2: lo=2,hi=3,mid=2,arr[2]=3<5,lo=3. Iteration 3: lo=3,hi=3,mid=3, but arr[3] is an out-of-bounds access. Python raises IndexError. (In C, this would be undefined behavior.)
(c) Two valid fixes (pick one):
Fix to closed-interval convention: change initialization to hi=len(arr)−1. The loop body's lo≤hi and hi=mid−1 are correct for this convention.
Fix to half-open convention: keep hi=len(arr), but change the loop to lo<hi and the upper update to hi=mid (not mid−1).
Pick one and stick with it; mixing them is exactly how off-by-one bugs in binary search get into production. The Knuth-attributed observation that "binary search is easy until you have to write it" comes from this: the algorithm is one paragraph of pseudocode, but the boundary convention has to be exact.