Binary Search: Theory, Implementation, and Edge Cases
A thorough guide to binary search: from the core invariant to correct implementations in Python, C++, and JavaScript, plus tricky variations.
Table of Contents
Binary Search: Theory, Implementation, and Edge Cases
Binary search is the algorithm most developers think they understand but often implement incorrectly. Getting it right requires understanding the invariant it maintains, not just memorizing a template.
Why O(log n)?
Binary search halves the search space on each step. Starting with $n$ elements:
$$n \to \frac{n}{2} \to \frac{n}{4} \to \cdots \to 1$$
This takes $\log_2 n$ steps. For $n = 10^9$: only 30 comparisons.
The Core Invariant
The target (if it exists) is always in the range $[lo, hi]$. Every update must maintain this invariant:
while lo <= hi:
mid = lo + (hi - lo) // 2 # avoids integer overflow vs (lo + hi) // 2
if arr[mid] == target: return mid
elif arr[mid] < target: lo = mid + 1
else: hi = mid - 1
return -1
Implementations
def binary_search(arr: list[int], target: int) -> int:
lo, hi = 0, len(arr) - 1
while lo <= hi:
mid = lo + (hi - lo) // 2
if arr[mid] == target:
return mid
elif arr[mid] < target:
lo = mid + 1
else:
hi = mid - 1
return -1
Variations
Lower Bound (first position ≥ target)
def lower_bound(arr: list[int], target: int) -> int:
"""Returns the leftmost index where arr[i] >= target."""
lo, hi = 0, len(arr) # hi = len(arr), not len-1
while lo < hi: # not lo <= hi
mid = lo + (hi - lo) // 2
if arr[mid] < target:
lo = mid + 1
else:
hi = mid # don't exclude mid
return lo # lo == hi at termination
Upper Bound (first position > target)
def upper_bound(arr: list[int], target: int) -> int:
"""Returns the leftmost index where arr[i] > target."""
lo, hi = 0, len(arr)
while lo < hi:
mid = lo + (hi - lo) // 2
if arr[mid] <= target:
lo = mid + 1
else:
hi = mid
return lo
Binary Search on the Answer
Binary search isn't just for arrays — you can binary search on any monotone predicate. Example: find the minimum speed to ship all packages within D days:
def ship_within_days(weights: list[int], D: int) -> int:
def can_ship(capacity: int) -> bool:
days, load = 1, 0
for w in weights:
if load + w > capacity:
days += 1
load = 0
load += w
return days <= D
lo, hi = max(weights), sum(weights)
while lo < hi:
mid = lo + (hi - lo) // 2
if can_ship(mid):
hi = mid
else:
lo = mid + 1
return lo
Complexity Summary
| Operation | Time | Space |
|---|---|---|
| Standard binary search | O(log n) | O(1) |
| Lower/upper bound | O(log n) | O(1) |
| Binary search on answer | O(log(range) × check) | O(1) |
Binary search is deceptively simple. The key is maintaining your invariant precisely and choosing your boundary conditions carefully.