Top K Frequent Elements
Interview Prep
The problem
Given an integer array nums and an integer
k, return the k most frequent values. The
output order doesn't matter.
Pattern: count, then top-K via heap or bucket sort
Always start with Counter(nums). Then the question
becomes "give me the top K of this." Two ways to do that
cleanly: a size-K heap, or bucket-sort by frequency.
Solution 1 — size-K heap
from collections import Counter
import heapq
def top_k_frequent(nums: list[int], k: int) -> list[int]:
counts = Counter(nums)
# heapq.nlargest is O(n log k) — perfect for "top K" queries.
return [val for val, _ in heapq.nlargest(k, counts.items(), key=lambda kv: kv[1])]
O(n log k) time, O(n + k) space. The hidden trick:
heapq.nlargest internally maintains a heap of size
k rather than sorting all n entries.
Cleanest answer for an interview, and you should know
nlargest exists.
Solution 2 — bucket sort by frequency
def top_k_frequent(nums: list[int], k: int) -> list[int]:
counts: dict[int, int] = {}
for x in nums:
counts[x] = counts.get(x, 0) + 1
# Bucket index = frequency. There are at most n elements, so frequency
# is in [1, n]. Each bucket holds the values that occur that many times.
n = len(nums)
buckets: list[list[int]] = [[] for _ in range(n + 1)]
for val, freq in counts.items():
buckets[freq].append(val)
# Walk buckets from highest frequency down, taking values until we have k.
out: list[int] = []
for freq in range(n, 0, -1):
for val in buckets[freq]:
out.append(val)
if len(out) == k:
return out
return out
O(n) time, O(n) space — strictly linear. Every element appears
between 1 and n times, so we can use the frequency itself as a
bucket index. Walking from the high bucket down stops as soon as we
have k values.
The bucket version beats the heap on asymptotic complexity, but in
practice the heap is usually faster for moderate k
because of cache effects. Mention both in interviews if asked to
"improve" — it shows you know the tradeoff.
Edge cases
k == len(set(nums))— return every distinct value.- Tie at the boundary — the problem says "any K" suffices, so any tie-breaker is correct.
- Empty input is excluded by the problem constraints (k ≥ 1, n ≥ k).
Related
- Kth Largest Element in an Array. Same heap idea but on raw values, not counts.
- Sort Characters by Frequency. Same pattern, sorted output instead of "top K."