Generate Parentheses

Interview Prep

Warm-upBacktrackingbacktrackingrecursion

The problem

Given n, generate all well-formed strings of n open and n close parentheses. "Well formed" means at every prefix, the number of close parens never exceeds opens, and the totals balance.

Pattern: count remaining + validity invariant

The naive approach generates all 2^(2n) strings and filters — works for tiny n, but the Catalan-number answer count (2n choose n)/(n+1) grows much slower than 4^n, so almost all generated strings are wasted. The fix: build the string one character at a time and prune invalid prefixes at the source.

Two simple invariants make pruning safe:

Together they guarantee every emitted leaf is a valid sequence. The branching factor at each node is ≤ 2, and the tree has exactly the Catalan number of leaves — optimal.

Solution

def generate(n: int) -> list[str]:
    out, buf = [], []
    def back(opened: int, closed: int):
        if opened == n and closed == n:
            out.append(''.join(buf))
            return
        if opened < n:                       # open if we haven't used all opens
            buf.append('('); back(opened + 1, closed); buf.pop()
        if closed < opened:                  # close only if there's an unmatched open
            buf.append(')'); back(opened, closed + 1); buf.pop()
    back(0, 0)
    return out

Trace

n = 3.

back(0, 0)              ""
  '('                   "("
    back(1, 0)
      '('               "(("
        back(2, 0)
          '('           "((("
            back(3, 0)
              ')'       "((()"
                ')'     "((())"
                  ')'   "((()))" -> emit
          ')'           "(()"
            back(2, 1)
              '('       "(()("
                back(3, 1)
                  ')'   "(()()"
                    ')' "(()())" -> emit
              ')'       "(())"
                back(2, 2)
                  '('   "(())("
                    back(3, 2)
                      ')' "(())()" -> emit
      ')'               "()"
        back(1, 1)
          ... -> "()(())" and "()()()" eventually

result (5 = Catalan(3)):
  "((()))", "(()())", "(())()", "()(())", "()()()"

Brute force for contrast

def generate_brute(n):
    """Generate all 2^(2n) strings; filter to the well-formed ones."""
    def valid(s):
        bal = 0
        for c in s:
            bal += 1 if c == '(' else -1
            if bal < 0: return False
        return bal == 0
    return [s for s in (
        ''.join(bits) for bits in __import__('itertools').product('()', repeat=2 * n)
    ) if valid(s)]

Complexity

Variations worth knowing