Serialize and Deserialize Binary Tree

Interview Prep

LegendaryTreestreesdesigntraversal

The problem

Design a codec for an arbitrary binary tree: a function that encodes the tree to a string and another that decodes that string back into an identical tree. Both must run in O(n).

class Node:
    def __init__(self, val, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

Why pre-order + a None sentinel works

A standard fact: pre-order alone is not sufficient to reconstruct a binary tree, because two different shapes can produce the same pre-order sequence. The classical fix is to pair pre-order with in-order — but that takes O(n) additional space and the in-order/pre-order alignment is fiddly.

A cleaner trick: include explicit None markers (often # or null) for missing children during pre-order. With the missing children made visible, pre-order becomes a complete description: at each step the recursive deserialiser knows whether to build a subtree or stop. The grammar is simply tree := '#' | int tree tree.

Pre-order codec (recursive)

def serialize(root: Node | None) -> str:
    """Pre-order traversal with a sentinel for None."""
    parts = []
    def walk(node):
        if node is None:
            parts.append('#')
            return
        parts.append(str(node.val))
        walk(node.left)
        walk(node.right)
    walk(root)
    return ','.join(parts)

def deserialize(s: str) -> Node | None:
    it = iter(s.split(','))
    def build():
        tok = next(it)
        if tok == '#':
            return None
        node = Node(int(tok))
        node.left  = build()
        node.right = build()
        return node
    return build()

Trace

Tree:        1
            / \
           2   3
              / \
             4   5

Pre-order with None sentinels:
  visit 1     -> "1"
  visit 2     -> "2"
  2.left  None -> "#"
  2.right None -> "#"
  visit 3     -> "3"
  visit 4     -> "4"
  4.left  None -> "#"
  4.right None -> "#"
  visit 5     -> "5"
  5.left  None -> "#"
  5.right None -> "#"

serialized = "1,2,#,#,3,4,#,#,5,#,#"

deserialize:
  build() consumes "1" -> Node(1)
    left  = build() consumes "2" -> Node(2)
      left  = build() consumes "#" -> None
      right = build() consumes "#" -> None
    right = build() consumes "3" -> Node(3)
      left  = build() consumes "4" -> Node(4)
        ... (Nones)
      right = build() consumes "5" -> Node(5)
        ... (Nones)

BFS codec (level-order)

Equivalent power, different style: emit a BFS traversal with # in place of missing children. Reconstruct by reading pairs: the next two tokens are the children of the next non-null node in a queue. This is the encoding LeetCode uses to display trees, and is good to know for that reason.

from collections import deque

def serialize_bfs(root) -> str:
    if root is None: return ''
    q, parts = deque([root]), []
    while q:
        node = q.popleft()
        if node is None:
            parts.append('#')
        else:
            parts.append(str(node.val))
            q.append(node.left)
            q.append(node.right)
    return ','.join(parts)

def deserialize_bfs(s: str):
    if not s: return None
    tokens = s.split(',')
    root = Node(int(tokens[0]))
    q = deque([root]); i = 1
    while q and i < len(tokens):
        node = q.popleft()
        if tokens[i] != '#':
            node.left = Node(int(tokens[i]))
            q.append(node.left)
        i += 1
        if i < len(tokens) and tokens[i] != '#':
            node.right = Node(int(tokens[i]))
            q.append(node.right)
        i += 1
    return root

Complexity

Variations worth knowing