Serialize and Deserialize Binary Tree
Interview Prep
The problem
Design a codec for an arbitrary binary tree: a function that
encodes the tree to a string and another that decodes that string
back into an identical tree. Both must run in O(n).
class Node:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right Why pre-order + a None sentinel works
A standard fact: pre-order alone is not sufficient to
reconstruct a binary tree, because two different shapes can produce
the same pre-order sequence. The classical fix is to pair pre-order
with in-order — but that takes O(n) additional space
and the in-order/pre-order alignment is fiddly.
A cleaner trick: include explicit None markers
(often # or null) for missing children
during pre-order. With the missing children made visible, pre-order
becomes a complete description: at each step the recursive
deserialiser knows whether to build a subtree or stop. The grammar
is simply tree := '#' | int tree tree.
Pre-order codec (recursive)
def serialize(root: Node | None) -> str:
"""Pre-order traversal with a sentinel for None."""
parts = []
def walk(node):
if node is None:
parts.append('#')
return
parts.append(str(node.val))
walk(node.left)
walk(node.right)
walk(root)
return ','.join(parts)
def deserialize(s: str) -> Node | None:
it = iter(s.split(','))
def build():
tok = next(it)
if tok == '#':
return None
node = Node(int(tok))
node.left = build()
node.right = build()
return node
return build() Trace
Tree: 1
/ \
2 3
/ \
4 5
Pre-order with None sentinels:
visit 1 -> "1"
visit 2 -> "2"
2.left None -> "#"
2.right None -> "#"
visit 3 -> "3"
visit 4 -> "4"
4.left None -> "#"
4.right None -> "#"
visit 5 -> "5"
5.left None -> "#"
5.right None -> "#"
serialized = "1,2,#,#,3,4,#,#,5,#,#"
deserialize:
build() consumes "1" -> Node(1)
left = build() consumes "2" -> Node(2)
left = build() consumes "#" -> None
right = build() consumes "#" -> None
right = build() consumes "3" -> Node(3)
left = build() consumes "4" -> Node(4)
... (Nones)
right = build() consumes "5" -> Node(5)
... (Nones) BFS codec (level-order)
Equivalent power, different style: emit a BFS traversal with
# in place of missing children. Reconstruct by reading
pairs: the next two tokens are the children of the next non-null
node in a queue. This is the encoding LeetCode uses to display
trees, and is good to know for that reason.
from collections import deque
def serialize_bfs(root) -> str:
if root is None: return ''
q, parts = deque([root]), []
while q:
node = q.popleft()
if node is None:
parts.append('#')
else:
parts.append(str(node.val))
q.append(node.left)
q.append(node.right)
return ','.join(parts)
def deserialize_bfs(s: str):
if not s: return None
tokens = s.split(',')
root = Node(int(tokens[0]))
q = deque([root]); i = 1
while q and i < len(tokens):
node = q.popleft()
if tokens[i] != '#':
node.left = Node(int(tokens[i]))
q.append(node.left)
i += 1
if i < len(tokens) and tokens[i] != '#':
node.right = Node(int(tokens[i]))
q.append(node.right)
i += 1
return root Complexity
- Time:
O(n)for both serialize and deserialize. - Space:
O(n)output string. Recursion usesO(h)stack.
Variations worth knowing
- BST-specific codec: with the BST invariant you can serialise as pre-order without sentinels — values alone are enough, because the bounds at each recursion step decide whether the next token belongs in the left or right subtree. Smaller payload.
- N-ary tree: the same idea, but each node needs
its child count appended (e.g.
1#3#2#4#means "node 1 has 3 children..."). Or use a closing-paren style. - Tree as a Prüfer sequence: for labeled trees,
an
n − 2-long sequence uniquely encodes the tree. Beautiful theoretical encoding; cute for combinatorics interviews. - Compact binary encoding: if values are bounded integers, switch from comma-separated text to a binary format (a marker bit per node + fixed-width values). 5-10× smaller in practice.
- Streaming deserialise: the recursive pre-order codec consumes the input one token at a time, so it generalizes naturally to streamed/chunked input. Useful for sending large trees over the wire.