C Compiler in Python
Compilers
A working compiler for a real, growing subset of C, written
incrementally in Python. Each chapter handles slightly more of
the language than the last; each chapter the program still
compiles every earlier-chapter source file; each chapter's
output is real x86-64 assembly that you can hand to
gcc and execute. The structure follows Nora
Sandler's Writing a C Compiler book and the chibicc
project: don't try to swallow C whole, grow into it.
Python is the host language for this track because it lets the algorithms breathe. String slicing, dictionaries, exceptions, no manual memory management — every line of the compiler is about the compiler, not about Python plumbing. Once the shape of each stage is clear, the C and Lisp and OCaml tracks are "how do I express the same idea here?" exercises.
This series is also where the site is testing a new exercise pattern. Every chapter ends with a block of mixed exercises in two flavors:
- Concept Comprehension questions — "why does this work," "explain X in your own words." Click-to-reveal canonical answer.
- Problem Hands-on problems — trace a program, modify the codegen, predict an output. Click-to-reveal solution and explanation.
The exercises are not graded — there's no runtime in the page, yet. The discipline is on the reader: think first, click second. Future versions of these components may grow a sandboxed code runner so problems can be auto-checked.
Chapters
- The smallest C program
— Lex, parse, and codegen for
int main(void) { return INTEGER; }. End-to-end pipeline in <100 lines of Python. - Unary operators
—
-x,!x,~x. The first AST node, the first recursive parser, and the "value lives in%eax" codegen invariant. - Binary operators and precedence
—
+ - * /. Recursive-descent precedence climbing. - Local variables — Stack frames, the activation record, variable resolution.
- Conditionals
—
if/else, conditional branches. - Loops
—
while,for,break,continue. - Functions and the calling convention — Multiple functions, arguments, the System V ABI.
- Pointers and the address-of operator
—
*,&, addresses as values. - Arrays — Stack-allocated arrays and indexing.
- Structs — Composite types, field offsets.
Upcoming chapters aren't written yet.
Prerequisites
Comfort with C as a user — variables, control flow, functions, pointers — and with Python as the implementation language. Some exposure to assembly helps but isn't required; we introduce only as much x86-64 as each chapter needs. No compiler-construction background assumed: the standard "lex / parse / codegen" pipeline is built up explicitly in chapter 1.
A Linux box (or WSL) with gcc installed is the
easiest way to run the compiled output. macOS works too — the
only difference is the linker wants _main instead
of main; chapter 1 calls this out where relevant.