C Compiler in Python

Compilers

A working compiler for a real, growing subset of C, written incrementally in Python. Each chapter handles slightly more of the language than the last; each chapter the program still compiles every earlier-chapter source file; each chapter's output is real x86-64 assembly that you can hand to gcc and execute. The structure follows Nora Sandler's Writing a C Compiler book and the chibicc project: don't try to swallow C whole, grow into it.

Python is the host language for this track because it lets the algorithms breathe. String slicing, dictionaries, exceptions, no manual memory management — every line of the compiler is about the compiler, not about Python plumbing. Once the shape of each stage is clear, the C and Lisp and OCaml tracks are "how do I express the same idea here?" exercises.

This series is also where the site is testing a new exercise pattern. Every chapter ends with a block of mixed exercises in two flavors:

The exercises are not graded — there's no runtime in the page, yet. The discipline is on the reader: think first, click second. Future versions of these components may grow a sandboxed code runner so problems can be auto-checked.


Chapters

  1. The smallest C program — Lex, parse, and codegen for int main(void) { return INTEGER; }. End-to-end pipeline in <100 lines of Python.
  2. Unary operators-x, !x, ~x. The first AST node, the first recursive parser, and the "value lives in %eax" codegen invariant.
  3. Binary operators and precedence+ - * /. Recursive-descent precedence climbing.
  4. Local variables — Stack frames, the activation record, variable resolution.
  5. Conditionalsif / else, conditional branches.
  6. Loopswhile, for, break, continue.
  7. Functions and the calling convention — Multiple functions, arguments, the System V ABI.
  8. Pointers and the address-of operator*, &, addresses as values.
  9. Arrays — Stack-allocated arrays and indexing.
  10. Structs — Composite types, field offsets.

Upcoming chapters aren't written yet.


Prerequisites

Comfort with C as a user — variables, control flow, functions, pointers — and with Python as the implementation language. Some exposure to assembly helps but isn't required; we introduce only as much x86-64 as each chapter needs. No compiler-construction background assumed: the standard "lex / parse / codegen" pipeline is built up explicitly in chapter 1.

A Linux box (or WSL) with gcc installed is the easiest way to run the compiled output. macOS works too — the only difference is the linker wants _main instead of main; chapter 1 calls this out where relevant.