Pairs Trading and Cointegration

Finance

Pairs trading is the canonical statistical-arbitrage strategy. The idea: find two securities whose prices are linked by some economic relationship — same industry, same supply chain, share class duals, ETF and its constituents — so that their RATIO or LINEAR COMBINATION is approximately stationary even though each individual price is a random walk. When the spread deviates from its long-term mean, short the rich one, long the cheap one; close the position when the spread mean-reverts. The mathematical machinery is COINTEGRATION (Engle and Granger, 1987 — Granger's share of the 2003 Nobel cited it), and the trading rule is mean-reversion in the spread.

Pairs trading is also the cleanest entry point to a broader family of STATISTICAL ARBITRAGE strategies that look for short-term mean reversion in mispriced relationships. Many of these strategies were extremely profitable in the 1990s and early 2000s (Bamberger / D.E. Shaw / Renaissance), and have since been crowded by quant funds running similar logic. The basic methodology survives — the alpha has migrated to subtler signals, faster timescales, and more complex multi-asset combinations.

Stationarity and cointegration

A time series ${X_{t}}$ is STATIONARY if its statistical properties (mean, variance, autocorrelation) are time-invariant. Asset prices are typically NOT stationary — they have unit roots, meaning $X_{t} = X_{t - 1} + ϵ_{t}$ with no mean-reverting force; the variance of $X_{t}$ grows linearly with $t$ . Stationary series are tradeable in a mean-reversion sense; non-stationary series are not.

Two non-stationary series ${A_{t}}$ and ${B_{t}}$ are COINTEGRATED if there exists a coefficient $β$ such that $Z_{t} = A_{t} - β B_{t}$ IS stationary. Both $A$ and $B$ can wander freely, but their linear combination is anchored. Economically, this happens when the two assets are linked by an equilibrium relationship (no-arbitrage, hedging, common factor) that arbitrageurs enforce. The spread can deviate temporarily but ALWAYS comes back.

The Engle-Granger test

The classical procedure (Engle and Granger 1987):

Regress $A_{t}$ on $B_{t}$ : $A_{t} = α + β B_{t} + ε_{t}$ . The coefficient $\hat{β}$ is the COINTEGRATING VECTOR.
Form the residuals $\hat{Z}_{t} = A_{t} - \overset{α}{^} - \hat{β} B_{t}$ .
Apply an Augmented Dickey-Fuller (ADF) unit-root test to the residuals. The null hypothesis is that $\hat{Z}_{t}$ has a unit root (non-stationary, no cointegration). The test regresses $Δ \hat{Z}_{t}$ on $\hat{Z}_{t - 1}$ (with optional lags); the $t$ -statistic of the lagged-level coefficient is compared to the DICKEY-FULLER critical values (NOT the usual Student t critical values, because under the null the regressor is non-stationary).

Standard Dickey-Fuller critical values (no constant, no trend, large samples): 1% = -2.58, 5% = -1.95, 10% = -1.62. If the observed $t$ is BELOW the critical value (more negative), reject the null and conclude COINTEGRATION. Practitioners typically use the Johansen test (1991) instead in higher dimensions; for pair-wise tests Engle-Granger is fine.

The spread as Ornstein-Uhlenbeck process

Once cointegration is established, model the spread as a mean-reverting Ornstein-Uhlenbeck process:

d Z_{t} = - θ (Z_{t} - μ) d t + σ d W_{t} .

The PARAMETER $θ$ is the rate of mean reversion (inverse of the characteristic timescale); $μ$ is the long-term mean; $σ$ sets the volatility. Estimate by maximum likelihood or by regression of $Δ Z_{t}$ on $Z_{t}$ . The CHARACTERISTIC HALF-LIFE of mean reversion is $t_{1/2} = ln 2/ θ$ ; for a pair to be tradeable you typically want half-life in days to weeks, not months.

Trading the spread

Standardize the spread by its rolling mean and standard deviation:

z_{t} = \frac{Z _{t} - μ _{t}}{σ _{t}}

where $μ_{t}, σ_{t}$ are computed over a trailing window (e.g. 60 days). The classical rule:

If $z_{t} > 2$ : spread is too high → SHORT the spread (short $A$ , long $β$ units of $B$ ).
If $z_{t} < - 2$ : spread is too low → LONG the spread (long $A$ , short $β$ units of $B$ ).
EXIT when $∣ z_{t} ∣ < 0.5$ : spread has reverted; close the position.
STOP LOSS when $∣ z_{t} ∣ > 4$ : relationship may be breaking; exit at a loss rather than hold.

The thresholds (2 in, 0.5 out, 4 stop) are tunable parameters. Optimal values depend on the spread's mean-reversion speed and transaction costs; this is empirical work, and overfitting is the standard hazard.

Code

# Pairs trading on a synthetic cointegrated pair:
#   1. Generate two series sharing a common stochastic trend.
#   2. Engle-Granger test: regress one on the other, ADF the residuals.
#   3. Build the spread, compute rolling z-score, and run a simple
#      mean-reversion strategy that enters at |z|>2 and exits at |z|<0.5.

import numpy as np

rng = np.random.default_rng(123)
n = 1000

# Two cointegrated series: shared random-walk trend plus idiosyncratic noise.
common  = np.cumsum(rng.normal(0, 1, n))
beta_true = 1.5
A = common + rng.normal(0, 0.5, n) + 10
B = beta_true * common + rng.normal(0, 0.5, n) + 5

# ─── Engle-Granger cointegration test ──────────────────────────────────
# Step 1: regress A on B (or vice versa).
b_est, a_est = np.polyfit(B, A, 1)
spread = A - (b_est * B + a_est)
print(f"Engle-Granger regression A_t = a + b * B_t + e_t:")
print(f"  b_hat = {b_est:.4f}  (true hedge ratio = 1 / beta_true = {1/beta_true:.4f})")

# Step 2: ADF unit-root test on the residuals.
# Manual ADF: regress dx_t on x_{t-1} and check t-statistic vs DF critical values.
def adf_t_statistic(x):
    dx = np.diff(x)
    x_lag = x[:-1]
    X = x_lag - np.mean(x_lag)
    Y = dx - np.mean(dx)
    rho = np.sum(X * Y) / np.sum(X * X)
    e   = Y - rho * X
    se  = np.sqrt(np.sum(e**2) / (len(e) - 1) / np.sum(X * X))
    return rho / se

t_stat = adf_t_statistic(spread)
print(f"  ADF t-statistic on spread: {t_stat:.4f}")
print(f"  Critical values: 1% = -2.58, 5% = -1.95")
print(f"  Spread is " +
      ('STATIONARY (cointegrated)' if t_stat < -1.95 else 'non-stationary'))

# ─── Mean-reversion trading strategy ──────────────────────────────────
window = 60
def rolling(x, w, fn):
    return np.array([fn(x[max(0, i-w):i+1]) for i in range(len(x))])
mu_s = rolling(spread, window, np.mean)
sd_s = rolling(spread, window, np.std)
z    = (spread - mu_s) / np.maximum(sd_s, 1e-8)

position = 0
trades = 0
pnl = 0.0
for t in range(window, n - 1):
    if position == 0:
        if   z[t] >  2: position = -1; entry = spread[t]; trades += 1
        elif z[t] < -2: position =  1; entry = spread[t]; trades += 1
    else:
        if abs(z[t]) < 0.5:
            pnl += position * (spread[t] - entry)
            position = 0

print(f"\nMean-reversion strategy on the spread (|z|>2 entry, |z|<0.5 exit):")
print(f"  Number of round-trip trades: {trades}")
print(f"  Total PnL on the spread: {pnl:.2f}")

Output:

Engle-Granger regression A_t = a + b * B_t + e_t:
  b_hat = 0.6642  (true hedge ratio = 1 / beta_true = 0.6667)
  ADF t-statistic on spread: -30.8160
  Critical values: 1% = -2.58, 5% = -1.95
  Spread is STATIONARY (cointegrated)

Mean-reversion strategy on the spread (|z|>2 entry, |z|<0.5 exit):
  Number of round-trip trades: 41
  Total PnL on the spread: 59.00

Three things to read off. (1) The Engle-Granger regression recovers the true hedge ratio $\hat{b} = 0.66$ very close to the analytical value $1/ β_{true} = 0.67$ (the regression coefficient is the ratio that makes the residuals stationary). (2) The ADF $t$ -statistic is $- 30.8$ — WAY beyond the 1% critical value of $- 2.58$ . Strong rejection of the unit-root null; cointegration is conclusively present (as designed). (3) The simple $∣ z ∣ > 2$ entry / $∣ z ∣ < 0.5$ exit rule generates 41 round-trip trades over the 1000-day sample with $59 of P&L on the spread — modest but positive. In real applications the spread would be normalized to dollar P&L per unit position, and you'd benchmark the Sharpe ratio rather than absolute P&L.

The Kalman filter extension

A static hedge ratio $β$ works when the relationship between $A$ and $B$ is stable. In practice the relationship can drift — corporate actions, structural breaks, evolving fundamentals. The KALMAN FILTER tracks a TIME-VARYING hedge ratio:

A_{t} = α_{t} + β_{t} B_{t} + ε_{t}, (α_{t}, β_{t}) evolves as a random walk .

The state-space form is:

x_{t} = x_{t - 1} + η_{t}, A_{t} = (1, B_{t}) x_{t} + ε_{t},

with $x_{t} = (α_{t}, β_{t})^{T}$ . The Kalman filter gives the optimal recursive estimate of $x_{t}$ from the observations, and the FILTERED residuals are the trading spread. This is widely used in practice — many "pairs trading" production systems are really Kalman filters on cointegrating relationships.

What goes wrong in practice

Cointegration breaks. The relationship that held in-sample stops working out-of-sample. Causes: industry consolidation, corporate restructuring, regulatory change. Continually monitor the spread's stationarity (rolling ADF) and exit if cointegration fails.
Crowded trade. If everyone runs the same pair, the spread mean-reverts faster (good) but profits per trade shrink (bad). The textbook pairs (Coke vs Pepsi, Royal Dutch vs Shell pre-merger) made money for years before being arbitraged away.
Transaction costs. Two-leg trades have two bid-ask spreads to cross, and high-frequency rebalancing eats P&L. The strategy works much better when both legs are liquid; for crowded pairs both legs have tight spreads, so the costs are manageable.
Overfitting in pair selection. Search $N^{2} /2$ possible pairs in a universe of $N$ stocks, and you'll find many "cointegrated" relationships by chance — the multiple-testing problem. Hold out a true validation window before deploying.
Asymmetric tail risk. Pairs trading produces small profits most of the time and occasional large losses when the relationship breaks (LTCM-style). Sharpe ratios look great until the breakdown event hits.

Mean-variance portfolio optimization — the static portfolio counterpart to dynamic stat-arb.
Value at Risk — the risk number that pairs strategies must respect.
Statistics & inference — the underlying theory of stationarity tests and Kalman filtering.