GARCH and Realized Volatility

Finance

The Black-Scholes model assumes constant volatility. Real volatility VARIES — and it does so in a structured way: periods of high volatility cluster together, periods of low volatility cluster together, and the underlying process appears to have a persistent state that drives the local variance. This phenomenon, VOLATILITY CLUSTERING, is one of the strongest stylised facts in financial time series. The GARCH family (Generalized Autoregressive Conditional Heteroskedasticity, Bollerslev 1986, extending Engle's 1982 ARCH) is the standard parametric framework for modelling it. Engle won the 2003 Nobel Prize for ARCH; GARCH and its extensions remain the workhorse of econometric volatility modelling forty years on.

Complementary to GARCH is REALIZED VOLATILITY — the model-free estimator of integrated variance from high-frequency intraday returns. The two together cover the full toolkit: GARCH for FORECASTING volatility from past returns, realized vol for MEASURING volatility from intraday data. Modern volatility-trading desks use both: GARCH-family models for next-day forecasts, realized vol for evaluating the accuracy of those forecasts and for trading volatility products (variance swaps, VIX futures).

The GARCH(1,1) model

Returns $r_{t}$ with conditional variance $σ_{t}^{2}$ :

r_{t} = σ_{t} ε_{t}, ε_{t} \sim N (0, 1), σ_{t}^{2} = ω + α r_{t - 1}^{2} + β σ_{t - 1}^{2} .

Three parameters with clean interpretations:

$ω > 0$ : the LONG-RUN baseline variance level (after scaling by the persistence factor).
$α \geq 0$ : how strongly TODAY's squared shock feeds into tomorrow's variance forecast. Captures "vol responds to large moves."
$β \geq 0$ : how much of yesterday's variance forecast PERSISTS in today's. Captures "vol is sticky."

Stationarity requires $α + β < 1$ , and the unconditional variance is $ω / (1 - α - β)$ . In practice for equity indices, $α + β \approx 0.97 - 0.99$ — extremely persistent. A high-volatility day is followed by a high-volatility forecast that decays only slowly. The HALF-LIFE of volatility shocks is $ln 2/∣ ln (α + β) ∣$ ; for $α + β = 0.98$ this is about 34 days. That's the volatility persistence intuition.

Maximum-likelihood estimation

With Gaussian innovations $ε_{t}$ , the log-likelihood of a sample ${r_{t}}_{t = 1}^{T}$ conditional on $σ_{1}^{2}$ is

ℓ (ω, α, β) = - \frac{1}{2} t = 1 \sum T [ln σ_{t}^{2} (ω, α, β) + \frac{r _{t}^{2}}{σ _{t}^{2} ( ω , α , β )}],

where $σ_{t}^{2}$ is recursively defined by the GARCH equation starting from some initial value (typically the sample variance). MAXIMIZE over $(ω, α, β)$ subject to positivity and stationarity constraints. The objective is non-convex (multiple local optima possible) but in practice well-behaved for typical data — Nelder-Mead or L-BFGS-B works.

Standard QUASI-MAXIMUM LIKELIHOOD: even when the innovations $ε_{t}$ are NOT Gaussian (e.g., t-distributed), the Gaussian likelihood ML estimator is still consistent (Bollerslev-Wooldridge 1992). For asymmetric inference (standard errors, tests), the asymptotic covariance matrix needs a robust sandwich form.

Code: GARCH simulation and estimation

# Simulate a GARCH(1,1) process; fit it by MLE; demonstrate vol clustering;
# compare to realized variance from high-frequency returns.

import numpy as np
from scipy.optimize import minimize

# True parameters: high persistence (typical for equity indices).
omega_t, alpha_t, beta_t = 1e-6, 0.08, 0.90
N = 2000

def simulate_garch(N, omega, alpha, beta, seed=7):
    rng = np.random.default_rng(seed)
    r  = np.zeros(N)
    s2 = np.zeros(N)
    s2[0] = omega / (1 - alpha - beta)         # stationary variance
    for t in range(1, N):
        s2[t] = omega + alpha * r[t-1]**2 + beta * s2[t-1]
        r[t]  = np.sqrt(s2[t]) * rng.standard_normal()
    return r, s2

r, s2_true = simulate_garch(N, omega_t, alpha_t, beta_t)

# Vol clustering: ACF of |r| and r² should be positive at multiple lags.
print(f"Simulated GARCH(1,1): omega={omega_t}, alpha={alpha_t}, beta={beta_t}")
print(f"  ACF of |r| at lag 5:  {np.corrcoef(np.abs(r[:-5]), np.abs(r[5:]))[0,1]:.4f}")
print(f"  ACF of  r² at lag 5:  {np.corrcoef(r[:-5]**2,  r[5:]**2 )[0,1]:.4f}")
print(f"  (Positive autocorrelation in absolute returns — the GARCH signature.)")

# MLE estimation
def neg_log_likelihood(params, r):
    omega, alpha, beta = params
    if omega <= 0 or alpha < 0 or beta < 0 or alpha + beta >= 0.9999:
        return 1e10
    s2 = np.zeros(len(r))
    s2[0] = np.var(r)
    for t in range(1, len(r)):
        s2[t] = omega + alpha * r[t-1]**2 + beta * s2[t-1]
    return 0.5 * np.sum(np.log(s2) + r**2 / s2)

res = minimize(neg_log_likelihood, x0=[1e-5, 0.1, 0.85], args=(r,),
               method='Nelder-Mead', options={'xatol': 1e-7, 'maxiter': 5000})
omega_h, alpha_h, beta_h = res.x
print(f"\nMLE estimates from {N} simulated observations:")
print(f"  omega = {omega_h:.3e}  (true {omega_t:.3e})")
print(f"  alpha = {alpha_h:.4f}  (true {alpha_t})")
print(f"  beta  = {beta_h:.4f}  (true {beta_t})")
print(f"  alpha+beta = {alpha_h + beta_h:.4f}  (true {alpha_t + beta_t})")

# Realized variance: sum of squared high-frequency returns over a window.
# Compare to integrated true variance — RV is an unbiased estimator of IV.
print(f"\nRealized variance (sum r² over the last 78 observations):")
day_returns = r[-78:]
realized_var = np.sum(day_returns**2)
true_int_var = np.sum(s2_true[-78:])
print(f"  Realized variance:        {realized_var:.5e}")
print(f"  Integrated true variance: {true_int_var:.5e}")
print(f"  Ratio (should be ~1):     {realized_var/true_int_var:.4f}")

Output:

Simulated GARCH(1,1): omega=1e-06, alpha=0.08, beta=0.9
  ACF of |r| at lag 5:  0.2575
  ACF of  r² at lag 5:  0.2769
  (Positive autocorrelation in absolute returns — the GARCH signature.)

MLE estimates from 2000 simulated observations:
  omega = 6.048e-07  (true 1.000e-06)
  alpha = 0.0633  (true 0.08)
  beta  = 0.9232  (true 0.9)
  alpha+beta = 0.9864  (true 0.98)

Realized variance (sum r² over the last 78 observations):
  Realized variance:        2.24663e-03
  Integrated true variance: 2.38680e-03
  Ratio (should be ~1):     0.9413

Three things to read off. (1) Vol clustering is visible: autocorrelation of $∣ r ∣$ at lag 5 is 0.26, well above zero. Plain returns $r$ have near-zero autocorrelation (efficient markets); their ABSOLUTE values do not. This is the empirical signature that ARCH-family models capture. (2) MLE recovers parameters reasonably: persistence $α + β$ recovered to within 1% (0.986 vs true 0.98). Individual $α, β$ are harder to identify separately — they're conditional on each other through the recursion — but their SUM is well-pinned. (3) Realized variance ≈ integrated true variance (ratio 0.94 — Monte-Carlo error of one realization). RV is unbiased for IV in this model.

The GARCH zoo

GARCH(1,1) is the workhorse; extensions address known shortcomings.

EGARCH (Exponential GARCH, Nelson 1991): models $ln σ_{t}^{2}$ instead of $σ_{t}^{2}$ — guarantees positivity without constraints, and allows ASYMMETRIC response to good vs bad news (the "leverage effect": negative returns increase vol more than positive returns of the same size do).
GJR-GARCH (Glosten-Jagannathan-Runkle 1993): adds a dummy variable for negative returns. Captures the leverage effect with less overhead than EGARCH.
TGARCH (Threshold GARCH): models $σ_{t}$ directly rather than $σ_{t}^{2}$ .
IGARCH (Integrated GARCH): special case where $α + β = 1$ — vol shocks persist FOREVER. The case the equity market is often close to.
FIGARCH (Fractionally Integrated): long-memory generalization — power-law decay of vol persistence rather than exponential.
Multivariate GARCH (BEKK, DCC): joint modelling of multiple assets, with time-varying conditional COVARIANCE matrices.

Realized volatility

A complementary approach: rather than FORECASTING variance from a parametric model, MEASURE variance directly from high-frequency intraday returns. Realized variance is

RV_{t} = i = 1 \sum M r_{t, i}^{2},

where $r_{t, i}$ are $M$ high-frequency log-returns within day $t$ (typically 5-minute returns over a 6.5-hour trading day, so $M = 78$ ). As $M \to \infty$ (sampling frequency increasing), $RV_{t}$ converges in probability to the INTEGRATED VARIANCE

IV_{t} = \int_{t - 1}^{t} σ_{s}^{2} d s .

This is the realized-volatility theorem (Andersen-Bollerslev-Diebold-Labys 2001 onward). It works because the QUADRATIC VARIATION of an Itô process is precisely $IV_{t}$ , and the discrete RV is the natural estimator.

The microstructure noise problem

Naive RV with infinitely-frequent sampling would converge perfectly — but real prices have MICROSTRUCTURE NOISE (bid-ask bounce, discrete tick size, jitter from order arrivals) that adds spurious squared returns. As you sample more finely, the noise contribution grows linearly in $M$ , while the signal stays bounded. The optimal sampling frequency is a trade-off: too coarse (5-minute) and you lose signal; too fine (1-second) and noise dominates.

Standard cures:

5-minute sampling. The empirical default — coarse enough to be largely noise-free, fine enough to capture most of the volatility signal. Used in nearly every realized-vol paper.
Subsampling and averaging (Zhang-Mykland-Aït-Sahalia 2005): compute RV on multiple overlapping subsamples (one-minute returns starting at second 0, 5, 10, ...) and average. Reduces noise contribution while keeping the sampling fine.
Realized kernel (Barndorff-Nielsen et al. 2008): weight cross-products of returns by a kernel function that cancels noise contributions. The state-of-the-art estimator.
Pre-averaging (Jacod et al. 2009): average a short window of returns before squaring. Functions similarly to a kernel.

Using GARCH and realized vol together

Modern volatility modelling combines them. Two standard frameworks:

HEAVY model (Shephard-Sheppard 2010): GARCH-like recursion where the input is REALIZED variance rather than squared returns. Much higher-quality conditioning information → better forecasts. $σ_{t}^{2} = ω + α RV_{t - 1} + β σ_{t - 1}^{2}$ .
Realized GARCH (Hansen-Huang-Shek 2012): joint model of returns and RV. Estimate the parameters of both from data simultaneously, with a measurement equation linking them.
HAR-RV (Heterogeneous Autoregressive, Corsi 2009): regress today's RV on daily, weekly, and monthly RV lags. Simpler than GARCH, often beats it on forecast accuracy, conceptually appealing (different market participants operate on different timescales).

Where this matters

Volatility forecasting for derivatives pricing. The Black-Scholes-input $σ$ is typically a GARCH-style forecast or an implied vol; comparing them tells you whether options are rich or cheap relative to your model.
Volatility trading. Variance swaps pay realized variance minus a strike. Forecasting realized variance via HAR or GARCH directly bets on the strike.
Risk management. VaR with time-varying volatility uses GARCH (or implied-vol scaling). Without it, VaR systematically misses regime changes.
Asset allocation. Volatility timing strategies use GARCH-based forecasts to adjust portfolio leverage day-to-day.

Volatility surfaces and calibration — implied volatility from option prices; GARCH and realized vol are the time-series complement.
Value at Risk — incorporates GARCH-based time-varying volatility for regime-aware risk estimates.
Heston model — continuous-time stochastic volatility, the natural derivative-pricing extension of GARCH.