Kelly Criterion and Position Sizing

Finance

The Kelly criterion is the answer to a question every trader, gambler, and investor must answer: GIVEN a profitable opportunity, how much of your bankroll should you stake on it? Bet too little and you sacrifice long-term growth. Bet too much and you risk ruin — even with a positive expectancy. Kelly (J. L. Kelly Jr., 1956) showed that there is a UNIQUE optimal fraction that maximizes the LONG-RUN GEOMETRIC GROWTH RATE of wealth, and that betting MORE than twice this fraction is RUINOUS regardless of how favorable the odds.

The result is one of the cleanest in quantitative finance — closed-form, intuitive, and broadly applicable. It is used (in fractional form) by serious gamblers, hedge funds, market makers, and anyone optimizing long-run wealth under risk. It's also widely misunderstood: the "naive" application leads to recommended bet sizes far larger than most institutions can stomach, and the use of FRACTIONAL Kelly (typically half or quarter Kelly) reflects practical reality about parameter uncertainty rather than a rejection of the theory.

The binary case

The cleanest statement. Repeatedly bet a fraction $f$ of your bankroll on a wager that pays $b$ dollars per unit staked with probability $p$ , and loses the stake with probability $q = 1 - p$ . After one bet, wealth is multiplied by either $(1 + f b)$ or $(1 - f)$ . After $n$ independent bets:

W_{n} = W_{0} \cdot (1 + f b)^{W_{n}} \cdot (1 - f)^{L_{n}}, W_{n} + L_{n} = n .

The LOG-WEALTH growth rate per bet is

G (f) = E [ln (W_{k + 1} / W_{k})] = p ln (1 + f b) + q ln (1 - f) .

Maximize: $d G / df = 0$ gives

f^{*} = \frac{p b - q}{b} = \frac{p ( b + 1 ) - 1}{b} .

A few sanity checks. (1) $f^{*} > 0$ requires $p b > q$ — positive expectancy. With $p = 0.55, b = 1$ : expected return per unit staked is $0.55 - 0.45 = 0.10$ , edge is 10%, and Kelly says bet exactly 10% — edge over odds. (2) Larger edge → larger fraction. (3) Higher payoff $b$ at fixed edge increases $f^{*}$ proportionally less because the upside is already big. (4) $f^{*} = 1$ (bet everything) is only optimal in the degenerate case $q = 0$ (sure thing).

The continuous case

For log-normal returns — i.e., a continuously compounded return $d W / W = μ d t + σ d B_{t}$ — the analysis is even simpler. With a fraction $f$ in the risky asset and the rest at risk-free rate $r$ , the portfolio follows

\frac{d W}{W} = (r + f (μ - r)) d t + f σ d B_{t} .

The long-run log-growth rate (using Itô) is

g (f) = r + f (μ - r) - \frac{1}{2} f^{2} σ^{2} .

Maximize: $g^{'} (f) = 0$ gives

f^{*} = \frac{μ - r}{σ ^{2}} .

Continuous Kelly: bet a fraction equal to the EXCESS RETURN over the VARIANCE. With $μ = 8%$ , $σ = 20%$ , $r = 0$ : $f^{*} = 0.08/0.04 = 2.0$ — Kelly says LEVER 2× into the market. With $μ = 10%$ , $σ = 30%$ : $f^{*} = 0.10/0.09 = 1.11$ — Kelly says levered slightly. Reduce $σ$ and Kelly grows quadratically.

Note: continuous Kelly often gives $f^{*} > 1$ , meaning LEVERAGE. Whether you can actually borrow at the risk-free rate at the level Kelly demands is a separate (real-world) question. Most institutional investors find Kelly's prescription too aggressive in practice — see fractional Kelly below.

The ruinous over-betting region

The growth-rate function $g (f)$ is concave with a unique maximum at $f^{*}$ . Beyond $f^{*}$ , growth DECREASES. Beyond $2 f^{*}$ , growth becomes NEGATIVE — even though every individual bet has positive expected value, the long-run growth rate is negative and the bankroll trends to zero almost surely. For the binary case $g (2 f^{*}) = 0$ exactly; for continuous Kelly, the level at which growth becomes zero is $f = 2 (μ - r) / σ^{2} = 2 f^{*}$ by the same algebra.

This is the cruellest fact in position sizing. A positive-EV opportunity, sized too large, doesn't just give worse-than-Kelly returns — it actively destroys wealth. The intuition: large bets create wild swings, and the GEOMETRIC mean of wild swings is below their ARITHMETIC mean (Jensen's inequality on log). Sufficient compounding losses cannot be undone by even larger compounding gains because of this geometric-arithmetic gap.

Code: growth rate vs bet fraction

# Kelly criterion: the fraction of bankroll that maximizes long-run
# logarithmic growth rate. Three cases:
#   1. Discrete binary bet
#   2. Continuous (Gaussian) returns
#   3. Monte Carlo verification of the growth-rate maximum

import numpy as np

def kelly_binary(p, b):
    """Optimal fraction for a binary bet: probability p of winning b dollars
    per unit staked, probability 1-p of losing the unit stake."""
    return (p*b - (1-p)) / b

def kelly_continuous(mu, sigma, r=0.0):
    """Optimal fraction for log-normal returns: f* = (mu - r) / sigma²."""
    return (mu - r) / sigma**2

print("Binary bet examples:")
for p, b in [(0.55, 1.0), (0.60, 1.0), (0.51, 2.0), (0.40, 3.0)]:
    f = kelly_binary(p, b)
    print(f"  p={p:.2f}, b={b:.1f}:  f* = {f:.4f}  (bet {100*f:.1f}% of bankroll)")

print("\nContinuous Kelly: f* = mu / sigma² (r = 0):")
for mu, sigma in [(0.05, 0.15), (0.08, 0.20), (0.10, 0.30)]:
    print(f"  mu={mu:.2f}, sigma={sigma:.2f}:  f* = {kelly_continuous(mu, sigma):.4f}")

# Monte Carlo: simulate growth rate vs bet fraction, show Kelly is the optimum.
def simulate_growth(p, b, f, n=500, n_paths=10000, seed=0):
    rng = np.random.default_rng(seed)
    wins = rng.random((n_paths, n)) < p
    log_returns = np.where(wins, np.log(1 + f*b), np.log(1 - f))
    return np.mean(log_returns.sum(axis=1)) / n

p, b = 0.55, 1.0
f_kelly = kelly_binary(p, b)
print(f"\nGrowth rate vs bet fraction (p={p}, b={b}, Kelly = {f_kelly}):")
print(f"  {'f':>8s}  {'E[ln(W)/n]':>14s}  {'comment':>12s}")
for f in [0.05, 0.08, f_kelly, 0.15, 0.20, 0.25]:
    g = simulate_growth(p, b, f)
    tag = '← Kelly' if abs(f - f_kelly) < 1e-3 else ('  RUIN' if g < 0 else '')
    print(f"  {f:>8.4f}  {g:>14.6f}  {tag:>12s}")
print("Over-betting at 2*Kelly gives ZERO growth; > 2*Kelly is ruinous.")

Output:

Binary bet examples:
  p=0.55, b=1.0:  f* = 0.1000  (bet 10.0% of bankroll)
  p=0.60, b=1.0:  f* = 0.2000  (bet 20.0% of bankroll)
  p=0.51, b=2.0:  f* = 0.2650  (bet 26.5% of bankroll)
  p=0.40, b=3.0:  f* = 0.2000  (bet 20.0% of bankroll)

Continuous Kelly: f* = mu / sigma² (r = 0):
  mu=0.05, sigma=0.15:  f* = 2.2222
  mu=0.08, sigma=0.20:  f* = 2.0000
  mu=0.10, sigma=0.30:  f* = 1.1111

Growth rate vs bet fraction (p=0.55, b=1.0, Kelly = 0.1):
         f      E[ln(W)/n]      comment
    0.0500        0.003766
    0.0800        0.004829
    0.1000        0.005036      ← Kelly
    0.1500        0.003776
    0.2000       -0.000083
    0.2500       -0.006659         RUIN

The growth rate is maximized at $f = 0.10$ (Kelly), at value 0.0050 (about 0.5% per bet, or 0.5% × 500 = 250% over a lifetime of 500 bets). At 5% bet size: 75% of Kelly's growth. At 15% bet size: 75% of Kelly's growth (symmetric around $f^{*}$ ). At 20% (2× Kelly): essentially zero growth. At 25%: NEGATIVE growth — you lose money on average despite each bet having positive expected value.

Fractional Kelly

In practice, almost no one bets full Kelly. The two main reasons:

(1) Parameter uncertainty. Kelly assumes $p, b$ (or $μ, σ$ ) are KNOWN. They aren't — they're estimated from historical data with substantial uncertainty. A $\overset{μ}{^}$ that's 1 percentage point too high turns a Kelly fraction of 2 into an actual optimum of 1 — and the wealth growth is much more sensitive to OVER-estimating Kelly (entering the ruinous region) than UNDER-estimating it. The asymmetry argues for using a conservative fraction.

(2) Variance reduction. Even at full Kelly, the path of wealth has very high variance. Half-Kelly gives 75% of Kelly's growth rate with HALF the variance of log-wealth. Most institutional investors prefer this trade. Two-thirds Kelly retains 89% of the growth rate.

The standard practitioner's choice: HALF KELLY ( $f = f^{*} /2$ ). Empirical observation from quant hedge funds, professional sports bettors, and the Kelly literature alike. Renaissance Technologies' Medallion Fund reportedly uses a fraction in the 25-50% Kelly range; gambler Edward Thorp explicitly advocated 0.5× Kelly in his books on blackjack and trading.

Multi-asset Kelly

With $n$ risky assets having mean excess returns $r$ and covariance $Σ$ , the multi-dimensional Kelly is:

f^{*} = Σ^{- 1} r .

Note the connection to the tangency portfolio: $f^{*} / 1^{T} f^{*}$ is the tangency portfolio (max-Sharpe direction), and the magnitude $1^{T} f^{*} = 1^{T} Σ^{- 1} r$ is the LEVERAGE. Multi-asset Kelly is "tangency portfolio + Kelly leverage" — the natural decomposition of position sizing in a multi-asset world.

The same parameter-uncertainty caveat applies, even more severely. The covariance matrix $Σ$ is noisy and ill-conditioned; $Σ^{- 1}$ amplifies that noise; the resulting multi-asset Kelly recommends large positions along the smallest-eigenvalue directions of $Σ$ , which are exactly the noisiest. Practitioners use SHRINKAGE on $Σ$ , FACTOR MODELS for the covariance, or simply scale down to fractional Kelly even more aggressively.

When Kelly fails

Path-dependent risk. Kelly maximizes long-run log-wealth but says nothing about drawdowns or short-term volatility. A Kelly bettor can experience 50% drawdowns even at full Kelly with favorable bets; investors with mark-to-market constraints (margin requirements, redemption pressure, regulators) cannot tolerate that path even if the destination is good.
Non-stationary parameters. The probabilities $p$ , the means and variances, can shift over time. Kelly with stale parameters can recommend the wrong fraction by a lot.
Wealth-dependent utility. Kelly assumes pure log-wealth maximization. If utility is more risk-averse than log (e.g., power utility with $γ > 1$ ), the optimal fraction is $f^{*} / γ$ — i.e., always less than Kelly. Real institutional preferences are typically more risk-averse than log.
Transaction costs. Kelly ignores costs. Higher-frequency rebalancing toward Kelly amplifies transaction friction; the optimal fraction shrinks as costs rise.

Sharpe and backtest evaluation — the standard performance metrics that complement Kelly sizing.
Mean-variance portfolio optimization — the cross-sectional counterpart; multi-asset Kelly is tangency portfolio × Kelly leverage.
Value at Risk — the institutional constraint that typically forces sub-Kelly position sizing.