Atomic basis functions
Quantum Chemistry
A 1s Slater orbital is the right shape for a hydrogenic atom — it has a cusp at the nucleus and exponential decay at long range, both physically correct. The problem isn't shape, it's integration. The moment a Slater orbital is involved in a multi-centre integral (anything that touches more than one nucleus, which is everything in molecular quantum chemistry past the H atom), the integral becomes a slow, numerically awkward beast. There's no closed form for two electrons in two Slater orbitals on different centres interacting through ; you grind it out numerically and pay for it.
Boys's idea, in 1950: replace Slater orbitals with Gaussians. Lose the physically correct shape, gain analytic integrals. Then make up for the lost shape by using several Gaussians per Slater orbital — a contracted basis function. This is the trade every practical quantum-chemistry code makes. We make it now.
The Gaussian advantage: a single magic identity
A primitive 1s Gaussian centred at is
The product of two such Gaussians is itself a Gaussian, centred at a weighted midpoint:
This is the Gaussian product theorem. Two Gaussians on different centres collapse to one Gaussian on a single centre, with a known prefactor and a known new centre . Multi-centre Gaussian integrals reduce to single-centre Gaussian integrals, which all have closed forms. Slater orbitals don't have an analogous identity. That's the whole reason chemistry uses Gaussians.
The cost: shape
Gaussians are wrong about hydrogen in two specific ways. They have no cusp at — a primitive Gaussian is smooth there, whereas the exact 1s wavefunction has a sharp peak. And their tails decay too fast: falls off much quicker than . Both errors matter for integrals near the nucleus and far from it.
The fix is to use a fixed linear combination of several primitive Gaussians chosen so that the combined function approximates a Slater orbital. STO-G is exactly this: primitive Gaussians with prefactors fitted to a Slater orbital with . STO-3G uses three primitives. The fit is least-squares over the radial coordinate; the resulting contraction coefficients are tabulated and copy-pasted into every quantum chemistry code on earth.
For hydrogen the published values are:
The contraction coefficients are fixed — they do not change during a calculation. The basis function is as a whole; the primitives just live inside it. From the rest of the program's point of view, STO-3G provides one basis function per atom, just parametrized internally as a sum of three Gaussians.
The integrals
Three primitive integrals are all we need for the H-atom case (one centre, one nucleus). Closed forms, derived by direct integration:
The Hamiltonian and overlap matrices are then 3×3, with element obtained by plugging into the formulas above. With those, the energy of the contracted basis function is the Rayleigh quotient
The code (new this page)
Three integral functions, one constants block (the STO-3G parameters), one function that assembles the matrices and returns the energy.
import numpy as np
# ---- primitive 1s Gaussian integrals (same-centre, same-nucleus) ----
# All three closed forms below are derived by direct integration of
# Gaussian × Gaussian on r in 3D, with normalization N = (2 alpha / pi)**0.75.
def overlap_g(a, b):
"""<g_a | g_b> for normalized 1s primitive Gaussians at the origin."""
return (4 * a * b / (a + b) ** 2) ** 0.75
def kinetic_g(a, b):
"""<g_a | -1/2 nabla^2 | g_b> = 3 a b / (a+b) * <g_a|g_b>."""
return 3 * a * b / (a + b) * overlap_g(a, b)
def nuclear_attraction_g(a, b, Z=1):
"""<g_a | -Z/r | g_b> for nucleus at the origin."""
return -Z * 2 * np.sqrt((a + b) / np.pi) * overlap_g(a, b)
# ---- STO-3G contraction for hydrogen (zeta = 1.0) ----
# Three primitive 1s Gaussians whose normalized linear combination
# approximates a 1s Slater orbital with zeta = 1. Coefficients and
# exponents are tabulated in Pople's STO-nG papers.
STO_3G_H_ALPHAS = np.array([3.42525091, 0.62391373, 0.16885540])
STO_3G_H_COEFFS = np.array([0.15432897, 0.53532814, 0.44463454])
def hydrogen_energy_sto3g():
"""Energy of the H atom in the STO-3G basis.
Builds the 3x3 overlap and Hamiltonian matrices over the three
primitive Gaussians, then evaluates <psi|H|psi>/<psi|psi> for the
contracted 1s function psi = sum_i d_i g_i.
"""
a = STO_3G_H_ALPHAS
d = STO_3G_H_COEFFS
n = len(a)
S = np.array([[overlap_g(a[i], a[j]) for j in range(n)] for i in range(n)])
T = np.array([[kinetic_g(a[i], a[j]) for j in range(n)] for i in range(n)])
V = np.array([[nuclear_attraction_g(a[i], a[j]) for j in range(n)] for i in range(n)])
H = T + V
return (d @ H @ d) / (d @ S @ d) Run it
Exact (Slater, zeta = 1.0000): E = -0.500000 Hartree
STO-3G (3 primitive Gaussians): E = -0.466582 Hartree
Exact ground state of H: E = -0.500000 Hartree
STO-3G error: +0.033418 Hartree The STO-3G energy is Hartree, about 33 mHa above the exact . This is the variational principle holding strictly: our trial space (three fixed-exponent Gaussians with fixed contraction coefficients) doesn't contain the exact ground state, so the minimum we can reach inside it is strictly above . The 33 mHa gap is what we paid for the analytic-integral convenience.
Two ways to tighten this. (1) Use more primitives — STO-6G would cut the error to ~5 mHa. (2) Optimize the basis — let the 's vary. Both are real chemistry-code techniques, but for our H2 goal STO-3G is enough; we'd rather spend the complexity budget on the H2-specific machinery (Slater determinants, the SCF loop, two-centre integrals).
The full file
"""qc.py — Computational QC from scratch.
Step 3: replace the analytic Slater orbital with a Gaussian basis (STO-3G).
"""
import numpy as np
from scipy.optimize import minimize_scalar
# ---- analytic Slater 1s integrals (kept from steps 1-2 as a reference) ----
def kinetic_1s(zeta):
return 0.5 * zeta ** 2
def nuclear_attraction_1s(zeta):
return -zeta
def hydrogen_energy(zeta):
return kinetic_1s(zeta) + nuclear_attraction_1s(zeta)
def optimise_hydrogen():
result = minimize_scalar(hydrogen_energy, bracket=(0.5, 1.5))
return result.x, result.fun
# ---- primitive 1s Gaussian integrals (same-centre, nucleus at origin) ----
def overlap_g(a, b):
"""<g_a | g_b> for normalized 1s primitive Gaussians at the origin."""
return (4 * a * b / (a + b) ** 2) ** 0.75
def kinetic_g(a, b):
"""<g_a | -1/2 nabla^2 | g_b> = 3 a b / (a+b) * <g_a|g_b>."""
return 3 * a * b / (a + b) * overlap_g(a, b)
def nuclear_attraction_g(a, b, Z=1):
"""<g_a | -Z/r | g_b> for nucleus at the origin."""
return -Z * 2 * np.sqrt((a + b) / np.pi) * overlap_g(a, b)
# ---- STO-3G contraction for hydrogen ----
STO_3G_H_ALPHAS = np.array([3.42525091, 0.62391373, 0.16885540])
STO_3G_H_COEFFS = np.array([0.15432897, 0.53532814, 0.44463454])
def hydrogen_energy_sto3g():
"""Energy of the H atom in the STO-3G basis."""
a = STO_3G_H_ALPHAS
d = STO_3G_H_COEFFS
n = len(a)
S = np.array([[overlap_g(a[i], a[j]) for j in range(n)] for i in range(n)])
T = np.array([[kinetic_g(a[i], a[j]) for j in range(n)] for i in range(n)])
V = np.array([[nuclear_attraction_g(a[i], a[j]) for j in range(n)] for i in range(n)])
H = T + V
return (d @ H @ d) / (d @ S @ d)
if __name__ == "__main__":
zeta_opt, E_slater = optimise_hydrogen()
E_sto3g = hydrogen_energy_sto3g()
print(f"Exact (Slater, zeta = {zeta_opt:.4f}): E = {E_slater:+.6f} Hartree")
print(f"STO-3G (3 primitive Gaussians): E = {E_sto3g:+.6f} Hartree")
print(f"Exact ground state of H: E = {-0.5:+.6f} Hartree")
print(f"STO-3G error: {E_sto3g - (-0.5):+.6f} Hartree") Why fix the contraction coefficients instead of optimizing them?
Because if you let the contraction coefficients vary, you've effectively turned three primitives into three independent basis functions, which is a different and slightly larger basis. The STO-G recipe deliberately constrains the 's to fitted values so the basis size stays at one function per atom. This makes book-keeping simple and matches how chemists report basis-set sizes — when someone says "STO-3G," they mean one basis function per H, two per C (1s and 2s+2p bundle), etc. If the contraction coefficients were optimized, the count and the standard wouldn't make sense.
Where do the STO-3G numbers come from?
Pople and co-workers (1969) least-squares-fit the three pairs to a Slater orbital with , minimizing the radial deviation. The result is a fixed, atom-independent recipe: scale the 's by to match a different Slater orbital, but the relative magnitudes of the 's and the values of are universal. The numbers in the code are the published canonical values for hydrogen.
Why is the STO-3G energy strictly above the exact value?
Variational principle. The exact 1s wavefunction (a Slater orbital with ) is not exactly representable as a linear combination of three fixed Gaussians; it lives outside our trial space. Any wavefunction inside our trial space is therefore not the exact ground state, so its energy is strictly above . The 33 mHa gap is the energetic cost of leaving the exact answer behind for analytic integrability — and is the kind of error that compounds in bigger systems, which is why people invest in better basis sets.