Reference measures, Langevin dynamics, and stochastic quantization —
one equation that runs from coin-flip samplers to the Yang-Mills frontier, with four live
models you can poke.
I've been writing up a set of working notes on a single organizing idea, and this post is the
interactive version. The idea is almost embarrassingly simple to state: a sampler is a
machine for constructing a measure. Choose a state space. Choose a reference measure
on it. Choose a dimensionless action relative to that reference measure. Then build a
transition rule that preserves the resulting target. That one discipline runs unbroken from
textbook Metropolis-Hastings all the way to stochastic quantization of gauge theory —
and most of the famous subtleties along the way are what happens when one of those choices is
made sloppily.
This post does not claim a new construction of an interacting continuum quantum field theory,
and it doesn't introduce a new algorithm. The contribution is organizational: one framework,
one equation, and — because this is my blog and not a journal — four live
simulations so you can watch the measures get built.
The one equation
The central object is not a bare density but a measure:
π(dΨ) = Z⁻¹ e^(−𝒜(Ψ)) μ₀(dΨ)
Here Ψ is a state — a particle configuration, a parameter vector, a lattice field, a set
of group-valued links — μ₀ is an explicit reference measure on
the state space, and 𝒜 is a dimensionless action. For a thermal system,
𝒜 = βH. For Euclidean field theory, 𝒜 = SE
in units where ℏ = 1.
The point that everything else hangs on: a density is not invariant data until the
reference measure is specified. In Euclidean parameter space the reference is
Lebesgue measure. In canonical phase space it's Liouville measure. On a Riemannian manifold
it's Riemannian volume. On a lattice scalar field it's a product measure over sites; on a
lattice gauge field, product Haar measure over links; in function-space sampling, often a
Gaussian. These choices are not cosmetic. They determine the actual probability law being
sampled — and demo 3 below makes that difference something you can see.
A transition rule is a Markov kernel K(Ψ, dΦ). The measure
π is stationary if πK = π: feed the kernel a state drawn from
π and it hands back a state drawn from π. Convergence to π from an
arbitrary start is a stronger statement and needs extra assumptions (irreducibility,
aperiodicity, suitable recurrence, non-explosion). The framework's job is to make
stationarity itself routine to verify.
Metropolis-Hastings as a measure-preserving rule
The universal correction layer. Draw a proposal Φ from a kernel with density
q(Φ | Ψ) with respect to the same reference measure μ₀, and accept
it with probability
If rejected, stay put. The normalizing constant Z cancels from the ratio — the
practical superpower of the whole method: you never need to know it. The resulting kernel
satisfies detailed balance with respect to π, which implies stationarity (though detailed
balance is sufficient, not necessary — non-reversible chains can preserve the same
target while carrying equilibrium probability currents, sometimes deliberately, for faster
mixing).
One correction worth making loudly: the folklore summary — "accept downhill moves
automatically, uphill moves with probability e−Δ𝒜" — is only true
when the proposal is symmetric. For an asymmetric proposal, a downhill move can be
rejected and an uphill move can be accepted with probability one, if the proposal ratio says
so. Cancellation is something you verify, not something you assume.
Live model 01 — Metropolis-Hastings
Watching a measure get built
A single walker samples a tilted double-well action
𝒜(x) = 2(x²−1)² + 0.25x with a symmetric Gaussian proposal. Top: the
action landscape and the walker, with each proposal flashed cyan (accepted) or red
(rejected). Bottom: the histogram of visited states converging onto the exact target density
e^(−𝒜) drawn in white. The walker starts in the shallower well — watch the
measure pull it across. Try the timid and wild step sizes to see why tuning matters: both
preserve the target, but mixing is another story.
The acceptance rule min(1, e^(−Δ𝒜)) needs no normalizing constant. Timid
steps accept nearly everything and explore nothing; wild steps mostly reject. Either way the
stationary measure is the same — the chain just takes longer to reveal it.
Euclidean Langevin dynamics
The continuous-time counterpart in ℝd is the overdamped Langevin diffusion:
dXₜ = −∇𝒜(Xₜ) dt + √2 dBₜ
The drift transports probability downhill in action; the diffusion spreads it. The
Fokker-Planck equation is ∂tρ = ∇·(ρ∇𝒜 + ∇ρ), and the
stationary density with respect to Lebesgue measure is
ρ∞ = Z⁻¹e−𝒜 — the verification is a one-line
zero-current calculation: at ρ∞, the two probability fluxes cancel
exactly. The stationary state is not "everything at the minimum"; it is the distribution
where downhill transport and diffusive spreading balance.
That distinction — optimization finds minima, Langevin finds measures
— is exactly what the noise toggle in the next demo shows. Kill the noise and the
particle cloud collapses into the two minima and freezes. Restore it and the cloud relaxes
back to e−𝒜, with the deeper well holding predictably more mass.
Live model 02 — Langevin diffusion
Drift versus noise, fighting to a draw
Six hundred independent particles integrate
dX = −∇𝒜 dt + √2 dB in a two-dimensional tilted double-well.
The faint red shading is the exact target e^(−𝒜); the HUD compares the measured share of
particles in the deeper left well against the value predicted by integrating the measure.
Respawn the whole cloud in the shallow right well and watch it relax to equilibrium —
then switch the noise off and watch sampling degenerate into gradient descent.
The stationary cloud is the configuration in which downhill drift and
diffusive spreading cancel as probability fluxes. With noise off, the same drift is just an
optimizer: every particle finds a minimum and dies there.
Riemannian Langevin and the volume correction
Move the state space to a Riemannian manifold (𝒮, g) and the reference-measure principle
stops being pedantry and starts changing answers. Intrinsic manifold Langevin dynamics
— best defined generator-first, as
Lf = −⟨gradg𝒜, gradgf⟩g + Δgf
with the Laplace-Beltrami operator — samples e−𝒜 with respect to
Riemannian volume, not coordinate volume. In a chart,
Whenever |g| is not constant, those are two different target measures, full stop. The Ito
form of the intrinsic diffusion carries an extra drift term,
|g|−1/2∂j(√|g| gij) — equivalently
−gjkΓijk — and dropping it (using g⁻¹ as a mere
preconditioner in a Euclidean-style update) generally breaks the stationary measure. If you
actually want the coordinate-density target, you can have it: shift the action by
½ log |g| and sample that intrinsically. The framework doesn't forbid either
target; it forces you to say which one you mean.
The cleanest place to see the distinction needs no dynamics at all — just the
sphere, where the metric factor is sin θ.
Live model 03 — Reference measures
"Uniform" is not a density. It's a measure.
Both modes below sample with the same constant density, "1" —
the same 𝒜 = 0 — but against two different reference measures on the sphere.
Coordinate measure dθ dφ piles points up at the poles, because equal intervals of θ near
the pole own almost no area. Riemannian volume sin θ dθ dφ spreads them
uniformly over the surface. Same density. Different measure. Different physics.
For a truly uniform surface distribution, 13.4% of points should fall within
30° of a pole. Under the coordinate measure it's 33.3%. The √|g| = sin θ volume
factor is the entire difference — and the same factor, in higher dimensions, is the
Riemannian Langevin correction term.
The same discipline governs discretization. An unadjusted Langevin step has a stationary
distribution that's biased by an amount controlled by the step size; the Metropolis-adjusted
version (MALA) restores exactness by treating each step as a proposal — but only
if the proposal density entering the acceptance ratio is the density of the move
actually being simulated, with all asymmetry, Jacobian, and volume terms included, computed
against one consistent reference measure. Geometric proposals built from the exponential map
are the same story: exact geodesic random walks from isotropic tangent noise can be genuinely
symmetric with respect to Riemannian volume, but non-radial covariance, approximate
retractions, coordinate-expressed densities, boundaries, and cut-locus preimages all break
the symmetry. Adding an accept/reject step does not by itself fix your target.
Field configurations and reference measures
Euclidean field theory writes its target as
π(dφ) ∝ e−SE[φ] 𝒟φ, and the formal symbol 𝒟φ
is exactly where the framework earns its keep: a continuum path-integral measure is not
automatically a well-defined object. It is given meaning through one of a few explicit
constructions:
Lattice product measure. On a finite lattice, the field is just a
high-dimensional vector, the reference is ∏x dφx, and
Metropolis, MALA, or HMC apply directly. This is an ordinary probability measure with no
philosophical asterisk.
Gaussian reference measure. In infinite dimensions there is no
translation-invariant Lebesgue measure. One starts instead from a Gaussian free-field
measure μG and defines the interacting theory by a Radon-Nikodym density:
π(dφ) = Z⁻¹e−V(φ)μG(dφ). And here's a trap the equation
guards against: if μG already contains the free quadratic part of the action,
writing the full action in the exponential double-counts it. The reference
measure determines what belongs in the action.
This is also why function-space MCMC works the way it does. Preconditioned Crank-Nicolson
proposals are engineered to preserve the Gaussian reference exactly, so only the relative
density enters the accept/reject step — which is why pCN survives mesh refinement that
kills naive random-walk proposals. The algorithm respects the reference measure instead of
fighting it.
Stochastic quantization is field-space Langevin
Parisi and Wu's stochastic quantization is, in this framework, nothing exotic: it is the
infinite-dimensional row of the same table. Run a fictitious-time Langevin equation whose
invariant measure is intended to be the Euclidean field-theory measure:
∂τφ = −δS_E/δφ + √2 η
The gradient becomes a variational derivative, Brownian motion becomes space-time white noise
η, and "run the sampler to equilibrium" becomes a construction route for the measure itself.
But the formal equation only means something after specifying a regulator, a reference
measure, a state space, and a topology — and the continuum limit ε → 0 is
where renormalization enters.
In singular theories the naive drift is not even well-defined. For dynamical Φ⁴ in three
dimensions the field is distribution-valued and φ³ has no classical meaning; the correct
equation is a renormalized limit of cutoff equations with counterterms:
∂τφ_ε = Δφ_ε − m²φ_ε − λφ_ε³ + C_ε φ_ε + √2 η_ε
The structural point: the invariant measure and the stochastic dynamics must be
renormalized together. Hairer's regularity structures and the
Gubinelli-Imkeller-Perkowski paracontrolled calculus are the two major frameworks that make
such singular SPDEs meaningful — and extending them to manifolds and vector bundles
drags in the bundle connection, noise covariance, and curvature-dependent renormalization:
the infinite-dimensional echo of the finite-dimensional volume correction above.
Stationarity also has output beyond samples. If L generates the process and π is
invariant, then ∫ LF dπ = 0 for suitable test functions — an
integration-by-parts identity that, in field notation, reads
E_π[ δF/δφ(x) ] = E_π[ F(φ) · δS_E/δφ(x) ]
These are the Schwinger-Dyson identities: the field equations in expectation form. The
dynamics samples the measure; the stationarity of the measure is the correlation
structure of the quantum field theory.
Gauge fields and Haar reference measure
Gauge theory makes the reference-measure question unavoidable, because the state space stops
being a vector space. On a lattice, the configuration is a set of group-valued link variables
Ue ∈ G on edges, and the natural reference is product Haar
measure — the Wilson target is
π(dU) = Z⁻¹e−SW[U]∏edUe. The base
measure here is not a detail bolted on after the action; it is the geometry of the
sample space, encoding which configurations exist and what "uniform" means on them.
In the continuum the state-space problem deepens: fields are connections, observables must
respect gauge symmetry, and the honest state space involves gauge orbits. The recent
rigorous milestones live exactly here — Chandra, Chevyrev, Hairer, and Shen constructed
the state space and Markov process for the 2D Yang-Mills Langevin dynamic, and built
local-in-time solutions and a canonical Markov process on gauge orbits for 3D
Yang-Mills-Higgs. In gauge theory the problem is not only the action: it is the state space,
the quotient geometry, the reference measure, the observable algebra, and the renormalized
dynamics, all at once.
Algorithms as transition-rule upgrades
Read through this lens, the standard algorithm zoo is one ladder, each rung a smarter proposal
wrapped in the same correction layer:
MH
The universal correction layer. Any proposal works if its density is
known relative to the correct reference measure.
MALA
Discretized Langevin step as proposal. Asymmetric because of the drift
— the proposal-density ratio is mandatory.
pCN
Preserves a Gaussian reference exactly; only the relative density enters
the acceptance. Stable under mesh refinement.
HMC
Extends the state space with momenta and Liouville measure; geometry
proposes long moves, Metropolis keeps the target exact.
HMC deserves its own sentence: it was introduced in lattice field theory as hybrid
Monte Carlo, a molecular-dynamics-guided sampler — the symplectic expression of the
transition-rule philosophy. Use geometry to travel far; use Metropolis-Hastings to preserve
the exact measure.
A lattice you can run
The notes end with a companion numerical program, and it would be against the spirit of this
blog not to just run it. Take two-dimensional scalar φ⁴ theory on a periodic lattice:
with target π(dφ) ∝ e−S[φ]∏xdφx — an ordinary,
finite-dimensional measure. The transition rule below is the simplest one on the ladder: a
local Metropolis update, proposing a small change at one site and accepting with probability
min(1, e−ΔS). The observables are magnetization
M = |Λ|⁻¹Σφx and its fluctuations; near the critical coupling the
correlation length blows up, the mass gap closes, and you can watch order parameter domains
the size of the whole lattice breathe in and out. This simulation is not the continuum
theorem — but it is a fully controlled, finite-dimensional instance of every idea
above.
Live model 04 — Lattice φ⁴, the companion program
A quantum field measure under local Metropolis
A 96×96 periodic lattice, λ = 0.5, running thousands of
local Metropolis updates per frame. Amber is φ > 0, cyan is φ < 0.
Step m₀² across the phase transition: deep in the symmetric phase the field is short-range
noise; near the critical point (m₀² ≈ −1.27 for this λ) huge correlated domains
form and die; in the broken phase one sign wins and the magnetization trace at the bottom
locks on. Switching m₀² mid-run quenches the system live.
Reference measure: ∏ₓ dφₓ. Action: lattice φ⁴. Transition rule: local
Metropolis. Invariant measure: the lattice field theory. Away from criticality correlations
decay like e^(−mr) — estimating that decay is direct numerical contact with the idea of
a mass gap.
The four-dimensional frontier
The framework is not a shortcut around constructive field theory; it's a way to state exactly
what must be constructed. For standard positive-coupling scalar φ⁴ in four dimensions, the
news is sobering: Aizenman and Duminil-Copin proved marginal triviality of the scaling limits
of critical 4D Ising-type and lattice-regularized φ⁴ models — the usual route does not
produce the interacting continuum theory one might hope for. Four-dimensional Yang-Mills is
different: asymptotic freedom means the ultraviolet problem is not the same triviality
mechanism, and a complete construction with a positive mass gap is, of course, a Clay
Millennium problem.
So read the ladder honestly: it's a route into the terrain, not a proof of the summit. What
it identifies is the full list of objects that have to be controlled — reference
measure, action, transition rule, invariant measure, continuum limit, renormalization, gauge
quotient, mass-gap observables. The guiding principle the whole way up:
Choose the reference measure. Choose the action. Construct dynamics that preserves
the resulting measure.
This principle does not solve the hard continuum problems by itself. It tells you exactly
where the hard problems live. And as the four panels above hopefully show, every rung of it
below the frontier is something you can actually watch run.
G. Parisi and Y.-S. Wu, "Perturbation theory without gauge fixing," Scientia
Sinica 24:483–496, 1981.
M. Hairer, "A theory of regularity structures," Inventiones Mathematicae
198:269–504, 2014.
M. Gubinelli, P. Imkeller, N. Perkowski, "Paracontrolled distributions and singular
PDEs," Forum of Mathematics, Pi 3:e6, 2015.
A. Chandra, I. Chevyrev, M. Hairer, H. Shen, "Langevin dynamic for the 2D Yang-Mills
measure," Publ. Math. IHES 136:1–147, 2022; and "Stochastic
quantisation of Yang-Mills-Higgs in 3D," arXiv:2201.03487, 2022.
S. L. Cotter, G. O. Roberts, A. M. Stuart, D. White, "MCMC methods for functions:
modifying old algorithms to make them faster," Statistical Science
28(3):424–446, 2013.
S. Duane, A. D. Kennedy, B. J. Pendleton, D. Roweth, "Hybrid Monte Carlo,"
Physics Letters B 195(2):216–222, 1987.
M. Aizenman and H. Duminil-Copin, "Marginal triviality of the scaling limits of
critical 4D Ising and φ⁴₄ models," Annals of Mathematics 194(1):163–235,
2021; corrigendum, 199(1):479, 2024.
M. Hairer and H. Singh, "Regularity structures on manifolds and vector bundles,"
arXiv:2308.05049, 2023.
A. Jaffe and E. Witten, "Quantum Yang-Mills theory," Clay Mathematics Institute
Millennium Problem statement, 2000.
K. G. Wilson, "Confinement of quarks," Physical Review D 10:2445–2459,
1974.
M. Girolami and B. Calderhead, "Riemann manifold Langevin and Hamiltonian Monte Carlo
methods," JRSS B 73(2):123–214, 2011.