Transition Rules as Measure Constructors

Reference measures, Langevin dynamics, and stochastic quantization — one equation that runs from coin-flip samplers to the Yang-Mills frontier, with four live models you can poke.

I've been writing up a set of working notes on a single organizing idea, and this post is the interactive version. The idea is almost embarrassingly simple to state: a sampler is a machine for constructing a measure. Choose a state space. Choose a reference measure on it. Choose a dimensionless action relative to that reference measure. Then build a transition rule that preserves the resulting target. That one discipline runs unbroken from textbook Metropolis-Hastings all the way to stochastic quantization of gauge theory — and most of the famous subtleties along the way are what happens when one of those choices is made sloppily.

This post does not claim a new construction of an interacting continuum quantum field theory, and it doesn't introduce a new algorithm. The contribution is organizational: one framework, one equation, and — because this is my blog and not a journal — four live simulations so you can watch the measures get built.

The one equation

The central object is not a bare density but a measure:

π(dΨ) = Z⁻¹ e^(−𝒜(Ψ)) μ₀(dΨ)

Here Ψ is a state — a particle configuration, a parameter vector, a lattice field, a set of group-valued links — μ₀ is an explicit reference measure on the state space, and 𝒜 is a dimensionless action. For a thermal system, 𝒜 = βH. For Euclidean field theory, 𝒜 = SE in units where ℏ = 1.

The point that everything else hangs on: a density is not invariant data until the reference measure is specified. In Euclidean parameter space the reference is Lebesgue measure. In canonical phase space it's Liouville measure. On a Riemannian manifold it's Riemannian volume. On a lattice scalar field it's a product measure over sites; on a lattice gauge field, product Haar measure over links; in function-space sampling, often a Gaussian. These choices are not cosmetic. They determine the actual probability law being sampled — and demo 3 below makes that difference something you can see.

A transition rule is a Markov kernel K(Ψ, dΦ). The measure π is stationary if πK = π: feed the kernel a state drawn from π and it hands back a state drawn from π. Convergence to π from an arbitrary start is a stronger statement and needs extra assumptions (irreducibility, aperiodicity, suitable recurrence, non-explosion). The framework's job is to make stationarity itself routine to verify.

Metropolis-Hastings as a measure-preserving rule

The universal correction layer. Draw a proposal Φ from a kernel with density q(Φ | Ψ) with respect to the same reference measure μ₀, and accept it with probability

a(Ψ, Φ) = min[ 1, e^(−[𝒜(Φ)−𝒜(Ψ)]) · q(Ψ|Φ) / q(Φ|Ψ) ]

If rejected, stay put. The normalizing constant Z cancels from the ratio — the practical superpower of the whole method: you never need to know it. The resulting kernel satisfies detailed balance with respect to π, which implies stationarity (though detailed balance is sufficient, not necessary — non-reversible chains can preserve the same target while carrying equilibrium probability currents, sometimes deliberately, for faster mixing).

One correction worth making loudly: the folklore summary — "accept downhill moves automatically, uphill moves with probability e−Δ𝒜" — is only true when the proposal is symmetric. For an asymmetric proposal, a downhill move can be rejected and an uphill move can be accepted with probability one, if the proposal ratio says so. Cancellation is something you verify, not something you assume.

Live model 01 — Metropolis-Hastings

Watching a measure get built

A single walker samples a tilted double-well action 𝒜(x) = 2(x²−1)² + 0.25x with a symmetric Gaussian proposal. Top: the action landscape and the walker, with each proposal flashed cyan (accepted) or red (rejected). Bottom: the histogram of visited states converging onto the exact target density e^(−𝒜) drawn in white. The walker starts in the shallower well — watch the measure pull it across. Try the timid and wild step sizes to see why tuning matters: both preserve the target, but mixing is another story.

The acceptance rule min(1, e^(−Δ𝒜)) needs no normalizing constant. Timid steps accept nearly everything and explore nothing; wild steps mostly reject. Either way the stationary measure is the same — the chain just takes longer to reveal it.

Euclidean Langevin dynamics

The continuous-time counterpart in ℝd is the overdamped Langevin diffusion:

dXₜ = −∇𝒜(Xₜ) dt + √2 dBₜ

The drift transports probability downhill in action; the diffusion spreads it. The Fokker-Planck equation is tρ = ∇·(ρ∇𝒜 + ∇ρ), and the stationary density with respect to Lebesgue measure is ρ = Z⁻¹e−𝒜 — the verification is a one-line zero-current calculation: at ρ, the two probability fluxes cancel exactly. The stationary state is not "everything at the minimum"; it is the distribution where downhill transport and diffusive spreading balance.

That distinction — optimization finds minima, Langevin finds measures — is exactly what the noise toggle in the next demo shows. Kill the noise and the particle cloud collapses into the two minima and freezes. Restore it and the cloud relaxes back to e−𝒜, with the deeper well holding predictably more mass.

Live model 02 — Langevin diffusion

Drift versus noise, fighting to a draw

Six hundred independent particles integrate dX = −∇𝒜 dt + √2 dB in a two-dimensional tilted double-well. The faint red shading is the exact target e^(−𝒜); the HUD compares the measured share of particles in the deeper left well against the value predicted by integrating the measure. Respawn the whole cloud in the shallow right well and watch it relax to equilibrium — then switch the noise off and watch sampling degenerate into gradient descent.

The stationary cloud is the configuration in which downhill drift and diffusive spreading cancel as probability fluxes. With noise off, the same drift is just an optimizer: every particle finds a minimum and dies there.

Riemannian Langevin and the volume correction

Move the state space to a Riemannian manifold (𝒮, g) and the reference-measure principle stops being pedantry and starts changing answers. Intrinsic manifold Langevin dynamics — best defined generator-first, as Lf = −⟨gradg𝒜, gradgf⟩g + Δgf with the Laplace-Beltrami operator — samples e−𝒜 with respect to Riemannian volume, not coordinate volume. In a chart,

dvol_g(x) = √|g(x)| dx     ⟹     p(x) = Z⁻¹ e^(−𝒜(x)) √|g(x)|

Whenever |g| is not constant, those are two different target measures, full stop. The Ito form of the intrinsic diffusion carries an extra drift term, |g|−1/2j(√|g| gij) — equivalently −gjkΓijk — and dropping it (using g⁻¹ as a mere preconditioner in a Euclidean-style update) generally breaks the stationary measure. If you actually want the coordinate-density target, you can have it: shift the action by ½ log |g| and sample that intrinsically. The framework doesn't forbid either target; it forces you to say which one you mean.

The cleanest place to see the distinction needs no dynamics at all — just the sphere, where the metric factor is sin θ.

Live model 03 — Reference measures

"Uniform" is not a density. It's a measure.

Both modes below sample with the same constant density, "1" — the same 𝒜 = 0 — but against two different reference measures on the sphere. Coordinate measure dθ dφ piles points up at the poles, because equal intervals of θ near the pole own almost no area. Riemannian volume sin θ dθ dφ spreads them uniformly over the surface. Same density. Different measure. Different physics.

For a truly uniform surface distribution, 13.4% of points should fall within 30° of a pole. Under the coordinate measure it's 33.3%. The √|g| = sin θ volume factor is the entire difference — and the same factor, in higher dimensions, is the Riemannian Langevin correction term.

The same discipline governs discretization. An unadjusted Langevin step has a stationary distribution that's biased by an amount controlled by the step size; the Metropolis-adjusted version (MALA) restores exactness by treating each step as a proposal — but only if the proposal density entering the acceptance ratio is the density of the move actually being simulated, with all asymmetry, Jacobian, and volume terms included, computed against one consistent reference measure. Geometric proposals built from the exponential map are the same story: exact geodesic random walks from isotropic tangent noise can be genuinely symmetric with respect to Riemannian volume, but non-radial covariance, approximate retractions, coordinate-expressed densities, boundaries, and cut-locus preimages all break the symmetry. Adding an accept/reject step does not by itself fix your target.

Field configurations and reference measures

Euclidean field theory writes its target as π(dφ) ∝ e−SE[φ] 𝒟φ, and the formal symbol 𝒟φ is exactly where the framework earns its keep: a continuum path-integral measure is not automatically a well-defined object. It is given meaning through one of a few explicit constructions:

  • Lattice product measure. On a finite lattice, the field is just a high-dimensional vector, the reference is ∏x dφx, and Metropolis, MALA, or HMC apply directly. This is an ordinary probability measure with no philosophical asterisk.
  • Gaussian reference measure. In infinite dimensions there is no translation-invariant Lebesgue measure. One starts instead from a Gaussian free-field measure μG and defines the interacting theory by a Radon-Nikodym density: π(dφ) = Z⁻¹e−V(φ)μG(dφ). And here's a trap the equation guards against: if μG already contains the free quadratic part of the action, writing the full action in the exponential double-counts it. The reference measure determines what belongs in the action.

This is also why function-space MCMC works the way it does. Preconditioned Crank-Nicolson proposals are engineered to preserve the Gaussian reference exactly, so only the relative density enters the accept/reject step — which is why pCN survives mesh refinement that kills naive random-walk proposals. The algorithm respects the reference measure instead of fighting it.

Stochastic quantization is field-space Langevin

Parisi and Wu's stochastic quantization is, in this framework, nothing exotic: it is the infinite-dimensional row of the same table. Run a fictitious-time Langevin equation whose invariant measure is intended to be the Euclidean field-theory measure:

∂τφ = −δS_E/δφ + √2 η

The gradient becomes a variational derivative, Brownian motion becomes space-time white noise η, and "run the sampler to equilibrium" becomes a construction route for the measure itself. But the formal equation only means something after specifying a regulator, a reference measure, a state space, and a topology — and the continuum limit ε → 0 is where renormalization enters.

In singular theories the naive drift is not even well-defined. For dynamical Φ⁴ in three dimensions the field is distribution-valued and φ³ has no classical meaning; the correct equation is a renormalized limit of cutoff equations with counterterms:

∂τφ_ε = Δφ_ε − m²φ_ε − λφ_ε³ + C_ε φ_ε + √2 η_ε

The structural point: the invariant measure and the stochastic dynamics must be renormalized together. Hairer's regularity structures and the Gubinelli-Imkeller-Perkowski paracontrolled calculus are the two major frameworks that make such singular SPDEs meaningful — and extending them to manifolds and vector bundles drags in the bundle connection, noise covariance, and curvature-dependent renormalization: the infinite-dimensional echo of the finite-dimensional volume correction above.

Stationarity also has output beyond samples. If L generates the process and π is invariant, then ∫ LF dπ = 0 for suitable test functions — an integration-by-parts identity that, in field notation, reads

E_π[ δF/δφ(x) ] = E_π[ F(φ) · δS_E/δφ(x) ]

These are the Schwinger-Dyson identities: the field equations in expectation form. The dynamics samples the measure; the stationarity of the measure is the correlation structure of the quantum field theory.

Gauge fields and Haar reference measure

Gauge theory makes the reference-measure question unavoidable, because the state space stops being a vector space. On a lattice, the configuration is a set of group-valued link variables Ue ∈ G on edges, and the natural reference is product Haar measure — the Wilson target is π(dU) = Z⁻¹e−SW[U]edUe. The base measure here is not a detail bolted on after the action; it is the geometry of the sample space, encoding which configurations exist and what "uniform" means on them.

In the continuum the state-space problem deepens: fields are connections, observables must respect gauge symmetry, and the honest state space involves gauge orbits. The recent rigorous milestones live exactly here — Chandra, Chevyrev, Hairer, and Shen constructed the state space and Markov process for the 2D Yang-Mills Langevin dynamic, and built local-in-time solutions and a canonical Markov process on gauge orbits for 3D Yang-Mills-Higgs. In gauge theory the problem is not only the action: it is the state space, the quotient geometry, the reference measure, the observable algebra, and the renormalized dynamics, all at once.

Algorithms as transition-rule upgrades

Read through this lens, the standard algorithm zoo is one ladder, each rung a smarter proposal wrapped in the same correction layer:

MH
The universal correction layer. Any proposal works if its density is known relative to the correct reference measure.
MALA
Discretized Langevin step as proposal. Asymmetric because of the drift — the proposal-density ratio is mandatory.
pCN
Preserves a Gaussian reference exactly; only the relative density enters the acceptance. Stable under mesh refinement.
HMC
Extends the state space with momenta and Liouville measure; geometry proposes long moves, Metropolis keeps the target exact.

HMC deserves its own sentence: it was introduced in lattice field theory as hybrid Monte Carlo, a molecular-dynamics-guided sampler — the symplectic expression of the transition-rule philosophy. Use geometry to travel far; use Metropolis-Hastings to preserve the exact measure.

A lattice you can run

The notes end with a companion numerical program, and it would be against the spirit of this blog not to just run it. Take two-dimensional scalar φ⁴ theory on a periodic lattice:

S[φ] = Σₓ [ ½ Σ_μ (φ_{x+μ̂} − φₓ)² + (m₀²/2) φₓ² + λ φₓ⁴ ]

with target π(dφ) ∝ e−S[φ]xx — an ordinary, finite-dimensional measure. The transition rule below is the simplest one on the ladder: a local Metropolis update, proposing a small change at one site and accepting with probability min(1, e−ΔS). The observables are magnetization M = |Λ|⁻¹Σφx and its fluctuations; near the critical coupling the correlation length blows up, the mass gap closes, and you can watch order parameter domains the size of the whole lattice breathe in and out. This simulation is not the continuum theorem — but it is a fully controlled, finite-dimensional instance of every idea above.

Live model 04 — Lattice φ⁴, the companion program

A quantum field measure under local Metropolis

A 96×96 periodic lattice, λ = 0.5, running thousands of local Metropolis updates per frame. Amber is φ > 0, cyan is φ < 0. Step m₀² across the phase transition: deep in the symmetric phase the field is short-range noise; near the critical point (m₀² ≈ −1.27 for this λ) huge correlated domains form and die; in the broken phase one sign wins and the magnetization trace at the bottom locks on. Switching m₀² mid-run quenches the system live.

Reference measure: ∏ₓ dφₓ. Action: lattice φ⁴. Transition rule: local Metropolis. Invariant measure: the lattice field theory. Away from criticality correlations decay like e^(−mr) — estimating that decay is direct numerical contact with the idea of a mass gap.

The four-dimensional frontier

The framework is not a shortcut around constructive field theory; it's a way to state exactly what must be constructed. For standard positive-coupling scalar φ⁴ in four dimensions, the news is sobering: Aizenman and Duminil-Copin proved marginal triviality of the scaling limits of critical 4D Ising-type and lattice-regularized φ⁴ models — the usual route does not produce the interacting continuum theory one might hope for. Four-dimensional Yang-Mills is different: asymptotic freedom means the ultraviolet problem is not the same triviality mechanism, and a complete construction with a positive mass gap is, of course, a Clay Millennium problem.

So read the ladder honestly: it's a route into the terrain, not a proof of the summit. What it identifies is the full list of objects that have to be controlled — reference measure, action, transition rule, invariant measure, continuum limit, renormalization, gauge quotient, mass-gap observables. The guiding principle the whole way up:

Choose the reference measure. Choose the action. Construct dynamics that preserves the resulting measure.

This principle does not solve the hard continuum problems by itself. It tells you exactly where the hard problems live. And as the four panels above hopefully show, every rung of it below the frontier is something you can actually watch run.

  1. G. Parisi and Y.-S. Wu, "Perturbation theory without gauge fixing," Scientia Sinica 24:483–496, 1981.
  2. M. Hairer, "A theory of regularity structures," Inventiones Mathematicae 198:269–504, 2014.
  3. M. Gubinelli, P. Imkeller, N. Perkowski, "Paracontrolled distributions and singular PDEs," Forum of Mathematics, Pi 3:e6, 2015.
  4. A. Chandra, I. Chevyrev, M. Hairer, H. Shen, "Langevin dynamic for the 2D Yang-Mills measure," Publ. Math. IHES 136:1–147, 2022; and "Stochastic quantisation of Yang-Mills-Higgs in 3D," arXiv:2201.03487, 2022.
  5. S. L. Cotter, G. O. Roberts, A. M. Stuart, D. White, "MCMC methods for functions: modifying old algorithms to make them faster," Statistical Science 28(3):424–446, 2013.
  6. S. Duane, A. D. Kennedy, B. J. Pendleton, D. Roweth, "Hybrid Monte Carlo," Physics Letters B 195(2):216–222, 1987.
  7. M. Aizenman and H. Duminil-Copin, "Marginal triviality of the scaling limits of critical 4D Ising and φ⁴₄ models," Annals of Mathematics 194(1):163–235, 2021; corrigendum, 199(1):479, 2024.
  8. M. Hairer and H. Singh, "Regularity structures on manifolds and vector bundles," arXiv:2308.05049, 2023.
  9. A. Jaffe and E. Witten, "Quantum Yang-Mills theory," Clay Mathematics Institute Millennium Problem statement, 2000.
  10. K. G. Wilson, "Confinement of quarks," Physical Review D 10:2445–2459, 1974.
  11. M. Girolami and B. Calderhead, "Riemann manifold Langevin and Hamiltonian Monte Carlo methods," JRSS B 73(2):123–214, 2011.

← Back to all posts