Mixing, Expressivity, and the Barren-Plateau Bridge

The thermodynamic trainability quantity $Q$ dies the way a quantum loss landscape goes barren — same shape, different machine — and saying exactly how far that analogy reaches is the whole point of this entry.

This is a synthesis entry, not a new result. It assembles two artifacts — synthesis/trainability-theorem.md (the corollary block) and synthesis/parent-bridges.md (the cross-parent map) — into one statement: the Mixing–Expressivity Tradeoff (MET) is a product of independently-fatal factors, and that product is the EBM cousin of the quantum barren plateau. Here we keep the claim-status discipline visible.

The question

Given the factorization $Q \asymp (\gamma K / 2)\cdot R(\theta)$ — gradient SNR-squared as effective sample size times signal-to-slow-fluctuation ratio — when does $Q \to 0$ ? And: is "the gradient landscape went flat" the same failure that quantum machine learning calls a barren plateau, or only a rhyme?

The setup / method

We work from one reverse-process EBM layer of a Denoising Thermodynamic Model, sampled by a reversible Gibbs kernel $P_\theta$ . The operational object is $Q_{op} := \|g\|^2 / E\|\hat g - g\|^2$ — true-gradient power over estimator MSE. The corollary reads the factorization as a Ragone-shaped expression: $Q \to 0$ (the plateau) iff any factor collapses.

The result — three independently-fatal factors

The MET, stated as a theorem-shape, is the product of:

$\gamma \to 0$ exponentially — spectral-gap collapse, $\sigma_2 \to 1$ . This is the MET named as a mechanism: a monolithic EBM expressive enough to fit the data becomes exponentially slow to sample (the DTM paper's core motivation, §I–II + App. B). Quantum analogue: exponential $\dim(\mathfrak{g})$ blow-up (reachability/expressivity), where $\mathfrak{g}$ is the QML dynamical Lie algebra — not the gradient $g$ .
$R \to 0$ — the gradient signal is swamped by the equilibrium fluctuations of $\partial_a E$ along the slow modes; the signal is not where the sampler can resolve it. Quantum analogue: the vanishing g-purity product $P_{\mathfrak{g}}(\rho)\cdot P_{\mathfrak{g}}(O)$ .
Budget starvation — too few total Gibbs steps. Small burn-in $B \lesssim 1/\gamma$ leaves an $O(1)$ bias ( $\sigma_2^B = (1-\gamma)^B$ ); small window $K$ leaves $\mathrm{ESS} = \gamma K / (2 w_a)$ too small. Via the DTM energy model $E = T\cdot K_{mix}\cdot L^2 \cdot E_{cell}$ (App. E), this total-steps axis is literally the energy-per-sample axis. No quantum analogue.

Each factor is independently sufficient to kill $Q$ ; all three are needed for trainability. The corollary's tag is conjectured.

After exp1+exp2 (experiments/exp1-exact-diag/, experiments/exp2-thrml-smoke/) the first two factors are read in their observable-projected form: factor 1 with $\gamma \to \gamma_{eff}$ over the gradient-relevant mode set $C^*(O)$ , factor 2 with $R \to R_{eff}$ . The three-factor shape is unchanged; the quantities are the corrected, projected ones.

The bridge — what transfers, what does not

The cross-parent table maps factor-by-factor:

| Quantum (QML) | Classical EBM/DTM (here) | |---|---| | g-purity product $P_{\mathfrak{g}}(\rho)\cdot P_{\mathfrak{g}}(O)$ | $R(\theta)$ — signal-to-slow-fluctuation | | $1/\dim(\mathfrak{g})$ — reachability | spectral gap $\gamma = 1 - \sigma_2$ | | barren plateau $\mathrm{Var}[\partial C]\to 0$ | thermodynamic plateau $Q \to 0$ |

So $Q \asymp (\gamma K/2)\cdot R$ is the EBM analogue of Ragone's loss-variance $\mathrm{Var}[\ell_\theta] = \sum_j P_{g_j}(\rho)P_{g_j}(O)/\dim(\mathfrak{g}_j)$ .

What DOES transfer is exactly two things: the phenomenon (a gradient SNR that survives or collapses exponentially with system size) and the theorem shape (a product of independently-fatal factors).

Scope & caveats — the sharpest contrast

It does NOT literally transfer. There is no Lie algebra in an EBM: $P_{\mathfrak{g}}(\rho)$ , $\dim(\mathfrak{g})$ , and the DLA decomposition are quantum-circuit objects with no EBM counterpart. Do not write EBM quantities as if they were g-purities.

The mechanism is different. The quantum barren plateau is signal extinction: the true gradient variance itself vanishes, $\mathrm{Var}[\partial C]\to 0$ . The thermodynamic plateau is an estimation failure: $Q_{op}$ collapses because the estimator MSE swamps a still-nonzero $g$ — the recoverability of $g$ dies, not $g$ . Same phenomenology (SNR collapse), different machine. The Q-program's honesty hinges on keeping that distinction visible.

This entry asserts no tag flip. The corollary is conjectured; the bridge is explicitly a shape analogy, not a literal transfer, per parent-bridges.md. Nothing here is proven-here or validated.

What this feeds: the observable-projected predictor $Q_{struct}^{\perp}$ — the corrected, differentiable-in- $J$ target that turns this shape into something computable without training to convergence.

Sources

Jelinčič et al. 2025, Denoising Thermodynamic Models, arXiv:2510.23972 (Eq. 14, §IV, Fig. 5b, App. B/E/G).
Ragone et al. 2024, the loss-variance factorization (DLA dimension · state purity · observable purity) the corollary is shaped after.