A zero-compute, exactly-solvable RBM family shows the reversible kernel's observable gap collapsing as multimodality turns on, with reversibilization itself adding a multimodality-tracking slowdown — the mechanism behind exp4's , computed with no chain and no burn-in.
The question
Exp4 left two readings of its scaling unresolved: reading-(1), a genuine plateau, versus reading-(2), merely inadequate burn-in. Both are consistent with a chain that fails to decorrelate at scale. The discriminating move is to remove the chain entirely: build a substrate small enough to exact-diagonalize, read and off the spectrum and resolvent, and ask whether the collapse survives when there is no burn-in left to blame.
The setup
Substrate: a bipartite RBM with energy , planted multimodality of patterns (the exp2 substrate, the DTM-native block-Gibbs structure). Gradient observables are the couplings , which are -even.
The grid (frozen at b20b2cc, with the pre-run cluster-anchor correction 13d120f; ran 2026-06-03, laptop CPU, float64, 250 s, no GPU):
- P1–P3 (primary): reversible single-site random-scan joint Gibbs, exact-diag, () plus a
ferrocontrol = 60 cells; an () top-k scaling check (4 cells, not pass-gated). - P4 (secondary): cross-kernel — non-reversible block (resolvent ) vs reversible (resolvent and exact-diag), = 16 cells.
Sanity guards all passed: to ; stochastic/-stationary to ; reversibility residual – for (non-reversible) and for (reversible). The decisive validation: resolvent vs exact-diag on agree to (Guard b). The slow_obs_overlap – confirms is -odd / observable-orthogonal — which is why the pre-run anchor was corrected from raw to the observable-relevant .
The result
P1 — gap collapse: NOT PASSED (strict); mechanism confirmed. collapses 1–2 orders as turns on at :
m, β | γ_eff(M=1) M=2 M=3 M=4 | Csize(M=1→4)
4,3.0 | 0.648 0.050 0.010 0.658 | 78→1→2→26
5,3.0 | 0.648 0.006 0.025 0.629 | 80→1→2→47
6,3.0 | 0.647 0.076 0.405 0.076 | 120→1→14→2
6,2.0 | 0.626 0.194 0.416 0.229 | 120→2→15→3
The deepest is at m5-3-2 — a collapse. It is physical, not a cluster artifact: the raw observable gap collapses identically (m5-3: ). Csize reads out the mechanism: gives a cluster all modes (unimodal rank-1 coupling, no slow mode), while give Csize=1–2, a clean A7-separated slow cluster.
Strict P1 fails for one well-understood reason: at in small (rank-4 coupling in 4–6 dims near-full-rank) the planted patterns over-saturate model capacity, basins merge, the slow mode vanishes, and recovers — breaking the registered monotone-in- criterion (, not monotone). The check corroborates this decisively: with more capacity, stays collapsed (, Csize=2) rather than recovering. The registered alternative outcome (" stays ") is also rejected.
P2 — off-cluster subdominance (O5.a): NOT PASSED (strict, 1/8); confirmed 7/8. With iff the observable's concentrates in the slow cluster: of the 8 cells with , 7 confirm (–). The one failure is again the saturation cell (m6-3-4, , dev ), where A7 itself degrades (observable mass spread over modes, not the 2-mode cluster). Half-Sokal convention throughout ().
P3 — sweet-spot map (no pass gate). With , every primary cell has (deepest: m5-3-2 ESS ). So even at deepest collapse a feasible reaches ESS (, marginal); never strictly closes in the frozen grid.
P4 — A2-vs-mixing: PASSED. The reversible is slower than the non-reversible in 16/16 cells ( ratio –), monotone-increasing in at (), and largest in the cleanest-multimodal cell (, , ratio , Csize=1). Reversibilization adds a mixing penalty that tracks the multimodal slow-mode structure — reproducing the exp3 (fast, non-reversible) exp4 (slow, reversible) contrast on an exactly-characterizable model. Resolventdiag cross-check: max rel-err .
Scope and caveats
This is mechanism-confirming, NOT a proof the DTM checkpoint sits in the plateau. Because here is exact-from-spectrum with no chain and no burn-in possible, the collapse cannot be a burn-in artifact — so the controlled substrate supports reading-(1) over reading-(2). But the honest ceiling holds: bottoms at (not ), the sweet spot stays marginally open (ESS ), and the A2 penalty is – (not the DTM's unbounded ). The substrate is small (), controlled and planted, not a trained DTM-MNIST conditional — so this is mechanism, not prevalence or depth. Tracking is near-tautological on small families; no factorization claim is made. The -knob is non-monotone in effective multimodality at small (capacity saturation); a clean monotone sweep needs larger , a feasible follow-up not re-run here. Propagation is Risk 5 / Risk 1 risk-ledger sharpening only — the operational claim stays [conjectured], with no tag flip in any outcome.
What this feeds: the literal reading-(1)-vs-(2) verdict on the real DTM chain still needs the at-scale earlier/less-trained-checkpoint -sweep (GPU, credit-gated); exp5 sharpens the risk ledger and confirms the experiments/exp4 mechanism without reaching its depth.