Exp 1 — Exact Diagonalization: Risk-1 Fires · Thermodynamic Machine Learning

The experiment built to decide Risk 1 returned a verdict the pre-commitment did not anticipate: the single- $\gamma$ factorization fails for a structural reason before it fails for a cluster reason.

The question

Risk 1 of the trainability theorem asks whether the conjectured gradient-SNR factorization $Q \asymp (\gamma K / 2)\, R$ survives exact computation. The structural ratio is

$R = \frac{\lVert g \rVert^2}{\sum_a w_a \operatorname{Var}_\pi[f_a]}, \qquad w_a = \frac{\hat f_{a,2}^2}{\operatorname{Var}_\pi[f_a]},$

anchored to the slowest Gibbs mode $\phi_2$ (eigenvalue $\sigma_2$ ). The pre-committed question: in isolated spectra does $Q_{struct}$ track the operational SNR $Q_{op}$ , and in cluster spectra does single- $\gamma$ over-predict? See experiments/exp1-exact-diag/.

The setup

Frozen pre-registered run (experiment.py, 80 cells, 264 s, pure numpy/scipy on a laptop) plus an exploratory follow-up (followup.py, 96 cells). Families: Curie–Weiss, SK, and planted Hopfield at $M$ patterns, swept over $\beta \times \delta \times K$ with $N \le 14$ . Constants per the pre-commitment: $\tau = 3$ , $c = 3$ , $m = 3$ , burn-in $B = \lceil 5/\gamma \rceil$ , seed $0$ . The kernel is a reversible single-site random-scan Gibbs sampler; we diagonalize it exactly to get $\sigma_2$ , the slow manifold, and the exact $Q_{op}$ .

Validity checks all pass. Detailed balance $\pi_x P_{xy} = \pi_y P_{yx}$ holds to residual $\le 5\times10^{-18}$ ; stationarity $\pi P = \pi$ to $\le 10^{-16}$ ; $\sigma_1 = 1$ asserted every cell. The headline-method check — exact bias+window MSE against a genuinely non-stationary MC chain (uniform start, real burn-in, $400$ seeds) — matches within $0.7$ – $3.3\%$ across 5 cells, so the stationary-window approximation is sound.

The result

Risk 1 is confirmed and sharpened — not closed. Two findings.

1. Symmetry mis-anchoring (the discovery). With field $b = 0$ these EBMs carry $Z_2$ spin-flip symmetry. The slowest mode $\phi_2$ is odd; the pairwise gradient observables $f_{ij} = -x_i x_j$ are even. So $\hat f_{a,2}^2 \approx 0$ — confirmed numerically at the $10^{-24}$ to $10^{-29}$ floor — and the single- $\gamma$ ratio $R$ blows up. $Q_{struct}$ over-predicts $Q_{op}$ by $10^{26}$ to $10^{30}$ : a divide-by-symmetry-zero, not a finite error. The gradient SNR is set by the slowest observable-overlapping (even) mode, not by $\sigma_2$ .

2. Cluster in the observable-relevant sector. Once anchored to the modes the gradient actually overlaps, multimodal Hopfield cells show a genuine cluster of slow even modes. A single relevant gap is insufficient (tracks $Q_{op}$ in $23/48$ cells); the multi-mode cluster correction restores tracking ( $45/48$ ). Both fixes are necessary at $b = 0$ .

The pre-registered predicate verdicts: P1 (isolated regime tracks) is NULL — no dense cell anywhere had an isolated $\sigma_2$ , so the premise never occurs and P1 is not testable as framed. P2 (cluster size monotone in $M$ ) FAILS as a law — raw $|C_3|$ is non-monotone (e.g. $N=8,\beta=2$ : $M=1\ldots4 \to [1,3,1,1]$ ). P3 (single- $\gamma$ over-predicts in cluster cells) passes literally ( $31/31$ ) but degenerately — the over-prediction is the $10^{26}$ – $10^{30}$ blowup, right direction, wrong mechanism. P4 ( $\gamma_{eff} + R^C$ repair) is PARTIAL ( $13/31$ ), because its cluster set was anchored to the symmetry-odd $\sigma_2$ .

The clean finite Risk-1 mechanism does appear: in the symmetry-broken $b \ne 0$ run the naive predictor becomes well-defined and over-predicts by a finite $\sim 8$ – $9\times$ in cluster cells, tracking ( $\sim 0.5\times$ ) in isolated cells.

Scope and caveats

This is construction-confirmed, not validated — no tag flip. The observable-relevant predictor and the $b \ne 0$ run are post-hoc/exploratory, defined after seeing the degeneracy; they upgrade no artifact tag. The single-site random-scan kernel carries a weak-coupling degenerate manifold at $\sigma = 1 - 1/N$ unrelated to the energy landscape — DTM uses block-Gibbs, which exp2 must check. Small $N \le 14$ , controlled planted/random families: this reveals the mechanism, not asymptotic prevalence in trained DTMs. exp1 identifies a necessary correction to the factorization; it does not prove the corrected form.

What this feeds

Risk 1 in the trainability theorem moves [open] → [open — sharpened]: the factorization requires (a) an observable projection (the relevant gap is to the slowest mode $g$ overlaps, not $\sigma_2$ ) and (b) a $\gamma_{eff}$ /multi-mode cluster correction. The conjectured $Q \asymp (\gamma K/2)R$ stays conjectured. The [solid] variance leg's $2w_a$ constant is annotated: it presumes a single observable-relevant slow mode, now known insufficient and, under symmetry, mis-anchored.

What this feeds: exp2 tests whether the degenerate manifold and the symmetry picture survive a block-Gibbs (THRML) update scheme.