The experiment built to decide Risk 1 returned a verdict the pre-commitment did not anticipate: the single- factorization fails for a structural reason before it fails for a cluster reason.
The question
Risk 1 of the trainability theorem asks whether the conjectured gradient-SNR factorization survives exact computation. The structural ratio is
anchored to the slowest Gibbs mode (eigenvalue ). The pre-committed question: in isolated spectra does track the operational SNR , and in cluster spectra does single- over-predict? See experiments/exp1-exact-diag/.
The setup
Frozen pre-registered run (experiment.py, 80 cells, 264 s, pure numpy/scipy on a laptop) plus an exploratory follow-up (followup.py, 96 cells). Families: Curie–Weiss, SK, and planted Hopfield at patterns, swept over with . Constants per the pre-commitment: , , , burn-in , seed . The kernel is a reversible single-site random-scan Gibbs sampler; we diagonalize it exactly to get , the slow manifold, and the exact .
Validity checks all pass. Detailed balance holds to residual ; stationarity to ; asserted every cell. The headline-method check — exact bias+window MSE against a genuinely non-stationary MC chain (uniform start, real burn-in, seeds) — matches within – across 5 cells, so the stationary-window approximation is sound.
The result
Risk 1 is confirmed and sharpened — not closed. Two findings.
1. Symmetry mis-anchoring (the discovery). With field these EBMs carry spin-flip symmetry. The slowest mode is odd; the pairwise gradient observables are even. So — confirmed numerically at the to floor — and the single- ratio blows up. over-predicts by to : a divide-by-symmetry-zero, not a finite error. The gradient SNR is set by the slowest observable-overlapping (even) mode, not by .
2. Cluster in the observable-relevant sector. Once anchored to the modes the gradient actually overlaps, multimodal Hopfield cells show a genuine cluster of slow even modes. A single relevant gap is insufficient (tracks in cells); the multi-mode cluster correction restores tracking (). Both fixes are necessary at .
The pre-registered predicate verdicts: P1 (isolated regime tracks) is NULL — no dense cell anywhere had an isolated , so the premise never occurs and P1 is not testable as framed. P2 (cluster size monotone in ) FAILS as a law — raw is non-monotone (e.g. : ). P3 (single- over-predicts in cluster cells) passes literally () but degenerately — the over-prediction is the – blowup, right direction, wrong mechanism. P4 ( repair) is PARTIAL (), because its cluster set was anchored to the symmetry-odd .
The clean finite Risk-1 mechanism does appear: in the symmetry-broken run the naive predictor becomes well-defined and over-predicts by a finite – in cluster cells, tracking () in isolated cells.
Scope and caveats
This is construction-confirmed, not validated — no tag flip. The observable-relevant predictor and the run are post-hoc/exploratory, defined after seeing the degeneracy; they upgrade no artifact tag. The single-site random-scan kernel carries a weak-coupling degenerate manifold at unrelated to the energy landscape — DTM uses block-Gibbs, which exp2 must check. Small , controlled planted/random families: this reveals the mechanism, not asymptotic prevalence in trained DTMs. exp1 identifies a necessary correction to the factorization; it does not prove the corrected form.
What this feeds
Risk 1 in the trainability theorem moves [open] → [open — sharpened]: the factorization requires (a) an observable projection (the relevant gap is to the slowest mode overlaps, not ) and (b) a /multi-mode cluster correction. The conjectured stays conjectured. The [solid] variance leg's constant is annotated: it presumes a single observable-relevant slow mode, now known insufficient and, under symmetry, mis-anchored.
What this feeds: exp2 tests whether the degenerate manifold and the symmetry picture survive a block-Gibbs (THRML) update scheme.