// The research notebook — experiments, proofs & negative results

The Notebook

The complete record behind thermodynamic trainability: every experiment, derivation, and honest negative result, organized by theme.

Foundations

Foundations18.VI.MMXXVIRead 6 min

The Trainability Theorem (the spine)

The central artifact: what the trainability quantity actually is, why the single-gap form is superseded as written, and the two-tier tag that separates a proven conditional from an unvalidated operational claim.

Read entry →

Foundations13.VI.MMXXVIRead 4 min

Mixing, Expressivity, and the Barren-Plateau Bridge

Recasting the Mixing–Expressivity Tradeoff as a Ragone-shaped product of independently-fatal factors — and mapping it, shape-for-shape, onto the quantum barren plateau without pretending an EBM has a Lie algebra.

Read entry →

Experiments

Experiment20.VI.MMXXVIRead 4 min

Exp 19 — Hotter Top: Decorrelates, Then the Cost Wall

A hotter top finally decorrelates the hot end — but the cheapest equal-acceptance ladder now needs 136 reversible rungs against a 96-rung budget. The obstruction did not vanish; it relocated to a thermodynamic-length cost wall.

Read entry →

Experiment18.VI.MMXXVIRead 5 min

Exp 15-recheck — The Init-Weight Bug, Isolated

A code-verification found exp15 built its PT local kernels from INIT weights while swapping on TRAINED energies. Repair only that, re-run at real $t=200$, and the trained DTM does not mix — pooled swap-accept maxes at 0.0099, ~15× below floor.

Read entry →

Experiment18.VI.MMXXVIRead 4 min

Exp 18 — PT Replica Feasibility: PT-MARGINAL

Finer uniform spacing does not saturate cold-edge acceptance below the floor — it drives it ×33 through the whole band. The wall is ladder placement plus an un-mixing hot end, not adjacent overlap.

Read entry →

Experiment17.VI.MMXXVIRead 4 min

Exp 16 — Operational Validation (Withdrawn by Erratum)

The first at-scale operational validation of $Q_{op}\approx Q_{struct}^{\perp}$ returned an F4-fail — but it reused exp15's buggy alpha builder, so the central reading was an init-weight artifact and the F4/P4 table is withdrawn.

Read entry →

Experiment16.VI.MMXXVIRead 3 min

Exp 15 — GPU DTM PT P0 (and Its Erratum)

The first GPU DTM-MNIST parallel-tempering feasibility probe read P0-RESOLVED — but a later code audit found an init-weight PT-kernel bug that faked the mixing. Lead with the erratum: the scientific reading is withdrawn.

Read entry →

Experiment15.VI.MMXXVIRead 5 min

Exp 12 — PT vs P_sym: Outcome F (Measurement-Limited)

A reversible parallel-tempering mixture cut the slow-cluster τ_max by 14–22×, but the independently estimated verdict T_O never stabilized inside the registered window — so the gate fired and the verdict is unreadable.

Read entry →

Experiment15.VI.MMXXVIRead 4 min

Exp 13 — Measurement Repair: T_O Is Calibratable

Calibrate one doubling-stable T_O once on a long exact-π trajectory, then read the operational windows against it: the windows pass at both 20 and 50 τ̂*. exp12's failure was a window-dependent estimator, not residual long-memory.

Read entry →

Experiment15.VI.MMXXVIRead 5 min

Exp 14 — Operational Re-read: A Positive Small-Family Precursor

Re-read against exp13's frozen calibration: the primary reversible PT kernel cuts the calibrated operational time at least 2x at R4 (+21%, robust) and R6 (corroborating). A positive precursor on the controlled family — no tag flip.

Read entry →

Experiment14.VI.MMXXVIRead 4 min

Exp 11 — Larger-λ Sweep: The Dose-Response Is the Result

Across a five-rung λ ladder the Q-ratio climbs to 101× while the task guard collapses to 0/4. No dose both steers and keeps the task — and the one eligible arm's genuine-channel gate is idle.

Read entry →

Experiment11.VI.MMXXVIRead 4 min

Exp 9 — The Q Objective Demo: It Doesn’t Steer

First in-loop use of the pooled $Q_{struct}^{\perp}$ as a differentiable training objective. It is feasible, finite, and FD-exact — and at the verdict $\lambda$ it does not steer: median matched-crossing ratio 1.125, far below the 1.5 demo bar.

Read entry →

Experiment10.VI.MMXXVIRead 4 min

Exp 8 — Hypernetwork Retrofit: Building the Missing Substrate

The scan-recommended $W=g_\phi(u)$ hypernetwork, frozen before implementation, CONSTRUCTS the G1b gate and finds a non-empty R3 sweet spot — with a deflated-resolvent autodiff mirror that survived adjacent eigenvalue gaps of $10^{-16}$.

Read entry →

Experiment8.VI.MMXXVIRead 4 min

Exp 7 — Barrier/Conductance: A Scoped Green Light

A zero-compute Cheeger/Kramers crossability test on the controlled RBM family finds a non-empty, observable-relevant sweet spot — 25 of 64 cells — but it is scoped existence on a tiny RBM, never a fundamentality verdict.

Read entry →

Experiment4.VI.MMXXVIRead 5 min

Exp 6 — Checkpoint τ-Sweep: Slow From the Start

A four-checkpoint τ-sweep on the reversible kernel: the chain never equilibrates at any probed training depth, only the converged model. The A2↔A6 antagonism is present from very early training, not just at convergence.

Read entry →

Experiment3.VI.MMXXVIRead 4 min

Exp 4 — Reversible Kernel: τ̂ Unresolved (P0-HALT)

The A2-valid reversible kernel won't equilibrate: τ̂ grows dead-linearly in trajectory length, the doubling-stability rule refuses to resolve it, and A6 becomes unreachable — registered outcome P0-HALT, no tag flip.

Read entry →

Experiment3.VI.MMXXVIRead 5 min

Exp 5 — Gap vs Multimodality: The Mechanism

On an exactly-diagonalizable RBM family, the observable gap collapses ~100x as multimodality turns on, and A2-reversibilization adds a slowdown that grows with M — confirming exp4's mechanism, not proving the DTM sits in the plateau.

Read entry →

Experiment31.V.MMXXVIRead 4 min

Exp 3 — At Scale: The Equilibration Limit

First GPU run of the at-scale Q_op ≈ Q_struct⊥ test. The real MNIST DTM mixes so slowly that the diagnostic is equilibration-limited — one clean PASS, no tag flip, factorization stays conjectured.

Read entry →

Experiment27.V.MMXXVIRead 4 min

Exp 1 — Exact Diagonalization: Risk-1 Fires

On $N\le 14$ exact-diag EBMs the single-gap gradient-SNR predictor does not just err — it divides by a symmetry zero, over-predicting by $10^{26}$ to $10^{30}$. Risk 1 confirmed and sharpened, no tag flip.

Read entry →

Experiment27.V.MMXXVIRead 4 min

Exp 2 — Block-Gibbs RBM: The Findings Survive

Move exp1 from a fully-connected Ising chain to a bipartite RBM with 2-block-Gibbs: the symmetry/orthogonality result is model-driven and survives; the weak-coupling degeneracy is kernel-driven and lifts. No tag flip.

Read entry →

Proofs

Proof1.VI.MMXXVIRead 4 min

O1: Projection in L²(π), Not State-Space Conditioning

The first terminal proof in the wiki: the observable projection is an orthogonal projection in function space, not dynamical conditioning, and the stationary SNR is invariant to the projection-vs-quotient choice.

Read entry →

Proof1.VI.MMXXVIRead 4 min

The Proof Program: O1–O6 and the Conditional Factorization

Six obligations, nine assumptions, six falsifiers — the bookkeeping that turns Q_op ≈ Q_struct^⊥ into a conditional [solid] assembly over a single proven-here lemma, with the operational claim still [conjectured].

Read entry →

Negative Results

Negative Result20.VI.MMXXVIRead 5 min

What Didn’t Work — and Why the Negative Results Matter

Four sharp negatives — the A2↔A6 obstruction, the thermodynamic-length wall, Goodharting the proxy, and PT failing by schedule — and why each well-instrumented failure is the contribution.

Read entry →

Methodology

Methodology18.VI.MMXXVIRead 4 min

How This Research Is Run: Pre-registration, Claim-Status, MEASURE-ONLY

The discipline that makes the results trustworthy: frozen pre-commitments, four claim-status tags, MEASURE-ONLY runs that cannot self-authorize a verdict, and a bug that passed every pre-flight yet moved zero tags.

Read entry →