Thermodynamic Machine Learning · MMXXVI
// The stack

Tools & runtime

Open libraries · pinned versions · rented compute

The exact toolchain behind every run — the open libraries, pinned versions, and rented accelerators that turn a claim into a result anyone can re-run.

The program runs on a frozen pre-registration discipline: a run’s steps, constants, seeds, and stop conditions are content-hashed to a git commit before any data is seen, and no threshold is relaxed afterward. Every claim carries one of four status tags — solid, conjectured, proven-here, validated — that move only by explicit conferral. Many runs are MEASURE-ONLY: they emit numbers but issue no verdict, so the code can never self-authorize a conclusion.

The toolchain

Python
Host language. Every run is python3 <script>.py, wrapping JAX, thrml, and the dtm-replication substrate.
JAX
Autodiff + XLA, in two lines: 0.9.1 (x64 / CPU) for the exact differentiable-Q work in float64, and 0.10.1 (CUDA 12) for GPU sampling and energy on the real DTM. Also the value-agreement reference checked against NumPy (≈1e-14).
NumPy / SciPy
Exact diagonalization (eigh) and resolvent baselines — the ground-truth reference that gates the JAX path.
thrml 0.1.3
Extropic’s Ising sampling library; builds the per-replica parallel-tempering kernels (AnnealingIsingSamplingProgram) at DTM scale.
Equinox
Immutable parameter-tree surgery (eqx.tree_at) to refresh the sampler’s interactions to trained weights — the fix for the exp15 init-weight bug.
dtm-replication @ 7c22d19
The DTM-MNIST 60_12 substrate codebase, git-pinned, behind every GPU run.
Weights & Biases
Every run is logged and version-tracked (offline, deliberately non-verdict-bearing instrumentation). The workspace is public on W&B.
Rented NVIDIA GPUs
H100 80GB (exp3) → H200, all on Lightning AI Studio — gated behind a CPU small-family authorization and hard GPU-hour caps.
Git
Frozen pre-commitment: every run is content-hashed to a commit before any data is seen, so no threshold can be relaxed after the fact.
Laptop CPU (float64)
Exact-diagonalization and small-family runs.

What each experiment added

The program is a ladder: each experiment introduces one new method, tool, or capability on top of the runs before it. Every entry links to its full write-up; the verdicts live there.

EXP 1Exact diagonalizationFirst exact-diag ground-truth instrument — true Q_op + slow-mode spectrum.
NumPy/SciPyCPUW&B
EXP 2Block-Gibbs RBMPorts the mechanism to a block-Gibbs RBM; adds thrml as a fidelity check.
NumPythrmlJAX 0.10.1W&B
EXP 3At scale: the equilibration limitFirst GPU-scale run on a real trained DTM-MNIST.
rented H100Lightning AI Studiodtm-replicationDTM-MNISTW&B
EXP 4Reversible kernelA2-valid reversible 4-block-Gibbs kernel + doubling-stability τ̂ estimator.
rented H200Lightning AI Studiothrml + patchvmapW&B
EXP 5Gap vs multimodalityZero-compute exactly-diagonalizable RBM family; reversible-vs-non head-to-head.
NumPy/SciPyCPUW&B
EXP 6Checkpoint τ-sweepFour-checkpoint τ̂ sweep along one training trajectory.
rented H200Lightning AI StudioDTM-MNISTW&B
EXP 7Barrier / conductanceNon-chain crossability test (Cheeger/Kramers, MFPTs).
CPU exact-diagW&B
EXP 8Hypernetwork retrofitDifferentiable hypernetwork + deflated-resolvent autodiff (no eigh).
JAX 0.9.1CPUW&B
EXP 9The Q objective demoFirst in-loop Q as a differentiable training objective.
JAX 0.9.1AdamCPUW&B
EXP 11Larger-λ sweepFrozen 5-rung λ ladder dose-response.
JAX 0.9.1CPUW&B (offline)
EXP 12PT vs P_symReversible parallel-tempering kernel K=½(LS+SL) — the mixing-speed axis.
CPUMEASURE-ONLYW&B
EXP 13Measurement repairDecouples T_O calibration from the operational read.
CPUreuses exp12 kernelW&B
EXP 14Operational re-readCompute-normalized speedup verdict on held-out seeds (confers GPU authorization).
CPUW&B
EXP 15GPU DTM PT (P0)First GPU reversible-PT on the real DTM (A6 reachability probe).
rented H200Lightning AI StudioCUDA 12jax 0.10.1thrmlW&B
EXP 15rThe init-weight bug, isolatedIsolates/repairs the init-weight bug via eqx.tree_at.
rented H200Lightning AI StudioEquinoxjax 0.10.1thrmlW&B
EXP 16Operational validationFirst at-scale operational-validation read of Q_op ≈ Q_struct^⊥.
rented H200Lightning AI Studiojax 0.10.1W&B
EXP 18PT replica feasibilityUniform-Δα reversible-PT swept over replica count R.
rented H200Lightning AI Studiojax 0.10.1W&B
EXP 19Hotter topHotter top (α_top→0) + thermodynamic-length cost frontier.
rented H200Lightning AI Studiojax 0.10.1W&B