Exp 19 — Hotter Top: Decorrelates, Then the Cost Wall

A hotter top cures the temperature obstruction exp18 left open, but the cheapest decorrelating ladder costs more rungs than we are allowed — so the wall moves from overlap to thermodynamic length.

This is the complete technical record for experiments/exp19-hotter-top-first/. Here we keep the grid, the gates, and the claim-status discipline. Outcome HOT-TOP-DECORRELATES-COSTWALL. MEASURE-ONLY — no tag moves.

The question

exp18 left two suspects for why a reversible parallel-tempering ladder would not mix the trained DTM: a hot end that did not decorrelate, and mis-placement of the rungs. exp19 asks both at once on the corrected (trained-weight-refreshed) 60_12, $t=200$ MNIST DTM at INPUT_IDX=0, single-input conditional $\pi_\theta(\cdot\,|\,x_0)$ : does a hotter top ( $\beta_{top}\to 0$ , i.e. $\alpha_{top}\to 0$ ) make the single-replica reversible 4-block-Gibbs kernel decorrelate ( $\tau \le \tau_{TARGET}=25$ ), and if so, can an equal-acceptance ladder span the decorrelating regime within the conferred budget $R_{MAX}=96$ rungs?

The setup

Two analytic-plus-empirical frontiers over a 10-point $\alpha_{top}$ grid floored at $0.01$ . (a) A decorrelation frontier: single-replica integrated autocorrelation $\tau(\alpha_{top})$ at flat $L_{LADDER}=2000$ (Sokal-resolved throughout). (b) An analytic cost frontier: $R^*(\alpha_{top}) = 1 + \mathrm{round}\big(\beta \cdot \int_{\alpha_{top}}^{1}\sqrt{C}\,d\alpha \,/\, \delta^*\big)$ with $\delta^*=1.683$ — the rung count an equal-acceptance ladder needs to span $[\alpha_{top}, 1]$ . Gated before any scored compute, all PASS: patch_live=True with the A2 self-adjoint cert PASS ( $\Pi$ -reversible to $<10^{-10}$ ); provenance proven (opt_counts=[12200,12200] $=200\times61$ , weights_hash $\ne$ init_hash); and the trained-weight refresh guard that invalidated exp15/16/17 now PROVEN clean (constructor_was_stale=True, refreshed_vs_trained=0.0, refresh_ok=True) — so this is a valid trained read, not the stale-INIT bug. $\beta_{target}$ MEASURED at runtime $=1.0$ . Ran on a rented H200, SSH-driven, $1.194$ of the $4.0$ GPU-h cap.

The result

A hotter top DOES cure the decorrelation — a temperature effect, not local-kernel non-ergodicity. $\tau(\alpha)$ is non-monotone, rising to $\sim 330$ near $\alpha=0.18$ before collapsing:

| $\alpha_{top}$ | 0.5 | 0.18 | 0.08 | 0.03 | 0.02 | 0.01 | |---|---|---|---|---|---|---| | $\tau$ | 305.5 | 329.5 | 250.3 | 79.3 | 15.8 | 2.2 | | $\tau\le 25$ ? | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ |

So the decorrelating set is $S=\{0.02, 0.01\}$ and the coldest decorrelating top (the cheapest span) is $\alpha_{top}^*=0.02$ . The anchor $\tau(0.5)=305.5$ reproduces exp18's $\tau_{hot}\approx 318$ — regression intact. But the analytic cost frontier does not overlap it:

| $\alpha_{top}$ | 0.5 | 0.25 | 0.18 | 0.05 | 0.02 | 0.01 | |---|---|---|---|---|---|---| | $\int\sqrt{C}$ | 105.5 | 156.5 | 172.7 | 217.7 | 227.5 | 231.0 | | $R^*$ | 64 | 94 | 104 | 130 | 136 | 138 |

Decorrelation requires $\alpha_{top}\le 0.02$ ; tractability requires $\alpha_{top}\ge\sim 0.25$ ( $R^*(0.25)=94$ ). There is no $\alpha_{top}$ that BOTH decorrelates AND has $R^*\le 96$ : $\min_{\alpha\in S} R^* = R^*(0.02)=136 > 96$ . The thermodynamic length $\beta\cdot\int\sqrt{C}\approx 227$ over the span $[0.02, 1.0]$ is the wall. Cost-aware decisive STOP at Stage A — no Stage-B ladder built, the $R^2$ -faithful round-trip cert never approached (budget_hit=False, clean completion).

This matched the pre-commitment: $S$ nonempty $\wedge$ $\min R^*=136>R_{MAX}=96$ routed correctly to a Stage-A STOP. The prediction-precision check confirmed direction (sublinear $R^*$ , $>64$ , costwall) but found the magnitude under-estimated — the crossing landed deeper ( $0.02$ , not the forecast $[0.02,0.08]$ band edge) and on-DTM $C$ ran higher than the exp18 extrapolation, so the cost wall is worse than forecast ( $R^*\approx 30\text{–}45\%$ above the predicted $\sim 90\text{–}100$ ).

Scope and caveats

Config-scoped: 60_12, SEED=0, $t=200$ , INPUT_IDX=0, single-input conditional, the reversible 4-block-Gibbs kernel, the equal-acceptance ladder family, the 10-point grid floored at $0.01$ . HOT-TOP-DECORRELATES-COSTWALL means "this kernel on this landscape needs $R^*=136$ for a single-span ladder," never "reversible PT cannot mix DTMs," and never a fundamentality verdict about $Q_{op}\approx Q_{struct}^{\perp}$ . $R^*(\alpha_{top})$ is a Gaussian-overlap LOWER BOUND for a multimodal landscape, so the true rung count could be higher — it corroborates the wall, it never rules mixing in. The grid floor at $0.01$ gives "deepest tested," not "provably deepest." Burn-in init (no exact $\pi_r$ ) means a STOP carries no robustness asymmetry. This sharpens spine Risk 5 (A2 $\leftrightarrow$ A6 antagonism), quantitatively: the reversible/self-adjoint kernel the factorization theorem requires (A2) needs $\alpha\approx 0.02$ to decorrelate, but a theorem-compatible tractable ladder exists only down to $\alpha\approx 0.25$ — and $R^*=136$ vs $R_{MAX}=96$ is the measured size of that gap on the real trained DTM. The conditional factorization stays [solid]; the operational claim stays [conjectured] ( $K\gg\tau_{int}$ , A7 open). No tag moves.

What this feeds: this is the keystone behind the A2 $\leftrightarrow$ A6 cost-wall framing — the obstruction is relocated, not removed, so the indicated pivot is to switch families (hierarchical PT / simulated tempering / population annealing to amortize the 136-rung length), or to fix the measurement with a $\tau_{int}$ -robust $Q_{op}$ estimator, rather than build a still-larger single-span ladder or raise the cap again.