Turning the regularizer knob harder does not buy steering — it buys a steeper trade between an inflating diagnostic and a collapsing task.
This is the complete technical record for experiments/exp11-htdml-objective-lambda-sweep/. Here we keep the numbers, the gates, and the claim-status discipline.
The question
exp9 showed that at the trainability objective under-steers. The obvious rejoinder: push higher. So exp11 asks two things at G1-level objective-usability (never G2): does a larger steer past the bar, and if it does, does it steer through the genuine channel — faster mixing shrinking the gradient-noise denominator — rather than through anti-convergence or a -shrink artifact?
The setup
A frozen five-rung ladder over 4 training cells (, 4 teachers at ), 20 runs total, Adam step cap 2000. Pre-committed at gate-1 (d8073cc, before any implementation); runner frozen at gate-2 (dfd1bc6) after rehearsal and a 4-lens adversarial audit (0 MAJOR). The verdict basis is per-arm: an arm is verdict-eligible only if it passes the held-c25 task guard and has . Steering needs Leg 1 (median matched-crossing pooled- ratio ) and Leg 2 (genuine-channel gate at ). Ran 2026-06-14, laptop CPU, 2711 s wall; JAX 0.9.1 (x64). wandb ran offline (instrumentation-only, non-verdict-bearing).
The result
P1 — feasibility + fidelity: PASS (construction/formula-level, never "empirically validated"). All 20 runs finite (zero non-finite events); the value-agreement gate held across 26 comparisons at max (gate ); the FD battery passed at all 4 cells. The factored resolvent ran in-loop across ~32 000 pooled- evaluations with gradients — the objective is computable, differentiable, and in-loop-usable at every .
P2 — steering: Outcome 2, does NOT steer in this range (scoped negative). The matched-crossing pooled- ratio rises steeply while the guard-pass fraction collapses:
| arm | | guard | eligible | | median ratio | Leg 1 | Leg 2 () | |---|---|---|---|---|---|---|---| | baseline | 0 | 4/4 | — | — | — | — | — | | lam0p1 | 0.1 | 4/4 | ✓ | 8 | 1.125 | FAIL () | FAIL (1.010 ) | | lam0p3 | 0.3 | 2/4 | ✗ | 4 | 2.744 | — | — | | lam1p0 | 1.0 | 1/4 | ✗ | 2 | 101.37 | — | — | | lam3p0 | 3.0 | 0/4 | ✗ | 0 | — | — | — |
The ratio climbs as the guard falls . The sole eligible arm, , fails Leg 1 (median , reproducing exp9 exactly). The large ratios at are non-verdict quantities — those arms fail the arm-level task guard, so their inflated is pure anti-convergence the matched/held-crossing design excludes. At several arms re-cross their KL thresholds upward (e.g. ends at ; at ), and the held-at-stop rule (D5) correctly excludes those hollow crossings.
Even the eligible arm's genuine channel is idle: median , so holding gradient-mass overlap fixed, faster mixing did not shrink the denominator. The movement is signal-side (numerator ratios 0.96–2.05) plus -shrink (where the denominator fell, e.g. c25 at 0.866, it came via , not ) — exactly the artifact the gate exists to exclude.
P3 — KL guard: clean monotone collapse, . P4 — R4 carry-over: verified on 288 cells (verify-or-HALT, no HALT; max A2 residual ). P5 — descriptive: baseline reproduces exp8 bitwise (every matched rel-diff exactly ); median over .
Scope and caveats
Demo-level only. 4 cells, , one seed table, one teacher set, five rungs — "does not steer / does not engage " does not generalize beyond this preregistered grid, and never to HTDML at scale. This is a predictor, not an estimator: exp11 moves and measures exactly; no sampling, no . The conditional tier stays [solid], the operational tier [conjectured] (, A7 open at scale). This is not "cannot steer" — Outcome 2 is a scoped negative; exp7's small-family crossable region keeps a genuine positive plausible elsewhere. No tag moves; G2 untouched.
What this feeds: the RBM-retrofit G1 column and the spine Risk-2 / HTDML-property annotations now record that no registered dose met both steering magnitude and task compatibility — closing the larger- frontier at demo level while leaving the operational factorization tier exactly where it was.