calibration_multimodal_from_synthetic - laser-base/laser-measles GitHub Wiki
Experiment V2: Two-Phase Spatial Probe for Gravity Mixing Calibration
Date: 2026-04-23
Figure: experiment_v2_summary.png
Scripts: generate_reference_v2.py, calibrate_v2.py, experiment_v2_summary.py
Background and Motivation
Experiment V1 attempted to calibrate a three-parameter spatial measles model ā transmission rate β, gravity mixing scale k, and distance exponent c ā using an SIA (supplemental immunization activity) in cluster_B as a perturbation to help identify k. The design failed for k: the SIA fired before the epidemic reached cluster_B, so cluster_B's pre-existing immunity was set entirely by the SIA regardless of how much natural cross-cluster transmission had occurred. The loss surface was flat in the k direction (identifiability ratio = 0.000), and the best-fit k was 480% off the true value.
The core diagnosis: k only matters if cross-cluster coupling has had time to leave a trace. An SIA that fires before any infection arrives erases the evidence before it can be observed.
Experimental Design
The V2 design keeps the same scenario ā 50 patches split into two clusters of 25, ~2.3 M agents, no vital dynamics ā but repositions the SIA to fire mid-invasion.
Timeline
| Phase | Ticks | What happens |
|---|---|---|
| Burn-in | 0 ā 199 | Fully susceptible population |
| Phase 1 | 200 ā 265 | Import seeds cluster_A (5 largest patches, rate 3/day for 4 days); epidemic spreads within cluster_A and leaks into cluster_B at a rate controlled by k |
| SIA fires | tick 265 | 70%-efficacy campaign targets all of cluster_B |
| Phase 2 | 266 ā end | Within-cluster epidemic in cluster_B drives from the patches already seeded before the SIA; cluster_A continues to burn out |
True parameters: β = 0.50, k = 0.050, c = 1.50
SIA efficacy: 70% (raises cluster_B effective immune fraction from ~0% to 70%, leaving R_eff ā 1.2 ā enough to sustain local spread from seeded patches but not saturating)
The key observable is the patch invasion count: how many of the 25 cluster_B patches have any recorded infection before tick 265. At the true k this is 11/25. The count is sensitive to k across the plausible range: k = 0.001 yields ~0ā1 patches invaded; k = 0.05 yields ~11; k = 0.15 yields ~20+. A complementary, coarser signal ā cluster_B's cumulative recovered fraction R(t_SIA) ā rises monotonically with k and is visible to a deterministic compartmental model, making it useful for Stage 1 calibration.
Reference Simulation
The reference dataset is a single ABM run at true parameters.
| Metric | Value |
|---|---|
| Global peak infectious | tick 299 |
| cluster_A final attack rate | 98.0% |
| cluster_B final attack rate | 79.3% |
| SIA baseline immunity | 70% |
| k-attributable excess AR in cluster_B | +9.3 pp |
| cluster_B patches invaded before SIA | 11 / 25 |
| Inter-cluster arrival lag | 41.4 ticks |
The 9.3 pp gap between the SIA-floor (70%) and the realized cluster_B attack rate (79.3%) is the integrated signal of cross-cluster coupling during Phase 1. It is small enough to be k-sensitive but large enough to survive the noise of the stochastic reference.
Figure Walk-through
All nine panels are in the figure
.
Row 0 ā Reference and Design
-
(0,0) Experiment timeline schematic. The yellow band marks Phase 1 (the k-signal window); the blue band marks Phase 2 (post-SIA within-cluster epidemic). The schematic highlights that the import seeds cluster_A, the gravity model leaks infections into cluster_B during Phase 1, and the SIA interrupts that process at tick 265.
-
(0,1) Reference ABM global I(t). A single clean epidemic peak at tick 299. Phase 1 is the rising edge; Phase 2 is the declining tail after the SIA removes a large fraction of cluster_B's remaining susceptibles.
-
(0,2) Cluster-level R(t). cluster_A (blue) saturates near 98%; cluster_B (red) reaches 79.3%, annotated with the 9.3 pp gap above the SIA-floor baseline. The gap is visible as a clear separation between the 70% dashed line and the cluster_B final value.
Row 1 ā k Identifiability
-
(1,0) Per-patch invasion timeline. Each horizontal bar is one cluster_B patch, sorted by its first-infection tick. The 11 patches (red) that were seeded strictly before tick 265 are the Phase-1 k signal. The 14 gray bars represent patches that were first infected by within-cluster spread after the SIA or were never reached. This is the ABM-only observable: the compartmental model cannot resolve individual patch invasions.
-
(1,1) Loss surface βĆk. The 8Ć8 CMP sweep (fixed c = 1.50) shows a clear, tight minimum near the true k = 0.05 (lime crosshairs). The orange diamond marks the Optuna CMP best (k = 0.052). Contrast with V1, where this panel showed a horizontal stripe with no variation in the k direction.
-
(1,2) CMP-visible k signal. Extracted from the same sweep: cluster_B's cumulative R at tick 265 as a function of k, at β ā 0.50. The monotonic rise from ~60% at k = 0 to ~80%+ at k = 0.20 is the mechanism that gives the CMP leverage on k. Without this signal the loss surface is flat; with it, k acquires a well-defined gradient.
Row 2 ā Calibration Results
-
(2,0) Global I(t) fit. The CMP best (blue dashed) closely tracks the ABM reference (black) in peak height and timing. The ABM best (red dash-dot) is also a reasonable fit globally, though it over-estimates the tail duration.
-
(2,1) Cluster R(t) fit ā CMP. The CMP recovers both cluster curves accurately, including the final 9.3 pp gap. This confirms that the loss function correctly captures the cross-cluster coupling signal.
-
(2,2) Parameter recovery: V1 vs V2. Bar chart of relative error for each parameter, comparing both experiments. The headline result is the k column: CMP error dropped from +481% (V1) to +4% (V2). ABM k error fell from +184% to +98% ā a meaningful improvement but not yet converged. β and c are recovered well in both versions.
Calibration Results
Stage 1 ā Compartmental (Optuna, 100 trials)
| Parameter | True | Best-fit | Error |
|---|---|---|---|
| β | 0.500 | 0.548 | +9.6% |
| k | 0.050 | 0.052 | +3.8% |
| c | 1.500 | 1.690 | +12.6% |
Loss: 0.006 (down from ~20.0 in V1 due to all trials hitting invalid parameter space)
Stage 2 ā ABM warm-start (Optuna, 40 trials Ć 3 seeds)
| Parameter | True | Best-fit | Error |
|---|---|---|---|
| β | 0.500 | 0.519 | +3.7% |
| k | 0.050 | 0.099 | +98% |
| c | 1.500 | 2.544 | +70% |
Loss: 0.856
Identifiability Ratios (8Ć8 CMP sweep)
| Parameter | Ratio | Verdict |
|---|---|---|
| k | 0.836 | Well-identified |
| c | 0.910 | Well-identified |
Discussion
What worked
The V2 redesign fully solves the k identifiability problem at the compartmental level. By positioning the SIA mid-invasion rather than pre-epidemic, the experiment creates a monotonic relationship between k and the cluster_B R(t_SIA) observable. The CMP can detect this relationship and converge to the true k in 100 Optuna trials. This is the surrogate-calibration paradigm in practice: a fast deterministic model navigates parameter space using cluster-level signals, then hands off to the expensive stochastic model for patch-level refinement.
What still needs work
The ABM Stage 2 recovers k to ~99% error ā better than V1's 184% but far from the CMP's 4%. Two likely causes:
-
Noisy invasion count. The patch invasion count (the ABM-only k term) is stochastic: across three seeds with the same parameters, the count varies by ±2ā4 patches. With W_INVASION = 3.0 this term dominates the loss but is too noisy to guide gradient descent cleanly at 40 trials. More seeds, more trials, or a smoother k observable would help.
-
cāk correlation in Phase 2. After the SIA, the within-cluster epidemic in cluster_B involves both within-cluster transmission (influenced by c through patch-to-patch distances) and continued cross-cluster seeding (influenced by k). The ABM loss function may be trading k against c in Phase 2, since both affect the realized cluster_B final AR. Adding a loss term that specifically isolates the Phase-1 period (e.g., cluster_B I(t) for t < SIA_TICK only) could break this correlation.
Design principle
The V1āV2 progression illustrates a general lesson: the experiment must be designed so that the target parameter leaves a trace in the observables before any intervention erases it. The intervention itself (the SIA) is useful precisely because it creates a contrast ā but only if it fires after the parameter-dependent process has run long enough to be measurable. Optimal experiment design for spatial calibration should therefore ask: for each parameter, what is the earliest observable signature, and does the intervention window allow that signature to be read?
Files
| File | Description |
|---|---|
generate_reference_v2.py |
Runs the reference ABM, saves abm_reference_v2/ |
calibrate_v2.py |
Identifiability sweep + CMP + ABM calibration |
experiment_v2_summary.py |
Generates experiment_v2_summary.png |
experiment_v2_summary.png |
3Ć3 panel report figure |
sweep_identifiability_v2.png |
10Ć10 sweep (from calibrate_v2.py) |
calibration_diagnostics_v2.png |
Per-patch AR, arrival histograms, convergence |
abm_reference_v2/ |
Reference data: I_by_patch.npy, R_by_patch.npy, patch_summary.csv |