nato-io-hmm

A synthetic hurricane-evacuation household-behavior data-generating process (DGP), designed to be paired with an Input-Output Hidden Markov Model (IO-HMM) fit. The DGP simulates N households over T hourly steps before landfall. Households move through five latent behavioral states (UA → AW → PR → ER → SH) under exogenous forecasts and warning orders, plus two endogenous feedbacks (network congestion, peer-departure share). Three observation channels are emitted each step: a noisy departure indicator, a displacement track, and a communication-activity count.

Pipeline

A single command sequence carries a run from synthesis to bands: simulate writes a four-file bundle (observations, population, timeline, config) for one scenario and seed. fit runs the IO-HMM EM loop with random restarts and emits its own four-file fit bundle. diagnose recovery aligns fitted states to truth via the Hungarian algorithm and writes state- and parameter-recovery metrics. report renders the smoke-check figures from either bundle.

sweep run repeats the simulation across every predefined scenario under a common seed; report sweep-all produces the cross-scenario comparison plots. The final stage — bootstrap fit followed by bootstrap shift-sweep — refits the IO-HMM on household resamples (joblib parallelism, ~minutes per replicate at N = 10000) and runs the warning-shift sweep, yielding quantile bands on failed-evacuation count as a function of warning timing.

Baseline simulator

Stacked area chart of state occupancy shares over time, with vertical lines marking voluntary order, mandatory order, and landfall.
Fig. 3 — Occupancy. Stacked area of state shares over time. UA depletes monotonically; AW, PR, and ER each bulge and drain in sequence; SH absorbs most of the population by landfall. Dashed verticals mark the voluntary and mandatory orders (read from the timeline, not hardcoded); the solid vertical is landfall.
Cumulative share of households that have ever departed, plotted against time. Approximately S-shaped.
Fig. 4 (single scenario) — Cumulative departures. Roughly S-shaped curve, near zero before the voluntary order, inflecting near the mandatory order, with an asymptote below 1 (some households shelter in place without ever departing). "Departure" here is the first hour at which a household enters the ER latent state — not the noisy emission flag.
Multi-panel plot showing per-household forecast intensity, latent state path as a step plot, departure marker, and normalized displacement track for two or three households.
Fig. 2 — Per-household trajectories. For 2–3 sampled households: forecast intensity overlay, latent state path as a step plot over UA → AW → PR → ER → SH, a single departure marker, and a normalized displacement track. The default household IDs are [0, 1, 2]; the CLI accepts an override.
Grouped bar chart of per-state means of departure, displacement, and communication count across the five latent states.
Emissions sanity check. Per-state means of departure, displacement, and comm_count. Departure mean spikes in ER (~0.95) and is near zero in UA; displacement is highest in ER and smaller in SH; communication counts rise from UA through PR then fall once households are en route or sheltered.

Recovery diagnostics

Confusion matrix between fitted IO-HMM states and ground-truth latent states after Hungarian alignment.
State recovery confusion. Confusion matrix between the fitted IO-HMM's Viterbi-decoded states and the simulator's ground-truth latent states, after permutation alignment by the Hungarian algorithm. Mass concentrated on the diagonal means the fit recovered the simulator's behavioral mode structure.
Scatter plot of fitted IO-HMM parameter values against their ground-truth values used to generate the synthetic data.
Parameter recovery. Fitted parameter values scattered against the ground-truth values that drove the DGP. Points should hug the identity line for transition and emission parameters identifiable from the bundle at the chosen N, T, and seed.

Scenario sweep

Cross-scenario cumulative departure curves overlaid on a single axis.
Fig. 4 (cross-scenario) — Cumulative departures by scenario. Cumulative departure share for each predefined scenario, overlaid on a common axis under a shared seed. Lets you read off how warning timing and intensity shift the departure curve relative to the baseline.
Two-by-two panel of network-coupling metrics across scenarios.
Fig. 5 — Network metrics. 2×2 panel of network-coupling metrics across scenarios. Definitions live in docs/network.md; the figure is the canonical view of how endogenous network congestion shapes departures differently from a pure exogenous-forecast model.

Parametric bootstrap

Failed-evacuation count plotted against warning shift, with shaded quantile bands from the parametric bootstrap.
Fig. 6 — Bootstrap bands on warning shift. Failed-evacuation count (households still in ER at landfall) as a function of warning-order shift, with quantile bands from the parametric bootstrap over household resamples. Reads as a decision-relevant uncertainty envelope around the policy counterfactual.

License

AGPL-3.0-only · Copyright © 2026 SWGY, Inc. Distributed without warranty; see LICENSE for the full text.