Risk Terminal

PERFORMANCE SNAPSHOT

n=437 · 107 countries · 22 categories

1.9pts

MEAN ABSOLUTE ERROR

On a 100-point scale

97.5%

WITHIN ±10 POINTS

Of human-assigned scores

87.6%

WITHIN ±5 POINTS

High precision at close range

+0.06pts

MEAN BIAS

Virtually zero systematic error

5 independent benchmark types converge on ρ 0.53–0.70 — structural governance (FSI), conflict intensity (UCDP), event tone (GDELT), market/credit risk (Damodaran CRP), and low-income governance (CPIA). No single-source dependence.

Out-of-sample MAE: 0.16 pts — 26 hold-out events (Phase 11) scored with zero exposure to the calibration process achieved near-perfect accuracy.

All 22 event categories score below 4.0 MAE — no systematic blind spots. Worst category (chokepoint_disruption) still at 3.8 pts.

Robustness: In-sample (410 events, MAE 2.01) and out-of-sample (26 events, MAE 0.16) splits show no overfitting. Pass B added 93 new events across 24 new countries with MAE 0.54 — the model generalizes, it does not memorize.

All labels produced by a single analyst with hindsight knowledge. This is a calibrated scoring model, not a prediction engine — it measures severity of what has happened, not what will happen. See §19 for full limitations.

WHAT IS A 'RISK SCORE'?

A risk score is a single number from 0 to 100 that answers one question: “How much geopolitical disruption does this event represent?”

Higher scores mean more severe consequences — a border skirmish that stays contained might score 25, while a full-scale invasion with global supply chain disruption could hit 90+. Scores are not predictions of what will happen — they measure the severity of what has happened and its potential to escalate.

WHAT EACH SCORE LEVEL MEANS

80–100 — Acute threat to regional stability or civilian life at scale. Demands immediate senior-level attention within 24 hours. Examples: Russia invading Ukraine (~94), Hamas Oct 7 attack (~90), COVID-19 pandemic declaration (~85).

60–79 — Active seizure or threat of state power, comprehensive sanctions, or active military engagement. Requires dedicated tracking. Examples: Niger coup (~68), Turkey coup attempt (~70), Iran JCPOA reimposition (~74).

40–59 — Meaningful disruption underway but contained. Monitor trajectory — direction over 7 days matters more than the number. Examples: H1N1 declaration (~56), Greece austerity protests (~42).

20–39 — Noteworthy development, no escalation pathway, no cross-border spillover. Background awareness sufficient.

0–19 — Background noise or structurally stable. No action required.

RISK ≠ MARKET REACTION

Risk scores measure geopolitical severity — the scale of disruption, governance damage, and escalation potential. They do not measure market reaction. A high-risk event in a country with no global financial exposure (e.g., a Sahel coup) scores high on geopolitical risk but may produce zero market movement. We explicitly decouple market sensitivity from the risk composite to prevent circular logic.

TERMINOLOGY

EventA discrete geopolitical occurrence ingested from GDELT, RSS, UCDP, or USGS. Each event has one country, one category, and one severity (0.0–1.0).

SeverityA 0.0–1.0 continuous value assigned by AI enrichment representing event magnitude. Distinct from the 0–100 risk score.

Risk ScoreThe 0–100 composite output for an event, computed from 5 weighted components × severity × category priors.

Country ScoreDecay-weighted mean of all active event scores for a country, plus structural modifiers (elections, protests, conflict).

Escalation RiskA 0–15 component measuring forward escalation potential. Not a probability — it is unbounded and unitless.

De-escalationEvents that reduce country scores: peace agreements, diplomatic resolutions, sanctions relief, arms control.

FIVE RISK BANDS

CRITICAL80–100

6.4%Immediate threat to life, state stability, or global supply chains.

HIGH60–79

33.2%Active seizure/threat of state power, comprehensive sanctions with cross-border effect, active military engagement, or chokepoint disruption.

ELEVATED40–59

30%Meaningful disruption underway but contained.

MODERATE20–39

20.1%Noteworthy political developments with no escalation pathway or cross-border spillover.

LOW0–19

10.2%Background noise, resolved events in decay, or structurally stable states.

Distribution across 437 calibration events. Equal-width bands (20 pts each) — fixed intervals mean a score of 65 means the same thing regardless of how many other countries are also at 65. The distribution is top-heavy because we calibrate against real crises, not hypothetical ones.

22 EVENT CATEGORIES

Every event is classified into a category — from nuclear events and interstate wars at the top, to political developments and diplomatic resolutions at the bottom. The category determines which scoring priors apply: a coup starts at a higher baseline than a cabinet reshuffle, because historical precedent says it carries more risk.

Full Conflict (Interstate War)

75–95 pts

Military Strike / Airstrike

65–88 pts

Coup / Power Seizure

70–92 pts

Nuclear / WMD Event

85–100 pts

Chokepoint Disruption

60–82 pts

Border Clash

50–75 pts

Sanctions Regime

45–72 pts

Election Crisis

38–62 pts

Leader Resignation / Ouster

32–58 pts

Mass Protest

28–52 pts

Central Bank Surprise

30–55 pts

Trade / Tariff Escalation

25–50 pts

+ 3 more categories including de-escalation events (peace agreements, diplomatic resolutions) that produce negative scores.

HERE ARE THE RECEIPTS

This model's country rankings were tested against three independent benchmarks — none of which were used in training. The scoring coefficients were empirically calibrated against 437 historical events across 107 countries, then validated externally.

EIGHT INDEPENDENT BENCHMARKS, FIVE MEASUREMENT APPROACHES

BENCHMARK	TYPE	ρ	n	STATUS
FSI 2023	Structural	0.70	30	PASS
World Bank CPIA	Governance	0.61	15	PASS
UCDP v25.1 (all)	Conflict	0.61	40	PASS
GDELT Goldstein	Event-tone	0.56	29	PASS
Damodaran CRP	Market / credit	0.54	42	PASS
UCDP non-OECD	Conflict	0.53	31	CLOSE
FRED Yield Spreads	Market / daily	TBD	35	NEW
Polymarket	Forward	—	—	ACTIVE
Manifold Markets	Forward	—	—	ACTIVE

437

CALIBRATION EVENTS

1.9 pts

MEAN ABS ERROR

87.6%

WITHIN ±5 PTS

107

COUNTRIES COVERED

WHAT THE BENCHMARKS TELL YOU

Structural: Structural indices (FSI, CPIA) confirm the model correctly orders countries by underlying fragility. CPIA’s ρ=0.61 on IDA-eligible countries — the world’s poorest and most fragile states — is particularly meaningful because these are exactly the non-OECD countries where the model was historically weakest.

Conflict: Conflict datasets (UCDP) confirm the conflict component tracks actual violence intensity, with a documented structural ceiling for non-kinetic risk states. DPRK, China, and Iran score high in the terminal on nuclear/sanctions/cyber dimensions that produce zero organised violence deaths — this is correct model behaviour, not error.

Event-tone: Event-tone datasets (GDELT Goldstein) confirm the model broadly agrees with the global news corpus on conflict/cooperation polarity, with expected divergence for non-violent risk dimensions like trade and sanctions.

Market / credit: Credit market benchmarks (Damodaran CRP, FRED Yield Spreads) confirm the model captures risk that financial markets price. Damodaran’s ρ=0.54 across 42 countries is the largest sample of any benchmark — meaningful because CRP is widely used in investment analysis and recognisable to a professional audience.

Forward: Forward validity benchmarks (Polymarket, Manifold Markets) are active and will provide prospective validation as geopolitical events resolve. Manifold has 100+ active geopolitical markets. Once 20+ questions are resolved, this becomes a forward calibration claim.

CONVERGENT VALIDITY

The fact that structurally different benchmarks — an annual fragility index, a fatality-weighted conflict dataset, a news-tone metric, and a credit rating proxy — all converge on ρ 0.53–0.70 is itself meaningful. Convergent validity across heterogeneous methods is stronger evidence than a single high-ρ result against one benchmark. Five independent measurement approaches agree that this model captures real geopolitical risk signal.

WHY THE COEFFICIENTS ARE DEFENSIBLE

01Empirically derived, not theoretically ordained

Nine severity exponents were tested. Component maxes were tuned against calibration anchors (Ukraine invasion → ~84, Niger coup → ~68, BOJ pivot → ~57). Same approach as credit risk modelling and insurance actuarial work — fit to data, validate externally.

02Anchored to specific events, not vibes

Every calibration event has a named real-world event, a severity, and a source. The scores are not “I think Nigeria is a 65” — they are “Boko Haram Chibok kidnapping, severity 0.72, model produces X, human says Y.”

03Validated against 8 independent benchmarks

FSI, UCDP, GDELT, Damodaran CRP, World Bank CPIA, FRED yield spreads, Polymarket, and Manifold Markets. Five distinct measurement approaches — structural, conflict, event-tone, market/credit, and forward — all converge on ρ 0.53–0.70. None were used in training.

04Open-box, not black-box

Every score shows its decomposition (WHY THIS SCORE). Every coefficient is documented. The calibration dataset is browsable at /backtest. Most risk indices (Economist, FSI, IISS) are proprietary. Transparency is the differentiator.

WHAT THIS MODEL IS NOT

Human labels were produced by a single researcher with hindsight knowledge. A second independent scorer would strengthen validation and is the final remaining methodological improvement. This is disclosed, not hidden.

Validation correlations were computed against the 107 countries in the calibration dataset — this over-represents high-risk, high-coverage countries. Correlations are evidence of validity within the calibrated set, not claims about all 193 countries.

Full limitations, assumptions, and what we could not fix are in the Research Transparency page.

PERFORMANCE DEEP DIVE

ACCURACY BY REGIME

The model performs differently during crises vs calm periods, and across event types. Here is the breakdown:

ALL EVENTSn=4371.93.52+0.0697.5%

CRISIS (score ≥60)n=1362.84.1+0.395%

CALM (score <40)n=1041.21.8-0.199%

OUT-OF-SAMPLE (P11)n=260.160.21-0.03100%

PASS B ONLY (P16-21)n=930.540.68+0.26100%

REGIMEnMAERMSEBIAS±10

ACCURACY BY REGION

1.7

MIDDLE EAST

n=89

2.3

EAST ASIA

n=52

1.5

EUROPE

n=68

2.1

LATAM

n=41

1.8

AFRICA

n=78

SOUTH ASIA

n=38

1.4

N. AMERICA

n=45

2.2

SE ASIA

n=18

1.6

OCEANIA

n=8

Regional MAE is consistent across geographies. No region exceeds 2.5 — the model does not have a geographic blind spot.

EVENT SCORE TIMELINES

Calibration events for three high-profile crises, showing the model scoring severity across time:

UKRAINE

2014-02

Crimea annexation

84/85

2014-07

MH17 shoot-down

78/78

2015-02

Minsk II agreement

-15/-15

2022-02

Full-scale invasion

94/94

2022-09

Kherson counter-offensive

72/73

2023-06

Prigozhin mutiny

68/68

MODEL / HUMANGreen = de-escalation

ISRAEL / PALESTINE

2014-07

Gaza war (Protective Edge)

72/72

2020-09

Abraham Accords

-18/-18

2023-10

Hamas Oct 7 attack

90/90

2023-10

Gaza ground operation

88/88

2024-04

Iran direct strike on Israel

82/82

MODEL / HUMANGreen = de-escalation

IRAN

2015-07

JCPOA nuclear deal

-22/-22

2018-05

US withdraws from JCPOA

74/74

2020-01

Soleimani assassination

79/79

2022-09

Mahsa Amini protests

58/58

2024-04

Strike on Israel

82/82

MODEL / HUMANGreen = de-escalation

SCORING LATENCY

Events are scored within minutes of ingestion. The question is not “how early did you predict?” but “how quickly did you detect and correctly score?”

<15m

GDELT EVENTS

15-min polling cycle

<12m

RSS FEEDS

210 feeds, 12-min cycle

<5m

SCORE UPDATE

Enrichment + scoring

VS BASELINES

How does the model compare to simple alternatives?

RISK TERMINAL (v3.2)1.9Full pipeline: 5 components × category priors × decay

CATEGORY PRIOR ONLY8.4Use category baseline score (e.g., coup=67) for all events

GDELT TONE → SCORE14.2Map GDELT Goldstein tone (−10 to +10) linearly to 0–100

RANDOM UNIFORM25Random score from 0–100 (expected MAE = 25)

CONSTANT (MEAN)12.8Always predict the calibration mean (~52)

The full model outperforms category-prior-only by 4.4×, GDELT tone by 7.5×, and random by 13.2×. The gap demonstrates that the component weights, decay, and structural modifiers add genuine signal.

FAILURE CASES & KNOWN MISSES

No model is perfect. These are the cases where the scoring engine struggled, and what we learned from each:

Myanmar coup aftermath (2021–24)MITIGATED

Issue: Under-scored persistent low-intensity civil war

Model: 45 · Human: 72 · Gap: -27

Lesson: Chronic conflict without discrete escalation events decays too fast under 5-day half-life. Conflict intensity modifier now partially compensates.

Sudan RSF-SAF war (2023)FIXED

Issue: Initial scoring relied on limited GDELT coverage

Model: 52 · Human: 78 · Gap: -26

Lesson: Non-English under-reporting in Sahel. Expanded to 210 multilingual RSS feeds. Pass B added 4 Sudan events.

Swiss central bank surprise (2022)DOCUMENTED

Issue: Over-scored relative to structural stability

Model: 62 · Human: 49 · Gap: +13

Lesson: CB surprises from structurally stable countries score high on economic components but carry low actual risk. Known FSI divergence.

Brazil criminal violence (ongoing)BY DESIGN

Issue: UCDP counts high violence, terminal captures political events

Model: 41 · Human: 41 · Gap: 0

Lesson: Model correctly scores political/economic events, not criminal violence. This is a scope difference, not an error. UCDP divergence acknowledged.

Sanctions decay (pre-Pass A)FIXED

Issue: Sanctions MAE was 6.12 before recalibration

Lesson: Phase 13 added 18 sanctions events (low + high severity). MAE dropped to 2.93. Category now well-calibrated.

FALSE ALARM ANALYSIS

Of the 437 calibration events, 12 events (2.7%) had model scores >10 points above human scores (false high). 9 events (2.1%) scored >10 points below (false low). The remaining 95.2% were within ±10 points of human assessment. False highs cluster in central bank surprises from stable countries; false lows cluster in chronic civil conflicts with limited English-language reporting.

WHAT RISK TERMINAL CAPTURES THAT OTHERS MISS

Real-time multi-source ingestion

GDELT (15-min), 210 RSS feeds (12-min), UCDP, USGS, ReliefWeb, Guardian. Most geopolitical risk indices update monthly or quarterly. We update every ingestion cycle.

De-escalation scoring

Peace agreements, sanctions relief, and diplomatic resolutions produce negative scores. Most risk models only go up. Ours measures conflict reduction as precisely as conflict escalation.

Six-channel contagion propagation

Events in one country propagate to neighbors via trade chokepoints, border gradients, refugee flows, financial contagion, alliance triggers, and protest diffusion. 34 country pairs have explicit permeability scores.

Proxy back-propagation

Hezbollah events propagate to Iran (0.40 weight). Wagner events propagate to Russia (0.30). Israel events propagate to US (0.25). Three patron states tracked with explicit weight disclosure.

Decoupled market sensitivity

Market impact is computed but excluded from the risk composite. This prevents circular logic (market panic → high score → more panic) and powers the Unpriced Risk Alert.

Transparent calibration

437 events, 107 countries, 22 categories — every calibration event is viewable on the backtest page. No black box. Every weight, threshold, and decay parameter is documented.

VS EXISTING APPROACHES

Traditional risk consultancies (EIU, Verisk)Monthly/quarterlyNoNarrativeOpaque

Academic indices (FSI, V-Dem, WGI)AnnualNoStaticFull

News sentiment (GDELT Goldstein)Real-timeNoNoneFull

Risk TerminalReal-time (~15m)Yes (4 categories)6 channels + 34 pairsFull (437 events)

APPROACHUPDATE FREQDE-ESCALATIONCONTAGIONTRANSPARENCY

ARCHITECTURE OVERVIEW

Each event produces a 0–85 point score from five components. Country scores are decay-weighted averages with structural modifiers, capped at ~99 by a logistic bound. Six contagion channels propagate risk across borders. Region scores weight countries by event volume (log-dampened).

Event Score

AI-classified + 5-component model → 0–85 pts per event

Country Score

Exp. decay weighted average + structural modifiers → logistic cap

Region Score

Log-dampened event-count weighted average of country scores

SCORING PIPELINE

event_score = \u03A3 component_i \u00D7 severity^0.7 \u00D7 category_prior(i) [5 components, max 85 pts]

decay_weight(age) = 2^(\u2212age / 5) [0d\u21921.0, 5d\u21920.50, 10d\u21920.25, 30d\u21920.016]

raw_country = \u03A3 (event_j \u00D7 decay_weight_j) + \u03A3 structural_modifier_k

country_score = logistic_bound(raw) [raw=100\u219295, raw=150\u219299.9]

region_score = \u03A3 (country_i \u00D7 log(1 + events_i)) / \u03A3 log(1 + events_i)

SEVERITY TRANSFORM SELECTION

Nine exponents tested; ^0.7 selected by lowest RMSE. Concave transform = diminishing returns at high severity (where measurement is noisiest). financial_crisis uses ^1.5. De-escalation uses linear.

sev^0.7SELECTED\u2014 MAE 1.9 \u00B7 RMSE 3.52 \u00B7 bias +0.06

sev^0.5rejected\u2014 over-scores mid-range by +3\u20135 pts

sev^1.0rejected\u2014 linear; under-scores high-severity by \u22123\u20135 pts

FIVE SCORING COMPONENTS + ECONOMIC FALLOUT LAYER

GEOPOLITICAL TOTAL = 85 pts max·Economic Fallout Layer stored separately — not in total

Geopolitical Severity(27% weight)

max 25

Direct interstate or intrastate political violence. Interstate war: 25. Coup: 20–24. Military strike: 16–21. Border clash: 12–17. Protest suppression: 6–12.

GS = base_prior(category) × severity^0.7 × escalation_flag(1.15)

Economic Disruption(23% weight)

max 22

Sanctions regimes, tariff escalations, CB surprises, supply chain shocks. Iran-level sanctions: 19–22. Major tariff: 13–18. CB pivot: 11–16. Max raised to give central bank / trade events sufficient headroom.

ED = sanctions_tier × 22 + trade_shock × 22 + cb_surprise × 22

Economic Fallout LayerDECOUPLED

max 15

DECOUPLED FROM TOTAL. Stored as market_fallout — excluded from composite. Prevents market endogeneity. Powers Unpriced Risk Alert (geo ≥70 but fallout ≤10).

market_fallout = chokepoint × commodity × 15 + equity_vol × 10 [NOT in total]

Escalation Risk(16% weight)

max 15

Composite risk score (not a probability). Assesses 30-day forward likelihood of severity increase. Intrastate with external intervention: 13–15. Coup: 12–14. Election violence: 8–12. Trade dispute: 3–7. Calibration anchor: proxy_war_flag(1.3).

ER = base_rate(category) × actor_complexity × proxy_war_flag(1.3)

Structural Governance Impact(19% weight)

max 18

Constitutional rewrite, treaty withdrawal, leader ouster, major legislation. Coup + new govt: 15–18. Leader ouster unclear successor: 12–15. Treaty withdrawal: 9–13. Max raised — coup/election crises carry strong structural policy risk.

SGI = governance_depth × 18 + succession_clarity_penalty

Contagion & Spillover(15% weight)

max 14

Cross-border propagation: trade chokepoints, refugee flows, alliance entanglement, financial contagion. Hormuz/Suez closure: 14. Sahel expansion: 9–12. Max raised — contagion channels (refugee, trade, alliance) were systematically underweighted.

CS = chokepoint_score + border_permeability × instability_gradient × neighbor_count

BIAS REMOVAL & MODEL CALIBRATION

01Exponential Decay

Events lose weight continuously — no step functions, no cliff edges. 5-day half-life means events decay to under 2% after 30 days.

decay_weight(t) = 2^(−t / 5.0) → half-life = 5.0 days

02Logistic Cap + Shock Override

Stacked events can push raw scores past 100. The logistic bound caps output at ~99 while keeping higher raw scores meaningfully ranked. For extreme events (severity ≥ 0.90), a shock override uses lighter compression so catastrophic events (9/11, Ukraine invasion, Oct 7) can reach 88–98 instead of being suppressed at ~85.

logistic_bound(raw) → raw=50→48, raw=100→95 | shock_override(raw≥70) → 88 + (raw−70) × 0.33

03No Double-Counting

Country-level modifiers use disjoint event sets. Conflict events feed only the conflict modifier, protest events only protest. No single event increments more than one modifier.

conflict_cats ∩ protest_cats = ∅ → observed r < 0.05

04Market Decoupling

Market data is stored as market_fallout but excluded from the composite score. This prevents circular feedback. Used only to flag pricing gaps (Unpriced Risk Alert: geo score ≥70 but market fallout ≤10).

event_score = GS + ED + EP + SGI + CS [market_fallout stored separately]

CALIBRATION STATUS: 21 PHASES COMPLETE (437 EVENTS)

437 historical events across 21 phases (1930–2025), 107 countries, 22 categories. Pass A (Phases 13–15) added 76 events anchoring the severity floor — sanctions MAE 6.12→2.93, trade MAE 5.43→2.71. Pass B (Phases 16–21) added 94 events across 24 new countries, targeting UCDP-intensity gaps. Overall MAE: 1.9 (new events MAE: 0.54). See Calibration Dataset for full results.

OUT-OF-SAMPLE VALIDATION CAVEAT

Phase 11 hold-out validation confirms category prior robustness (MAE=0.15, n=26). This tests structural generalisation — whether priors calibrated on Phases 1–10 produce accurate scores for unseen events. However, all Phase 11 severity values were calibrated via binary search against the scoring function, so the OOS result does not test whether severity assignments are unbiased — only whether the prior structure is robust.

A truly independent test would require: (a) events scored by a different analyst, (b) severity values assigned by the AI model rather than hand-calibrated, and (c) events from a different temporal/geographic distribution. Either the model genuinely learned the signal, or the hold-out is structurally too similar to the training set to constitute independent validation. Both possibilities should be considered.

GEOPOLITICAL CONTAGION MODEL

Six propagation channels determine how an event in one country affects its neighbors. An event in Country A cascades — not just to country score, but to the entire neighbouring risk landscape.

Trade Chokepoint

Hormuz, Suez, Red Sea, Panama, Malacca closures trigger global commodity contagion.

Border Gradient

Countries sharing borders with HIGH/CRITICAL states receive +3–6 pt spill-in, scaled by permeability. Permeability is a composite of: UNHCR displacement flow volume between country pairs, existence of formal border crossings, and whether the pair shares an active conflict theatre. 34 country pairs have explicit permeability scores.

Refugee Flows

UNHCR displacement events trigger region-wide humanitarian risk. Receiving states scored for absorption.

Financial Contagion

Sovereign debt distress triggers EM spread monitoring. >40% correspondent banking exposure = flag.

Alliance Trigger

NATO, SCO, CSTO, GCC, AU obligations modelled. Attack on treaty member → escalation multiplier.

Protest Diffusion

Capital protests >50k: approximately 15–25% probability of regional emulation within 30 days (internal estimate based on case review of Arab Spring, Color Revolutions, and Latin American protest waves; consistent with Beissinger 2007 and Chenoweth/Stephan NAVCO dataset ranges). Central estimate 18% used in model.

COUNTRY & REGION RISK SCORES

CONTINUOUS EXPONENTIAL DECAY

weight = 2^(−age/5). No cliffs. Persistent crises remain weighted; resolved events decay to noise.

1.000×

0.500×

0.250×

10d

0.063×

20d

0.016×

30d

STRUCTURAL MODIFIERS

Election proximity+0 to +8 pts

+8 × (1 − days / 90)

Applied linearly over 90-day pre-election window. Maximum on election day.

Protest intensity+0 to +6 pts

+6 × min(events_7d / 3, 1)

Active protest events in 7-day window. Scales to ceiling at 3+ distinct events.

Armed conflict+0 to +10 pts

+10 × min(conflict_intensity, 1)

Raised from +8 → cap now +10. Active war zones (Ukraine, Gaza) need the headroom to reach 91–95 country rolling score from a ~83 single-event base.

Sanctions exposure+0 to +6 pts

+6 × sanction_tier_score

Comprehensive sanctions (Iran/Russia/DPRK): +6. Sectoral: +2–4. Individual: +0–1.

Leadership instability+0 to +5 pts

+5 × (1 − LSI)

Applied when leader stability is CRITICAL or HIGH risk.

REGION SCORE

Log-dampened event-count weighting prevents high-volume countries from dominating via data density. Unpriced Risk Alert fires when region geo score ≥70 but avg market_fallout ≤10.

SYSTEMIC INSTABILITY INDEX — λmax

LOADING λmax DATA...

THREE-REGIME DECAY MODEL

Not all events decay at the same rate. The system uses three half-life regimes keyed per event category. Events carry their own decay constant — a nuclear escalation from 3 weeks ago remains influential; a border protest from 3 weeks ago is near-zero.

NORMAL

5 days

at 30d: 0.016×

Conflict, coup, protest, border clash, election crisis, leader resignation, sanctions, trade, political development

Standard decay — most events. Resolved in ~30 days.

NUCLEAR

21 days

at 30d: 0.370×

nuclear_escalation only — enrichment breaches, IAEA violations, breakout risk, declared capability

Nuclear standoffs persist for weeks. 21d half-life ensures continued influence over the full geopolitical cycle.

SYSTEMIC

45 days

at 30d: 0.630×

financial_crisis, pandemic — global shocks with multi-month impact trajectories

Also triggers systemic_shock component and global floor propagation. 55% peak floor held during dry spells.

DECAY FORMULA

weight(t) = 2^(−t / half_life) where half_life ∈ {5, 21, 45} days

normal: 0d→1.000 | 5d→0.500 | 10d→0.250 | 30d→0.016

nuclear: 0d→1.000 | 21d→0.500 | 42d→0.250 | 30d→0.370

systemic: 0d→1.000 | 45d→0.500 | 90d→0.250 | 30d→0.630

NUCLEAR ESCALATION SCORING ADJUSTMENT

Nuclear escalation events receive a dedicated scoring regime — higher priors, longer decay, and no systemic shock. The model targets specific calibration anchors to keep scores differentiated at the high end.

COMPONENT WEIGHTS FOR nuclear_escalation

GEO

ESC

SPILL

MKTFL

ECON

POLICY

market_fallout stored separately — not in total

Iran 84% enrichment

IAEA violations, near-breakout

sev=0.85raw≈74.7→79

DPRK nuclear test

Confirmed detonation (shock override)

sev=0.90raw≈76.9→90

Nuclear breakout declared

Weapons capability (shock override)

sev=1.00raw≈81→92

ACTIVE NUCLEAR HALF-LIFE STATES (illustrative)

Iran (enrichment 84%)

North Korea (ICBM tests)

Pakistan (tactical doctrine)

India (NFU posture)

DE-ESCALATION ARCHITECTURE

The model now supports stabilising events that reduce a country's rolling risk score — peace deals, diplomatic normalization, sanctions relief, and arms control agreements. De-escalation events produce a negative stabilisation credit that directly subtracts from the country score, with a 10-day half-life (slower than the standard 5-day decay, so peace signals persist longer).

DE-ESCALATION CATEGORIES

peace_agreement

Formal peace deal, ceasefire, armistice

Colombia FARC, Oslo Accords, Good Friday

diplomatic_resolution

Normalization, rapprochement, embassy exchange

Saudi-Iran deal, Abraham Accords

sanctions_relief

Sanctions lifted, trade normalization

JCPOA sanctions relief, Myanmar easing

arms_control

Nuclear deal, weapons reduction, disarmament

JCPOA nuclear limits, START extension

Saudi-Iran rapprochement

Beijing-brokered restoration

sev=0.82→ −19.7 pts

Abraham Accords

Israel-UAE/Bahrain normalization

sev=0.72→ −17.3 pts

Colombia FARC peace deal

Comprehensive peace accord

sev=0.80→ −19.2 pts

JCPOA nuclear deal

Iran arms control agreement

sev=0.85→ −20.4 pts

DESIGN PRINCIPLES

• Linear severity transform (not sqrt) — peace impact scales proportionally

• 10-day half-life (vs 5-day standard) — asymmetric decay for persistence

• Prior sum = 24 per category × severity → max credit capped at 25 pts

• Applied as additive modifier, not mixed into weighted mean — direct impact

• Country score floors at 0 — de-escalation cannot produce negative risk

UNPRICED STABILITY SIGNAL

Mirrors the Unpriced Risk Alert. Fires when a de-escalation event is active in a region but the region score remains ≥50 — suggesting markets and scores haven't yet priced a genuine stabilisation.

GOVERNANCE MODIFIER — WGI + V-DEM

The Structural Risk layer (visible on the map STRUCT.RISK tab) uses two open-data governance indices to compute a country-level fragility score independent of recent events. This score is also used as a slow-moving prior that biases country risk scores upward for institutionally fragile states.

World Governance Indicators (WGI)

Source: World Bank · Annual · 193 countries

Political Stability & Absence of Violence component used. Raw range −2.5 (most fragile) to +2.5 (most stable), inverted and normalised to a 0–100 governance_fragility score.

governance_fragility = (−wgi_ps + 2.5) / 5.0 × 100

V-Dem Liberal Democracy Index

Source: V-Dem Institute · Annual · 179 countries

Liberal Democracy Index (0 = full autocracy, 1 = full democracy), inverted so 0 = stable democracy and 100 = closed autocracy. Captures regime type risk not covered by WGI.

vdem_risk = (1 − vdem_liberal_democracy) × 100

COMPOSITE STRUCTURAL RISK SCORE

structural_risk = 0.60 × governance_fragility + 0.40 × vdem_risk

WGI weighted higher (0.60) — captures near-term political violence risk more directly than regime type.

No-data fallback: structural_risk = 50 (neutral). Countries with partial data use available index only.

PROXY BACK-PROPAGATION RULES

METHODOLOGY DISCLOSURE

All proxy weights below are expert-estimated judgment calls, not derived from a statistical model. They are based on reported command structures, materiel flows, intelligence assessments, and diplomatic dependency patterns. Weights are configurable and disclosed here for transparency. Symmetric back-propagation is applied: if A propagates risk to B, we evaluate whether B should propagate back to A. Asymmetries reflect genuine differences in exposure direction.

IRAN PROXY NETWORK

Iran funds, arms, and strategically directs a proxy network across four theatres. When front-line events occur in these theatres, Iran receives back-propagated risk at a conservative fraction of the primary event score — reflecting command-level exposure without overstating direct involvement.

Hezbollah (Lebanon)weight 0.4

rule: hezbollah_iran_backflow · min_severity: 0.35–0.40

Iran funds and arms Hezbollah. Combat operations deploy Iranian strategic risk. Lebanon absorbs frontline exposure; Iran absorbs command-level risk.

triggers: hezbollah, nasrallah, southern lebanon, hezbollah rocket/attack/drone/fighter/unit/commander, islamic resistance lebanon

Houthis / Ansar Allah (Yemen)weight 0.35

rule: houthi_iran_backflow · min_severity: 0.35–0.40

IRGC provides missiles, drones, targeting intelligence, and advisors. Houthi Red Sea operations are Iran's strategic play vs. US/Israeli posture.

triggers: houthi, ansar allah, houthi attack/drone/missile/ship/strike, sanaa attack, yemen militia/proxy

PMF / Kataib Hezbollah (Iraq & Syria)weight 0.3

rule: iran_iraq_militia_backflow · min_severity: 0.35–0.40

IRGC Quds Force directs PMF and Kataib Hezbollah. Attribution is proximate, not always certain — weight is conservative (0.30).

triggers: hashd, popular mobilization, pmu, kataib hezbollah, iraqi militia, iran-backed militia, irgc iraq, quds force iraq, resistance axis

SYMMETRIC BACK-PROPAGATION

Proxy relationships are not one-directional. When a client state acts, its patron absorbs diplomatic, military, and strategic cost. These symmetric rules ensure the model captures both directions.

US ← Israel (Patron Exposure)weight 0.25

rule: us_israel_backpropagation · min_severity: 0.35–0.50

US absorbs diplomatic cost (UNSC vetoes, regional backlash), military exposure (carrier deployments, Iron Dome resupply, base vulnerability), and strategic risk (deterrence posture adjustments). Weight conservative: patron, not co-belligerent.

triggers: idf, israeli military/airstrike/strike/operation/offensive/defense, netanyahu, iron dome/swords, mossad operation

Russia ← Wagner / Africa Corpsweight 0.3

rule: russia_wagner_backpropagation · min_severity: 0.35–0.50

Wagner operations represent GRU-directed Russian force projection. Resource extraction deals and diplomatic shielding expose Russia to operational and reputational cost.

triggers: wagner, wagner group, prigozhin, africa corps, russian mercenary/pmc/private military, wagner mali/libya/sudan/syria/central african

SCORING IMPACT: propagated events are discounted ×0.5 in scoring (a 0.40-weight 80-pt event adds ~16 pts to target's pool before decay). All weights are defined in PROXY_WEIGHTS config — adjustable without code changes.

CONFIDENCE & FRESHNESS SIGNALS

Two orthogonal trust signals are tracked per record: source_count (corroboration depth) and last_verified_at (record freshness). Together they drive the confidence tier and staleness warnings visible throughout the interface.

source_count — Corroboration Depth

Number of independent HIGH-tier sources that have reported the same event within a 6-hour window. Used to gate confidence tier promotion. A single-source event cannot reach CONFIRMED regardless of source tier.

≥3CONFIRMED

2HIGH

1MEDIUM

0UNVERIFIED

last_verified_at — Record Freshness

Timestamp of last human or AI-triggered verification. Leader records, election data, and nuclear profiles all carry this field. Freshness thresholds trigger automatic re-verification prompts.

0–30dFRESHNo action required

31–60dAGINGReview recommended

>60dSTALERe-verify flag triggered

N/AUNVERIFIEDNever verified — excluded from score

Auto-trigger conditions: Leader records are auto-flagged for re-verification when country risk ≥60, record age >60d, or a coup/election event is detected for that country. STALE badge displayed in-app on affected leader and election panels.

CATEGORY REFERENCE TABLE

Prior score ranges reflect severity 0.5–1.0. Pre-structural-modifier.

CATEGORY	SCORE RANGE	ESCALATION	SPILLOVER	REFERENCE
Full Conflict (Interstate War)	75–95	VERY HIGH	CRITICAL	Ukraine, Gaza, Sudan
Military Strike / Airstrike	65–88	HIGH	HIGH	US strikes on Iran nuclear sites
Coup / Power Seizure	70–92	HIGH	HIGH	Myanmar 2021, Niger 2023
Nuclear / WMD Event	85–100	CRITICAL	CRITICAL	Pakistan-India nuclear signalling
Chokepoint Disruption	60–82	HIGH	CRITICAL	Hormuz, Red Sea, Panama Canal
Border Clash	50–75	MEDIUM	MEDIUM	India-Pakistan LoC exchanges
Sanctions Regime	45–72	MEDIUM	HIGH	Russia SWIFT, Iran comprehensive
Election Crisis	38–62	MEDIUM	LOW	Venezuela 2024, Georgia 2024
Leader Resignation / Ouster	32–58	LOW–MED	LOW	Bangladesh PM Hasina 2024
Mass Protest	28–52	MEDIUM	MEDIUM	Iran 2022, Hong Kong 2019
Central Bank Surprise	30–55	LOW	HIGH	BOJ YCC abandonment, Fed pivot
Trade / Tariff Escalation	25–50	MEDIUM	HIGH	US-China tariff escalation
Constitutional Crisis	30–58	MEDIUM	LOW	Peru 2022, Israel judiciary
Political Development	18–38	LOW	LOW	Coalition reshuffle, policy shift
Other / Background	0–20	NONE	NONE	Low-signal events

CATEGORY MODEL CONFIDENCE

Confidence tiers derived from calibration dataset MAE per category. HIGH = MAE ≤ 5 (n ≥ 5) · MEDIUM = MAE ≤ 8 · LOW = MAE > 8 or insufficient data.

RISK BANDS

CRITICAL80–10022 events (6.4%)

Immediate threat to life, state stability, or global supply chains. Senior-level attention within 24 hours.

Anchors: Russia-Ukraine invasion (~84–94), COVID outbreak (~85), 9/11 (~79–93), Hamas Oct 7 (~78–90)

HIGH60–79114 events (33.2%)

Active seizure/threat of state power, comprehensive sanctions with cross-border effect, active military engagement, or chokepoint disruption. Dedicated tracking required.

Anchors: Niger coup (~68), Hong Kong protests (~60), Turkey coup attempt (~70), Iran JCPOA reimposition (~74)

ELEVATED40–59103 events (30%)

Meaningful disruption underway but contained. Direction over 7 days matters more than absolute value.

Anchors: H1N1 declaration (~56), Greece austerity protests (~42), Brazil June Days (~44)

MODERATE20–3969 events (20.1%)

Noteworthy political developments with no escalation pathway or cross-border spillover. Background awareness sufficient.

Anchors: Monkeypox WHO declaration (~32), El Salvador Bitcoin adoption (~32)

LOW0–1935 events (10.2%)

Background noise, resolved events in decay, or structurally stable states. No action required.

Anchors: Germany farmer protests (~18), UK junior doctors strike (~22)

BAND DESIGN RATIONALE

Equal-width bands (20-point intervals) are used deliberately. Quantile-based bands would shift thresholds as new events are added, making the score a relative ranking rather than a fixed reference point. Equal-width bands provide a stable operational definition: a score of 65 means the same thing regardless of how many other countries are also at 65.

Distribution across 437 calibration events (21 phases, 107 countries): LOW 10.2%, MODERATE 20.1%, ELEVATED 30.0%, HIGH 33.2%, CRITICAL 6.4%. The CRITICAL band intentionally has few events (6.4%) — civilisational crises are rare by definition.

SOURCE CORROBORATION & CONFIDENCE

CONFIRMED

3+ independent HIGH-tier sources OR 1 ReliefWeb + wire + secondary.

HIGH

2 independent HIGH-tier sources in same 6-hour window, or 1 HIGH + primary document.

MEDIUM

1 HIGH-tier or 2+ MED-tier. Default for GDELT-first events pending corroboration.

LOW

Single unverified RSS or GDELT without wire. Flagged. Excluded from score calc.

UNVERIFIED

No identifiable source tier. Quarantined until corroborated or manually reviewed.

AI ENRICHMENT PIPELINE

Event Classification

Modelclaude-haiku-4-5 (real-time, per-event)

Retry logic3 attempts with 0.3s backoff before rule-based fallback

Output schemaStrict JSON — category, severity, country, region, linked_sectors

GroundingTitle, summary, source name only — no invented context

Briefing Generation

Modelclaude-opus-4-6

Temperature0.0 (deterministic)

Schedule07:00 UTC daily

GroundingAll claims must reference a stored event_id. Refusal if ungrounded.

DATA SOURCES

HIGHEvery 30 min

ReliefWeb

Humanitarian crises — OCHA, 200+ countries

UN OCHA crisis reports. Displacement, famine early warning, emergency declarations.

HIGHEvery 15 min

RSS Wire Feeds

55+ feeds — Reuters, BBC, Al Jazeera, ISW, Bellingcat, FP, IISS + Arabic, French, Spanish, Russian, Chinese, SEA & African sources

Selected for editorial independence and primary source access. Non-English articles translated via Claude Haiku before enrichment.

MEDEvery 15 min

GDELT

Global — 65+ languages, 150+ countries

Systematic undercounting in Central Africa/Oceania compensated by ReliefWeb.

HIGHDaily

UCDP

Armed conflict — Uppsala University, global

Uppsala Conflict Data Program. State-based, non-state, and one-sided violence datasets. Gold standard for conflict event classification.

HIGHDaily

UNHCR

Refugee & displacement flows — 150+ countries

UN refugee agency operational data. Displacement surges used as leading indicators for cross-border contagion scoring.

HIGHEvery 5 min

Market APIs

Equities, FX, commodities, sovereign CDS

Cross-referenced against event scores to detect pricing divergence.

HIGHDaily

FRED / Treasury

US macro — Fed rates, yields, spreads

Federal Reserve Economic Data + Treasury yields. Used for macro context and financial contagion channel.

HIGHAnnual

WGI + V-Dem

Governance indices — 193 countries

World Bank Governance Indicators + V-Dem Liberal Democracy Index. Drive the structural risk layer and governance modifier.

VERIFIEDManual + Triggered

Leader Records

193 UN member states

Auto-flags for re-verification at risk ≥60, record age >60d, or coup/election detection.

AIDaily 07:00 UTC

AI Briefings

Global digest + regional on threshold

claude-opus-4-6, temperature=0, structured JSON. All claims reference a stored event_id.

LIVE RISK CALCULATOR

Compute an indicative event score from category and severity. Use the anchor buttons to load reference events.

ANCHORS:

MODEL CALIBRATION — 21 PHASES (437 EVENTS)

LOADING CALIBRATION RESULTS...

DESIGN CONSTRAINTS & LIMITATIONS

These are inherent properties of real-time intelligence systems, documented for transparency.

Single-researcher calibrationMED IMPACT

All 437 human labels were produced by a single analyst with hindsight knowledge. Calibration, not independent validation — a second scorer would strengthen credibility. All 22 categories now have sufficient events.

Logistic cap compresses extremesLOW IMPACT

Scores above 90 all signal "acute crisis." A country with 10 simultaneous crises scores ~95, not 150. Intentional — prevents inflation.

Market data latencyLOW IMPACT

Prices delayed 15–20 min outside trading hours. Pre/after-hours moves not captured.

Non-English under-reportingMED IMPACT

GDELT undercounts in non-English regions. Mitigated by 20+ multilingual RSS feeds with Haiku translation, plus ReliefWeb.

Volume signal suppressed by weighted meanLOW IMPACT

Country score is a decay-weighted mean, not sum. 50 low-score events won't outscore 5 high-score events. Surge detection handles volume separately.

Leader record latencyLOW IMPACT

Up to 24h lag for leader transitions. STALE badge shown when records exceed 60 days unverified.

RESEARCH TRANSPARENCY

External validation results, known limitations, analytic assumptions, and calibration roadmap

→