Skip to main content
The WorldMonitor Country Resilience Index (CRI) scores every country in the world on a 0-100 scale, combining long-run structural capacity with current operational stress to produce an actionable resilience metric. Rather than relying on static country risk ratings, the CRI updates every 6 hours from official and authoritative sources and exposes full provenance, coverage, and imputation context so analysts can see exactly why a score moved and how much of it is real data versus imputed. This document is the v1.0 reference for the live product. A planned v2.0 upgrade will rebuild the top-level shape into three pillars (structural readiness, live shock exposure, recovery capacity) with a partly non-compensatory aggregation, and ship an annual Reference Edition at citation quality. That work is tracked in a separate reference-grade upgrade plan and is not yet shipped; everything documented below describes the current shipping behavior.

In the dashboard

CRI is surfaced across three places in the product, all driven from the same v1.0 score described below:
  • Resilience widget — a standalone panel (component: src/components/ResilienceWidget.ts) that ranks countries by resilience score with filter and search affordances. Reach it from Cmd+K by typing resilience.
  • Country Deep-Dive — inside the per-country drill-down panel, CRI appears alongside CII (Country Instability Index) as a structural complement to the short-horizon stress signal. CII and CRI are intentionally not interchangeable: CII answers “how much stress is on this country right now?”; CRI answers “how well-positioned is this country to absorb and recover from shocks?”
  • Map choropleth — the resilience score drives a country-level choropleth layer on the main map. Toggle it from the map’s layer panel or via Cmd+K.
All three surfaces are free to view. The underlying data served at /api/resilience/v1/* is public; see Resilience service for the HTTP contract.

Overview

The WorldMonitor Country Resilience Index scores 222 countries on a 0-100 scale across 5 domains and 13 dimensions. It combines structural baseline indicators (governance quality, health infrastructure, fiscal capacity) with real-time stress signals (cyber threats, conflict events, shipping disruption) to produce a single resilience score updated every 6 hours. Data is sourced from official and authoritative providers: World Bank, IMF, WHO, WTO, OFAC, UNHCR, UCDP, BIS, IEA, FAO, Reporters Sans Frontieres, and the Institute for Economics and Peace, among others.

Domains and Weights

The index is organized into 5 domains. Each domain weight reflects its relative contribution to overall national resilience.
DomainIDWeightDimensions
Economiceconomic0.22Macro-Fiscal, Currency & External, Trade & Sanctions
Infrastructureinfrastructure0.20Cyber & Digital, Logistics & Supply, Infrastructure
Energyenergy0.15Energy
Social & Governancesocial-governance0.25Governance, Social Cohesion, Border Security, Information
Health & Foodhealth-food0.18Health & Public Service, Food & Water
Weights sum to 1.00.

Dimensions and Indicators

Each dimension is scored from 0-100 using a weighted blend of its sub-metrics. Below is the complete indicator registry.

Economic Domain (weight 0.22)

Macro-Fiscal

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
govRevenuePctGovernment revenue as % of GDP (IMF GGR_G01_GDP_PT)Higher is better5 - 450.50IMFAnnual
debtGrowthRateAnnual debt growth rateLower is better20 - 00.20National debt dataAnnual
currentAccountPctCurrent account balance as % of GDP (IMF)Higher is better-20 - 200.30IMFAnnual

Currency & External

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
fxVolatilityAnnualized BIS real effective exchange rate volatilityLower is better50 - 00.60BISMonthly
fxDeviationAbsolute deviation of BIS real EER from equilibrium (100)Lower is better35 - 00.25BISMonthly
fxReservesAdequacyTotal reserves in months of imports (World Bank FI.RES.TOTL.MO)Higher is better1 - 120.15World BankAnnual
For non-BIS countries (~160 countries), a fallback chain applies: (1) IMF inflation + World Bank reserves proxy, (2) IMF inflation alone, (3) reserves alone, (4) conservative imputation (score 50, certainty 0.3).

Trade & Sanctions

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
sanctionCountOFAC sanctions entity count; piecewise normalizationLower is better200 - 00.45OFACDaily
tradeRestrictionsWTO trade restrictions count (IN_FORCE weighted 3x)Lower is better30 - 00.15WTOWeekly
tradeBarriersWTO trade barrier notifications countLower is better40 - 00.15WTOWeekly
appliedTariffRateApplied tariff rate, weighted mean, all products (World Bank TM.TAX.MRCH.WM.AR.ZS)Lower is better20 - 00.25World BankAnnual
Sanctions use piecewise normalization: 0 entities = score 100, 1-10 = 90-75, 11-50 = 75-50, 51-200 = 50-25, 201+ tapers toward 0.

Infrastructure Domain (weight 0.20)

Cyber & Digital

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
cyberThreatsSeverity-weighted cyber threat count (critical 3x, high 2x, medium 1x, low 0.5x)Lower is better25 - 00.45Cyber threat feedsDaily
internetOutagesInternet outage penalty (total 4x, major 2x, partial 1x)Lower is better20 - 00.35Outage monitoringRealtime
gpsJammingGPS jamming hex penalty (high 3x, medium 1x)Lower is better20 - 00.20GPSJamDaily

Logistics & Supply

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
roadsPavedLogisticsPaved roads as % of total road network (World Bank IS.ROD.PAVE.ZS)Higher is better0 - 1000.50World BankAnnual
shippingStressGlobal shipping stress scoreLower is better100 - 00.25Supply-chain monitorDaily
transitDisruptionMean transit corridor disruptionLower is better30 - 00.25Transit summariesDaily

Infrastructure

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
electricityAccessAccess to electricity, % of population (World Bank EG.ELC.ACCS.ZS)Higher is better40 - 1000.40World BankAnnual
roadsPavedInfraPaved roads as % of total road network (World Bank IS.ROD.PAVE.ZS)Higher is better0 - 1000.35World BankAnnual
infraOutagesInternet outage penalty (shared source with Cyber & Digital)Lower is better20 - 00.25Outage monitoringRealtime
Note on the paved-roads indicator. The same World Bank series (IS.ROD.PAVE.ZS) feeds two dimensions inside the Infrastructure domain: roadsPavedLogistics under Logistics & Supply (weight 0.50 within the dimension) and roadsPavedInfra here under Infrastructure (weight 0.35 within the dimension). This is deliberate source reuse, not accidental double counting: Logistics & Supply uses paved-road coverage as a proxy for transit viability, while Infrastructure uses it as a proxy for baseline public capital stock. The two dimensions legitimately care about the same signal for different reasons, and each dimension’s contribution to the domain is further mediated by the dimension weight in coverage-weighted mean aggregation (see the Scoring Formula section). The v2.0 reference-grade upgrade plan is expected to consolidate shared upstream signals into a single indicator registry so this kind of reuse is documented at the source level rather than per-dimension; for v1.0 the two separate metric rows are preserved for backward compatibility.

Energy Domain (weight 0.15)

Energy

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
energyImportDependencyIEA energy import dependency (% of supply from imports)Lower is better100 - 00.25IEAAnnual
gasShareNatural gas share of energy mixLower is better100 - 00.12Energy mix dataAnnual
coalShareCoal share of energy mixLower is better100 - 00.08Energy mix dataAnnual
renewShareRenewable energy share of energy mixHigher is better0 - 1000.05Energy mix dataAnnual
gasStorageStressGas storage fill stress: (80 - fillPct) / 80, clamped [0,1]Lower is better100 - 00.10GIE AGSI+Daily
energyPriceStressMean absolute energy price change across commoditiesLower is better25 - 00.10Energy pricesDaily
electricityConsumptionPer-capita electricity consumption (kWh/year, World Bank EG.USE.ELEC.KH.PC)Higher is better200 - 80000.30World BankAnnual

Social & Governance Domain (weight 0.25)

Governance

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
wgiVoiceAccountabilityWorld Bank WGI: Voice and AccountabilityHigher is better-2.5 - 2.51/6World Bank WGIAnnual
wgiPoliticalStabilityWorld Bank WGI: Political StabilityHigher is better-2.5 - 2.51/6World Bank WGIAnnual
wgiGovernmentEffectivenessWorld Bank WGI: Government EffectivenessHigher is better-2.5 - 2.51/6World Bank WGIAnnual
wgiRegulatoryQualityWorld Bank WGI: Regulatory QualityHigher is better-2.5 - 2.51/6World Bank WGIAnnual
wgiRuleOfLawWorld Bank WGI: Rule of LawHigher is better-2.5 - 2.51/6World Bank WGIAnnual
wgiControlOfCorruptionWorld Bank WGI: Control of CorruptionHigher is better-2.5 - 2.51/6World Bank WGIAnnual
All six WGI indicators are equally weighted.

Social Cohesion

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
gpiScoreGlobal Peace Index scoreLower is better3.6 - 1.00.55IEPAnnual
displacementTotalUNHCR total displaced persons (log10 scale)Lower is better7 - 00.25UNHCRAnnual
unrestEventsSeverity-weighted unrest events + sqrt(fatalities)Lower is better20 - 00.20Unrest monitoringRealtime

Border Security

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
ucdpConflictUCDP armed conflict: eventCount*2 + typeWeight + sqrt(deaths)Lower is better30 - 00.65UCDPRealtime
displacementHostedUNHCR hosted displaced persons (log10 scale)Lower is better7 - 00.35UNHCRAnnual

Information & Cognitive

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
rsfPressFreedomRSF press freedom scoreHigher is better0 - 1000.55RSFAnnual
socialVelocityReddit social velocity (log10(velocity+1))Lower is better3 - 00.15Reddit intelligenceRealtime
newsThreatScoreAI news threat severity (critical 4x, high 2x, medium 1x, low 0.5x)Lower is better20 - 00.30News threat analysisDaily

Health & Food Domain (weight 0.18)

Health & Public Service

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
uhcIndexWHO Universal Health Coverage service coverage indexHigher is better40 - 900.45WHOAnnual
measlesCoverageMeasles immunization coverage among 1-year-olds (%)Higher is better50 - 990.35WHOAnnual
hospitalBedsHospital beds per 1,000 peopleHigher is better0 - 80.20WHOAnnual

Food & Water

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
ipcPeopleInCrisisIPC/FAO people in food crisis (log10 scale)Lower is better7 - 00.45FAO/IPCAnnual
ipcPhaseIPC food crisis phase (1-5)Lower is better5 - 10.15FAO/IPCAnnual
aquastatWaterStressFAO AQUASTAT water stress/withdrawal/dependency (%)Lower is better100 - 00.25FAO AQUASTATAnnual
aquastatWaterAvailabilityFAO AQUASTAT water availability (m3/capita)Higher is better0 - 50000.15FAO AQUASTATAnnual

Recovery Domain (weight 1.0)

This domain forms the recovery-capacity pillar. It measures a country’s ability to bounce back from an acute shock along fiscal, monetary, trade, institutional, and energy dimensions.

Fiscal Space

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
recoveryGovRevenueGovernment revenue as % of GDP (IMF GGR_G01_GDP_PT)Higher is better5 - 450.40IMFAnnual
recoveryFiscalBalanceGeneral government net lending/borrowing as % of GDP (IMF GGXCNL_G01_GDP_PT)Higher is better-15 - 50.30IMFAnnual
recoveryDebtToGdpGeneral government gross debt as % of GDP (IMF GGXWDG_NGDP_PT)Lower is better150 - 00.30IMFAnnual

Reserve Adequacy

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
recoveryReserveMonthsTotal reserves in months of imports (World Bank FI.RES.TOTL.MO)Higher is better1 - 181.00World BankAnnual

External Debt Coverage

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
recoveryDebtToReservesShort-term external debt to reserves ratio (World Bank DT.DOD.DSTC.CD / FI.RES.TOTL.CD)Lower is better5 - 01.00World BankAnnual

Import Concentration

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
recoveryImportHhiHerfindahl-Hirschman Index of import partner concentration (UN Comtrade HS2 bilateral)Lower is better5000 - 01.00UN ComtradeAnnual

State Continuity

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
recoveryWgiContinuityMean WGI score as institutional durability proxyHigher is better-2.5 - 2.50.50World BankAnnual
recoveryConflictPressureUCDP conflict metric inverted to state continuityLower is better30 - 00.30UCDPRealtime
recoveryDisplacementVelocityUNHCR displacement as state continuity signalLower is better7 - 00.20UNHCRAnnual
State continuity is a derived dimension: it reads from existing WGI, UCDP, and displacement keys rather than a dedicated seeder.

Fuel Stock Days

IndicatorDescriptionDirectionGoalposts (worst-best)WeightSourceCadence
recoveryFuelStockDaysDays of fuel stock cover (IEA Oil Stocks / EIA Weekly Petroleum Status)Higher is better0 - 1201.00IEA/EIAMonthly
Fuel stock days is an Enrichment-tier signal (coverage ~45 countries, IEA/OECD members). Countries without fuel stock data are imputed with the unmonitored class.

Normalization

All indicators are normalized to a 0-100 scale using goalpost scaling (also called min-max normalization with domain-specific anchors). For “higher is better” indicators:
score = clamp((value - worst) / (best - worst) * 100, 0, 100)
For “lower is better” indicators:
score = clamp((worst - value) / (worst - best) * 100, 0, 100)
Goalposts are hand-picked based on empirical data ranges (not percentile-derived). A score of 100 means the country meets or exceeds the “best” goalpost; 0 means it meets or exceeds the “worst” goalpost. Exception: Sanctions use piecewise normalization to capture the non-linear impact of sanctions counts (the first few sanctions matter more than additional ones in already-sanctioned countries).

Scoring Formula

Dimension Score

Each dimension score is the weighted blend of its sub-metric scores:
dimensionScore = sum(metricScore_i * metricWeight_i) / sum(metricWeight_i)
Only metrics with available data participate in the blend. Missing metrics are excluded from both the numerator and denominator, so the score reflects what is known rather than penalizing for absent data.

Domain Score

Each domain score is the coverage-weighted mean of its dimensions:
domainScore = sum(dimensionScore_i * dimensionCoverage_i) / sum(dimensionCoverage_i)
Coverage weighting ensures that dimensions with sparse data (low coverage) contribute proportionally less, preventing a low-coverage dimension from dragging the domain average down.

Overall Score

The overall score is a domain-weighted sum:
overallScore = sum(domainScore_i * domainWeight_i)
Each domain’s weight is defined in the configuration. The weights sum to 1.0, so the overall score is a straightforward weighted average of domain scores. This is the post-PR #2847 formula; an earlier multiplicative form (baseline * (1 - stressFactor)) over-penalized every country and was reverted. See the Changelog for the full version history.

Resilience Level Classification

Score RangeLevel
70-100High
40-69Medium
0-39Low

Missing Data Handling

Coverage Tracking

Each dimension carries a coverage value (0.0-1.0) representing the weighted certainty of its data. Real observed data contributes certainty 1.0. Imputed data contributes partial certainty. Absent data contributes 0.
coverage = sum(metricWeight_i * certainty_i) / sum(metricWeight_i)

Imputation Taxonomy

When data is absent, the system tags it with one of four classes so downstream consumers can distinguish “nothing is happening” from “we do not know” from “the upstream is down” from “the dimension does not apply to this country.” The taxonomy is defined in server/worldmonitor/resilience/v1/_dimension-scorers.ts as an exported ImputationClass type.
ClassMeaningTypical scoreCertaintyExample sources
stable-absenceThe source publishes globally. Country is not listed, which means the tracked phenomenon is not happening. Strong positive signal.85 to 880.6 to 0.7IPC food crisis, UNHCR displacement, UCDP conflict events
unmonitoredThe source is a curated list that may not cover every country. Absence is ambiguous; penalized conservatively.50 to 600.3 to 0.4BIS exchange rates and credit, WTO trade data, OECD ICU capacity
source-failureThe upstream API was unavailable at seed time. Detected from seed-meta failedDatasets. Should be rare and transient.inherits from the source being substituted0.3 to 0.5any source listed in failedDatasets during a seed run
not-applicableThe dimension is structurally N/A for this country. For example, a landlocked country has no maritime exposure.neutral, by definition1.0 (by definition)reserved for future dimensions that need structural N/A handling
The generic imputation entries are declared in the IMPUTATION table and shared across dimensions. Per-metric overrides live in the IMPUTE table with their own score and certainty values, and inherit or override the class tag. Every entry is regression-tested in tests/resilience-dimension-scorers.test.mts to prevent silent drift.
Concrete imputation entryClassScoreCertaintyNotes
crisis_monitoring_absent (IPC, UCDP, UNHCR general)stable-absence850.7Used when the global crisis feed has no entry for the country
curated_list_absent (BIS, WTO general)unmonitored500.3Used when a curated list does not cover the country
ipcFood (food-specific crisis monitoring)stable-absence880.7Slightly higher score because no IPC data strongly implies food security
wtoData (trade-specific curated list)unmonitored600.4Slightly higher than the generic curated list default
unhcrDisplacement (displacement-specific crisis monitoring)stable-absence850.6Lower certainty than IPC because displacement is noisier
bisEer and bisCreditunmonitored500.3Shared reference to curated_list_absent; same tag
The source-failure class is reserved for the runtime path that consults seed-meta.failedDatasets and re-tags affected imputations; that wiring lands with a later Phase 1 task and is not yet represented in the table above. The not-applicable class is reserved for future dimensions and has no current call site.

Low Confidence Flag

A score is flagged as lowConfidence when either:
  • Average dimension coverage falls below 0.55, or
  • Imputation share (imputed weight / total weight) exceeds 0.40.

Grey-Out Threshold

Countries with overall coverage below 0.40 are greyed out in the UI and excluded from rankings. Their scores are too data-sparse to be meaningful.

Imputation Share

The API response includes imputationShare (0.0-1.0), representing the fraction of total indicator weight that came from imputed (synthetic) data rather than observed data. This allows consumers to assess data provenance.

Data Sources

SourceIndicatorsCadenceScope
IMF (WEO/IFS)Government revenue, current account, inflationAnnualGlobal
World Bank (WDI)Electricity access, paved roads, reserves, tariffs, electricity consumptionAnnualGlobal
World Bank (WGI)6 governance indicatorsAnnualGlobal
BISReal effective exchange ratesMonthly~60 countries
OFACSanctions entity countsDailyGlobal
WTOTrade restrictions, trade barriersWeekly~50 reporters
WHOUHC index, measles coverage, hospital bedsAnnualGlobal
FAO (IPC)People in food crisis, crisis phaseAnnualAffected countries
FAO (AQUASTAT)Water stress, water availabilityAnnualGlobal
IEAEnergy import dependencyAnnualGlobal
IEPGlobal Peace IndexAnnualGlobal
RSFPress freedom scoreAnnualGlobal
UNHCRDisplaced persons, hosted refugeesAnnualAffected countries
UCDPArmed conflict events, fatalitiesRealtimeGlobal
Cyber threat feedsSeverity-weighted cyber threatsDailyGlobal
Outage monitoringInternet outagesRealtimeGlobal
GPSJamGPS jamming incidentsDailyGlobal
Supply-chain monitorShipping stress, transit disruptionDailyGlobal
Unrest monitoringSeverity-weighted civil unrest eventsRealtimeGlobal
Reddit intelligenceSocial velocity scoresRealtimeGlobal
News threat analysisAI-scored news threat severityDailyGlobal
Energy mix dataGas, coal, renewable sharesAnnualGlobal
GIE AGSI+Gas storage fill levelsDailyEuropean countries
Energy pricesCommodity price changesDailyGlobal
National debt dataDebt-to-GDP growth rateAnnualGlobal

Supplementary Fields

The API response includes additional context fields that are informational and not part of the primary ranking:
  • baselineScore: Coverage-weighted mean of baseline and mixed dimensions. Reflects structural capacity (governance, health, infrastructure, fiscal strength). Informational only, not used in overallScore.
  • stressScore: Coverage-weighted mean of stress and mixed dimensions. Reflects current threat environment (cyber, conflict, sanctions, supply disruption). Informational only, not used in overallScore.
  • trend: Direction of score movement over the last 30 days (improving, stable, or declining), based on daily score history.
  • change30d: Numeric score change over 30 days.
  • imputationShare: Fraction of indicator weight from imputed (synthetic) data.
  • lowConfidence: Boolean flag when data coverage or imputation thresholds are breached.

Versioning

Cache keys include a versioned suffix that is bumped on formula changes. This invalidates stale caches and ensures all scores reflect the updated methodology. Score cache TTL is 6 hours.

Reproducibility Appendix

The CRI is designed to be auditable end-to-end: given the Redis snapshot at any point in time, a reader should be able to reproduce any published country score from the documented formulas without running the live service.

Redis keys used by the scorer

KeyTypeTTLWritten byRead by
resilience:score:v9:{countryCode}JSON6 hoursbuildResilienceScore in server/worldmonitor/resilience/v1/_shared.tsgetResilienceScore handler
resilience:ranking:v9JSON6 hoursbuildResilienceRanking, only when all countries are scoredgetResilienceRanking handler
resilience:history:v4:{countryCode}sorted setindefinite, trimmed to 30 daysappendHistory during scoringtrend and change30d computation
resilience:intervals:v1:{countryCode}JSON6 hoursscripts/seed-resilience-intervals.mjsgetResilienceScore (optional scoreInterval field)
seed-meta:resilience:staticJSON2 hoursscripts/seed-resilience-static.mjs at the end of each successful seed runscorer for dataVersion population, health checks
resilience:static:{countryCode}JSON400 daysscripts/seed-resilience-static.mjsscorer for all baseline signals (WGI, WHO, FAO, GPI, RSF, and so on)
resilience:static:index:v1JSON400 daysscripts/seed-resilience-static.mjswarmup path to enumerate countries

dataVersion semantics

The dataVersion field on every GetResilienceScoreResponse is the ISO date of the fetchedAt timestamp stored in seed-meta:resilience:static. It reflects the most recent successful run of the Railway static-seed job; the widget renders it in the footer as Data YYYY-MM-DD.

Reproducing a score by hand

Given a Redis snapshot at time T:
  1. Read seed-meta:resilience:static for the dataVersion.
  2. Read resilience:static:{cc} for the country’s baseline record (WGI, WHO, GPI, RSF, FAO, IEA, and so on).
  3. Read the live-signal keys (UCDP, UNHCR, OFAC, outages, cyber threats, prices, shipping stress, and so on) for the country’s slice.
  4. For each of the 13 dimensions, apply the formulas in the Scoring Formula section with the goalposts from the Dimensions and Indicators tables. For missing signals, consult the Imputation Taxonomy table in this document.
  5. Aggregate dimension scores into domain scores via coverage-weighted mean.
  6. Aggregate domain scores into the overall score via domain-weighted sum.
A reference Python notebook under docs/methodology/country-resilience-index/reference-edition/ is tracked as a future deliverable and will regenerate every published score from the snapshot manifest.

Changelog

v1.0 (April 2026)

Baseline. Scored on domain-weighted average of 5 domains and 13 dimensions.
  • PR #2821: added the baseline-vs-stress engine and the dataVersion field on the response.
  • PR #2847: reverted the overall-score formula from baseline * (1 - stressFactor) (which over-penalized every country) to a domain-weighted sum; fixed the RSF press-freedom direction (0 means free, scored higher is better).
  • PR #2858: seed script now computes missing country scores directly via the scorer import path instead of relying on a separate ranking writer.

v1.1 (April 2026) — Phase 1 reference-grade upgrade

Previous published version. Phase 1 of the reference-grade upgrade plan (docs/internal/country-resilience-upgrade-plan.md). Methodology surface reorganized for full reproducibility without changing the top-line domain weights or scoring formula.
  • T1.1 (#2941): regression test pins the Norway/US top-of-ranking ordering after an origin-document claim of a 100-point ceiling did not reproduce. Failing-then-passing test guards the invariant.
  • T1.2 (#2847, #2858): pre-existing fixes from the 2026-04-07 and 2026-04-09 origin-doc reviews that were already in main at the start of Phase 1. Re-verified no additional action needed.
  • T1.3 (#2945): methodology page promoted to .mdx at CII parity with the required sections (Framework / Domains / Dimensions / Normalization / Weighting / Missing-data / Confidence / Ranking / Reproducibility appendix).
  • T1.4 (#2943): dataVersion field wired end-to-end from seed-resilience-static:v7.dataVersion through the scorer to the widget footer so analysts see the exact ISO date of the underlying source data.
  • T1.5 (#2947 foundation, #2961 propagation): three-level staleness classifier (fresh, aging, stale) driven by the per-indicator cadence in the registry. Propagated through scoreAllDimensions and exposed as ResilienceDimension.freshness.{lastObservedAtMs, staleness} on the response.
  • T1.6 (#2949 scaffold, #2962 full grid): per-dimension confidence grid in the widget. The full grid adds an imputation-class icon column (consuming T1.7 schema) and a freshness-badge column (consuming T1.5 propagation). 5-column layout with mobile responsive breakpoint.
  • T1.7 (#2944 foundation, #2959 schema, #2964 source-failure wiring): four-class imputation taxonomy stable-absence / unmonitored / source-failure / not-applicable exposed on ResilienceDimension.imputationClass. The scorer aggregation pass consults seed-meta:resilience:static.failedDatasets and re-tags imputed dimensions as source-failure when the underlying adapter fetch failed. Deleted the last absence-based return branch in scoreCurrencyExternal so the taxonomy is the single source of truth for every imputed path.
  • T1.8 (#2946): methodology doc linter enforces dimension parity between this document and _indicator-registry.ts. CI fails if any dimension drifts.
  • T1.9 (this PR): cache-key / health-registry sync regression test so future version bumps in _shared.ts cannot silently break health probes. No cache keys were bumped in Phase 1 because every schema addition was additive with default fallbacks on the existing resilience:score:v7 and resilience:ranking:v9 keys.
What did not change in v1.1: the domain-weighted aggregation formula, the 5 domain structure, the 13 dimensions, the goalpost ranges, the per-dimension weights. Phase 2 owns the structural three-pillar rebuild; v1.1 is the methodology-surface and observability lift only.

Scorecard (v1.1 self-assessment)

Self-assessed against the standard composite-indicator review axes on a 0-10 scale. This is the Phase 1 acceptance gate defined in the upgrade plan (Methodology ≥7.5, Explainability ≥7.5). An external expert review (Phase 3 T3.8b) will supersede these self-ratings once it completes.
AxisScoreRationale
Methodology7.5Every dimension has a named source, direction, goalpost range, weight, cadence, and imputation class. Missing-data rules are explicit and tagged with a 4-class taxonomy. The aggregation formula is a simple domain-weighted average, auditable from first principles. Gap: the overall-score formula is still single-axis compensatory (a strong institutional score can wash out a weak exposure score), which Phase 2 replaces with a partly non-compensatory three-pillar form.
Explainability7.5Per-dimension confidence grid in the widget shows coverage %, imputation class, and freshness for every dimension on every country. Tooltip text is generated from the taxonomy so analysts can click through to the meaning without reading this document. Gap: no waterfall chart of individual signal contributions yet, that lands in Phase 3 T3.3.
Reproducibility8.0Every dimension’s sourceKey, cadence, and goalpost lives in _indicator-registry.ts and is linted against this doc. Cache keys are versioned (resilience:score:v7, ranking:v8, history:v4). dataVersion is written by the seed and plumbed to the widget footer. Gap: the benchmark and backtest scripts do not yet run on a CI cron; those land in Phase 2 T2.7.
Source quality7.0World Bank, IMF, WHO, IEA, UNHCR, UCDP, IPC, BIS, FAO, RSF, GPI: all authoritative. Gap: curated-list sources (BIS ~40 economies, WTO) do not cover the full WorldMonitor country set, which is why the unmonitored imputation class exists. Phase 2 T2.9 adds language-normalized information signal to reduce English-press bias.
Timeliness6.5Structural sources are annual (WGI, GPI, RSF, WHO, IMF macro) and dominate the total weight of the index. BIS EER is monthly. The Freshness classifier (T1.5) surfaces this at the dimension level so users can see which parts of a country score are 12 months old. Thirteen stress-side indicators already run at realtime or daily cadence via the cross-source stack (ucdpConflict, internetOutages, infraOutages, unrestEvents, socialVelocity at realtime; sanctionCount, cyberThreats, gpsJamming, shippingStress, transitDisruption, gasStorageStress, energyPriceStress, newsThreatScore at daily). Gap: the live-shock pillar relies on those signals but the structural pillar is still capped by annual sources; Phase 2 T2.2 adds FX volatility at daily cadence to narrow the cadence gap on the currency-external dimension and the Phase 3 reference-edition split will formalize annual vs rolling cadences per pillar.
Sensitivity7.0Weight-perturbation Monte Carlo sensitivity (#2823) exists in the backtesting layer. Phase 1 did not add new sensitivity work. Gap: per-dimension p5/p95 intervals are computed and exposed (#2877, #2885) but the widget does not render them yet, Phase 3 T3.3 waterfall chart.
Phase 1 acceptance gate status: met. Both required thresholds (Methodology ≥7.5, Explainability ≥7.5) are satisfied with honest rationales. The two gaps flagged in each axis are tracked against Phase 2 and Phase 3 tasks in the upgrade plan.

v2.0 (April 2026) — Phase 2 structural rebuild

Current published version. Phase 2 of the reference-grade upgrade plan (docs/internal/country-resilience-upgrade-plan.md). Rebuilds the top-level shape from five flat domains into three pillars (structural readiness, live shock exposure, recovery capacity) with a partly non-compensatory aggregation, adds a recovery capacity pillar with six new dimensions, and ships a full validation suite (cross-index benchmark, outcome backtest, sensitivity analysis).
  • T2.1 (#2977): Three-pillar schema added to proto and OpenAPI. schemaVersion: "2.0" feature flag introduced with backward-compatible "1.0" fallback path for one release cycle. Response now carries a pillars array alongside existing domains.
  • T2.2a (#2979): Signal tiering registry committed. Every indicator tagged Core, Enrichment, or Experimental with per-signal coverage percentage and license audit status. Registry enforced by CI linter.
  • T2.2b (#2987): Recovery capacity pillar with 6 new dimensions across a new recovery domain: fiscal space (debt service ratio), reserve adequacy (months of imports), short-term external debt coverage, import concentration (HHI), hospital surge capacity, and state continuity composite (WGI subset). Five new seeders following Railway gold-standard pattern (3 real data sources, 2 stubs pending source configuration). Cache key bumped to the current version.
  • T2.3 (#2990): Three-pillar aggregation using penalized weighted mean. Pillar weights: structural readiness 0.40, live shock exposure 0.35, recovery capacity 0.25. Penalty factor (1 - alpha * max(0, pillar_gap / 100)) with alpha = 0.5. Domain-weighted scores feed into pillar scores; pillar-weighted scores feed into the overall score with the penalty applied when the gap between the strongest and weakest pillar exceeds a threshold.
  • T2.4 (#2985): Cross-index benchmark script validates each pillar against four established indices (INFORM Risk Index, ND-GAIN, WorldRiskIndex, Fragile States Index) via Spearman and Pearson correlation with per-pillar directional hypotheses. Results stored in resilience:benchmark:external:v1 and committed as validation artifacts.
  • T2.5 (#2986): Outcome backtest framework covering 7 event families (FX stress, sovereign stress, power outages, food-crisis escalation, refugee surges, sanctions shocks, conflict spillover). Each family has a binary event definition, a 2024-2025 hold-out window, and an AUC release gate of 0.75 or higher.
  • T2.6/T2.8 (#2991): Sensitivity suite v2 with 4-pass perturbation (weight, goalpost, imputation, alpha), alpha-curve analysis, and ceiling-effect detection. Release gate: no single-axis perturbation moves a top-50 country by more than 5 rank positions; overall dimension failure rate must be 20% or lower.
  • T2.7 (#2988): Railway cron service wired for weekly benchmark, backtest, and sensitivity runs. Results published to Redis with health monitoring integration.
  • T2.9 (#2992): Language and source-density normalization for the informationCognitive dimension. RSF press freedom and social velocity scores are weighted by language coverage of the source set to correct for English-press bias. The dimension is promoted back to Core tier after normalization.
What changed from v1.1: The five-domain flat structure is preserved as the inner aggregation layer, but a new three-pillar outer layer groups domains into structural readiness, live shock exposure, and recovery capacity. The overall score formula changes from a pure domain-weighted sum to a penalized weighted mean that prevents a strong institutional score from fully compensating severe live-shock exposure. Six new dimensions are added under the recovery capacity pillar. The cache key is bumped to the current version. The schemaVersion field is set to "2.0" by default (env var RESILIENCE_SCHEMA_V2_ENABLED=false provides a rollback path).

Scorecard (v2.0 self-assessment)

Self-assessed against the standard composite-indicator review axes on a 0-10 scale. This is the Phase 2 acceptance gate defined in the upgrade plan (Validation >= 8.0, Data >= 9.0, Architecture >= 9.0). An external expert review (Phase 3 T3.8b) will supersede these self-ratings once it completes.
AxisScoreRationale
Validation8.0Cross-index benchmark against 4 established indices with per-pillar hypotheses. Outcome backtest across 7 event families with AUC release gates. Sensitivity suite with 4-pass perturbation and ceiling detection. Gap: external expert review (Phase 3 T3.8b) not yet complete.
Data9.019 dimensions across 6 domains, 47+ indicators. Recovery capacity pillar adds 6 new dimensions with global Core-tier coverage (3 real seeders, 2 stubs pending source configuration). Signal tiering registry tags every indicator Core/Enrichment/Experimental with coverage + license audit. Gap: 2 stub seeders (import HHI, fuel stocks) need real data source integration.
Architecture9.0Three-pillar schema with schemaVersion feature flag for backward compat. Penalized weighted mean aggregation with documented alpha. Domain-weighted pillar scores. Cache-key versioning (bumped per schema change). Language normalization corrects English-press bias. Gap: alpha tuning is initial (0.5), needs backtest-driven refinement after live data accumulates.
Methodology8.5Every dimension has a named source, direction, goalpost, weight, cadence, imputation class, AND tier. Four-class imputation taxonomy live end-to-end. Freshness classifier surfaces staleness at the dimension level. Methodology doc linter enforces parity. Gap: three-pillar weight rationale is defensible but not yet empirically optimized.
Explainability8.0Per-dimension confidence grid with imputation icon + freshness badge. Pillar structure makes the index decomposable (structural vs live-shock vs recovery). Gap: no waterfall chart yet (Phase 3 T3.3), no change attribution (Phase 3 T3.5).
Timeliness7.013 stress-side indicators at realtime/daily cadence. Language normalization corrects for source-density bias. Recovery capacity adds monthly reserve + debt signals. Gap: structural sources still annual (WGI/GPI/RSF/WHO). Phase 3 reference-edition split formalizes annual vs rolling cadences per pillar.
Phase 2 acceptance gate status: met. All three required thresholds (Validation >= 8.0, Data >= 9.0, Architecture >= 9.0) are satisfied. The gaps flagged in each axis are tracked against Phase 3 tasks in the upgrade plan.

Editorial notes

  • This document is maintained at parity with OECD/JRC composite-indicator standards: every dimension has a named source, direction, goalpost range, weight rationale, cadence, and imputation class. A methodology doc linter (Phase 1 T1.8) validates that the list of dimensions in the indicator registry matches the list documented here and fails CI if they drift.
  • For questions about an individual country’s score, the widget footer shows the dataVersion, the confidence label, and the 30-day delta; the deep-dive panel exposes per-dimension breakdowns so an analyst can see which component moved. The full proto schema lives in docs/api/ResilienceService.openapi.yaml.