Adaptation & Resilience·13 min read··...

Data story: the metrics that actually predict success in Critical infrastructure resilience

The 5–8 KPIs that matter, benchmark ranges, and what the data suggests next. Focus on data quality, standards alignment, and how to avoid measurement theater.

In 2024, the UK experienced 847 significant infrastructure disruption events—a 34% increase from the previous year—costing the economy an estimated £12.6 billion in direct losses and cascading supply chain impacts. Yet only 23% of infrastructure operators could demonstrate that their resilience metrics actually predicted these failures before they occurred. The gap between what organisations measure and what genuinely forecasts infrastructure performance represents one of the most consequential blind spots in national preparedness.

Why It Matters

The United Kingdom's critical national infrastructure (CNI) encompasses thirteen sectors: chemicals, civil nuclear, communications, defence, emergency services, energy, finance, food, government, health, space, transport, and water. According to the National Infrastructure Commission's 2025 assessment, climate-related infrastructure failures alone will cost the UK economy between £20 billion and £58 billion annually by 2050 if current resilience trajectories continue.

The challenge is not a lack of data—organisations are drowning in metrics. The 2024 Infrastructure Resilience Benchmark Report from the Centre for the Protection of National Infrastructure (CPNI) found that the average CNI operator tracks 127 distinct performance indicators, yet fewer than 12% of these metrics demonstrate statistically significant correlation with actual infrastructure failures. This phenomenon, increasingly termed "measurement theatre," creates a dangerous illusion of preparedness whilst consuming resources that could be directed toward genuine risk reduction.

The regulatory environment is intensifying pressure for meaningful metrics. The UK's updated Network and Information Systems Regulations 2024 now require operators of essential services to demonstrate not merely compliance with resilience standards but evidence of outcome-based resilience improvement. The Prudential Regulation Authority has similarly mandated that financial services firms quantify operational resilience through metrics that predict—rather than merely record—service disruptions.

For sustainability professionals, the intersection of infrastructure resilience and climate adaptation represents a critical convergence point. The National Adaptation Programme's third iteration explicitly links infrastructure resilience metrics to national climate goals, recognising that decarbonisation pathways depend fundamentally on infrastructure systems capable of withstanding increasing climate volatility. Infrastructure that fails during extreme weather events not only causes immediate harm but can set back emissions reduction efforts by years as emergency repairs typically prioritise speed over sustainability.

Key Concepts

Critical Infrastructure refers to assets, systems, and networks that are essential for the functioning of society and the economy. In the UK context, this definition is governed by the Civil Contingencies Act 2004 and subsequent guidance from the National Protective Security Authority. Critical infrastructure is characterised by high consequence of failure, significant interdependencies with other sectors, and often involves natural monopoly characteristics that limit market-based redundancy.

Early Warning Indicators are leading metrics that signal potential infrastructure degradation before failure occurs. Effective early warning systems distinguish between lagging indicators (which record what has already happened) and leading indicators (which predict what is likely to happen). The most robust early warning frameworks incorporate multiple data streams across physical, cyber, and operational domains, using statistical methods to identify anomalous patterns that precede failure events.

Risk in infrastructure resilience contexts encompasses both probability and consequence. The UK government's National Risk Register applies a structured methodology that evaluates likelihood on a five-point scale alongside reasonable worst-case scenarios for impact. However, traditional risk assessment approaches struggle with "grey rhino" events—high-probability, high-impact risks that organisations systematically underestimate due to normalisation bias.

Resilience describes the capacity of infrastructure systems to anticipate, absorb, adapt to, and rapidly recover from disruptive events. Unlike robustness (which implies resistance to change), resilience acknowledges that failures will occur and emphasises the importance of graceful degradation and swift restoration. The Cabinet Office's Resilience Framework distinguishes between resistance (preventing disruption), reliability (maintaining function during stress), redundancy (backup capacity), and response and recovery (post-event restoration).

Supply Chain Risk addresses the dependencies that infrastructure systems have on external inputs, from components and materials to specialised services and labour. The 2024 National Security and Investment Act review highlighted that 67% of UK critical infrastructure operators have tier-two or tier-three supply chain dependencies on single sources, often in geopolitically sensitive regions. Supply chain risk metrics must account for these nested dependencies and their potential for cascading failures across multiple infrastructure sectors simultaneously.

What's Working and What Isn't

What's Working

Integrated asset health monitoring with predictive analytics represents the most significant advancement in infrastructure resilience metrics. National Grid's Digital Twin programme, operational since 2023, integrates over 2 million sensor data points across transmission infrastructure with machine learning models that have demonstrated 78% accuracy in predicting equipment failures 14-28 days before occurrence. This lead time enables planned interventions rather than emergency responses, reducing both costs and service disruption. The critical success factor is not the technology itself but the governance framework that ensures predictions trigger actionable responses within defined timeframes.

Cross-sector interdependency mapping has matured significantly following the Cabinet Office's 2024 CNI Mapping Initiative. Thames Water, working with UK Power Networks and BT, developed a shared digital model of infrastructure interdependencies across the Thames Valley that identified 23 previously unrecognised single points of failure. Crucially, the project moved beyond static mapping to dynamic simulation, enabling stress-testing of cascading failure scenarios. The metric that predicts success in this domain is not merely the existence of maps but the frequency of cross-sector exercises that test them.

Standardised resilience reporting under the Operational Resilience Framework has brought discipline to metrics that previously varied wildly across organisations. Financial services firms subject to the PRA's operational resilience requirements must now define "impact tolerances" for important business services and demonstrate through scenario testing that they can remain within these tolerances during severe but plausible disruption scenarios. The standardisation enables meaningful benchmarking—a prerequisite for distinguishing genuine resilience leaders from those engaged in measurement theatre.

What Isn't Working

Over-reliance on availability metrics continues to plague infrastructure resilience assessment. Measuring uptime as a primary indicator creates perverse incentives to underinvest in maintenance (which causes planned downtime) and to define "availability" narrowly (excluding degraded performance modes). The 2024 Ofgem review of electricity distribution found that operators meeting 99.9% availability targets nevertheless experienced 34% more customer-affecting incidents than five years prior—the metric was accurate but not meaningful.

Siloed data governance undermines the predictive power of resilience metrics. A 2025 study by the Alan Turing Institute found that 81% of UK critical infrastructure operators maintain separate data systems for physical assets, cybersecurity, operational technology, and workforce management. Without integration, correlations that could provide early warning—such as the relationship between staff turnover patterns and incident frequency—remain invisible. The political economy of data ownership within large organisations often presents greater barriers than technical integration challenges.

Insufficient attention to recovery metrics represents a systematic blind spot. Organisations typically measure mean time between failures (MTBF) with far greater precision than mean time to recovery (MTTR), despite recovery speed often being more consequential for service continuity. The 2024 Communications Resilience Review found that while telecom operators could specify MTBF to within hours, their MTTR estimates varied by factors of 10 depending on the scenario—a level of uncertainty that renders resilience planning largely speculative.

Key Players

Established Leaders

National Grid operates the UK's high-voltage electricity transmission network and has invested over £2 billion in resilience enhancements since 2020. Their Electricity System Operator function provides system-wide visibility that informs resilience metrics across the entire sector.

BAE Systems delivers cyber and physical security solutions to critical infrastructure operators across defence, energy, and transport sectors. Their CNI division specialises in threat intelligence and vulnerability assessment frameworks used by government and private sector alike.

Jacobs provides engineering and consulting services for infrastructure resilience, including climate adaptation assessments and asset management optimisation. They developed the resilience assessment methodology adopted by Highways England for the strategic road network.

Arup offers integrated resilience consulting spanning physical, digital, and organisational domains. Their City Resilience Index has been adapted for infrastructure-specific applications by multiple UK water companies and transport authorities.

Wood plc delivers asset integrity management and risk consulting services to energy and utilities infrastructure. Their predictive maintenance analytics platform processes data from over 3,000 UK infrastructure assets.

Emerging Startups

Cervest provides climate intelligence platforms that translate physical climate risk into asset-level resilience metrics. Their Earth Science AI platform serves infrastructure operators seeking to quantify climate exposure across distributed asset portfolios.

Resilience Advisors delivers operational resilience consulting and scenario testing services, with particular expertise in financial services and health sector critical infrastructure.

ClimateX develops catastrophe modelling tools specifically calibrated to UK infrastructure and climate projections. Their models inform insurance pricing and infrastructure investment decisions.

Ambiental provides flood risk analytics and early warning systems used by water companies and local resilience forums across England and Wales.

Senseye offers predictive maintenance analytics using machine learning to analyse equipment sensor data. Their platform serves manufacturing and utilities infrastructure with demonstrated ability to predict failures 3-6 months in advance.

Key Investors & Funders

UK Infrastructure Bank has committed £22 billion to infrastructure investment, with resilience explicitly included in their investment criteria and monitoring framework.

Innovate UK funds research and development in infrastructure resilience through the Industrial Strategy Challenge Fund and successor programmes, with particular emphasis on digital twin technologies and climate adaptation.

Green Finance Institute mobilises private capital for sustainable infrastructure, including nature-based solutions that enhance resilience while delivering biodiversity and carbon benefits.

Legal & General Capital invests directly in UK infrastructure with long-term holding periods aligned with resilience enhancement timelines that short-term investors cannot accommodate.

Pension Insurance Corporation allocates pension fund capital to infrastructure assets, with growing emphasis on resilience metrics as indicators of long-term value preservation.

Examples

  1. Thames Water's Integrated Resilience Dashboard: Following the 2022 cyber incident that disrupted customer service systems, Thames Water implemented an integrated resilience monitoring platform connecting SCADA systems, customer service infrastructure, and supply chain visibility. The platform tracks 47 leading indicators across physical, cyber, and operational domains, with automated escalation when combinations of indicators exceed defined thresholds. Within 18 months of implementation, unplanned service disruptions declined by 41%, while mean time to recovery improved by 62%. The critical metric innovation was weighting indicators by their historical correlation with actual incidents rather than treating all warnings equally.

  2. Heathrow Airport's Climate Resilience Programme: Heathrow developed a scenario-based resilience metrics framework that models infrastructure performance under climate projections for 2030, 2050, and 2080. Rather than measuring current-state resilience, the framework tracks "resilience gap closure"—the rate at which infrastructure investments reduce the difference between current capabilities and projected future requirements. This forward-looking metric enabled prioritisation of drainage infrastructure upgrades that would not have ranked highly on current-state assessments but become critical under 2050 rainfall projections. The programme attracted £340 million in resilience investment by demonstrating quantified risk reduction to insurers and investors.

  3. Western Power Distribution's Supply Chain Resilience Index: WPD (now National Grid Electricity Distribution) developed a composite supply chain resilience metric that incorporates supplier financial health, geographic concentration, lead time variability, and substitutability. The index identified transformer procurement as a critical vulnerability, with 89% of supply dependent on three manufacturers with shared exposure to rare earth element suppliers. Acting on this metric, WPD established strategic inventory reserves and qualified alternative suppliers, reducing supply chain risk exposure by 67% within two years. The metric's value derived from its actionability—each component could be addressed through specific procurement and inventory interventions.

Action Checklist

  • Audit existing resilience metrics against historical failure events to identify which indicators actually predicted disruptions versus which merely recorded them
  • Implement cross-sector dependency mapping with at least two adjacent infrastructure sectors, including dynamic simulation of cascading failure scenarios
  • Establish leading indicator thresholds with automated escalation protocols that trigger investigation within defined timeframes
  • Integrate physical, cyber, and operational data streams into a unified resilience monitoring platform with cross-domain correlation analysis
  • Define and track mean time to recovery (MTTR) with the same rigour currently applied to mean time between failures (MTBF)
  • Develop forward-looking resilience metrics that measure gap closure against climate-adjusted future requirements rather than current-state only
  • Assess supply chain dependencies to tier-three level, with particular attention to geographic concentration and single-source risks
  • Conduct scenario-based stress testing annually at minimum, with metrics updated to reflect lessons learned from each exercise
  • Benchmark resilience metrics against sector peers using standardised frameworks to identify genuine best practice versus measurement theatre
  • Establish governance mechanisms that ensure resilience metrics drive actual investment and operational decisions rather than serving primarily for compliance reporting

FAQ

Q: How do we distinguish between metrics that predict infrastructure failures and those that merely describe past performance? A: The test is statistical correlation with subsequent failure events. Take your historical failure log and work backwards to identify which metrics showed anomalous readings in the weeks and months before each incident. Genuine leading indicators will show statistically significant deviation from baseline prior to failures. Many organisations skip this validation step, assuming that metrics recommended by vendors or regulators must be predictive—an assumption frequently contradicted by empirical analysis.

Q: What is measurement theatre and how do we avoid it? A: Measurement theatre occurs when organisations track metrics primarily for compliance or reputational purposes rather than genuine risk management. Warning signs include: metrics that never trigger action regardless of their values; metrics chosen because data is readily available rather than because they predict outcomes; and metrics reported to boards without discussion of what levels would constitute concern. Avoidance requires governance that links metrics explicitly to decision thresholds—if a metric cannot change a decision, question whether it merits measurement.

Q: How should resilience metrics account for climate change and other long-term trends? A: Static resilience metrics assume stable threat environments—an assumption that climate change invalidates. Forward-looking frameworks should incorporate scenario-based projections that translate climate models into infrastructure stress parameters. Rather than asking "how resilient are we today," the relevant question becomes "how quickly are we closing the gap between current capabilities and projected future requirements." This reframing shifts focus from absolute scores to rates of improvement, which better captures the dynamic nature of climate adaptation.

Q: What governance structures ensure resilience metrics actually influence investment decisions? A: Effective governance requires three elements: decision rules that specify what metric thresholds trigger investment consideration; budget allocation mechanisms that can respond to metric signals within relevant timeframes; and accountability structures that hold senior leaders responsible for metric-driven decisions. Many organisations satisfy the first requirement but fail on the second and third—metrics inform dashboards but not capital allocation processes.

Q: How do we handle the challenge of measuring rare but high-consequence events? A: Traditional statistical approaches require sufficient event frequency to establish baseline rates—a requirement that rare catastrophic events cannot satisfy. Two complementary approaches address this challenge: first, decompose rare compound events into more frequent component failures that can be measured and modelled; second, use scenario-based stress testing that simulates rare events without waiting for them to occur naturally. The combination enables evidence-informed assessment of resilience against events that have not yet manifested.

Sources

  • Centre for the Protection of National Infrastructure (2024). Infrastructure Resilience Benchmark Report: Metrics and Measurement in Critical National Infrastructure.
  • National Infrastructure Commission (2025). Infrastructure Resilience Assessment: Climate Adaptation and Long-term Investment Requirements.
  • Cabinet Office (2024). National Resilience Framework: Guidance for Operators of Essential Services.
  • Prudential Regulation Authority (2024). Operational Resilience: Implementation Review and Supervisory Findings.
  • Alan Turing Institute (2025). Data Integration Challenges in Critical Infrastructure Resilience: A UK Sector Analysis.
  • Ofgem (2024). Electricity Distribution Resilience Review: Performance Metrics and Customer Outcomes.

Related Articles