Case study: Critical infrastructure resilience — a startup-to-enterprise scale story
A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on data quality, standards alignment, and how to avoid measurement theater.
Between the 1970s and 2010s, recorded economic losses from climate-related disasters increased sevenfold—from $198 billion to $1.6 trillion—with infrastructure assets bearing a significant portion of that burden. In the United States alone, 387 natural disasters between 1980 and mid-2024 each caused at least $1 billion in losses, with hurricanes averaging $22.5 billion per event. Europe now faces €3.4 billion per year in infrastructure damage, a figure projected to multiply tenfold by 2100 under current warming trajectories. These numbers underscore a fundamental truth: the gap between infrastructure design assumptions and climate reality is widening faster than most organizations can adapt.
Yet within this crisis lies opportunity. McKinsey projects climate resilience technology will represent a $1 trillion market by 2030, while the Coalition for Disaster Resilient Infrastructure (CDRI) estimates that every dollar invested in disaster-resilient infrastructure yields 7–12x returns over asset lifecycles. The question facing sustainability leaders is no longer whether to invest in resilience, but how to scale solutions from promising pilots to enterprise-grade deployments—a journey littered with measurement theater, data quality failures, and misaligned incentives.
This case study examines the mechanics of that scaling journey, drawing on real-world implementations from 2024–2025 to identify what works, what fails, and what the next generation of resilience infrastructure demands.
Why It Matters
Infrastructure failure cascades through economies in ways that traditional risk assessments systematically underestimate. When a power substation fails during a heatwave, the direct repair cost represents only a fraction of total economic damage. Hospital equipment goes offline. Food spoils in warehouses. Manufacturing lines halt. A 2024 study by the Global Association of Risk Professionals found that current financial risk models "heavily underestimate" indirect and cascading consequences, with supply chain interconnections potentially amplifying GDP losses by up to 30 times compared to direct asset damage alone.
The fiscal implications are equally severe. Climate-related damage to U.S. paved roads alone could cost up to $20 billion to repair by century's end, with upgrades to withstand changing conditions adding another $5.8–$10 billion. U.S. drinking water infrastructure requires approximately $625 billion over the next two decades for maintenance, plus an additional $448–$944 billion through 2050 for climate adaptation measures. These costs fall disproportionately on state and local governments, which issue bonds for roughly two-thirds of U.S. infrastructure projects. As climate impacts intensify, credit rating agencies may lower government ratings, increasing borrowing costs precisely when investment needs peak.
The human dimension compounds these economic pressures. Ninety-nine and a half percent of U.S. congressional districts experienced at least one federally declared disaster for extreme weather between 2023 and 2024. Least Developed Countries and Small Island Developing States face 10–30 times greater exposure to climate-related disasters than OECD nations, with fewer resources to respond. The OECD estimates that $6.9 trillion in annual investment is required by 2030 to meet global climate and development infrastructure objectives—a target that demands both scale and speed that traditional procurement cannot deliver.
Key Concepts
Resilience Metrics That Matter
Effective infrastructure resilience demands metrics that capture both acute shock resistance and chronic stress adaptation. The most operationally useful frameworks distinguish between:
Return Period Analysis: Traditional engineering uses historical data to estimate event frequencies (e.g., "100-year flood"). Climate change invalidates these baselines. Leading organizations now use dynamic return periods that adjust projections based on emissions scenarios and observed trend acceleration.
Time to Recovery (TTR): How quickly does infrastructure return to operational capacity after disruption? This metric matters more than pure damage resistance for most economic applications. A road that floods but drains within hours causes less economic damage than one that remains impassable for days.
Graceful Degradation: Does infrastructure fail safely? A power grid that sheds load incrementally preserves critical services; one that cascades to blackout amplifies harm. Top-performing systems design explicit degradation pathways for each failure mode.
Interdependency Mapping: Infrastructure systems depend on each other. A water treatment plant requires electricity; data centers require cooling water. Effective resilience assessment maps these dependencies and identifies single points of failure that create systemic risk.
Early Warning Systems
The value of early warning compounds nonlinearly with lead time. A 24-hour flood warning enables evacuation and asset protection. A 72-hour warning enables supply chain rerouting and production rescheduling. A 7-day warning enables financial hedging and demand management.
Modern early warning systems integrate satellite observations, IoT sensor networks, and AI-driven predictive models. Companies like Silurian (Y Combinator-backed) now offer earth simulation models that power weather prediction, wildfire risk assessment, and energy grid load forecasting. Pano AI's wildfire detection platform serves over 250 first responder agencies, while Technosylva's catastrophic weather simulation tools attracted investment from General Atlantic's BeyondNetZero fund in late 2024.
Adaptive Infrastructure
The concept of adaptive infrastructure shifts design philosophy from static resilience (building to withstand predicted extremes) to dynamic resilience (building systems that learn and adjust). Key mechanisms include:
Modular Design: Infrastructure components that can be upgraded, relocated, or replaced without rebuilding entire systems. This approach reduces stranded asset risk when climate projections shift.
Nature-Based Solutions: Green infrastructure that provides co-benefits while adapting autonomously to changing conditions. WSP's work on New York City's $3.5 billion Green Infrastructure Program demonstrates how permeable surfaces, bioswales, and urban forests reduce flooding while requiring less maintenance than gray infrastructure.
Digital Twins: Virtual replicas of physical infrastructure that enable scenario testing and predictive maintenance. Leading utilities now simulate thousands of climate scenarios against digital twins before committing capital to physical upgrades.
What's Working and What Isn't
What's Working
Successful Startup-to-Enterprise Scaling Patterns
Organizations achieving production-grade resilience infrastructure share common patterns that transcend sector and geography.
Pattern 1: Narrow Scope, Deep Integration
The most successful scaling stories begin with ruthlessly narrow problem definitions. Form Energy didn't attempt to solve all energy storage challenges—it focused specifically on multi-day duration storage that enables renewable integration during extended low-production periods. This narrow focus allowed deep integration with existing grid operations rather than requiring greenfield deployment.
Similarly, Nira's grid mapping tool addresses a single, high-value pain point: showing available capacity at interconnection points. By solving this specific problem exceptionally well, Nira attracted enterprise customers including AES and Cypress Creek before expanding scope.
Pattern 2: Observability-First Architecture
Organizations that scale successfully instrument their systems heavily from day one. This means logging not just outcomes but decision processes, confidence scores, and environmental conditions. When Jacobs deployed digital operations and maintenance systems for California wastewater infrastructure in 2024, comprehensive observability enabled rapid diagnosis and iteration when performance deviated from expectations.
The median enterprise deployment now generates 2–5 MB of logs per operational hour, a fourfold increase from 2023 norms. Storage costs are real, but diagnostic capability determines whether problems become learnings or recurring failures.
Pattern 3: Hybrid Autonomy
Pure automation rarely works at enterprise scale. The highest-performing deployments use tiered architectures: fully autonomous handling for high-confidence, low-stakes decisions; human-in-the-loop for medium-confidence or high-stakes situations; human-on-the-loop for aggregate pattern monitoring. This hybrid approach achieves both efficiency gains and appropriate oversight.
What Isn't Working
Measurement Theater
Many organizations report impressive resilience metrics that collapse under scrutiny. Common failure modes include:
Counting Preparedness as Resilience: Having a plan is not the same as having capability. Organizations frequently measure plan completion rates, training attendance, and asset inventories while ignoring actual performance during stress events.
Self-Reported Metrics Without Verification: Resilience assessments often rely on infrastructure operators evaluating their own performance. Without independent verification or sampling-based audits, optimistic bias inflates reported capabilities by 20–40% compared to actual performance.
Ignoring Tail Cases: Resilience programs often achieve 95%+ success rates on common scenarios while catastrophically failing on rare events. These tail failures—comprising 2–5% of events—can destroy more value than routine successes create. The 2024 Pew Charitable Trusts analysis found most U.S. cities remain in "reactive mode" rather than proactively addressing low-probability, high-consequence scenarios.
Data Quality Failures
Effective resilience requires accurate baseline data about infrastructure condition, exposure, and interdependencies. Common data quality failures include:
Outdated Asset Inventories: Many utilities operate with asset databases that haven't been ground-truthed in decades. Digital transformation initiatives frequently discover 15–25% discrepancies between recorded and actual infrastructure configurations.
Missing Interdependency Documentation: Infrastructure systems evolved independently, and their interconnections often exist in tribal knowledge rather than formal documentation. When key personnel retire or transfer, this knowledge disappears.
Incompatible Data Standards: Different infrastructure sectors use incompatible data formats, coordinate systems, and classification schemes. Integrating electric grid data with water system data with transportation network data requires extensive manual reconciliation that introduces errors and delays.
Premature Scaling
Organizations frequently scale resilience solutions based on pilot success without recognizing that pilot conditions differ from production: curated scenarios, motivated teams, extra attention from developers. Production environments surface failure modes that pilots miss. Research indicates that 67% of scaled deployments underperform their pilots by at least 20% on key metrics.
Key Players
Established Leaders
Jacobs Solutions — Global engineering firm with extensive climate resilience practice. Recent projects include Metro Vancouver's 5.3-mile water supply tunnel for climate adaptation, San José-Santa Clara's $129 million wastewater facility upgrade, and digital operations systems for California infrastructure. Revenue exceeded $16 billion in 2024.
WSP Global — Engineering consultancy involved in 74 of Canada's top 100 megaprojects in 2024. WSP's nature-based solutions practice leads projects including NYC's $3.5 billion Green Infrastructure Program, the Thames TEAM2100 flood defense system protecting 1.4 million people, and Florida's Everglades restoration. Approximately 65% of revenues tied to UN Sustainable Development Goals.
Black & Veatch — Infrastructure solutions provider specializing in power, water, and telecommunications resilience. The firm's adaptive infrastructure practice focuses on grid modernization, water security, and community resilience planning across North America and Asia-Pacific.
AECOM — Global infrastructure consulting firm with dedicated climate resilience services spanning transportation, water, buildings, and energy systems. AECOM's digital capabilities enable scenario planning and risk quantification for public and private sector clients.
Emerging Startups
Form Energy — Developer of multi-day energy storage systems using iron-air battery chemistry. The company's technology enables grid resilience during extended renewable production gaps, addressing a critical infrastructure vulnerability as renewable penetration increases.
Pano AI — AI-powered wildfire detection platform serving 250+ first responder agencies. Pano's camera network and machine learning systems provide early warning that reduces fire spread and infrastructure damage.
Technosylva — Catastrophic weather simulation and response platform. The company's wildfire behavior models enable utilities to preemptively de-energize lines, reducing ignition risk while minimizing service disruption. Received investment from General Atlantic's BeyondNetZero fund in November 2024.
HyLight — Developer of autonomous airships for infrastructure monitoring. HyLight's hydrogen-powered platforms detect methane leaks and power line defects across pipeline and transmission networks, with 10-hour flight endurance enabling comprehensive asset surveillance.
Silurian — Earth simulation platform providing weather prediction, wildfire risk assessment, and grid load forecasting. Y Combinator-backed startup applying frontier AI to climate and infrastructure applications.
Key Investors & Funders
Breakthrough Energy Ventures — Bill Gates-backed fund focusing on climate solutions including energy storage, grid resilience, and sustainable infrastructure. Portfolio includes Form Energy and other resilience-focused companies.
Lightsmith Group — Climate resilience-focused private equity firm with $186 million under management. Investments target climate adaptation solutions including off-grid water systems, agricultural supply chain resilience, and AI-powered infrastructure monitoring.
Convective Capital — Venture fund exclusively focused on wildfire prevention, mitigation, suppression, and recovery technologies. The fund's narrow focus enables deep sector expertise and value-add for portfolio companies.
Energy Impact Partners — Climate-focused venture and growth equity fund with over $2.5 billion in assets under management. EIP invests across the energy value chain from generation through distribution and consumption.
Invesco — Launched $500 million climate adaptation fund in 2024 targeting both private and public sector resilience infrastructure investments.
Examples
Jacobs and Metro Vancouver Water Resilience
In December 2024, Jacobs was selected to design Metro Vancouver's Coquitlam Lake Water Supply Project, a 5.3-mile water supply tunnel intended to enhance climate resilience for the region's water system. The project exemplifies enterprise-scale infrastructure resilience: it addresses chronic drought risk while providing redundancy against acute seismic events. Key success factors included Jacobs' prior regional experience (reducing knowledge transfer friction), explicit climate scenario integration in design specifications, and phased delivery that allows course correction as climate projections evolve. The project builds on Jacobs' earlier work on the Iona Wastewater Treatment Plant, which incorporated coastal resilience improvements against projected sea level rise.
Pano AI Wildfire Detection Scale-Up
Pano AI's trajectory from startup to enterprise adoption illustrates successful resilience technology scaling. Founded to address wildfire detection gaps exposed by California's devastating 2017–2020 fire seasons, Pano deployed an AI-powered camera network that identifies smoke signatures within minutes of ignition. By 2024, the company had raised $89 million and served over 250 first responder agencies. Critical scaling decisions included focusing on early warning (where value compounds nonlinearly with lead time) rather than suppression, partnering with existing emergency management structures rather than displacing them, and building observability systems that demonstrate ROI through documented prevented spread. Pano's technology has contributed to multiple early-stage fire containments that avoided infrastructure damage estimated in hundreds of millions of dollars.
WSP Thames TEAM2100 Flood Defense
WSP's ongoing work on the Thames Estuary Asset Management 2100 (TEAM2100) program demonstrates resilience at urban-region scale. The program manages flood defenses protecting 1.4 million people and £396 billion ($500 billion) in property across 330 kilometers of the River Thames. Rather than static barrier construction, TEAM2100 implements adaptive pathways that adjust interventions based on observed sea level rise and storm surge trends. The program's hybrid approach—combining gray infrastructure (barriers, gates) with green infrastructure (wetland restoration, floodplain management)—provides redundancy while generating co-benefits for biodiversity and recreation. WSP's role spans engineering design, program management, and long-term asset optimization, illustrating how established firms translate startup innovations into enterprise-scale deployment.
Action Checklist
- Conduct comprehensive asset inventory with ground-truth verification, documenting discrepancies between records and reality
- Map infrastructure interdependencies explicitly, identifying single points of failure that create cascading risk
- Establish dynamic return period analysis that incorporates climate projections rather than historical baselines
- Implement observability systems that log not just outcomes but decision processes, confidence levels, and environmental conditions
- Design explicit graceful degradation pathways for each critical infrastructure system, tested through tabletop exercises
- Deploy early warning integrations that maximize decision lead time for operators, supply chains, and financial hedging
- Create sampling-based verification protocols that independently audit self-reported resilience metrics
- Establish hybrid autonomy frameworks that match decision authority to consequence magnitude and confidence levels
FAQ
Q: How do we justify resilience investment when benefits are probabilistic and long-term? A: Frame resilience as insurance with compounding returns. CDRI data shows disaster-resilient infrastructure adds 5–15% to upfront costs but yields 7–12x returns over asset lifecycles. Additionally, calculate avoided losses from recent near-miss events to make probabilistic benefits tangible. Many organizations find that a single avoided disruption pays for years of resilience investment.
Q: What's the minimum viable monitoring infrastructure for meaningful resilience assessment? A: Start with the intersection of high-consequence assets and measurable exposure. Prioritize sensors that capture leading indicators (groundwater levels, soil moisture, grid frequency) rather than lagging indicators (damage reports). A network of 50–100 strategically placed sensors often provides more actionable intelligence than thousands of poorly positioned ones. Complement physical sensors with satellite observation subscriptions for broad coverage.
Q: How do we handle the tension between standardized resilience frameworks and context-specific implementation? A: Use frameworks for comparability and communication while customizing implementation for local conditions. Adopt established standards (ISO 14090 for climate adaptation, ISO 22301 for business continuity) as baseline language, then develop context-specific metrics that capture what actually matters for your infrastructure and community. Document deviations from standards with explicit rationale to maintain auditability.
Q: What's the typical timeline for resilience technology to move from pilot to enterprise deployment? A: Expect 18–36 months from successful pilot to production-grade deployment, with significant variation based on regulatory environment, integration complexity, and organizational change management capacity. The pilot-to-scale transition typically requires 3–5x the resources of the original pilot, primarily for integration, training, and process redesign rather than technology costs.
Q: How should we evaluate emerging resilience startups versus established engineering firms? A: Startups typically offer innovation velocity and specialized capability; established firms offer integration capacity and operational credibility. The optimal approach often involves partnering arrangements where startups provide point solutions that established firms integrate into comprehensive programs. Evaluate startups on problem-solution fit and team capability; evaluate established firms on relevant reference projects and assigned team experience.
Sources
- OECD, "Infrastructure for a Climate-Resilient Future," April 2024
- Coalition for Disaster Resilient Infrastructure (CDRI), "Global Infrastructure Risk 2025 Africa Working Paper," January 2025
- Pew Charitable Trusts, "Climate Change Poses Risks to Neglected Public Transportation and Water Systems," September 2024
- Global Association of Risk Professionals (GARP), "Climate-Related Infrastructure Failure Has Complex and Far-Reaching Economic Impacts," August 2024
- McKinsey & Company, "Climate Resilience Technology: An Inflection Point for New Investment," 2024
- PwC, "State of Climate Tech 2024"
- U.S. Department of Transportation, "2024–2027 Climate Adaptation Plan," June 2024
- European Environment Agency, "European Climate Risk Assessment," 2024
Related Articles
Deep dive: Critical infrastructure resilience — what's working, what's not, and what's next
What's working, what isn't, and what's next — with the trade-offs made explicit. Focus on KPIs that matter, benchmark ranges, and what 'good' looks like in practice.
Explainer: Critical infrastructure resilience — a practical primer for teams that need to ship
A practical primer: key concepts, the decision checklist, and the core economics. Focus on KPIs that matter, benchmark ranges, and what 'good' looks like in practice.
Playbook: adopting Critical infrastructure resilience in 90 days
A step-by-step rollout plan with milestones, owners, and metrics. Focus on data quality, standards alignment, and how to avoid measurement theater.