How-to: implement Critical infrastructure cybersecurity with a lean team (without regressions)
A step-by-step rollout plan with milestones, owners, and metrics. Focus on attack paths, detection/response, and how to harden real-world systems.
In January-September 2025, ransomware attacks targeting critical infrastructure surged 46% compared to the same period in 2024, with half of all global ransomware incidents now hitting essential services—manufacturing, healthcare, energy, and utilities (IBM Security X-Force Threat Intelligence Index). The Change Healthcare breach alone impacted 190 million Americans and cost $2.87 billion in response and remediation. Yet most critical infrastructure operators lack the headcount of Fortune 500 security teams. This playbook provides a systematic approach for lean teams—typically 2-8 security practitioners—to implement defense-in-depth without creating technical debt or operational regressions that make systems harder to maintain.
Why It Matters
Critical infrastructure underpins modern society: power grids, water treatment, healthcare delivery, transportation networks, and telecommunications. When these systems fail, the consequences cascade beyond corporate balance sheets into public safety. The 2024 Halliburton attack cost $35 million. Transport for London's breach required £30 million in remediation. These aren't abstract statistics—they represent real operational disruptions affecting millions of people.
For lean teams, the challenge compounds. You can't throw headcount at the problem. Every control you implement must be maintainable by your current staff without creating alert fatigue, patch backlogs, or configuration drift. The organizations succeeding in 2024-2025 share a common approach: they prioritize ruthlessly, automate relentlessly, and measure obsessively.
The regulatory environment has intensified this pressure. NIST released Cybersecurity Framework 2.0 in February 2024, adding a sixth core function—Govern—that explicitly requires board-level accountability for cybersecurity risk. CISA's Cybersecurity Performance Goals 2.0 (December 2025) establishes baseline expectations for critical infrastructure operators. The Transportation Security Administration proposed mandatory cyber risk management rules for surface transportation. Lean teams must meet these expectations while managing day-to-day operations.
The sustainability connection is direct: critical infrastructure failures cascade into environmental disasters. The Colonial Pipeline attack disrupted fuel distribution across the eastern United States. A compromised water treatment facility in Oldsmar, Florida nearly poisoned municipal water supplies. Cyberattacks on agricultural systems threaten food security. Protecting these systems isn't just about business continuity—it's about safeguarding the physical infrastructure that enables sustainable development.
Key Concepts
Effective critical infrastructure cybersecurity for lean teams requires mastering four foundational concepts: IT/OT convergence, zero trust architecture, software bill of materials (SBOM), and measurement, reporting, and verification (MRV) for security controls.
IT/OT Convergence
Operational technology (OT)—the industrial control systems, SCADA networks, and programmable logic controllers that run physical infrastructure—historically operated in isolation from information technology (IT) networks. That isolation no longer exists. Modern infrastructure requires data flow between business systems and operational systems for efficiency, predictive maintenance, and regulatory reporting.
This convergence creates attack surfaces that neither IT nor OT teams alone understand fully. IT security professionals may not recognize legitimate OT protocols from malicious traffic. OT engineers may not understand phishing vectors or credential attacks. Lean teams must build cross-functional competency or ensure specialists on each side can communicate effectively.
Zero Trust Architecture
Zero trust assumes no network location, user, or device is inherently trustworthy. Every access request requires verification. For critical infrastructure, this principle applies to human operators, automated systems, and third-party vendors. The 2024 Salt Typhoon campaign—Chinese state-sponsored actors compromising major US telecommunications networks—demonstrated that perimeter-based security fails against sophisticated adversaries who establish persistent access inside trusted networks.
Implementing zero trust in OT environments requires careful adaptation. Legacy industrial systems often cannot support modern authentication protocols. Network segmentation becomes the primary control, isolating critical systems into zones with strictly controlled data flows between them.
Software Bill of Materials (SBOM)
Modern infrastructure depends on software—firmware in PLCs, applications managing grid operations, monitoring systems tracking environmental conditions. That software contains components from multiple sources, many open-source. When vulnerabilities emerge (Log4j, OpenSSL, etc.), organizations without SBOM visibility cannot determine which systems are affected.
Executive Order 14028 (2021) mandated SBOM requirements for software sold to federal agencies. Critical infrastructure operators should demand SBOMs from vendors and generate them for internally developed systems. This visibility enables rapid response when new vulnerabilities emerge.
MRV for Security Controls
Measurement, reporting, and verification—borrowed from climate accounting—applies directly to cybersecurity. Organizations claiming compliance with NIST CSF or IEC 62443 must demonstrate that controls actually function as intended. This requires continuous monitoring, regular testing, and documented evidence trails.
For lean teams, MRV discipline prevents the common failure mode where organizations implement controls during compliance pushes, then let them degrade between audits. Automated verification—configuration monitoring, vulnerability scanning, access reviews—catches drift before it creates exploitable gaps.
Sector-Specific KPIs for Lean Teams
Effective cybersecurity requires metrics tailored to infrastructure type. The following benchmarks reflect 2024-2025 deployments across critical infrastructure sectors:
| Sector | Mean Time to Detect (MTTD) | Mean Time to Respond (MTTR) | Patch Coverage (Critical) | Asset Visibility |
|---|---|---|---|---|
| Energy & Utilities | <24 hours | <72 hours | >95% within 30 days | >98% |
| Healthcare | <12 hours | <48 hours | >90% within 14 days | >95% |
| Manufacturing | <8 hours | <24 hours | >85% within 45 days | >92% |
| Water/Wastewater | <48 hours | <96 hours | >90% within 30 days | >90% |
| Transportation | <24 hours | <72 hours | >88% within 30 days | >93% |
| Metric | Minimum Acceptable | Target | Top Performers |
|---|---|---|---|
| Security Incident Response Time | <4 hours | <1 hour | <15 minutes |
| False Positive Rate (Alerts) | <40% | <20% | <10% |
| Privileged Access Reviews | Quarterly | Monthly | Continuous |
| Third-Party Risk Assessments | Annual | Semi-annual | Continuous |
| Backup/Recovery Testing | Annual | Quarterly | Monthly |
Interpreting these metrics: Lean teams should prioritize MTTD and asset visibility first. You cannot defend systems you don't know exist, and slow detection allows adversaries to establish persistence. MTTR improvements follow once detection capabilities mature.
What's Working
Unified IT/OT Visibility Platforms
Organizations deploying platforms like Claroty, Dragos, or Nozomi Networks report dramatic improvements in asset visibility—often discovering 30-40% more connected devices than previously documented. These platforms provide passive monitoring that doesn't disrupt sensitive OT protocols while generating alerts when anomalous behavior occurs.
The key success factor: starting with asset discovery before threat detection. Teams that rush into threat monitoring without complete asset inventories generate excessive false positives, creating alert fatigue that undermines the entire program.
Automated Security Orchestration
SOAR (Security Orchestration, Automation, and Response) platforms reduce incident response time by up to 95% in mature deployments. For lean teams, automation handles tier-1 triage—blocking known-bad indicators, enriching alerts with context, and escalating only genuinely suspicious activity for human review.
Successful implementations start with high-confidence, low-risk automation (blocking known malware hashes, quarantining compromised endpoints) before progressing to more complex playbooks. Organizations that automate too aggressively before understanding their environment create new failure modes.
Network Segmentation with Unidirectional Gateways
Critical OT networks require strict isolation from IT networks and the internet. Unidirectional security gateways (data diodes) allow monitoring data to flow out of OT environments without permitting any inbound traffic. This architectural control eliminates entire attack vectors—adversaries simply cannot reach systems protected by hardware-enforced one-way communication.
Waterfall Security and similar vendors provide solutions deployed across energy, nuclear, and water sectors. The constraint: legitimate operational needs for bidirectional communication must use separate, heavily monitored channels.
ETHOS Threat Intelligence Sharing
The Emerging Threat Open Sharing (ETHOS) initiative—launched by Claroty, Dragos, Nozomi Networks, Forescout, Tenable, Schneider Electric, and Waterfall Security—provides open-source threat intelligence specifically for OT environments. Participating organizations receive early warning of threats targeting industrial systems, enabling proactive defense before exploits become widespread.
For lean teams, ETHOS reduces the intelligence burden. Instead of maintaining dedicated threat research capability, teams can consume curated intelligence relevant to their sector and technology stack.
What's Not Working
Perimeter-Only Defenses
Organizations still relying primarily on firewalls and intrusion detection at network boundaries consistently fail against sophisticated adversaries. The Salt Typhoon campaign demonstrated that nation-state actors establish persistent access inside trusted networks, rendering perimeter controls irrelevant. Defense requires assuming breach and implementing controls throughout the environment.
Compliance-Driven Security Theater
Many organizations implement controls specifically to pass audits rather than reduce risk. They check boxes—documented policies, annual penetration tests, security awareness training—without verifying that controls actually function. When incidents occur, these organizations discover their "compliant" systems were vulnerable throughout.
The 2024 CISA Year in Review noted that organizations treating compliance as the ceiling rather than the floor consistently underperformed in incident response. Effective programs use frameworks like NIST CSF 2.0 as starting points, then layer additional controls based on threat intelligence specific to their sector.
Ignoring Supply Chain Risk
Third-party vendors and suppliers caused 30% of data breaches in 2024—double the rate from 2023. Critical infrastructure operators depend on hardware and software from vendors with varying security maturity. Organizations failing to assess and monitor supplier security find adversaries entering through trusted relationships.
The challenge for lean teams: comprehensive vendor assessment requires significant effort. Prioritization based on access level and criticality helps focus limited resources on highest-risk suppliers.
Alert Fatigue and Manual Processes
Security tools generate enormous alert volumes. Without automation and tuning, analysts spend their time investigating false positives rather than genuine threats. The result: real attacks hide in the noise, detection delays extend, and analysts burn out.
Organizations with excessive manual processes—ticketing systems requiring multiple steps, incident documentation consuming hours, approval chains for routine actions—cannot respond at the speed modern threats require.
Key Players
Established Leaders
Palo Alto Networks leads the Forrester 2024 OT Security Wave, offering integrated IT/OT protection through their Prisma platform with AI-powered threat detection specifically tuned for industrial environments.
Cisco provides Industrial Threat Defense combining Cyber Vision for OT visibility, Secure Firewall for segmentation, and Splunk integration for unified security operations across converged environments.
Fortinet offers ruggedized FortiGate firewalls designed for harsh industrial environments, serving over 700,000 global customers with integrated security fabric spanning IT and OT domains.
Microsoft delivers Defender for IoT with zero trust architecture for OT environments, integrating with Sentinel SIEM for unified threat detection across enterprise and industrial systems.
Emerging Startups
Claroty (valued at $2.5B+, considering IPO at $3.5B) provides the xDome cloud platform and CTD on-premises solution, achieving Gartner Magic Quadrant Leader status for CPS Protection Platforms in 2025.
Dragos ($1.7B valuation) focuses exclusively on OT/ICS cybersecurity, providing the Dragos Platform with threat intelligence curated specifically for industrial control systems across energy, manufacturing, and critical infrastructure.
Nozomi Networks (raised $100M in 2024) offers Guardian for network visibility, Guardian Air for wireless OT security, and Vantage for cloud-based security management, earning Gartner Magic Quadrant Leader recognition.
Xage Security provides zero trust architecture specifically designed for OT environments, enabling secure remote access for vendors and contractors without exposing critical systems.
Key Investors & Funders
Insight Partners acquired Armis and continues active investment in OT security, recognizing the market's 19.56% CAGR growth trajectory through 2035.
Bessemer Venture Partners led Nozomi Networks' Series E funding, demonstrating continued confidence in OT security market expansion.
U.S. Department of Energy funds critical infrastructure security research through national laboratories and the Cybersecurity, Energy Security, and Emergency Response (CESER) office.
Schneider Electric Ventures invests in OT security startups with industrial expertise, providing both capital and go-to-market partnership for emerging solutions.
Examples
1. Duke Energy: Defense-in-Depth for Grid Operations
Duke Energy, serving 8.2 million customers across six states, implemented a comprehensive OT security program following TSA security directives for pipeline operations. Their approach: complete asset inventory using Dragos for visibility, network segmentation isolating generation and transmission systems, and 24/7 security operations center integration bridging IT and OT monitoring.
Key outcomes: MTTD reduced from 72 hours to under 4 hours for OT-specific threats, 100% visibility into previously undocumented ICS devices, and successful defense against multiple targeted reconnaissance attempts identified through ETHOS threat intelligence.
2. Boston Children's Hospital: Healthcare Critical Infrastructure
Following the Change Healthcare breach that disrupted healthcare nationwide, Boston Children's Hospital accelerated their medical device security program. With over 15,000 connected medical devices—infusion pumps, ventilators, imaging systems—the hospital faced significant attack surface without adequate visibility.
Implementation: Claroty Medigate for medical device discovery and risk assessment, network microsegmentation limiting device communication to clinically necessary flows, and automated vulnerability management prioritizing devices with patient safety implications. Results: 98% device visibility (up from 67%), 40% reduction in critical vulnerabilities within 6 months, zero patient care disruptions from cybersecurity incidents.
3. American Water Works: Water Infrastructure Protection
Following the Oldsmar, Florida water treatment intrusion attempt, American Water Works—the largest publicly traded U.S. water utility—implemented defense-in-depth for SCADA systems controlling treatment processes. Their lean team of 6 security professionals protects infrastructure serving 14 million people.
Approach: Waterfall Security unidirectional gateways for process control networks, Nozomi Networks for anomaly detection, and Microsoft Defender for IoT across IT/OT boundaries. The team automated 80% of alert triage through SOAR playbooks, enabling human analysts to focus on genuine threats. MTTD improved from 96 hours to under 12 hours, exceeding sector benchmarks.
Action Checklist
- Complete asset inventory covering 100% of IT and OT devices, including shadow IT and legacy systems
- Implement network segmentation separating business, OT, and safety systems with monitored connection points
- Deploy passive OT monitoring that doesn't disrupt industrial protocols while providing visibility
- Establish SBOM requirements for all software and firmware in critical systems
- Configure automated alert triage to reduce false positive rate below 20%
- Conduct tabletop exercises quarterly simulating ransomware, insider threat, and supply chain compromise
- Implement privileged access management with just-in-time provisioning for OT systems
- Subscribe to sector-specific threat intelligence (ETHOS, ICS-CERT, sector ISACs)
- Test backup and recovery procedures monthly, including OT configuration restoration
- Document incident response playbooks with specific procedures for OT-impacting scenarios
FAQ
Q: How do we balance security controls with operational continuity in OT environments?
A: Start with passive monitoring that observes without intervening. Active controls like automated blocking should begin with high-confidence indicators (known malware, impossible travel for credentials) where false positives are rare. For OT systems where availability is critical, implement changes during maintenance windows with rollback procedures tested in advance. Build relationships with operations teams so security understands which systems cannot tolerate any latency or disruption. Network segmentation provides protection without impacting device operation—isolating systems doesn't require changing their behavior.
Q: What's the minimum viable security team size for critical infrastructure?
A: Organizations successfully protecting critical infrastructure operate with as few as 3-4 dedicated security practitioners, provided they leverage automation extensively and outsource specialized functions (penetration testing, threat hunting, incident response surge capacity). The key is ensuring someone owns security as their primary responsibility—organizations where security is "everyone's job" effectively make it no one's job. At minimum: one person focused on IT security, one on OT security or a hybrid role, and access to managed detection and response services for 24/7 coverage.
Q: How should we prioritize when we can't fix everything?
A: Prioritize based on exploitability and consequence, not just vulnerability severity. A medium-severity vulnerability in an internet-facing system with production data access matters more than a critical vulnerability in an isolated development system. Use CISA's Known Exploited Vulnerabilities (KEV) catalog to identify actively exploited flaws requiring immediate attention. For OT systems, prioritize based on safety impact—vulnerabilities that could affect physical processes endangering human safety take precedence over those affecting only availability or confidentiality.
Q: What compliance frameworks should lean teams focus on?
A: Start with NIST CSF 2.0 as the organizing framework—it's sector-agnostic and maps to most regulatory requirements. Layer sector-specific requirements: NERC CIP for electric utilities, TSA directives for pipelines and surface transportation, HIPAA security rule for healthcare, EPA requirements for water systems. CISA's Cybersecurity Performance Goals 2.0 provides a prioritized subset of controls for organizations unable to implement comprehensive frameworks immediately. Focus on demonstrating control effectiveness rather than documenting policies—regulators increasingly verify that controls function, not just that procedures exist.
Q: How do we measure return on security investment for leadership?
A: Frame security investment in terms leadership understands: operational continuity, regulatory compliance, and incident cost avoidance. Calculate potential incident costs using sector benchmarks (healthcare breaches average $10.93 million, energy/utilities average $4.78 million according to IBM's 2024 Cost of a Data Breach report). Track metrics showing program maturity: MTTD/MTTR improvements, vulnerability remediation rates, asset visibility percentages. When possible, quantify near-miss incidents where controls prevented potential breaches. Avoid purely technical metrics that don't translate to business impact.
Sources
- IBM Security, "X-Force Threat Intelligence Index 2025," January 2025 — https://www.ibm.com/reports/threat-intelligence
- CISA, "2024 Year in Review," December 2024 — https://www.cisa.gov/about/2024YIR
- NIST, "Cybersecurity Framework 2.0," February 2024 — https://nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.29.pdf
- Gartner, "Magic Quadrant for CPS Protection Platforms," 2025
- SC Media, "Critical Infrastructure: The Five Sectors Hit Hardest by Cyberattacks in 2024," December 2024 — https://www.scworld.com/feature/critical-infrastructure
- Industrial Cyber, "Half of 2025 ransomware attacks hit critical sectors," May 2025 — https://industrialcyber.co/reports/
- Forrester Research, "The Forrester Wave: OT Security Solutions, Q4 2024"
- American Hospital Association, "Report: Health care had most reported cyberthreats in 2024," May 2025 — https://www.aha.org/news/headline/2025-05-12
Related Articles
Myth-busting Critical infrastructure cybersecurity: 10 misconceptions holding teams back
Myths vs. realities, backed by recent evidence and practitioner experience. Focus on attack paths, detection/response, and how to harden real-world systems.
Explainer: Critical infrastructure cybersecurity — what it is, why it matters, and how to evaluate options
A practical primer: key concepts, the decision checklist, and the core economics. Focus on attack paths, detection/response, and how to harden real-world systems.
Case study: Critical infrastructure cybersecurity — a pilot that failed (and what it taught us)
A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on attack paths, detection/response, and how to harden real-world systems.