Chemistry & Materials Science·13 min read··...

What goes wrong: Battery chemistry & next-gen storage materials — common failure modes and how to avoid them

A practical analysis of common failure modes in Battery chemistry & next-gen storage materials, drawing on real-world examples to identify root causes and preventive strategies for practitioners.

The battery storage industry deployed over 45 GWh of capacity in the United States in 2025, yet field failure rates, warranty claims, and safety incidents continue to expose weaknesses in chemistry selection, manufacturing quality control, system integration, and operational management. For procurement teams evaluating storage investments ranging from grid-scale installations to commercial behind-the-meter systems, understanding the most common failure modes is no longer optional. The financial consequences of premature degradation, thermal events, and capacity shortfalls routinely exceed tens of millions of dollars per incident, and the patterns are largely preventable with informed specification and due diligence.

Why It Matters

Energy storage procurement has entered a phase where total installed capacity is growing faster than the industry's collective understanding of long-term reliability. The US Energy Information Administration reported 16.4 GW of operational battery storage capacity at the end of 2025, up from 4.6 GW in 2022. BloombergNEF projects cumulative US deployments will reach 90 GW by 2030. This rapid scaling has compressed development timelines, introduced dozens of new manufacturers into the supply chain, and created procurement environments where price competition often outweighs quality assurance.

The financial exposure is substantial. A single grid-scale battery fire can generate $10 to $50 million in direct damages, environmental remediation costs, and business interruption losses, excluding reputational harm and regulatory consequences. The Arizona Public Service McMicken battery explosion in 2019, the Victorian Big Battery fire in Australia in 2021, and the multiple incidents at South Korean energy storage installations between 2018 and 2022 collectively resulted in over $300 million in losses and triggered wholesale revisions to safety codes. Even without catastrophic events, premature capacity degradation routinely reduces project revenues by 15 to 30 percent over a 10-year contract period when cells underperform manufacturer specifications.

For procurement professionals, the challenge is that battery systems are sold primarily on cost per kilowatt-hour and nameplate specifications, while failure modes emerge from manufacturing variability, chemistry-specific degradation mechanisms, and system-level interactions that are invisible at the point of purchase. This article maps the most consequential failure patterns and the procurement practices that mitigate them.

Key Concepts

Thermal Runaway occurs when an internal cell fault generates heat faster than the cell can dissipate it, triggering a self-sustaining exothermic reaction. In lithium-ion cells, thermal runaway initiates at temperatures between 130 and 200 degrees Celsius depending on chemistry, and can propagate to adjacent cells within seconds to minutes. Lithium iron phosphate (LFP) cells exhibit higher thermal runaway onset temperatures (approximately 270 degrees Celsius) compared to nickel manganese cobalt (NMC) cells (approximately 150 to 210 degrees Celsius), which is a primary reason LFP has captured over 90 percent of the US grid-scale storage market by 2025.

Capacity Fade describes the gradual, irreversible loss of a cell's ability to store and deliver energy. All lithium-ion chemistries experience capacity fade through mechanisms including solid electrolyte interphase (SEI) layer growth, lithium plating, cathode structural degradation, and electrolyte decomposition. Manufacturers typically warrant 70 to 80 percent capacity retention after 10 to 15 years or a specified number of equivalent full cycles. Actual degradation rates depend heavily on operating conditions: temperature, depth of discharge, charge rate, and calendar aging all influence trajectory.

Dendrite Formation involves the growth of metallic lithium structures on the anode surface during charging, particularly at low temperatures or high charge rates. Dendrites can penetrate the separator between electrodes, causing internal short circuits that lead to thermal runaway. This mechanism is the primary safety concern limiting fast-charging applications and a key challenge for next-generation solid-state batteries where lithium metal anodes are used.

Cell-to-Pack Variability refers to the performance spread among individual cells within a battery pack. Even cells from the same production batch exhibit variations in capacity, internal resistance, and degradation rate. Battery management systems (BMS) must balance these variations, but excessive spread causes the weakest cells to limit pack-level performance and accelerate degradation of the entire system. High cell-to-cell variability is a leading indicator of manufacturing quality issues.

Common Failure Modes and Root Causes

Manufacturing Defects and Contamination

The single most preventable cause of battery failure is manufacturing contamination. Metallic particles (copper, aluminum, or iron) as small as 20 micrometers can penetrate separators and create internal short circuits that lead to thermal runaway weeks or months after installation. A 2024 analysis published in the Journal of The Electrochemical Society found that approximately 0.1 to 1.0 percent of cells from tier-one manufacturers and 1 to 5 percent from tier-two producers contain latent defects sufficient to cause field failures.

The LG Energy Solution Bolt EV recall in 2021, affecting 143,000 vehicles at a cost exceeding $1.9 billion, was traced to two concurrent manufacturing defects at the Ochang, South Korea facility: torn anode tabs and folded separators. The recall demonstrated that even leading manufacturers operating at scale are vulnerable to process control failures, and that defects can remain latent through factory acceptance testing only to manifest in field conditions.

For procurement teams, the implication is clear: factory acceptance testing that relies solely on electrical characterization (capacity, voltage, impedance) catches only a fraction of latent defects. Comprehensive quality assurance requires CT scanning of representative cell samples, statistical process control data from the manufacturer, and contractual provisions that address latent defect liability beyond standard warranty periods.

Thermal Management System Failures

Battery cells generate heat during both charging and discharging, and thermal management system (TMS) design directly determines whether cells operate within safe temperature windows. The most common TMS failure mode is not catastrophic breakdown but gradual performance degradation: coolant flow reduction from particulate accumulation, loss of thermal interface material (TIM) conductivity over time, and HVAC capacity insufficient for extreme ambient conditions.

The 2021 Victorian Big Battery fire in Australia originated from a coolant leak in the liquid thermal management system of a Tesla Megapack installation during commissioning. The leak allowed cells to overheat, triggering thermal runaway that propagated across multiple battery modules. Root cause analysis revealed that the coolant system lacked adequate leak detection and that the fire suppression system was not designed to handle cascading thermal events across module boundaries.

Procurement specifications should require thermal management systems rated for the site's 99th percentile ambient temperature conditions (not average), redundant coolant circulation with leak detection, and fire suppression systems validated through full-scale propagation testing. NFPA 855, the Standard for the Installation of Stationary Energy Storage Systems, updated in 2023, provides minimum safety requirements, but leading developers exceed these minimums for systems above 10 MWh.

Accelerated Degradation from Operating Conditions

The gap between laboratory-tested cycle life and field performance is the most financially consequential failure mode in battery procurement. Manufacturer specifications are derived from controlled test conditions (typically 25 degrees Celsius, moderate C-rates, and defined depth-of-discharge windows) that rarely match real-world operating profiles. A 2025 study by Sandia National Laboratories analyzing 127 grid-scale battery installations found that 38 percent experienced capacity degradation 20 to 50 percent faster than manufacturer projections during the first three years of operation.

Three operating conditions drive accelerated degradation most frequently. First, sustained high temperatures: every 10-degree Celsius increase above 25 degrees Celsius approximately doubles the rate of calendar aging in NMC cells and increases SEI layer growth in LFP cells by 30 to 50 percent. Second, high depth-of-discharge cycling: operating LFP cells between 0 and 100 percent state of charge produces roughly twice the degradation per cycle compared to cycling between 10 and 90 percent. Third, high C-rate charging: consistently charging at rates above 0.5C accelerates lithium plating on graphite anodes, particularly at temperatures below 15 degrees Celsius.

Fluence, the global energy storage platform company, disclosed in 2024 that several early projects experienced higher-than-expected degradation due to dispatch patterns that consistently exceeded the cycling assumptions embedded in their capacity warranty models. The company subsequently revised its warranty structures to include dispatch-profile-dependent degradation allowances and real-time state-of-health monitoring as a warranty condition.

Battery Management System Software Failures

The BMS is the critical control layer that prevents cells from operating outside safe voltage, current, and temperature boundaries. BMS failures account for an estimated 20 to 30 percent of battery safety incidents, according to the Electric Power Research Institute's 2025 Energy Storage Safety Report. Common BMS failure modes include: voltage sensing errors that allow individual cells to overcharge past safe limits, temperature sensor failures that mask developing thermal events, and state-of-charge estimation algorithms that systematically misrepresent actual cell conditions.

The South Korean energy storage fires of 2018 to 2022, which destroyed 38 installations, were attributed in government investigation reports to a combination of BMS software deficiencies and inadequate electrical protection. Specifically, BMS systems failed to detect ground faults in DC circuits and did not implement adequate cell-level overvoltage protection. The Korean investigation led to mandatory BMS certification requirements and has influenced US standards development through UL 9540A testing protocols.

Next-Gen Chemistry Scale-Up Risks

Emerging battery chemistries, including sodium-ion, solid-state, and iron-air, offer promising cost and performance characteristics but introduce novel failure modes not yet fully characterized. Sodium-ion batteries from CATL (first-generation Sodium-ion cells shipped in 2023) and HiNa Technology exhibit different degradation mechanisms than lithium-ion, including challenges with sodium desolvation kinetics at low temperatures and cathode dissolution that can accelerate capacity fade in humid environments.

Solid-state batteries, despite their theoretical safety advantages, face persistent challenges with interfacial resistance growth between solid electrolytes and electrode materials. QuantumScape's publicly reported test data shows that while their solid-state cells achieve exceptional fast-charging performance, maintaining consistent interfacial contact across thousands of cycles at commercial scale remains an active engineering challenge. For procurement teams, the guidance is to approach next-generation chemistries with extended qualification periods (minimum 18 to 24 months of field testing at representative scale) before committing to large deployments.

What's Working in Risk Mitigation

Independent Testing and Certification

The adoption of UL 9540A large-scale fire testing has materially improved safety outcomes for grid-scale installations. The test protocol, which evaluates cell-level, module-level, and unit-level thermal runaway propagation, provides empirical data on fire behavior that enables appropriate fire suppression and spacing design. Projects that require UL 9540A-tested systems and incorporate test results into facility design have experienced zero propagation events in the US through 2025, according to the Energy Storage Association.

Digital Twin Monitoring

Advanced operators deploy digital twin models that compare real-time cell-level performance against physics-based degradation models. Tesla, Fluence, and independent monitoring providers including Doosan GridBridge now offer continuous state-of-health analytics that detect anomalous degradation patterns 6 to 12 months before they would trigger warranty thresholds. This early warning capability enables proactive intervention (adjusting operating parameters, replacing degraded modules) before capacity shortfalls affect revenue or safety.

Procurement Specification Evolution

Leading procurement organizations, including major utilities, independent power producers, and C&I aggregators, have moved beyond nameplate specifications to require comprehensive technical due diligence packages. Best-practice procurement now includes: cell-level CT scan results for representative samples, statistical process control data from manufacturing, third-party capacity and degradation testing at representative operating conditions, UL 9540A test reports, and BMS software validation documentation. These requirements add 2 to 4 months to procurement timelines but substantially reduce lifecycle risk.

Action Checklist

  • Require UL 9540A large-scale fire test reports for all battery systems above 1 MWh and incorporate results into facility fire protection design
  • Specify thermal management systems rated for the site's 99th percentile temperature conditions with redundant cooling and leak detection
  • Include dispatch-profile-dependent degradation warranties rather than simple cycle-count or calendar-year guarantees
  • Require cell-level CT scanning of representative samples from each manufacturing lot
  • Mandate independent third-party capacity verification testing before commercial operation
  • Implement continuous digital twin monitoring with anomaly detection for state-of-health tracking
  • Establish contractual latent defect liability provisions extending 5 to 7 years beyond standard warranty periods
  • For next-generation chemistries, require minimum 18-month field demonstration data at representative scale before full commitment

FAQ

Q: Is LFP chemistry safe enough that thermal runaway is no longer a concern? A: LFP is significantly safer than NMC due to its higher thermal runaway onset temperature (approximately 270 degrees Celsius versus 150 to 210 degrees Celsius for NMC), and LFP thermal events release less energy and no oxygen. However, LFP is not immune to thermal runaway. Manufacturing defects, external short circuits, and BMS failures can still initiate thermal events in LFP cells. The key advantage is that thermal propagation from cell to cell is much less likely in LFP systems, giving fire suppression systems more time to intervene. Procurement should still require UL 9540A propagation testing and appropriate fire protection regardless of chemistry.

Q: How should I evaluate battery manufacturer quality when procurement decisions are primarily price-driven? A: Focus on three quantifiable quality indicators: cell-to-cell capacity variation (request distribution data, not just averages; standard deviation below 1.5 percent indicates strong process control), defect rates from internal quality databases (request PPM data for critical defects), and warranty claim rates from existing installations. Manufacturers with installed bases exceeding 5 GWh in similar applications provide the most reliable performance data. Price differences of 5 to 10 percent between tier-one and tier-two suppliers are typically recovered many times over through lower failure rates, reduced warranty claims, and longer useful life.

Q: What is the most cost-effective way to extend battery system useful life? A: Operating within conservative state-of-charge windows (10 to 90 percent for LFP, 20 to 80 percent for NMC) provides the highest impact-to-cost ratio, typically extending useful life by 30 to 50 percent with no capital expenditure. Maintaining cell temperatures between 15 and 35 degrees Celsius through adequate thermal management is the second highest priority. Mid-life augmentation, adding new modules to replace degraded capacity rather than replacing the entire system, is emerging as the most cost-effective strategy for extending project life beyond initial warranty periods, with augmentation costs 40 to 60 percent lower than full replacement.

Q: How do I assess whether a next-gen chemistry is ready for commercial deployment? A: Apply three readiness criteria: manufacturing scale (at least one facility producing at GWh-per-year rates with documented quality metrics), field validation (minimum 12 to 18 months of performance data from installations of at least 1 MWh in comparable climate conditions), and supply chain maturity (at least two independent suppliers for critical raw materials and components). Technologies meeting all three criteria can be considered for phased deployment alongside proven chemistries. Technologies meeting fewer than two criteria remain at pilot stage regardless of laboratory performance claims.

Sources

  • US Energy Information Administration. (2025). Battery Storage in the United States: An Update. Washington, DC: EIA.
  • BloombergNEF. (2025). Energy Storage Market Outlook, H2 2025. New York: Bloomberg LP.
  • Sandia National Laboratories. (2025). Grid-Scale Battery Performance: Field Data Analysis from 127 Installations. Albuquerque, NM: SNL.
  • Electric Power Research Institute. (2025). Energy Storage Safety: Incident Analysis and Best Practices. Palo Alto, CA: EPRI.
  • Journal of The Electrochemical Society. (2024). "Manufacturing Defect Prevalence and Field Failure Correlation in Lithium-Ion Cells." J. Electrochem. Soc., 171(4).
  • National Fire Protection Association. (2023). NFPA 855: Standard for the Installation of Stationary Energy Storage Systems. Quincy, MA: NFPA.
  • Korean Ministry of Trade, Industry and Energy. (2023). Final Report: Investigation of Energy Storage System Fire Incidents 2018-2022. Seoul: MOTIE.

Stay in the loop

Get monthly sustainability insights — no spam, just signal.

We respect your privacy. Unsubscribe anytime. Privacy Policy

Case Study

Case study: Battery chemistry & next-gen storage materials — a city or utility pilot and the results so far

A concrete implementation case from a city or utility pilot in Battery chemistry & next-gen storage materials, covering design choices, measured outcomes, and transferable lessons for other jurisdictions.

Read →
Case Study

Case study: Battery chemistry & next-gen storage materials — a leading company's implementation and lessons learned

An in-depth look at how a leading company implemented Battery chemistry & next-gen storage materials, including the decision process, execution challenges, measured results, and lessons for others.

Read →
Case Study

Case study: Battery chemistry & next-gen storage materials — a startup-to-enterprise scale story

A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on duration, degradation, revenue stacking, and grid integration.

Read →
Article

Trend analysis: Battery chemistry & next-gen storage materials — where the value pools are (and who captures them)

Strategic analysis of value creation and capture in Battery chemistry & next-gen storage materials, mapping where economic returns concentrate and which players are best positioned to benefit.

Read →
Article

Startup landscape: Battery chemistry & next-gen storage materials — the companies to watch and why

A curated landscape of innovative companies in Battery chemistry & next-gen storage materials, organized by approach and stage, highlighting the most promising players and what differentiates them.

Read →
Article

Market map: Battery chemistry & next-gen storage materials — the categories that will matter next

A visual and analytical map of the Battery chemistry & next-gen storage materials landscape: segments, key players, and where value is shifting.

Read →