Interview: the skeptic's view on Compute, chips & energy demand — what would change their mind
A practitioner conversation: what surprised them, what failed, and what they'd do differently. Focus on implementation trade-offs, stakeholder incentives, and the hidden bottlenecks.
Training a single large language model now consumes approximately 1,287 MWh of electricity—enough to power 120 average American homes for an entire year. As data centers across North America race to deploy increasingly powerful AI accelerators, the energy footprint of compute infrastructure has become one of the most contested sustainability debates of our era. Skeptics argue that the current trajectory is fundamentally unsustainable, while proponents counter that efficiency gains and renewable energy procurement will close the gap. This synthesized expert perspective examines the skeptical viewpoint on compute energy demand, the evidence that might change minds, and the hidden implementation trade-offs that practitioners wish they had understood earlier.
Why It Matters
The intersection of AI compute growth and energy consumption represents a critical inflection point for North American sustainability policy. According to the International Energy Agency's 2024 report, global data center electricity consumption reached 460 TWh in 2024, with projections suggesting this could double to over 1,000 TWh by 2026. In the United States alone, data centers accounted for approximately 4% of total electricity consumption in 2024, a figure expected to rise to 6-8% by 2028.
The growth in AI-specific compute is particularly striking. NVIDIA shipped over 3.76 million data center GPUs in 2024, each drawing between 350-700 watts under load. The cumulative effect of deploying millions of high-power accelerators has created what some analysts call an "energy emergency" for grid operators. ERCOT (Electric Reliability Council of Texas) reported that data center interconnection requests in 2024 totaled over 50 GW of potential new load—more than the entire current generating capacity of many U.S. states.
The skeptical view holds that efficiency improvements cannot keep pace with exponential demand growth. Between 2020 and 2025, the performance per watt of leading AI chips improved by approximately 2.5x, but model training compute requirements grew by over 10x during the same period. This fundamental mismatch, critics argue, means that even aggressive efficiency gains will not prevent absolute energy consumption from spiraling upward.
Key Concepts
GPU Power Consumption and Thermal Design Power (TDP)
Modern AI accelerators represent the most power-dense computing components ever mass-produced. NVIDIA's H100 GPU operates at a TDP of 700 watts, while the newer B200 "Blackwell" accelerator draws up to 1,000 watts per chip. A single DGX B200 server containing eight accelerators can consume over 14.3 kW—equivalent to running 140 100-watt lightbulbs continuously. Understanding TDP versus actual power draw under various workloads is essential for capacity planning, as utilization patterns significantly affect real-world consumption.
Power Usage Effectiveness (PUE)
PUE measures data center energy efficiency as the ratio of total facility energy to IT equipment energy. A PUE of 1.0 would indicate perfect efficiency (all power goes to computing), while the industry average hovers around 1.58. Leading hyperscalers report PUE values between 1.1-1.2, though skeptics note that these figures often exclude embedded carbon in construction, water consumption for cooling, and transmission losses. The metric also fails to capture the total carbon intensity of the electricity consumed.
Liquid Cooling Technologies
As chip power densities exceed 1,000 watts per accelerator, traditional air cooling becomes thermodynamically inadequate. Direct-to-chip liquid cooling and immersion cooling technologies can reduce cooling energy by 30-40% while enabling higher rack densities. However, deployment complexity, maintenance requirements, and retrofit costs for existing facilities remain significant barriers. The transition from air to liquid cooling represents a multi-billion dollar capital expenditure for existing data center operators.
Neuromorphic and Alternative Architectures
Neuromorphic computing, which mimics biological neural networks, promises orders-of-magnitude improvements in energy efficiency for certain workloads. Intel's Loihi 2 chip, for instance, demonstrates 1,000x better energy efficiency than GPUs for sparse, event-driven inference tasks. However, the programming model remains incompatible with existing AI frameworks, and the hardware cannot efficiently train the large models that dominate current AI applications.
Edge vs. Cloud Compute Tradeoffs
Distributing inference workloads to edge devices can reduce data center energy consumption and latency, but introduces its own sustainability challenges. Edge deployments multiply the number of devices requiring manufacturing, each with embedded carbon. The operational energy may be lower per inference, but the aggregate carbon footprint depends heavily on edge device utilization rates, which often prove disappointing in real-world deployments.
Compute Energy KPI Benchmarks
| Metric | Current Industry Average | Best-in-Class | Target (2027) |
|---|---|---|---|
| Data Center PUE | 1.58 | 1.10 | 1.08 |
| GPU Utilization Rate | 30-40% | 60-70% | 80%+ |
| Carbon Intensity (gCO2e/kWh) | 380 (US grid average) | 0 (100% renewable) | <50 |
| Cooling Energy (% of total) | 35-40% | 8-12% | <10% |
| FLOPS per Watt (AI Training) | 1,979 TFLOPS/kW (H100) | 2,250 TFLOPS/kW (B200) | 4,000+ |
| Water Usage Effectiveness (WUE) | 1.8 L/kWh | 0.5 L/kWh | <0.3 L/kWh |
| Scope 2 Renewable Coverage | 45% | 100% | 100% |
What's Working and What Isn't
What's Working
Chip-Level Efficiency Improvements: Each successive generation of AI accelerators delivers meaningful performance-per-watt gains. NVIDIA's Blackwell architecture provides approximately 25x inference performance per watt compared to the Hopper generation for certain workloads. AMD's MI300X and Intel's Gaudi 3 are driving competitive pressure that accelerates efficiency improvements across the industry.
Renewable Energy Power Purchase Agreements (PPAs): Major cloud providers have signed over 35 GW of renewable energy PPAs across North America as of 2025. Microsoft's agreement with Constellation Energy to restart the Three Mile Island nuclear plant specifically for data center power demonstrates that operators are willing to pursue unconventional solutions. Google achieved its goal of matching 100% of electricity consumption with renewable purchases in 2023 and is now pursuing 24/7 carbon-free energy by 2030.
Advanced Cooling Deployments: Liquid cooling adoption has accelerated dramatically, with over 40% of new AI-focused data center capacity in 2025 incorporating direct-to-chip or immersion cooling. Equinix, Digital Realty, and newer entrants like Crusoe Energy have deployed liquid-cooled systems that achieve PUE values approaching 1.05 in optimal conditions.
Workload Optimization Software: Tools for inference optimization, model pruning, and quantization can reduce the compute requirements for AI workloads by 50-80% with minimal accuracy loss. Companies like OctoML, Deci, and Neural Magic have demonstrated that many production inference workloads are dramatically over-provisioned, representing low-hanging fruit for efficiency gains.
What Isn't Working
Exponential Growth Outpacing Efficiency: The fundamental skeptical argument holds considerable weight: efficiency improvements of 2-3x per generation cannot offset demand growth of 10-100x. Total data center electricity consumption in North America grew by 18% in 2024 alone, and current projections suggest 15-20% annual growth through 2030. Even achieving ambitious efficiency targets still results in a tripling or quadrupling of absolute energy consumption.
Grid Constraints and Interconnection Queues: Data center operators in key markets face interconnection delays of 3-5 years due to overwhelmed utility planning processes. Virginia's Dominion Energy has over 40 GW of data center projects in queue, while available grid capacity expansion is measured in hundreds of megawatts per year. This mismatch has led to projects relocating to regions with available power—often areas with higher carbon intensity.
Scope 2 Accounting Gaps and Additionality: While hyperscalers report impressive renewable energy statistics, skeptics point out that many renewable PPAs represent existing generation capacity that would have operated regardless. True additionality—building new renewable capacity specifically to serve data center load—remains limited. The practice of purchasing unbundled Renewable Energy Certificates (RECs) allows operators to claim green credentials while drawing power from fossil fuel-heavy grids.
Embedded Carbon Blindspot: Current sustainability metrics focus almost exclusively on operational energy (Scope 2 emissions), while the manufacturing carbon footprint of chips, servers, and data center construction (Scope 3) receives minimal attention. A single H100 GPU contains approximately 150 kg of embedded carbon from manufacturing, and the concrete and steel in a hyperscale data center can represent 500,000+ tonnes of CO2e before a single server is powered on.
Key Players
Established Leaders
NVIDIA Corporation: Dominates the AI accelerator market with over 80% share, producing the H100, H200, and B200 GPUs that power the majority of AI training infrastructure. NVIDIA's CUDA ecosystem creates strong lock-in, though the company faces increasing antitrust scrutiny.
Intel Corporation: Despite losing market share in AI accelerators, Intel remains critical through its foundry services, CPU infrastructure, and Gaudi AI chips. The company's $100 billion investment in U.S. manufacturing capacity will shape domestic supply chains.
Advanced Micro Devices (AMD): The MI300X accelerator has gained significant traction as an H100 alternative, with major deployments at Microsoft Azure and Meta. AMD's chiplet architecture enables faster iteration on energy-efficient designs.
Google (Alphabet): Designs custom TPU accelerators and operates some of the world's most efficient data centers. Google's commitment to 24/7 carbon-free energy by 2030 sets an industry benchmark that competitors are forced to address.
Microsoft Corporation: The largest purchaser of AI compute capacity and a major driver of data center expansion. Microsoft's nuclear PPA strategy and investments in fusion energy indicate long-term thinking about compute energy constraints.
Emerging Startups
Cerebras Systems: Produces the wafer-scale CS-3 chip containing 900,000 cores, designed specifically for AI training workloads. The architecture eliminates memory bottlenecks that cause inefficiencies in GPU clusters, potentially offering 10x efficiency gains for certain model types.
Groq: Focuses on inference optimization with its Language Processing Unit (LPU) architecture, claiming 10x better performance per watt than GPUs for inference workloads. The company's deterministic execution model eliminates the scheduling overhead that reduces GPU efficiency.
SambaNova Systems: Offers the SN40L chip with a reconfigurable dataflow architecture that adapts to different model types. The approach addresses the efficiency losses that occur when running diverse workloads on fixed-function hardware.
d-Matrix: Develops in-memory computing chips that dramatically reduce the energy-intensive data movement between memory and compute units. Early benchmarks suggest 5-10x efficiency improvements for transformer inference.
Etched AI: Creates application-specific integrated circuits (ASICs) optimized exclusively for transformer architectures, sacrificing flexibility for dramatic efficiency gains on the dominant AI workload type.
Key Investors & Funders
Andreessen Horowitz: Led multiple AI chip investments including Groq and has published extensively on the economics of AI infrastructure, shaping industry discourse on sustainability.
U.S. CHIPS and Science Act: Provides $52.7 billion in federal funding for domestic semiconductor manufacturing, with provisions for energy efficiency and environmental compliance that influence chip design priorities.
Department of Energy Advanced Research Projects Agency-Energy (ARPA-E): Funds breakthrough research in energy-efficient computing, including neuromorphic architectures and novel cooling technologies.
Tiger Global Management and Coatue Management: Major investors in AI infrastructure companies, providing the growth capital that enables rapid scaling of new data center capacity.
Examples
Example 1: Microsoft's Three Mile Island Nuclear Revival
In September 2024, Microsoft signed a 20-year PPA with Constellation Energy to restart Unit 1 of the Three Mile Island nuclear plant specifically to power data centers. The 835 MW reactor will provide 100% carbon-free electricity starting in 2028, addressing both the capacity and carbon intensity challenges facing AI infrastructure. The deal demonstrates that hyperscalers are willing to pursue unconventional and expensive solutions to secure clean, reliable power—a development that could change skeptics' minds about the feasibility of sustainable AI compute at scale.
Example 2: Crusoe Energy's Stranded Gas Data Centers
Crusoe Energy operates modular data centers at oil and gas wellheads, utilizing natural gas that would otherwise be flared or vented. By converting waste gas into electricity for Bitcoin mining and AI training, Crusoe claims to reduce the net carbon footprint compared to both flaring and grid-connected operations. The model has attracted over $750 million in funding and demonstrates creative approaches to powering compute infrastructure. However, critics argue that this approach extends the operational life of fossil fuel extraction rather than accelerating the transition to renewables.
Example 3: Google's Geothermal Data Center Initiative
In November 2024, Google announced a partnership with Fervo Energy to supply 400 MW of enhanced geothermal power to data centers in Nevada. Unlike solar and wind, geothermal provides 24/7 baseload power that matches data center demand profiles. The project represents the first commercial-scale deployment of next-generation geothermal technology and could demonstrate that truly 24/7 carbon-free energy is achievable at data center scale. If successful, this approach could address skeptical concerns about intermittency and RECs arbitrage.
Action Checklist
- Conduct a Scope 3 emissions inventory including embedded carbon from hardware manufacturing and data center construction
- Evaluate liquid cooling retrofit options for existing facilities, with cost-benefit analysis across 5-year and 10-year horizons
- Audit renewable energy claims for additionality—prioritize PPAs that finance new generation capacity over unbundled REC purchases
- Implement workload optimization tools to improve GPU utilization rates from typical 30-40% toward 70%+ targets
- Assess grid interconnection queue positions and develop contingency plans for power availability delays
- Establish internal carbon pricing ($100+/tonne) that captures the full cost of compute energy in project economics
- Pilot alternative chip architectures (neuromorphic, inference-optimized ASICs) for appropriate workload segments
- Develop water stewardship policies for data center cooling, particularly in water-stressed regions
FAQ
Q: Can efficiency improvements keep pace with AI compute demand growth? A: Historical evidence suggests they cannot in absolute terms. While performance per watt improves 2-3x per chip generation, AI training compute requirements have grown 10-100x per model generation. This fundamental mismatch means that even achieving ambitious efficiency targets will result in significantly higher total energy consumption. The skeptical position is that efficiency gains are necessary but insufficient—demand management and breakthrough technologies are also required.
Q: Are hyperscaler renewable energy claims credible? A: The claims are technically accurate but often misleading about climate impact. Many PPAs involve existing renewable generation that would have operated regardless, and the common practice of annual renewable energy matching allows operators to draw fossil fuel power during high-demand periods while purchasing surplus renewable power overnight. True 24/7 carbon-free energy matching—which Google is pursuing—is far more demanding and remains rare in the industry.
Q: What would genuinely change skeptics' minds about sustainable AI compute? A: Skeptics identify several potential developments: (1) Demonstrated absolute declines in data center energy consumption while AI capabilities continue improving; (2) Commercial deployment of neuromorphic or other alternative architectures achieving 100x efficiency gains; (3) Proof that hyperscaler operations are truly grid-beneficial, providing demand response and storage services that accelerate renewable integration; (4) Robust Scope 3 accounting showing that full lifecycle emissions are declining.
Q: How do edge computing strategies affect the overall energy equation? A: Edge computing can reduce per-inference energy consumption but introduces manufacturing overhead from millions of additional devices. The net effect depends heavily on utilization rates—edge devices that sit idle most of the time may have worse lifecycle carbon footprints than centralized data centers. Additionally, edge deployment does not address the training compute problem, which remains concentrated in hyperscale facilities.
Q: What role will nuclear power play in sustainable AI infrastructure? A: Nuclear power is emerging as a leading candidate for meeting data center baseload requirements with zero carbon emissions. Microsoft's Three Mile Island deal and Amazon's investments in nuclear-powered data centers signal growing industry interest. However, new nuclear construction remains slow and expensive, and the existing fleet is aging. Small modular reactors (SMRs) may eventually provide more flexible nuclear options, but commercial deployment at scale remains years away.
Sources
- International Energy Agency. "Electricity 2024: Analysis and Forecast to 2026." IEA Publications, January 2024.
- Electric Reliability Council of Texas (ERCOT). "Report on Data Center Interconnection Requests." ERCOT Board Materials, October 2024.
- Lawrence Berkeley National Laboratory. "United States Data Center Energy Usage Report." U.S. Department of Energy, December 2024.
- Strubell, Emma, Ananya Ganesh, and Andrew McCallum. "Energy and Policy Considerations for Deep Learning in NLP." Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 2019.
- Masanet, Eric, et al. "Recalibrating Global Data Center Energy-Use Estimates." Science, Vol. 367, Issue 6481, February 2020.
- Uptime Institute. "Global Data Center Survey 2024." Uptime Institute Intelligence, 2024.
- Goldman Sachs. "Generational Growth: AI, Data Centers, and the Coming US Power Demand Surge." Goldman Sachs Research, April 2024.
Related Articles
Case study: Compute, chips & energy demand — a leading organization's implementation and lessons learned
A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on implementation trade-offs, stakeholder incentives, and the hidden bottlenecks.
Data story: the metrics that actually predict success in Compute, chips & energy demand
The 5–8 KPIs that matter, benchmark ranges, and what the data suggests next. Focus on KPIs that matter, benchmark ranges, and what 'good' looks like in practice.
Explainer: Compute, chips & energy demand — what it is, why it matters, and how to evaluate options
A practical primer: key concepts, the decision checklist, and the core economics. Focus on unit economics, adoption blockers, and what decision-makers should watch next.