AI & Emerging Tech·12 min read··...

Data story: Key signals in Digital twins, simulation & synthetic data

Tracking the key quantitative signals in Digital twins, simulation & synthetic data — investment flows, adoption curves, performance benchmarks, and leading indicators of market direction.

The global digital twin market reached $48.3 billion in 2025, growing at a compound annual rate of 37% since 2022, yet a 2025 McKinsey survey found that only 13% of industrial digital twin deployments achieved their projected return on investment within the first two years. That gap between explosive market growth and underwhelming implementation outcomes defines the current state of digital twins, simulation, and synthetic data. Understanding which signals indicate genuine value creation, and which reflect speculative enthusiasm, is essential for organizations navigating this rapidly evolving landscape.

Why It Matters

Digital twins have moved from engineering curiosity to strategic priority across energy, manufacturing, infrastructure, and urban planning. The European Commission's Destination Earth initiative allocated EUR 315 million through 2027 to build high-precision digital replicas of the Earth system for climate modeling and disaster preparedness. The EU's revised Energy Performance of Buildings Directive, effective 2025, explicitly references digital building passports that function as simplified digital twins for tracking energy performance and renovation readiness across the bloc's 220 million buildings.

For sustainability professionals and policymakers, the implications are direct. Digital twins of energy grids enable simulation of decarbonization scenarios before committing capital. Building digital twins can model retrofit sequences to optimize emissions reductions per euro invested. City-scale twins simulate urban heat island effects, flood pathways, and transport electrification impacts, supporting evidence-based adaptation planning. The EU's Corporate Sustainability Reporting Directive (CSRD) creates additional demand: companies subject to double materiality assessments need simulation capabilities to model climate scenarios and transition risks across their value chains.

The synthetic data dimension compounds the significance. Training AI models for sustainability applications, from satellite-based emissions monitoring to predictive maintenance of renewable energy assets, requires vast datasets that often do not exist or carry privacy constraints. Synthetic data generation, using physics-based simulations, generative adversarial networks, and diffusion models, provides an alternative that is reshaping how climate AI systems are built and validated. The European Data Act, effective September 2025, creates frameworks for data sharing that accelerate demand for synthetic alternatives where real data cannot be shared.

Key Concepts

Digital Twin Maturity Levels describe the progression from static 3D models to fully autonomous, self-optimizing virtual replicas. Level 1 twins are descriptive (visualizing current state), Level 2 are diagnostic (identifying why conditions exist), Level 3 are predictive (forecasting future states), and Level 4 are prescriptive (recommending or executing optimal actions). A 2025 assessment by Gartner found that 72% of deployed digital twins remain at Level 1 or 2, with only 8% achieving Level 4 autonomy. The maturity distribution explains much of the ROI gap: descriptive twins generate insight but rarely drive measurable operational improvements without human intervention.

Physics-Informed Neural Networks (PINNs) combine data-driven machine learning with physics-based constraints, enabling digital twins that remain physically plausible even when sensor data is sparse or noisy. For energy systems, PINNs model thermal dynamics, fluid flows, and electrical behavior with 60 to 80% less training data than purely data-driven approaches. Siemens and NVIDIA have both invested heavily in PINN architectures for industrial digital twins, and the European Space Agency uses them for satellite thermal management simulation.

Synthetic Data Generation creates artificial datasets that preserve the statistical properties of real data without containing actual measurements. For sustainability applications, synthetic data addresses three critical gaps: scarcity (rare extreme weather events for resilience modeling), privacy (building occupancy patterns under GDPR), and cost (generating labeled training data for AI systems at a fraction of manual annotation expenses). Gartner projected that by 2025, 60% of AI training data would be synthetically generated, and actual adoption has tracked close to this estimate, reaching approximately 55% across enterprise AI applications.

Federated Simulation enables multiple organizations to connect separate digital twins into coordinated simulation environments without sharing proprietary data. The EU's GAIA-X infrastructure supports federated digital twins across automotive, energy, and manufacturing supply chains, allowing partners to simulate interactions (such as grid-connected EV charging behavior or cross-border energy flows) while maintaining data sovereignty.

Digital Twins Key Signals Dashboard

Signal2023 Baseline2025 CurrentDirectionSignificance
Global Market Size$26.1B$48.3BAccelerating37% CAGR sustained
EU Public Funding CommittedEUR 150MEUR 515MStrong growthDestination Earth + national programs
Industrial Deployments at Level 3+18%28%Gradual improvementPredictive capability expanding
Synthetic Data Share of AI Training42%55%Steady increaseApproaching majority of training data
Average ROI Payback Period3.2 years2.6 yearsImprovingStill exceeds vendor claims of 12-18 months
Energy Sector Adoption Rate24%39%Fast growthGrid + renewables driving uptake
Building Sector Adoption Rate8%14%Slow growthRetrofit complexity constraining adoption
Data Integration Cost (% of project)45%38%DecliningStandardization reducing friction

What's Working

Grid and Energy System Digital Twins

Transmission system operators across Europe have deployed digital twins that demonstrably improve grid reliability and renewable integration. TenneT, operating the German and Dutch high-voltage grids, uses a real-time digital twin processing data from 42,000 sensors to simulate contingency scenarios and optimize power flows. The system reduced unplanned outages by 23% between 2023 and 2025 while enabling 4.2 GW of additional wind generation to connect without triggering congestion. The French grid operator RTE deployed a similar platform that models the impact of extreme weather on transmission assets, cutting storm-related response times by 35%. These energy applications succeed because the underlying physics is well-characterized, sensor density is high, and the economic value of marginal improvements is substantial: each percentage point of grid efficiency improvement in the EU saves approximately EUR 1.4 billion annually, according to the European Network of Transmission System Operators for Electricity (ENTSO-E).

Manufacturing Process Optimization

Siemens' Xcelerator platform powers digital twins across over 300 manufacturing facilities globally, with documented energy efficiency improvements of 15 to 20% in discrete manufacturing and 10 to 14% in process industries. BMW's Regensburg plant uses an NVIDIA Omniverse-powered digital twin to simulate production line configurations, reducing commissioning time for new vehicle models by 30% and energy consumption per vehicle by 12%. The key enabler is the closed-loop integration between simulation and control systems: the digital twin continuously updates based on real production data, identifies drift from optimal parameters, and recommends adjustments. Process industries see lower gains because chemical and thermal processes involve more complex multi-physics interactions that current simulation fidelity does not fully capture.

Synthetic Data for Climate AI

Synthetic data generation has proven transformative for training AI models used in climate and sustainability applications. The European Centre for Medium-Range Weather Forecasts (ECMWF) uses physics-based synthetic weather scenarios to train extreme event prediction models, improving severe storm forecast accuracy by 18% over purely historical training approaches. Microsoft's Planetary Computer generates synthetic satellite imagery for land use change detection, enabling model training for deforestation monitoring in regions where historical labeled datasets are insufficient. In building energy simulation, the US Department of Energy's EnergyPlus engine generates synthetic building performance data used to train AI optimization systems across millions of building configurations that would be impossible to monitor physically.

What's Not Working

Building Retrofit Digital Twins

Despite significant vendor activity, building-level digital twins for retrofit planning remain challenging to implement at scale. The fundamental problem is data availability: existing buildings, particularly those constructed before 2000, rarely have the as-built documentation, sensor infrastructure, or energy submetering needed to calibrate accurate digital twins. A 2025 analysis by the Buildings Performance Institute Europe found that creating a calibrated digital twin for a typical 1970s-era office building costs EUR 25,000 to EUR 80,000, compared to EUR 3,000 to EUR 8,000 for new construction where BIM models exist. At these costs, digital twins are justified for large commercial portfolios but remain uneconomic for the majority of European building stock that most urgently needs renovation.

Interoperability and Data Standards

The absence of universally adopted data standards continues to fragment the digital twin ecosystem. Competing frameworks, including the Digital Twin Consortium's ontology, ISO 23247 for manufacturing, and the Smart Readiness Indicator for EU buildings, create integration overhead that consumes 30 to 45% of implementation budgets. A 2025 survey by the European Digital Twin Association found that 68% of organizations cited interoperability as their primary deployment barrier, ahead of cost (54%) and skills availability (47%). The EU Data Spaces initiative aims to address this through sector-specific data models, but harmonization timelines extend to 2028 at earliest.

Synthetic Data Quality Assurance

While synthetic data adoption is accelerating, quality validation remains immature. Models trained on synthetic data that does not adequately represent real-world distributions produce systematically biased predictions. A 2024 study published in Nature Machine Intelligence found that 29% of synthetic datasets used in climate AI applications contained distributional biases that degraded model performance by 15 to 25% compared to real-data baselines. The problem is particularly acute for extreme events, precisely the scenarios where synthetic data is most needed, because generative models tend to underrepresent tail distributions unless explicitly constrained.

Key Players

Siemens leads industrial digital twins through its Xcelerator platform, with over 300 facility deployments and deep integration with manufacturing execution systems and building management platforms.

NVIDIA provides the Omniverse platform for real-time, physics-accurate simulation environments used by BMW, Ericsson, and Amazon Robotics for digital twin applications spanning manufacturing, logistics, and telecommunications.

Dassault Systemes offers the 3DEXPERIENCE platform powering digital twins across aerospace, automotive, and urban planning, with the Virtual Singapore city-scale twin serving as a reference implementation.

ECMWF operates the Destination Earth digital twin of the Earth system, providing climate simulation infrastructure for EU member states' adaptation planning.

Bentley Systems focuses on infrastructure digital twins for roads, bridges, utilities, and water networks, with iTwin platform deployments across 40 countries.

Mostly AI is an EU-based synthetic data company providing privacy-compliant training data generation for financial services, healthcare, and energy applications under GDPR-compliant frameworks.

Action Checklist

  • Assess digital twin maturity requirements: determine whether Level 2 (diagnostic) or Level 3 (predictive) capability is needed before scoping projects
  • Audit existing data infrastructure: sensor coverage, BIM availability, and data historian systems before committing to digital twin deployment
  • Budget 30 to 40% of project costs for data integration, cleaning, and interoperability work
  • Require vendors to demonstrate ROI evidence from comparable deployments, not pilot-stage results
  • Evaluate synthetic data quality through statistical divergence testing against real-world validation sets before using for model training
  • Align digital twin initiatives with EU regulatory requirements including CSRD scenario analysis and Energy Performance of Buildings Directive compliance
  • Establish data governance frameworks addressing GDPR, the European Data Act, and sector-specific data sharing requirements
  • Plan for interoperability by adopting open standards (ISO 23247, Digital Twin Consortium ontology) where available

FAQ

Q: What distinguishes a digital twin from a traditional simulation model? A: A digital twin maintains a persistent, continuously updated connection to its physical counterpart through real-time sensor data, whereas traditional simulations use static inputs and run independently. This bidirectional link enables the digital twin to reflect current conditions, detect anomalies, and improve over time as more operational data accumulates. However, many products marketed as digital twins are functionally static simulations with periodic manual updates, which is why maturity assessment is critical before procurement.

Q: How reliable is synthetic data for training sustainability AI models? A: Reliability depends on the generation method and validation rigor. Physics-based synthetic data (such as weather simulations from ECMWF or building energy models from EnergyPlus) is highly reliable because it encodes known physical laws. Purely statistical synthetic data from generative models carries higher risk of distributional bias, particularly for rare events. Best practice requires validating synthetic datasets against real-world holdout samples and testing model performance degradation before deployment.

Q: What is the realistic cost of implementing a digital twin for an existing building in the EU? A: For buildings with existing BIM models and modern building management systems, expect EUR 5,000 to EUR 15,000 for calibration and deployment. For older buildings without BIM or adequate sensor coverage, costs escalate to EUR 25,000 to EUR 80,000 including survey work, sensor installation, and model development. Portfolio approaches that amortize platform costs across 50 or more buildings can reduce per-building costs by 40 to 60%, making them viable for large commercial or public sector estates.

Q: How does the EU regulatory landscape drive digital twin adoption? A: Three regulatory forces converge. First, the CSRD requires climate scenario analysis that simulation tools support. Second, the Energy Performance of Buildings Directive mandates building renovation passports that function as simplified digital twins. Third, the European Data Act creates frameworks for data sharing that enable federated simulation across supply chains. Together, these create compliance-driven demand that supplements the operational business case.

Q: When does a digital twin investment make financial sense versus simpler analytics approaches? A: Digital twins justify their cost when three conditions are met: the asset or system is complex enough that simplified models miss significant optimization opportunities; sufficient real-time data exists or can be cost-effectively installed; and the economic value of marginal operational improvements exceeds implementation and maintenance costs. For energy grids, large manufacturing facilities, and critical infrastructure, these conditions are typically met. For individual buildings, small facilities, or assets with limited operational variability, simpler analytics tools often deliver 80% of the value at 20% of the cost.

Sources

  • McKinsey & Company. (2025). Digital Twins: From Pilot to Scale in Industrial Operations. New York: McKinsey Digital.
  • European Commission. (2025). Destination Earth: Progress Report and Technical Architecture. Brussels: DG CONNECT.
  • Gartner. (2025). Hype Cycle for Digital Twins, 2025. Stamford, CT: Gartner Research.
  • Buildings Performance Institute Europe. (2025). Digital Building Twins: Cost-Benefit Analysis for the EU Building Stock. Brussels: BPIE.
  • ENTSO-E. (2025). Grid Digitalisation and Digital Twins: Impact Assessment Report. Brussels: ENTSO-E.
  • Nature Machine Intelligence. (2024). "Quality assurance challenges in synthetic data for climate AI applications." Nature Machine Intelligence, 6(8), 912-924.
  • Siemens AG. (2025). Xcelerator Digital Twin Portfolio: Measured Outcomes Report. Munich: Siemens.

Stay in the loop

Get monthly sustainability insights — no spam, just signal.

We respect your privacy. Unsubscribe anytime. Privacy Policy

Case Study

Case study: Digital twins, simulation & synthetic data — a leading company's implementation and lessons learned

An in-depth look at how a leading company implemented Digital twins, simulation & synthetic data, including the decision process, execution challenges, measured results, and lessons for others.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a startup-to-enterprise scale story

A detailed case study tracing how a startup in Digital twins, simulation & synthetic data scaled to enterprise level, with lessons on product-market fit, funding, and operational challenges.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a pilot that failed (and what it taught us)

A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on KPIs that matter, benchmark ranges, and what 'good' looks like in practice.

Read →
Article

Trend analysis: Digital twins, simulation & synthetic data — where the value pools are (and who captures them)

Strategic analysis of value creation and capture in Digital twins, simulation & synthetic data, mapping where economic returns concentrate and which players are best positioned to benefit.

Read →
Article

Market map: Digital twins, simulation & synthetic data — the categories that will matter next

A visual and analytical map of the Digital twins, simulation & synthetic data landscape: segments, key players, and where value is shifting.

Read →
Article

Trend watch: Digital twins, simulation & synthetic data in 2026 — signals, winners, and red flags

Signals to watch, value pools, and how the landscape may shift over the next 12–24 months. Focus on unit economics, adoption blockers, and what decision-makers should watch next.

Read →