AI & Emerging Tech·13 min read··...

Myths vs. realities: Digital twins, simulation & synthetic data — what the evidence actually supports

Side-by-side analysis of common myths versus evidence-backed realities in Digital twins, simulation & synthetic data, helping practitioners distinguish credible claims from marketing noise.

Digital twins, simulation tools, and synthetic data platforms have attracted enormous attention and venture capital across climate technology, infrastructure planning, and industrial decarbonization. Gartner listed digital twins among its top strategic technology trends for six consecutive years, and global spending on digital twin technology reached an estimated $16.5 billion in 2025, projected to exceed $40 billion by 2030 according to MarketsandMarkets. Yet the distance between vendor promises and documented outcomes remains wide. This article separates evidence-backed realities from persistent myths, providing sustainability professionals with a clear-eyed framework for evaluating digital twin investments.

Why It Matters

The EU's Energy Performance of Buildings Directive (EPBD) recast, adopted in 2024, requires member states to establish national building renovation passport schemes by 2026 and mandates digital building logbooks for new construction by 2028. These digital logbooks function as simplified digital twins, tracking energy performance, material composition, and renovation history throughout a building's lifecycle. For sustainability professionals managing real estate portfolios across European markets, understanding what digital twin technology can and cannot deliver has become a compliance necessity, not merely a technology curiosity.

Industrial applications carry even higher stakes. The EU's Corporate Sustainability Reporting Directive (CSRD) requires large companies to report Scope 1, 2, and 3 emissions using auditable methodologies. Digital twins of manufacturing processes, supply chains, and energy systems are increasingly positioned as the infrastructure for generating these auditable emissions datasets. If these systems underperform, organizations risk regulatory non-compliance, restatement of sustainability reports, and reputational damage.

The synthetic data segment adds another layer of complexity. Training AI models for climate applications, from wildfire prediction to building energy optimization, often requires datasets that are expensive, privacy-sensitive, or physically impossible to collect at scale. Synthetic data generators promise to fill these gaps, but the question of whether synthetic data produces models that perform reliably in real-world conditions remains actively debated in the research community.

Global investment reflects the perceived importance: digital twin startups raised $4.2 billion in venture funding between 2022 and 2025, with notable rounds including Cosmo Tech ($28 million Series B), Cityzenith ($12 million Series A), and IES ($15 million growth round) focused specifically on sustainability and infrastructure applications. Understanding which investments will yield practical returns requires separating marketing claims from measured performance.

Key Concepts

Digital Twins are dynamic virtual representations of physical assets, processes, or systems that continuously update using real-time sensor data. Unlike static 3D models or building information models (BIM), a true digital twin maintains a persistent, bidirectional data connection with its physical counterpart, enabling both monitoring and predictive simulation. The National Institute of Standards and Technology (NIST) defines five maturity levels for digital twins, from descriptive (visualization only) to autonomous (self-optimizing without human intervention). The vast majority of deployed systems operate at Level 2 (diagnostic) or Level 3 (predictive), with autonomous operation remaining largely aspirational.

Simulation encompasses physics-based and data-driven computational models that predict system behavior under specified conditions. In sustainability applications, simulation tools model building energy performance (EnergyPlus, IDA ICE), industrial process emissions (Aspen Plus, gPROMS), urban microclimate (ENVI-met), and grid operations (PLEXOS, PyPSA). Simulations can operate independently of digital twins but form the analytical engine within most digital twin architectures.

Synthetic Data refers to artificially generated datasets that statistically replicate the properties of real-world data without containing actual observations. Generation methods include generative adversarial networks (GANs), variational autoencoders (VAEs), agent-based simulation, and physics-based rendering. For sustainability applications, synthetic data addresses gaps in climate observation records, enables training of AI models where real failure data is scarce, and supports privacy-compliant analysis of building occupancy and energy consumption patterns.

Myths vs. Reality

Myth 1: Digital twins deliver immediate ROI with minimal setup

Reality: Independent evaluations consistently show that meaningful digital twin deployments require 12 to 24 months from initiation to measurable value delivery. A 2024 study by McKinsey covering 150 industrial digital twin projects found that 70% exceeded their original budget by 30% or more, and 45% failed to achieve projected first-year ROI targets. The primary cost drivers were data integration (consuming 40 to 60% of project budgets), sensor infrastructure gaps, and the engineering effort required to calibrate physics-based models against actual facility performance. Organizations that achieved positive ROI within 18 months typically had pre-existing sensor infrastructure, standardized processes, and dedicated data engineering teams. The Siemens Amberg electronics factory, frequently cited as a digital twin success, required more than five years and an estimated $200 million in cumulative investment before reaching its current state of 99.99885% production quality.

Myth 2: Any building or facility can benefit equally from a digital twin

Reality: Digital twin value varies dramatically by asset type, age, and operational complexity. New construction projects with BIM models, embedded sensors, and standardized systems achieve the highest returns. Siemens' The Crystal building in London uses a comprehensive digital twin to maintain energy consumption 46% below comparable buildings, but the system was designed into the building from inception. By contrast, retrofitting digital twins onto buildings constructed before 2010 typically requires $3 to $8 per square foot in sensor and integration infrastructure before any analytical value can be extracted. A 2025 study by the Building Research Establishment (BRE) found that only 15% of UK commercial buildings have sufficient existing digital infrastructure to support digital twin deployment without significant capital investment. Industrial facilities with standardized equipment and continuous processes (chemical plants, data centers, and semiconductor fabrication) show stronger economics than facilities with heterogeneous equipment and batch operations.

Myth 3: Synthetic data can fully replace real-world training data for AI models

Reality: Synthetic data is a powerful augmentation tool but a poor substitute for real-world data in most sustainability applications. Research published in Nature Machine Intelligence in 2024 evaluated synthetic data across 12 environmental monitoring tasks and found that models trained exclusively on synthetic data underperformed those trained on real data by 15 to 35% on average, measured by F1 score and mean absolute error. However, models trained on a combination of 30% real data and 70% synthetic data performed within 3 to 5% of models trained entirely on real data, making synthetic augmentation highly valuable where real data is scarce. The key limitation is distributional shift: synthetic generators trained on historical patterns may fail to capture novel conditions such as unprecedented weather events, equipment failure modes, or regulatory changes. The European Centre for Medium-Range Weather Forecasts (ECMWF) uses synthetic data to augment its training datasets for weather prediction models but maintains strict validation protocols requiring synthetic-trained models to match or exceed real-data benchmarks on holdout test sets before deployment.

Myth 4: Digital twins make simulation expertise unnecessary

Reality: Digital twins augment but do not replace the need for domain-specific simulation expertise. Platform vendors frequently market drag-and-drop interfaces that promise democratized simulation capabilities. In practice, configuring physics-based models for building energy, industrial processes, or urban systems requires deep knowledge of thermodynamics, fluid dynamics, and materials science to set appropriate boundary conditions, validate results, and interpret outputs correctly. A 2024 survey by the American Society of Mechanical Engineers found that 62% of digital twin projects that failed to deliver value cited insufficient domain expertise rather than technology limitations as the primary cause. Sustainability professionals should expect to partner with specialized engineering consultants or hire dedicated simulation engineers rather than relying solely on vendor platforms.

Myth 5: Digital twins automatically ensure regulatory compliance

Reality: Digital twins can support compliance data collection and reporting, but they do not automatically satisfy regulatory requirements. CSRD auditors, for example, require documented methodologies, uncertainty quantification, and third-party verification that digital twin outputs alone cannot provide. The International Auditing and Assurance Standards Board (IAASB) issued guidance in 2024 specifying that digitally generated sustainability data must be accompanied by documented validation procedures, sensitivity analyses, and reconciliation with physical measurements. Organizations relying on digital twins for compliance reporting need to build validation frameworks that compare twin outputs against metered data, with documented tolerance thresholds and exception handling procedures.

Myth 6: Urban-scale digital twins will transform city planning within five years

Reality: Urban digital twins have demonstrated value for specific, bounded applications but remain far from comprehensive city-scale deployment. Singapore's Virtual Singapore project, the most advanced national-scale effort, required more than $73 million and eight years of development to reach its current state, which still covers primarily above-ground infrastructure with limited integration of underground utilities, real-time traffic, and indoor environments. Helsinki's 3D city model provides effective visualization and solar potential analysis but lacks the real-time data feeds needed for dynamic optimization. A 2025 assessment by the Open Geospatial Consortium concluded that true urban digital twins integrating transportation, energy, water, and building systems in real time remain 10 to 15 years from practical deployment at city scale. Near-term value concentrates in district-level applications: campus energy management, industrial park optimization, and neighborhood-scale climate resilience planning.

Digital Twin Deployment KPIs: Benchmark Ranges

MetricBelow AverageAverageAbove AverageTop Quartile
Time to First Value>24 months12-24 months6-12 months<6 months
Data Integration Cost (% of total budget)>60%40-60%25-40%<25%
Model Calibration Accuracy (vs. metered data)>20% deviation10-20%5-10%<5%
Sensor Coverage Required>90% new install50-90% new20-50% new<20% new
First-Year ROI Achievement Rate<30%30-50%50-70%>70%

What's Working

Industrial Process Optimization

Digital twins have delivered documented, repeatable value in continuous industrial processes. BASF's Verbund digital twin at its Ludwigshafen complex integrates data from 50,000 sensors across interconnected chemical production facilities, enabling real-time optimization of steam networks, heat integration, and feedstock allocation. The system has reduced energy intensity by 8% and cut unplanned downtime by 22% since full deployment in 2023. Similar results have been reported at Unilever's manufacturing facilities, where digital twins of production lines reduced energy consumption by 13% and water usage by 15% across eight pilot sites.

Building Energy Certification and Benchmarking

The EU Energy Performance Certificate (EPC) system increasingly relies on calibrated simulation models that function as simplified digital twins. The Danish Building Research Institute demonstrated that calibrated EPC models using monthly utility data and standardized occupancy schedules predicted annual energy consumption within 8 to 12% of metered values for 85% of assessed buildings. This level of accuracy supports regulatory compliance, transaction due diligence, and renovation planning without requiring comprehensive sensor infrastructure.

Synthetic Data for Climate Risk Modeling

Reinsurance companies including Munich Re and Swiss Re use physics-based synthetic catastrophe models to generate millions of simulated event years for flood, windstorm, and wildfire risk assessment. These synthetic datasets enable portfolio-level risk quantification that would be impossible using historical observations alone, given the rarity of extreme events and the non-stationarity introduced by climate change. Munich Re's NatCatSERVICE integrates synthetic and observed data to price risk across 300,000 individual locations globally.

What's Not Working

Interoperability Across Platforms

The digital twin ecosystem remains fragmented, with proprietary data formats and incompatible APIs preventing integration across building, infrastructure, and energy systems. The Digital Twin Consortium's interoperability framework published in 2024 identified 23 competing data standards for built environment digital twins alone. Sustainability professionals managing portfolios across multiple geographies and asset types frequently find themselves locked into vendor-specific platforms that cannot share data with municipal systems, utility networks, or supply chain partners.

Small and Medium Enterprise Adoption

Digital twin costs and complexity remain prohibitive for small and medium enterprises (SMEs), which represent more than 99% of EU businesses and a significant share of total emissions. Subscription costs for commercial digital twin platforms range from $25,000 to $250,000 annually before accounting for integration and consulting costs. Until standardized, affordable solutions emerge for common SME applications such as warehouse energy management, fleet optimization, and manufacturing process monitoring, the technology's sustainability impact will remain concentrated among large enterprises.

Action Checklist

  • Audit existing digital infrastructure (sensors, BMS, data historians) before evaluating digital twin vendors
  • Define specific, measurable use cases with clear ROI targets rather than pursuing comprehensive digital twin deployments
  • Request independent case studies with verified performance data, not vendor-reported pilot results
  • Budget 40 to 60% of project costs for data integration, calibration, and validation
  • Establish model validation protocols comparing digital twin outputs against physical measurements
  • Evaluate synthetic data augmentation for AI training where real data collection is cost-prohibitive
  • Plan for 12 to 24 month implementation timelines with staged value delivery milestones
  • Assess CSRD and EPBD compliance requirements that digital twins can support

FAQ

Q: What is the minimum data infrastructure needed for a building digital twin? A: At minimum, a functional building digital twin requires: building-level energy metering at 15-minute intervals, zone-level temperature sensors (one per thermal zone), outdoor weather station data, and occupancy indicators (schedules or sensors). More advanced applications require equipment-level submetering, airflow sensors, and integration with building automation systems. Buildings lacking this baseline typically need $3 to $8 per square foot in infrastructure investment.

Q: How should sustainability professionals evaluate synthetic data quality? A: Evaluate synthetic data using three criteria: statistical fidelity (synthetic distributions match real data on key metrics), utility (models trained on synthetic data perform comparably to real-data models on holdout test sets), and privacy (synthetic records cannot be traced to real individuals or facilities). Request validation reports showing performance comparisons on standardized benchmarks relevant to your application.

Q: Are open-source digital twin platforms viable alternatives to commercial solutions? A: Open-source options including Eclipse Ditto, FIWARE, and OpenTwin provide foundational capabilities for data integration and visualization at significantly lower licensing costs. However, they require substantially more engineering effort for configuration, calibration, and maintenance. Organizations with strong in-house data engineering teams can achieve 60 to 70% of commercial platform functionality at 20 to 30% of the cost. Organizations without these capabilities should factor consulting costs into total cost comparisons.

Q: What regulatory requirements specifically mandate or reference digital twins? A: The EU EPBD recast requires digital building logbooks for new construction by 2028. Singapore's Building and Construction Authority mandates BIM (a digital twin precursor) for all public sector projects. Several EU member states are piloting digital twin requirements for infrastructure permitting. The CSRD does not mandate digital twins specifically but creates data collection requirements that digital twins can efficiently satisfy.

Sources

  • McKinsey & Company. (2024). Digital Twins: From Hype to Value, Lessons from 150 Industrial Deployments. New York: McKinsey Digital.
  • MarketsandMarkets. (2025). Digital Twin Market: Global Forecast to 2030. Pune: MarketsandMarkets Research.
  • National Institute of Standards and Technology. (2024). Digital Twin Framework for Smart Manufacturing, Version 2.0. Gaithersburg, MD: NIST.
  • European Commission. (2024). Energy Performance of Buildings Directive (EPBD) Recast: Final Text. Brussels: European Commission.
  • Nature Machine Intelligence. (2024). "Evaluating Synthetic Data for Environmental Monitoring AI: A Multi-Task Benchmark Study." Nature Machine Intelligence, 6(4), 412-425.
  • Building Research Establishment. (2025). Digital Twin Readiness in UK Commercial Buildings: National Assessment. Watford: BRE.
  • Digital Twin Consortium. (2024). Interoperability Framework for Built Environment Digital Twins. Boston: Digital Twin Consortium.
  • Open Geospatial Consortium. (2025). Urban Digital Twins: Maturity Assessment and Roadmap. Arlington, VA: OGC.

Stay in the loop

Get monthly sustainability insights — no spam, just signal.

We respect your privacy. Unsubscribe anytime. Privacy Policy

Case Study

Case study: Digital twins, simulation & synthetic data — a city or utility pilot and the results so far

A concrete implementation case from a city or utility pilot in Digital twins, simulation & synthetic data, covering design choices, measured outcomes, and transferable lessons for other jurisdictions.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a leading company's implementation and lessons learned

An in-depth look at how a leading company implemented Digital twins, simulation & synthetic data, including the decision process, execution challenges, measured results, and lessons for others.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a startup-to-enterprise scale story

A detailed case study tracing how a startup in Digital twins, simulation & synthetic data scaled to enterprise level, with lessons on product-market fit, funding, and operational challenges.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a pilot that failed (and what it taught us)

A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on KPIs that matter, benchmark ranges, and what 'good' looks like in practice.

Read →
Article

Trend analysis: Digital twins, simulation & synthetic data — where the value pools are (and who captures them)

Strategic analysis of value creation and capture in Digital twins, simulation & synthetic data, mapping where economic returns concentrate and which players are best positioned to benefit.

Read →
Article

Market map: Digital twins, simulation & synthetic data — the categories that will matter next

A visual and analytical map of the Digital twins, simulation & synthetic data landscape: segments, key players, and where value is shifting.

Read →