AI & Emerging Tech·14 min read··...

Trend watch: Digital twins, simulation & synthetic data in 2026 — signals, winners, and red flags

Signals to watch, potential winners, and red flags for Digital twins, simulation & synthetic data heading into 2026 and beyond.

The global digital twin market reached $17.7 billion in 2025 and is projected to exceed $110 billion by 2030, expanding at a compound annual growth rate above 35%. For executives in energy, infrastructure, and manufacturing, 2026 marks the year digital twins move from isolated pilot projects to enterprise-wide operating systems, fueled by three converging forces: the maturation of physics-informed machine learning, the explosive growth of synthetic data for training AI models, and a new generation of real-time simulation platforms that can model entire cities, supply chains, and power grids at unprecedented fidelity.

Why It Matters

Digital twins are no longer a futuristic concept or a marketing buzzword. They are becoming foundational infrastructure for decision-making in sectors where the cost of physical experimentation is prohibitive, dangerous, or slow. A digital twin of a wind farm can simulate 10,000 maintenance scenarios overnight, identifying failure modes months before they occur. A synthetic data pipeline can generate millions of labeled training images for autonomous vehicle perception systems without a single real-world photograph. A city-scale simulation can test flood resilience strategies across decades of climate projections in hours rather than years.

The economic logic is straightforward. Physical prototyping, real-world testing, and trial-and-error optimization are expensive and time-consuming. McKinsey estimates that digital twins reduce product development cycles by 20% to 50% and cut defect rates by up to 25% in manufacturing. Siemens reports that virtual commissioning of factory automation systems using digital twins reduces on-site commissioning time by 60% to 70%, saving millions in delayed production starts.

The convergence with generative AI and foundation models is accelerating this trend dramatically. NVIDIA's Omniverse platform, which provides the physics simulation backbone for industrial digital twins, now processes over 100 million rendering tasks per month. Google DeepMind's GenCast weather model outperforms traditional numerical weather prediction across 97% of evaluation targets, demonstrating that AI-driven simulation is surpassing physics-only approaches in speed and accuracy. Microsoft's collaboration with Siemens on industrial copilots integrates large language models with digital twin data, enabling natural language queries against complex engineering simulations.

For sustainability specifically, digital twins offer a unique capability: they allow organizations to test decarbonization strategies, resource efficiency improvements, and climate adaptation measures in virtual environments before committing capital. The ability to simulate the carbon impact of 50 building retrofit scenarios or model how a grid will perform with 80% renewable penetration is not merely convenient. It is becoming a competitive necessity.

Signals to Watch

Physics-Informed ML Replaces Pure Data-Driven Models

Traditional machine learning models trained on historical data alone struggle with rare events, regime changes, and out-of-distribution scenarios, precisely the conditions that matter most for climate resilience and infrastructure planning. Physics-informed neural networks (PINNs), which embed physical laws directly into model architectures, are emerging as the standard for high-fidelity digital twins. DeepMind's weather forecasting models incorporate atmospheric physics constraints to deliver 15-day forecasts in under a minute, compared to hours on traditional supercomputers. Ansys, Altair, and Siemens are integrating physics-informed ML into their commercial simulation platforms, enabling engineers without PhD-level expertise to build accurate digital twins. Watch for announcements of physics-informed ML modules becoming default features in major CAD and simulation software suites, signaling mainstream adoption.

Synthetic Data Becomes the Default Training Pipeline

Real-world data is expensive, biased, incomplete, and increasingly subject to privacy regulations. Synthetic data, generated through simulation and procedural generation, is filling the gap at scale. Gartner projects that by 2030, synthetic data will completely overshadow real data in AI model training. In 2026, the shift is already visible. NVIDIA's Omniverse Replicator generates physically accurate synthetic images for robotics and autonomous systems training. Datagen (acquired by Unreal Engine parent Unity, subsequently spun out) creates synthetic human data for computer vision without privacy concerns. Synthesis AI generates synthetic face and body data for healthcare and retail applications. The signal to watch is enterprise procurement: when Fortune 500 companies begin listing synthetic data generation as a line item in their AI budgets rather than a research experiment, the market has crossed the commercialization threshold.

City-Scale and Grid-Scale Twins Move to Production

Individual asset twins (a single turbine, a single building) are well established. The frontier in 2026 is system-of-systems digital twins that model interactions across entire urban environments or energy networks. Singapore's Virtual Singapore platform, developed with Dassault Systemes, models the entire city-state for urban planning, emergency response, and sustainability analysis. In the energy sector, GE Vernova's GridOS platform creates digital twins of transmission and distribution networks, enabling utilities to simulate the impact of distributed energy resource integration before physical deployment. The UK's National Digital Twin Programme is developing interoperability standards that allow twins from different vendors to communicate, a prerequisite for system-scale deployment. Track the number of utilities and municipalities publishing digital twin procurement RFPs as the leading indicator.

Generative AI Accelerates Twin Creation

Building a digital twin traditionally required months of manual modeling, sensor integration, and calibration. Generative AI is compressing this timeline dramatically. Autodesk's Forma platform uses AI to generate building design options and energy performance simulations from natural language descriptions. NVIDIA's Earth-2 initiative applies generative AI to create high-resolution climate simulations at the city block level. Startups like Matterport use AI to convert 3D scans of buildings into digital twins in hours rather than weeks. The signal is the democratization of twin creation: when mid-market companies with limited engineering staff can deploy useful digital twins, the addressable market expands by an order of magnitude.

Winners and Red Flags

Winners

Platform providers with physics simulation engines and AI integration are positioned to dominate. NVIDIA's Omniverse, Siemens Xcelerator, and Dassault Systemes' 3DEXPERIENCE combine physics-based simulation with AI capabilities and cloud-scale compute, creating sticky enterprise relationships. These companies benefit from network effects as more digital twins on their platforms generate more data to improve their AI models.

Synthetic data companies targeting regulated industries will capture premium margins. Healthcare, autonomous vehicles, defense, and financial services require massive training datasets but face strict privacy and safety regulations that limit real-world data collection. Companies like Synthesis AI, Rendered.ai, and MOSTLY AI are building specialized synthetic data pipelines that meet regulatory requirements while reducing data acquisition costs by 90% or more.

Engineering services firms that can bridge the IT-OT gap are essential for enterprise digital twin deployments. Accenture, Cognizant, and specialized firms like Bentley Systems' engineering consultancy understand both the information technology infrastructure and the operational technology (sensors, PLCs, SCADA systems) that digital twins must integrate. This domain expertise is difficult to replicate and commands premium billing rates.

Red Flags

Digital twin vendors without a clear data ingestion and interoperability strategy risk building isolated systems that cannot communicate with the broader enterprise technology stack. Proprietary data formats, closed APIs, and vendor lock-in are common in the current market. Executives should demand compliance with emerging standards such as the Digital Twin Consortium's interoperability framework and the Asset Administration Shell specification from the Industrial Digital Twin Association.

Organizations pursuing digital twins without adequate sensor infrastructure are building simulations on foundations of sand. A digital twin is only as accurate as its real-time data feeds. Companies that invest millions in simulation software while neglecting IoT sensor deployment, data quality governance, and network connectivity will produce digital twins that diverge from physical reality within months.

Synthetic data startups without domain-specific validation frameworks face credibility challenges. Generating photorealistic images or plausible tabular data is insufficient if the synthetic distributions do not accurately represent real-world edge cases. The gap between "looks realistic" and "trains models that perform reliably in production" is significant, and companies without rigorous validation methodologies will lose enterprise contracts as buyers become more sophisticated.

Sector-Specific KPI Benchmarks

SectorKPILaggardAverageLeaderNotes
ManufacturingVirtual commissioning time savings<20%40-50%>65%Siemens reporting 60-70% at scale
Energy (Grid)Outage prediction accuracy<60%75-82%>90%Requires real-time sensor feeds
ConstructionDesign iteration cycle reduction<15%30-40%>55%BIM-integrated twins outperform
Autonomous SystemsSynthetic vs. real data ratio in training<30%50-70%>90%Leaders using synthetic-first pipelines
Smart CitiesInfrastructure asset coverage (%)<10%25-40%>75%Singapore and Helsinki leading
HealthcareSynthetic patient data regulatory approval rate<30%50-60%>80%FDA guidance evolving rapidly

What's Working

Virtual commissioning is delivering measurable ROI in manufacturing. Siemens reported that its Tecnomatix platform reduced physical commissioning time by 60% to 70% for Maserati's Grecale production line, saving approximately 12 weeks of factory downtime. BMW uses digital twins of its entire Regensburg plant to simulate production changes, reducing new model integration time by 30%. These are not pilot results. They are production deployments at scale, generating hundreds of millions of dollars in avoided costs.

Weather and climate digital twins are outperforming legacy systems. The European Centre for Medium-Range Weather Forecasts (ECMWF) launched its Destination Earth initiative to build a full-resolution digital twin of the Earth's climate system. Google DeepMind's GenCast model achieves superior forecast accuracy to the ECMWF's operational system while running 1,000 times faster. The UK Met Office, the U.S. National Oceanic and Atmospheric Administration (NOAA), and several national meteorological agencies are integrating AI-driven simulation alongside traditional numerical weather prediction, creating ensemble approaches that improve forecast reliability.

Synthetic data is accelerating autonomous vehicle development. Waymo generates over 20 million simulated driving miles daily, compared to approximately 20,000 real-world miles. This 1,000-to-1 ratio allows Waymo to test rare and dangerous scenarios, such as pedestrians emerging from behind vehicles at night, that would be impractical or unethical to stage in reality. Cruise, Aurora, and Motional use similar synthetic-first approaches to validate perception and planning systems before road testing.

What Isn't Working

Interoperability between digital twin platforms remains fragmented. Despite standards efforts from the Digital Twin Consortium, the Industrial Digital Twin Association, and ISO TC 184, most enterprise digital twins are siloed within single vendor ecosystems. A Siemens twin of a factory floor cannot easily communicate with a Bentley Systems twin of the surrounding infrastructure or an ESRI twin of the local environment. This fragmentation prevents the system-of-systems integration that would unlock the greatest value and forces enterprises into costly custom integration projects.

Data quality and sensor coverage gaps undermine twin accuracy. A 2025 survey by ARC Advisory Group found that 43% of industrial digital twin projects failed to achieve their stated ROI targets, with poor data quality cited as the primary cause. Many facilities lack the sensor density, network bandwidth, or data governance practices to maintain real-time synchronization between physical assets and their virtual counterparts. The problem is particularly acute in brownfield industrial sites where retrofitting sensors onto legacy equipment is expensive and disruptive.

Synthetic data validation remains immature. While synthetic data generation technology has advanced rapidly, methods for validating that synthetic datasets produce models that generalize to real-world conditions are still developing. Several autonomous vehicle companies have reported performance gaps between models trained on synthetic data and models trained on real-world data, particularly for rare edge cases involving unusual lighting, weather, or road surface conditions. The field lacks standardized benchmarks and certification frameworks that would give enterprise buyers confidence in synthetic data quality.

Key Players

Established Leaders

  • NVIDIA dominates the simulation infrastructure layer through Omniverse, which provides physics-based rendering, AI training, and digital twin hosting. Its Earth-2 climate simulation initiative extends the platform to environmental applications.
  • Siemens offers the most comprehensive industrial digital twin portfolio through Xcelerator, spanning product design, factory simulation, building management, and grid modeling. Its acquisition of Bentley Systems' stake strengthened infrastructure twin capabilities.
  • Dassault Systemes powers virtual twins across aerospace, automotive, life sciences, and urban planning through its 3DEXPERIENCE platform. The Virtual Singapore project is its flagship city-scale deployment.
  • Microsoft Azure Digital Twins provides cloud infrastructure and integration with industrial IoT, coupled with Copilot AI capabilities through its partnership with Siemens.

Emerging Challengers

  • Cosmo Tech develops decision-intelligence digital twins that combine simulation, optimization, and AI for supply chain and energy system planning.
  • MOSTLY AI leads in privacy-compliant synthetic tabular data generation for financial services, healthcare, and telecommunications.
  • Rendered.ai provides a synthetic data pipeline platform specifically for computer vision and geospatial applications, used by defense and intelligence agencies.
  • Invicara offers digital twin middleware that aggregates data from multiple BIM, IoT, and enterprise systems into unified building operations twins.

Key Investors and Funders

  • European Commission allocated over 315 million euros to the Destination Earth initiative for building a high-fidelity digital twin of Earth's climate system.
  • SoftBank Vision Fund, Andreessen Horowitz, and Tiger Global have collectively invested over $2 billion in digital twin and synthetic data startups since 2022.
  • U.S. Department of Energy funds digital twin research for grid modernization and advanced manufacturing through the Advanced Research Projects Agency-Energy (ARPA-E) and national laboratory programs.

Action Checklist

  • Identify three to five high-value use cases where digital twins could reduce physical testing, improve asset utilization, or accelerate decision-making, prioritizing applications where simulation accuracy can be validated against known outcomes
  • Audit existing sensor infrastructure and data quality practices to determine whether current IoT deployments can support real-time digital twin synchronization at the required fidelity
  • Evaluate platform options from NVIDIA, Siemens, Dassault, and Microsoft against interoperability requirements, ensuring any selected platform supports open standards and API-based integration with existing enterprise systems
  • Launch a synthetic data pilot for one AI training use case, comparing model performance between synthetic-only, real-only, and blended training datasets to establish baseline quality metrics
  • Establish a data governance framework that defines ownership, access controls, versioning, and quality standards for digital twin data streams before scaling beyond pilot deployments
  • Engage with the Digital Twin Consortium or equivalent industry body to stay current on interoperability standards and best practices, avoiding vendor lock-in through standards-compliant architecture decisions
  • Build internal capability by training engineering and operations teams on digital twin platforms, recognizing that technology adoption without workforce readiness consistently leads to underperforming deployments

FAQ

Q: How much does it cost to deploy an enterprise digital twin? A: Costs vary enormously by scope and complexity. A single-asset twin (one turbine or one building) can cost $50,000 to $500,000 including sensors, software, and integration. A factory-scale or campus-scale twin typically runs $2 million to $15 million. City-scale and grid-scale twins require $50 million to $300 million or more. The critical metric is not implementation cost but return on investment: leaders report 5x to 15x ROI over five years through reduced downtime, optimized maintenance, and accelerated design cycles.

Q: Is synthetic data reliable enough for production AI systems? A: For many applications, yes. Waymo, Tesla, and major autonomous vehicle developers rely on synthetic data for the majority of their perception system training. Healthcare and financial services companies use synthetic tabular data that passes statistical fidelity tests. However, validation remains essential. Organizations should run systematic domain adaptation testing, comparing model performance on synthetic versus real-world evaluation sets, before deploying synthetic-trained models in production environments.

Q: Which industries benefit most from digital twins in 2026? A: Manufacturing, energy and utilities, aerospace, and construction are the most mature adopters. Healthcare (patient-specific treatment simulation), agriculture (crop and soil modeling), and smart cities (urban planning and emergency response) are the fastest-growing segments. Any industry with expensive physical assets, complex operational processes, or high consequences of failure stands to benefit significantly.

Q: How do digital twins contribute to sustainability goals? A: Digital twins enable organizations to simulate and optimize energy consumption, material usage, and emissions across operations before committing to physical changes. Building twins can model retrofit scenarios to identify the highest-impact energy efficiency improvements. Grid twins can simulate renewable integration strategies to minimize curtailment. Supply chain twins can identify emission hotspots and test alternative logistics configurations. The ability to run thousands of "what if" scenarios virtually, rather than through costly physical pilots, accelerates decarbonization by compressing decision cycles from months to days.

Sources

Stay in the loop

Get monthly sustainability insights — no spam, just signal.

We respect your privacy. Unsubscribe anytime. Privacy Policy

Case Study

Case study: Digital twins, simulation & synthetic data — a city or utility pilot and the results so far

A concrete implementation case from a city or utility pilot in Digital twins, simulation & synthetic data, covering design choices, measured outcomes, and transferable lessons for other jurisdictions.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a leading company's implementation and lessons learned

An in-depth look at how a leading company implemented Digital twins, simulation & synthetic data, including the decision process, execution challenges, measured results, and lessons for others.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a startup-to-enterprise scale story

A detailed case study tracing how a startup in Digital twins, simulation & synthetic data scaled to enterprise level, with lessons on product-market fit, funding, and operational challenges.

Read →
Case Study

Case study: Digital twins, simulation & synthetic data — a pilot that failed (and what it taught us)

A concrete implementation with numbers, lessons learned, and what to copy/avoid. Focus on KPIs that matter, benchmark ranges, and what 'good' looks like in practice.

Read →
Article

Trend analysis: Digital twins, simulation & synthetic data — where the value pools are (and who captures them)

Strategic analysis of value creation and capture in Digital twins, simulation & synthetic data, mapping where economic returns concentrate and which players are best positioned to benefit.

Read →
Article

Market map: Digital twins, simulation & synthetic data — the categories that will matter next

A visual and analytical map of the Digital twins, simulation & synthetic data landscape: segments, key players, and where value is shifting.

Read →