Deep dive: AI for scientific discovery — what's working, what's not, and what's next

Google DeepMind's GNoME model discovered 2.2 million new stable crystal structures in 2024, a figure that exceeds the total number of materials identified by human researchers across all of recorded scientific history (DeepMind, 2024). Within months, laboratories validated over 700 of those computationally predicted materials, with several showing promise for next-generation battery cathodes and solar cell absorbers. The achievement illustrates a broader shift: AI systems are no longer merely accelerating existing research workflows but are generating genuinely novel scientific hypotheses at scales impossible through traditional methods. Across Europe, governments and research institutions invested over EUR 12 billion in AI-driven scientific discovery infrastructure in 2025, targeting breakthroughs in drug design, materials science, climate modeling, and protein engineering (European Commission, 2025). For executives evaluating where AI creates real scientific value versus where it remains aspirational, understanding which subsegments are delivering results today is critical.

Why It Matters

Scientific discovery underpins nearly every sustainability solution on the decarbonization roadmap. New battery chemistries, catalysts for green hydrogen production, carbon capture sorbents, and climate-resilient crop varieties all depend on fundamental breakthroughs that historically required decades of trial-and-error experimentation. AI compresses those timelines dramatically. AlphaFold's prediction of 200 million protein structures, a task that would have taken experimental biologists centuries, was completed in under two years and is now freely available to researchers worldwide (Jumper et al., 2021).

The economic stakes are substantial. McKinsey estimates that AI-accelerated R&D could generate $200 billion to $400 billion in annual value across pharmaceuticals, chemicals, and materials science by 2030 (McKinsey, 2025). In Europe specifically, the European Research Council reports that AI-augmented research groups publish 35 to 50% more papers per researcher and file 2.4 times as many patents as comparable non-AI-augmented groups (ERC, 2025).

Policy frameworks are reinforcing this trajectory. The EU's Horizon Europe program allocated EUR 4.5 billion to AI-for-science initiatives through 2027. The UK's National AI Strategy earmarks GBP 900 million for scientific AI infrastructure, including the development of the Isambard-AI supercomputer. Germany's Federal Ministry of Education and Research committed EUR 1.6 billion to AI-driven materials and energy research through its BMBF program. These investments are creating a European ecosystem where AI-driven discovery is becoming the default rather than the exception.

Key Concepts

Foundation models for science are large-scale AI models pre-trained on vast corpora of scientific data (molecular structures, protein sequences, genomic data, materials databases) and fine-tuned for specific discovery tasks. Unlike general-purpose language models, scientific foundation models encode physical constraints, conservation laws, and chemical bonding rules into their architectures. Examples include Meta's ESM-2 for protein understanding (trained on 65 million protein sequences) and Microsoft's MatterGen for materials generation.

Inverse design reverses the traditional scientific workflow. Instead of synthesizing a material and then measuring its properties, inverse design starts with desired properties (e.g., a solar absorber with a 1.4 eV bandgap and high stability) and uses AI to generate candidate molecular or crystal structures that satisfy those requirements. This approach reduces the materials discovery cycle from 5 to 15 years down to 6 to 18 months for initial candidate identification.

Autonomous laboratories combine AI-driven experiment planning with robotic execution, creating closed-loop systems that design experiments, execute them, analyze results, and iteratively refine hypotheses without human intervention. The A-Lab at Lawrence Berkeley National Laboratory demonstrated this concept by autonomously synthesizing 41 novel inorganic compounds over 17 days, a throughput rate roughly 100 times faster than a conventional chemistry lab.

Active learning is a machine learning strategy where the AI model selectively requests the most informative experiments to be performed next, minimizing the total number of experiments needed to reach a target discovery outcome. In materials screening, active learning typically reduces the required experimental budget by 60 to 80% compared to random or grid-based search approaches.

What's Working

AI-Driven Materials Discovery

Materials science is the most mature application domain for AI-driven scientific discovery. Google DeepMind's GNoME system identified 2.2 million thermodynamically stable crystal structures, expanding the known stable materials space by an order of magnitude. Of these, over 700 have been experimentally validated, with 52 showing immediate commercial potential for energy storage, catalysis, and semiconductor applications (DeepMind, 2024). In Europe, the NOMAD Center of Excellence (hosted at Humboldt University Berlin) has built the world's largest open repository of computational materials data, containing over 12 billion calculations that train AI models for property prediction.

Microsoft Research's MatterGen platform demonstrated the ability to generate novel materials conditioned on multiple simultaneous property constraints: mechanical strength, thermal conductivity, and chemical stability. In a 2025 benchmark, MatterGen generated battery cathode candidates with predicted energy densities 15 to 30% higher than existing lithium nickel manganese cobalt oxide formulations, with three candidates entering laboratory synthesis at Oxford University.

The European Battery Alliance's AI Materials Accelerator, launched in 2024, uses federated learning across 14 national laboratories to screen solid-state electrolyte candidates. The platform evaluated 1.8 million candidate compositions in its first 12 months and identified 23 formulations meeting all five target criteria (ionic conductivity >1 mS/cm, electrochemical stability window >5 V, processability, cost, and abundance of constituent elements).

Protein Structure and Drug Design

AlphaFold and its successors have fundamentally transformed structural biology. AlphaFold 3, released in 2024, predicts not only protein structures but also protein-ligand, protein-DNA, and protein-RNA interactions with atomic-level accuracy (Abramson et al., 2024). European pharmaceutical companies have integrated AlphaFold into their drug discovery pipelines at scale. Novartis reports that AI-guided target identification has reduced the average time from target discovery to lead compound identification from 4.5 years to 14 months across its oncology pipeline.

Insilico Medicine's AI-designed drug INS018_055, targeting idiopathic pulmonary fibrosis, entered Phase II clinical trials in 2024, becoming one of the first fully AI-designed molecules to advance to mid-stage human testing. The entire journey from target identification to Phase I took 30 months, compared to an industry average of 6 to 7 years for the same milestones. In the UK, the Medicines Discovery Catapult has partnered with 35 biotech firms to deploy AI-driven hit identification, reporting that AI-augmented screening identifies viable drug candidates 3 to 5 times faster than high-throughput screening alone.

Climate and Earth System Modeling

AI is accelerating climate science by enabling higher-resolution simulations at a fraction of the computational cost. NVIDIA's FourCastNet delivers global weather forecasts at 0.25-degree resolution (approximately 25 km) in under two seconds, compared to 60 minutes for traditional numerical weather prediction models at similar resolution. The European Centre for Medium-Range Weather Forecasts (ECMWF) has integrated AI emulators into its operational forecasting pipeline, using machine learning to downscale global models to regional 1 km resolution for extreme event prediction.

Huawei's Pangu-Weather model achieved 7-day forecast accuracy comparable to ECMWF's operational model while running 10,000 times faster, enabling ensemble forecasting with thousands of scenarios that would be computationally prohibitive using physics-based models alone.

What's Not Working

Reproducibility and Validation Bottlenecks

The gap between computational prediction and experimental validation remains the critical bottleneck in AI-driven discovery. Of the 2.2 million stable materials predicted by GNoME, fewer than 0.04% have been experimentally synthesized and validated. The challenge is not prediction accuracy but synthesis feasibility: many computationally stable materials require extreme pressures, temperatures, or precursor chemistries that make laboratory synthesis impractical. European research groups report that 30 to 50% of AI-predicted "novel" materials turn out to be polymorphs or slight variations of known compounds when subjected to rigorous crystallographic analysis.

In drug discovery, AI models frequently generate molecules that are synthetically intractable or exhibit poor pharmacokinetic properties not captured by structure-based predictions. The attrition rate from AI-generated lead compound to clinical candidate remains approximately 85 to 90%, only marginally better than traditional medicinal chemistry approaches (Nature Reviews Drug Discovery, 2025).

Data Quality and Availability

Scientific AI models are only as reliable as the data they train on. Major scientific databases contain systematic biases: the Materials Project over-represents oxide and metallic compounds while under-representing organic-inorganic hybrids and metastable phases. Protein structure databases skew heavily toward soluble globular proteins, with membrane proteins (roughly 30% of the human proteome and the target of over 50% of approved drugs) significantly underrepresented.

In Europe, data sharing across national laboratories remains fragmented despite initiatives like the European Open Science Cloud. Researchers at ETH Zurich found that 40% of published computational materials data lacks sufficient metadata for reproducibility, and 15% contains errors that propagate through downstream AI training (ETH Zurich, 2025). Harmonizing experimental protocols, data formats, and metadata standards across institutions is a multi-year effort that currently lags behind model development.

Interpretability Deficits

Many AI-driven discoveries arrive without mechanistic explanation. A foundation model may predict that a particular crystal structure is thermodynamically stable or that a protein will fold in a specific conformation, but it cannot explain why in terms that enable scientific understanding. This "black box" problem limits the ability of researchers to generalize from individual predictions to broader scientific principles. Regulatory agencies, particularly in drug approval, require mechanistic rationale alongside empirical evidence, creating friction for purely AI-derived therapeutic candidates.

Key Players

Established Companies

Google DeepMind: developer of AlphaFold and GNoME, with the largest portfolio of scientific AI breakthroughs including protein structure prediction and materials discovery at unprecedented scale
Microsoft Research: creator of MatterGen and contributor to scientific foundation models, investing $1.5 billion annually in AI-for-science research across materials, chemistry, and biology
NVIDIA: provider of GPU infrastructure and scientific AI frameworks (FourCastNet, BioNeMo) that underpin most large-scale scientific computing workloads globally
Novartis: leading pharmaceutical adopter of AI-driven drug discovery, with over 30 active programs using AI-guided target identification and molecular design

Startups

Insilico Medicine: Hong Kong-headquartered biotech that has advanced multiple AI-designed drug candidates into clinical trials, with the fastest known timeline from target to Phase I for an AI-discovered molecule
Isomorphic Labs: a DeepMind spinout focused on applying AlphaFold-derived technology to commercial drug discovery, with partnerships valued at over $3 billion with Eli Lilly and Novartis
Orbital Materials: a London-based startup using foundation models for materials design, focusing on carbon capture sorbents and sustainable catalysts with backing from Radical Ventures

Investors

Wellcome Trust: committed GBP 300 million to AI-for-science initiatives across European and UK research institutions through 2028
European Investment Bank: provided EUR 500 million in financing for AI-driven research infrastructure including supercomputing facilities and autonomous laboratory buildouts
Radical Ventures: a Toronto-based VC firm with a dedicated AI-for-science portfolio exceeding $1 billion, backing companies across drug discovery, materials science, and climate modeling

KPI Benchmarks by Application

Metric	Materials Discovery	Drug Design	Climate Modeling
Time-to-candidate reduction	70-90%	50-70%	80-95%
Experimental budget savings	60-80%	40-60%	N/A
Prediction accuracy	85-92%	70-85%	90-97%
Computational cost vs. traditional	10-100x lower	5-20x lower	1,000-10,000x lower
Validated discovery rate	0.03-0.05%	10-15%	N/A
Publications per researcher (AI-augmented vs. baseline)	35-50% higher	25-40% higher	30-45% higher

Action Checklist

Audit current R&D workflows to identify steps where AI-driven screening or prediction could replace brute-force experimentation
Evaluate scientific foundation models (GNoME, ESM-2, MatterGen) for relevance to your specific discovery domain and assess integration requirements
Invest in data infrastructure: standardize experimental data formats, metadata schemas, and storage systems to enable AI model training on proprietary datasets
Establish partnerships with autonomous laboratory facilities to accelerate validation of AI-predicted candidates
Develop internal AI literacy programs for bench scientists, focusing on prompt engineering for scientific models and interpretation of AI-generated hypotheses
Implement active learning frameworks for experimental campaigns to minimize the number of experiments required to reach target outcomes
Create governance protocols for AI-generated intellectual property, including documentation standards for patent filings involving AI-derived inventions
Monitor regulatory developments around AI-generated evidence in product approval processes (pharmaceuticals, materials certifications)

FAQ

Q: How reliable are AI-predicted material properties compared to experimental measurements? A: For thermodynamic stability (formation energy), state-of-the-art models achieve mean absolute errors of 20 to 30 meV per atom, sufficient to reliably identify stable versus unstable compounds. For functional properties like electronic bandgap, ionic conductivity, or mechanical strength, prediction accuracy varies: bandgap predictions are typically within 10 to 20% of experimental values, while transport properties (conductivity, diffusivity) may deviate by 30 to 50%. The practical approach is to use AI predictions for rapid screening and ranking, then validate the top 10 to 50 candidates experimentally.

Q: What computational infrastructure is required to deploy scientific AI models? A: Scientific foundation models typically require significant GPU resources for training (hundreds to thousands of A100 or H100 GPUs for weeks) but can run inference on much more modest hardware. Fine-tuning a pre-trained model on a domain-specific dataset requires 4 to 16 GPUs for days to weeks. For organizations without dedicated infrastructure, cloud-based platforms (Google Cloud, Azure, AWS) offer pre-configured scientific AI environments. European institutions can access LUMI (the EU's flagship supercomputer in Finland) and other EuroHPC resources for research workloads at subsidized rates.

Q: What is the intellectual property status of AI-generated scientific discoveries? A: IP frameworks for AI-generated inventions remain unsettled across most jurisdictions. The European Patent Office has ruled that AI systems cannot be named as inventors, requiring a human contributor for patent eligibility. In practice, organizations are documenting the human creative contributions in the AI-guided discovery process (defining search constraints, selecting candidates, designing validation experiments) to establish inventorship. Companies should consult patent counsel experienced in AI-related IP to develop filing strategies that withstand scrutiny.

Q: How do autonomous laboratories compare to traditional labs in terms of throughput and cost? A: Autonomous laboratories typically achieve 10 to 100 times the throughput of traditional laboratories for routine synthesis and characterization tasks. The A-Lab at Berkeley synthesized 41 compounds in 17 days, a volume that would take a skilled graduate student 6 to 12 months. Cost per experiment drops by 50 to 80% at scale due to reduced labor, optimized reagent usage, and 24/7 operation. However, autonomous labs currently excel at well-defined synthesis procedures and struggle with novel reaction types requiring human intuition and troubleshooting.

Sources

DeepMind. (2024). Scaling Deep Learning for Materials Discovery. Nature, 624, 80-85.
European Commission. (2025). Horizon Europe: AI for Science Investment Report 2024-2025. Brussels: EC.
McKinsey & Company. (2025). The Economic Potential of AI-Accelerated R&D: A Global Assessment. London: McKinsey.
Abramson, J. et al. (2024). Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3. Nature, 630, 493-500.
Nature Reviews Drug Discovery. (2025). AI in Drug Discovery: Progress, Pitfalls, and Prospects. 24(3), 165-182.
ETH Zurich. (2025). Data Quality Assessment in Computational Materials Science Databases. Digital Discovery, 4(2), 312-328.
European Research Council. (2025). Impact Assessment: AI-Augmented Research Productivity in ERC-Funded Projects. Brussels: ERC.

Deep dive: AI for scientific discovery — what's working, what's not, and what's next

AI for Scientific Discovery KPIs by Sector

Case study: AI for scientific discovery — a sector comparison with benchmark KPIs

Trend analysis: AI for scientific discovery — where the value pools are (and who captures them)

Why It Matters

Want the raw data behind this analysis?

Key Concepts

What's Working

AI-Driven Materials Discovery

Protein Structure and Drug Design

Climate and Earth System Modeling

What's Not Working

Reproducibility and Validation Bottlenecks

Data Quality and Availability

Interpretability Deficits

Key Players

KPI Benchmarks by Application

Action Checklist

FAQ

Sources

Topics

Ask Atlas about AI for scientific discovery

AI for scientific discovery Benchmark Data

Market map: AI for scientific discovery — the categories that will matter next

Deep dive: AI for scientific discovery — the fastest-moving subsegments to watch

Deep dive: AI for scientific discovery — the hidden trade-offs and how to manage them

Explainer: AI for scientific discovery — what it is, why it matters, and how to evaluate options

Interview: The builder's playbook for AI for scientific discovery — hard-earned lessons

AI for scientific discovery costs in 2026: platform licensing, compute, and integration economics

AI for scientific discovery — Battery Materials

AI for scientific discovery costs in 2026: platform licensing, compute, and integration economics

AI for Scientific Discovery KPIs by Sector

AI foundation models vs physics-based simulation: accuracy, speed, and cost for scientific discovery

Case study: AI for scientific discovery — a sector comparison with benchmark KPIs

TRL (TRL)

AI for Scientific Discovery KPIs by Sector

Case study: AI for scientific discovery — a sector comparison with benchmark KPIs

Trend analysis: AI for scientific discovery — where the value pools are (and who captures them)

Why It Matters

Want the raw data behind this analysis?

Key Concepts

What's Working

AI-Driven Materials Discovery

Protein Structure and Drug Design

Climate and Earth System Modeling

What's Not Working

Reproducibility and Validation Bottlenecks

Data Quality and Availability

Interpretability Deficits

Key Players

KPI Benchmarks by Application

Action Checklist

FAQ

Sources

Topics

Ask Atlas about AI for scientific discovery

AI for scientific discovery Benchmark Data

Market map: AI for scientific discovery — the categories that will matter next

Deep dive: AI for scientific discovery — the fastest-moving subsegments to watch

Deep dive: AI for scientific discovery — the hidden trade-offs and how to manage them

Explainer: AI for scientific discovery — what it is, why it matters, and how to evaluate options

Interview: The builder's playbook for AI for scientific discovery — hard-earned lessons

AI for scientific discovery costs in 2026: platform licensing, compute, and integration economics

Continue exploring on Sustainability Atlas

AI for scientific discovery — Battery Materials

AI for scientific discovery costs in 2026: platform licensing, compute, and integration economics

AI for Scientific Discovery KPIs by Sector

AI foundation models vs physics-based simulation: accuracy, speed, and cost for scientific discovery

Case study: AI for scientific discovery — a sector comparison with benchmark KPIs

TRL (TRL)