AI ≠ Alchemy: Why Data Still Decides the Winners in Drug Discovery

Pietro Gatti
May 21, 2025
4 min read

TL;DR

AI looks invincible when you zoom in on AlphaFold 3 or Insilico’s Phase II trials. But zoom out, and you’ll see a pattern: pure-play AI drug discovery valuations are falling, while hybrid companies that own the robots as well as the models are raising nine-figure rounds. The story isn’t that AI is broken. It’s that amplifying AI with fresh, proprietary data is the only model that scales.

The hype’s being repackaged — not discarded, but rebuilt on a foundation that actually works.

It All Looked Solved

For a while, it felt like AI had finally cracked pharma’s code.

AlphaFold 3 was predicting protein-ligand complexes with the flick of a GPU. Insilico Medicine dosed a fully AI-designed drug in Phase II, just four years after identifying the target (source). Isomorphic Labs, Alphabet’s AI-first drug venture, raised $600 million in April 2025, marking one of the largest funding rounds for a UK AI company (Financial Times).

If you were just reading headlines — and judging by how many investors did — the message was clear: the age of data-driven drug discovery had arrived, and biology was finally hackable.

The narrative was clean. Models would design molecules. Pipelines would collapse from ten years to three. And wet labs? Optional at best.

Then Reality Logged In

But under the hood, something else was happening. The companies actually making progress weren’t abandoning wet labs. They were doubling down on them — and wiring them to the cloud. Take Recursion. They didn’t just build a model. They built a 23-petabyte dataset of cellular images using high-throughput robotics, and piped it directly into NVIDIA’s BioNeMo platform.

Or Strateos and Emerald Cloud Lab — labs where biotech teams can run live assays remotely, scripting experiments like developers push code. Their automation platforms are quietly underpinning some of the most productive hybrid biotech teams out there.

What became clear: owning data isn’t optional. Generating it continuously and programmatically is the actual moat (McKinsey).

Hype Meets the Cashflow Discount

BenevolentAI — once Europe’s AI drug darling — announced layoffs, missed key milestones, and delisted from Euronext in 2025 (Euronext notice). Atomwise, after raising $174M, swapped CEOs and pivoted strategy amid stalled deal momentum (source). At the same time, Recursion announced a $50 million investment from NVIDIA and lit up BioHive-2, a 2000-PFLOP supercomputer designed not just to run inference but to fuel its own data-feedback loop (source).

VC sentiment is catching up fast. Investment trends in early 2025 indicated significant investor confidence in AI-driven drug discovery companies that integrate computational and experimental platforms, reflecting a colder reception for generic AI models alone (Galen Growth, TechFunding News).

Gartner’s 2024 Hype Cycle for Artificial Intelligence tells the same story from 30,000 feet: Generative AI has already slid past the Peak of Inflated Expectations, and the spotlight is shifting to “composite AI” architectures that marry models with disciplined, continuously refreshed data flows.

Knowledge graphs—Gartner’s poster-child on the Slope of Enlightenment—are a case in point: they only create value when someone is shovelling high-quality experimental data into them. In other words, capital is rewarding proof of cycle-time compression, not model cleverness in isolation.

From Hype to Hybrid

This shift isn’t a comedown for AI. It’s a rebuild of the operating system underneath it.

Generative models and foundation frameworks still matter — especially as platforms like BioNeMo and the capabilities of large foundation models in chemistry and biology mature.

Amplify Thesis: from model-only to AI+Lab

But the teams getting ahead are those who’ve linked AI to a proprietary experimental loop. They’re not tossing data at models. They’re feeding it like fuel — fresh, structured, and traceable.

This is the Amplify Thesis: The edge comes from building AI on top of data-generating infrastructure that never sleeps.

Let’s make that tangible:

Automate the feed – Platforms like Strateos and ECL turn assay workflows into scripts. Clients queue up experiments at 10pm. Results land in a dashboard at breakfast.
Own the loop – Recursion’s 23-petabyte dataset didn’t come from scraping. It came from controlled, reproducible, robot-run assays.
Change your metric – Track cost per validated mechanism (CpVM), not “how many molecules you generated.” That’s what boards now care about.

The implication: The winners are those who run the tightest, most defensible learning loops — not just the flashiest demos.

Rebuilding the Operating System

Pharma isn’t being disrupted so much as recompiled. The once-monolithic, 10-to-15-year conveyor belt from target ID to approval is being refactored into a four-layer stack: robotic wet-lab hardware, a live data fabric, continuously retraining AI models, and—crucially—the decision layer that turns all that streaming evidence into real go/no-go calls. Because boards, regulators, and clinicians won’t green-light a drug on “cool predictions” alone.

The first three layers shorten cycle time: robots generate fresh assays, data pipelines clean and feed them upward, models learn on the fly. The fourth is where the value shows up: decision dashboards surface hard metrics so leaders can act in weeks, not quarters.

AI serves as the power bus of the stack—energising wet-lab robots, feeding the data fabric, sharpening live models, and lighting up decision dashboards so signals accelerate rather than fade on their way to the boardroom.

For founders, the play is to own that bus: architect loops where every experiment pushes fresh data uphill and turns strategic decisions from quarterly debates into weekly code releases.

If you're a founder, the questions you need to ask aren’t “Do I use AI?” but:

Can I generate the data my model needs faster than others can buy it? Can I test, learn, and refine in a week instead of a quarter? Can I deliver outputs that inform faster decisions?

That’s what amplifying AI looks like. That’s the direction the capital is already flowing. And that’s the difference between demo-grade innovation and an actual drug.

Insight Over Instinct: Making Informed Decisions

In a field where bold claims often outpace results, the real challenge lies in discerning enduring value from transient hype. Understanding the true potential of an idea—its scalability, resilience, and future trajectory—requires more than enthusiasm; it demands critical evaluation and strategic foresight. Engaging with experts who can navigate these complexities is crucial to avoid missteps and to ensure that innovation translates into meaningful, sustainable progress.

Triple Helix