AI drug discovery models: Physics falls short

In a thought-provoking twist for computational chemistry and biomedicine, researchers at the University of Basel (Switzerland) have uncovered that even the most advanced AI models used for drug design may not truly understand the physics of molecular binding; they appear to be pattern-matching rather than reasoning.

When learning isn’t the same as understanding

The study, reported today, describes how deep-learning "co-folding" models, systems designed to predict how a protein and a potential drug molecule will fit together, fail to uphold basic physical laws when deliberately tested under challenging conditions.

In one striking experiment, researchers mutated or blocked binding sites on proteins, or edited ligands so they would no longer bind. Yet, AI models frequently predicted a binding pose anyway, as though the disruption had not occurred. In more than half the cases, the AI output ignored the alterations.

The authors argue that these models rely on statistical correlations, the shapes and sequence patterns they’ve observed in training, rather than truly modeling the underlying physics of electrostatics, steric factors (the crowding of atoms), hydrogen bonds, and so on.

Why this matters for drug discovery

The implications are significant. The promise of AI in drug discovery is enormous finding new molecules more quickly, predicting how they will bind, and shortening the time to a viable therapeutic. However, as the Basel team notes, if the model doesn't truly understand what makes a ligand bind to a protein, predictions for novel, unseen targets (a key objective) may be unreliable.

As Prof. Markus Lill of the University of Basel states, "When they see something completely new, they quickly fall short, but that is precisely where the key to new drugs lies."

In other words, models trained on known protein-ligand pairs may perform well "within sample," but when faced with novel challenges, they may revert to "safe guesses" rather than principled predictions. This puts a caveat on many current hype narratives surrounding AI drug design.

Key findings include:

The deep-learning co-folding models were exposed to adversarial examples, including mutating binding sites, altering ligand charge distributions, and blocking binding pockets.
Despite physically implausible or impossible binding configurations (for instance, ligand charged the wrong way, binding site residues replaced by sterically blocking amino acids), the models still often predicted good binding poses.
From this, the authors conclude these models do not reliably respect physical constraints (e.g., electrostatics, hydrogen‐bonding networks, steric hindrance), and they fail to generalize when faced with new types of protein/ligand systems.
The paper argues for integrating physical and chemical priors into future models, making sure that machine‐learning models are not simply “black-box” pattern matchers, but respect the underlying molecular science.

A cautious but curious tone on where to go next

The news from Basel isn’t a refutation of AI in drug research; rather, it is a clarion call for more nuance and care. AI models have already changed what’s possible: predicting protein folds, accelerating docking predictions, and broadly expanding the realm of computational chemistry. Yet this research suggests there’s still an important gap between “predicting what we know” and “reasoning about what we don’t know.”

Going forward, several directions are ripe:

Hybrid modeling: combining data‐driven deep learning with traditional physics‐based modeling (electrostatics, molecular mechanics, quantum effects) might strengthen reliability.
Benchmarking on novel/rare systems: rather than just “hold‐out” samples similar to the training data, models should be challenged with radically new proteins or ligands to test generalization.
Transparent AI: understanding not just the output but the reasoning of models (why did they predict binding despite physically implausible input?).
Experimental validation remains crucial: even the most sophisticated prediction needs lab and computational cross-checks that consider real chemistry and physics.

Bottom line

In summary, the study conducted by the University of Basel presents a compelling assessment: while current AI models demonstrate remarkable capabilities, they may primarily rely on patterns derived from historical data rather than accurately simulating molecular interactions. This disparity is particularly significant in drug discovery, where novel targets and unforeseen chemical phenomena are commonplace. Therefore, bridging this gap is essential. Moving forward, a focus on integrating machine learning with physics-based insights holds the key to advancing the development of innovative therapeutics.

AI drug discovery models: Physics falls short

When learning isn’t the same as understanding

Why this matters for drug discovery

A cautious but curious tone on where to go next

Bottom line

Supercomputers reveal a lopsided giant: Reimagining Saturn’s magnetic world

Forecasting the invisible: How supercomputing safeguards humanity’s return to the Moon

Supercomputing chases quantum dreams, but how close are we, really?

How HPC is revealing alien matter deep inside ice giants

Russian scientists make multimodal AI breakthrough in protein interaction prediction

Intel, Google's latest AI pact: A boost for supercomputing, or a strategic rebrand?

How supercomputing is transforming our understanding of the Antarctic Circumpolar flow

When stars fall apart: Supercomputing reveals the hidden physics of black holes

Tiny whirlpools, massive potential: How skyrmions could reshape supercomputing memory

Riding invisible waves: How open-source code transforms space weather science

POPULAR RIGHT NOW

EMAIL NEWSLETTER SUBSCRIPTION

AI drug discovery models: Physics falls short

When learning isn’t the same as understanding

Why this matters for drug discovery

A cautious but curious tone on where to go next

Bottom line

POPULAR RIGHT NOW