National

Wednesday, 28 January 2026

Meet AlphaGenome, the AI that could predict your cancer

Google DeepMind’s new super-fast model can identify how changes to DNA lead to disease, allowing scientists to simulate conditions and develop cures

Why do some people succumb to certain diseases and others don’t? Because of the make-up of their DNA. How can we identify which parts of our genome are responsible? That’s a harder question.

But we are now closer to an answer thanks to AlphaGenome, an AI model that will help biologists crunch through DNA sequences to understand the genetic basis for diseases.

In a paper in the journal Nature, UK-based researchers at Google DeepMind detail how their tool can analyse very long DNA sequences and make detailed predictions of how changing part of the sequence may lead to diseases such as cancer.

By using the open-source model, scientists will be able to better understand how genes affect our bodies, identify potential drugs more quickly and even design entirely new DNA sequences for gene therapy treatments.

Similar so-called DNA foundational models exist, including Evo 2, created by the Nvidia-backed Arc Institute. They are all built using huge biological databases, the result of decades of research by public bodies.

But the researchers claim AlphaGenome represents a breakthrough. It follows Google DeepMind’s AlphaFold, a similar model that analyses proteins and delivered a Nobel prize for its researchers in 2024.

Robert Goldstone, head of genomics at the Francis Crick Institute, said AlphaGenome was “a major milestone in the field of genomic AI”.

The level of detail the tool can analyse, he added, “is a breakthrough that moves the technology from theoretical interest to practical utility, allowing scientists to programmatically study and simulate the genetic roots of complex disease”.

The model has been tested by academics at the Wellcome Sanger Institute at Cambridge. Professor Ben Lehner, the institute’s head of generative and synthetic genomics, said AlphaGenome was “a great example of how AI is accelerating biological discovery and the development of therapeutics”.

“Identifying the precise differences in our genomes that make us more or less likely to develop thousands of diseases is a key step towards developing better therapeutics,” he said.

Newsletters

Choose the newsletters you want to receive

View more

For information about how The Observer protects your data, read our Privacy Policy

DNA is the double-helix molecule that carries our genetic code. The human genome is the collection of all the 3bn pairs of DNA in most of our cells. It is often compared to an instruction manual, DNA the letters written inside.

It’s a strange instruction manual: only 2% of the genome forms genes that actually deliver instructions for the body to do anything. The remaining 98% of so-called junk DNA, also known as the “dark genome”, seems to just sit there. Except that it doesn’t: mutations anywhere in the genome can change how genes deliver their instructions, and these mutations can create diseases.

Analysing these mutations is hard. Cancer cells can have thousands of mutations, forcing researchers to slog through them to establish which ones are causing the disease.

“While the Human Genome Project gave us the book of life, reading it remained a challenge,” said Google DeepMind researcher Pushmeet Kohli. “We have the text but we are still deciphering the semantics. Understanding the grammar of this genome, what is encoded in our DNA and how it governs life is the next critical frontier for research.”

AI researchers have previously created tools looking for specific elements of DNA, and AlphaGenome combines these into one tool, trained on human and mouse genomes, vastly increasing the size of its analysis.

“It can process a sequence of 1m DNA letters, roughly the scope required to understand the full regulatory environment of a single gene,” said researcher Žiga Avsec.

The Nature paper shows AlphaGenome can simultaneously predict 5,930 human or 1,128 mouse genetic signals.

But there are limitations, Lehner warned. “AI models are only as good as the data used to train them,” he said. “Most existing data in biology is not very suitable for AI – the datasets are too small and not well standardised.”

The challenge is to find ways of generating data for future models in a fast and cost-effective way, he added.

Rivka Isaacson, professor of molecular biophysics in the department of chemistry at King’s College London, said: “This work is an exciting step forward in illuminating the dark genome. We still have a long way to go in understanding the lengthy sequences of our DNA that don’t directly encode the protein machinery whose constant whirring keeps us healthy.

“There are so many interwoven possibilities and complex feedback mechanisms that I doubt the whole thing will ever be fully untangled. AlphaGenome gives scientists whole new and vast datasets to sift and scavenge for clues.”

Photograph by Carolyn Van Houten/The Washington Post via Getty Images

Follow

The Observer
The Observer Magazine
The ObserverNew Review
The Observer Food Monthly
Copyright © 2025 Tortoise MediaPrivacy PolicyTerms & Conditions