Helsinki researchers use ML to unlock the genomic code in clinical cancer samples

A new paper from the University of Helsinki suggests a method for accurately analyzing genomics data in archival cancer biopsies. This tool uses machine learning methods to correct damaged DNA and unveil the true mutation processes in tumor samples. This helps to unlock tremendous medicine values in millions of archival cancer samples.

Molecular-based diagnosis helps to match the right patient with the right cancer treatment. Researchers took particular interest in DNA profiling in clinical cancer samples.

This invaluable source is currently not being used for molecular diagnosis due to the poor DNA quality. Formalin causes severe damage to DNAs, which is an inevitable challenge to analyse cancer genomes in preserved tissues, says lead author Qingli Guo from the University of Helsinki.

Analyzing mutation processes in cancer genomes can help early cancer detection, accurately diagnose cancer, and reveal why some cancers become resistant to treatment. The new method can dramatically accelerate the development of clinical applications that can directly impact future cancer patient care.

The new method predicted more than 90% of developing cancer processes

Lead author Qingli Guo works in close collaboration with scientists from The Institute of Cancer Research (ICR), London, and the Queen Mary University of London, developed machine learning methods, named FFPEsig, to unravel exactly how formalin mutates DNA.

Our results show that normally nearly half of the cancer processes will be missed without noise correction. However, using FFPEsig, more than 90% of them were accurately predicted. says Qingli.

Cancer evolves gradually. Profiling mutational processes in longitudinal samples help to identify clinical informative predictors and make a diagnosis of each tumor stage.

Our finding enables the characterization of clinically relevant signatures from the preserved tumor biopsies stored at room temperatures for decades. With a deep understanding of how formalin impacts the cancer genome, our study opens a huge opportunity to transform the developed signature detection assays using large cost-effective archival samples.

The researchers pointed out that the method currently does not completely remove artifacts that appeared in FFPE samples showing batch effects, and how well the tool performs varies by cancer type, so care must be taken to interpret any findings. We are also interested in further applying their methods to a much broader spectrum of archival samples in the future.

The research was funded by Cancer Research UK, the University of Helsinki, and in part by the Academy of Finland. This project is co-led by senior authors Prof. Ville Mustonen (University of Helsinki) and Prof. Trevor Graham (the ICR).