Identifying the drivers amongst the passengers

A new method developed by Rory Johnson, leader of the GOLD (Genomics of Long non- coding RNA and Disease) Lab at the at the University of Bern, can identify cancer genes in noncoding regions of DNA. The statistical method is the first to be specifically designed to identify cancer driver long noncoding RNAs (lncRNAs) from tumour genome cohorts.

Cancer begins with a series of genetic mutations that enable a cell to escape the normal constraints on its growth and migration. The big challenge in the domain of genetic cancer research is to identify which “driver genes” are the targets of these mutations – such genes represent new targets for therapy. While much effort has been put into identifying conventional protein-coding genes connected to cancer development, the majority noncoding regions of DNA have up until now been neglected. Notably long noncoding RNAs (lncRNAs) represent a vast unex- plored genetic space that may hold missing drivers, but few such “driver lncRNAs” have been identified.

Johnson and former colleagues from Barcelona – he has only recently set up the GOLD Lab in Bern – have now developed a statistical method specifically designed to identify cancer driver lncRNAs from tumour genome cohorts. The software called ExInAtor aims to address the unique opportunity of discovering cancer driver lncRNAs within and across tumour types using mutation data generated by projects such as ICGC (International Cancer Genome Consortium).

”It is only very recently that we have the possibility to scan entire genomes and can therefore search a full catalogue of mutations”, says Johnson. This is accomplished by sequencing the entire genomes of matched pairs of normal and tumour samples, and then comparing them to identify tumour mutations. The Group used ExInAtor to predict drivers from the GENCODE annotation across 1112 entire genomes from 23 cancer types. Using a stratified approach, the group identified 15 high-confidence candidates: 9 novel and 6 known cancer-related genes, including well-known driver lncRNAs such as MALAT1, NEAT1 and SAMMSON. They also showed for the first time, that driver lncRNAs are distin- guished by elevated gene length, evolutionary conservation and expression.

ExInAtor identifies genes with excess load of somatic single nucleotide variants (SNVs). Though signals in noncoding regions of the DNA are relatively weak and noisy, Johnson believes that they are in control of the problem of false positives. “But there are probably many false negatives”, acknowledges Johnson, “that is something we would like to improve.” Although ExInAtor was designed with lncRNAs in mind, it makes no use of functional impact predictions and hence is agnostic to the protein-coding potential of the genes it analyses. The group took advantage of this versatility to further test ExInA-tor’s precision, by comparing predictions to the gold-standard catalogue of the Cancer Gene Census (CGC) – with positive results.

The distinguishing features of cancer-related lncRNAs are reminiscent of similar findings for protein coding genes. Evolutionary conservation and high steady-state RNA levels are generally interpreted in this context as evidence for functionality of lncRNAs. It remains unclear how many lncRNA drivers remain to be discovered, and which have tumour-specific or pan-cancer activity. “We expect that future studies will yield many more candidate lncRNAs than produced here: although the datasets we have used represent a large proportion of all presently available tumour genomes, future projects will be larger and produce mutation calls of better quality,” says Johnson.

At the present time, the group are using ExInAtor to hunt for lncRNAs in an international collaboration called PCAWG (Pan- Cancer Analysis of Whole Genomes) that have sequenced thousands of entire tumour genomes. So far the results are promising, with dozens of new driver lncRNAs identified. “We plan to spend the next couple of years improving the sensitivity of ExInAtor to a point where we hope to identify essentially all the cancer driver lncRNAs that may be out there.”

Lanzós A. et al. (2017)Scientific Reports (7), 41544

By Roland Fischer