New OncodriveFM release 0.3.2

Some time ago Abel wrote about how to identify cancer drivers from tumor somatic mutations, and presented OncodriveFM. Nuria also posted a nice poster explaining it together with TransFIC.
Initially, OncodriveFM was written by Abel as a Perl script and distributed through our web. Later on I had to implement the analysis workflows of IntOGen SM, which required to use it intensively. However,
we realized that the code of OncodriveFM could be significantly improved in terms of performance, as there is a part of the analysis that  may take quite a lot of time depending on the input data. This is why I decided to implement it again, starting from a prototype written by Abel, using Python. Read the rest of this entry »

New IntOGen Somatic Mutations Analysis version available

We are proud to announce the brand new version of the IntOGen Somatic Mutations Analysis (IntOGen SM) pipeline. We call it version 2.0.0 as it has been completely rewritten from scratch with a strong focus on quality, efficiency and scalability.

The IntOGen SM pipeline addresses the challenge of identifying which somatic mutations are important for the development of tumors. The input for the analysis is a list of somatic mutations detected in a cohort of tumors. Read the rest of this entry »

How to identify cancer drivers from tumor somatic mutations?

Cartoon representing genomic alterations in a tumor cell. Image from NCI.

I have recently seen several presentations by groups that systematically explore alterations in cancer genomes that deliver the same message. One of the main challenges faced by their projects is to identify genes and pathways involved in tumor development (drivers). Very good methods based on the recurrence of somatic mutations have been developed to identify cancer drivers (see, for example MutSig and the Significantly Mutated Gene (SMG) test from MuSiC). They rely on the assumption that genes that exhibit more mutations than expected by chance are putative drivers. Even though these methods are successful in identifying clear cancer drivers, they also face some known challenges. For instance, the background mutation rate is hard to estimate accurately and important genes that are mutated only in a small number of tumors may be overlooked. Besides, these methods treat all mutations that may affect protein sequence equally, when their impact on protein function clearly differ.


Some time ago we thought that a good way to address these challenges would be to use the Functional Impact Bias (FM bias) observed in genes across a cohort of tumor samples. In other words, we wanted to estimate how the accumulation of mutations with high functional impact on each gene deviates from the average observed in all tumor samples.

Read the rest of this entry »