Mutual exclusion statistics and data events in Gitools

We’re pleased to announce another incremental release of Gitools, version 2.2. Amongst the many improvements (listed at the bottom of this post) we’d like to highlight the effort that we put into improving performance, specifically with genomic data: mutual exclusion and co-occurrence statistics coupled with a new feature called “data events” – which helps to get a quick grasp of the data.

Low expression events ordered by mutual exclusion

Low expression data events events ordered by mutual exclusion

Read the rest of this entry »

My slides on: Identification of cancer drivers across tumor types

Yesterday I gave a talk at the PRBB Computational Genomics Seminars Series. In that talk I summarized our work of this year in the lab. Basically, we have developed methods to identify cancer driver genes and we have applied them to thousands of tumor resequenced genomes. Here, I leave you the slides, and I summarize the talk below.


Read the rest of this entry »

How to identify cancer drivers from tumor somatic mutations?

Cartoon representing genomic alterations in a tumor cell. Image from NCI.

I have recently seen several presentations by groups that systematically explore alterations in cancer genomes that deliver the same message. One of the main challenges faced by their projects is to identify genes and pathways involved in tumor development (drivers). Very good methods based on the recurrence of somatic mutations have been developed to identify cancer drivers (see, for example MutSig and the Significantly Mutated Gene (SMG) test from MuSiC). They rely on the assumption that genes that exhibit more mutations than expected by chance are putative drivers. Even though these methods are successful in identifying clear cancer drivers, they also face some known challenges. For instance, the background mutation rate is hard to estimate accurately and important genes that are mutated only in a small number of tumors may be overlooked. Besides, these methods treat all mutations that may affect protein sequence equally, when their impact on protein function clearly differ.


Some time ago we thought that a good way to address these challenges would be to use the Functional Impact Bias (FM bias) observed in genes across a cohort of tumor samples. In other words, we wanted to estimate how the accumulation of mutations with high functional impact on each gene deviates from the average observed in all tumor samples.

Read the rest of this entry »

Exploring multiple cancer genomics alterations with Gitools.

Cancer genomics data that is produced creates multi-dimensional data sets. Gitools lets you browse all that data at once.

A typical cancer genomics project nowadays screens the cancer genome, epigenome and transcriptome of a cohort of patients and identifies various types of alterations: Copy Number changes, Somatic Mutations, Gene Expression changes and others. This is the case of projects framed within The Cancer Genome Atlas or the International Cancer Genomics Consortium, as well as many others. Each of these types of alterations is represented in different data formats and it remains a challenge to integrate them to get a unified view of the process of alterations that leads to tumorigenesis. In Gitools it is possible to explore and analyze multi-value matrices in the form of interactive heatmaps, making it possible to work with various data dimensions at once.  Read the rest of this entry »