IntOGen-mutations analysis pipeline 2.4.0 released

Recently we published the paper describing IntOGen-mutations (see this post for more information). Now we are happy to announce a new release of the IntOGen-mutations pipeline (version 2.4.0), which contains some improvements and corrects some errors:

  • Quality control: we have included a report that provides information on how the mutations and genes travel through the different steps of the pipeline (see screenshot at the end of the post). With this report users can now monitor the number of mutations included in the input file, the number of those that were correctly parsed and correctly mapped to genome annotations (by Variant Effect Predictor). This helps to detect errors in the list of somatic mutations used as input. In addition the report also provides information on the number of genes analyzed by OncodriveFM and OncodriveCLUST, and the number of those that appear significant given different qvalue cutoffs. This information can be used to monitor if the run of the pipeline was correct and to fine tune the parameters, for instance the minimum number of mutations required for a gene to enter OncodriveFM and OncodriveCLUST analysis, or the qvalue cutoff to consider a gene as a candidate driver.
  • MAF parser: We have corrected the parser of mutations from MAF files that was not working properly. This in turn allowed us to replace TCGA datasets by the original MAF files, thus correcting errors in mutations occurring in genes coding in the reverse strand.
  • Parsing indels: We have fixed a problem with parsing some insertions in which a shift of one base position was introduced.
  • Functional impact calculation: We have introduced slight changes on how functional impacts are calculated. In particular we now consider splice donor variant and splice acceptor variants as high impacting, similarly as STOP gains, STOP loss and frameshift.
  • Stability: This new version also improves stability.

 

Note that we have re-run all the 4623 tumor genomes with the new version of the pipeline, and thus the results have changed slightly, mainly due to the correction of the errors mentioned above.

 

Related posts:

IntOGen-mutations: the analysis of cancer genomes published in Nature Methods