trimAl: Trimming, assessing and reformatting of Multiple Sequence Alignments.

Salvador Capella Gutierrez.

Comparative Genomics Group. Bioinformatics & Genomics Programme. CRG

May, 11th at 11:30am, room 470.2

Multiple sequence alignments (MSA) are central to many areas of bioinformatics. The quality of downstream analyses depends on the accuracy of MSAs. For instance, it has been shown that removing conflicting regions from an alignment help to increase the quality of phylogenetic analysis. In large scale analyses THERE is a need to appropriately determine the best trimming parameters to each alignment. Here, we present trimAl, a tool for automated alignment trimming, which is especially suited for large-scale analyses. trimAl can consider several parameters, alone or in multiple combinations, for selecting the most reliable positions in the alignment. These include the proportion of sequences with a gap, the level of amino acid similarity and, if several alignments for the same set of sequences are provided, the level of consistency across different alignments. Moreover, trimAl can automatically select the parameters to be used in each specific alignment so that the signal-to-noise ratio is optimized. Besides implementing the assessment and trimming of alignments, trimAl is an efficient format-conversion tool that eases the concatenation of different alignment-based analyses. In this tutorial, I will go over the main potentialities of trimAl, showing selected practical examples.

trimAl can be accessed at trimal.cgenomics.org

Posted in Uncategorized | Comments Off

The true story behind the annotation of a pathway

Giovanni M. Dall’Olio
Instituto de Biologia Evolutiva, UPF-CEXS
Room 468, Tuesday April 5th at 11.30

In 2010, our group worked on the annotation of the Asparagine N-Glycosylation pathway in the Reactome, a public database of biological pathways. This talk will describe the journey of this process, from the double point of view of a annotator and an end user.
Since this pathway is fairly well established in the scientific literature, it is a nice study-case to compare the status and practices of annotations of different other scientific databases, such as Gene Ontology, String, Kegg/Pathways and Uniprot.
More importantly, this will be an opportunity to discuss how to use and interpret the data from a public scientific database, and what is the best thing to do when you encounter unclear or erroneous data in a scientific database.
- http://precedings.nature.com/documents/5561/version/2

Posted in Uncategorized | Comments Off

Computational Analysis of Biological Networks

Giovanni Scardoni, University of Verona (Italy)

Friday Feb 18 2011 11:00 pm at room 473.10

Complex biological networks, such as intracellular signaling networks, are modeled by the evolution to accomplish a variety of different regulatory functions. This is achieved by controlling the overall topology of the network, in turn determined by the modular architecture of proteins, which, then, affects its dynamic behavior. The most significant mathematical properties of biological networks will be presented. Particularly, centralities are parameters allowing scoring of the nodes according to their individual topological relevance. Centralities parameters and their biological meaning will be considered and a centralities based analysis of the human-kino-phosphatome will be presented, as an example of integrated analysis of topological and experimental data.

Centrality interference, a new centrality-based notion, will also be introduced. Centrality interference allows identifying nodes that are more influenced by a particular node in the network and can be used to model process where nodes are removed or added from/to a network. An example of interference analysis will also be discussed.

References: Scardoni G, Petterlini M, Laudanna C. Analyzing biological network parameters with CentiScaPe. Bioinformatics. 2009 Nov 1;25(21):2857-9. Epub 2009 Sep 2.
PubMed PMID: 19729372; PubMed Central PMCID: PMC2781755.

Posted in Tools | Comments Off

Analysis and manipulation of phylogenomic data using ETE

Wed, Jan 26, 12:00 pm at room 470 sem 2

Jaime Huerta Cepas. Comparative Genomics Group. Bioinformatics & Genomics Programme. CRG

ETE is a python programming library that assists in the automated manipulation, analysis and visualization of phylogenetic trees. It allows to read trees in Newick format and operate with them as very intuitive python objects, providing advanced methods to locate nodes, browse tree topology, annotate branches, or manipulate node connections. In addition, ETE provides a fully customizable system for tree visualization. Users can visualize trees interactively or write their own python functions to create tree images in PDF or SVG format.

Posted in Bioconductor, Library, python, Sequencing, Tools | Comments Off

Analyzing Chip-Seq mapped reads with Pyicos and bash: Command-line real time examples.

October, 20th 11:00 AM at PRBB room 173.06-183.01 (Xipre)

Juan Gonzalez-Vallinas
Regulatory Genomics Group, GRIB

When provided with some files with mapped reads coming from a Chip-Seq experiment, lots of the work has already been done. Extracting the biological information from them should be an easy job, right? Surprisingly, lots of bioinformaticians are finding that the methods and software packages proposed for the analysis of this kind of data doesn’t fit their particular needs. Because these experiments have gone through a long process and they commonly targets a particular Protein-DNA interactions, this last step normally takes longer than expected. Moreover, the technical difficulties of dealing with read files that can be on the Gigabyte-Scale, the different formats used by different laboratories and tools and the novelty of the field are extra headaches for the researcher performing this kind of analysis. This seminar is designed as an introduction where I will work in real-time with a sample dataset, showing how to use bash and Pyicos, a novel toolbox for the analysis of mapped reads coming from Deep Sequencing experiments.

Posted in python, Sequencing, Statistics, Tools | Tagged | Comments Off

BIBEX, high performance exploration for bibliographic databases

June, 15th 10:30 room 473.10

logo_bibexBibliographic exploration is important for the management, preservation and creation of IP. Questions like “who is the most significant authority in a specific knowledge area?”, “who is a suitable reviewer for a given paper?”,  “what is the seminal paper in a specific topic?” or “how it is the social network of a researcher?” are questions that many researchers and managers from Academia and industry would like to have responded in a few seconds. This would save them time for other more creative and productive tasks within their organizations.

BIBEX is an innovative software for the exploration of bibliographic and documental repositories. It allows answering those queries and other more sophisticated ones in a fast and synthetic way over huge amount of data. BIBEX stems from the last five years of research and development at UPC, turning our product into a marketable technology, commercialized by Sparsity technologies, a spin out company of UPC. Our technology has been supported and is being used by the Ministry of Science and Innovation of Spain, and by Generalitat de Catalunya.

BIBEX solves, for the first time, those complex queries in very fast execution times for billions of objects in a database. It is based on a technology also developed at UPC, the DEX graph database management system, which has also been used in many other different application areas, like fraud detection, cancer analysis, social network analysis and recommendation systems.

Speakers: JOSEP LLUIS LARRIBA PEY (DAMA-UPC) / PERE BALETA FERRER (SPARSITY TECHNOLOGIES)

More information on www.dama.upc.edu

Posted in Uncategorized | Comments Off

Biana: a software framework for compiling biological interactions and analyzing networks

May, 5th at room 470. Javier García-García, UPF

The analysis and usage of biological data is hindered by the spread of information across multiple repositories and the difficulties posed by different nomenclature systems and storage formats. In particular, there is an important need for data unification in the study and use of protein-protein interactions. Without good integration strategies, it is difficult to analyze the whole set of available data and its properties. We introduce BIANA (Biologic Interactions and Network Analysis), a tool for biological information integration and network management. BIANA is a Python framework designed to achieve two major goals: i) the integration of multiple sources of biological information, including biological entities and their relationships, and ii) the management of biological information as a network where entities are nodes and relationships are edges. Moreover, BIANA uses properties of proteins and genes to infer latent biomolecular relationships by transferring edges to entities sharing similar properties. BIANA is also provided as a plugin for Cytoscape, which allows users to visualize and interactively manage the data. A web interface to BIANA providing basic functionalities is also available. The software can be downloaded under GNU GPL license from http://sbi.imim.es/web/BIANA.php. BIANA’s approach to data unification solves many of the nomenclature issues common to systems dealing with biological data. BIANA can easily be extended to handle new specific data repositories and new specific data types. The unification protocol allows BIANA to be a flexible tool suitable for different user requirements: non-expert users can use a suggested unification protocol while expert users can define their own specific unification rules.

Posted in python, Tools | Tagged , , | Comments Off

Integration of biological annotations and networks using Cytoscape (hands on tutorial)

Cytoscape is a widely used open-source software for visualization and analysis of networks. There is a wide range of plugins available facilitating the analysis of biological networks (protein-protein interaction networks, biological pathways etc). This hands on tutorial will give a short introduction to Cytoscape and will focus on the integration of biological annotations with networks.

Date and place: April 6th, 15h in room 470 of PRBB

Speaker: Anna Bauer-Mehren Integrative Biomedical Informatics Laboratory

Requirements: People should bring their laptops with Cytoscape verision 2.6.3 installed. Cytoscape can be downloaded here.

The slides will be available after the talk. Material needed for the tutorial will be available shortly before the tutorial on http://ibi.imim.es.

Posted in Tools | Tagged , | Comments Off

A brief introduction to latex for writing papers

Latex is an open source text document markup language broadly used among scientists. We propose a hands on two hours tutorial to show the basic features of this language. Everything you need is just your notebook!
If your laptop is mounting linux check that you have the latex compiler and one of the most used editors (kile, texmaker, lyx).

If your laptop is a mac you can get everything you need to follow the tutorial here: http://guides.macrumors.com/Installing_LaTeX_on_a_Mac.

If you are a windows user, please send a mail to aledda@imim.es and we will provide you with a live CD with all the tools.

Any further question can be addressed to aledda@imim.es.

The workshop will be given by Alice Ledda, PhD student at Evolutionary Genomics lab and Inma Tur, PhD student Functional Genomics, GRIB

Download the course material.

Posted in Uncategorized | Tagged | Comments Off

GEM hands on

February 2nd at 11h in room 473.10 of PRBB, Paolo Ribeca from Genome Bioinformatics, CRG.

GEM is a set of tools for mapping and analysing next-generation sequencing data.

This seminar will be a continuation, more practical with use cases, on the tutorial that was presented in the previous seminar.

You can download binaries from http://gemlibrary.sourceforge.net

Posted in Sequencing, Tools | Tagged , , | Comments Off