Abstract No.: ThP-LB5
Session: LATE-BREAKING/Bioinformatics
Presentation date: Thu, Aug 31, 2006
Presentation time: 09:50 – 11:20

Proline: The Proteomics Pipeline

Andreas De Stefani1, Michael Clarke2, Gerard Cagney2, Matthew Sullivan2

1 Siemens, Dublin, Ireland
2 University College Dublin, Dublin, Ireland

Correspondence address: Andreas De Stefani, UCD/Siemens, BioInformatics Lab, Belfield UCD, Dublin, D4 Ireland.

Keywords: Data Analysis; Database; MS/MS; Protein Identification.

Novel aspect: Automation of proteomics data analysis and storage based on XML workflow-descriptions. Collection of proteomics data visualization and mining tools.


Proline is a complete software platform that automates the analysis of proteomics mass spectrometry data. It provides spectra quality validation, protein identification employing both database searching and de novo sequencing - identification validation, relative abundance estimation and detailed cross-sample analysis. Proline integrates with numerous third-party packages to provide these services. The system provides workflow management to define the task execution order and data transformation.

Proline has two distinct parts: the Core and the Portal.

The Proline Core is a distributed life sciences workflow and task execution system. Analysis tools can be integrated into Proline, customised workflows can be built to automate tasks, and a flexible data transformation system can manage the flow and storage of data between the integrated components. It is preconfigured to automate tasks common to proteomics analysis, and to capture relevant data. The following third party applications have been integrated into the Core:
*X! Tandem
*Peptide & Protein Prophet (the Trans-Proteomic Pipeline)

The Core has a configurable storage mechanism for generated data. While Proline has a proteomics database schema that can be implemented in any RDBMS (MySQL by default), other external third party storage repositories can be defined.

The Proline Portal is a collection of user interfaces and software tools that provide different methods to view and manipulate Proline.

Proline Results Viewer is a Portal view that provides a sophisticated protein identification results interface that encompasses both protein- and peptide-centric views.

Quantum is a view that examines the relative protein abundance between multiple analyses. Quantum employs spectral counting techniques to achieve this.

Pepper is a view that calculates peptide distributions for specified proteins. Pepper plots the peptide distribution from different analyses and provides a visual means for identifying different patterns in the distributions.

The Portal contains all the necessary utility and security features to secure and manage access to the data within Proline. User authentication and access control lists are provided. The Portal provides methods for the submission of data for analysis by the Core, and the monitoring of system load and workflow execution.

Thus combined, the Core and Portal provide the researcher with a system powerful enough to manipulate proteomics data, yet flexible enough to be tailored to the researcher's particular needs and methods of practice.

Proline is freely available for general use at http://proteomics-portal.ucd.ie:8088. This installation provides de novo sequence analysis and database searching, using PepNovo and X! Tandem respectively.