Go to contents (site navigation)

 


Realised by ALMS™
developer of the AIDS-HIV Reference project
Abstract No.: ThP-027
Session: Bioinformatics
Presentation date: Thu, Aug 31, 2006
Presentation time: 09:50 – 11:20

Strategies for More Reliable and Sensitive Identification of Endogenous Peptides

Maria Falth1, Karl Skold1, Marcus Svensson1, Anna Nilsson1, David Fenyo2, Per E. Andren1

1 Uppsala University, Uppsala, Sweden
2 The Rockefeller University, New York, United States

Correspondence address: Maria Falth, Uppsala University, Lab for Biological and Medical Mass Spectrometry, Husargatan 3, Box 583, Uppsala, 75123 Sweden.

Keywords: Data Analysis; Database; Mass Spectrometry; Neuropeptides.

Novel aspect: Improved automatic identification of endogenous peptides.

 

Introduction
Database searching against large databases with unspecific cleavage is time consuming and the search result is poor. In a typical MS experiment, using a sample from brain tissue, about 500 peptides are detected with DeCyder MS (GE Healthcare). Only 50 to 100 of them get an identity. The aim of this study is to investigate how to optimize the identification process of endogenous peptides from mass spectrometry analyses. To do this a database that reflects the content of the sample was created. In this study four different databases have been used together with two different search engines, X!Tandem (www.thegpm.org) and Mascot (Matrix sciences) to investigate how the search results depend on how well the database reflects the sample.

Materials and Methods
This study was performed using two different search engines, X!Tandem and Mascot and four different databases; IPI_Mouse, known peptide precursors from SwePep (www.swepep.org), known peptides from SwePep and a database with predicted peptides from IPI_Mouse.
For the first 3 databases unspecific cleavage was used and for the prediction database no cleavage was used since the database consists of pre-cleaved peptides. All peptides identified with a score over the significant level were considered as true positives.

Prediction database
Specific protein precursors are processed by proprotein convertases to form biologically active peptides. The most common cleavage site is the c-terminal side of two basic amino acids. These sites were used to predict possible biologically active peptides from IPI_Mouse. The database consists of many false positives but since the database is used for the identification of tandem MS data, false positives will not match any spectra.

Data sets
Two different datasets, from mouse brain tissue, were used for the database searches. The first data set was a collection of 98 tandem mass spectra deriving from 70 previously positively identified peptides by X! Tandem. The other dataset was a tissue sample from mouse striatum where all (2867) generated tandem mass spectra were used. All samples were analyzed using nanoLC ESI-LTQ MSMS.

Results
The highest number of positively identified peptides was obtained when using the peptide precursor database and Mascot. In comparison with X! Tandem, Mascot was more sensitive (less true negatives), while X! Tandem was more specific (less false positives).
The cut off score for a significant identification in Mascot increased with the size of the database. This generated many more false negatives. On the other hand if we used a smaller database the cut off score decreased and the number of false positives increased.
During this study, novel peptides not previously identified from mouse brain tissue were discovered. The present study demonstrates the importance of using different databases in the identification process of endogenous peptides.