Submitted on July 22, 2008
Revised on October 27, 2008
Accepted on October 28, 2008
Predicting protein post-translational modifications using meta-analysis of proteome-scale data sets
Daniel Schwartz, Michael F. Chou, and George M. Church
Genetics Department, Harvard Medical School, Boston, MA 02115
Corresponding Author: dschwartz{at}hms.harvard.edu
Protein post-translational modifications (PTMs) are an important biological regulatory mechanism and the rate of their discovery using high throughput techniques is rapidly increasingly. To make use of this wealth of sequence data, we introduce a new general strategy designed to predict a variety of PTMs in several organisms. We used the motif-x program to determine phosphorylation motifs in yeast, fly, mouse and man, and lysine acetylation motifs in man. These motifs were then scanned against proteomic sequence data using a newly developed tool called scan-x to globally predict other potential modification sites within these organisms. Ten-fold cross validation was used to determine the sensitivity and minimum specificity for each set of predictions all of which showed improvement over other available tools for phospho-prediction. New motif discovery is a byproduct of this approach, and the phosphorylation motif analyses provide strong evidence of evolutionary conservation of both known and novel kinase motifs.