Dukka B KC
North Carolina A&T State University, USA
Posters & Accepted Abstracts: J Comput Sci Syst Biol
Protein phosphorylation, mediated by protein kinases, is one of the most important post-translational modifications in eukaryotes. By modulating protein function via the addition of a negatively-charged phosphate group to a serine, threonine or tyrosine residue, phosphorylation regulates many cellular processes, including signal transduction, gene expression, the cell cycle, cytoskeletal regulation and apoptosis. An estimated 30% of the proteins in the human proteome are regulated by phosphorylation. Over the years experimental methods such as tandem mass spectrometry (MS/MS) have been used to identify phosphorylation sites in proteins. Identification of phosphorylation site with MS/MS comes with some challenges such as very expensive instrument, labor intensive and requiring specialized technical knowledge. As a consequence, phosphosite prediction algorithms predict a residue of interest is likely to be phosphorylated under cellular conditions, represent potentially valuable tools for annotating entire phosphoproteomes of a wide variety of species. In this study, we will describe our random forest based approach for phosphorylation site prediction tool (RF-Phos). RF-Phos uses random forest classifiers and a variety of sequence-driven features so that it is able to identify some putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation and an independent dataset, RF-Phos performs comparable to or better than other existing phosphosite prediction methods, such as PhosphoSVM, GPS2.1 and Musite.
Email: dbkc@ncat.edu
Journal of Computer Science & Systems Biology received 2279 citations as per Google Scholar report