These approaches provide diverse performance because of their different combinations of protein features, training datasets, training strategies, and computational machine learning algorithms. Prediction of protein subcellular multisite localization. A comparative study on feature extraction from protein. Here, we investigate the extent of utilization of human cellular localization mechanisms by viral. Protein subcellular localization prediction of eukaryotes. A multiple information fusion method for predicting. A bayesian method for predicting protein subcellular localization. What is the pathway by which membrane proteins reach their proper subcellular destination in bacteria. Since cellular functions are often localized in specific compartments, predicting the subcellular localization of unknown proteins may be used to obtain useful information about their functions and. Bacterial protein subcellular localizations for several major marine bacterial groups were predicted using genomic, metagenomic and metatranscriptomic data sets following modification of metap software for use with partial.
The study of protein subcellular localization psl is important for elucidating protein functions involved in various cellular processes. Most of the functions critical to the cells survival are performed by these proteins located in its different organelles, usually called subcellular locations. It can be used to infer potential functions for a protein, to either design. Computational methods aiming at predicting subcellular localization of. Many methods have been described to predict subcellular location from sequence information.
A multilabel classifier for predicting the subcellular. One of the central problems in computational biology is protein function identification in an automated fashion. It is widely recognized that much of the information for determining the final subcellular localization of proteins is found in their amino acid sequences. We present an approach to predict subcellular localization for gramnegative bacteria.
Many methods for predicting protein subcellular localization were. The metap program for predicting protein subcellular localization for metagenomic sequences is a consensus algorithm, and its accuracy is dependent on that of the multiple predictors incorporated into the final algorithm. Abstract because the proteins function is usually related to its subcellular localization, the ability to predict subcellular localization directly from protein sequences will be useful for inferring protein functions. Bacteria consume dissolved organic matter dom through hydrolysis, transport and intracellular metabolism, and these activities occur in distinct subcellular localizations. Subcellular localization prediction of bacterial protein is an important component of bioinformatics, which has great importance for drug design and other applications. Several methods have been developed for predicting subcellular location of eukaryotic, prokaryotic gramnegative bacteria and human proteins but no method is available for mycobacterial proteins. The subcellular localization of apases has significant ecological implications for marine biota but is largely unknown. Pslpred is a hybrid approachbased method that integrates psiblast and three svm modules based on compositions of residues, dipeptides and.
Recent tools and an experience report can be found in a recent paper by meinken and min 2012. Pslpred is a hybrid approachbased method that integrates psiblast and three svm modules based on compositions of residues, dipeptides and physicochemical properties. Advances in predicting subcellular localization of multi. We propose a hybrid prediction method for gramnegative bacteria that combines a oneversusone support vector machines svm model and. Our experimental results indicate that the proposed method can resolve inconsistency problems in subcellular localization prediction for both gramnegative and grampositive bacterial proteins. Many methods for microbial protein subcellular localization scl prediction exist. The smallest unit of life is a cell, which contains numerous protein molecules. Methods for predicting bacterial protein subcellular. Genomewide prediction of protein subcellular localization is an important type of evidence used for inferring protein function. For the prediction of protein subcellular localization, as we all know, lots of computational tools have been developed in the recent decades. The subcellular location of a protein can provide valuable information about its function. However, existing computational approaches have the following disadvantages. Genomewide protein localization prediction strategies for.
With the rapid increase of protein sequences in the postgenomic age, the need for an automated and accurate tool to predict protein subcellular localization becomes increasingly important. Brinkman senior supervisor professor, department of. Predict subcellular localization of grampositive bacterial proteins by. Currently available methods are inadequate for genomescale predictions with several limitations.
Spaces and line breaks will be ignored and will not affect the prediction result. Information on the subcellular localization of gramnegative bacterial proteins is of great significance to study the pathogenesis, drug design and discovery of certain diseases. Evidence that subcellular localization of a bacterial. Ab initio methods that predict subcellular localization for any protein sequence using only the native amino acid sequence and features predicted from the native sequence have shown the most remarkable improvements. Protein subcellular localization prediction wikipedia. In our efforts to predict useful vaccine targets against gramnegative bacteria, we noticed that misannotated start codons frequently lead to wrongly assigned scls.
The user can choose from a large selection of genomes. In its original version, all of the three base predictors. Prediction of protein subcellular localization is a challenging problem, particularly when the system concerned contains both singleplex and multiplex proteins. The recent worldwide spreading of pneumoniacausing virus, such as coronavirus, covid19, and h1n1, has been endangering the life of human beings all around the world. Expert system for predicting protein localization sites in. The eukaryotic cell is a highly ordered structure where nucleusencoded proteins are synthesized in the cytoplasm and all noncytosolic proteins are transported to their destined subcellular locations. In addition, we incorporate phylogenetic profiles and gene ontology go terms derived from the protein sequence to. A new method for predicting the subcellular localization of. Thus there was a need to develop a dedicated method for predicting subcellular localization of mycobacterial proteins. Predicting the subcellular localization of viral proteins. Subcellular localization is a key functional attribute of a protein. Prediction of protein subcellular localization request pdf.
A multilabel classifier for predicting the subcellular localization of. In the current study, we are to use the multilabel theory to develop a new sequencebased method for predicting the subcellular localization of gramnegative bacterial proteins with both single and multiple locations, aimed at improving its absolute true and absolute false rates, the two most important and harshest metrics for a multilabel. The entire classifier thus established is called ilocgneg, which can be used to predict the subcellular localization of both singleplex and multiplex gramnegative bacterial proteins. After homology reduction to 25% sequence identity and filteringout protein. This and other problems in scl prediction, such as the relatively high falsepositive and falsenegative rates of some tools. There is a scarcity of efficient computational methods for predicting protein subcellular localization in eukaryotes.
Many efforts have been made to predict protein subcellular localization. Assessing the precision of highthroughput computational and laboratory approaches for the genomewide identification of protein subcellular localization in bacteria. Prediction of subcellular localization of bacterial proteins article pdf available in bioinformatics 2110. Based on a study last performed in 2010, psortb v3. Prokaryotic protein subcellular localization prediction and genomescale comparative analysis examining committee. The psolocbact method is a pso method for combining the results of multiple classifiers for the prediction of protein subcellular localization in both gramnegative and grampositive bacteria. Computational methods of prediction subcellular localization of protein are much more reliable which produce subcellular localization as an output by taking some input information about protein. Therefore, predicting the subcellular location of protein sequences is a key step to understand the biological functions of protein sequences.
A key step to achieve this is predicting to which subcellular location the protein belongs, since protein localization correlates closely with its function. We developed a web server pslpred for predicting subcellular localization of gramnegative bacterial proteins with an overall accuracy of 91. For subcellular localization of gramnegative bacterial proteins, table 2 shows that when the decomposition scale is 3, the highest prediction accuracy is 94. List of protein subcellular localization prediction tools.
Identifying protein subcellular localization scl is important for deducing protein function, annotating newly sequenced genomes, and guiding experimental designs. Protein subcellular localization prediction for gramnegative bacteria using amino acid subalphabets and a combination of multiple support vector machines. To obtain the best experience, we recommend you use a more up to date. Many methods for predicting protein subcellular localization were based on the aacdiscrete model see, e. The extensive metagenomic sequence databases from the global ocean sampling expedition provide an opportunity to. Pdf prediction subcellular localization of gramnegative.
Protein subcellular localization prediction based on compartment. An adaptive boosting method for predicting subchloroplast localization of plant proteins. In this study, we developed a new feature extraction method based on the pk value and frequencies of amino acids to represent a protein as a real values vector. Automated prediction of bacterial protein subcellular localization is an important tool for genome annotation and drug discovery. Recent decades have witnessed remarkable progress in bacterial protein subcellular localization by computational approaches. Bacteria lack an endoplasmic reticulum, a golgi apparatus, and transport vesicles and yet are capable of sorting and delivering integral membrane proteins to particular sites within the cell with high precision. This program can predict 11 distinct locations each in plant and animal. Nakai 2000 based on the observation that sequences targeted to specific locations rely on the nterminal sorting or signal sequences. Estimation of subcellular proteomes in bacterial species. With the avalanche of protein sequences emerging in the postgenomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Protein subcellular localization closely relates to the protein function.
Subcellular localization, pssm, pseaac, linear dimensionality reduction, pca, lda. However, relatively much less efforts have been made to address those proteins which may simultaneously exist at, or move. This program can predict 11 distinct locations each in plant and animal species. Jun 15, 2011 genomewide prediction of protein subcellular localization is an important type of evidence used for inferring protein function. In order to perform a comprehensive survey of prediction methods, we selected only methods that accepted large batches of protein sequences, were publicly available, and were able to predict localization to at least nine of the major subcellular locations nucleus, cytosol, mitochondrion, extracellular region, plasma membrane, golgi apparatus, endoplasmic reticulum er, peroxisome. The input information that we are talking about are the related.
Methods for predicting bacterial protein subcellular localization. Psortb for bacterial psort is a highprecision localization prediction method for bacterial proteins. The computational prediction of the subcellular localization of bacterial proteins is an important step in genome annotation and in the search for novel vaccine or drug targets. You are using a browser version with limited support for css. The extensive metagenomic sequence databases from the global ocean sampling expedition provide an opportunity to address this question. Breakdown of the gramnegative bacterial protein benchmark. Subcellular localization of marine bacterial alkaline. However, this is not the case for viruses whose proteins are often involved in extensive interactions at various subcellular localizations with host proteins. Subcellular localization of proteins scholars research library. Evaluation and comparison of mammalian subcellular. However, most of these methods either rely on global sequence properties or use a set of.
It can be used to infer potential functions for a protein, to either design or support the results of particular experimental approaches and, in the case of surfaceexposed proteins, to quickly identify potential drug. Psort has been one of the most widely used computational methods for such bacterial protein analysis. Metap consensus algorithm for subcellular localization prediction of fragmentary sequences. We present an approach to predict subcellular localization for gram.
Institute of image processing and pattern recognition, shanghai jiaotong university, shanghai, 200240, china. A wide variety of methods for protein subcellular localization prediction have been proposed over recent years. Various methods for predict ing subcellular localization of protein sequences have been. This method is capable of generating final localization predictions based on protein. In this paper, by introducing the multilabel scale and hybridizing the information of gene ontology with the sequential evolution information, a novel predictor called ilocgneg is developed for predicting the subcellular. Comparison with the stateoftheart method in predicting plant protein subcellular localization a. Sclpred protein subcellular localization prediction by nto1 neural networks. In addition, we incorporate phylogenetic profiles and gene ontology go terms derived from the protein. When the decomposition scale level is 4, the highest overall average prediction accuracy is 97. Psortb has remained the most precise bacterial protein subcellular localization scl predictor since it was first made available in 2003. Sep 12, 2011 therefore these predictors cannot estimate the correct subcellular localization if the nterminus of proteins is absent.
The prediction of a bacterial proteins subcellular localization can be of considerable aid to microbiological research. Protein subcellular localization, consequent to protein sorting or protein trafficking, is a key functional characteristic of proteins. This method uses the support vector machines trained by multiple feature vectors based on n peptide compositions. However, determining the localization sites of a protein through wetlab experiments can be timeconsuming and laborintensive. The prediction accuracy of these methods has increased by over 30% in the past decade. Sherloc2 is a comprehensive highaccuracy subcellular localization prediction system. Pdf we developed a web server pslpred for predicting subcellular localization of gramnegative bacterial proteins with an overall accuracy of 91. We present a software package and a web server for predicting the subcellular localization of protein sequences based on the ngloc method. Predicting subcellular localization of proteins for gramnegative bacteria by support vector machines based on npeptide compositions. The subcellular localization scl of proteins provides important clues to their function in a cell.
The predictor developed via the aforementioned procedures is called plocmgneg, where ploc stands for predict subcellular localization, and mgneg for multilabel gramnegative bacterial proteins. It is applicable to animal, fungal, and plant proteins and covers all main eukaryotic subcellular locations. Knowing the subcellular location of proteins is important for understanding their functions. Thus, computational approaches become highly desirable. Protein subcellular localization prediction using artificial. The bioinformatic prediction of protein subcellular localization has been extensively studied for prokaryotic and eukaryotic organisms. Predicting subcellular localization of gramnegative bacterial proteins by linear dimensionality reduction method volume. Rnapredator is a web server for the prediction of bacterial srna targets. Predicting subcellular localization of proteins for gram. Multilocation grampositive and gramnegative bacterial. Prediction subcellular localization of gramnegative.
Ab initio methods that predict subcellular localization for any protein sequence using. Gardy and others published methods for predicting bacterial protein subcellular localization find, read and cite all the. In some cases, these tools may yield inconsistent and conflicting prediction results. A wide variety of methods for protein subcellular localization prediction have been. Request pdf methods for predicting bacterial protein subcellular localization the computational prediction of the subcellular localization of bacterial proteins is an important step in genome. We have addressed this question by using green fluorescent protein. Using the nonlinear dimensionality reduction method for. Computational prediction of subcellular localization. Mar 28, 2009 one of the central problems in computational biology is protein function identification in an automated fashion. While a variety of computational tools have been developed for this purpose, errors in the gene models and use of protein sorting signals that are not recognized by the more commonly accepted tools can diminish the accuracy of their output. Sherloc2 integrates several sequencebased features as well as textbased features. Dec 15, 2009 bacterial alkaline phosphatases apases are important enzymes in organophosphate utilization in the ocean. Here, we present a new prediction method, ptarget that can predict proteins targeted to nine different subcellular.
It had been shown, however, that the prediction of protein subcellular localization can be obtained by training a svm employing the amino acid composition of a whole protein hua and sun, 2001. Predicting subcellular localization of gramnegative bacterial proteins by linear dimensionality reduction method article in protein and peptide letters 171. Since the 1991 release of psort ithe first comprehensive algorithm to predict bacterial protein localizationmany other localization prediction tools have been developed. Predicted protein subcellular localization in dominant.
Thus the prediction of protein localization sites is of both theoretical and practical interest. To provide an intuitive picture, a flowchart is provided in fig. A list of published protein subcellular localization prediction tools. Previously developed methods for localization prediction in bacteria exhibit poor predictive performance and are not conducive to the highthroughput analysis. Predicting the subcellular localization of a protein is a critical step in processes ranging from genome annotation to drug and vaccine target discovery. Computational prediction of bacterial protein subcellular localization scl provides a quick and inexpensive means for gaining insight into protein function, verifying experimental results, annotating newly sequenced bacterial genomes, detecting potential cell surfacesecreted drug targets, as well as identifying biomarkers for microbes. Bacterial alkaline phosphatases apases are important enzymes in organophosphate utilization in the ocean. Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the rapid increase of sequenced genomic data, the need for an automated and accurate tool to predict subcellular localization becomes increasingly important. Several computational approaches for predicting subcellular localization have been developed and proposed. Predicting subcellular localization of gramnegative. In this article, two efficient multilabel predictors, gposeccmploc and gnegeccmploc, are developed to predict the subcellular locations of multilabel grampositive and gramnegative bacterial proteins respectively. Sep 11, 2006 the prediction of a bacterial protein s subcellular localization can be of considerable aid to microbiological research. The cello method enables prediction of five subcellular localizations in gram negative bacteria cytoplasm, inner membrane, periplasm, outer.
192 1537 152 1161 1624 288 493 264 1027 1426 862 578 382 534 1466 391 1493 1579 1571 878 1097 47 974 329 461 348 1536 1527 500 1277 958 459 55 655 1468 150 403 996 1059 1256 4 1436 1089 1449 1134