Predicting Metal-Binding Sites of Protein Residues
Metal ions in protein are critical to the function, structure and stability of protein. For this reason accurate prediction of metal binding sites in protein is very important. Here, we present our study which is performed for predicting metal binding sites for histidines (HIS) and cysteines from protein sequence. Three different methods are applied for this task: Support Vector Machine (SVM), Naive Bayes and Variable-length Markov chain. All these methods use only sequence information to classify a residue as metal binding or not. Several feature sets are employed to evaluate impact on prediction results. We predict metal binding sites for mentioned amino acids at 35% precision and 75% recall with Naive Bayes, at 25% precision and 23% recall with Support Vector Machine and at 0.05% precision and 60% recall with Variable-length Markov chain. We observe significant differences in performance depending on the selected feature set. The results show that Naive Bayes is competitive for metal binding site detection.(original abstract)
- J. Reedijk, "Comprehensive Coordination Chemistry", vol. 2, chp. 13.2, Pergamon, Oxford, pp. 73-98, 1987.
- A. J. Thomson and H. B. Gray "Bio-inorganic chemistry", Current Opinion in Chemical Biology 2: 155-158.
- A. Passerini, M. Punta, A. Ceroni, B. Rost, and P Frasconi,"Identifying Cysteines and Histidines in Transition-MetalBinding Sites Using Support Vector Machines and Neural Networks," Proteins, vol. 65, no. 2, pp. 305-316, 2006.
- L. Bancini et. al., "A prokaryotic superoxide dismutase paralog lacking two Cu ligands: from largely unstructured in solution to ordered in the crystal", Proc Natl Acad Sci USA, 102:7541-7546, 2005.
- M. Akke, T. Drakenberg and WJ. Chazin, "Three-dimensional solution structure of Ca(2+)-loaded porcine calbindin D9k determined by nuclear magnetic resonance spectroscopy", 31:1011-1020, 1992.
- H. M. Greenblatt, H. Feinberg, PA. Tucker and G. Shoham, "Carboxypeptidase A: native, zinc-removed and mercury-replaced forms", 54:289-305, 1998.
- H. Sun, H. Li and PJ. Sadler, "Transferrin as a metal ion mediator", Chem Rev., 99: 2817-2842, 1999.
- M. R. Chance and W. Shi, "Metallomics and metalloproteomics.", Cell Mol. Life Sci., 65, 3040-3048,2008.
- W. Shi et. al., "Characterization of metalloproteins by highthroughput X-ray absorption spectroscopy", Genom Res., 21(6):898- 907, 2011.
- A. Passerini, M. Lippi and P. Frasconi, "MetalDetector v2.0: predicting the geometry of metal binding sites from protein sequence", Nucleic Acids Res., 39(Web Server issue):W288-92, 2011.
- F. Ferre and P. Clote, "DiANNA 1.1: An Extension of the DiANNA Web Server for Ternary Cysteine Classification", Nucleic Acids Research, vol.34, pp.W182-W185, 2006.
- A. Passerini, C. Andreini, S. Menchetti, A. Rosato, and P. Frasconi, "Predicting Zinc Binding at the Proteome Level," BMC Bioinformatics, vol. 8, p. 39, 2007.
- N. Shu, T. Zhou, and S. Hovmoller, "Prediction of Zinc-Binding Sites in Proteins from Sequence," Bioinformatics, vol. 24, no. 6, pp. 775- 782, 2008.
- L. Rishishwar, N. Mishra, B. Pant, K. Pant, and K. R. Pardasani, ProCoS - PROtein COmposition Server, Bioinformation, 5(5): 227. PMC: 3040505, 2010.
- G. Bejenora and G. Yona, "Variations on probabilistic suffix trees: statistical modelling and prediction of protein families", Bioinformatics Vol.17 No.1, pp. 23-43, 2000.
- H. Oğul and E. Mumcuoğlu, "SVM-based detection of distant protein structural relationships using pairwise probabilistic suffix trees", Computational Biology and Chemistry Vol.30, pp. 292-299, 2006.
- M. Boulle, "Parsimonious Naïve Bayes", 2014 Federated Conference on Computer Science and Information Systems (FedCSIS), pp. 355-359, 2014