AboutDownloadsDocumentsForumsSource CodeIssues
FEATURE for Metals
Metals play a variety of roles in biological processes, and hence their presence in a protein structure can yield vital functional information. Because the residues that coordinate a metal often undergo conformational changes upon binding, detection of binding sites based on simple geometric criteria in proteins without bound metal is difficult. However, aspects of the physicochemical environment around a metal binding site are often conserved even when this structural rearrangement occurs. We have developed a Bayesian classifier using known zinc binding sites as positive training examples and nonmetal binding regions that nonetheless contain residues frequently observed in zinc sites as negative training examples. In order to allow variation in the exact positions of atoms, we average a variety of biochemical and biophysical properties in six concentric spherical shells around the site of interest. At a specificity of 99.8%, this method achieves 75.5% sensitivity in unbound proteins at a positive predictive value of 73.6%. We also test its accuracy on predicted protein structures obtained by homology modeling using templates with 30%-50% sequence identity to the target sequences. At a specificity of 99.8%, we correctly identify at least one zinc binding site in 65.5% of modeled proteins. Thus, in many cases, our model is accurate enough to identify metal binding sites in proteins of unknown structure for which no high sequence identity homologs of known structure exist.
FEATURE Zinc-binding domain 2007
Feb 19, 2014

Training data for machine learning models to predict zinc-binding domains.  View License

Download Links

Feb 19, 2014
495 KB
Source code
Naive Bayes machine learning system for predicting metal ion cofactors

Feb 19, 2014
80 MB
Machine learning training data for zinc-binding domain


Ebert, J.C. and Altman, R.B. (2008). Robust recognition of zinc binding sites in proteins Protein Science, 17(1) 54-65. (2008) View