TY - JOUR
T1 - The relationship between the sequence identities of alpha helical proteins in the PDB and the molecular similarities of their ligands
AU - Mitchell, John Blayney Owen
PY - 2001
Y1 - 2001
N2 - This paper considers the relationship between the percentage sequence identities of protein chains and the molecular similarities of the ligands they bind. Among a set of alpha helical proteins from the PDB, it is found that related proteins tend to bind similar ligands. Furthermore, the property of binding similar ligands can be used to define the categories of "like" and "unlike" pairs of protein chains, separated by an approximate cutoff at a sequence identity of, or somewhat above, 45%. Similarly, the property of binding related protein chains can be used to define "low" and "high" similarity pairs of ligand residues, with a cutoff at a Tanimoto score of 0.70. The ligands bound to two "like" protein chains are five times more likely to be of high similarity than would be expected if protein sequence identity and ligand molecular similarity were independent variables. Nonetheless, the nature of the PDB means that it is unclear whether the same conclusions would be reached with a data set representing an unbiased sample of all protein-ligand complexes in a living cell. The construction of an appropriate data set for such a study represents a significant challenge.
AB - This paper considers the relationship between the percentage sequence identities of protein chains and the molecular similarities of the ligands they bind. Among a set of alpha helical proteins from the PDB, it is found that related proteins tend to bind similar ligands. Furthermore, the property of binding similar ligands can be used to define the categories of "like" and "unlike" pairs of protein chains, separated by an approximate cutoff at a sequence identity of, or somewhat above, 45%. Similarly, the property of binding related protein chains can be used to define "low" and "high" similarity pairs of ligand residues, with a cutoff at a Tanimoto score of 0.70. The ligands bound to two "like" protein chains are five times more likely to be of high similarity than would be expected if protein sequence identity and ligand molecular similarity were independent variables. Nonetheless, the nature of the PDB means that it is unclear whether the same conclusions would be reached with a data set representing an unbiased sample of all protein-ligand complexes in a living cell. The construction of an appropriate data set for such a study represents a significant challenge.
KW - CLASSIFICATION
KW - DESCRIPTORS
KW - VALIDATION
KW - EVOLUTION
UR - http://www.scopus.com/inward/record.url?scp=0035498342&partnerID=8YFLogxK
M3 - Article
SN - 0095-2338
VL - 41
SP - 1617
EP - 1622
JO - Journal of Chemical Information and Computer Sciences
JF - Journal of Chemical Information and Computer Sciences
IS - 6
ER -