Abstract
The combination of models for protein target prediction with large databases containing toxicological information for individual molecules allows the derivation of toxiclogical profiles, i.e., to what extent are molecules of known toxicity predicted to interact with a set of protein targets. To predict protein targets of drug-like and toxic molecules, we built a computational multiclass model using the Winnow algorithm based on a dataset of protein targets derived from the MDL Drug Data Report. A 15-fold Monte Carlo cross-validation using 50% of each class for training, and the remaining 50% for testing, provided an assessment of the accuracy of that model. We retained the 3 top-ranking predictions and found that in 82% of all cases the correct target was predicted within these three predictions. The first prediction was the correct one in almost 70% of cases. A model built on the whole protein target dataset was then used to predict the protein targets for 150 000 molecules from the MDL Toxicity Database. We analysed the frequency of the predictions across the panel of protein targets for experimentally determined toxicity classes of all molecules. This allowed us to identify clusters of proteins related by their toxicological profiles, as well as toxicities that are related. Literature-based evidence is provided for some specific clusters to show the relevance of the relationships identified.
Original language | English |
---|---|
Pages (from-to) | 225-234 |
Number of pages | 10 |
Journal | Toxicology and Applied Pharmacology |
Volume | 231 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Sept 2008 |
Keywords
- protein target prediction
- Winnow algorithm
- computational toxicology
- toxicity
- ESTROGEN-RECEPTOR
- GENE-EXPRESSION
- DRUG DISCOVERY
- PARATHYROID-HORMONE
- CHEMICAL-STRUCTURE
- STEROID SULFATASE
- BONE-RESORPTION
- BREAST-CANCER
- K+-ATPASE
- ER-BETA