Does Drug-target Have a Likeness?

H. Xu; Y. Fang; L. Yao; Y. Chen; X. Chen

doi:10.1160/ME0425

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Download PDF

Methods Inf Med 2007; 46(03): 360-366
DOI: 10.1160/ME0425

paper

Schattauer GmbH

Does Drug-target Have a Likeness?

Authors

H. Xu

¹College of Life Science, Zhejiang University, Hangzhou, Zhejiang, P. R. China
Y. Fang

²Department of Biomedical Informatics, Columbia University, New York, NY, USA
L. Yao

²Department of Biomedical Informatics, Columbia University, New York, NY, USA
Y. Chen

³Department of Computational Science, National University of Singapore, Singapore
X. Chen

¹College of Life Science, Zhejiang University, Hangzhou, Zhejiang, P. R. China

Further Information

Publication History

Publication Date:
20 January 2018 (online)

Permissions and Reprints

Summary

Objective: The discovery of new targets that are sufficiently robust to yield marketable therapeutics is an enormous challenge. Conventional target identification approaches are disease-dependent, which require heavy experimental workload and comprehensive domain knowledge. In this work, we propose that a disease-independent property of proteins, “drug-target likeness”, can be explored to facilitate the genomic scale target screening in the post-genomic age.

Methods: ASupport Vector Machine (SVM) classifier was trained to recognize target and non-target protein sequences compiled from the Therapeutic Target Database, Drug Bank, and PFam. Protein sequences are encoded by theircomposition, transition and distribution features of residues and Gaussian kernel function was used in SVM classification.

Results: SVM with a fine-tuned kernel width records 66.4 ± 5.1% of sensitivity and 97.2 ± 0.6% of specificity, corresponding to an overall target prediction accuracy of 94.4 ± 0.8%.

Conclusions: Though primitive, these results suggest that, similar to the “drug likeness” for small chemicals, their binding partners, drug targets, also display shared features which are reflected in their sequences and can be captured bystatistical learning approaches. Further research on how to accurately and interpretably measure the likeness of protein being a drug target is promising. Inspired bythe progress of “drug likeness” studies, advances in protein descriptors, statistical learning algorithms and more comprehensive and accurate gold-standard data set from disease biology research may help to further define the “drug-target likeness” property of proteins.

Keywords

Drug target - human genome - statistical learning - support vector machine

References
1 Alpay D. Reproducing kernel spaces and applications. Boston, MA: Birkhauser Verlag; 2003

Search in Google Scholar
Download RIS citation
2 Alpay D. The Schur algorithm, reproducing kernel spaces, and system theory. Providence, RI: American Mathematical Society; viii 150 2001

Search in Google Scholar
Download RIS citation
3 Arièens EJ. Drug design. New York: Academic Press; v. 1971

Search in Google Scholar
Download RIS citation
4 Bailey D, Zanders E, Dean P. The end of the beginning for genomic medicine. Nat Biotechnol 2001; 19 (03) 207-9.

Crossref PubMed Search in Google Scholar
Download RIS citation
5 Baur JA, Sinclair DA. Therapeutic potential of resveratrol: the in vivo evidence. Nat Rev Drug Discov 2006; 05 (06) 493-506.

Crossref PubMed Search in Google Scholar
Download RIS citation
6 Ben-Yacoub S, Abdeljaoued Y, Mayoraz E. Fusion of face and speech data for person identity verification. Ieee Transactions on Neural Networks 1999; 10 (05) 1065-74.

Crossref PubMed Search in Google Scholar
Download RIS citation
7 Bock JR, Gough DA. Predicting protein-protein interactions from primary structure. Bioinformatics 2001; 17 (05) 455-60.

Crossref PubMed Search in Google Scholar
Download RIS citation
8 Brown MPS, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D. Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of the National Academy of Sciences of the United States of America 2000; 97 (01) 262-7.

Crossref PubMed Search in Google Scholar
Download RIS citation
9 Burbidge R, Trotter M, Buxton B, Holden S. Drug design by machine learning: support vector machines for pharmaceutical data analysis. Comput Chem 2001; 26 (01) 5-14.

Crossref PubMed Search in Google Scholar
Download RIS citation
10 Byvatov E, Schneider G. Support vector machine applications in bioinformatics. Appl Bioinformatics 2003; 02 (02) 67-77.

Search in Google Scholar
Download RIS citation
11 Cai CZ, Han LY, Chen X, Cao ZW, Chen YZ. Prediction of functional class of the SARS coronavirus proteins by a statistical learning method. J Proteome Res 2005; 04 (05) 1855-62.

Crossref PubMed Search in Google Scholar
Download RIS citation
12 Cai CZ, Han LY, Ji ZL, Chen X, Chen YZ. SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res 2003; 31 (13) 3692-7.

Crossref PubMed Search in Google Scholar
Download RIS citation
13 Cai CZ, Wang WL, Chen YZ. Support Vector Machine Classification of Physical and Biological Datasets. Inter J Mod Phys C 2003; 14 (05) 575-85.

Search in Google Scholar
Download RIS citation
14 Chen X, Ji ZL, Chen YZ. TTD: Therapeutic Target Database. Nucleic Acids Res 2002; 30 (01) 412-5.

Crossref PubMed Search in Google Scholar
Download RIS citation
15 Chinnasamy A, Sung WK, Mittal A. Protein structure and fold prediction using Tree-Augmented naive Bayesian classifier. J Bioinform Comput Biol 2005; 03 (04) 803-19.

Crossref PubMed Search in Google Scholar
Download RIS citation
16 Conkright MD, Guzman E, Flechner L, Su AI, Hogenesch JB, Montminy M. Genome-wide analysis of CREB target genes reveals a core promoter requirement for cAMP responsiveness. Mol Cell 2003; 11 (04) 1101-8.

Crossref PubMed Search in Google Scholar
Download RIS citation
17 de Vel O, Anderson A, Corney M, Mohay G. Mining e-mail content for author identification forensics. Sigmod Record 2001; 30 (04) 55-64.

Search in Google Scholar
Download RIS citation
18 Ding CH, Dubchak I. Multi-class protein fold recognition using support vector machines and neuralnetworks. Bioinformatics 2001; 17 (04) 349-58.

Crossref PubMed Search in Google Scholar
Download RIS citation
19 Drews J. Drug discovery: ahistorical perspective. Science 2000; 287 5460 1960-4.

Crossref PubMed Search in Google Scholar
Download RIS citation
20 Dubchak I, Muchnik I, Holbrook SR, Kim SH. Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci USA 1995; 92 (19) 8700-4.

Crossref PubMed Search in Google Scholar
Download RIS citation
21 Dusseldorp E, Meulman J. Prediction in medicine by integrating regression trees into regression analysis with optimal scaling. Methods Inf Med 2002; 40 (05) 403-9.

Thieme Connect Search in Google Scholar
Download RIS citation
22 Engelhardt BE, Jordan MI, Muratore KE, Brenner SE. Protein Molecular Function Prediction by Bayesian Phylogenomics. PLoS Comput Biol 2005; 01 (05) e45.

Crossref PubMed Search in Google Scholar
Download RIS citation
23 Friede T, Kieser M, Miller F. Modeling the recovery from depressive illness by an exponential model with mixed effects. Methods Inf Med 2000; 39 (01) 12-5.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
24 Gunn SR. Support Vector Machines for Classification and Regression: Technical Report. University of Southampton. 1998

PubMed Search in Google Scholar
Download RIS citation
25 Herpfer I, Lieb K. Substance P receptor antagonists in psychiatry: rationale for development and therapeutic potential. CNS Drugs 2005; 19 (04) 275-93.

Crossref PubMed Search in Google Scholar
Download RIS citation
26 Hollander N, Augustin NH, Sauerbrei W. Investigation on the improvement of prediction by bootstrap model averaging. Methods Inf Med 2006; 45 (01) 44-50.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
27 Hopkins AL, Groom CR. The druggable genome. Nat Rev Drug Discov 2002; 01 (09) 727-30.

Crossref PubMed Search in Google Scholar
Download RIS citation
28 Hua S, Sun Z. A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. J Mol Biol 2001; 308 (02) 397-407.

Crossref PubMed Search in Google Scholar
Download RIS citation
29 Huang X, Huang DS, Zhang GZ, Zhu YP, Li YX. Prediction of protein secondary structure using improved two-level neural network architecture. Protein Pept Lett 2005; 12 (08) 805-11.

Crossref PubMed Search in Google Scholar
Download RIS citation
30 Ikeda M, Itoh S, Ishigaki T, Yamauchi K. Application of resampling techniques to the statistical analysis of the Brier score. Methods Inf Med 2001; 40 (03) 259-64.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
31 Karchin R, Karplus K, Haussler D. Classifying G-protein coupled receptors with support vector machines. Bioinformatics 2002; 18 (01) 147-59.

Crossref PubMed Search in Google Scholar
Download RIS citation
32 Karlsen RE, Gorsich DJ, Gerhart GR. Target classification via support vector machines. Optical Engineering 2000; 39 (03) 704-11.

Crossref Search in Google Scholar
Download RIS citation
33 Kim KI, Jung K, Park SH, Kim HJ. Supportvector machine-based text detection in digital video. Pattern Recognition 2001; 34 (02) 527-9.

Crossref Search in Google Scholar
Download RIS citation
34 Kirk RI, Deitch JA, Wu JM, Lerea KM. Resveratrol decreases early signaling events in washed platelets but has little effect on platelet in whole blood. Blood Cells Mol Dis 2000; 26 (02) 144-50.

Crossref PubMed Search in Google Scholar
Download RIS citation
35 Kramer MS, Cutler N, Feighner J, Shrivastava R, Carman J, Sramek JJ, Reines SA, Liu G, Snavely D, Wyatt-Knowles E, Hale JJ, Mills SG, MacCoss M, Swain CJ, Harrison T, Hill RG, Hefti F, Scolnick EM, Cascieri MA, Chicchi GG, Sadowski S, Williams AR, Hewson L, Smith D, Carlson EJ, Hargreaves RJ, Rupniak NM. Distinct mechanism for antidepressant activity by blockade of central substance Preceptors. Science 1998; 281 5383 1640-5.

Crossref PubMed Search in Google Scholar
Download RIS citation
36 Lin HH, Han LY, Cai CZ, Ji ZL, Chen YZ. Prediction of transporter family from protein sequence by support vector machine approach. Proteins 2006; 62 (01) 218-31.

PubMed Search in Google Scholar
Download RIS citation
37 Liong SY, Sivapragasam C. Flood stage forecasting with support vector machines. Journal of the American Water Resources Association 2002; 38 (01) 173-86.

Crossref Search in Google Scholar
Download RIS citation
38 Matsuda S, Vert JP, Saigo H, Ueda N, Toh H, Akutsu T. A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci 2005; 14 (11) 2804-13.

Crossref PubMed Search in Google Scholar
Download RIS citation
39 Maurer W. Creative and innovative statistics in clinical research and development. Methods Inf Med 2005; 44 (04) 551-60.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
40 Muller KR, Ratsch G, Sonnenburg S, Mika S, Grimm M, Heinrich N. Classifying ‘drug-likeness’ with Kernel-based learning methods. J Chem Inf Model 2005; 45 (02) 249-53.

Crossref PubMed Search in Google Scholar
Download RIS citation
41 Narayanan MN, Lucas SB. A genetic algorithm to improve aneural network to predict a patient’s response to warfarin. Methods Inf Med 1993; 32 (01) 55-8.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
42 Pattini L, Cerutti S. Hydrophobicity analysis of protein primary structures to identify helical regions. Methods Inf Med 2004; 43 (01) 102-5.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
43 Rijsbergen CJv. Information Retireval. London: Butterworths; 1979

Search in Google Scholar
Download RIS citation
44 Ryan TE, Patterson SD. Proteomics: drug target discovery on an industrial scale. Trends Biotechnol 2002; 20 (12) S45-51.

PubMed Search in Google Scholar
Download RIS citation
45 Stankovski V, Bratko I, Demsar J, Smrke D. Induction of hypotheses concerning hip arthroplasty: a modified methodology for medical research. Methods Inf Med 2001; 40 (05) 392-6.

Thieme Connect PubMed Search in Google Scholar
Download RIS citation
46 Vapnik VN. The nature of statistical learning theory. 02 New York: Springer; xix 314 2000

Search in Google Scholar
Download RIS citation
47 Vapnik VN. An overview of statistical learning theory. Ieee Transactions on Neural Networks 1999; 10 (05) 988-99.

Crossref PubMed Search in Google Scholar
Download RIS citation
48 Vapnik VN. Statistical learning theory. New York: Wiley; xxiv 736 1998

Download RIS citation
49 Walke DW, Han C, Shaw J, Wann E, Zambrowicz B, Sands A. In vivo drug target discovery: identifying the best targets from the genome. Curr Opin Biotechnol 2001; 12 (06) 626-31.

Crossref PubMed Search in Google Scholar
Download RIS citation
50 Williams M. Genome-based drug discovery: prioritizing disease-susceptibility/disease-associated genes as novel drug targets for schizophrenia. Curr Opin Investig Drugs 2003; 04 (01) 31-6.

Search in Google Scholar
Download RIS citation
51 Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey J. Drug Bank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 2006; 34 (Database issue) D668-72.

Crossref PubMed Search in Google Scholar
Download RIS citation
52 Yang Y, Liu X. A re-examination of text categorization methods. The ACM SIGIR Conference on Research and Development in Information Retrieval. 1999

PubMed Search in Google Scholar
Download RIS citation
53 Yevich JP. Drug development: from discovery to marketing. Krogsgaard-Larsen PLT, Madsen U. A textbook of drug design and development.. Australia: Harwood academic; 1996: 508.

Search in Google Scholar
Download RIS citation
54 Yin MJ, Yamamoto Y, Gaynor RB. The antiinflammatory agents aspirin and salicylate inhibit the activity of I(kappa)B kinase-beta. Nature 1998; 396 6706 77-80.

Crossref PubMed Search in Google Scholar
Download RIS citation
55 Yuan Z, Burrage K, Mattick JS. Prediction of protein solvent accessibility using support vector machines. Proteins 2002; 48 (03) 566-70.

Crossref PubMed Search in Google Scholar
Download RIS citation
56 Zhao C, Zhang H, Zhang X, Zhang R, Luan F, Liu M, Hu Z, Fan B. Prediction of Milk/Plasma Drug Concentration (M/P) Ratio Using Support Vector Machine (SVM) Method. Pharm Res. 2005

PubMed Search in Google Scholar
Download RIS citation
57 Zhao CY, Zhang HX, Zhang XY, Liu MC, Hu ZD, Fan BT. Application of support vector machine (SVM) for prediction of toxic activity of different data sets. Toxicology 2006; 217 2-3 105-19.

Crossref PubMed Search in Google Scholar
Download RIS citation

Related Journals

Subscribe to RSS

Share / Bookmark

Does Drug-target Have a Likeness?

Authors

Publication History

Summary

Keywords

References