Computational Prediction of Compound–Protein Interactions for Orphan Targets Using CGBVS
Abstract
:1. Introduction
2. Materials and Methods
2.1. CGBVS
2.2. Virtual Orphan GPCR Model
2.3. Model Validation
2.4. Applicability Index
3. Results and Discussion
3.1. Analysis of Prediction Accuracy
3.2. Applicability of CGBVS for Orphan Targets
4. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Davies, M.; Nowotka, M.; Papadatos, G.; Dedman, N.; Gaulton, A.; Atkinson, F.; Bellis, L.; Overington, J.P. ChEMBL web services: Streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015, 43, W612–W620. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gaulton, A.; Hersey, A.; Nowotka, M.; Bento, A.P.; Chambers, J.; Mendez, D.; Mutowo, P.; Atkinson, F.; Bellis, L.J.; Cibrián-Uhalte, E.; et al. The ChEMBL database in 2017. Nucleic Acids Res. 2017, 45, D945–D954. [Google Scholar] [CrossRef] [PubMed]
- Mendez, D.; Gaulton, A.; Bento, A.P.; Chambers, J.; De Veij, M.; Félix, E.; Magariños, M.P.; Mosquera, J.F.; Mutowo, P.; Nowotka, M.; et al. ChEMBL: Towards direct deposition of bioassay data. Nucleic Acids Res. 2019, 47, D930–D940. [Google Scholar] [CrossRef]
- Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.; Shoemaker, B.A.; Thiessen, P.A.; Yu, B.; et al. PubChem in 2021: New data content and improved web interfaces. Nucleic Acids Res. 2021, 49, D1388–D1395. [Google Scholar] [CrossRef] [PubMed]
- Yamanishi, Y.; Araki, M.; Gutteridge, A.; Honda, W.; Kanehisa, M. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 2008, 24, i232–i240. [Google Scholar] [CrossRef]
- Jacob, L.; Vert, J.P. Protein-ligand interaction prediction: An improved chemogenomics approach. Bioinformatics 2008, 24, 2149–2156. [Google Scholar] [CrossRef] [Green Version]
- Wassermann, A.M.; Geppert, H.; Bajorath, J. Ligand prediction for orphan targets using support vector machines and various target-ligand kernels is dominated by nearest neighbor effects. J. Chem. Inf. Model. 2009, 49, 2155–2167. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yabuuchi, H.; Niijima, S.; Takematsu, H.; Ida, T.; Hirokawa, T.; Hara, T.; Ogawa, T.; Minowa, Y.; Tsujimoto, G.; Okuno, Y. Analysis of multiple Compound–Protein interactions reveals novel bioactive molecules. Mol. Syst. Biol. 2014, 7, 472. [Google Scholar] [CrossRef]
- Brown, J.; Okuno, Y. Systems biology and systems chemistry: New directions for drug discovery. Chem. Biol. 2012, 19, 23–28. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Gönen, M. Predicting drug–target interactions from chemical and genomic kernels using Bayesian matrix factorization. Bioinformatics 2012, 28, 2304–2310. [Google Scholar] [CrossRef] [PubMed]
- Shiraishi, A.; Niijima, S.; Brown, J.; Nakatsui, M.; Okuno, Y. Chemical genomics approach for gpcr–ligand interaction prediction and extraction of ligand binding determinants. J. Chem. Inf. Model. 2013, 53, 1253–1262. [Google Scholar] [CrossRef]
- Van Laarhoven, T.; Marchiori, E. Predicting drug–target interactions for new drug compounds using a weighted nearest neighbor profile. PLoS ONE 2013, 8, e66952. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, M.; Miao, C.; Zhao, P.; Li, X.L. Neighborhood regularized logistic matrix factorization for drug–target interaction prediction. PLoS Comput. Biol. 2016, 12, e1004760. [Google Scholar] [CrossRef]
- Hamanaka, M.; Taneishi, K.; Iwata, H.; Ye, J.; Pei, J.; Hou, J.; Okuno, Y. CGBVS-DNN: Prediction of Compound–Protein Interactions Based on Deep Learning. Mol. Inform. 2017, 36, 1600045. [Google Scholar] [CrossRef]
- Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar]
- Hardoon, D.R.; Shawe-Taylor, J. Decomposing the tensor kernel support vector machine for neuroscience data with structured labels. Mach. Learn. 2010, 79, 29–46. [Google Scholar] [CrossRef]
- Platt, J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 1999, 10, 61–74. [Google Scholar]
- alvascience Srl. alvaDesc Version 1.0.8. Available online: https://www.alvascience.com (accessed on 14 July 2021).
- Rogers, D.; Hahn, M. Extended-connectivity fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754. [Google Scholar] [CrossRef] [PubMed]
- RDKit: Open-Source Cheminformatics Software. Available online: https://www.rdkit.org (accessed on 14 July 2021).
- Zhang, P.; Tao, L.; Zeng, X.; Qin, C.; Chen, S.Y.; Zhu, F.; Yang, S.Y.; Li, Z.R.; Chen, W.P.; Chen, Y.Z. PROFEAT Update: A Protein Features Web Server with Added Facility to Compute Network Descriptors for Studying Omics-Derived Networks. J. Mol. Biol. 2017, 429, 416–425. [Google Scholar] [CrossRef] [PubMed]
- Asgari, E.; Mofrad, M.R. Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE 2015, 10, e0141287. [Google Scholar] [CrossRef] [PubMed]
- Chen, Y. PROFEAT 2016. Available online: http://bidd.group/cgi-bin/profeat2016/main.cgi (accessed on 14 July 2021).
- BioVec. Available online: https://github.com/kyu999/biovec (accessed on 14 July 2021).
- Liao, L.; Noble, W.S. Combining pairwise sequence similarity and support vector machines for remote protein homology detection. In Proceedings of the Sixth Annual International Conference on Computational Biology, Washington, DC, USA, 18–21 April 2002; pp. 225–232. [Google Scholar]
- Saigo, H.; Vert, J.P.; Ueda, N.; Akutsu, T. Protein homology detection using string alignment kernels. Bioinformatics 2004, 20, 1682–1689. [Google Scholar] [CrossRef] [Green Version]
- INTAGE Healthcare Inc. CzeekS. Available online: https://www.intage-healthcare.co.jp/service/data-science/insilico/czeeks/ (accessed on 14 July 2021).
- Schroeter, T.S.; Schwaighofer, A.; Mika, S.; Ter Laak, A.; Suelzle, D.; Ganzer, U.; Heinrich, N.; Müller, K.R. Estimating the domain of applicability for machine learning QSAR models: A study on aqueous solubility of drug discovery molecules. J. Comput. Aided Mol. Des. 2007, 21, 485–498. [Google Scholar] [CrossRef] [PubMed]
- Fechner, N.; Jahn, A.; Hinselmann, G.; Zell, A. Estimation of the applicability domain of kernel-based machine learning models for virtual screening. J. Cheminform. 2010, 2, 2. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Kaneko, H.; Funatsu, K. Applicability domain based on ensemble learning in classification and regression analyses. J. Chem. Inf. Model. 2014, 54, 2469–2482. [Google Scholar] [CrossRef] [PubMed]
Gene Name | Accession | Active | Inactive | Protein Name |
---|---|---|---|---|
ADRA1A | P35348 | 1800 | 244 | Alpha-1A adrenergic receptor |
ADRA1B | P35368 | 1425 | 302 | Alpha-1B adrenergic receptor |
ADRA1D | P25100 | 1369 | 248 | Alpha-1D adrenergic receptor |
ADRB1 | P08588 | 1021 | 539 | Beta-1 adrenergic receptor |
ADRB2 | P07550 | 1542 | 1832 | Beta-2 adrenergic receptor |
ADRB3 | P13945 | 1472 | 215 | Beta-3 adrenergic receptor |
AGTR1 | P30556 | 1167 | 599 | Type-1 angiotensin II receptor |
AGTR2 | P50052 | 900 | 113 | Type-2 angiotensin II receptor |
CCKBR | P32239 | 1014 | 516 | Gastrin/cholecystokinin type B receptor |
CCR2 | P41597 | 1379 | 287 | C-C chemokine receptor type 2 |
CCR5 | P51681 | 1749 | 333 | C-C chemokine receptor type 5 |
CHRM1 | P11229 | 1768 | 1088 | Muscarinic acetylcholine receptor M1 |
CHRM2 | P08172 | 1493 | 663 | Muscarinic acetylcholine receptor M2 |
CHRM3 | P20309 | 1666 | 605 | Muscarinic acetylcholine receptor M3 |
CHRM4 | P08173 | 751 | 522 | Muscarinic acetylcholine receptor M4 |
CHRM5 | P08912 | 510 | 651 | Muscarinic acetylcholine receptor M5 |
CRHR1 | P34998 | 1648 | 239 | Corticotropin-releasing factor receptor 1 |
CXCR3 | P49682 | 1099 | 184 | C-X-C chemokine receptor type 3 |
EDNRA | P25101 | 1195 | 257 | Endothelin-1 receptor |
FFAR1 | O14842 | 774 | 300 | Free fatty acid receptor 1 |
GHSR | Q92847 | 1541 | 191 | Growth hormone secretagogue receptor type 1 |
GLP1R | P43220 | 3452 | 94,774 | Glucagon-like peptide 1 receptor |
GNRHR | P30968 | 1217 | 96 | Gonadotropin-releasing hormone receptor |
GPR119 | Q8TDV5 | 1234 | 110 | Glucose-dependent insulinotropic receptor |
GPR55 | Q9Y2T6 | 153 | 553 | G-protein coupled receptor 55 |
HCRTR1 | O43613 | 2200 | 783 | Orexin receptor type 1 |
HCRTR2 | O43614 | 2611 | 725 | Orexin receptor type 2 |
HRH1 | P35367 | 999 | 406 | Histamine H1 receptor |
HRH3 | Q9Y5N1 | 3395 | 212 | Histamine H3 receptor |
HRH4 | Q9H3N8 | 903 | 318 | Histamine H4 receptor |
HTR1A | P08908 | 3532 | 480 | 5-hydroxytryptamine receptor 1A |
HTR1B | P28222 | 932 | 190 | 5-hydroxytryptamine receptor 1B |
HTR1D | P28221 | 1078 | 133 | 5-hydroxytryptamine receptor 1D |
HTR2A | P28223 | 3540 | 676 | 5-hydroxytryptamine receptor 2A |
HTR2B | P41595 | 1337 | 381 | 5-hydroxytryptamine receptor 2B |
HTR2C | P28335 | 2588 | 756 | 5-hydroxytryptamine receptor 2C |
HTR6 | P50406 | 2925 | 306 | 5-hydroxytryptamine receptor 6 |
HTR7 | P34969 | 1532 | 248 | 5-hydroxytryptamine receptor 7 |
MC4R | P32245 | 2311 | 857 | Melanocortin receptor 4 |
MCHR1 | Q99705 | 3116 | 524 | Melanin-concentrating hormone receptor 1 |
NPY5R | Q15761 | 1038 | 100 | Neuropeptide Y receptor type 5 |
OPRD1 | P41143 | 3180 | 2086 | Delta-type opioid receptor |
OPRK1 | P41145 | 3743 | 1197 | Kappa-type opioid receptor |
OPRL1 | P41146 | 1305 | 128 | Nociceptin receptor |
OPRM1 | P35372 | 3797 | 2033 | Mu-type opioid receptor |
P2RY12 | Q9H244 | 912 | 237 | P2Y purinoceptor 12 |
PTGDR2 | Q9Y5Y4 | 2541 | 143 | Prostaglandin D2 receptor 2 |
S1PR1 | P21453 | 2165 | 379 | Sphingosine 1-phosphate receptor 1 |
TACR1 | P25103 | 2334 | 223 | Substance-P receptor |
TACR2 | P21452 | 794 | 227 | Substance-K receptor |
TACR3 | P29371 | 788 | 143 | Neuromedin-K receptor |
TSHR | P16473 | 1140 | 15,271 | Thyrotropin receptor |
Descriptor | Class | SVM Kernel | Equation |
---|---|---|---|
alvaDesc | compound | Gaussian | |
ECFP | compound | Tanimoto | |
PROFEAT2016 | protein | Gaussian | |
ProtVec | protein | Gaussian | |
MSA | protein | linear |
Descriptors | |||||
---|---|---|---|---|---|
Compound | Protein | 0–1 | 1–10 | 10–30 | 30–50 |
alvaDesc | PROFEAT | 13 | 23 | 14 | 2 |
ECFP | PROFEAT | 14 | 28 | 9 | 1 |
alvaDesc | MSA | 10 | 14 | 19 | 9 |
ECFP | MSA | 8 | 19 | 22 | 3 |
alvaDesc | ProtVec | 17 | 20 | 14 | 1 |
ECFP | ProtVec | 15 | 26 | 11 | 0 |
Descriptors | Spearman’s Corr. | r | ||
---|---|---|---|---|
Compound | Protein | |||
alvaDesc | PROFEAT | 0.4466 | 79.73 | 0.6264 |
ECFP | PROFEAT | 0.3949 | 94.18 | 0.6385 |
alvaDesc | MSA | 0.7792 | 12.77 | 0.4365 |
ECFP | MSA | 0.8047 | 11.04 | 0.4564 |
alvaDesc | ProtVec | −0.0362 | 39.87 | 0.5801 |
ECFP | ProtVec | 0.1759 | 89.28 | 0.8000 |
Descriptors | PPV | Accuracy | p-Value | ||
---|---|---|---|---|---|
Compound | Protein | ||||
alvaDesc | PROFEAT | 6.490 | 0.6521 | 0.7115 | |
ECFP | PROFEAT | 7.486 | 0.7272 | 0.7500 | |
alvaDesc | MSA | 8.702 | 0.8710 | 0.8846 | |
ECFP | MSA | 8.999 | 0.8889 | 0.8846 | |
alvaDesc | ProtVec | 7.671 | 0.8333 | 0.6153 | |
ECFP | ProtVec | −10.79 | 0.8000 | 0.6731 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kanai, C.; Kawasaki, E.; Murakami, R.; Morita, Y.; Yoshimori, A. Computational Prediction of Compound–Protein Interactions for Orphan Targets Using CGBVS. Molecules 2021, 26, 5131. https://doi.org/10.3390/molecules26175131
Kanai C, Kawasaki E, Murakami R, Morita Y, Yoshimori A. Computational Prediction of Compound–Protein Interactions for Orphan Targets Using CGBVS. Molecules. 2021; 26(17):5131. https://doi.org/10.3390/molecules26175131
Chicago/Turabian StyleKanai, Chisato, Enzo Kawasaki, Ryuta Murakami, Yusuke Morita, and Atsushi Yoshimori. 2021. "Computational Prediction of Compound–Protein Interactions for Orphan Targets Using CGBVS" Molecules 26, no. 17: 5131. https://doi.org/10.3390/molecules26175131