Next Article in Journal
Polymers Enhancing Bioavailability in Drug Delivery
Previous Article in Journal
Gold-Polypyrrole-Loaded Eosin in Photo-Mediated Treatment of Hidradenitis Suppurativa: In Vivo Trans-Epidermal Permeation Study and Clinical Case Report
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Cocrystal Prediction of Bexarotene by Graph Convolution Network and Bioavailability Improvement

1
State Key Laboratory of Drug Research and Drug Discovery and Design Center, Pharmaceutical Analytical & Solid-State Chemistry Research Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
2
School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
3
University of Chinese Academy of Sciences, Beijing 100049, China
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Pharmaceutics 2022, 14(10), 2198; https://doi.org/10.3390/pharmaceutics14102198
Submission received: 22 September 2022 / Revised: 9 October 2022 / Accepted: 13 October 2022 / Published: 16 October 2022
(This article belongs to the Section Pharmaceutical Technology, Manufacturing and Devices)

Abstract

:
Bexarotene (BEX) was approved by the FDA in 1999 for the treatment of cutaneous T-cell lymphoma (CTCL). The poor aqueous solubility causes the low bioavailability of the drug and thereby limits the clinical application. In this study, we developed a GCN-based deep learning model (CocrystalGCN) for in-silico screening of the cocrystals of BEX. The results show that our model obtained high performance relative to baseline models. The top 30 of 109 coformer candidates were scored by CocrystalGCN and then validated experimentally. Finally, cocrystals of BEX-pyrazine, BEX-2,5-dimethylpyrazine, BEX-methyl isonicotinate, and BEX-ethyl isonicotinate were successfully obtained. The crystal structures were determined by single-crystal X-ray diffraction. Powder X-ray diffraction, differential scanning calorimetry, and thermogravimetric analysis were utilized to characterize these multi-component forms. All cocrystals present superior solubility and dissolution over the parent drug. The pharmacokinetic studies show that the plasma exposures (AUC0−8h) of BEX-pyrazine and BEX-2,5-dimethylpyrazine are 1.7 and 1.8 times that of the commercially available BEX powder, respectively. This work sets a good example for integrating virtual prediction and experimental screening to discover the new cocrystals of water-insoluble drugs.

Graphical Abstract

1. Introduction

BEX (Targretin ®) was approved by the FDA in 1999 for the treatment of cutaneous T-cell lymphoma (CTCL) [1] and is a member of the rexinoids family. As a selective agonist of the retinoid X receptor, it induces cell differentiation and apoptosis and is effective as an oral treatment for mycosis fungoides and Sézary syndrome [2], as documented in the national guidelines of the UK [3]. The clinic dose of BEX is 300 mg/m2 once daily, and the main toxicities of BEX include xeroderma, central hypothyroidism, and elevation of cholesterol, which can be managed with dose attenuation [4]. The clinical utilization of BEX was limited by the poor aqueous solubility, which led to malabsorption with an absolute bioavailability of less than 20% [5]. In order to increase the absorption and potentially reduce the dose, new solid forms of BEX with higher bioavailability should be developed.
Cocrystals are multicomponent crystals of fixed stoichiometry that contain different interactions such as hydrogen bonds, halogen bonds, π-π stacking, and van der Waals interactions [6]. Supramolecular synthons play a crucial role in cocrystal design and assembly. The hydrogen bonding between unlike functional groups is called heterosynthon, which can often be served as a structural design unit for cocrystal synthesis [7]. As a new type of crystalline material, pharmaceutical cocrystals possess the potential to alter the physical and chemical properties of active pharmaceutical ingredients (API), such as melting point [8], chemical stability [9,10], solubility [11,12], bioavailability [13,14], and so on.
Traditional cocrystal screening methods such as solution crystallization [15], liquid-assisted grinding [16], dry grinding [17], and anti-solvent addition [18] are generally time-consuming, labor-intensive, and expensive economically. Thus, various computational approaches have been developed for in-silico cocrystal former (coformer) screening. Due to the abundant cocrystal data in the Cambridge Structural Database (CSD) [19], a series of knowledge-based methods including hydrogen bonding propensity [20,21], hydrogen bond motif searches [22,23], and molecular complementarity [24] are proposed. In addition, network-based methods [25,26] and ab initio methods such as molecular electrostatic potential surfaces (MEPS) [27,28], conductor-like screening models for real solvents (COSMO-RS) [29,30] are also available for cocrystal prediction.
With the development of computer science and the accumulation of data, machine learning-based approaches have been rapidly and widely applied in polymorphism and cocrystal prediction, such as crystal property prediction [31,32,33], crystal structure analysis [34,35], cocrystal former screening [36,37], and cocrystal formation prediction [38,39]. For example, Mswahili et al. [40] extracted descriptors from SMILES and built several machine learning-based models for cocrystal prediction. The result showed that the artificial neural network outperformed other models. Our group also developed cocrystal prediction models based only on 2D structures obtained from CSD, which achieved high prediction performance and was also verified by a cocrystal screening experiment [41]. It cannot be ignored that traditional machine learning models can usually only handle fixed-size inputs, and feature extraction relies on feature engineering, which will lead to information loss problems [42]. Over the past few years, graph convolution networks (GCN) have become a practical and powerful tool for non-Euclidean structured data [43]. A molecule can naturally be considered a graph, whose atoms and bonds can be represented by nodes and edges. Therefore, it is more suitable to apply graphs to molecular representation and cocrystal prediction. Devogelaer et al. reported an ensemble approach combining a molecular fingerprint-based model and a molecular graph-based model for the prediction of cocrystal formation [44]. Recently, a powerful graph neural network (GNN) framework named CCGNet was developed for quickly predicting coformers in diverse cocrystal materials [45]. The results showed that CCGNet exhibited superior performance against seven competitive models and three challenging independent test sets, and one new energetic cocrystal predicted by CCGNet was successfully synthesized.
In previous studies, the datasets used to build the cocrystal prediction model were unbalanced (fewer negative samples), and the negative samples in the balanced dataset were usually from artificially generated data, which would result in false negative samples. Additionally, most models pay less attention to the noncovalent interactions between API and coformers. The aim of this study was first to collect a balanced cocrystal dataset in which negative samples included experimental sources and artificially generated samples, and then to develop a graph convolution network-based deep learning model, CocrystalGCN, for cocrystal prediction. After that, the CocrystalGCN was used to conduct virtual cocrystal screening of BEX from 109 coformers. We selected 30 top-scored coformers for experimental validation. The results showed that 4 out of 30 coformers successfully formed cocrystals with BEX. The physicochemical properties of these cocrystals were characterized using powder X-ray diffraction (PXRD), differential scanning calorimetry (DSC), thermogravimetric analysis (TGA), and so on. Finally, the pharmacokinetic study in rats was conducted to evaluate the in vivo absorption of BEX cocrystals and the free drug.

2. Materials and Methods

2.1. Data Collection and Processing

Datasets from various sources have been used to develop models in previous studies, most of which are limited data collected from the literature. The Cambridge Structural Database (CSD), a comprehensive database of organic and metal-organic molecule crystal structures, contains published cocrystal structures around the world [19]. Our group previously collected two-component crystal structures from CSD as positive samples to develop machine-learning classification models [41]. Here, the new version of CSD (version 2022.1.0., Cambridge Crystallographic Data Centre, Cambridge, UK) is filtered based on the same rules as reference [41] to obtain cocrystal data. Finally, 8016 cocrystal structures were obtained as positive samples (denoted a csd_pos). The same method as in reference [41], where different molecules are randomly combined into molecular pairs, was employed to generate 8016 negative samples (denoted as gen_neg). It is worth noting that false negative samples may be included in artificially generated negative samples. Recently, Jiang and partners collected 1052 negative samples generated by experiments from 186 literatures to minimize false negative samples [45]. Therefore, we obtained 1012 negative samples after removing duplicates, and then replaced the same number of artificially generated negative samples with the negative samples from experimental source (denoted as exp_neg) according to specific rules, to further minimize the false negative and build high-quality balanced dataset. A small subset was deleted from gen_neg in terms of the following steps:
  • Count the frequency of each compound in csd_pos and gen_neg;
  • Calculate the difference between the frequencies of each compound in gen_neg and csd_pos, rank them from large to small;
  • If the frequencies of top 1 compound in gen_neg and csd_pos are M and N, respectively, and M > N, use the MaxMin method [46] to select N molecule pairs from the M molecule pairs of top 1 compound in gen_neg, and delete the remaining molecules;
  • Repeat step 3 to top 2, top 3, …, top n compounds until 1012 artificially generated negative samples are removed, and then add the exp_neg.
8016 negative samples including artificial or experimental samples were obtained. The whole diagram of dataset processing is shown in Figure 1.

2.2. Cocrystal Representation

For pharmaceutical cocrystals, the API interacts with coformer in a certain stoichiometric ratio by hydrogen bonds, π-π stacking, van der Waals, or other noncovalent bonds, rather than covalent bonds. Inspired by the work of Cho et al. [47], which utilizes two graphs to represent protein-ligand complex and are encoded as covalent and noncovalent adjacency matrices, we also apply the matrix strategy to cocrystal representation. Two-component cocrystal is first converted into an unweighted molecular graph in which each vertex represents an atom and each edge represents a bond. The adjacency matrix A is an m × m matrix, where A i j is either 0 or 1. A i j = 1 if i th and j th vertices are connected, and 0 if not. A can be divided into four smaller adjacency matrices, which encode the connection information of API-API ( A API : API ), API-CF ( A API : CF ), CF-API ( A CF : API ), and CF-CF ( A CF : CF ) atoms, respectively.
A = [ A 11 A 1 m A m 1 A m m ]     =     [ A API : API A API : CF A CF : API A CF : CF ]
m: total number of atoms in the cocrystal; API: active pharmaceutical ingredient; CF: cocrystal former (Coformer).
Specifically, covalent and noncovalent adjacency matrix strategies were employed for cocrystal. As shown in Figure 2, the covalent and noncovalent adjacency matrices are in green and blue boxes, denoted as A C   and A NC , respectively. A C represents the bond connectivity in API or coformer molecular graphs, in which A API : API and A CF : CF matrices are filled with 1 if there is a covalent bond between atoms, and 0 otherwise, A API : CF and A CF : API matrices are filled with 0. The A NC represents all possible intermolecular noncovalent interactions between API and coformer, in which A API : CF and A CF : API matrices are filled with 1, and A API : API and A CF : CF matrices are filled with 0.

2.3. Architecture of the CocrystalGCN

Graph can be represented as G = ( V ,   E ,   A ) , where V is a set of nodes, E is a set of edges, and A is a weighted or unweighted adjacency matrix. For m-nodes graph, the adjacency matrix,   A   m × m , denotes the following:
A i j =   { a i j ,         i f   e i j   E     0 ,               o t h e r w i s e  
where a i j = 1 if the A is unweighted adjacency matrix. GNN has shown powerful performance in many drug discovery tasks [48]. GCN is one of GNN architecture variants, which apply the convolution operator to non-Euclidean structured data, such as molecule, social networks, and also have achieved state-of-the-art performance in classification and regression tasks [42].
To better predict whether API and coformer can form cocrystal, a classification model named CocrystalGCN was developed. Similar to InteractionNet, CocrystalGCN contains five modules, among which two graph convolution modules adopt covalent ( A C ) and noncovalent ( A NC ) adjacency matrix strategies mentioned above respectively. The whole framework of CocrystalGCN is shown in Figure 3. From the Figure 3, the five modules of CocrystalGCN are node embedding layers, graph convolutional layers with A C (GCAc), graph convolutional layers with A NC (GCAnc), graph pooling layer (GP), and fully connected layers (FC). In addition, there are two residual connection operations between graph convolutional layers.
Let X i n denote atomic feature matrix (detailed atom features are shown in Table S1), after the cocrystal molecular graphs are input into CocrystalGCN, the X i n is first updated as node embedding h N E through node embedding layers,
h N E = σ ( b 2 + σ ( b 1 + X i n W 1 ) W 2 )
Here and elsewhere in this paper, σ represents the activation function, is the Rectified Linear Units (ReLU). b x and W x are bias and the trainable parameter matrix, respectively. Once the node embedding layers propagation is finished, GCAc and GCAnc layers receive node embedding h N E for local information aggregation. For each convolution step, the residual connection operations are used to updates representation additively. The detailed mathematics formulas of graph convolution are as follows:
h c = h N E + σ ( A C h N E W C )
h N C = h c + σ ( A N C h c W N C )
Cocrystal prediction belongs to graph-level classification task, so node-level atomic feature embedding need to be aggregated and generate a molecular vector in a permutation-invariant way. Therefore, h N C is aggregated into molecular vector h G P ,
h G P = 1 T h N C
After global pooling step, the molecular vector h G P is fed into fully connection layers and eventually outputs the probability of cocrystal formation between API and coformer.
h 0 = f ( σ ( b F C 2 + σ ( b F C 1 + h G P W F C 1 ) W F C 2 ) )
The 1 in Equation (6) stands for all-ones vector, function f ( ) in Equation (7) is sigmoid activation function.

2.4. Performance Evaluation

To evaluate the effectiveness of various models, three predictive performance metrics are calculated between observed and predicted category labels according to true positive (TP), false positive (FP), true negative (TN), and false negative (FN) as follows: Precision, Recall, and Accuracy (Acc). The detailed calculation formulas are as follows:
Recall = TP TP + FN
Peicision = TP TP + FP
Acc = TP + TN TP + TN + FP + FN
Another metric, the area under the receiver operating characteristic curve (AUC), is also employed to measure the binary classification capability. AUC value represents the performance of the model, the closer AUC value is to 1, the better the classification performance of the model, when AUC equals 0.5, the classification model is equivalent to a random classifier.

2.5. Model Training and Interpretation

The model attempts to differentiate positive coformers from whole coformer candidates, which can be regarded as a binary classification problem. Therefore, we adopted the binary cross-entropy shown below as loss function of the model, as follows:
L o s s = 1 N   i = 1 N [ y i · log y ^ i   + ( 1 y i ) · log ( 1 y ^ i ) ]
where N is the total number of samples, y i represents true label, and y ^ i represents the predicted probability of the ith sample. Adam [49] optimizer was used to minimize the loss function. Here, we implemented 10-fold cross-validation to evaluate prediction performance of the model. Dataset was randomly split into training set, validation set, and test set at the radio of 8:1:1. Hyperparameters searching was performed on the validation set. Table S2 shows the predefined hyperparameters search range. CocrystalGCN were implemented by using TensorFlow (version 2.8.0, Google, Mountain View, CA, America) [50].
Understanding why a model makes a certain decision can provide researchers with new and useful knowledge and make the model more acceptable [51,52]. In this work, we used a method named layer-wise relevance propagation (LRP) based decomposition [47,53] to explore the interpretation of CocrystalGCN model. The LRP method assumes that classifier can be decomposed into several layers, and the relevance of each neuron is calculated through reversely propagating of network (from output to input layer). The contribution of a neuron to the prediction result can be represented by relevance. Here, we can obtain the contribution of each atom in the compound to the cocrystal score predicted by CocrystalGCN through LRP analysis and generate the heat map of atomic contribution. By analyzing the heat map and the corresponding experimental results, we can gain a deeper understanding of the basis for the model predictions.

2.6. Prediction of Cocrystal of BEX

Due to the large number of coformers, the cocrystal screening experimental methods are expensive and time-consuming. In the present study, we applied the established CocrystalGCN model to predict coformer of BEX.
In total, we collected 109 coformer candidates based on the generally recognized as safe (GRAS) list and in-house coformer database. The compound pairs consisting of BEX and coformer candidates were fed into CocrystalGCN model. The cocrystal probability generated by CocrystalGCN model was ranked from large to small, and the top 30 coformers were selected for cocrystal screening experiment.

2.7. Materials

The commercially available BEX was purchased from Shanghai Titan Scientific Co., Ltd. (Shanghai, China). The analytical grade of solvents used for crystallization experiments was obtained from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China) and used as received.

2.8. Preparation of BEX Cocrystals

The cocrystals of BEX were synthesized by the method of liquid-assisted grinding (LAG). BEX (0.1 mmol) and stoichiometric amounts of the four corresponding coformers were ground for about 20 min at 30 Hz after adding 10 μL of methanol. A Retsch MM200 device was utilized to conduct the grinding experiments. Then, the resulting samples were detected by PXRD to confirm the formation of cocrystals.

2.9. Preparation of Single Crystals

The saturated solution of API was obtained by adding the excess BEX to the 2 mL of ethanol solvent and it was stirred for 12 h. Then the BEX-saturated solution was filtered and the excess coformers were added to the solution to stir overnight to produce the cocrystals. After that, the resulting samples were filtered and the single crystals were obtained by cooling the final filtrate at 4 °C for about a week.

2.10. Powder X-ray Diffraction (PXRD)

A Bruker D8 Advance X-ray diffractometer was utilized to obtain the PXRD patterns. The voltage was set to 40 kV and the current was set to 40 mA. Data were collected in the 3–40° 2θ range at a scan rate of 5°/min at ambient temperature. Data were imaged and integrated using RINT Rapid, and peaks were analyzed using Rigaku’s Jade 6.0. A corindon standard was used to calibrate the instrument.

2.11. Single Crystal X-ray Diffraction (SCXRD)

SCXRD experiments were performed on an APEX II CCD diffractometer (Bruker) with Mo Kα radiation (λ = 0.71073 Å). Mounting a suitable crystal on the loop and diffraction was performed at 170 K. The integration and scaling of the intensity data were operated on the SAINT program, and SADABS was used to correct the data for absorption effects. The direct methods on SHELXTL were used to solve and full matrix least squares were used to refine the crystal structure by OLEX2. The CCDC numbers are 2,203,843—2,203,846 for BEX-2,5-dimethylpyrazine, BEX-pyrazine, BEX-methyl isonicotinate, BEX-ethyl isonicotinate., respectively.

2.12. Thermal Analyses

A DSC TA Q2000 instrument was utilized to conduct the DSC experiments with a nitrogen flow purged at 50 mL min−1. Solid powder weighing 4−8 mg was heated in sealed aluminum pans, and the heating rate is 10 °C min−1. A Netzsch TG 209F3 apparatus was utilized to conduct the Thermogravimetric analysis, and the samples were heated at 10 °C min−1 to 400 °C.

2.13. Fourier Transformation Infrared (FTIR)

A Nicolet-Magna FT-IR 750 spectrometer was utilized to obtain the FTIR spectra in the range from 350 to 4000 cm−1, with a resolution of 4 cm−1 at ambient conditions.

2.14. High-Performance Liquid Chromatography

The concentration of BEX samples was detected by the Agilent 1260 Series Infinite HPLC. The Eclipse XDB-C18 column (4.6 × 150 mm, 5 μm) was used to separate the analytes. The column temperature was set at 30 °C and the injection volume was 10 μL. The mobile phase was composed of 0.1% trifluoroacetic acid in water and acetonitrile (10:90, v/v) pumped at 1.0 mL/min. The 260 nm was set as the UV-vis detection wavelength.

2.15. Solubility and Powder Dissolution

Excess amounts (10 mg) of BEX and four cocrystals samples were added to the 4 mL of pH 2.0, 4.5, and 6.8 buffer solutions respectively, with the presence of 0.05% CTAB (hexadecyl trimethyl ammonium bromide). Then the saturated solutions were stirred at 300 rpm for one day at 37 °C. The equilibrium solubility of BEX was determined by HPLC after filtering the solutions.
The solid samples of BEX and four cocrystals were sieved through 100-mesh sieves. In total, 20 mg of BEX and corresponding cocrystal powders (n = 3) were added to pH 6.8 phosphate buffer with the presence of 0.05% CTAB at 37 °C, rotating at 100 rpm. The samples were collected at 5, 10, 15, 20, 30, 40, 50, and 60 min. The concentration was measured by HPLC.

2.16. Pharmacokinetics in Rats

The PK studies were performed according to the Guide for Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committee of Shanghai Institute of Materia Medica. There were two groups of male Sprague−Dawley rats and the number of rats is six in each group. The 0.5% CMC-Na suspension formulations of BEX bulk powder, BEX-pyrazine, and BEX-2,5-dimethylpyrazine were prepared for the drug administration. The dose is 10 mg/kg BEX. The animals had free access to water and fasted overnight before dosing. After the administration, the 600 μL blood sample was collected at 0.5, 1, 2, 3, 4, 6, and 8 h. Plasma was separated by centrifugation (14,000 rpm, 10 min) and stored at −80 °C. In total, 200 μL of rat plasma sample was mixed with 600 μL acetonitrile through a vortex mixer for 20 min and centrifuged for 5 min (14,000 rpm). The BEX concentration was measured by the SCIEX Triple Quad™ 4500 LC-MS instrument. The analytes were separated by the Supelco Ascentis Express C18 column (2.1 × 50 mm, 2.7 μm). The column temperature was set at 35 °C. The mobile phase was composed of 0.2% formic acid in water (solvent A) and acetonitrile (solvent B) pumped at 0.6 mL/min. The gradient elution program started with 55% B and increased linearly to 90% B at 1.5 min and held for 0.5 min. It returned to the 55% B at the 3 min and held for 2 min. The injection volume was 10 μL, and the UV-vis detection wavelength was set at 260 nm. The negative ion detection mode was operated in the mass spectrometer. Multiple reaction monitoring modes (MRM) was used to perform the detection and quantification, with m/z 347.3→303.0 for BEX. The DAS 2.0 program was used to perform the PK analysis.

3. Results and Discussion

3.1. Dataset Analysis

To illustrate the advantages of the composition of the negative dataset, we visualized the chemical space distribution of exp_neg (blue), gen_neg (orange), and csd_pos (green) datasets based on 1024 bits extended-connectivity fingerprint (ECFP) [54] by the t-distribution stochastic neighbor embedding (t-SNE) method [55] in Figure 4. From the figure, it is observed that the chemical space distribution of compounds in gen_neg is similar to the csd_pos, which can easily be explained by the generation of gen_neg based on csd_pos. In contrast, the chemical space distribution of compounds in exp_neg is not completely contained in gen_neg and csd_pos, and the different points are mainly concentrated in the red box. This phenomenon shows that negative samples generated by experiments can extend the chemical diversity of artificially generated negative samples; additionally, it may also reduce the number of false negative samples. Therefore, the exp_neg and gen_neg were combined to form the final negative sample dataset. The ECFP fingerprint was calculated by the RDKit library [56] and t-SNE was implemented by Scikit-learn (version 1.0.2) [57].

3.2. Comparison of Prediction Performance between CocrystalGCN and Baselines

In order to compare the performance of CocrystalGCN, we also constructed several models based on both machine learning and deep learning methods. Three machine learning methods are the following: support vector machine (SVM) [58], random forest (RF) [59], and eXtreme gradient boosting (XGB) [60]. The latter two methods belong to the category of ensemble learning methods, which have been widely used in various tasks and achieved great results [61]. Deep neural networks (DNN) [62] and DeepDDS [63] are deep learning methods. A DNN typically includes an input layer, hidden layers, and an output layer, each of which consists of a different number of neurons. These classical machine learning-based models and DNN were implemented by using Scikit-learn (version 1.0.2), and predefined hyperparameters are shown in Table S3. The DeepDDS model was created for drug-drug synergy prediction, and its input is the embedding of two compounds and the gene expression profile of a cell line. Here, we removed the MLP module in the DeepDDS that encodes cell line features, leaving the rest unchanged. The difference between CocrystalGCN and revised DeepDDS is the way in which the representation of compounds is encoded. The embedding of two compounds in DeepDDS is encoded by two equivalent GCNs and then concatenated together, while in CocrystalGCN, the adjacency matrix of two compounds is concatenated together through different strategies and then input into a GCN layer with a feature matrix. Additionally, we also performed ablation experiments of CocrystalGCN by removing the GCAc (CocrystalGCN_C) and GCAnc (CocrystalGCN_C) layers separately to observe their influences on model performance.
We conducted 10-fold cross-validation to evaluate the predictive performance of each model. The whole dataset was randomly divided into training, validation, and test sets in an 8:1:1 ratio in each cross-validation experiment. The average predictive performance metrics of 10-fold were obtained as a final result of the model. Table 1 shows the statistical results of CocrystalGCN with its variants and baselines on the validation set and test set. As shown in Table 1, CocrystalGCN achieved pretty great performance in the Acc and Precision metrics both on the validation set and test set. DeepDDS achieved the best AUC value (0.871) on the test set, while RF achieved the best AUC value (0.884) on the validation set. It is worth noting that other performance metrics of RF also achieved competitive results compared with other models, which indicates the powerful predictive ability of RF. A previous study [41] on cocrystal prediction also found that the prediction performance of RF was better than other classical machine learning methods. Overall, the predictive performance of CocrystalGCN generally outperforms baselines.
To further explore how different adjacency matrix strategies affect the predictive ability of CocrystalGCN, we conducted an ablation experiment. CocrystalGCN_C and CocrystalGCN_NC represent the GCN layer of CocrystalGCN, containing GCAc and GCAnc, respectively, while CocrystalGCN contains both. From Table 1, we can find that CocrystalGCN and CocrystalGCN_NC outperform CocrystalGCN_C both on the validation set and the test set, which indicates that graph convolution modules adopting a noncovalent adjacency matrix strategy (GCAnc layer) is more important than those with a covalent adjacency matrix strategy (GCAc layer) for model performance. In other words, the GCAnc layer can well simulate the interactions between API and coformer, and this noncovalent interaction is necessary for the cocrystal formation of API and coformer. In short, our CocrystalGCN model generally outperforms its variants.

3.3. Interpretation of the CocrystalGCN

For exploring the interpretation of the CocrystalGCN, two cocrystals with CSD code GUCQUQ and NUBHAW from the test set were sampled, and conducted to LRP analysis. The intermolecular interaction mode in two cocrystals from the CSD database and the heat map obtained from trained CocrystalGCN are shown in Figure 5. For the cocrystal GUCQUQ, the three-dimensional intermolecular interaction mode (Figure 5a) shows a hydrogen bond formed between the hydroxyl group of 4-cyclohexene-1,2-diol and the amino group of 1,2-Cyclohexanediamine. The corresponding heat map of atomic contribution (Figure 5c) shows that the oxygen atom of the hydroxyl group and nitrogen atom of the amino group of the above two compounds are assigned the deepest red colors, respectively, indicating that the two atoms have the largest positive contribution to the predicted score, which is consistent with the experimental result. A similar result was obtained for another cocrystal NUBHAW. The interaction mode (Figure 5b) shows a hydrogen bond formed between the hydroxyl group contained in the carboxyl group of 2-chloro-5-nitrobenzoic acid and nitrogen on pyridine in nicotinamide, and the corresponding heat map of atomic contribution (Figure 5d) shows the hydroxyl oxygen on the carboxyl group and the nitrogen atom on the pyridine in a deeper red color. The other nitrogen atom in nicotinamide is assigned a blue color and generates a negative influence on prediction. The high correlation between LRP analysis and knowledge-based results indicates that our CocrystalGCN model can learn some knowledge related to cocrystal formation, and the LRP method applied in CocrystalGCN can help us understand why the model makes a decision.

3.4. Results of Virtual Screening Based on CocrystalGCN

The solubility of BEX is poor. We expect to improve its solubility with the cocrystal method. The optimal CocrystalGCN model trained on the whole dataset was used to predict the cocrystal formation of BEX. The score of cocrystal formation with BEX of 109 coformer candidates was predicted. The corresponding scores of the top 55 coformer candidates are displayed as a bar chart in Figure 6 because of space limitations. From the figure, the top 10 coformers and corresponding scores are pyrazine (0.565), p-acetophenetidide (0.450), n-phenylacetamide (0.432), glycocholic acid (0.426), ethyl isonicotinate (0.327), sorbic acid (0.311), benzene sulfonamide (0.269), p-toluenesulfonic acid (0.230), 2,5-dimethyl pyrazine (0.217), and methyl isonicotinate (0.214).
The prediction results show that the coformers after the top 30 all have scores below 0.1. Therefore, in order to quickly discover new cocrystals of BEX, the top 30 coformers were selected to perform cocrystal experimental screening through the LAG method. From the results of LAG, four coformers formed cocrystals with BEX, which are pyrazine, ethyl isonicotinate, 2,5-dimethyl pyrazine, and methyl isonicotinate. Other coformers did not form cocrystals with BEX, and the resulting samples were the physical mixtures of BEX and coformers. The chemical structures of four coformers with their score bars in yellow are displayed in Figure 6. The four coformers are ranked first, fifth, ninth, and tenth in the scoring list, respectively, demonstrating that the prediction performance of CocrystalGCN is highly accurate. Particularly, we also observed that pyrazine and 2,5-dimethyl pyrazine, ethyl isonicotinate, and methyl isonicotinate are two groups of homologues, and their structures are highly similar. This phenomenon of similarity can be explained by the fact that our model, in addition to learning the natural crystallization mechanism, also takes into account the contribution of structural similarity of coformers to cocrystal formation. Springuel et al. [64] pointed out that molecules having relatively similar chemical structures are more likely to form cocrystals with the same coformer in their study.

3.5. PXRD and Thermal Analysis of Cocrystals

All of the samples obtained by grinding were detected by PXRD to determine the BEX new phases (Figure S1). The patterns of the resulting samples are significantly different from those of BEX raw materials or coformers, confirming the formation of cocrystal phases. Take BEX-pyrazine as an example. The characteristic peaks of BEX at 2θ = 4.81°, 11.51°, 12.55°, 14.58°, 14.81°, 18.82°, 23.22°, 24.46°, and the characteristic peaks of pyrazine at 2θ = 17.53°, 18.32°, 19.03°, 25.23°, 28.30°, and 31.30° disappeared, while the characteristic peaks of cocrystal appeared at 2θ = 5.76°, 11.54°, 13.78°, 14.72°, 15.82°, 16.08°, 17.52°. The measured peaks of all cocrystals are in good agreement with the simulated patterns, indicating the formation of high-phase purity cocrystals.
The DSC curves (Figure 7) show that the commercially available BEX powder possesses a high crystallinity with a melting point of 224.4 °C. For the cocrystal of BEX-pyrazine, it represents a small endothermic peak at about 113.9 °C, corresponding to the release of pyrazine from the crystal structures during the heating process. This is also evidenced by the TG plot in Figure 8a, where a weight loss of 9.97%, equivalent to half a pyrazine molecule, is observed. Once the coformer is released from the cocrystal structure, the remaining solid is the starting material BEX, accounting for the appearance of the endothermic peak at 224.4 °C on the DSC curve of BEX-pyrazine. The cocrystal of BEX-2,5-dimethylpyrazine, it experienced the same thermal event as that of BEX-pyrazine and showed a mass loss of 23.69% in TG (Figure 8b), indicating a 1:1 molar ratio of BEX to 2,5-dimethylpyrazine. For the cocrystals of BEX-methyl isonicotinate and BEX-ethyl isonicotinate, the coformers are lost at 122.7 °C and 90.8 °C, respectively. Additionally, according to the TG results, the stoichiometric ratio of API and conformer in these two cocrystals is 1:1.

3.6. Crystal Structure Analysis

Crystallographic data of cocrystals are listed in Table S4. BEX-pyrazine (Figure 9) was solved in the triclinic P-1 space group with one BEX molecule and half a molecule of pyrazine in the asymmetric unit. BEX connects with pyrazine molecules (O1−H1⋯N1, 2.723 Å) to form the centrosymmetric structure. Two adjacent BEX molecules are separated by another antiparallel BEX molecule to form the 2D buckle structure. The 2D layers are alternatively packed along the a-axis to form the 3D structure via interlayer π-π interactions.
The cocrystal of BEX-2,5-dimethylpyrazine belongs to the monoclinic C 2/c space group, and the asymmetric unit contains a molecule of BEX and a molecule of 2,5-dimethylpyrazine. The carboxyl groups in BEX molecules form an intermolecular hydrogen bond with 2,5-dimethylpyrazine molecules via O1−H1···N1 (d = 2.726 Å). In the ac plane, the hydrophilic part is composed of 2,5-dimethylpyrazine molecules stacked along the c-axis. Additionally, BEX molecules are interspersed on both sides of the hydrophilic part to form the 2D structure. Through the π-π interactions, the 2D structures extend along the b-axis to form 3D structures. The cocrystals of BEX-methyl isonicotinate and BEX-ethyl isonicotinate stack in a similar manner as that of BEX-2,5-dimethylpyrazine.

3.7. Solubility and Powder Dissolution

The solubility is one of the most important physicochemical properties for a drug to achieve a desired concentration in systemic circulation [65]. As a BCS Class II compound, the absorption of BEX is limited by its low solubility. Additionally, the equilibrium solubility of BEX and four cocrystals was measured in the presence of 0.05% CTAB in buffer solutions at pH 2.0, 4.5, and 6.8, as shown in Figure 10a. The solubility of BEX was less than 10 ug/mL at pH 2.0, indicating it was almost insoluble in the stomach. However, the cocrystal BEX-2,5-dimethylpyrazine presented a 2-fold superiority (20 ug/mL at pH 2.0) over the parent BEX bulk powder. Being a weakly acid drug, the solubility of BEX increased with the increasing pH. Therefore, the solubility of BEX was slightly higher than 10 ug/mL at pH 4.5 and it increased to about 30 ug/mL at pH 6.8. For the four cocrystals, the solubility did not show a significant difference with BEX raw materials at pH 4.5. However, all cocrystals presented about a 1.5-fold advantage compared to the parent BEX at pH 6.8. The PXRD patterns of residues after equilibrium solubility studies were shown in Figure S3, and the pH values of the final solutions were shown in Table S5. The results indicated that all cocrystals dissociated and the remaining solid was the raw material BEX. The pH values of the solutions after the solubility experiments did not change much.
The powder dissolution profiles (Figure 10b) indicated that BEX exhibited poor dissolution performance, with a measured solubility of about 30 ug/mL after 60 min at pH 6.8. All cocrystals presented higher solubility than BEX in all periods. The spring-parachute effect was observed in the dissolution of BEX-pyrazine and BEX-methyl isonicotinate, which showed the highest solubility at 40 and 20 min, respectively. The maximum solubility values of BEX-pyrazine, BEX-2,5-dimethylpyrazine, BEX-methyl isonicotinate, and BEX-ethyl isonicotinate were 1.8, 1.4, 1.6, and 1.4 times higher than those of BEX, respectively. The dissolution advantage of cocrystals over the parent BEX under the condition of pH 6.8 indicated that the absorption of BEX in the posterior part of the small intestine could be improved through the cocrystal formation. The PXRD patterns of residues after powder dissolution were shown in Figure S4 and the pH values of the final solutions were shown in Table S6. The conditions of the cocrystal dissociation were the same as those of the solubility experiments.

3.8. PK Studies in Rats

Owing to the improved solubility and dissolution of BEX by forming cocrystals, we hypothesize that the in vivo absorption can also be enhanced. Additionally, the edible cocrystals of BEX-pyrazine and BEX-2,5-dimethylpyrazine were chosen to perform the pharmacokinetic study in rats. The plasma concentration−time profiles of BEX and cocrystals are presented in Figure 11, and the mean pharmacokinetic parameters are listed in Table 2. The results reveal that the Cmax, AUC0−8h, and AUCinf of cocrystals are statistically different from those of free BEX. The Cmax of BEX-pyrazine and BEX-2,5-dimethylpyrazine are 1.8 and 2.1 times, respectively, as that of pure BEX, and the plasma exposures (AUC0−8h) of BEX-pyrazine and BEX-2,5-dimethylpyrazine show a 1.7 and 1.8-fold improvement compared to the commercially available BEX powder. The PK studies demonstrate that the oral absorption of BEX can be indeed enhanced, originating from the improved solubility and dissolution of these two cocrystals.

4. Conclusions

In summary, a balanced dataset containing 8016 positive and 8016 negative samples was collected. CocrystalGCN was trained for cocrystal prediction based on the dataset. The AUC, Acc, Precision, and Recall values of the CocrystalGCN classification model were 0.866, 0.811, 0.802, and 0.830, respectively. After that, we utilized the optimum CocrystalGCN model for virtually screening the cocrystals of BEX. Based on the combination of virtual and experimental screening of BEX, we successfully obtained four new cocrystals of BEX. The PXRD patterns and the thermal behaviors of different solid forms were investigated. The intermolecular interactions in the single-crystal structures were clarified. All cocrystals presented superior solubility and dissolution over the parent BEX in the buffer solutions. Additionally, the PK studies indicate that the bioavailability of BEX-pyrazine and BEX-2,5-dimethylpyrazine is 1.7 and 1.8 times, respectively, that of the commercially available BEX powder. It indicates that our predictive model is effective to guide the cocrystal design and the absorption issues of water-insoluble drugs can be addressed through the cocrystal strategy.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/pharmaceutics14102198/s1, Figure S1: PXRD patterns of solid forms of BEX: (a) BEX-pyrazine, (b) BEX-2,5-dimethylpyrazine, (c) BEX-methyl isonicotinate, and (d) BEX-ethyl isonicotinatetitle; Figure S2: FTIR spectra of (a) BEX-pyrazine, (b) BEX-2,5-dimethylpyrazine, (c) BEX- methyl isonicotinate, and (d) BEX-ethyl isonicotinate; Figure S3: XRPD patterns of residues after equilibrium solubility studies: (a) pH 2.0, (b) pH 4.5, and (c) pH 6.8. The XRPD pattern from top to bottom in each figure is BEX-pyrazine, BEX-2,5-dimethylpyrazine, BEX-methyl isonicotinate, BEX- ethyl isonicotinate, and BEX raw materials, respectively; Figure S4: XRPD patterns of residues after powder dissolution: The XRPD pattern from top to bottom is BEX-pyrazine, BEX-2,5-dimethylpyrazine, BEX-methyl isonicotinate, BEX-ethyl isonicotinate, and BEX raw materials, respectively; Table S1: Atom features used in CocrystalGCN; Table S2: The search spaces of hyperparameters for CocrystalGCN; Table S3: The search spaces of hyperparameters for baseline models; Table S4: Crystallographic Data of BEX cocrystals; Table S5: The pH values of Bex and cocrystal solutions after equilibrium solubility experiments (n = 3); Table S6: The pH values of Bex and cocrystal solutions after powder dissolution at pH 6.8 (n = 3).

Author Contributions

Conceptualization, F.X., Y.C., K.C., X.M. and X.L.; investigation, F.X., Y.C. and J.-R.W.; methodology, F.X., Y.C., D.W. and Y.Z.; writing—original draft preparation, F.X. and Y.C.; writing—review and editing, X.M. and X.L.; supervision, K.C., X.M. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was financially supported by the National Natural Science Foundation of China (Grant No. 82130108), the Natural Science Foundation of Shanghai (Grant No. 22ZR147390), and the Shanghai Science and Technology Innovation Action Plan (Grant No. 20142201800).

Institutional Review Board Statement

The animal study protocol was approved by the Ethics Committee of Shanghai Institute of Materia Medica (protocol code: 2021-06-MXF-04, date: 25-06-2021).

Informed Consent Statement

Not applicable.

Data Availability Statement

The results obtained for all experiments performed are shown in the manuscript and Supplementary Materials, the raw data will be provided upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Assaf, C.; Bagot, M.; Dummer, R.; Duvic, M.; Gniadecki, R.; Knobler, R.; Ranki, A.; Schwandt, P.; Whittaker, S. Minimizing adverse side-effects of oral bexarotene in cutaneous T-cell lymphoma: An expert opinion. Br. J. Dermatol. 2006, 155, 261–266. [Google Scholar] [CrossRef] [PubMed]
  2. Panchal, M.R.; Scarisbrick, J.J. The utility of bexarotene in mycosis fungoides and Sezary syndrome. Oncotargets Ther. 2015, 8, 367–373. [Google Scholar] [CrossRef] [Green Version]
  3. Whittaker, S.J.; Marsden, J.R.; Spittle, M.; Jones, R.R. Joint British Association of Dermatologists and UK Cutaneous Lymphoma Group guidelines for the management of primary cutaneous T-cell lymphomas. Br. J. Dermatol. 2003, 149, 1095–1107. [Google Scholar] [CrossRef] [PubMed]
  4. Farol, L.T.; Hymes, K.B. Bexarotene: A clinical review. Expert Rev. Anticancer. Ther. 2004, 4, 180–188. [Google Scholar] [CrossRef] [PubMed]
  5. Chen, L.J.; Wang, Y.J.; Zhang, J.Z.; Hao, L.L.; Guo, H.J.; Lou, H.X.; Zhang, D.R. Bexarotene nanocrystal-Oral and parenteral formulation development, characterization and pharmacokinetic evaluation. Eur. J. Pharm. Biopharm. 2014, 87, 160–169. [Google Scholar] [CrossRef] [PubMed]
  6. Aitipamula, S.; Banerjee, R.; Bansal, A.K.; Biradha, K.; Cheney, M.L.; Choudhury, A.R.; Desiraju, G.R.; Dikundwar, A.G.; Dubey, R.; Duggirala, N.; et al. Polymorphs, Salts and Cocrystals: What’s in a Name? Cryst. Growth Des. 2012, 12, 2147–2152. [Google Scholar] [CrossRef]
  7. Bolla, G.; Sarma, B.; Nangia, A.K. Crystal Engineering of Pharmaceutical Cocrystals in the Discovery and Development of Improved Drugs. Chem. Rev. 2022, 122, 11514–11603. [Google Scholar] [CrossRef]
  8. Zheng, L.L.; Zhu, B.; Wu, Z.R.; Fang, X.X.; Hong, M.H.; Liu, G.X.; Li, W.H.; Ren, G.B.; Tang, Y. Strategy for Efficient Discovery of Cocrystals via a Network-Based Recommendation Model. Cryst. Growth Des. 2020, 20, 6820–6830. [Google Scholar] [CrossRef]
  9. Arabiani, M.R.; Bhunia, S.; Teja, P.K.; Lodagekar, A.; Chavan, R.B.; Shastri, N.R.; Reddy, C.M.; Shelat, P.; Dave, D. Brexpiprazole-catechol cocrystal: Structure elucidation, excipient compatibility and stability. CrystEngComm 2019, 21, 6703–6708. [Google Scholar] [CrossRef]
  10. Hong, M.; Li, S.; Ji, W.; Qi, M.-H.; Ren, G.-b. Cocrystals of Lenvatinib with Sulfamerazine and Salicylic Acid: Crystal Structure, Equilibrium Solubility, Stability Study, and Anti-Hepatoma Activity. Cryst. Growth Des. 2021, 21, 3714–3727. [Google Scholar] [CrossRef]
  11. Surov, A.O.; Manin, A.N.; Churakov, A.V.; Perlovich, G.L. New Solid Forms of the Antiviral Drug Arbidol: Crystal Structures, Thermodynamic Stability, and Solubility. Mol. Pharm. 2015, 12, 4154–4165. [Google Scholar] [CrossRef] [PubMed]
  12. Chen, J.Y.; Wu, H.; Guo, C.Y.; Zhu, B.; Ren, G.B. Enhancing the solubility of natural compound xanthotoxin by modulating stability via cocrystallization engineering. Int. J. Pharm. 2019, 572, 118776. [Google Scholar] [CrossRef] [PubMed]
  13. Chen, Y.; Li, L.; Yao, J.; Ma, Y.Y.; Chen, J.M.; Lu, T.B. Improving the Solubility and Bioavailability of Apixaban via Apixaban-Oxalic Acid Cocrystal. Cryst. Growth Des. 2016, 16, 2923–2930. [Google Scholar] [CrossRef]
  14. Drozd, K.V.; Manin, A.N.; Boycov, D.E.; Perlovich, G.L. Simultaneous Improvement of Dissolution Behavior and Oral Bioavailability of Antifungal Miconazole via Cocrystal and Salt Formation. Pharmaceutics 2022, 14, 1107. [Google Scholar] [CrossRef]
  15. Childs, S.L.; Rodríguez-Hornedo, N.; Reddy, L.S.; Jayasankar, A.; Maheshwari, C.; McCausland, L.; Shipplett, R.; Stahly, B.C. Screening strategies based on solubility and solution composition generate pharmaceutically acceptable cocrystals of carbamazepine. CrystEngComm 2008, 10, 856–864. [Google Scholar] [CrossRef]
  16. Puneeth Jayram, P.S. Pharmaceutical Co-crystals: A Systematic Review. Int. J. Pharm. Investig. 2020, 10, 246–252. [Google Scholar] [CrossRef]
  17. Bysouth, S.R.; Bis, J.A.; Igo, D. Cocrystallization via planetary milling: Enhancing throughput of solid-state screening methods. Int. J. Pharm. 2011, 411, 169–171. [Google Scholar] [CrossRef]
  18. Chun, N.-H.; Lee, M.-J.; Song, G.-H.; Chang, K.-Y.; Kim, C.-S.; Choi, G.J. Combined anti-solvent and cooling method of manufacturing indomethacin-saccharin (IMC-SAC) co-crystal powders. J. Cryst. Growth 2014, 408, 112–118. [Google Scholar] [CrossRef]
  19. Groom, C.R.; Bruno, I.J.; Lightfoot, M.P.; Ward, S.C. The Cambridge Structural Database. Acta Crystallogr. Sect. B Struct. Sci. Cryst. Eng. Mater. 2016, 72, 171–179. [Google Scholar] [CrossRef]
  20. Galek, P.T.A.; Allen, F.H.; Fábián, L.; Feeder, N. Knowledge-based H-bond prediction to aid experimental polymorph screening. CrystEngComm 2009, 11, 2634–2639. [Google Scholar] [CrossRef]
  21. Delori, A.; Galek, P.T.A.; Pidcock, E.; Patni, M.; Jones, W. Knowledge-based hydrogen bond prediction and the synthesis of salts and cocrystals of the anti-malarial drug pyrimethamine with various drug and GRAS molecules. CrystEngComm 2013, 15, 2916–2928. [Google Scholar] [CrossRef]
  22. Wood, P.A.; Feeder, N.; Furlow, M.; Galek, P.T.A.; Groom, C.R.; Pidcock, E. Knowledge-based approaches to co-crystal design. CrystEngComm 2014, 16, 5839–5848. [Google Scholar] [CrossRef]
  23. Kumar, A.; Nanda, A. In-silico methods of cocrystal screening: A review on tools for rational design of pharmaceutical cocrystals. J. Drug Deliv. Sci. Technol. 2021, 63, 102527. [Google Scholar] [CrossRef]
  24. Fábián, L. Cambridge Structural Database Analysis of Molecular Complementarity in Cocrystals. Cryst. Growth Des. 2009, 9, 1436–1443. [Google Scholar] [CrossRef]
  25. Devogelaer, J.-J.; Brugman, S.J.T.; Meekes, H.; Tinnemans, P.; Vlieg, E.; de Gelder, R. Cocrystal design by network-based link prediction. CrystEngComm 2019, 21, 6875–6885. [Google Scholar] [CrossRef] [Green Version]
  26. Devogelaer, J.J.; Charpentier, M.D.; Tijink, A.; Dupray, V.; Coquerel, G.; Johnston, K.; Meekes, H.; Tinnemans, P.; Vlieg, E.; Ter Horst, J.H.; et al. Cocrystals of Praziquantel: Discovery by Network-Based Link Prediction. Cryst. Growth Des. 2021, 21, 3428–3437. [Google Scholar] [CrossRef]
  27. Musumeci, D.; Hunter, C.A.; Prohens, R.; Scuderi, S.; McCabe, J.F. Virtual cocrystal screening. Chem. Sci. 2011, 2, 883–890. [Google Scholar] [CrossRef]
  28. McKenzie, J.; Feeder, N.; Hunter, C.A. H-bond competition experiments in solution and the solid state. CrystEngComm 2016, 18, 394–397. [Google Scholar] [CrossRef] [Green Version]
  29. Klamt, A. Solvent-screening and co-crystal screening for drug development with COSMO-RS. J. Cheminform. 2012, 4, 1–2. [Google Scholar] [CrossRef]
  30. Loschen, C.; Klamt, A. Cocrystal Ternary Phase Diagrams from Density Functional Theory and Solvation Thermodynamics. Cryst. Growth Des. 2018, 18, 5600–5608. [Google Scholar] [CrossRef]
  31. Bhardwaj, R.M.; Johnston, A.; Johnston, B.F.; Florence, A.J. A random forest model for predicting the crystallisability of organic molecules. CrystEngComm 2015, 17, 4272–4275. [Google Scholar] [CrossRef] [Green Version]
  32. Pillong, M.; Marx, C.; Piechon, P.; Wicker, J.G.P.; Cooper, R.I.; Wagner, T. A publicly available crystallisation data set and its application in machine learning. CrystEngComm 2017, 19, 3737–3745. [Google Scholar] [CrossRef]
  33. Ma, Y.; Gao, Z.; Shi, P.; Chen, M.; Wu, S.; Yang, C.; Wang, J.; Cheng, J.; Gong, J. Machine learning-based solubility prediction and methodology evaluation of active pharmaceutical ingredients in industrial crystallization. Front. Chem. Sci. Eng. 2021, 16, 523–535. [Google Scholar] [CrossRef]
  34. Bhardwaj, R.M.; Reutzel-Edens, S.M.; Johnston, B.F.; Florence, A.J. A random forest model for predicting crystal packing of olanzapine solvates. CrystEngComm 2018, 20, 3947–3950. [Google Scholar] [CrossRef] [Green Version]
  35. Yang, J.; Li, N.; Li, S. The interplay among molecular structures, crystal symmetries and lattice energy landscapes revealed using unsupervised machine learning: A closer look at pyrrole azaphenacenes. CrystEngComm 2019, 21, 6173–6185. [Google Scholar] [CrossRef]
  36. Galek, P.T.A.; Pidcock, E.; Wood, P.A.; Bruno, I.J.; Groom, C.R. One in half a million a solid form informatics study of a pharmaceutical. CrystEngComm 2012, 14, 2335–2596. [Google Scholar] [CrossRef]
  37. Wicker, J.G.P.; Crowley, L.M.; Robshaw, O.; Little, E.J.; Stokes, S.P.; Cooper, R.I.; Lawrence, S.E. Will they co-crystallize? CrystEngComm 2017, 19, 5336–5340. [Google Scholar] [CrossRef] [Green Version]
  38. Przybyłek, M.; Jeliński, T.; Słabuszewska, J.; Ziółkowska, D.; Mroczyńska, K.; Cysewski, P. Application of Multivariate Adaptive Regression Splines (MARSplines) Methodology for Screening of Dicarboxylic Acid Cocrystal Using 1D and 2D Molecular Descriptors. Cryst. Growth Des. 2019, 19, 3876–3887. [Google Scholar] [CrossRef]
  39. Chabalenge, B.; Korde, S.; Kelly, A.L.; Neagu, D.; Paradkar, A. Understanding Matrix-Assisted Continuous Co-crystallization Using a Data Mining Approach in Quality by Design (QbD). Cryst. Growth Des. 2020, 20, 4540–4549. [Google Scholar] [CrossRef]
  40. Mswahili, M.E.; Lee, M.-J.; Martin, G.L.; Kim, J.; Kim, P.; Choi, G.J.; Jeong, Y.-S. Cocrystal Prediction Using Machine Learning Models and Descriptors. Appl. Sci. 2021, 11, 1323. [Google Scholar] [CrossRef]
  41. Wang, D.; Yang, Z.; Zhu, B.; Mei, X.; Luo, X. Machine-Learning-Guided Cocrystal Prediction Based on Large Data Base. Cryst. Growth Des. 2020, 20, 6610–6621. [Google Scholar] [CrossRef]
  42. Sun, M.; Zhao, S.; Gilvary, C.; Elemento, O.; Zhou, J.; Wang, F. Graph convolutional networks for computational drug development and discovery. Brief. Bioinform. 2019, 21, 919–935. [Google Scholar] [CrossRef] [PubMed]
  43. Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph convolutional networks: A comprehensive review. Comput. Soc. Netw. 2019, 6, 1–23. [Google Scholar] [CrossRef] [Green Version]
  44. Devogelaer, J.J.; Meekes, H.; Tinnemans, P.; Vlieg, E.; de Gelder, R. Co-crystal Prediction by Artificial Neural Networks. Angew. Chem. Int. Ed. 2020, 59, 21711–21718. [Google Scholar] [CrossRef] [PubMed]
  45. Jiang, Y.; Yang, Z.; Guo, J.; Li, H.; Liu, Y.; Guo, Y.; Li, M.; Pu, X. Coupling complementary strategy to flexible graph neural network for quick discovery of coformer in diverse co-crystal materials. Nat. Commun. 2021, 12, 5950. [Google Scholar] [CrossRef]
  46. Ashton, M.; Barnard, J.; Casset, F.; Charlton, M.; Downs, G.; Gorse, D.; Holliday, J.; Lahana, R.; Willett, P. Identification of Diverse Database Subsets using Property-Based and Fragment-Based Molecular Descriptions. Quant. Struct. Act. Relatsh. 2002, 21, 598–604. [Google Scholar] [CrossRef] [Green Version]
  47. Cho, H.; Lee, E.K.; Choi, I.S. Layer-wise relevance propagation of InteractionNet explains protein-ligand interactions at the atom level. Sci. Rep. 2020, 10, 21155. [Google Scholar] [CrossRef]
  48. Mercado, R.; Rastemo, T.; Lindelöf, E.; Klambauer, G.; Engkvist, O.; Chen, H.; Jannik Bjerrum, E. Graph networks for molecular design. Mach. Learn. Sci. Technol. 2021, 2, 025023. [Google Scholar] [CrossRef]
  49. Kingma, D.P.; Ba, J. A Method For Stochastic Optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
  50. Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.; Davis, A.; Dean, J.; Devin, M.; et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv 2016, arXiv:1603.04467. [Google Scholar] [CrossRef]
  51. Polishchuk, P. Interpretation of Quantitative Structure-Activity Relationship Models: Past, Present, and Future. J. Chem. Inf. Model 2017, 57, 2618–2639. [Google Scholar] [CrossRef]
  52. Jiménez-Luna, J.; Grisoni, F.; Schneider, G. Drug discovery with explainable artificial intelligence. Nat. Mach. Intell. 2020, 2, 573–584. [Google Scholar] [CrossRef]
  53. Bach, S.; Binder, A.; Montavon, G.; Klauschen, F.; Muller, K.R.; Samek, W. On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation. PLoS ONE 2015, 10, e0130140. [Google Scholar] [CrossRef] [Green Version]
  54. Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742–754. [Google Scholar] [CrossRef] [PubMed]
  55. Laurens van der Maaten, G.H. Visualizing Data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
  56. Glennon, R.A.; Dukat, M. Structure-Activity Relationships of Synthetic Cathinones. Neuropharmacol. New Psychoact. Subst. 2017, 32, 19–47. [Google Scholar] [CrossRef]
  57. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. Available online: https://scikit-learn.sourceforge.net (accessed on 10 September 2021).
  58. Corinna Cortes; Vapnik, V. Support Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
  59. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  60. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  61. González, S.; García, S.; Del Ser, J.; Rokach, L.; Herrera, F. A practical tutorial on bagging and boosting based ensembles for machine learning: Algorithms, software tools, performance study, practical perspectives and opportunities. Inf. Fusion 2020, 64, 205–237. [Google Scholar] [CrossRef]
  62. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef]
  63. Wang, J.; Liu, X.; Shen, S.; Deng, L.; Liu, H. DeepDDS: Deep graph neural network with attention mechanism to predict synergistic drug combinations. Brief. Bioinform. 2021, 23, bbab390. [Google Scholar] [CrossRef] [PubMed]
  64. Springuel, G.; Norberg, B.; Robeyns, K.; Wouters, J.; Leyssens, T. Advances in Pharmaceutical Co-crystal Screening: Effective Co-crystal Screening through Structural Resemblance. Cryst. Growth Des. 2011, 12, 475–484. [Google Scholar] [CrossRef]
  65. Cheng, Y.; Rong, X.; Xia, M.; Zhang, Z.; Wang, J.-R.; Mei, X. Conformational polymorphs of isotretinoin and their impact on physicochemical and biological properties. Int. J. Pharm. 2021, 610, 121222. [Google Scholar] [CrossRef]
Figure 1. The whole workflow for dataset processing.
Figure 1. The whole workflow for dataset processing.
Pharmaceutics 14 02198 g001
Figure 2. Schematic representation of covalent ( A C ) and noncovalent ( A NC ) adjacency matrix.
Figure 2. Schematic representation of covalent ( A C ) and noncovalent ( A NC ) adjacency matrix.
Pharmaceutics 14 02198 g002
Figure 3. Illustration of CocrystalGCN architecture.
Figure 3. Illustration of CocrystalGCN architecture.
Pharmaceutics 14 02198 g003
Figure 4. Chemical space distribution of exp_neg (blue), gen_neg (orange), and csd_pos (green) dataset by t-SNE method using ECFP fingerprint. The red box mainly highlights blue points that do not overlap with orange and green points.
Figure 4. Chemical space distribution of exp_neg (blue), gen_neg (orange), and csd_pos (green) dataset by t-SNE method using ECFP fingerprint. The red box mainly highlights blue points that do not overlap with orange and green points.
Pharmaceutics 14 02198 g004
Figure 5. The 3D interaction modes and heat maps of two cocrystals: (a) Interaction mode diagram of cocrystal with CSD code GUCQUQ, (b) interaction mode diagram of cocrystal with CSD code NUBHAW. The gray, blue, red, green, and white represent carbon, nitrogen, oxygen, chlorine, and hydrogen atoms, respectively. The cyan lines represent hydrogen bonds, (c) Heat map for atomic contribution generated by LRP method, GUCQUQ, (d) Heat map for atomic contribution generated by LRP method, NUBHAW. The red, blue, and white represent positive influence, negative influence, and zero influence, respectively. Color intensity represent different contributions, the contribution increases with the darker color.
Figure 5. The 3D interaction modes and heat maps of two cocrystals: (a) Interaction mode diagram of cocrystal with CSD code GUCQUQ, (b) interaction mode diagram of cocrystal with CSD code NUBHAW. The gray, blue, red, green, and white represent carbon, nitrogen, oxygen, chlorine, and hydrogen atoms, respectively. The cyan lines represent hydrogen bonds, (c) Heat map for atomic contribution generated by LRP method, GUCQUQ, (d) Heat map for atomic contribution generated by LRP method, NUBHAW. The red, blue, and white represent positive influence, negative influence, and zero influence, respectively. Color intensity represent different contributions, the contribution increases with the darker color.
Pharmaceutics 14 02198 g005
Figure 6. Score ranking of top 55 coformers forming cocrystals with bexarotene by the CocrystalGCN. Four experimentally verified coformers capable of forming cocrystals with BEX are marked in yellow, which are pyrazine, ethyl isonicotinate, 2,5-dimethyl pyrazine, and methyl isonicotinate, respectively.
Figure 6. Score ranking of top 55 coformers forming cocrystals with bexarotene by the CocrystalGCN. Four experimentally verified coformers capable of forming cocrystals with BEX are marked in yellow, which are pyrazine, ethyl isonicotinate, 2,5-dimethyl pyrazine, and methyl isonicotinate, respectively.
Pharmaceutics 14 02198 g006
Figure 7. DSC profiles of solid forms of BEX: (a) BEX-pyrazine, (b) BEX-2,5-dimethylpyrazine, (c) BEX-methyl isonicotinate, and (d) BEX-ethyl isonicotinate.
Figure 7. DSC profiles of solid forms of BEX: (a) BEX-pyrazine, (b) BEX-2,5-dimethylpyrazine, (c) BEX-methyl isonicotinate, and (d) BEX-ethyl isonicotinate.
Pharmaceutics 14 02198 g007
Figure 8. TG patterns of solid forms of BEX: (a) BEX-pyrazine, (b) BEX-2,5-dimethylpyrazine, (c) BEX-methyl isonicotinate, and (d) BEX-ethyl isonicotinate.
Figure 8. TG patterns of solid forms of BEX: (a) BEX-pyrazine, (b) BEX-2,5-dimethylpyrazine, (c) BEX-methyl isonicotinate, and (d) BEX-ethyl isonicotinate.
Pharmaceutics 14 02198 g008
Figure 9. Packing modes of BEX-pyrazine (left) and BEX-2,5-dimethylpyrazine (right).
Figure 9. Packing modes of BEX-pyrazine (left) and BEX-2,5-dimethylpyrazine (right).
Pharmaceutics 14 02198 g009
Figure 10. (a) Equilibrium solubility of BEX and cocrystals. (b) Powder dissolution of BEX and cocrystals at pH 6.8. The 0.05% CTAB was added in the buffer solution.
Figure 10. (a) Equilibrium solubility of BEX and cocrystals. (b) Powder dissolution of BEX and cocrystals at pH 6.8. The 0.05% CTAB was added in the buffer solution.
Pharmaceutics 14 02198 g010
Figure 11. The PK profiles of BEX and cocrystals in rats.
Figure 11. The PK profiles of BEX and cocrystals in rats.
Pharmaceutics 14 02198 g011
Table 1. Performance comparison of CocrystalGCN and baselines on validation and test set.
Table 1. Performance comparison of CocrystalGCN and baselines on validation and test set.
ModelsValidation SetTest Set
AUCAccPrecisionRecallAUCAccPrecisionRecall
RF0.8840.8010.7970.8080.7950.7950.7930.801
SVM0.8520.7870.7700.8170.7830.7830.7670.818
XGBoost0.8550.7810.7700.8020.7800.7800.7750.792
DNN0.8390.7660.7570.7820.7590.7600.7390.806
DeepDDS0.8790.8100.7890.8440.8710.8050.7870.838
CocrystalGCN_C 10.8550.8030.7820.8290.8540.7940.7840.812
CocrystalGCN_NC 20.8530.8160.7950.8540.8550.8060.7850.840
CocrystalGCN 30.8660.8180.8020.8450.8660.8110.8020.830
1 Only contains GCAc layer in CocrystalGCN; 2 Only contains GCAnc layer in CocrystalGCN; 3 Containing both GCAc and GCAnc layers in CocrystalGCN. The best performance is in bold.
Table 2. Pharmacokinetic parameters of BEX and cocrystals in rats.
Table 2. Pharmacokinetic parameters of BEX and cocrystals in rats.
ParametersBEXBEX-PyrazineBEX-2,5-
Dimethylpyrazine
T1/2 (h)2.98 ± 1.182.59 ± 0.101.74 ± 0.39
Cmax (μg/L)1682.67 ± 559.272960.33 ± 248.02 *3577.00 ± 387.34 *
AUC0−8h (h·μg/L)7215.17 ± 810.6112561.45 ± 919.13 *12702.38 ± 978.30 *
AUCinf (h·μg/L)8792.75 ± 1076.3614656.33 ± 1085.66 *13513.29 ± 1358.85 *
* p < 0.05 compared with BEX. Data are means ± SD.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xiao, F.; Cheng, Y.; Wang, J.-R.; Wang, D.; Zhang, Y.; Chen, K.; Mei, X.; Luo, X. Cocrystal Prediction of Bexarotene by Graph Convolution Network and Bioavailability Improvement. Pharmaceutics 2022, 14, 2198. https://doi.org/10.3390/pharmaceutics14102198

AMA Style

Xiao F, Cheng Y, Wang J-R, Wang D, Zhang Y, Chen K, Mei X, Luo X. Cocrystal Prediction of Bexarotene by Graph Convolution Network and Bioavailability Improvement. Pharmaceutics. 2022; 14(10):2198. https://doi.org/10.3390/pharmaceutics14102198

Chicago/Turabian Style

Xiao, Fu, Yinxiang Cheng, Jian-Rong Wang, Dingyan Wang, Yuanyuan Zhang, Kaixian Chen, Xuefeng Mei, and Xiaomin Luo. 2022. "Cocrystal Prediction of Bexarotene by Graph Convolution Network and Bioavailability Improvement" Pharmaceutics 14, no. 10: 2198. https://doi.org/10.3390/pharmaceutics14102198

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop