AI-Aided Design of Novel Targeted Covalent Inhibitors against SARS-CoV-2

Tang, Bowen; He, Fengming; Liu, Dongpeng; He, Fei; Wu, Tong; Fang, Meijuan; Niu, Zhangming; Wu, Zhen; Xu, Dong

doi:10.3390/biom12060746

Open AccessArticle

AI-Aided Design of Novel Targeted Covalent Inhibitors against SARS-CoV-2

by

Bowen Tang

^1,2,3,

Fengming He

²,

Dongpeng Liu

¹,

Fei He

^1,4

,

Tong Wu

^1,5,

Meijuan Fang

²,

Zhangming Niu

³

,

Zhen Wu

^2,* and

Dong Xu

^1,*

¹

Department of Electrical Engineering and Computer Science, Informatics Institute, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA

²

Fujian Provincial Key Laboratory of Innovative Drug Target Research, School of Pharmaceutical Sciences, Xiamen University, Xiamen 361000, China

³

MindRank AI Ltd., Hangzhou 310000, China

⁴

School of Information Science and Technology, Northeast Normal University, Changchun 130117, China

⁵

Department of Epidemiology and Statistics, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, Beijing 100006, China

^*

Authors to whom correspondence should be addressed.

Biomolecules 2022, 12(6), 746; https://doi.org/10.3390/biom12060746

Submission received: 14 April 2022 / Revised: 17 May 2022 / Accepted: 20 May 2022 / Published: 25 May 2022

(This article belongs to the Special Issue Machine Learning Approach to Protein Structure, Dynamics, and Function)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The drug repurposing of known approved drugs (e.g., lopinavir/ritonavir) has failed to treat SARS-CoV-2-infected patients. Therefore, it is important to generate new chemical entities against this virus. As a critical enzyme in the lifecycle of the coronavirus, the 3C-like main protease (3CLpro or Mpro) is the most attractive target for antiviral drug design. Based on a recently solved structure (PDB ID: 6LU7), we developed a novel advanced deep Q-learning network with a fragment-based drug design (ADQN–FBDD) for generating potential lead compounds targeting SARS-CoV-2 3CLpro. We obtained a series of derivatives from the lead compounds based on our structure-based optimization policy (SBOP). All of the 47 lead compounds obtained directly with our AI model and related derivatives based on the SBOP are accessible in our molecular library. These compounds can be used as potential candidates by researchers to develop drugs against SARS-CoV-2.

Keywords:

COVID; SARS-CoV-2; 3C-like main protease; drug design; deep Q-learning network

1. Introduction

The emerging coronavirus SARS-CoV-2 has caused an outbreak of coronavirus disease 2019 (COVID-19) worldwide [1]. By the end of May 2022, the Johns Hopkins Coronavirus map tracker had reported more than 527 million SARS-CoV-2 infections and over six million deaths [2]. The number of infections and deaths is still increasing. To deal with the threat of SARS-CoV-2, it is necessary to develop new inhibitors or drugs. Unfortunately, since the outbreak of severe acute respiratory syndrome (SARS) 18 years ago, there has been no approved treatment for SARS-associated coronavirus (SARS-CoV) [3], which is similar to SARS-CoV-2. SARS-CoV-2-injected patients have also failed to respond to repurposed drugs, such as lopinavir and ritonavir [4]. Structure-based antiviral drug design with a novel artificial intelligence algorithm may be an effective approach to developing SARS-CoV-2-targeted inhibitors or drugs. Owing to the efforts of many researchers, several pieces of important information about the viral genome and protein structures are currently available. We know that non-structural protein 5 (Nsp5), a cysteine protease, is one of the main proteases (M^pro) of SARS-CoV-2, also known as “3C-like protease” (3CL^pro). Moreover, we know that the 3D structure of 3CL^pro is very similar to that of SARS-CoV with a sequence identity of >96% and 3D structure superposition RMSD_Cα of 0.44 Å, as shown in Supplementary Figures S1 and S2.

Overall, 3CL^pro has been reported as an attractive target for developing anti-coronavirus drugs for the following reasons: (1) this protease is highly conserved in both sequences and 3D structures [5], (2) 3CL^pro is a crucial enzyme for the replication of related viruses (including SARS and SARS-CoV-2), and (3) it only exists in viruses and not in humans. Developing specific antiviral drugs that target the 3CL^pro of viruses has shown significant success; for example, both lopinavir and ritonavir (approved HIV drugs) can completely occupy the substrate-binding site of 3CL^pro, thereby disrupting the replication of human immunodeficiency virus (HIV). However, due to the large difference between HIV and SARS-CoV-2 3CL^pro, lopinavir and ritonavir are ineffective for inhibiting SARS-CoV-2 [4]. On the other hand, the substrate-binding site of 3CL^pro is almost the same between SARS-CoV-2 and SARS, as shown in Supplementary Figure S3. Therefore, the developed potential inhibitors and drug design experience for SARS-3CL^pro may also apply to SARS-CoV-2. For example, the recently solved structure of SARS-CoV-2 3CL^pro (PDB ID: 6LU7) demonstrated that the developed inhibitor N3 [6], which is a covalent inhibitor derived from non-covalent inhibitors against SARS, can also bind to SARS-CoV-2 3CL^pro with a similar binding conformation (Supplementary Figure S4).

The available information from previous research can aid the design of novel targeted covalent inhibitors (TCIs) [7] against SARS-CoV-2. A successful TCI against 3CL^pro must first be able to fit in the binding site of 3CL^pro with an appropriate pose that keeps its reactive groups close enough to Cys145, which then undergoes a chemical step (nucleophilic attack by Cys145), leading to the formation of a stable covalent bond, as presented in the scheme below:

3 {CL}^{pro} - 145 CYS + TCI \underset{k_{2}}{\overset{k_{1}}{⇄}} 3 {CL}^{pro} - 145 CYS \cdot TCI \underset{k_{4}}{\overset{k_{3}}{⇄}} 3 {CL}^{pro} - 145 CYS - TCI

In theory, TCIs usually have a longer target residence time compared with that of non-covalent inhibitors, given the following: (1) for inhibition, k₁ must be larger than k₂, and non-covalent binding is determined by the equilibrium constant k₁/k₂; (2) TCIs undergo a chemical reaction with the target, where usually k₃ >> k₄; and (3) for TCIs, the binding process is controlled by k₁ k₃/(k₂ k₄), which is larger than k₁/k₂ for non-covalent inhibitors. In some extreme cases, k₄ = 0; thus, these inhibitors covalently bind until the target is no longer detectable [7,8].

Considering that the inhibitors of SARS 3CL^pro may exert bioactivity against SARS-CoV-2, we have created a molecular library that contains all of the reported SARS-3CL^pro inhibitors (284 molecules) [9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]. We will also add new validated molecular structures to this library as our research progresses. To date, there are no clinically approved vaccines or drugs specifically targeting SARS-CoV-2. Therefore, to discover novel candidate drugs targeting SARS-CoV-2, we combine artificial intelligence (AI) with fragment-based drug design (FBDD) to accelerate the generation of potential lead compounds and design TCIs.

AI, especially deep learning, has been used to predict molecular properties [30,31,32,33] and design novel molecules [34]. Table 1 lists some deep-learning-based tools and their features. Among these tools, the tool involving recurrent neural networks relies on a large number of training molecules with known structures to learn the context information as the basis for molecule generation. Over time, various advanced generative tools were proposed, including GENTRL, ORGAN, and ORGANIC. These tools were expected to increase success rates by introducing posterior distributions and adversarial attacks as regularizers. However, using the generative models still does not guarantee that their outputs are valid molecules for drug discovery; to overcome such problems, Molecule Deep Q-Networks (MolDQNs) with reinforcement learning (RL) were proposed. MolDQN can iteratively work at the atomic level to generate molecules fitting all predefined constraints. Hence, all outputs are chemistry-reasonable molecules.

Considering the advantages of MolDQN, we combined the framework with more fragment-based rules to optimize the generated molecules in this study. In contrast to earlier deep-learning molecular design by adding a single atom one at a time [30,34], our approach explores new molecules by adding a meaningful molecular fragment one by one, which is computationally more efficient and chemically more reasonable. For our AI model, we initially prepared the molecular fragment library by using the collected SARS-CoV 3CL^pro inhibitors (284 molecules) targeting SARS-CoV-2 3CL^pro, as shown in Figure 1. Then we split this set of molecules into fragments with a molecule weight of no more than 200 Da. Both of the collected inhibitors and the fragments can be found at [40]. Then we applied an advanced deep Q-learning network with fragment-based drug design (ADQN–FBDD) to generate potential lead compounds. If researchers have sufficient experience and internal lead compounds or biased fragments, they can manually add the lead compounds and biased fragments to the corresponding files of the ADQN–FBDD. Using the same fragments directly from existing bioactivate molecules, our ADQN–FBDD agent could easily access the potential chemical space for the 3CL^pro of SARS-CoV-2.

After the ADQN–FBDD automatically generated novel compounds targeting 3CL^pro, we obtained a covalent lead compound library with 4922 unique valid structures. A total of 47 of these compounds were selected with high scores with our AI model’s reward function. These molecules were further evaluated through docking studies. Among the 47 lead compounds, compound #46, with a high covalent docking score, attracted our attention, showing a small difference between the non-covalent and covalent docking poses. After carefully examining the interaction mode of lead #46 with 3CL^pro, we believe that there is still much space for optimization. Subsequently, we designed a series of derivatives from compound #46 based on our chemical biology knowledge and the structure-based optimization policy. All of the generated molecular structures are published in our code library, https://github.com/tbwxmu/2019-nCov [40]. We would encourage researchers interested in finding a potential treatment for COVID-19 to synthesize and evaluate some of these molecules.

2. Results

Integrating the double-dueling deep-Q-learning network with fixed q-targets and prioritized experience replay allowed our agent to be stable and efficient when learning from the chemical environment. With a combination of the state-of-the-art AI algorithm and FBDD, as shown in Figure 2, the approach was flexible and efficient in accessing the focused chemical space for SARS-CoV-2 3CL^pro. Based on the configurations targeting SARS-CoV-2 3CL^pro, the ADQN–FBDD generated a lead library containing 4922 unique molecular structures (Supplementary Materials dataset in [40]). To narrow our focus to a smaller set of molecules for analysis, we defined the filter rules (QED > 0.1 and DRL-reward ≥0.6); detailed information on the rules can be found in the Methods section. A total of 47 unique molecules (Supplementary Table S1) were selected for non-covalent docking and covalent docking evaluation. These 47 virtual leads exhibited an appropriate 3D complexity with typical characteristics of peptidomimetics and protein–protein interaction (PPI) inhibitors. They were mainly ranked according to the covalent docking scores, considering that covalent docking scores also include the scores of non-covalent docking [41].

To analyze the common features of these generated molecules, we used the Canvas Similarity and Clustering tool. As shown in Supplementary Table S3, the 47 leads were clustered into five clusters according to the clustering metric (shape). Among the five clusters, all of the molecules in Cluster #1 contained R-shaped pyrrolidin-2-one structures and may be considered optimal segments of the S2 subsite or S1 subsite. In addition, they contained some substituents with hydrophobic cyclic groups, such as aromatic rings, alicyclic rings, and aliphatic hydrocarbons. The molecules in Cluster #2 included 2-pyrrolidone in their structures, including R-shaped, S-shaped, and S’-shaped. Furthermore, their side chains linked to covalent targets contained large substituents (e.g., aromatic rings) and small substituents (e.g., aliphatic hydrocarbons or aliphatic amines). Cluster #3 comprised several straight chain-like molecules with substituted or non-substituted pyrrolidin-2-one, which may also be regarded as optimal segments of the S2 subsite or S1 subsite. In addition, their α-oxo aldehyde group can form a covalent bond with Cys145. Cluster #4 mainly consisted of 2-pyrrolidone, heterocyclic rings, and heteroaromatic rings, whose polarities were considerably higher than those of the molecules from other clusters, which may reduce the affinities of the protein active pockets S1 and S2. Their covalent targets included α, β-unsaturated ketones. Each molecule in Cluster #5 had four extension directions with different covalent targets, matching four sub-pockets of active protein-binding sites. This could contribute to a high binding affinity; hence, molecules in Cluster #5 were the best candidates for further analysis. We also considered the RMSD difference between the covalent and non-covalent binding poses based on all heavy atoms. Finally, we selected lead molecule #46 (Figure 3), with a good covalent docking score and a small RMSD value (docking affinity: −8.722 kcal/mol, KabschRmsd 1.71 Å) in Supplementary Table S1. The FBDD approach was further optimized to obtain a series of derivatives.

Molecule #46 was ranked first based on the covalent docking score, and its interaction model with the binding site was carefully examined, as shown in Figure 3. Although compound #46 had the best covalent docking score, it had an aldehyde group. Interestingly, upon a detailed examination of #46, it was found that its interactions were similar to those of the inhibitor 6XHO (PF-00835231) designed by Pfizer, thus demonstrating the potential of our proposed method. Moreover, 6XHO was reported to have modest levels of irreversible inhibition, which allowed co-crystallization in complex with SARS-CoV-2 3CL [42]. Based on PF-00835231, Pfizer further developed Lufotrelvir (PF-07304814). The phosphate group of PF-07304814 contributed to improved solubility and was cleaved in vitro to release the active antiviral PF-00835231. Similar to remdesivir, PF-07304814 was administered via intravenous infusion. Moreover, Pfizer developed the orally active 3C-like protease inhibitor Nirmatrelvir (PF-07321332), crystallized as 7VH8. We superimposed lead #46 with 6XHO and 7VH8 and compared their 3D pose and interactions, as shown in Figure 4. Most of their binding sites were aligned with their SARS-CoV-2 3CL^pro ligands. Considering that there is still much space for compound #46 to fill in the S1′ subsite and that α-ketoamides may be suitable to fit into the oxyanion hole (Figure 5A) of 3CL^pro [3], we replaced the aldehyde with formamide and also replaced the 1,4 Michael acceptors with α-ketoamides. Therefore, we optimized compound #46 to obtain compound 46–14–1 (Figure 5A,B).

The non-bonding interactions between compound 46–14–1 and SARS-CoV-2 3CL^pro are mainly hydrogen bonds (H-bonds, five in total). The carbonyl group of the covalent scaffold α-ketoamide forms H-bonds with Leu141 and Gly143 as a hydrogen acceptor or a hydrogen donor. The hydrogen on the nitrogen atom of the triazole ring forms an H-bond with Glu166, and Glu166 forms an H-bond with the carbonyl group on the main chain. The oxygen of the β-lactam ring forms an H-bond with His41. To enhance the polarity of the compounds, sulfonic groups were introduced to replace the ketone carbonyl groups on the main chain, and sulfonamides 46–14–2 were obtained. The covalent docking model of compound 46–14–2 with SARS-CoV-2 3CL^pro is shown in Figure 5C,D.

To make compound 46–14–2 fit the active pocket with a higher affinity, we added a carbon atom to the sulfonic acid group of the original molecule, and it extended the carbon chain to increase the molecule’s flexibility and obtained another optimized compound 46–14–3. The mode of covalent docking with SARS-CoV-2 3CL^pro is shown in Figure 6. Due to the introduction of carbon atoms and the enhancement of molecular flexibility, the β-lactam ring can be inserted deeper into the S2 pocket, and other fragments of the compound can better adapt to the S1, S1′, and S3 subsites. The α-carbonyl carbon on the α-ketoamide of compound 46–14–3 forms a covalent bond with the key residue Cys145 on the protease; however, the primary non-bonding interaction is still an H-bond (indicated by a yellow dash). The triazole ring mainly forms H-bonds with the Phe140 and Glu166 residues in the S1 pocket. The α-ketoamide covalent binding fragment mainly forms H-bonds with the key amino acid residues Cys145, Gly143, and Ser144 in the S1′ subsite, forming an oxyanion hole in the red circle shown in Figure 5. The β-lactam side chain mainly forms H-bonds with residues Tyr54 and AsS187 in the S2 pocket. The chromone scaffold mainly forms H-bonds with the key residues Thr190 and Gln192 in the S3 pocket. In addition, the oxygen on the sulfonyl group of the main chain forms an H-bond with Glu166 in the S1 pocket.

The optimized compounds (46–14–1, 46–14–2, and 46–14–3) are shown in Figure 7 that may be further evaluated by molecular dynamics simulation to determine the binding free energy and by quantum chemical calculation to determine the reaction energy barrier. Furthermore, 46–14–1, 46–14–2, and 46–14–3 were chosen as candidates for chemical synthesis and anti-SARS-CoV-2 activity testing, which is ongoing.

3. Discussion

Computational approaches are particularly important for emerging diseases, given the need to provide timely solutions. In this study, our robust and efficient computational method and pipeline for designing compounds can provide useful drug candidates for treating SARS-CoV-2 infections. More information about our AI model-generated leads and derivatives can be found at [40]. These candidates or their variants have a high probability of producing valid leads of anti-COVID-19 drugs. Nevertheless, the computational design requires experimental validation. Although we are currently conducting experimental validations, we hope that other researchers could use these candidates to accelerate the development of anti-COVID-19 drugs, given the urgency of treating the disease.

FBDD has become a key technology in the pharmaceutical industry for early stage drug discovery and development in the last two decades. Molecular fragments are always small in size compared with intact molecules. Several fragments are especially favorable to some protein binding pockets. For example, the α-ketoamides can form a covalent bond with CYS145 of 3CL^pro in SARS-CoV-2 and fit into the oxyanion hole, as Figure 6B presented. These fragments with high-quality interactions with protein targets can more likely grow up or integrate into lead compounds. During this growing or integrating process, ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties could be easier to control than working directly on the intact molecule. Based on the above advantages of FBDD, we believe more and more machine-learning- or deep-learning-based molecular generative models will use FBDD to design molecules in the future.

Compared with other deep RL methods, the ADQN–FBDD has several notable features: (1) It directly modifies and generates molecular structures without format conversion problems; other tools, such as in silico Medicine’s GENTRL (https://insilico.com), may generate invalid SMILES. (2) Most generative models require pretraining on a specific dataset and produce molecules with high similarities to a given training set. For example, the best molecule from Insilico Medicine for the target DDR1 is highly similar to the kinase inhibitor Iclusig (ponatinib) [43]. The ADQN–FBDD does not require any pre-training and is capable of generating new molecules. (3) The process of generating molecules is highly efficient, as the ADQN–FBBD is a molecular-fragment-based model with knowledge of chemical reactions, whereas other models are all atom-based, with no rules for chemical reactions [30,38,44]. (4) The ADQN–FBDD is highly flexible and user-friendly for medicinal chemists, who can easily introduce their drug discovery experience into the reward function to guide novel molecule generation. Our ADQN–FBDD and related pipeline may be used not only for designing anti-COVID-19 drugs but also for other structure-based drug discoveries, especially for emerging infectious diseases that require timely treatment.

4. Methods

4.1. Markov Decision Process (MDP) for Molecule Generation

Intuitively, the problem of chemical structure graph generation can be formulated as learning a reinforced agent, which performs discrete actions of chemical-reaction-based fragment addition or removal in a chemistry-aware MDP. Formally, the components of the MDP include M = {S, A, P, R, γ}, where each term can be defined as follows:

S = \{S_{t}\}

denotes the state space containing all possible intermediate and final generated molecular graphs. Each

s_{t}

is a tuple of (s, t); s represents a valid molecular structure, and t is the time step. For the initial state,

s_{0}

, the structure may be represented as a specific core, as demonstrated in Figure 2C, or randomly chosen from the prepared fragment library at time t = 0. We limited the maximum number of time steps, T, in the fragment-based MDP, which defines the set of terminal states as {

s_{t}

|t=T}, containing all the states, with the number of steps reaching the maximum allowed value T.

Ac = {A_t} denotes a set of actions that describe the modification made on the current molecular structure at each time step, t. Each action can be classified into three categories: fragment addition, fragment deletion, and no modification.

P = p (s_t₊₁|s_t … s₀) = p (s_t+1|s_t, a_t) is the basic assumption in the MDP. The state transition probability specifies the next possible state given the current state and action at time step, t. Here, we defined the state transition to be deterministic. For example, for S₀ to S₁ (Figure 2C), by adding a 1H-1,2,3-triazol-4-yl fragment on S₀, the next state S₁ would be the new structure consisting of added 1H-1,2,3-triazol-4-yl with a probability of 1.

R is the reward function that specifies the reward after reaching state

S_{t,} and γ ϵ (0, 1]

is the discount factor; typically, γ = 0.9 in our study. In our framework, the state always had a valid and complete chemical structure at each step, as shown in Figure 2C. A reward was given not just at the terminal states but also at each step. Both intermediate rewards and final rewards were used to guide the behavior of the RL agent. Therefore, there was no delayed or sparse reward issue as experienced with many other reinforced frameworks [36,45]. Additionally, to ensure that the last state was rewarded the most, we used γ^T^–t to discount the value of the rewards at state

s_{t}

. Our reward function can directly integrate the experience of medicinal chemists. For example, given a core of interest, medicinal chemists may add biased fragments of interest. Their input can be used to design a reward function that gives high reward signal values to biased fragments so that the ADQN–FBDD may have a better chance of generating the desired structures.

4.2. Chemical Environment Design

In our RL framework, the chemical environment receives action, a_t, from the agent and emits scalar reward, r_t, and state

s_{t}^{'}

to the agent, as shown in Figure 2A. Notably, the definition of the environment state differed from the general approach such that the environment state consisted of only the environment’s private representation invisible to the agent. We defined the state of the chemical environment,

s_{t}^{'}

, as the generated intermediate molecular structure at time step, t, as is fully observable by the RL agent. Basically, the environment’s state,

s_{t}^{'}

, was equivalent to the agent’s state,

s_{t + 1} .

For the task of molecule generation, the environment can incorporate the rules of chemistry. In our study, chemistry rules were not only about chemical valency but also about adding and removing the fragments of known inhibitors derived from chemical reactions. The detailed information of 45 defined chemical reaction rules is presented in Supplementary Table S2.

4.3. Agent Design

As shown in Figure 2A, the basic model of our ADQN–FBDD was an advanced Q-network. The goal of molecule generation was equally to fit a Q function Q(s_t, a_t) to make the agent choose the action a_t at state s_t to maximize the expected γ-discounted cumulative rewards with policy π. Mathematically, given the agent’s policy π, the value of the state-action pair Q^π(s_t, a_t) and the value of state V^π(s_t ) can each be defined as follows:

Q^{π} (s_{t}, a_{t}) = E_{a_{t} \sim π (s_{t})} [\sum_{n = t}^{T} γ^{T - n} \cdot R (s_{n}, a_{n})] = E_{a_{t} \sim π (s_{t})} [R (s_{t}, a_{t}) + γ \cdot E_{a_{t + 1} \sim π (s_{t + 1})} (Q^{π} (s_{t + 1}, a_{t + 1})]

(1)

V^{π} (s_{t}) = E_{a_{t} \sim π (s_{t})} [Q^{π} (s_{t}, a_{t})]

(2)

where

E_{a_{t} \sim π (s_{t})}

is the expectation within policy π on state s_t with a_t, and

R (s_{n}, a_{n})

denotes the reward at step n. The value function Q^π(s_t, a_t) measures the value of taking action a_t on state s_t. V^π(s_t) is the value of being at state s_t. V^π(s_t) can be seen as a part of Q^π(s_t, a_t). The remaining part of Q^π(s_t, a_t) can be defined as the so-called advantage function A^π(s_t, a_t) [43] as follows:

A^{π} (s_{t}, a_{t}) = Q^{π} (s_{t}, a_{t}) - V^{π} (s_{t})

(3)

Intuitively, the advantage value shows how advantageous selecting the action is relative to the others at the same given state. Equation (2) can be rewritten according to Equation (4):

V^{π} (s_{t}) = E_{a_{t} \sim π (s_{t})} [V^{π} (s_{t}) + A^{π} (s_{t}, a_{t})] = V^{π} (s_{t}) + E_{a_{t} \sim π (s_{t})} [A^{π} (s_{t}, a_{t})]

(4)

Obviously,

E_{a_{t} \sim π (s_{t})} [A^{π} (s_{t}, a_{t})] = 0

. To avoid the issue of identifiability, we subtracted the mean value from the prediction, and the Q-function of the dueling DQN can be defined as follows:

Q^{π} (s_{t}, a_{t}; Θ, α, β) = V^{π} (s_{t}; Θ, β) + (A^{π} (s_{t}, a_{t}; Θ, α) - \frac{1}{|Ac|} \sum_{a_{t}^{'}} A^{π} (s_{t}, a_{t}^{'}; Θ, α))

(5)

Note that

Θ, α, and β

come from the dueling Q-network, as shown in Figure 8. Moreover, |Ac| is the size of the action space and

a_{t}^{'} ϵ Ac

. To make our RL agent more robust for more stable learning and to handle the problem of the overestimation of Q-values, the double Q-network [46] and fixed Q-targets [47] were also incorporated:

T D = Q^{π} (s_{t}, a_{t}; Θ, α, β) - [R (s_{t}, a_{t}) + γ \cdot Q_{t a r}^{π} (s_{t + 1}, a r g m a x_{a_{t + 1}} Q^{π} (s_{t + 1}, a_{t + 1}; Θ, α, β); Θ^{'}, α^{'}, β^{'})]

(6)

where TD is the temporal difference; and

Q_{t a r}^{π}

is another dueling DQN network, as the target network and its parameters (

Θ^{'}, α^{'}, β^{'}

) were fixed and copied from the dueling DQN

Q^{π}

every m step (m = 20). To update the parameters (

Θ, α, β

) from the dueling DQN as shown in Figure 7, we trained our RL agent by minimizing the loss function:

l (Θ, α, β) = E [f_{l} (T D)]

(7)

where E is the expectation. As the disadvantage of the L2 loss is the tendency to be dominated by outliers, we used the Huber loss as the loss function

f_{l}

:

f_{l} (x) = \{\begin{matrix} |x| - 0.5 i f |x| \geq 1 \\ 0.5 * x^{2} i f |x| < 1 \end{matrix}

(8)

4.4. Prioritized Experience Replay

Prioritized experience replays [48] is a technique to enable the RL agent to remember and reuse experiences from the past and to replay important transitions more frequently. Prioritized experience replay is highly useful for replaying some less frequent experiences. Here, we used the “Prioritized Replay Buffer” code from Open AI Gym (version 0.15.4) [49]. Finally, our RL agent was trained in the double-dueling deep Q-learning network with fixed q-targets and prioritized experience replay.

4.5. Fragment Library Design

The fragment-based approach to drug discovery (FBDD) has been established as an efficient tool in searching for new drugs [50]. The idea of FBDD is that proper optimization of each unique interaction in the binding site and subsequent incorporation into a single molecular entity should produce a compound with a binding affinity that is the sum of the individual interactions. However, the widely used fragment libraries consider only the diversity of fragments, such as the ZINC fragment database. Therefore, there is a very low probability of achieving the desired bioactivity for a given protein.

To combine FBDD with our RL framework, we first collected and built a SARS-CoV-2 3CL^pro inhibitor dataset containing 284 reported molecules. We adopted the improved BRICS algorithm [46] to split these molecules to obtain the fragment library target on SARS-CoV-2 3CL^pro, as demonstrated in the flowchart in Figure 1 (yellow box). An elaborate filtering cascade is accompanied by manual inspection, and the rules can be changed based on the needs of different studies. Our fragment library contained 316 fragments with molecular weights of <200 Da and minimum and maximum numbers of non-hydrogen atoms of 1 and 25, respectively. The fragments that come directly from existing inhibitors based on the chemical retrosynthetic rules are always true substructures and may have high bioactivity in targeting 3CL^pro. The quality of the designed fragment library can directly affect the properties of the chemical environment of the ADQN–FBDD.

4.6. Core Selection

Previous studies have identified various scaffolds or core structures with advantageous characteristics in terms of the activity for a particular target [51,52,53]. Core structure selection is the starting point in scaffold-based drug discovery. However, choosing or designing a proper initial scaffold is not a simple task, and medicinal chemists may require considerable experience. Nevertheless, there are several reported core structures targeting SARS M^pro [54]. We chose 4-aminopent-2-enal and 3-amino-2-oxobutanal as the starting cores (Figure 9) because both cores have been validated to generate covalent bonds with the Cys145 of SARS or SARS-CoV-2 3CL^pro.

4.7. Reward Design

Most reported RL methods use the complete structural information of a positive drug or inhibitor as the template [38,44]. They design a reward function for the RL agent to learn to regenerate the template structure or generate highly similar structures to the template. This approach may be useful in testing the performance of RL methods but may not be suitable in real-world drug design because the complete structural information of the novel molecule is unknown. A more practical approach is to learn the structural features of existing drugs or inhibitors in the focused chemical space for a specific protein target. Instead of focusing only on the diversity of molecular structures, we explored the possibility of generating novel molecules based on the existing knowledge. Here, we designed a deep reinforcement learning reward (DRL-reward) function consisting of the final property score with the specific fragments (CSF) score and pharmacophore score as follows:

R (s) = w_{p r o} \cdot f_{p r o} (s) + w_{c o n} \cdot f_{c o n} (s) + w_{p h a} \cdot f_{p h a} (s)

(9)

f_{p r o} (s) = \{\begin{matrix} 1 i f Q E D (S) > 0.1 \\ 0 e l s e \end{matrix}

(10)

where

w_{p r o}

represents the weight for the quantitative estimate of drug-likeness (QED), and its default value is 0.1; and

f_{p r o}

represents QED. QED values can range from 0 (all properties are unfavorable) to 1 (all properties are favorable), calculated based on eight molecular properties [55]. The score function

f_{c s f}

of CSF is as follows:

f_{C S F} (s) = \{\begin{matrix} F_{c s f} (s) i f F_{C S F} (s) > 0.9 \\ 0 e l s e \end{matrix}

(11)

F_{C S F} (s) = \frac{n_{m a t c h}}{N_{t o t a l}}

(12)

The binding site of SARS-CoV-2 3CL^pro (Figure 10) is commonly divided into the catalytic activity center (His41 and Cys145; S1′) and several subsites, defined as S1 (His163, Glu166, Phe140, Leu141, and Asn142), S1′ (His41, Cys145, Gly143, and Ser144), S2 (Tyr54, Asp187, His41, Arg188, His164, and Met49), and S3 (Thr190, Gln192, Glu166, Met165, Leu167, and Gln189). Each subsite may have its favorable binding fragment. When generated structures (including intermediates) contain these favorable fragments,

f_{c s f}

is equivalent to giving an additional reward to our RL agent. Moreover,

w_{c o n}

controls the contribution of the biased fragment to the reward signal, and the default value is 0.6;

n_{m a t c h}

represents the number of biased fragments matched in one generated structure;

N_{t o t a l}

is the number of biased fragments defined based on our knowledge from related work; and

f_{p h a}

represents the score function of the pharmacophore, which mainly depends on the ligand–protein interaction mode from the crystal structure (PDB ID: 6LU7):

f_{p h a} (s) = \{\begin{array}{l} 1 i f m a t c h s t h e d e f i n e d p h a r m a c o p h o r e s \\ 0 e l s e \end{array}

(13)

The pharmacophore plot is shown in Supplementary Figure S5. Moreover,

w_{p h a}

controls the contribution of the pharmacophore score to the reward, and the default value is 0.4.

4.8. Molecule Generation and Selection

As discussed in Section 4.7, addressing the reward design, our reward function considered the molecular descriptor threshold (QED > 0.1), the defined pharmacophore mode, and biased fragments. A total of 4922 unique valid structures were automatically generated, and all matched the defined rules, using the ADQN–FBDD without any pre-training, as required by other methods [30,36,45,56,57]. Next, all molecules with a high deep-reinforcement-learning score (DRL score: R(S) >= 0.6) were selected (47 molecules), as they were more drug-like, contained more favorable fragments of SARS-CoV-2 3CL^pro, and covered more pharmacophore modes compared with other candidates obtained from multiple learning steps. These 47 unique molecules were prepared to generate at least one conformation with local energy minimization, using the OPLS-2005 force field of the “ligand prepare” module of Schrödinger 2015 software. The 47 unique molecules generated a total of 163 3D conformations before docking into the substrate-binding site of SARS-CoV-2 3CL^pro. Considering the balance between precision and calculation time, the stand-precision (SP) Glide [58] was initially used to predict the possible non-covalent binding poses in this binding site and the binding site grid centered on the original ligand N3 [59] with 20 Å buffer dimensions. Following non-covalent docking, we also determined the covalent docking poses and scores for the 47 molecules. We reordered the docking results mainly based on the covalent docking score and the RMSD difference between the covalent and non-covalent poses (Supplementary Table S1).

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/biom12060746/s1 [40]. Figure S1: Sequence alignment between SARS-CoV-2 3CL^pro and SARS-CoV 3CL^pro; Figure S2: Structure superposition between SARS-CoV-2 3CL^pro (PDB ID: 6LU7 shown in magenta) and SARS-CoV 3CL^pro (PDB ID: 3D62 shown in cyan) with an RMSD of 0.44 Å; Figure S3: Substrate-binding site superimposition of SARS-CoV-2 3CL^pro (PDB ID: 6LU7, magenta cartoon) and SARS-CoV 3CL^pro (PDB ID: 2HOB, cyan cartoon); Figure S4: Comparison of the ligand conformations between SARS-CoV-2 3CL^pro (PDB ID:6LU7) and SARS-CoV 3CL^pro (PDB ID:2HOB); Figure S5: The pharmacophore model embedded in the R function; Table S1: Results of non-covalent and covalent docking results; Table S2: 45 rules pf different chemical reactions; Table S3: Five clusters for the generative 47 leads.

Author Contributions

B.T. and D.X. designed the study; B.T., F.H. (Fengming He) and D.L. developed the AI-aided methods and wrote the manuscript. F.H. (Fei He), T.W., M.F., Z.N. and Z.W. contributed to the interpretation of the results. All authors reviewed and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the program of China Scholarships Council No. 201806310017 to B.T., the US National Institutes of Health grant R35-GM126985 to D.X., and Development Project of Jilin Province of China grant number 20210101174JC to F.H. (Fei He) We would like to thank Dongqing Wei’s group and the PCL lab for their generous support of high-performance computing resources.

Data Availability Statement

The code for the ADQN–FBDD and related data in this paper will be available at https://github.com/tbwxmu/2019-nCov [40].

Conflicts of Interest

The authors declare no conflict of interest.

References

Gorbalenya, A.E. Severe acute respiratory syndrome-related coronavirus–The species and its viruses, a statement of the Coronavirus Study Group. BioRxiv 2020, 937862. [Google Scholar] [CrossRef]
Coronavirus COVID-19 Global Cases by Johns Hopkins CSSE. Available online: https://coronavirus.jhu.edu/ (accessed on 19 May 2022).
Zhang, L.; Lin, D.; Kusov, Y.; Nian, Y.; Ma, Q.; Wang, J.; von Brunn, A.; Leyssen, P.; Lanko, K.; Neyts, J.; et al. Alpha-ketoamides as broad-spectrum inhibitors of coronavirus and enterovirus replication Structure-based design, synthesis, and activity assessment. J. Med. Chem. 2020, 63, 4562–4578. [Google Scholar] [CrossRef] [PubMed]
Jun, C.; Yun, L.; Xiuhong, X.; Ping, L.; Feng, L.; Tao, L.; Zhiyin, S.; Mei, W.; Yinzhong, S.; Hongzhou, L. Efficacy study of lopinavir, ritonavir and abirater for the treatment of COVID-19. Chin. J. Anim. Infect. Dis. 2020, 38, 86–89. [Google Scholar]
Liu, X.; Wang, X.-J. Potential inhibitors for 2019-nCoV coronavirus M protease from clinically approved medicines. J. Genet. Genom. 2020, 47, 119–121. [Google Scholar] [CrossRef]
Yang, H.T.; Xie, W.Q.; Xue, X.Y.; Yang, K.L.; Ma, J.; Liang, W.X.; Zhao, Q.; Zhou, Z.; Pei, D.Q.; Ziebuhr, J.; et al. Design of wide-spectrum inhibitors targeting coronavirus main proteases. PLoS Biol. 2005, 3, e428. [Google Scholar]
Singh, J.; Petter, R.C.; Baillie, T.A.; Whitty, A. The resurgence of covalent drugs. Nat. Rev. Drug Discov. 2011, 10, 307–317. [Google Scholar] [CrossRef]
Tuley, A.; Fast, W. The taxonomy of covalent inhibitors. Biochemistry 2018, 57, 3326–3337. [Google Scholar] [CrossRef]
Jain, R.P.; Pettersson, H.I.; Zhang, J.; Aull, K.D.; Fortin, P.D.; Huitema, C.; Eltis, L.D.; Parrish, J.C.; James, M.N.G.; Wishart, D.S.; et al. Synthesis and evaluation of keto-glutamine analogues as potent inhibitors of severe acute respiratory syndrome 3CLpro. J. Med. Chem. 2004, 47, 6113–6116. [Google Scholar] [CrossRef]
Wu, C.-Y.; Jan, J.-T.; Ma, S.-H.; Kuo, C.-J.; Juan, H.-F.; Cheng, Y.-S.E.; Hsu, H.-H.; Huang, H.-C.; Wu, D.; Brik, A.; et al. Small molecules targeting severe acute respiratory syndrome human coronavirus. Proc. Natl. Acad. Sci. USA 2004, 101, 10012–10017. [Google Scholar] [CrossRef] [Green Version]
Ghosh, A.K.; Xi, K.; Ratia, K.; Santarsiero, B.D.; Fu, W.; Harcourt, B.H.; Rota, P.A.; Baker, S.C.; Johnson, M.E.; Mesecar, A.D. Design and Synthesis of Peptidomimetic Severe Acute Respiratory Syndrome Chymotrypsin-like Protease Inhibitors. J. Med. Chem. 2005, 48, 6767–6771. [Google Scholar] [CrossRef]
Shie, J.-J.; Fang, J.-M.; Kuo, C.-J.; Kuo, T.-H.; Liang, P.-H.; Huang, H.-J.; Yang, W.-B.; Lin, C.-H.; Chen, J.-L.; Wu, A.Y.-T.; et al. Discovery of Potent Anilide Inhibitors against the Severe Acute Respiratory Syndrome 3CL Protease. J. Med. Chem. 2005, 48, 4469–4473. [Google Scholar] [CrossRef] [PubMed]
Shie, J.-J.; Fang, J.-M.; Kuo, T.-H.; Kuo, C.-J.; Liang, P.-H.; Huang, H.-J.; Wu, Y.-T.; Jan, J.-T.; Cheng, Y.-S.E.; Wong, C.-H. Inhibition of the severe acute respiratory syndrome 3CL protease by peptidomimetic α, β-unsaturated esters. Bioorganic Med. Chem. 2005, 48, 4469–4473. [Google Scholar] [CrossRef] [PubMed]
Al-Gharabli, S.I.; Shah, S.T.A.; Weik, S.; Schmidt, M.F.; Mesters, J.R.; Kuhn, D.; Klebe, G.; Hilgenfeld, R.; Rademann, J. An efficient method for the synthesis of peptide aldehyde libraries employed in the discovery of reversible SARS coronavirus main protease (SARS-CoV Mpro) inhibitors. ChemBioChem 2006, 7, 1048–1055. [Google Scholar] [CrossRef] [PubMed]
Lu, I.-L.; Mahindroo, N.; Liang, P.-H.; Peng, Y.-H.; Kuo, C.-J.; Tsai, K.-C.; Hsieh, H.-P.; Chao, Y.-S.; Wu, S.-Y. Structure-Based Drug Design and Structural Biology Study of Novel Nonpeptide Inhibitors of Severe Acute Respiratory Syndrome Coronavirus Main Protease. J. Med. Chem. 2006, 49, 5154–5161. [Google Scholar] [CrossRef] [PubMed]
Tsai, K.-C.; Chen, S.-Y.; Liang, P.-H.; Lu, I.-L.; Mahindroo, N.; Hsieh, H.-P.; Chao, Y.-S.; Liu, L.; Liu, D.; Lien, W. Discovery of a novel family of SARS-CoV protease inhibitors by virtual screening and 3D-QSAR studies. J. Med. Chem. 2006, 49, 3485–3495. [Google Scholar] [CrossRef]
Wu, C.-Y.; King, K.-Y.; Kuo, C.-J.; Fang, J.-M.; Wu, Y.-T.; Ho, M.-Y.; Liao, C.-L.; Shie, J.-J.; Liang, P.-H.; Wong, C.-H. Stable Benzotriazole Esters as Mechanism-Based Inactivators of the Severe Acute Respiratory Syndrome 3CL Protease. Chem. Biol. 2006, 13, 261–268. [Google Scholar] [CrossRef] [Green Version]
Akaji, K.; Konno, H.; Onozuka, M.; Makino, A.; Saito, H.; Nosaka, K. Evaluation of peptide-aldehyde inhibitors using R188I mutant of SARS 3CL protease as a proteolysis-resistant mutant. Bioorganic Med. Chem. 2008, 16, 9400–9408. [Google Scholar] [CrossRef]
Ghosh, A.K.; Gong, G.; Grum-Tokars, V.; Mulhearn, D.C.; Baker, S.C.; Coughlin, M.; Prabhakar, B.S.; Sleeman, K.; Johnson, M.E.; Mesecar, A.D. Design, synthesis and antiviral efficacy of a series of potent chloropyridyl ester-derived SARS-CoV 3CLpro inhibitors. Bioorganic Med. Chem. Lett. 2008, 18, 5684–5688. [Google Scholar] [CrossRef]
Shao, Y.-M.; Yang, W.-B.; Kuo, T.-H.; Tsai, K.-C.; Lin, C.-H.; Yang, A.-S.; Liang, P.-H.; Wong, C.-H. Design, synthesis, and evaluation of trifluoromethyl ketones as inhibitors of SARS-CoV 3CL protease. Bioorganic Med. Chem. 2008, 16, 4652–4660. [Google Scholar] [CrossRef]
Kuo, C.-J.; Liu, H.-G.; Lo, Y.-K.; Seong, C.-M.; Lee, K.-I.; Jung, Y.-S.; Liang, P.-H. Individual and common inhibitors of coronavirus and picornavirus main proteases. FEBS Lett. 2009, 583, 549–555. [Google Scholar] [CrossRef] [Green Version]
Ramajayam, R.; Tan, K.-P.; Liu, H.-G.; Liang, P.-H. Synthesis and evaluation of pyrazolone compounds as SARS-coronavirus 3C-like protease inhibitors. Bioorganic Med. Chem. 2010, 18, 7849–7854. [Google Scholar] [CrossRef] [PubMed]
Ryu, Y.B.; Jeong, H.J.; Kim, J.H.; Kim, Y.M.; Park, J.-Y.; Kim, D.; Naguyen, T.T.H.; Park, S.-J.; Chang, J.S.; Park, K.H. Biflavonoids from Torreya nucifera displaying SARS-CoV 3CLpro inhibition. Bioorganic Med. Chem. 2010, 18, 7940–7947. [Google Scholar] [CrossRef] [PubMed]
Akaji, K.; Konno, H.; Mitsui, H.; Teruya, K.; Shimamoto, Y.; Hattori, Y.; Ozaki, T.; Kusunoki, M.; Sanjoh, A. Structure-Based Design, Synthesis, and Evaluation of Peptide-Mimetic SARS 3CL Protease Inhibitors. J. Med. Chem. 2011, 54, 7962–7973. [Google Scholar] [CrossRef] [PubMed]
Jacobs, J.; Grum-Tokars, V.; Zhou, Y.; Turlington, M.; Saldanha, S.A.; Chase, P.; Eggler, A.; Dawson, E.S.; Baez-Santos, Y.M. Discovery, synthesis, and structure-based optimization of a series of N-(tert-butyl)-2-(N-arylamido)-2-(pyridin-3-yl) acetamides (ML188) as potent noncovalent small molecule inhibitors of the severe acute respiratory syndrome coronavirus (SARS-CoV) 3CL protease. J. Med. Chem. 2013, 56, 534–546. [Google Scholar]
Ren, Z.; Yan, L.; Zhang, N.; Guo, Y.; Yang, C.; Lou, Z.; Rao, Z. The newly emerged SARS-like coronavirus HCoV-EMC also has an “Achilles’ heel”: Current effective inhibitor targeting a 3C-like protease. Protein Cell 2013, 4, 248. [Google Scholar] [CrossRef]
Thanigaimalai, P.; Konno, S.; Yamamoto, T.; Koiwai, Y.; Taguchi, A.; Takayama, K.; Yakushiji, F.; Akaji, K.; Chen, S.-E. Development of potent dipeptide-type SARS-CoV 3CL protease inhibitors with novel P3 scaffolds: Design, synthesis, biological evaluation, and docking studies. Eur. J. Med. Chem. 2013, 68, 372–384. [Google Scholar] [CrossRef]
Turlington, M.; Chun, A.; Tomar, S.; Eggler, A.; Grum-Tokars, V.; Jacobs, J.; Daniels, J.S.; Dawson, E.; Saldanha, A.; Chase, P. Discovery of N-(benzo [1,2,3] triazol-1-yl)-N-(benzyl) acetamido) phenyl) carboxamides as severe acute respiratory syndrome coronavirus (SARS-CoV) 3CLpro inhibitors: Identification of ML300 and noncovalent nanomolar inhibitors with an induced-fit binding. Bioorganic Med. Chem. Lett. 2013, 23, 6172–6177. [Google Scholar] [CrossRef]
Kumar, V.; Shin, J.S.; Shie, J.-J.; Ku, K.B.; Kim, C.; Go, Y.Y.; Huang, K.-F.; Kim, M.; Liang, P.-H. Identification and evaluation of potent Middle East respiratory syndrome coronavirus (MERS-CoV) 3CLPro inhibitors. Antivir. Res. 2017, 141, 101–106. [Google Scholar] [CrossRef]
Jin, W.; Barzilay, R.; Jaakkola, T. Junction Tree Variational Autoencoder for Molecular Graph Generation. In Artificial Intelligence in Drug Discovery; RSC Publishing: London, UK, 2021; Volume 11, pp. 228–249. [Google Scholar] [CrossRef]
Wu, Z.; Ramsundar, B.; Feinberg, E.N.; Gomes, J.; Geniesse, C.; Pappu, A.S.; Leswing, K.; Pande, V. MoleculeNet: A benchmark for molecular machine learning. Chem. Sci. 2017, 9, 513–530. [Google Scholar] [CrossRef] [Green Version]
Wang, X.; Li, Z.; Jiang, M.; Wang, S.; Zhang, S.; Wei, Z. Molecule Property Prediction Based on Spatial Graph Embedding. J. Chem. Inf. Model. 2019, 59, 3817–3828. [Google Scholar] [CrossRef]
Liu, K.; Sun, X.; Jia, L.; Ma, J.; Xing, H.; Wu, J.; Gao, H.; Sun, Y.; Boulnois, F.; Fan, J. Chemi-Net: A Molecular Graph Convolutional Network for Accurate Drug Property Prediction. Int. J. Mol. Sci. 2019, 20, 3389. [Google Scholar] [CrossRef] [PubMed] [Green Version]
You, J.X.; Liu, B.W.; Ying, R.; Pande, V.; Leskovec, J. Graph Convolutional Policy Network for Goal-Directed Molecular Graph Generation. Adv. Neural Inf. Processing Syst. 2018, 31. [Google Scholar] [CrossRef]
Grisoni, F.; Moret, M.; Lingwood, R.; Schneider, G. Bidirectional Molecule Generation with Recurrent Neural Networks. J. Chem. Inf. Model. 2020, 60, 1175–1183. [Google Scholar] [CrossRef] [PubMed]
Zhavoronkov, A.; Ivanenkov, Y.A.; Aliper, A.; Veselov, M.S.; Aladinskiy, V.A.; Aladinskaya, A.V.; Terentiev, V.A.; Polykovskiy, D.A.; Kuznetsov, M.D.; Asadulaev, A.; et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 2019, 37, 1038–1040. [Google Scholar] [CrossRef] [PubMed]
Guimaraes, G.L.; Sánchez-Lengeling, B.; Farias, P.L.C.; Aspuru-Guzik, A. Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models. arXiv 2017, arXiv:1705.10843. [Google Scholar]
Zhou, Z.; Kearnes, S.; Li, L.; Zare, R.N.; Riley, P. Optimization of Molecules via Deep Reinforcement Learning. Sci. Rep. 2019, 9, 10752. [Google Scholar] [CrossRef]
Tang, B.; Kramer, S.T.; Fang, M.; Qiu, Y.; Wu, Z.; Xu, D. A self-attention based message passing neural network for predicting molecular lipophilicity and aqueous solubility. J. Cheminform. 2020, 12, 1–9. [Google Scholar] [CrossRef] [Green Version]
ADQN–FBDD. Available online: https://github.com/tbwxmu/2019-nCov (accessed on 19 May 2022).
Zhu, K.; Borrelli, K.W.; Greenwood, J.R.; Day, T.; Abel, R.; Farid, R.S.; Harder, E. Docking Covalent Inhibitors: A Parameter Free Approach To Pose Prediction and Scoring. J. Chem. Inf. Model. 2014, 54, 1932–1940. [Google Scholar] [CrossRef]
Hoffman, R.L.; Kania, R.S.; Brothers, M.A.; Davies, J.F.; Ferre, R.A.; Gajiwala, K.S.; He, M.; Hogan, R.J.; Kozminski, K.; Li, L.Y.; et al. Discovery of Ketone-Based Covalent Inhibitors of Coronavirus 3CL Proteases for the Potential Therapeutic Treatment of COVID-19. J. Med. Chem. 2020, 63, 12725–12747. [Google Scholar] [CrossRef]
Walters, W.P.; Murcko, M. Assessing the impact of generative AI on medicinal chemistry. Nat. Biotechnol. 2020, 38, 143–145. [Google Scholar] [CrossRef]
Olivecrona, M.; Blaschke, T.; Engkvist, O.; Chen, H. Molecular de-novo design through deep reinforcement learning. J. Cheminform. 2017, 9, 1–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Popova, M.; Isayev, O.; Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 2018, 4, eaap7885. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Van Hasselt, H.; Guez, A.; Silver, D. Deep Reinforcement Learning with Double Q-Learning. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; pp. 2094–2100. [Google Scholar]
Simonini, T. Improvements in Deep Q Learning: Dueling Double DQN, Prioritized Experience Replay, and Fixed Q-Targets. 2018. Available online: https://www.freecodecamp.org/news/improvements-in-deep-q-learning-dueling-double-dqn-prioritized-experience-replay-and-fixed-58b130cc5682/ (accessed on 6 July 2018).
Schaul, T.; Quan, J.; Antonoglou, I.; Silver, D. Prioritized Experience Replay. arXiv 2015, arXiv:1511.05952. [Google Scholar]
Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W. OpenAI Gym. arXiv 2016, arXiv:1606.01540. [Google Scholar]
Speck-Planche, A. Recent advances in fragment-based computational drug design: Tackling simultaneous targets/biological effects. Futur. Med. Chem. 2018, 10, 2021–2024. [Google Scholar] [CrossRef] [Green Version]
Varin, T.; Schuffenhauer, A.; Ertl, P.; Renner, S. Mining for Bioactive Scaffolds with Scaffold Networks: Improved Compound Set Enrichment from Primary Screening Data. J. Chem. Inf. Model. 2011, 51, 1528–1538. [Google Scholar] [CrossRef]
Schuffenhauer, A.; Ertl, P.; Roggo, S.; Wetzel, S.; Koch, M.A.; Waldmann, H. The Scaffold Tree − Visualization of the Scaffold Universe by Hierarchical Scaffold Classification. J. Chem. Inf. Model. 2006, 47, 47–58. [Google Scholar] [CrossRef]
Reis, J.; Gaspar, A.; Milhazes, N.; Borges, F. Chromone as a Privileged Scaffold in Drug Discovery: Recent Advances. J. Med. Chem. 2017, 60, 7941–7957. [Google Scholar] [CrossRef]
Pillaiyar, T.; Manickam, M.; Namasivayam, V.; Hayashi, Y.; Jung, S.H. An Overview of Severe Acute Respiratory Syndrome-Coronavirus (SARS-CoV) 3CL Protease Inhibitors: Peptidomimetics and Small Molecule Chemotherapy. J. Med. Chem. 2016, 59, 6595–6628. [Google Scholar] [CrossRef]
Bickerton, G.R.; Paolini, G.V.; Besnard, J.; Muresan, S.; Hopkins, A.L. Quantifying the chemical beauty of drugs. Nat. Chem. 2012, 4, 90–98. [Google Scholar] [CrossRef] [Green Version]
Elton, D.C.; Boukouvalas, Z.; Fuge, M.; Chung, P.W. Deep learning for molecular design—a review of the state of the art. Mol. Syst. Des. Eng. 2019, 4, 828–849. [Google Scholar] [CrossRef] [Green Version]
Yang, X.; Wang, Y.; Byrne, R.; Schneider, G.; Yang, S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem. Rev. 2019, 119, 10520–10594. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Friesner, R.A.; Murphy, R.B.; Repasky, M.P.; Frye, L.L.; Greenwood, J.R.; Halgren, T.A.; Sanschagrin, P.C.; Mainz, D.T. Extra Precision Glide: Docking and Scoring Incorporating a Model of Hydrophobic Enclosure for Protein−Ligand Complexes. J. Med. Chem. 2006, 49, 6177–6196. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jin, Z.; Du, X.; Xu, Y.; Deng, Y.; Liu, M.; Zhao, Y.; Zhang, B.; Li, X.; Zhang, L.; Duan, Y.; et al. Structure-based drug design, virtual screening and high-throughput screening rapidly identify antiviral leads targeting COVID-19. bioRxiv 2020, 964882. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Art for SARS-CoV-2 3CL^pro lead compound generation.

Figure 2. (A) Framework of the ADQN–FBDD. (A) The ADQN–FBDD consists of a reinforced agent, a prioritized experience replay algorithm, and a chemical environment to perform chemical structure generation. The agent selects an action (insertion, deletion, or none) for the intermediate molecular fragment at each step to generate a new molecule that can maximize the cumulative rewards. The prioritized experience replay algorithm allows the agent to repeat the molecule generation based on the updated maximization of rewards. The chemical environment assesses the agent’s actions according to the predefined chemical rules and provides rewards. (B) Example of fragment-based actions. (C) The solid lines represent taken actions, including the addition or deletion of different fragments during an episode. The dashed lines represent actions that the RL agent was considered but did not take. An exploratory action is represented by the red dashed line, which was taken even though another sibling action, the one leading to S*, was ranked higher. The exploratory action did not result in any learning; however, other actions did, resulting in updates as demonstrated by the curved arrows where estimated values moved up the tree from later nodes to earlier nodes.

Figure 3. Lead compound #46 generated by AI. (A) Non-covalent docking model of SARS-CoV-2 3CL^pro (brown surface) with the bound lead compound #46 (magenta sticks). The triazole ring binds to the S1 subsite of the active catalytic center, the covalent fragment of the α, β-unsaturated aldehyde binds to the S1′ subsite, the β-lactam ring binds to the S2 subsite, and 5,7-dihydroxy chromone binds to the S3 subsite. (B) Two-dimensional (2D) view of the non-bonding interactions of lead compound #46 in complex with 3CL protease based on non-covalent docking. The triazole ring, ketoamide group, and phenolic hydroxyl group form hydrogen bonds (H-bonds) with His163, Glu166, and Thr190, respectively. (C) Covalent docking model of compound #46 (green sticks) with 3CL protease (brown surface), similar to the non-covalent docking model. (D) Two-dimensional view of ligand interactions between compound #46 and protease under covalent docking. The triazole ring forms an H-bond with His163. The α, β-unsaturated aldehyde forms a covalent bond with Cys145, i.e., the key residue in the catalytic center of the protease, resulting in covalent inhibition. The aldehyde carbonyl group forms an H-bond with His41. The hydroxyl group of chromone at position 7 forms an H-bond with Thr190.

Figure 4. Co-crystal structure of the covalent adduct of (A) 6XHO (cyan) and (B) 7VH8 (magenta) bound to SARS-CoV-2 3CL^pro together with the alignment of (C) lead compound #46 (yellow), and the detailed interactions of the complex: (D) 6LU7_#46, (E) 6XHO, and (F) 7VH8.

Figure 5. Covalent binding models of compounds 46–14–1 and 46–14–2 in complex with SARS-CoV-2 3CL^pro. (A) Covalent docking model of compound 46–14–1 (green sticks) with 3CL protease (yellow-orange surface). The oxyanion hole formed by the segment of α-ketoamide is shown in the red circle. (B) Detailed view of the interactions between compound 46–14–1 and 3CL^pro. (C) Covalent docking model of compound 46–14–2 with 3CL protease. Molecule 46–14–2 is shown as green sticks, and the protein is shown as a brown surface. (D) Two-dimensional view of the interactions between compound 46–14–1 and 3CL^pro.

Figure 6. Covalent binding model of compound 46–14–3 in SARS-CoV-2 3CL^pro. (A) Surface representation of SARS-CoV-2 3CL^pro (brown) complexed with 46–14–3 (green sticks). (B) Stereoscopic view of 43–14–3 in the substrate-binding pocket of SARS-CoV-2 3CL^pro at 4 Å. Molecule 43–14–3 is shown as green sticks. Residues forming H-bonds are shown as yellow sticks. Yellow dashes represent the H-bonds, and the oxyanion hole is in the red circle.

Figure 7. Structures of optimized compounds.

Figure 8. Architecture of the applied deep Q-learning network (ADQN). TD is the temporal difference error.

Figure 9. Structure of the chosen cores.

Figure 10. Binding site of SARS-CoV-2 3CL^pro. (A) Surfaces of subsites that complement the substrate-binding pocket that are labeled as S1 (green), S1′ (red), S2 (magenta), and S3 (cyan). (B) Key residues of the binding site that are presented as green, red, magenta, and cyan lines (S1, S1′, S2, and S3). Images of the binding site were generated by using PyMol (http://www.pymol.org/).

Table 1. Comparisons of existing deep-learning-based molecule design tools.

Representative Tool	Method	Training/Performance
RNNs [35]	Recurrent neural networks	Extensive training data are required to learn the context of the atom composition; may generate many invalid SMILES
GENTRL [36]	Variational-autoencoder-based models	Large training data are required to learn the atom distribution from training SMILES; may still produce some invalid SMILES
ORGAN [37],ORGANIC [38]	Generative adversarial networks	Use adversarial attacks to enhance the success rate; however, this still cannot guarantee valid SMILES
MolDQN [39]	Reinforcement-learning-based models	Employ self-learning without training data and all generated SMILES are valid

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tang, B.; He, F.; Liu, D.; He, F.; Wu, T.; Fang, M.; Niu, Z.; Wu, Z.; Xu, D. AI-Aided Design of Novel Targeted Covalent Inhibitors against SARS-CoV-2. Biomolecules 2022, 12, 746. https://doi.org/10.3390/biom12060746

AMA Style

Tang B, He F, Liu D, He F, Wu T, Fang M, Niu Z, Wu Z, Xu D. AI-Aided Design of Novel Targeted Covalent Inhibitors against SARS-CoV-2. Biomolecules. 2022; 12(6):746. https://doi.org/10.3390/biom12060746

Chicago/Turabian Style

Tang, Bowen, Fengming He, Dongpeng Liu, Fei He, Tong Wu, Meijuan Fang, Zhangming Niu, Zhen Wu, and Dong Xu. 2022. "AI-Aided Design of Novel Targeted Covalent Inhibitors against SARS-CoV-2" Biomolecules 12, no. 6: 746. https://doi.org/10.3390/biom12060746

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

AI-Aided Design of Novel Targeted Covalent Inhibitors against SARS-CoV-2

Abstract

1. Introduction

2. Results

3. Discussion

4. Methods

4.1. Markov Decision Process (MDP) for Molecule Generation

4.2. Chemical Environment Design

4.3. Agent Design

4.4. Prioritized Experience Replay

4.5. Fragment Library Design

4.6. Core Selection

4.7. Reward Design

4.8. Molecule Generation and Selection

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI