Next Article in Journal
Ultrafast Energy Transfer Dynamics in a Cyanobacterial Light-Harvesting Phycobilisome
Next Article in Special Issue
Design Optimization of Counter-Flow Double-Pipe Heat Exchanger Using Hybrid Optimization Algorithm
Previous Article in Journal
Bioaugmentation of Aerobic Granular Sludge with Dye-Decolorizing Yeast for Textile Industrial Wastewater
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Performance Evaluation of Ingenious Crow Search Optimization Algorithm for Protein Structure Prediction

1
Statistics and Operations Research Department, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
2
Department of Electrical Engineering, Central University of Haryana, Mahendergarh 123031, Haryana, India
3
Department of Mathematics, Swami Keshvanand Institute of Technology, Management and Gramothan, Jaipur 302017, Rajasthan, India
4
CeADAR Ireland’s Center for Applied AI, Technological University Dublin, D7 EWV4 Dublin, Ireland
5
Operations Research Department, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza 12613, Egypt
6
Applied Science Research Center, Applied Science Private University, Amman 11937, Jordan
*
Author to whom correspondence should be addressed.
Processes 2023, 11(6), 1655; https://doi.org/10.3390/pr11061655
Submission received: 7 April 2023 / Revised: 14 May 2023 / Accepted: 17 May 2023 / Published: 29 May 2023

Abstract

:
Protein structure prediction is one of the important aspects while dealing with critical diseases. An early prediction of protein folding helps in clinical diagnosis. In recent years, applications of metaheuristic algorithms have been substantially increased due to the fact that this problem is computationally complex and time-consuming. Metaheuristics are proven to be an adequate tool for dealing with complex problems with higher computational efficiency than conventional tools. The work presented in this paper is the development and testing of the Ingenious Crow Search Algorithm (ICSA). First, the algorithm is tested on standard mathematical functions with known properties. Then, the application of newly developed ICSA is explored on protein structure prediction. The efficacy of this algorithm is tested on a bench of artificial proteins and real proteins of medium length. The comparative analysis of the optimization performance is carried out with some of the leading variants of the crow search algorithm (CSA). The statistical comparison of the results shows the supremacy of the ICSA for almost all protein sequences.

1. Introduction

Proteins are one of the essential macromolecules of human organisms. Due to their complicated structure and importance in bioinformatics, protein structure prediction attracts diverse researchers. Proteins are combinations of different types of chains of amino acids and can fold into various states. These folding structures are known as 3D structures of protein and play different important roles, including as catalysts in various reactions, as structural units, in the reporting of signals and as transport channels in living organisms. Understanding the 3D structure is also helpful in treating various diseases, such as Alzheimer’s disease and cystic fibrosis.
The basic methods of detecting protein structure are X-ray crystallography and NMR spectroscopy. However, these methods require an excessive amount of money and time and, hence, are less adopted. The prediction of protein structure via the relation between the linear sequence of amino acids and the protein’s 3D structure was conducted in [1,2]. In later years, the prediction of protein structure based on the fact that the most stable folding of protein is one which has minimum free energy [2,3,4,5]. In mathematical terms, the free energy reflects different types of bonding between protein molecules, such as hydrophilic, solvent, hydrogen and entropic effects. As this function is nonconvex, protein structure prediction can be considered a global optimization problem. This prediction method is performed in two stages: firstly, a physical form of protein is assumed, and secondly, the energy function is calculated through any optimization algorithm. The HP model [3] and AB-off lattice model [4] are two physical models widely used for the first stage. Recently, protein structure prediction has been converted into an optimization problem from a bioinformatics problem and has been solved using various nature-inspired algorithms [5,6,7,8].
An algorithm selection process based on fitness landscape is presented in [5]. The artificial bee colony (ABC) algorithm and its different improved versions have been applied to this protein folding structure problem [6,7,8] and prove that metaheuristic approaches are good alternative ways of solving this np-hard optimization problem. Different variants of the DE algorithm were applied to PSP in [9,10,11]. Along with this, the list of successful metaheuristic algorithms applied to PSP also includes the improved harmony search [12], gradient gravitational search algorithm [13], particle search algorithm [14], ant colony algorithm [15], genetic tabu search [16], an adaptive differential evolution algorithm [17], chaotic grasshopper algorithm [18] and many more. This literature review indicates the importance of the protein structure problem and also promotes the fact that other newly developed algorithms can be applied to discover some new results. Recently, algorithms inspired by nature and animal behavior have revolutionized the world with their problem-solving capabilities [5,6,7,8]. The crow search algorithm is one of them and has been successfully applied to many problems, such as the frequency modulation synthesis problem, model order reduction and other design problems [19]. Furthermore, some recent applications and facts reported in references [20,21,22,23,24] motivated the authors to conduct detailed investigations of the development of new bridging in existing CSA. Some interesting approaches regarding the integration of neural networks and real-life problems, such as bidding strategy planning and response prediction, have been demonstrated very prominently in references [25,26,27]. Similarly, a deep-learning-based approach was employed for protein structure prediction in reference [28]. Likewise, the gradient-based gravitational search algorithm has been employed for conformational searches of the basic building blocks of proteins [29]. In reference [30], the identification of essential proteins using chemical reaction optimization and machine learning has been performed. Inspired by these possibilities, we propose an ingenious crow search algorithm (ICSA) to solve the aforementioned protein folding problem. This work is an extension of the work reported previously by the authors as we change the cosine function with the exponential function. The following are the main contributions of this manuscript:
  • The protein folding problem has been discussed, and the problem is formulated by keeping the AB-off lattice model in consideration.
  • An application of the newly proposed ICSA has been explored on a predefined bench of mathematical functions and proteins, and evaluation of the algorithm has been conducted.
  • A meaningful comparison between the performance of various crow search variants and crow search itself has been conducted on the basis of statistical attribute analysis, box plot analysis and execution time analysis.
The remaining part of the paper is organized into several sections: Section 2 presents the problem formulation of protein structure prediction with energy minimization. Section 3 depicts the development steps of the ICSA and the basic details of the implemented algorithm. Section 4 presents the results of the simulation on the conventional benchmark functions and protein benches. Section 5 concludes the research work in this paper with suggestions for the future direction of research work.

2. Problem Formulation

In this paper, we have used the AB-off lattice model, which is a generalized form of the HP model. According to this model, particles are connected to each other with chemical bonds of unit length and then fold into a 3D structure. The model which possesses the lowest energy is the most stable among all possible structures. This energy is formed by two types of interactions: one is intermolecular (between protein and other solvent molecules) and the other is intramolecular (between any two protein molecules). In this way, the AB-off lattice model considers atomic interactions in calculating energy function, which are left over in the HP model. Instead of 20 different types of amino acids, this model only considers two residues named A (hydrophobic) and B (hydrophilic). Any protein sequence of length r consists of total (r − 2) bend angles ϕ 2 ,   ϕ 3 , , ϕ r 1 . A bend angle is an angle between any two amino acids, whose direction can be random, i.e., clockwise as well as anticlockwise. The energy and other terms of the protein structure problem can be mathematically given by the following equations:
E = i = 2 r 1 ( 1 cos ϕ i ) 4 + 4 i = 1 r 2 j = i + 2 r [ d i j 12 I ( τ i , τ j ) d i j 6 ]
d i j = [ 1 + k = i + 1 j 1 cos ( l = i + 1 k ϕ l ) ] 2 + [ k = i + 1 j 1 sin ( l = i + 1 k ϕ l ) ] 2
I ( τ i , τ j ) = 1 8 [ 1 + τ i + τ j + 5 τ i τ j ]
where E is the energy function to be minimized, and d i j is the corresponding distance between residue i and j. I ( τ i , τ j ) is the value representing the bonding between residues, such as for the AA bond I ( τ i , τ j ) = 1 , for the BB bond I ( τ i , τ j ) = 0.5 and for the AB or BA bonds I ( τ i , τ j ) = 0.5 .

3. Ingenious Crow Search Algorithm

The crow is considered as a genius bird among other species. The study in [20] shows that its brain size is bigger in comparison to other birds of its size. A crow shows its superior behavior in hiding and stealing its food, mimicking voices and in the mirror test. Crows live in a flock and present a natural example of the optimization process in searching for food, hiding it from others and following each other to know the location of food. Askarzadeh proposed the crow search algorithm (CSA) inspired by these characteristics, which became very popular due to its simple structure and a smaller number of parameters. Many researchers presented improved variants of CSA and applied them to solve real engineering problems [19,22,24]. Authors proposed chaotic variants of CSA to solve the feature selection problem in [21]. A modified version of CSA has been applied in [22] to solve the economic load dispatch problem. In [23], the authors employ improved crow search algorithm (ImCSA) for energy problems. The best selection of conductors in a radial distribution network has been addressed in [24]. In [19], the authors present an intelligent CSA inculcating two modifications in CSA, namely opposition-based learning and the cosine position updation rule. The proposed variant has also been verified in some real engineering problems, such as model order reduction and structural design problems. Furthermore, an exponential function-based mechanism is introduced in this version to make it the exponential function-based ingenious crow search algorithm (ICSA):
  • The first is opposition-based learning, which is used in the initialization phase, when the crows generate their positions. Out of the total crows, half of the crows generate their position randomly, and the remaining half generate it according to the following definition.
Definition 1:
Let z = ( z 1 , z 2 , , z r ) is a point in a space of R dimensions where z i is a real number for i { 1 , 2 , , r } and i [ a , b ] then opposite points set of z is given as z ¯ = ( z ¯ 1 , z ¯ 2 , , z ¯ r ) or we can also write
z ¯ i = [ a i + b i z i ]
2.
The second modification is an acceleration factor based on exponential function, which acts as a bridging mechanism between the exploration and exploitation stages of the optimization process. In comparison with a linear function, a cosine function provides better results as it has a high gradient in the exploration stage, which means that it explores a bigger area which helps in finding the solution. Later on, when the gradient is low, the area shrunk during the second half and led to the avoidance of local minima trapping. This acceleration factor can be given mathematically as
A F = 1 exp ( t / T )
Let us consider the total number of crows in a flock as N, and the location of hidden food is given by H i j , which is considered as the best position of crow i, for i = { 1 , , N } . U i j is the ith crow position at jth iteration. Initially, half of the population of crows generates their position randomly and the other half by Equation (4). Suppose a crow i follows crow y at iteration j, then two cases are possible: either crow y knows that crow i is following it and it tries to fool it by changing its position swiftly or crow y does not know that it is being followed. These two cases inculcated with the abovementioned modifications can be represented mathematically as
U i j + 1 = { U i j + ( A F ) R i L i ( H i j U i j )             i f   R i A P i , j a     r a n d o m     n u m b e r                             o t h e r w i s e
where L i = flight length of ith crow, R i = a random number such that R i = [ 0 , 1 ] , and A P i , j = awareness probability of the crow, which helps to create a balance between the exploitation and exploration stages.
In every iteration, the crow updates the location of its food by the following equation:
H i j + 1 = { U i j + 1   , f n ( U i j + 1 )   i s   b e t t e r   t h a n   f n ( H i j ) H i j o t h e r w i s e
For the easy understanding of readers, a flow chart of the ICSA [19] is given in Figure 1.
Furthermore, the implementation details of the ICSA have been depicted in the following algorithm.
  • For implementing any real-life optimization problem, the designer requires the identification of variable composition. In this structure prediction problem, we have to calculate the dimension of the variables as per the sequence length. Hence, the dimension of the solution string is calculated as per size of the sequence.
  • As it is a known fact that the folding can be conducted between [−180,180], the upper and lower bounds of the variables have been assigned as per these boundary conditions. From this, it can be observed that the initialization of the number of crows along with their research directions can be finalized with the help of sequence size and range of bend angles.
  • For further implementation of the algorithm, the energy function has been evaluated with every iteration of the ICSA, and the values of memory as well fitness function are stacked in an array. Then, the optimal values are retained, and further processing in order to improve the solution quality has been started with the help of the position update equation.
Furthermore, as per the stopping criterion of the ICSA, this process is stopped, and optimal values of energy function and corresponding angle values are stored.

4. Simulation and Results

For proving the efficacy of the proposed ICSA, a detailed investigation has been conducted in this section. First, the performance is evaluated for some standard functions, and then, the performance is evaluated for some artificial protein benches for the determination of the optimal structure.

4.1. Evaluation of ICSA on Conventional Benchmark Functions

Table 1 shows the diverse characteristics of various functions, along with dimensions and bounds. From this table, it is observed that the functions used in this experimentation have two characteristics, i.e., unimodal (with one minima) and multimodal (with multiple minima, including global and local), and possess complex landscapes. For the evaluation of the exploration and exploitation virtues, both of these landscape evaluations are required.
As we know, metaheuristics are based on the generation of random numbers in between the bounds for given variables, and they explore the search space in a very effective manner; hence, the results obtained from these algorithms always differ from run to run. Hence, for reporting the results of these algorithms, statistical attribute depiction is acutely required. Hence, in this experimentation, our aim is to run various improved versions of the crow search algorithm along with the proposed one and conduct the evaluation of the optimization properties on the basis of statistical attribute depiction. The following attributes are chosen for depiction of the optimization results.
  • Mean of the fitness values obtained from 20 independent runs.
  • Maximum fitness values obtained from 20 independent runs (Worst value as minimization is performed).
  • Minimum fitness values obtained from 20 independent runs (Best value as minimization is performed).
  • Standard Deviation of the fitness values obtained from 20 independent runs.
Some of the crow search variants have been taken in the analysis, in this experimentation all optimization algorithms are run for minimization purpose. For the optimization environment, search agent no. (30), maximum iteration (500) and total no. of independent runs (20) are kept constant for all the algorithms. Figure 2 shows the convergence property analysis of these algorithms. It can be observed from these plots that with the exponential modification, the convergence of the ICSA is higher as compared to the parent algorithm and other prominent variants of the CSA.
The results of the statistics attribute analysis have been showcased in Table 2. The following points are observed from this analysis.
  • From this table, it has been observed that for unimodal functions (F1–F7), the ICSA is performs well and the proposed exponential function-based mechanism helps the algorithm in convergence. Unimodal functions are those functions that possess only one minimum in a given search space. Hence, it can be concluded that the proposed exponential-driven mechanism helps the algorithm to find the minima very efficiently and helps the algorithm in convergence.
  • In addition, in a few of multimodal functions, such as F8 to F12, the performance is not compromised. Hence, it can be observed that the exploration and exploitation virtues of the ICSA have been enhanced with the inculcation of opposition-based learning and the proposed exponential-driven function. The results of standard deviation and their optimal values for the ICSA have been showcased in boldface and depict the superior quality of the optimization by the proposed ICSA. The proposed mechanisms help the ICSA to avoid local minima stagnation and provide a big leap in the position updation phase (due to exponential function).

4.2. Application of ICSA to Protein Structure Prediction

In this section, simulation and the results of the ICSA are discussed in relation to the protein bench, showcased in the previous subsection. In reference, it has already been shown by the authors in the previous subsection that the ICSA outperforms the original CSA and some of the leading variants of CSA on standard benchmark functions. Furthermore, to test the efficacy of the algorithm, benches of protein are considered here. The characteristics of these protein benches are shown in Table 3.
A. Statistical Attribute Analysis (SAA)
As we know that metaheuristics instill some degree of uncertainty in producing the results of the optimization process, it is an established practice to report the results in terms of mean values, maximum values, minimum values and standard deviation values of independent runs. To adhere to the same practice, these statistical attributes are exhibited in Table 4.
  • The results depicted in Table 3 are calculated by taking 20 independent runs into consideration. To make the competition fair, the maximum no. of function evaluations has been kept constant for all participating algorithms. The following points can be observed from these results:
  • The bench of protein is divided into three major parts, namely very small, small and medium length. Along with this, a real sequence has also been considered. From the observation table, we can conclude that the algorithms gave almost the same values of free energy for Asm1 and Asm2 when compared; however, the values of standard deviation of the results are optimal for the ICSA. These results are depicted in boldface.
  • Inspecting the values of mean As1 and As2, it can be clearly observed that these values are optimal for the ICSA. Along with this fact, in As2, standard deviation is also optimal. These results are considered as affirmative, and it can be concluded that the ICSA works well for these proteins.
On further inspection, for the medium-size and real protein sequence, we have observed that the mean values are optimal in case of the ICSA, and the algorithm shows promising results. Hence, it can be concluded that acceleration factor-driven bridging and opposition-based learning substantially enhance the performance of the algorithm.
B. Iterative Time Analysis (ITA)
It is a known fact that the execution time of the algorithm is quite important while dealing with complex engineering problems. Unlike classical problems, protein structure prediction is a complex problem, and the execution time for the identification of protein structure is an essential requirement to judge the performance of the algorithm. Taking this fact into consideration, the execution times for independent runs have been calculated, and mean values for the algorithms are depicted in Table 5.
By inspecting the values of mean execution time, it can be easily concluded that the ICSA gives fast and optimal results. The execution time for different protein sequences is optimal for the ICSA and depicted in boldface in Table 5.
C. Box Plot Analysis (BPA)
To compare the optimization performance of the competitors, BPA is also conducted. Diagrams are plotted for the Am1, Am2 and Rs1 sequences. These are depicted in Figure 3, Figure 4 and Figure 5. From these, one can observe that the mean values are optimal and the interquartile range of the ICSA is satisfactory as compared to other participating algorithms. From this analysis, the supremacy of the proposed variant over the CSA and other variants is confirmed.
D. Rank-sum Test Analysis
Figure 6 shows rank-sum test analysis results in terms of p-values to compare the ICSA with competitors. The Wilcoxon rank-sum test is conducted to find the statistical significance of the algorithm as compared to other as metaheuristics instill uncertainty in producing results.
From the figure, it can be seen that the CCSA is totally different from the ICSA as the p-values associated with this comparison are less than 0.05 for all the sequences. It is also worth mentioning here that a significance difference exists between the CSA for some of the sequences. However, in the case of the ImCSA, the performance of ICSA is comparable with ImCSA.
E. Extended Experiments on Real Protein Sequences
To extend the analysis of the proposed algorithm, we tested the algorithm for prediction of the structure of the following real proteins. The details of these proteins are shown in Table 6; these are taken from [31], and more details are available at Protein Data Bank (PDB, http://www.rcsb.org/pdb/home/home.do, the access date 15 April 2023) [32].
For evaluating the performance of the ICSA, statistical attributes (SA), such as mean, maximum, minimum and standard deviation, of the fitness function over 29 independent runs are reported in Table 7. The optimal values of the fitness function mean have been shown in boldface in the table. It is observed that the fitness values of the proposed ICSA are optimal for many proteins. Since the mean value of the optimization run is an important parameter to depict the algorithm performance, this has been chosen to showcase the efficacy of the ICSA. It is worth mentioning here that while dealing with long-length protein sequences, the algorithm showed sluggish behavior and took more time for convergence. Hence, a convergence improvement scheme may be employed in the future.
From Table 7, it has been observed that the proposed ICSA exhibits a better response in terms of mean values of SA. Hence, to verify this, convergence curves of RP-2, RP-5, RP-6, RP-7, RP-8 and RP-9 are plotted in Figure 7. From the figure, it has been observed that the ICSA exhibits a slightly better convergence property as compared with other variants of the CSA and the CSA itself. From this point of view, the proposed modification appears more meaningful for the PSP problem.

5. Conclusions

The ingenious crow search algorithm (ICSA) is proposed with a new bridging exponential operator for carrying out the protein structure prediction problem. Opposition theory has been implemented in the initialization phase, along with the exponential-driven position update mechanism. The ICSA has been tested on some of the standard benchmark functions and applied to the protein structure prediction problem. The following are the major conclusions of this work:
  • Before experimenting on the complex protein sequence, the ICSA has been tested over some of the conventional benchmark functions. These functions are known functions, i.e., the minima and search range are priorly known. The comparative analysis with some of the published versions of the CSA shows that the algorithm is substantially improved with the application of a new exponential-driven factor and opposition-based learning. A detailed investigation in terms of statistical analysis of the fitness has been carried out to exhibit the efficacy of the proposed ICSA.
  • A bench of various protein sequences is considered for testing the efficacy of the ICSA and some of the leading versions of the crow search algorithm and its variants. The bench consists of real and artificial sequences of protein.
  • An extended analysis of the algorithm has been conducted with the help of a real protein bench. The bench consists of a real protein sequence of medium length. The algorithm is evaluated with other opponents on the basis of convergence and SA.
  • Optimization performance has been compared with the help of various analyses, such as SAA, ETA and statistical significance evaluation with the help of the rank-sum test. We observed that the ICSA provides the optimal solution in less computation time, and in some cases, a degree of uniqueness exists in the obtained results.
  • Convergence curves for different conventional functions have been plotted to showcase the optimization efficacy of the ICSA.
For future experimentation, a local search algorithm for enhancing the accuracy of the prediction will be proposed and tested on artificial as well as real protein sequences. In addition, rigorous analysis of some long-length sequences will be executed in the future by the authors.

Author Contributions

Methodology, A.S.; Software, S.S.; Formal analysis, A.W.M.; Resources, A.M.A.; Data curation, H.M.Z.; Writing—original draft, A.M.A., A.S., S.S., H.M.Z. and A.W.M.; Writing—review & editing, A.S., S.S., H.M.Z. and A.W.M.; Funding acquisition, A.M.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by the Researchers Supporting Program at King Saud University, (RSPD2023R533).

Data Availability Statement

All the data sources has been cited in the given manuscript.

Acknowledgments

The authors present their appreciation to King Saud University for funding the publication of this research through the Researchers Supporting Program (RSPD2023R533), King Saud University, Riyadh, Saudi Arabia.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Anfinsen, C.B.; Haber, E.; Sela, M.; White, F.H., Jr. The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc. Natl. Acad. Sci. USA 1961, 47, 1309. [Google Scholar] [CrossRef] [Green Version]
  2. Anfinsen, C.B. Principles that govern the folding of protein chains. Science 1973, 181, 223–230. [Google Scholar] [CrossRef] [Green Version]
  3. Dill, K.A.; Bromberg, S.; Yue, K.; Chan, H.S.; Ftebig, K.M.; Yee, D.P.; Thomas, P.D. Principles of protein folding—A perspective from simple exact models. Protein Sci. 1995, 4, 561–602. [Google Scholar] [CrossRef] [Green Version]
  4. Stillinger, F.H.; Head-Gordon, T.; Hirshfeld, C.L. Toy model for protein folding. Phys. Rev. E 1993, 48, 1469. [Google Scholar] [CrossRef] [Green Version]
  5. Jana, N.D.; Sil, J.; Das, S. Selection of appropriate metaheuristic algorithms for protein structure prediction in AB off-lattice model: A perspective from fitness landscape analysis. Inf. Sci. 2017, 391, 28–64. [Google Scholar] [CrossRef]
  6. Li, B.; Gong, L.G.; Yang, W.L. An improved artificial bee colony algorithm based on balance-evolution strategy for unmanned combat aerial vehicle path planning. Sci. World J. 2014, 2014, 232704. [Google Scholar] [CrossRef] [Green Version]
  7. Li, B.; Chiong, R.; Lin, M. A balance-evolution artificial bee colony algorithm for protein structure optimization based on a three-dimensional AB off-lattice model. Comput. Biol. Chem. 2015, 54, 1–12. [Google Scholar] [CrossRef] [PubMed]
  8. Vargas Benítez, C.M.; Lopes, H.S. Parallel Artificial Bee Colony Algorithm Approaches for Protein Structure Prediction Using the 3dhp-sc Model. In Intelligent Distributed Computing IV: Proceedings of the 4th International Symposium on Intelligent Distributed Computing-IDC 2010, Tangier, Morocco, September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 255–264. [Google Scholar]
  9. Kalegari, D.H.; Lopes, H.S. A differential evolution approach for protein structure optimisation using a 2D off-lattice model. Int. J. Bio-Inspired Comput. 2010, 2, 242–250. [Google Scholar] [CrossRef]
  10. Kalegari, D.H.; Lopes, H.S. An improved parallel differential evolution approach for protein structure prediction using both 2D and 3D off-lattice models. In Proceedings of the 2013 IEEE Symposium on Differential Evolution (SDE), Singapore, 16–19 April 2013; IEEE: Piscataway, NJ, USA, 2013. [Google Scholar]
  11. Bošković, B.; Brest, J. Protein folding optimization using differential evolution extended with local search and component reinitialization. Inf. Sci. 2018, 454, 178–199. [Google Scholar] [CrossRef] [Green Version]
  12. Jana, N.D.; Sil, J.; Das, S. An improved harmony search algorithm for protein structure prediction using 3D off-lattice model. In Proceedings of the International Conference on Harmony Search Algorithm Springer, Singapore, 22–24 February 2017; pp. 304–314. [Google Scholar]
  13. Dash, T.; Sahu, P.K. Gradient gravitational search: An efficient metaheuristic algorithm for global optimization. J. Comput. Chem. 2015, 36, 1060–1068. [Google Scholar] [CrossRef]
  14. Chen, X.; Lv, M.; Zhao, L.; Zhang, X. An improved particle swarm optimization for protein folding prediction. Int. J. Inf. Eng. Electron. Bus. 2011, 3, 1. [Google Scholar] [CrossRef]
  15. Shmygelska, A.; Hoos, H.H. An ant colony optimisation algorithm for the 2D and 3D hydrophobic polar protein folding problem. BMC Bioinform. 2005, 6, 30. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. Zhang, X.; Wang, T.; Luo, H.; Yang, J.Y.; Deng, Y.; Tang, J.; Yang, M.Q. 3D Protein structure prediction with genetic tabu search algorithm. BMC Syst. Biol. 2010, 4, 1–9. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  17. Venske, S.M.; Gonçalves, R.A.; Benelli, E.M.; Delgado, M.R. ADEMO/D: An adaptive differential evolution for protein structure prediction problem. Expert Syst. Appl. 2016, 56, 209–226. [Google Scholar] [CrossRef]
  18. Saxena, A. A comprehensive study of chaos embedded bridging mechanisms and crossover operators for grasshopper optimisation algorithm. Expert Syst. Appl. 2019, 132, 166–188. [Google Scholar] [CrossRef]
  19. Shekhawat, S.; Saxena, A. Development and applications of an intelligent crow search algorithm based on opposition-based learning. ISA Trans. 2020, 99, 210–230. [Google Scholar] [CrossRef]
  20. Rincon, P. Science/nature|crows and jays top bird IQ scale. BBC News, 22 February 2005. [Google Scholar]
  21. Sayed, G.I.; Hassanien, A.E.; Azar, A.T. Feature selection via a novel chaotic crow search algorithm. Neural Comput. Appl. 2019, 31, 171–188. [Google Scholar] [CrossRef]
  22. Mohammadi, F.; Abdi, H. A modified crow search algorithm (MCSA) for solving economic load dispatch problem. Appl. Soft Comput. 2018, 71, 51–65. [Google Scholar] [CrossRef]
  23. Díaz, P.; Pérez-Cisneros, M.; Cuevas, E.; Avalos, O.; Gálvez, J.; Hinojosa, S.; Zaldivar, D. An improved crow search algorithm applied to energy problems. Energies 2018, 11, 571. [Google Scholar] [CrossRef] [Green Version]
  24. Abdelaziz, A.Y.; Fathy, A. A novel approach based on crow search algorithm for optimal selection of conductor size in radial distribution networks. Eng. Sci. Technol. Int. J. 2017, 20, 391–402. [Google Scholar] [CrossRef]
  25. Gupta, E.; Saxena, A. Robust generation control strategy based on grey wolf optimizer. J. Electr. Syst. 2015, 11, 174–188. [Google Scholar]
  26. Kałużyński, P.; Mucha, W.; Capizzi, G.; Lo Sciuto, G. Chemiresistor gas sensors based on conductive copolymer and ZnO blend–prototype fabrication, experimental testing, and response prediction by artificial neural networks. J. Mater. Sci. Mater. Electron. 2022, 33, 26368–26382. [Google Scholar] [CrossRef]
  27. Jain, K.; Saxena, A. Simulation on supplier side bidding strategy at day-ahead electricity market using ant lion optimizer. J. Comput. Cogn. Eng. 2023, 2, 17–27. [Google Scholar]
  28. Yang, K.; Huang, H.; Vandans, O.; Murali, A.; Tian, F.; Yap, R.H.; Dai, L. Applying deep reinforcement learning to the HP model for protein structure prediction. Phys. A Stat. Mech. Its Appl. 2023, 609, 128395. [Google Scholar] [CrossRef]
  29. Pradhan, R.; Panigrahi, S.; Sahu, P.K. Conformational Search for the Building Block of Proteins Based on the Gradient Gravitational Search Algorithm (ConfGGS) Using Force Fields: CHARMM, AMBER, and OPLS-AA. J. Chem. Inf. Model. 2023, 63, 670–690. [Google Scholar] [CrossRef] [PubMed]
  30. Inzamam-Ul-Hossain, M.; Islam, M.R. Identification of Essential Protein Using Chemical Reaction Optimization and Machine Learning Technique. IEEE/ACM Trans. Comput. Biol. Bioinform. 2023. [Google Scholar] [CrossRef]
  31. Jana, N.D.; Das, S.; Sil, J. A Metaheuristic Approach to Protein Structure Prediction: Algorithms and Insights from Fitness Landscape Analysis; Springer: Berlin/Heidelberg, Germany, 2018; Volume 31. [Google Scholar]
  32. RCSB Protein Data Bank (RCSB PDB). Available online: http://www.rcsb.org/pdb/home/home.do (accessed on 6 March 2023).
Figure 1. Flow chart of the ingenious crow search algorithm.
Figure 1. Flow chart of the ingenious crow search algorithm.
Processes 11 01655 g001
Figure 2. Convergence characteristics of conventional benchmark functions.
Figure 2. Convergence characteristics of conventional benchmark functions.
Processes 11 01655 g002
Figure 3. BPA for Rs1.
Figure 3. BPA for Rs1.
Processes 11 01655 g003
Figure 4. BPA for Am1.
Figure 4. BPA for Am1.
Processes 11 01655 g004
Figure 5. BPA for Am2.
Figure 5. BPA for Am2.
Processes 11 01655 g005
Figure 6. Rank-sum Test Analysis.
Figure 6. Rank-sum Test Analysis.
Processes 11 01655 g006
Figure 7. Convergence property analysis.
Figure 7. Convergence property analysis.
Processes 11 01655 g007
Table 1. Definition of standard benchmark functions.
Table 1. Definition of standard benchmark functions.
FunctionDimRangeMin.
Value
F 1 ( x ) = i = 1 n x i 2 30[−100,100]0
F 2 ( x ) = i = 1 n x i + i = 1 n x i 30[−10,10]0
F 3 ( x ) = i = 1 n ( j 1 i x j ) 2 30[−100,100]0
F 4 ( x ) = max i { x i , 1 i n } 30[−100,100]0
F 5 ( x ) = i = 1 n 1 [ 100 ( x i + 1 x i 2 ) 2 + ( x i 1 ) 2 ] 30[−30,30]0
F 6 ( x ) = i = 1 n 1 ( x i + 0.5 ) 2 30[−100,100]0
F 7 ( x ) = i = 1 n 1 i x i 4 + r a n d o m [ 0,1 ] 30[−1.28,1.28]0
F 8 ( x ) = i = 1 n x i sin ( x i ) 30[−500,500]−418.9829 × 5
F 9 ( x ) = i = 1 n [ x i 2 10 cos ( 2 π x i + 10 ) ] 30[−5.12,5.12]0
F 10 ( x ) = 20 exp ( 0.2 1 n i = 1 n x i 2 ) exp ( 1 n i = 1 n cos ( 2 π x i ) ) + 20 + e 30[−32,32]0
F 11 ( x ) = 1 4000 i = 1 n x i 2 i = 1 n cos ( x i i ) + 1 30[−600,600]0
F 12 ( x ) = π n 10 sin ( π y i ) + i = 1 n 1 ( y i 1 ) 2 [ 1 + 10 sin 2 ( π y i + 1 ) + ( y n 1 ) 2 ] + i = 1 n u ( x i , 10,100,4 ) y i = 1 + x i + 1 4
u x i , a , k , m = k x i a m x > a 0 a < x i < a k x i a m x i a
30[−50,50]0
Table 2. Statistical attribute analysis on conventional benchmark functions.
Table 2. Statistical attribute analysis on conventional benchmark functions.
FunctionParameterCCSA [21]ICSAImCSA [23]CSAFunctionParameterCCSA [21]ICSAImCSA [23]CSA
F1Mean6.8620460.031499371.70695.606232F7Mean0.0587420.0586670.002090.031414
Max13.020880.6036662978.31512.38207Max0.0982070.1095240.0069880.074844
Min2.3104261.68E−082.01E−061.688793Min0.0235070.0219477.16E−050.003457
SD3.3407160.134802806.73952.928814SD0.0201860.0253290.0016710.017421
F2Mean3.1709290.0683850.2863782.62875F8Mean−6709.28−3805.25−4814.16−6657.98
Max4.8102440.5750823.6581143.987891Max−5012.01−3140.88−2626.53−5498.49
Min1.4052134.61E−050.000421.515934Min−8371.41−4417.09−9016.28−7919.51
SD0.9107250.1733230.8449950.740529SD907.5612453.1582107.081650.6592
F3Mean323.0571797.50475011.694260.3749F9Mean33.984020.34010648.6874719.87135
Max531.16651733.9110026.44541.7801Max54.011394.837234158.726640.31614
Min196.5237256.5388649.746277.18435Min19.83778.43E−090.0001020.500648
SD99.44027408.31962623.501122.8966SD10.064631.11365948.367219.957675
F4Mean6.3082530.1884670.0029764.844995F10Mean4.3002760.0334430.7204543.476109
Max8.1402641.7371540.0184457.114288Max7.3233630.6212339.0067874.978548
Min3.9815777.91E−058.24E−051.400564Min2.8938332.89E−050.0021341.547676
SD1.1604480.4820950.0042751.564157SD1.2187950.1387432.0515220.861077
F5Mean324.0462903.36642.9959247.6814F11Mean1.0768780.0978222.0823081.032996
Max555.564112040.525802.284638.3758Max1.1381950.525478.2598491.08654
Min188.2174271.724528.701128.65027Min1.0245210.0073961.68E−060.938069
SD89.85753460.4031538.028127.9758SD0.0336680.144152.3049790.045092
F6Mean8.5120820.180113242.27615.755481F12Mean4.955293.21E−1010.177592.026635
Max18.402421.7038081797.64413.4591Max10.475271.97E−0919.874674.821795
Min3.8767122.59E−082.48E−061.820985Min1.1124884.59E−115.52E−060.161102
SD3.5381090.412308465.78082.824979SD2.5473754.63E−105.219751.325975
Table 3. Evaluation bench for PSP problem.
Table 3. Evaluation bench for PSP problem.
S. No.NameLengthSequence
1Asm14ABAB
2Asm24AAAA
3As15AAAAB
4As25AAAAA
5Am113ABBABBABABBAB
6Am217ABABBAABBBAAABABA
7Rs1 (1BXP)13ABBBBBBABBBAB
Table 4. Statistical attribute analysis.
Table 4. Statistical attribute analysis.
PSSACSACCSAImCSAICSA
Asm1Mean−0.64938−0.64876−0.64935−0.64938
Minimum−0.64938−0.64934−0.64938−0.64938
Maximum−0.64938−0.64628−0.64885−0.64938
Standard Deviation1.99E−160.0006930.0001171.92E−16
Asm2Mean−1.67633−1.67219−1.67178−1.67633
Minimum−1.67633−1.67597−1.67633−1.67633
Maximum−1.67633−1.66024−1.58531−1.67633
Standard Deviation4.86E−160.0043270.0203524.61E−16
As1Mean−1.57712−1.51277−1.54829−1.57822
Minimum−1.58944−1.57024−1.58944−1.58944
Maximum−1.4772−1.46993−1.32764−1.4772
Standard Deviation0.0344750.0287380.0716960.034547
As2Mean−2.76032−2.71044−2.78057−2.80884
Minimum−2.84828−2.83731−2.84828−2.84828
Maximum−2.46639−2.59715−2.45111−2.46639
Standard Deviation0.1124350.0636950.0939850.090723
Am1Mean−0.763090.284905−0.6902−1.11339
Minimum−2.1577−0.14584−1.56744−1.69817
Maximum−0.012210.589463−0.01221−0.40284
Standard Deviation0.6640840.2501810.5041680.522875
Am2Mean−2.9870.90737−2.58315−3.23951
Minimum−4.615110.030558−4.52953−4.99724
Maximum−1.192161.910105−0.79697−1.15554
Standard Deviation1.1074740.4824121.0292531.064372
Rs1Mean−0.68660.204919−0.7064−0.94874
Minimum−1.62337−0.0458−1.45093−1.68243
Maximum−0.091480.432196−0.09148−0.09148
Standard Deviation0.4801410.1458140.4751260.577632
Table 5. Iterative Time Analysis.
Table 5. Iterative Time Analysis.
PSCSACCSAImCSAICSA
Asm10.0016320.0028550.0082750.001577
Asm20.0016510.0023780.0082950.001604
As10.002680.003230.0092960.0026
As20.002740.0032190.0093720.002648
Am10.0404140.0405340.0477350.040257
Am20.0865170.0876930.0930030.086233
Rs10.0408480.0415920.0470920.040451
Table 6. Bench of real proteins.
Table 6. Bench of real proteins.
S.No.Nomenclature of Protein [31]
(Length of Sequence)
Sequence Considered
RP-1.2ZNF (18)ABABBAABBABAABBABA
RP-2.1CB3 (13)BABBBAABBAAAB
RP-3.1BX1 (16)ABAABBAAAAABBABB
RP-4.1EDP (17)ABABBAABBBAABBABA
RP-5.1EDN (21)ABABBAABBBAABBABABAAB
RP-6.1SP7 (24)AAAAAAAABAAABAABBAAAABBB
RP-7.2H3S (25)AABBAABBBBBABBBABAABBBBBB
RP-8.1FYG (25)ABAAABAABBAABBAABABABBABA
RP-9.1T2Y (25)ABAAABAABBABAABAABABBAABB
RP-10.2KPA (26)ABABABBBAAAABBBBABABBBBBBA
RP-11.1ARE (29)BBBAABAABBABABBBAABBBBBBBBBBB
RP-12.1K48 (29)BAAAAAABBAAAABABBAAABABBAAABB
Table 7. Evaluation of ICSA on real protein bench.
Table 7. Evaluation of ICSA on real protein bench.
SequenceSACSACCSAImCSAICSASequenceSACSACCSAImCSAICSA
RP-1Mean−2.3791931−0.7886972−1.7198855−2.4001344RP-7Mean−1.3065694−0.3117043−0.5407107−1.4724554
Max0.02673680.2064326−0.61763130.0267386Max0.00644620.49848380.00631010.0063656
Min−5.0663166−2.6839571−3.4270418−4.3176683Min−3.5274819−1.7353797−2.0208762−2.8963254
SD1.33755920.8557510.77662581.2731138SD1.36017460.70188310.65789561.0617478
RP-2Mean−0.9197507−0.1272179−0.5434708−1.1124384RP-8Mean−3.6225088−2.2060117−2.5448055−3.6272946
Max0.13938050.2355230.13938020.1393803Max−0.5703019−1.1118433−1.554185−1.1893901
Min−3.1515092−1.7230257−2.9664274−3.027823Min−5.730837−3.6879032−3.6678783−5.5395026
SD1.22884560.4572330.9380321.1736009SD1.17025850.89626530.67906311.5134546
RP-3Mean−4.061345−2.0771745−2.6960923−4.073643RP-9Mean−3.9554999−1.5579017−1.9065653−3.9567796
Max−1.3752103−0.9003451−1.010259−2.391513Max0.00424790.57401430.00397360.0040509
Min−6.2571965−3.4880659−6.3372388−5.9779379Min−6.7932439−3.7696626−5.0893686−6.1911524
SD1.30040850.76430661.23787570.881331SD1.51682581.37382491.05192911.6046706
RP-4Mean−1.6090671−0.3213995−1.0254915−1.277614RP-10Mean−3.2471186−1.0720838−2.4729678−2.8468528
Max−0.45288520.3969437−0.1595380.1053464Max−0.8728514−0.0972439−1.070647−0.3651848
Min−3.1296681−1.408081−1.9635722−3.2492808Min−5.1931067−2.1187554−4.9041739−5.848072
SD0.91392860.55801950.54328351.0252792SD1.16306490.65465351.00422021.2853246
RP-5Mean−1.5179187−0.472523−1.4763197−1.6676075RP-11Mean−1.7560662−0.3524027−1.3875453−1.3565324
Max0.07455480.41672090.07451650.0745474Max−0.16023640.2812715−0.1905263−0.1585026
Min−3.9066747−3.0887209−4.3472688−4.7164015Min−3.3050487−1.662606−2.931173−3.3910932
SD1.25627950.97785510.98568351.4249971SD1.01608310.64266340.76220981.2438222
RP-6Mean−8.9429712−4.9479063−6.4622347−9.0543114RP-12Mean−5.8367209−2.8724377−3.7033946−4.8984339
Max−5.9611104−1.285182−3.7284173−5.4037268Max−2.5869554−0.4310465−0.2096519−0.2091465
Min−11.30546−9.0011959−10.247179−13.325058Min−8.8135235−4.9994968−6.6499219−7.9420245
SD1.47206031.77027741.6889181.8708891SD2.03373751.41329171.8780151.765799
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alshamrani, A.M.; Saxena, A.; Shekhawat, S.; Zawbaa, H.M.; Mohamed, A.W. Performance Evaluation of Ingenious Crow Search Optimization Algorithm for Protein Structure Prediction. Processes 2023, 11, 1655. https://doi.org/10.3390/pr11061655

AMA Style

Alshamrani AM, Saxena A, Shekhawat S, Zawbaa HM, Mohamed AW. Performance Evaluation of Ingenious Crow Search Optimization Algorithm for Protein Structure Prediction. Processes. 2023; 11(6):1655. https://doi.org/10.3390/pr11061655

Chicago/Turabian Style

Alshamrani, Ahmad M., Akash Saxena, Shalini Shekhawat, Hossam M. Zawbaa, and Ali Wagdy Mohamed. 2023. "Performance Evaluation of Ingenious Crow Search Optimization Algorithm for Protein Structure Prediction" Processes 11, no. 6: 1655. https://doi.org/10.3390/pr11061655

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop