Next Article in Journal
Averaging Methods for Second-Order Differential Equations and Their Application for Impact Systems
Next Article in Special Issue
Dimension Reduction of Machine Learning-Based Forecasting Models Employing Principal Component Analysis
Previous Article in Journal
Mixed Convective Stagnation Point Flow towards a Vertical Riga Plate in Hybrid Cu-Al2O3/Water Nanofluid
Previous Article in Special Issue
A Novel and Simple Mathematical Transform Improves the Perfomance of Lernmatrix in Pattern Classification
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:

Machine Learning Modeling of Aerobic Biodegradation for Azo Dyes and Hexavalent Chromium

State Key Laboratory of Water Resources and Hydropower Engineering Science, Wuhan University, Wuhan 430072, China
Faculty of Civil Engineering, Technische Universität Dresden, 01069 Dresden, Germany
Department of Automation, Obuda University, 1034 Budapest, Hungary
Department of Mathematics, J. Selye University, 94501 Komarno, Slovakia
Department of Informatics, J. Selye University, 94501 Komarno, Slovakia
Department of Environmental Sciences, PMAS-Arid Agriculture University, Rawalpindi 46300, Pakistan
Institute of Research and Development, Duy Tan University, Da Nang 550000, Vietnam
Author to whom correspondence should be addressed.
Mathematics 2020, 8(6), 913;
Submission received: 9 March 2020 / Revised: 30 May 2020 / Accepted: 3 June 2020 / Published: 4 June 2020
(This article belongs to the Special Issue Advances in Machine Learning Prediction Models)


The present study emphasizes the efficacy of a biosurfactant-producing bacterial strain Klebsiella sp. KOD36 in biodegradation of azo dyes and hexavalent chromium individually and in a simultaneous system. The bacterial strain has exhibited a considerable potential for biodegradation of chromium and azo dyes in single and combination systems (maximum 97%, 94% in an individual and combined system, respectively). Simultaneous aerobic biodegradation of azo dyes and hexavalent chromium (SBAHC) was modeled using machine learning programming, which includes gene expression programming, random forest, support vector regression, and support vector regression-fruit fly optimization algorithm. The correlation coefficient includes the dispersion index, and the Willmott agreement index was employed as statistical metrics to assess the performance of each model separately. In addition, the Taylor diagram was used to further investigate the methods used. The findings of the present study were that the support vector regression-fruitfly optimization algorithm (SVR-FOA) with correlation coefficient (CC) of 0.644, (scattered index) SI of 0.374, and (Willmott’s index of agreement) WI of 0.607 performed better than the autonomous support vector regression (SVR), gene expression programming (GEP), and random forest (RF) methods. In addition, the standalone SVR model with CC of 0.146, SI of 0.473, and WI of 0.408 ranked the second best. In summary, the SBAHC can be accurately estimated using the hybrid SVR-FOA method. In other words, FOA has proven to be a powerful optimization algorithm for increasing the accuracy of the SVR method.

1. Introduction

Many salts, in particular azo dyes and chromium sulphate, are the most commonly used chemicals in various processes in the leather tanning industry [1]. Azo dyes are regarded as xenobiotics because they are very resistant. Their chemical structure is designed to resist discoloration when exposed to sunlight, sweat, water, microbes, or other chemical agents [2]. These dyes tend to persist for a long time, circulate, and accumulate in the food chain [3]. The improper disposal of liquid wastes containing azo dyes and their derivatives is visually unpleasant in the aquatic ecosystem, leading to a reduction in solar radiation and, thus, to a reduction in photosynthesis and dissolved oxygen (DO) concentration, which ultimately leads to serious toxic effects on marine life and serious environmental damage [4]. These dyes also have many toxic effects in the form of eye infections, tumors, allergies, cancer, and respiratory diseases in humans [5].
Wastewater from the textile industry containing organic pollutants often contains free or complex ionic metals [6]. The activity of microorganisms can be affected by a high concentration of lead, chromium, copper, zinc, arsenic, and cadmium, which are known as heavy metals [7]. The composition of the microbial may influence the decomposition properties of microorganisms [8]. Chromium exists in two biologically active forms that are stable under their oxidizing conditions. The toxicity of hexavalent chromium (Cr(VI)) is about 100 times higher than that of the trivalent chromium (Cr(III)) and is also mutagenic, carcinogenic, and teratogenic [9]. Due to its high solubility and permeability to the outer layer of the cell, Cr(VI) changed to Cr(III) and sticks more strongly to DNA to form the DNA-Cr complex and Cr protein, which tend to alter the structure and function of enzymes and cause mutations when exposed to humans, leading to skin and respiratory infections that are irritating to the eyes and even carcinogenic [10].
Co-contamination (azo dyes and chromium) is a complex problem in the biodegradation process. Physico-chemical and bioremediation technologies are currently available to reduce the hexavalent chromium of a more toxic form to less toxic Cr(III) in industrial effluents [11] and the biodegradation of azo dyes, but these processes are costly and less effective in removing combined pollutants or have secondary effects on pollution to remove these toxic substances. Several microorganisms, in particular, bacteria producing biosurfactants, are able to grow in a medium amended with hexavalent chromium and changed it to trivalent chromium. However, such a process is limited due to the small number of potential microorganisms that can be overcome by the exogenous application of biosurfactant-producing bacteria and a nutrient medium, but this also requires the optimization of the biological process to develop an effective system for biodegradation of azo dyes and Cr(VI).
Statistical methods have been used to estimate medium parameter optimization in some previous studies [11,12,13]. These include the traditional optimization method, where one factor is optimized and the other is kept constant [14,15]; however, this process takes more time. Taguchi design [15,16,17], Plackett–Burman design [18,19,20], response surface methodology (RSM) with central composite design (CCD), and n-factor design [21,22,23,24] can be considered as the conventional tools for experimental design. Sometimes, second-order polynomial cannot fully cope with nonlinear biological interactions [25,26]. This leads researchers to employ computational intelligence techniques to predict and estimate the optimal parameters of simulated bioreduction of azo dyes and Cr(VI).
Gene expression programming (GEP) is a metaheuristic approach in the presence of a decision tree (DT) structure to combine the properties of other data-driven approaches [27]. Likewise, the RF technique, which includes a number of decision trees and naturally integrates feature selection and interaction into the learning process, is a good way to estimate factors that influence bioprocesses [28]. Other methods, such as SVR, are a powerful machine learning (ML) technique developed from statistical learning theory. The SVR is popular due to its high generalization performance and its capability to manage nonlinear models in the presence of kernel techniques [29]. Fruit fly optimization algorithm (FOA) was recently introduced by Pan [30] as an evolutionary optimization algorithm. This technique has advantages, like being easy to understand and less time-consuming. As a new intelligent optimization algorithm, this technique has attracted attention and has been successfully employed in the autonomous control of surface vessels [31] and the optimization of control bioprocesses [32].
The existing study is novel in the sense that it describes simultaneous biodegradation of azo dyes and chromium (VI) using biosurfactant-producing bacteria, which have advantageous over traditional bacteria having potential of biosurfactant production, which results in micelle formation and enhances solubility of compounds in aqueous system. Moreover, application of SVR-FOA for estimation of simultaneous biodegradation of azo dyes and chromium (VI) in the current study is a novel strategy as the technique is superior compared to previously employed algorithm programs, as SVR-FOA has fewer parameters and is easy to program. Hence, it optimizes the complex linear regression problem inspired by the fruit fly food searching phenomena by a specialized way of smell and vision. The main contribution of the study was to develop a novel modeling approach based on machine learning programing for estimating and predicting optimal parameters for SBAHC by the newly isolated bacterial strain Klebsiella sp. KOD36.

2. Methodology

2.1. Culture Medium, Chemicals, and Microorganism

Reactive black dyes and di-potassium chromate were used in this study to determine possible degradation by bacterial stain. All solutions were prepared in distilled water at 121 °C for 20 min. A working solution of 15, 100, and 150 mg L−1 azo dyes and 2, 5, and 10 mg L−1 chromium was prepared from the stock solution. The diphenyl carbide reagent was used to determine the chromium concentration in the spectrophotometer. Degradation studies using mineral salts (MSM), NaCl (1.0 g/L), CaCl2 (0.1 g/L), KH2PO4 (1.0 g/L), MgSO4. 7H2O (0.5 g/L), Na2HPO4 (1.0 g/L), and yeast extract (4.0 g/l) amended with various concentration of Cr(VI) and azo dyes were used for the purification and streaking of bacterial strains already reported to produce biosurfactants. The strain used was isolated previously [33] and identified as Klebsiella sp. through 16 s RNA. The sequence of identified strain was submitted with accession number KT364873 to gene bank [33].

2.2. Evaluation of Klebsiella sp. KOD36 Potential for Biodegradation of Azo Dyes and Chromium (VI) in Single and Combined System

The strain Klebsiella sp. KOD36, which produces biosurfactants, was investigated for capability to utilize as bioremediation agent in biodegradation of azo dyes and hexavalent chromium in single and combined aerobic systems. For this purpose, the MSM broth was supplemented with various concentrations of chromium (0, 5, 10 mg L−1) and azo dyes (0, 100, 150 mg L−1) in single or combination systems (100 mg L−1 azo dyes and 5 mg L−1 Cr, 100 mg L−1 azo dyes and 10 mg L−1 Cr, 150 mg L−1 azo dyes and 5 mg L−1 Cr, 150 mg L−1 azo dyes and 10 mg L−1 Cr) inoculated with Klebsiella sp. KOD36 uniform bacterial community 2.3 × 107 colony forming unit (CFU) mL−1. The glass vials were placed at 150 rpm in a stirred, temperature-controlled incubator for 24 h. Aliquots were regularly taken alternately from each vial to determine their concentration of reduced species. The degradation potential was measured using the technique as described by Desai et al. [34].

2.3. Estimation of Optimal Growth Fcators for Simultaneous Bioremoval of Azo Dyes and Azo Dyes and Chromium (VI) in Single and Combined System

Various parameters for the simultaneous biodegradation of azo dyes and chromium were evaluated using the Taguchi model, as shown in Table 1. The measured values of simultaneous biodegradation of azo dyes and chromium are given in Table 2, while machine learning employed is depicted by Figure 1.

2.4. Methods of Analysis

Azo Dye and Chromium Concentration Measurement

Chromium and azo dye concentrations were measured using the technique described by Desai et al. [34]. In short, 1 milliliter of desired sample was drawn from a tube and subjected to centrifugation at 10,000 rev/min for 20 min. The chromium concentration was determined by complexing the chromium with a diphenyl carbide reagent. The pink color was developed, which was determined at wavelengths of 540 nm. The concentration of the azo dye at 570 nm was also calculated for each dye with a spectrophotometer.

2.5. Machine Learning Methodolgies

In the present study, our major aim was to design a novel methodology to estimate and predict the simultaneous biodegradation of azo dyes and chromium (VI) in a single and combine system by means of machine learning programing. The acquired data were subjected to use in training GEP, RF, SVR, and SVR-FOA procedures. The gene expression program was up-to-date algorithm methodology that described the relationship between given input and resulting output variables through software development [35].
In contrast to genetic algorithm (GA) and genetic programming (GP), the GEP is actually the combination of these two [35,36]. GEP is a specialized program for solving regression problems. Support vector regression (SVR) is specifically important for its capability of management and required performance in solving problems, particularly nonlinear regression problems [37]. The selection of basic parameters is an important aspect in SVR technique. The main parameters of SVR technique include c, ε, and kernel functions [38]. The drawback of this technique is, in certain cases, the incorrect values of SVR basic parameters may result in either under- or over-tuning. Therefore, an optimal value of each basic parameter should be chosen while in the SVR training phase. Various methodologies have been adopted to select the optimal value of these basic parameter of SVR. Fruit fly optimization algorithm (FOA) is such a technique, introduced by Pan [30], which employs the foraging strategy of a fruit fly to search the optimal position of each basic value in support vector regression. These methods are described as following.

2.5.1. Gene Expression Programming (GEP)

Genetic expression programming is a specialized genetic programing technique that is capable of solving optimal problems by developing an expression tree (ET). The baseline of this technique is a specialized tree-like structure which is first trained as living organisms by changing shape, size, composition, and other affecting factors. Genetic expression programs, similar to living organisms, are coded as fixed traits on chromosomes. Therefore, the GEP is just like a genotype-phenotype expression system that uses a simple genetic information code to exhibit the biologically inspired phenotype. The particular fixed-length chromosome has genetic information similar to actual information stored in a chromosome part. Each chromosome contains several genes that are called sub-entity types. In genetic expression programing, all the other sub-entity types are connected to a root, similar to a tree, and make a connection to each other. These sub-connections in genetic expression technique include division, multiplication, subtraction, and addition [39]. These particular genes, irrespective of their fixed length, have varying size and shape. So, the varying length of different genes allows these genes in GEP to progress adoption and evolution. Each specific area in a gene is known as open reader frame (ORF), which provides solution while exhibiting as code in expression tree [40]. Similar to other evolutionary programs, the genetic expression program is based on scattered information on chromosomes. In a general set of data (population), fitness function is used to evaluate each chromosome and designate a specific value. Different fitness functions in a genetic expression program have been used previously [35]. Suitable chromosomes are picked up in the next generation. These chromosomes are further controlled by particular gene operators after being selected. The process of selection continues until a proper optimal value is attained [35,41].

2.5.2. Support Vector Regression

SVR is normally used for solving regression problems and has been used as an estimation and prediction system in biological systems [42,43,44]. This is a supervised program which utilizes structural risk minimization (SRM), in contrast to empirical risk minimization (ERM) used in conventional neural network. ERM basically reduces the error of a training dataset while SRM is helpful in reducing errors at higher extent. Therefore, SVM has the potential to reduce errors of commonly practiced neural networks [37,38]. The main background of SVR is to design the given input data into more precise dimensions and improve the efficiency of linear regression problems by adopting kernel function; yet these kernel functions are not adequate specifically in more complex linear regression problems. Each function (kernel) must have two features, symmetricity and compliance, with Cauchy-Schwarz criteria to address these issues. These features ensure that new space is capable of being defined by these functions (kernel). In general, the support vector regression performance depends on its parameters, which include ε (intensive zone, which is normally used to fire the training dataset), C (trade-off), and γ (determine of relative error and smoothness). To optimize the various SVR parameters, different algorithms have been developed.

2.5.3. FOA (Fruit Fly Optimization Algorithm)

The FOA is a biological program inspired by the behavior of Drosophila insect for food search (Pan, 2012). The Drosophila insect has a unique quality of superior smell and vision, which differentiate and make it superior from other insects. The insect employs a sense of vision when approaching a food source. This information is transmitted to the insect body, which determines the route leading to the food source identified [45]. The schematic diagram of SVR-FOA methodology, adapted by Nabipour et al. [46], is shown in Figure 2.

2.6. Modeling Methodologies

The simultaneous biodegradation of azo dyes and chromium was estimated and modeled using DT techniques (e.g., GEP, SVR, RF, and SVR-FOA). DT techniques present simple interpretation of results by handling nonlinear and nonparametric variables. Evaluating the performance of the mentioned techniques was performed by comparing results with the empirical relationships presented by other researchers.
Simultaneous degradation   of   azo   dyes   and   chromium   ( % ) = 3.87091 N i t r o g e n + I n c u b a t i o n N i t r o g e n + 1.28408 N i t r o g e n C a r b o n 0.929413 + p H + p H C a r b o n T e m p e r a t u r e + N i t r o g e n ( C a r b o n N i t r o g e n p H + T e m p e r a t u r e )
In the above formulation, set the mentioned following values for nitrogen as yeast, ammonium sulfate, and urea, while for carbon source glucose set sucrose and starch.

Parameters for Evaluation of Models’ Performance

The performance (predictive) of the recommended model was evaluated as CC, SI, and WI. These statistics are presented as follows [47,48]:
I: CC expressed as:
C C = ( i = 1 n O i P i 1 n i = 1 n O i i = 1 n P i ) ( i = 1 n O i 2 1 n ( i = 1 n O i ) 2 ) ( i = 1 n P i 2 1 n ( i = 1 n P i ) 2 )
II: SI follows as:
S I = 1 n i = 1 n ( P i O i ) 2 O ¯
III: WI expressed as:
W I = 1 [ i = 1 n ( O i P i ) 2 i = 1 n ( | P i O ¯ i | + | O i O ¯ i | ) 2 ]
The targeted and predicted values are denoted as Oi and Pi, respectively.

3. Results and Discussion

In the present study, the potential of biosurfactant-producing strain Klebsiella sp. KOD36 was tested for its simultaneous reduction of chromium and reactive black-5 azo dyes (RB-5). Additionally, optimization of environmental and nutritional parameters during simultaneous biodegradation of chromium and azo dyes was assessed using GEP, SVR, RF, and SVR-FOA. Table 3 presents the statistical analysis parameters of the used data.

3.1. Evaluation of Klebsiella sp. KOD36 for Biodegradation of Azo Dyes and Chromium (VI)

3.1.1. In Single System

Results presented in Figure 3 represent that for chromium at 5 mg L−1, 98% reduction was observed as compared to control after 24 h of inoculation of bacterial strains Klebsiella sp. KOD36, while for 10 mg L−1 of chromium up to 80% reduction occurred, and for chromium (5 mg L−1) 80% reduction of chromium was observed compared to control (Figure 4).
Regarding azo dyes at 100 mg L−1, a maximum 90% degradation of azo dyes was observed, while for 150 mg L−1 azo dyes’ concentration, a maximum 87% degradation was observed after 24 h. The inoculation of biosurfactant-producing strain Klebsiella sp. KOD36 significantly enhanced the biodecolorization of reactive black dyes compared with the sample lacking the bacterial strain. Based on results of the previous study, B. circulans BWL1061 decolorizes azo dyes [37], which lead to achieving the biodecolorization by improving the enzymes responsible for degradation and dyes. The significant improvement in biodegradation of azo dyes and chromium (VI) by biosurfactant at critical micelle concentration (CMC) and biosurfactant-producing bacteria indicates that electrostatic attraction forces and hydrophobic part of the biosurfactant play a vital role in biodegradation of dyes. A similar finding and mechanism was described by a previous study conducted by Liu et al. [49], who described the isolated strain BWL1061 exhibted degradation potential for azo dyes, which may likely have been due to interaction of biosurfactant hydrophobic moiety of biosurfactant and dyes. Similarly, in another study conducted, Thacker and Madamwar [50] described the capability of the biosurfactant-producing bacterial strain Ochrobactrum sp. and Bacillus sp. for the reduction of hexavalent chromium in a batch study experiment.

3.1.2. In Combined System

Microorganisms are important in the way that they are excellent bioremediation agents for heavy metal contamination (soil and water). Microorganisms showing a significant resistance to heavy metals have potential as remediation agent in detoxification of these heavy metals. However, in certain cases, more specifically under co-contamination, their efficiency is hindered, as heavy metals cause toxicity to microorganisms and lower their efficiency for biodegradation of azo dyes. In this scenario, the following investigation was designed to investigate the efficacy of biosurfactant and biosurfactant-producing bacteria for decolorization of azo dyes with various concentrations of chromium (VI).
Regarding simultaneous degradation results (azo dyes at 100 mg L−1 and Cr at 5 mg L−1 concentration), 89% degradation was observed after 24 h of inoculation of bacterial strains Klebsiella sp. KOD36 after 24 h, as compared to control (Figure 5). While for 100 mg L−1 (azo dyes and 10 mg L−1 Cr), up to 94% simultaneous degradation was observed after 24 h of inoculation of bacterial strains Klebsiella sp. KOD36, as compared to control. The percent degradation for 150 mg L−1 azo dyes and 5 mg L−1 Cr, and 150 mg L−1 azo dyes and 10 mg L−1 Cr were 91% and 82%, respectively, after 24 h, as compared to control. Similar effects could also be observed in the study conducted by Halmi et al. [51], who isolated a novel potential stain for decolorization of four different dyes, namely amaranth dye, Biebrich scarlet, direct blue, and metanil yellow, under aerobic environment. The isolate bacterial strain exhibited decolorization a maximum of 52% of dyes (initial concentration 150 ppm potassium dichromate) in nutrient broth medium after an incubation of 24 h under shaking at 150 rpm. The enhanced biodecolorization effect could be likely have been due to the fact that biosurfactants reduce the toxicity of hexavalent chromium by entrapping it in micelles and reduce their bioavailability to microorganisms and, meanwhile, bacterial decolorization for azo dyes was enhanced.
The presence of organic and non-organic compounds emit mixed pollution in industrial zones [38]. Conventional wastewater contains various types of organic and inorganic contaminants, which require immediate attention. Chang et al. [52] conducted a study using a high salinity-tolerant bacterial strain, A12 and L, for its biodegradability for sulfamethoxazole (SMX). They also found that under aerobic and anaerobic conditions the bacterial strain denoted as A12 and L showed a significant degradation of SMX in milkfish culture pond sediment batch experiments. Biosurfactant plays a vital role in this regard. Micelle formation in biosurfactant entrapped the heavy metal (chromium) in its core, thus reducing the bioavailability to bacterial cell by preventing the cells of chromium [30]. Subsequently, bacterial cell efficiently decolorized azo dyes in the presence of Cr(VI). The above findings showed that Klebsiella sp. KOD36 is a proper choice for reclaiming azo dyes and metal (Cr(VI))-contaminated sites.

3.2. Modeling Outcomes

There is not any significant instruction for splitting training and testing data. In this study, data were divided into training (67%) and testing (33%) to develop GEP, RF, SVR, and SVR-FOA models for SBAHC estimation. Moreover, SVR-FOA optimized the default values of SVR for increasing the accuracy of predictions. So, the default and optimized values of SVR and SVR-FOA are presented in Table 4. Also, the GEP modeling functional parameters are shown in Table 5. Therefore, with default and optimized parameters, the defined scenarios for SVR, SVR-FOA, GEP, and RF models’ parameters are shown in Table 6.
This is clear in Table 6 that SVR-FOA showed the maximum estimation performance. In other words, SVR-FOA with CC value of 0.644, SI value of 0.374, and WI value of 0.607 estimated SBAHC more accurately than other considered models and, hence, chosen as the best model among others studies, followed by SVR with CC value of 0.146, SI value of 0.473, and WI value of 0.408. Although the CC value of SVR was low, due to lower SI error, it may be more appropriate than GEP and RF models. Additionally, among GEP and RF models, GEP showed weak performance with CC value of 0.387, SI value of 0.647, and WI value of 0.456. Furthermore, it can be concluded from Table 6 that SVR-FOA increased CC values of SVR, GEP, and RF by 341.1%, 66.4%, and 57.1%, respectively. Also, it reduced SI values of the mentioned models by 20.9%, 42.2%, and 27.9%, respectively. Finally, SVR-FOA increased the WI values of the mentioned models by 48.8%, 33.1%, and 19.7%, respectively. Although the GEP had lower accuracy in predicting simultaneous biodegradation, the model could be used for SBAHC. The mentioned GEP formulation is presented below.
SBAHC = 3.87091 N i t r o g e n + I n c u b a t i o n N i t r o g e n + 1.28408 N i t r o g e n C a r b o n 0.929413 + p H + p H C a r b o n T e m p e r a t u r e + N i t r o g e n ( C a r b o n N i t r o g e n p H + T e m p e r a t u r e )
In the above formulation, the following values should be considered for nitrogen, yeast 1, ammonium 2, urea 3; and for carbon, glucose 1 and sucrose 2.
These performance parameters of various models used in the present study are also shown as bar chart (Figure 6). It is obvious from the chart that SVR-FOA had highest potential for predicting SBAHC in prediction and estimation.
The SBAHC predictive results of various models is also shown in Figure 7. It can be observed from Figure 5 that SVR-FOA had a higher performance than other considered models. Furthermore, Figure 8 indicates scatter plots of prediction of the SBAHC values with SVR-FOA, SVR, GEP, and RF models. The less-scattered points exhibited by SVR-FOA is a clear indication that the values of SVR-FOA were more accurate than standalone SVR, GEP, and RF models.
Furthermore, Taylor diagrams (TD) were employed to examine standard deviation (SD) and CC values for the SVR-FOA, SVR, GEP, and RF models. Figure 9 presents TD for all models. It can be understood from Figure 7 that SVR-FOA (a point with grey color), due to a shorter distance from the observed green point, provided relatively precise predictions of SBAHC values. As a conclusive remark, it can be stated that SVR-FOA with optimized values (C = 1.7242, γ= 0.0517m andε = 0.0468) and using input parameters of temperature, pH, incubation period (IP), and shaking is more capable for accurate estimation of SBAHC in comparison to standalone SVR and GEP models, and may be recommended for further implementations [36,53,54].
The need for a robust model for estimation of large number of input variables is obvious in the recent world. Another study, investigated by Amato et al. [55], illustrated that in a well-defined geographic area the application of social nets (on-line) may increase the detection efficiency (real time) and alert diffusion. They proposed a multicomplex big data system that uses clustering event detection techniques along with multimedia content and biologically inspired programming to develop alerts.

4. Conclusions

Klebsiella sp. KOD36 significantly boosted the biodegradation of azo dyes. Additionally, the capabilities of SVR, SVR-FOA, GEP, and RF models in estimation of SBAHC values were inspected. Accordingly, the enactments of studied methods were comprehensively examined using CC, SI, and WI parameters. Also, Taylor diagrams were utilized for further assessment. The obtained results indicated that SVR-FOA with CC of 0.644, SI of 0.374, and WI of 0.607 had better performance comparing to standalone SVR, GEP, and RF methods. Moreover, standalone SVR model ranked the second best with CC of 0.146, SI of 0.473, and WI of 0.408. This can be verified by the presented Taylor diagram. Because of less-scattered points exhibited by SVR-FOA, it can be concluded that the estimates of SVR-FOA were much more accurate than other studied models. Conclusively, a fruit fly optimization algorithm had remarkable impact in reducing the prediction errors of a standalone SVR method and it can be recommended for SBAHC estimation. However, the main drawback in using the SVR-based model is that, when there is a big dataset or large number of population load, the time consumed as learning/training is very high, thus determination of parameters involves mainly the researcher experience.

Author Contributions

Conceptualization, Z.A., H.Z., M.S., A.K. and S.M.; formal analysis, H.Z., M.S., H.S., A.K. and S.M.; investigation, H.Z. and A.K.; methodology, Z.A., A.M. and H.S.; project administration, H.S. and S.M.; resources, Z.A., H.Z. and M.S.; software, A.M.; supervision, A.M.; validation, N.N.; visualization, M.S. and N.N.; writing—original draft, Z.A., and A.M.; writing—review and editing, Z.A., A.M. and N.N. All authors have read and agreed to the published version of the manuscript.


This research was supported by the National Natural Science Foundation of China (51629901) and Major Science and Technology Program for Water Pollution Control and Treatment of China (2017ZX07108-001).


We acknowledge the financial support of this work by the Hungarian State and the European Union under the EFOP-3.6.1-16-2016-00010 project and the 2017-1.3.1-VKE-2017-00025 project. This research has been supported by the Project: “Support of research and development activities of the J. Selye University in the field of Digital Slovakia and creative industry” of the Research and Innovation Operational Programme (ITMS code: NFP313010T504) co-funded by the European Regional Development Fund.

Conflicts of Interest

The authors declare no conflict of interest.


SBAHCSimultaneous aerobic biodegradation of azo dyes and hexavalent chromium
GEPGene expression programming
RFRandom forest
SVRSupportive vector regression
FOAFruit fly optimization algorithm
SRMStructural risk minimization
ERMEmpirical risk minimization
CCCorrelation coefficient
IPIncubation period
DODissolved oxygen
DTDecision tree
MLMachine learning
SIScattered index
WIWillmott agreement index


  1. Sandrin, T.R.; Maier, R.M. Impact of metals on the biodegradation of organic pollutants. Environ. Health Perspect. 2003, 111, 1093–1100. [Google Scholar] [CrossRef] [PubMed]
  2. Vishwakarma, S.K.; Singh, M.P.; Srivastava, A.K.; Pandey, V.K. Azo dye (direct blue 14) decolorization by immobilized extracellular enzymes of Pleurotus species. Cell. Mol. Boil. 2012, 58, 21–25. [Google Scholar]
  3. Simmons, J.E.; Yang, R.S.H.; Berman, E. Evaluation of the nephrotoxicity of complex mixtures containing organics and metals: Advantages and disadvantages of the use of real-world complex mixtures. Environ. Health Perspect. 1995, 103 (Suppl. 1), 67–71. [Google Scholar] [PubMed]
  4. Saratale, R.G.; Saratale, G.D.; Chang, J.S.; Govindwar, S. Bacterial decolorization and degradation of azo dyes: A review. J. Taiwan Inst. Chem. Eng. 2011, 42, 138–157. [Google Scholar] [CrossRef]
  5. Ali, N.; Hameed, A.; Siddiqui, M.; Ghumro, P.; Ahmed, S. Application of Aspergillus niger SA1 for the enhanced bioremoval of azo dyes in simulated textile effluent. Afr. J. Biotechnol. 2009, 8, 3839–3845. [Google Scholar]
  6. Moawad, W.M.; El-Rahim, W.M.A.; Khalafallah, M. Evaluation of biotoxicity of textile dyes using two bioassays. J. Basic Microbiol. 2003, 43, 218–229. [Google Scholar] [CrossRef]
  7. Baath, E. Thymidine incorporation into macromolecules of bacteria extracted from soil by homogenization centrifugation. Soil Boil. Biochem. 1992, 24, 1157–1165. [Google Scholar] [CrossRef]
  8. Bader, J.L.; González, G.; Goodell, P.C.; Pillai, S.D.; Ali, A.S.; Pillai, S.D. Chromium-resistant bacterial populations from a site heavily contaminated with hexavalent chromium. Water Air Soil Pollut. 1999, 109, 263–276. [Google Scholar] [CrossRef]
  9. Huq, S.I. Critical environmental issues relating to tanning industries in Bangladesh. In Proceedings of the ACIAR, Coimbatore, India, 31 January–4 February 1998; pp. 22–28. [Google Scholar]
  10. Mahmood, S.; Khalid, A.; Mahmood, T.; Arshad, M.; Ahmad, R. Potential of newly isolated bacterial strains for simultaneous removal of hexavalent chromium and reactive black-5 azo dye from tannery effluent. J. Chem. Technol. Biotechnol. 2012, 88, 1506–1513. [Google Scholar] [CrossRef]
  11. Ho, W.S.W.; Poddar, T.K. New membrane technology for removal and recovery of chromium from waste waters. Environ. Prog. 2001, 20, 44–52. [Google Scholar] [CrossRef]
  12. Onwosi, O.; Odibo, F.J.C. Use of response surface design in the optimization of starter cultures for enhanced rhamnolipid production by Pseudomonas nitroreducens. Afr. J. Biotechnol. 2013, 12, 2611–2617. [Google Scholar]
  13. Abbasi, H.; Hamedi, M.M.; Lotfabad, T.B.; Zahiri, H.S.; Sharafi, H.; Masoomi, F.; Moosavi-Movahedi, A.A.; Ortiz, A.; Amanlou, M.; Noghabi, K.A. Biosurfactant-producing bacterium, Pseudomonas aeruginosa MA01 isolated from spoiled apples: Physicochemical and structural characteristics of isolated biosurfactant. J. Biosci. Bioeng. 2012, 113, 211–219. [Google Scholar] [CrossRef] [PubMed]
  14. Lotfy, W.A.; Ghanem, K.M.; El-Helow, E.R. Citric acid production by a novel Aspergillus niger Klebseilla pneumoniae. II. Optimization of process parameters through statistical experimental designs. Bioresour. Technol. 2007, 98, 3470–3477. [Google Scholar] [CrossRef] [PubMed]
  15. Tanyildizi, S.; Ozer, D.; Elibol, M. Optimization of a-amylase production by Bacillus sp. using response surface methodology. Process Biochem. 2005, 40, 2291–2296. [Google Scholar] [CrossRef]
  16. Locner, H.; Matar, J.E. Designing for Quality; Productivity Press: New York, NY, USA, 1990. [Google Scholar]
  17. Yang, W.; Tarng, Y. Design optimization of cutting parameters for turning operations based on the Taguchi method. J. Mater. Process. Technol. 1998, 84, 122–129. [Google Scholar] [CrossRef]
  18. Kalyani, L.T.; Sireesha, G.N.; Aditya, A.K.G.; Sankar, G.G.; Prabhakar, T. Production optimization of rhamnolipid biosurfactant by Streptomyces coelicoflavus (NBRC 15399T) using Plackett–Burman design. Eur. J. Biotechnol. Biosci. 2014, 1, 7–13. [Google Scholar]
  19. Zhao, F.; Mandlaa, M.; Hao, J.; Liang, X.; Shi, R.; Han, S.; Zhang, Y. Optimization of culture medium for anaerobic production of rhamnolipid by recombinant Pseudomonas stutzeri Rhl for microbial enhanced oil recovery. Lett. Appl. Microbiol. 2014, 59, 231–237. [Google Scholar] [CrossRef]
  20. Mabrouk, M.E.; Youssif, E.M.; Sabry, S.A. Biosurfactant production by a newly Klebseilla pneumoniaed soft coral-associated marine Bacillus sp. E34: Statistical optimization and characterization. Life Sci. J. 2014, 11, 756–768. [Google Scholar]
  21. Amodu, O.S.; Ntwampe, S.K.; Ojumu, T.V. Optimization of biosurfactant production by Bacillus licheniformis STK 01 grown exclusively on Beta vulgaris waste using response surface methodology. BioResources 2014, 9, 5045–5065. [Google Scholar] [CrossRef] [Green Version]
  22. Chandankere, R.; Yao, J.; Masakorala, K.; Jain, A.K.; Kumar, R. Enhanced production and characterization of biosurfactant produced by a newly isolated Bacillus amyloliquefaciens USTBb using response surface methodology. Int. J. Curr. Microbiol. Appl. Sci. 2014, 3, 66–80. [Google Scholar]
  23. Sen, R.; Swaminathan, T. Application of response-surface methodology to evaluate the optimum environmental conditions for the enhanced production of surfactin. Appl. Microbiol. Biotechnol. 1997, 47, 358–363. [Google Scholar] [CrossRef]
  24. Mnif, I.; Sahnoun, R.; Ellouze-Chaabouni, S.; Ghribi, D. Evaluation of B. subtilis SPB1 biosurfactants’ potency for diesel-contaminated soil washing: Optimization of oil desorption using Taguchi design. Environ. Sci. Pollut. Res. 2014, 21, 851–861. [Google Scholar] [CrossRef] [PubMed]
  25. Mutalik, S.R.; Vaidya, B.K.; Joshi, R.M.; Desai, K.M.; Nene, S.N. Use of response surface optimization for the production of biosurfactant from Rhodococcus spp. MTCC 2574. Bioresour. Technol. 2008, 99, 7875–7880. [Google Scholar] [CrossRef] [PubMed]
  26. Chen, H.C. Optimizing the concentrations of carbon, nitrogen and phosphorous in a citric acid fermentation with response surface method. Food Biotechnol. 1996, 10, 13–27. [Google Scholar] [CrossRef]
  27. Zounemat-Kermani, M.; Rajaee, T.; Ramezani-Charmahineh, A.; Adamowski, J.F. Estimating the aeration coefficient and air demand in bottom outlet conduits of dams using GEP and decision tree methods. Flow Meas. Instrum. 2017, 54, 9–19. [Google Scholar] [CrossRef]
  28. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  29. Cheng, K.; Lu, Z. Adaptive sparse polynomial chaos expansions for global sensitivity analysis based on support vector regression. Comput. Struct. 2018, 194, 86–96. [Google Scholar] [CrossRef]
  30. Pan, W.T. A new fruit fly optimization algorithm: Taking the financial distress model as an example. Knowl.-Based Syst. 2012, 26, 69–74. [Google Scholar] [CrossRef]
  31. Abidin, Z.; Hamzah, M.; Arshad, M.; Ngah, U. A calibration framework for swarming ASVs’ system design. Indian J. Mar. Sci. 2012, 41, 581–588. [Google Scholar]
  32. Liu, Y.; Wang, X.; Li, Y. A modified fruit-fly optimization algorithm aided PID controller designing. In Proceedings of the Intelligent Control and Automation (WCICA), 10th World Congress on IEEE, Beijing, China, 6–8 July 2012; pp. 233–238. [Google Scholar]
  33. Ahmad, Z.; Arshad, M.; Asghar, H.N.; Sheikh, M.A.; Crowley, D.E. Isolation, screening and functional characterization of biosurfactant producing bacteria isolated from crude oil contaminated site. Int. J. Agric. Boil. 2016, 18, 542–548. [Google Scholar]
  34. Desai, C.; Jain, K.; Patel, B.; Madamwar, D. Efficacy of bacterial consortium-AIE2 for contemporaneous Cr (VI) and azo dye bioremediation in batch and continuous bioreactor systems, monitoring steady-state bacterial dynamics using qPCR assays. Biodegradation 2009, 20, 813. [Google Scholar] [CrossRef] [PubMed]
  35. Ferreira, C. Gene Expression Programming in Problem Solving, in Soft Computing and Industry; Springer: New York, NY, USA, 2002; pp. 635–653. [Google Scholar]
  36. Baghban, A.; Mosavi, A. Insight into the antiviral activity of synthesized schizonepetin derivatives: A theoretical investigation. Sci. Rep. 2020, 25, 1–5. [Google Scholar] [CrossRef] [PubMed]
  37. Suykens, J.; Vandewalle, J. Least squares support vector machine classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
  38. Karballaeezadeh, N.; Mohammadzadeh, S.D.; Shamshirband, S.; Hajikhodaverdikhan, P.; Mosavi, A.; Chau, K.W. Prediction of remaining service life of pavement using an optimized support vector machine (case study of Semnan–Firuzkuh road). Eng. Appl. Comput. Fluid Mech. 2019, 13, 188–198. [Google Scholar] [CrossRef]
  39. Ebtehaj, I.; Bonakdari, H.; Zaji, A.H.; Azimi, H.; Sharifi, A. Gene expression programming to predict the discharge coefficient in rectangular side weirs. Appl. Soft Comput. 2015, 35, 618–628. [Google Scholar] [CrossRef]
  40. Lopes, H.S.; Weinert, W.R. EGIPSYS: An enhanced gene expression programming approach for symbolic regression problems. Int. J. Appl. Math. Comput. Sci. 2004, 14, 375–384. [Google Scholar]
  41. Ferreira, C. Gene Expression Programming: Mathematical Modeling by an Artificial Intelligence; Springer: New York, NY, USA, 2006; Volume 21. [Google Scholar]
  42. Yadav, B.; Ch, S.; Mathur, S.; Adamowski, J. Estimation of in-situ bioremediation system cost using a hybrid Extreme Learning Machine (ELM)-particle swarm optimization approach. J. Hydrol. 2016, 543, 373–385. [Google Scholar] [CrossRef]
  43. Cao, Q.; Leung, K.M. Prediction of chemical biodegradability using support vector classifier optimized with differential evolution. J. Chem. Inf. Model. 2014, 54, 2515–2523. [Google Scholar] [CrossRef]
  44. Tang, L.; Zhang, M.; Wen, L. Support vector machine classification of seismic events in the Tianshan orogenic belt. J. Geophys. Res. Solid Earth 2020, 125, e2019JB018132. [Google Scholar] [CrossRef]
  45. Xiao, C.; Hao, K.; Ding, Y. An improved fruit fly optimization algorithm inspired from cell communication mechanism. Math. Probl. Eng. 2015, 2015, 492195. [Google Scholar] [CrossRef] [Green Version]
  46. Nabipour, N.; Daneshfar, R.; Rezvanjou, O.; Mohammadi-Khanaposhtani, M.; Baghban, A.; Xiong, Q.; Li, L.K.; Habibzadeh, S.; Doranehgard, M.H. Estimating biofuel density via a soft computing approach based on intermolecular interactions. Renew. Energy 2020, 152, 1086–1098. [Google Scholar] [CrossRef]
  47. Qasem, S.N.; Samadianfard, S.; Kheshtgar, S.; Jarhan, S.; Kisi, O.; Shamshirband, S.; Chau, K.W. Modeling monthly pan evaporation using wavelet support vector regression and wavelet artificial neural networks in arid and humid climates. Eng. Appl. Comput. Fluid 2019, 13, 177–187. [Google Scholar] [CrossRef] [Green Version]
  48. Samadianfard, S.; Majnooni-Heris, A.; Qasem, S.N.; Kisi, O.; Shamshirband, S.; Chau, K.W. Daily global solar radiation modeling using data-driven techniques and empirical equations in a semi-arid climate. Eng. Appl. Comput. Fluid Mech. 2019, 13, 142–157. [Google Scholar] [CrossRef] [Green Version]
  49. Liu, W.; Liu, C.; Liu, L.; You, Y.; Jiang, J.; Zhou, Z.; Dong, Z. Simultaneous decolorization of sulfonated azo dyes and reduction of hexavalent chromium under high salt condition by a newly isolated salt-tolerant strain Bacillus circulans BWL1061. Ecotoxicol. Environ. Saf. 2017, 141, 9–16. [Google Scholar] [CrossRef] [PubMed]
  50. Thacker, U.; Madamwar, D. Reduction of toxic chromium and partial localization of chromium reductase activity in bacterial isolate DM1. World J. Microbiol. Biotechnol. 2005, 21, 891–899. [Google Scholar] [CrossRef]
  51. Halmi, M.I.E.; Abdullah, S.R.S.; Shukor, M.Y. Characterization of Chromate Reducing Pseudomonas Aeruginosa Strain Mie3 Isolated from Juru River Sludge and its Potential on Azo Dye Decolorization. J. Chem. Pharmac. Sci. 2017, 10, 522–526. [Google Scholar]
  52. Chang, B.V.; Chao, W.L.; Yeh, S.L.; Kuo, D.L.; Yang, C.W. Biodegradation of Sulfamethoxazole in Milkfish (Chanos chanos) Pond Sediments. Appl. Sci. 2019, 9, 4000. [Google Scholar] [CrossRef] [Green Version]
  53. Shamshirband, S.; Hadipoor, M.; Baghban, A.; Mosavi, A.; Bukor, J.; Várkonyi-Kóczy, A.R. Developing an ANFIS-PSO model to predict mercury emissions in combustion flue gases. Mathematics 2019, 7, 965. [Google Scholar] [CrossRef] [Green Version]
  54. Nabipour, N.; Mosavi, A.; Baghban, A.; Shamshirband, S.; Felde, I. Extreme learning machine-based model for Solubility estimation of hydrocarbon gases in electrolyte solutions. Processes 2020, 8, 92. [Google Scholar] [CrossRef] [Green Version]
  55. Amato, F.; Moscato, V.; Picariello, A.; Sperli’ì, G. Extreme events management using multimedia social networks. Future Gener. Comput. Syst. 2019, 94, 444–452. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of simultaneous biodegradation of azo dyes and Cr(VI) and their estimation through machine learning programing.
Figure 1. Schematic diagram of simultaneous biodegradation of azo dyes and Cr(VI) and their estimation through machine learning programing.
Mathematics 08 00913 g001
Figure 2. The schematic diagram of SVR-FOA adapted by Nabipour et al. [46].
Figure 2. The schematic diagram of SVR-FOA adapted by Nabipour et al. [46].
Mathematics 08 00913 g002
Figure 3. Biodegradation of azo dyes at various concentrations (15, 100, and 150 mgL−1) by biosurfactant-producing Klebsiella sp. KOD36.
Figure 3. Biodegradation of azo dyes at various concentrations (15, 100, and 150 mgL−1) by biosurfactant-producing Klebsiella sp. KOD36.
Mathematics 08 00913 g003
Figure 4. Bioreduction of Cr(VI) at various concentrations (2, 5, and 10 mgL−1) by biosurfactant-producing Klebsiella sp. KOD36.
Figure 4. Bioreduction of Cr(VI) at various concentrations (2, 5, and 10 mgL−1) by biosurfactant-producing Klebsiella sp. KOD36.
Mathematics 08 00913 g004
Figure 5. Time course study of simultaneous reduction (%) of azo dyes and chromium at concentration (a) (100 mg L−1, 5 mg L−1), (b) (100 mg L−1, 10 mg L−1), (c) (150 mg L−1, 5 mg L−1), (d) (150 mg L−1, 10 mg L−1) by Klebsiella sp. KOD36.
Figure 5. Time course study of simultaneous reduction (%) of azo dyes and chromium at concentration (a) (100 mg L−1, 5 mg L−1), (b) (100 mg L−1, 10 mg L−1), (c) (150 mg L−1, 5 mg L−1), (d) (150 mg L−1, 10 mg L−1) by Klebsiella sp. KOD36.
Mathematics 08 00913 g005
Figure 6. Three-dimensional bar graphs of the statistical parameters.
Figure 6. Three-dimensional bar graphs of the statistical parameters.
Mathematics 08 00913 g006
Figure 7. Estimated and observed values comparison for simultaneous degradation of azo dyes and chromium (%) of various models studied.
Figure 7. Estimated and observed values comparison for simultaneous degradation of azo dyes and chromium (%) of various models studied.
Mathematics 08 00913 g007
Figure 8. The estimated and observed values for simultaneous degradation of azo dyes and chromium (%) by various models used in the present study.
Figure 8. The estimated and observed values for simultaneous degradation of azo dyes and chromium (%) by various models used in the present study.
Mathematics 08 00913 g008
Figure 9. Taylor diagrams of estimated values of simultaneous degradation of azo dyes and chromium (%).
Figure 9. Taylor diagrams of estimated values of simultaneous degradation of azo dyes and chromium (%).
Mathematics 08 00913 g009
Table 1. Parameters and their levels.
Table 1. Parameters and their levels.
Independent VariablesUnit−101
pHUnit less369
Carbon source20 mg/LGlucoseSucroseStarch
Nitrogen source2 mg/LYeastAmmonium sulfateUrea
Table 2. Measured average value of simultaneous reduction of azo dye and chromium by biosurfactant-producing bacteria with their dependent variables.
Table 2. Measured average value of simultaneous reduction of azo dye and chromium by biosurfactant-producing bacteria with their dependent variables.
Temperature °CpHIP (Hours)Source of Carbon (20 mg/L)Source of Nitrogen (2 mg/L)Shaking (rpm)Simultaneous Reduction of Azo Dye and Chromium (%)
Table 3. Statistical analysis parameter values of the used data.
Table 3. Statistical analysis parameter values of the used data.
VariableMeanMinimumMaximumStandard DeviationCoefficient of VariationSkewnessCorrelation with Simultaneous Degradation of Azo Dyes And chromium (%)
Shaking (rpm)200.0150.0250.
IP (h)−0.210
Temperature (C)
Simultaneous degradation of azo dyes and chromium (%)56.810.−0.081
Table 4. Parameters of the SVR and SVR-FOA models.
Table 4. Parameters of the SVR and SVR-FOA models.
Table 5. Parameters of the GEP model.
Table 5. Parameters of the GEP model.
Head size8
Linking FunctionAddition (+)
Number of Genes3
Mutation Rate0.044
Inversion Rate0.1
One-Point RR0.3
Two-Point RR0.3
Gene RR0.1
Gene Transposition Rate0.1
Used functions+, −, ×, ÷, power
Table 6. Various model evaluation performance parameters’ comparison.
Table 6. Various model evaluation performance parameters’ comparison.

Share and Cite

MDPI and ACS Style

Ahmad, Z.; Zhong, H.; Mosavi, A.; Sadiq, M.; Saleem, H.; Khalid, A.; Mahmood, S.; Nabipour, N. Machine Learning Modeling of Aerobic Biodegradation for Azo Dyes and Hexavalent Chromium. Mathematics 2020, 8, 913.

AMA Style

Ahmad Z, Zhong H, Mosavi A, Sadiq M, Saleem H, Khalid A, Mahmood S, Nabipour N. Machine Learning Modeling of Aerobic Biodegradation for Azo Dyes and Hexavalent Chromium. Mathematics. 2020; 8(6):913.

Chicago/Turabian Style

Ahmad, Zulfiqar, Hua Zhong, Amir Mosavi, Mehreen Sadiq, Hira Saleem, Azeem Khalid, Shahid Mahmood, and Narjes Nabipour. 2020. "Machine Learning Modeling of Aerobic Biodegradation for Azo Dyes and Hexavalent Chromium" Mathematics 8, no. 6: 913.

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop