# Evaluation of River Water Quality Index Using Remote Sensing and Artificial Intelligence Models

^{*}

## Abstract

**:**

## 1. Introduction

#### 1.1. Literature Review

^{2}(coefficient of determination) values were 0.991, 9.933, 0.937, 0.93, and 0.934 for turbidity, TSS, COD, BOD, and DO, respectively. Arias-Rodriguez et al. [26] used extreme learning machine (ELM), support vector regression (SVR), and linear regression (LR) models in order to estimate Chl-a, turbidity, total suspended matter (TSM), and Secchi disk depth (SDD) using the national water quality monitoring system. Through their research, they applied data from Mexico for four lakes and additionally, Landsat-8 OLI, Sentinel-3 OLCI, and Sentinel-2 MSI were recruited. They found that the ELM model had relatively better performance in water quality estimation than other machine learning models.

^{2+}, Na

^{+}, Mg

^{2+}, pH, COD, BOD, DO, electrical conductivity (EC), total hardness (TH), phosphate (PO

_{4}

^{3−}), nitrate (NO

_{3}

^{−}), fecal coliform (FC), turbidity (Tur), and ammonium (NH

_{4}

^{+}) for the Karun River, Iran. They obtained monthly temperature values from satellite images of Landsat-7. From their research, it was found that the MT model had the best performance for the estimate of water quality index (WQI) compared with multivariate adaptive regression spline (MARS), gene expression programming (GEP), and evolutionary polynomial regression (EPR).

^{2}) of 0.916 and 0.890 for both an estimation of optically active and non-optically active parameters, respectively. On the other hand, Hong et al. [19] employed four improved structures of deep neural network (DNN) models (ResNet-18, ResNet-101, GoogLeNet, and Inception v3) to establish a relationship between hyperspectral imagery captured by drones and in situ measurements, as well as meteoroidal data to monitor Chl-a, phycocyanin (PC), and turbidity for Daechung Dam reservoir, South Korea. The results of their study demonstrated that the ResNet-18 model had the most accurate performance out of the other DNN models.

#### 1.2. Objectives and Research Organization

## 2. Overview of Case Study and Water Quality Data Description

_{4}

^{2−}), sodium (Na

^{+}), potassium (K

^{+}), hardness, fluoride (F

^{−}), dissolved oxygen (DO, chloride (Cl

^{−}), arsenic (AS), alkalinity, pH, nitrate (NO

_{3}

^{−}), and magnesium (Mg

^{2+}) whose statistical properties were listed in Table 1. These parameters were collected from the Hudson River near the Poughkeepsie, NY site located on the Hudson River with latitude 41.72176015 and longitude 73.94069299 from 14 March 2021 to 16 June 2021. The water quality data was taken from https://waterdata.usgs.gov (accessed on 11 August 2022). Additionally, some WQPs affecting sewage discharge (e.g., COD and phosphorous) were not available at the dates of the satellite images, whereas nitrate was included.

^{+}), Figure 2f (NO

_{3}

^{−}), Figure 2g (Mg

^{2+}), Figure 2h (Hardness), Figure 2j (Cl

^{−}), and Figure 2k (AS). Moreover, the frequencies of SO

_{4}

^{2−}and Na

^{+}were illustrated in Figure 2b,c that had skewed right patterns, whereas the frequencies of the pH (Figure 2e), Alk (Figure 2l), and DO (Figure 2m) parameters followed a symmetrical pattern. Figure 2i indicates that the frequency of F

^{−}parameter has no special pattern.

## 3. Data Preparation and Methods

#### 3.1. Preparation of Satellite Images

#### 3.1.1. Conversion of Digital Number to Spectral Radiance

^{2}× ster × μ

_{m}), DN is the pixel value, and Gian and Offset are the sensor calibration coefficients.

#### 3.1.2. Conversion of Spectral Radiation to Spectral Reflectance

#### 3.1.3. Separation of Water from Other Parts of Satellite Images

#### 3.2. Correlation between Spectral Bands and WQPs

_{1}, b

_{2}, b

_{3},…, b

_{11}) was examined in SPSS software as the first step in this section of the study. The correlation coefficient (R) between the water quality parameters of the Hudson River and spectral bands is shown in Table S1 (see Supplementary Materials). The results of the Pearson correlation demonstrated that the highest correlation coefficients are listed as DO with b

_{10}(R= −0.914), pH with b

_{11}(R = 0.916), Mg

^{2+}with b

_{11}(R = 0.864), Na

^{+}with b

_{9}(R = −0.866), SO4

^{2−}with b

_{2}(R = −0.933), hardness with b

_{1}(R = −0.871), Alk with b

_{3}(R = 0.776), AS with b

_{10}(R = 0.914), F

^{−}with b

_{11}(R = 0.728), K

^{+}with b

_{1}(R = 0.827), Cl

^{−}with b

_{10}(R = −0.883), tur with b

_{11}(R = −0.841), and NO

_{3}

^{−}with b

_{6}(R = −0.854).

_{6}/b

_{5}, pH with b

_{5}/b

_{7}, and Mg

^{2+}with b

_{9}/b

_{10}and b

_{9}/b

_{11}have the highest values of correlation coefficients.

#### 3.3. Correlation between WQPs and Spectral Indices

_{i}was single bands and spectral indices with a high correlation coefficient and k was the number of bands; A

_{0}and A

_{1}were empirical regression coefficients obtained from in situ data observations. By applying the relationships obtained on Landsat-8 images, the value of each pixel was converted to the simulated value of WQP. This study utilizes MLR analysis in order to provide an empirical equation between WQPs and spectral indices. It was highly important to consider both single spectral indices (i.e., b

_{1}, b

_{2}, b

_{3},…, b

_{11}) and ratios of spectral indices. Table 4 presents lists of MLR equations that establish the correlation between WQPs and spectral indices. As seen in Table 4, the most correlated MLR equation (R = 0.954) was dedicated to Cl

^{−}, which was approximated using b

_{0}, b

_{11}, and b

_{2}/b

_{6}, whereas the approximation of Na

^{+}has rather lower correlation (R = 0.756) with spectral indices (b

_{9}, b

_{11}, and b

_{3}/b

_{11}) in comparison with other WQPs. Additionally, MLR equations estimating Cl

^{−}(0.954), pH (0.939), F

^{−}(0.937), AS (0.936), and Alk (0.920) stood at the other ranks in terms of accuracy level.

#### 3.4. WQI Calculation

#### 3.5. Definition of Statistical Indices

_{Pre}denotes predicted values of WQI by AI models, WQI

_{Obs}denotes the computed values of WQI by CCME guideline, $\overline{WQI}$ is the average value of WQI, and U is the number of WQI samples.

## 4. Implementation of Soft Computing Models

#### 4.1. Model Tree

_{1}, a

_{2}, a

_{3},..., a

_{11}were a set of weighing coefficients related to Equation (12). Performance of MT indicated that 6 input variables were applied to feed M5MT. In this way, two multivariate linear equations were obtained as follows:

_{6}≤ 0.021,

_{6}was the splitting variable, and the corresponding value was 0.021. Moreover, Equations (13) and (14) were provided using smoothing and pruning the trees.

#### 4.2. Multivariate Adaptive Regression Spline

_{i}, C

_{0}, and N were the weighting coefficients (WCs) computed with the least squares (LS) technique, the constant coefficient (or bias), and the number of basis functions, respectively.

_{1}, b

_{2}, b

_{4}, b

_{7}, and b

_{11}) were applied to approximate WQI values, whereas other spectral indices did not have a role to play. Additionally, the total number of effective parameters and GCV value were 28.5 and 3.568, respectively. The values of WC were adjusted with particle swarm optimization (PSO) within 70 iterations and mean square error [MSE] = 7.484.

#### 4.3. Gene Expression Programming

#### 4.4. Evolutionary Polynomial Regression

^{x}+ e

^{-x})] and tangent hyperbolic [Tanhx = (e

^{x}− e

^{−x})/(e

^{x}+ e

^{−x})] could provide more complicated EPR expressions in comparison with expressions given by natural logarithm and exponential functions. On many occasions, it is more suitable for engineers to select the lowest complicated expression, although its accuracy level was marginally lower than other EPR expressions. In this study, EPR expressions given by exponential function obtained the lowest accurate predictions in the training stage (MSE = 1.603) when compared with EPR expression developed with no function (MSE = 1.519), secant hyperbolic (MSE = 1.423), and tangent hyperbolic (MSE = 1.477). EPR models provided highly lower complex equations rather than equations given by secant and tangent hyperbolic functions in spite of resulting in slightly lower accurate predictions (MSE = 1.507) than hyperbolic functions. Additionally, 11 logarithm expressions were produced during the training stage of the EPR model. As seen in Table 9, each equation included six algebraic terms, and additionally, natural logarithm was employed as an inner function to approximate the WQI values due to the fact that the pollution process in the natural streams is generally a complicated process; then, applying a complex expression could improve accuracy level of predictions in comparison with employing a simple regression equation. Another important setting parameter is related to multi-objective genetic algorithm (MOGA), which is applied in the structure of the EPR model in order to optimize the number of algebraic terms, the number of variables used in the EPR model, and values of exponent dedicated to each variable. Moreover, Table 10 demonstrates the setting parameters of all EPR expressions. According to Table 9, Model.8 yielded the most accurate prediction of WQI (MSE = 1.507) in comparison with other expressions. Hence, Model.8 was elected for further analysis in the training and testing phases and robust comparisons with related works.

## 5. Results and Discussion

#### 5.1. Statistical Performance of Soft Computing Techniques

_{0}). Accordingly, the null hypothesis of the F-test stands at the acceptable level when F

_{0}> F

_{α},

_{γ,λ}in which α is the significant level (0.05) and λ is the number of spectral indices (γ = 11), and λ denotes U− γ−1 (95−11−1 = 83). In addition to this, F

_{0}is calculated by MS

_{R}/MS

_{E}where MS

_{R}[SSR/(γ−1)] and MS

_{E}[SSE/(U−γ−1)] denote the mean square regression and the mean square error, respectively. SSR and SSE denote the sum of squares regression and the sum of squares error respectively that are computed as follows:

_{0.05,12,83}is roughly equal to 2.112. Table 12 indicated the results of F-test for all AI models. As inferred from Table 13, MARS (F

_{0}= 0.327), MT (F

_{0}= 1.513), and GEP (F

_{0}= 0.8771) accept the hypothesis of the F-test, whereas the EPR model did not satisfy the hypothesis (F

_{0}= 5.639).

#### 5.2. Complexity of AI Model-Derived Expressions

_{1}= Sum[ai·x

_{1}·x

_{2}·f(x

_{1}·x

_{2})], y

_{2}= Sum[ai·f(x

_{1}·x

_{2})], and y

_{3}= Sum[ai·x

_{1}·x

_{2}·f(x

_{1})·f(x

_{2})]. This study employed y1 to receive more accurate predictions of WQI, although the usability of y1 increased the complexity of EPR expressions when compared with y2 and y3.

#### 5.3. Variation of WQI Values by AI Models

_{a}is the standard normal variable at the 5% of significant level. In order to make comparisons among the uncertainty values given by the AI models in this study, the $C{L}_{e}^{\pm}$ values at the 5% of the significant level for all datasets (i.e., training and testing datasets) have been provided in Table 13. From Table 13, it is clear that the AI models result in overestimated predictions (${\mu}_{e}>0$) for WQI values: GEP (0.0710), EPR(0.2252), MARS(0.1732), and M5MT (0.2997). Additionally, the lowest value of estimation uncertainty is given by the EPR model with an uncertainty band of 0.0710, whereas the M5MT generates the highest level of uncertainty (0.2997). Generally, the findings of Table 13 demonstrate that EPR expression has the most superior performance when compared with other AI models applied in the current study.

#### 5.4. Comparisons of the Present Study with the Literature

_{2}, b

_{5}, b

_{6}, and b

_{7}as input parameters. The linear regression equation by Arias-Rodriguez et al. [26] had lower performance than the multivariate linear equation by MT (R = 0.969 and RMSE = 10.85). The present study applied AI models (i.e., GEP, EPR, MARS, and MT) to provide an accurate and less complicated model compared with ELM and SVM models by Arias-Rodriguez et al. [26], who proved that a remote sensing-based GP model had the acceptable capability to predict TP with R = 0.761. They used MODIS satellite images, which were not capable of retrieving WQPs with accurate predictions and finer spatial resolution, as well as Landsat-8.

## 6. Conclusions

- The correlation coefficients of WQP with single bands revealed that a considerable number of parameters were highly correlated with Landsat-8 bands 10 and 11;
- The correlation between spectral data and WQP improves when spectral indexes (RI and NDI) are utilized. In addition, the results showed that the use of spectral indices in some cases led to an increase in the value of R2 in MLR models;
- The WQI values were computed from the observed water quality data, which varied from 84.2 to 96.25 in the Hudson River. The observed WQI values given by CCME guidelines were indicative of good state of quality;
- The WQI values were predicted with AI models, for which four robust expressions were provided based on eight bands of Landsat-8 images. All the AI models were developed along with the optimum selection of the setting parameters;
- Statistical measures (i.e., IOA, RMSE, MAE, and SI) quantified the satisfying performance of non-linear multivariate expressions given by AI models (i.e., EPR, GEP, and MARS) and linear regression model (MT) in the prediction of WQI values for both training and testing stages. In addition, the results of the F-test and AUC approved the quantitative performance, and more importantly, the qualitative efficiency of AI models was statistically studied with violin graphs. Moreover, the uncertainty results of AI models performance indicated that EPR and MT had the lowest and highest degrees of uncertainty;
- AI models could efficiently detect both spatial and temporal variations of the WQI values for the studied reach of the Hudson River. Additionally, the comparisons of the present results with the literature were done in terms of the accuracy levels of AI models, the structural complexity of AI models, and the typical use of satellite images. According to R and RMSE criteria, the results of the present AI models (i.e., EPR, MT, GEP, and MARS) as white-box models were comparable with studies performed with SVM, RF, ANN, RT, and GBM models (introduced as black-box models).

## Supplementary Materials

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Liyanage, C.; Yamada, K. Impact of Population Growth on the Water Quality of Natural Water Bodies. Sustainability
**2017**, 9, 1405. [Google Scholar] [CrossRef] - Karn, S.K.; Harada, H. Surface Water Pollution in Three Urban Territories of Nepal, India, and Bangladesh. J. Environ. Manag.
**2001**, 28, 483–496. [Google Scholar] [CrossRef] - Najafzadeh, M.; Homaei, F.; Farhadi, H. Reliability assessment of water quality index based on guidelines of national sanitation foundation in natural streams: Integration of remote sensing and data-driven models. Artif. Intell. Rev.
**2021**, 54, 4619–4651. [Google Scholar] [CrossRef] - Wang, X.; Zhang, F.; Ding, J. Evaluation of water quality based on a machine learning algorithm and water quality index for the Ebinur Lake Watershed, China. Sci. Rep.
**2017**, 7, 12858. [Google Scholar] [CrossRef] [PubMed] - Horton, R.K. An index number system for rating water quality. J. Water Pollut. Control Fed.
**1965**, 37, 300–306. [Google Scholar] - Brown, R.M.; McClelland, N.I.; Deininger, R.A.; Tozer, R.G. A water quality index-do we dare? Water Sew. Work.
**1970**, 117, 339–343. [Google Scholar] - Hassan, G.; Goher, M.E.; Shaheen, M.E.; Taie, S.A. Hybrid Predictive Model for Water Quality Monitoring Based on Sentinel-2A L1C Data. IEEE Access
**2021**, 9, 65730–65749. [Google Scholar] [CrossRef] - Peterson, K.; Sidike, P.; Sloan, J.M. Deep learning-based water quality estimation and anomaly detection using Landsat-8/Sentinel-2 virtual constellation and cloud computing. GISci. Remote Sens.
**2022**, 57, 510–525. [Google Scholar] [CrossRef] - Ritchie, J.C.; Zimba, P.V.; Everitt, J.H. Remote Sensing Techniques to Assess Water Quality. Photogramm. Eng. Remote Sens.
**2003**, 69, 695–704. [Google Scholar] [CrossRef] - Caballero, I.; Román, A.; Tovar-Sánchez, A.; Navarro, G. Water quality monitoring with Sentinel-2 and Landsat-8 satellites during the 2021 volcanic eruption in La Palma (Canary Islands). Sci. Total Environ.
**2022**, 822, 153433. [Google Scholar] [CrossRef] - Pahlevan, N.; Smith, B.; Alikas, K.; Anstee, J.; Barbosa, C.; Binding, C.; Bresciani, M.; Cremella, B.; Giardino, C.; Gurlin, D.; et al. Simultaneous retrieval of selected optical water quality indicators from Landsat-8, Sentinel-2, and Sentinel-3. Remote Sens. Environ.
**2022**, 270, 112860. [Google Scholar] [CrossRef] - Barrett, D.C.; Frazier, A.E. Automated Method for Monitoring Water Quality Using Landsat Imagery. Water
**2016**, 8, 257. [Google Scholar] [CrossRef] - Niroumand-Jadidi, M.; Bovolo, F.; Bresciani, M.; Gege, P.; Giardino, C. Quality Retrieval from Landsat-9 (OLI-2) Imagery and Comparison to Sentinel-2. Remote Sens.
**2022**, 14, 4596. [Google Scholar] [CrossRef] - Sagan, V.; Peterson, K.T.; Maimaitijiang, M.; Sidike, P.; Sloan, J.; Greeling, B.A.; Maalouf, S.; Adams, C. Monitoring inland water quality using remote sensing: Potential and limitations of spectral indices, bio-optical simulations, machine learning, and cloud computing. Earth Sci. Rev.
**2020**, 205, 103187. [Google Scholar] [CrossRef] - Hou, X.; Feng, L.; Duan, H.; Chen, X.; Sun, D.; Shi, K. Fifteen-year monitoring of the turbidity dynamics in large lakes and reservoirs in the middle and lower basin of the Yangtze River, China. Remote Sens. Environ.
**2017**, 190, 107–121. [Google Scholar] [CrossRef] - Su, T.C. A study of a matching pixel by pixel (MPP) algorithm to establish an empirical model of water quality mapping, as based on unmanned aerial vehicle (UAV) images. Int. J. Appl. Earth Obs. Geoinf.
**2017**, 58, 213–224. [Google Scholar] [CrossRef] - Yang, Z.; Reiter, M.; Munyei, N. Estimation of chlorophyll—A concentrations in diverse water bodies using ratio-based NIR/Red indices. Remote Sens. Appl. Soc. Environ.
**2017**, 6, 52–58. [Google Scholar] [CrossRef] - Shuchman, R.A.; Leshkevich, G.; Sayers, M.J.; Johengen, T.H.; Brooks, C.N.; Pozdnyakov, D. algorithm to retrieve chlorophyll, dissolved organic carbon, and suspended minerals from Great Lakes satellite data. J. Great Lakes Res.
**2013**, 39, 14–33. [Google Scholar] [CrossRef] - Li, N.; Ning, Z.; Chen, M.; Wu, D.; Hao, C.; Zhang, D.; Bai, R.; Liu, H.; Chen, X.; Li, W.; et al. Satellite and Machine Learning Monitoring of Optically Inactive Water Quality Variability in a Tropical River. Remote Sens.
**2022**, 14, 5466. [Google Scholar] [CrossRef] - Ahmed, M.; Mumtaz, R.; Anwar, Z.; Shaukat, A.; Arif, O.; Shafait, F. A Multi–Step Approach for Optically Active and Inactive Water Quality Parameter Estimation Using Deep Learning and Remote Sensing. Water
**2022**, 14, 2112. [Google Scholar] [CrossRef] - Zhang, F.; Chan, N.W.; Liu, C.; Wang, X.; Shi, J.; Kung, H.T.; Li, X.; Guo, T.; Wang, W.; Cao, N. Water Quality Index (WQI) as a Potential Proxy for Remote Sensing Evaluation of Water Quality in Arid Areas. Water
**2021**, 13, 3250. [Google Scholar] [CrossRef] - Chebud, Y.; Naja, G.M.; Rivero, R.G.; Melesse, A.M. Water Quality Monitoring Using Remote Sensing and an Artificial Neural Network. Water Air Soil Poll.
**2012**, 223, 4875–4887. [Google Scholar] [CrossRef] - Chang, N.B.; Xuan, Z.; Yang, Y.J. Exploring spatiotemporal patterns of phosphorus concentrations in a coastal bay with MODIS images and machine learning models. Remote Sens. Environ.
**2013**, 134, 100–110. [Google Scholar] [CrossRef] - Kim, Y.H.; Im, J.; Ha, H.K.; Choi, J.K.; Ha, S. Machine learning approaches to coastal water quality monitoring using GOCI satellite data. GISci. Remote Sens.
**2014**, 51, 158–174. [Google Scholar] [CrossRef] - Sharaf El Din, E.; Zhang, Y.; Suliman, A. Mapping concentrations of surface water quality parameters using a novel remote sensing and artificial intelligence framework. Int. J. Remote Sens.
**2017**, 38, 1023–1042. [Google Scholar] [CrossRef] - Arias-Rodriguez, L.F.; Duan, Z.; Díaz-Torres, J.D.J.; Basilio Hazas, M.; Huang, J.; Kumar, B.U.; Tuo, Y.; Disse, M. Integration of Remote Sensing and Mexican Water Quality Monitoring System Using an Extreme Learning Machine. Sensors
**2021**, 21, 4118. [Google Scholar] [CrossRef] - Chen, P.; Wang, B.; Wu, Y.; Wang, Q.; Huang, Z.; Wang, C. Urban River water quality monitoring based on self-optimizing machine learning method using multi-source remote sensing data. Ecol. Indic.
**2023**, 146, 109750. [Google Scholar] [CrossRef] - Alparslan, E.; Aydöner, C.; Tufekci, V.; Tüfekci, H. Water quality assessment at Ömerli Dam using remote sensing techniques. Environ. Monit. Assess.
**2007**, 135, 391–398. [Google Scholar] [CrossRef] - Wei, Z.; Wei, L.; Yang, H.; Wang, Z.; Xiao, Z.; Li, Z.; Yang, Y.; Xu, G. Water Quality Grade Identification for Lakes in Middle Reaches of Yangtze River Using Landsat-8 Data with Deep Neural Networks (DNN) Model. Remote Sens.
**2022**, 14, 6238. [Google Scholar] [CrossRef] - Brezonik, P.L.; Olmanson, L.G.; Finlay, J.C.; Bauer, M.E. Factors affecting the measurement of CDOM by remote sensing of optically complex inland waters. Remote Sens. Environ.
**2015**, 157, 199–215. [Google Scholar] [CrossRef] - Sòria-Perpinyà, X.; Vicente, E.; Urrego, P.; Pereira-Sandoval, M.; Ruíz-Verdú, A.; Delegido, J.; Soria, J.M.; Moreno, J. Remote sensing of cyanobacterial blooms in a hypertrophic lagoon (Albufera of València, Eastern Iberian Peninsula) using multitemporal Sentinel-2 images. Sci. Total Environ.
**2020**, 698, 134305. [Google Scholar] [CrossRef] [PubMed] - McFeeters, S.K. The use of the Normalized Difference Water Index (NDWI) in the delineation of open water features. Int. J. Remote Sens.
**1996**, 17, 1425–1432. [Google Scholar] [CrossRef] - Song, K. Water quality monitoring using Landsat Themate Mapper data with empirical algorithms in Chagan Lake, China. J. Appl. Remote Sens.
**2011**, 5, 053506. [Google Scholar] [CrossRef] - Vincent, R.K.; Qin, X.M.; McKay, R.M.L.; Miner, J.; Czajkowski, K.; Savino, J.; Bridgeman, T. Phycocyanin detection from Landsat TM data for mapping cyanobacterial blooms in Lake Erie. Remote Sens. Environ.
**2004**, 89, 381–392. [Google Scholar] [CrossRef] - Kachroud, M.; Trolard, F.; Kefi, M.; Jebari, S.; Bourrié, G. Water Quality Indices: Challenges and Application Limits in the Literature. Water
**2019**, 11, 361. [Google Scholar] [CrossRef] - Willmott, C.J. On the validation of models. Phys. Geogr.
**1981**, 2, 184–194. [Google Scholar] [CrossRef] - Patricio-Valerio, L.; Schroeder, T.; Devlin, M.J.; Qin, Y.; Smithers, S. A Machine Learning Algorithm for Himawari-8 Total Suspended Solids Retrievals in the Great Barrier Reef. Remote Sens.
**2022**, 14, 3503. [Google Scholar] [CrossRef] - Mehraein, M.; Mohanavelu, A.; Naganna, S.R.; Kulls, C.; Kisi, O. Monthly Streamflow Prediction by Metaheuristic Regression Approaches Considering Satellite Precipitation Data. Water
**2022**, 14, 3636. [Google Scholar] [CrossRef] - Singh, V.K.; Singh, B.P.; Kisi, O.; Kushwaha, D.P. Spatial and multi-depth temporal soil temperature assessment by assimilating satellite imagery, artificial intelligence and regression based models in arid area. Comput. Electron. Agric.
**2018**, 150, 205–219. [Google Scholar] [CrossRef] - Quinlan, J.R. Learning with Continuous Classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, Tasmania, Australia, 16–18 November 1992; pp. 343–348. [Google Scholar]
- Bayatvarkeshi, M.; Imteaz, M.; Kisi, O.; Zarei, M.; Yaseen, Z.M. Application of M5 model tree optimized with Excel Solver Platform for water quality parameter estimation. Environ. Sci. Pollut. Res.
**2021**, 28, 7347–7364. [Google Scholar] [CrossRef] - Kim, S.; Alizamir, M.; Zounemat-Kermani, M.; Kisi, O.; Singh, V.P. Assessing the biochemical oxygen demand using neural networks and ensemble tree approaches in South Korea. J. Environ. Manag.
**2020**, 270, 110834. [Google Scholar] [CrossRef] - Keshtegar, B.; Heddam, S.; Kisi, O.; Zhu, S.P. Modeling total dissolved gas (TDG) concentration at Columbia River basin dams: High-order response surface method (H-RSM) vs. M5Tree, LSSVM, and MARS. Arab. J. Geosci.
**2019**, 12, 544. [Google Scholar] [CrossRef] - Friedman, J.H. Multivariate Adaptive Regression Splines. Ann. Stat.
**1991**, 19, 1–67. [Google Scholar] [CrossRef] - Shiau, J.; Lai, V.Q.; Keawsawasvong, S. Multivariate adaptive regression splines analysis for 3D slope stability in anisotropic and heterogenous clay. J. Rock Mech. Geotech. Eng.
**2022**, 15, 1052–1064. [Google Scholar] [CrossRef] - Ferreira, C. Gene expression programming: A new adaptive algorithm for solving problems. Int. J. Complex Syst.
**2001**, 13, 87–129. Available online: https://www.gene-expression-programming.com/ (accessed on 11 August 2022). - Borrelli, A.; De Falco, I.; Della Cioppa, A.; Nicodemi, M.; Trautteur, G. Performance of genetic programming to extract the trend in noisy data series. Phys. A Stat. Mech. Appl.
**2006**, 370, 104–108. [Google Scholar] [CrossRef] - Najafzadeh, M.; Oliveto, G.; Saberi-Movahed, F. Estimation of Scour Propagation Rates around Pipelines While Considering Simultaneous Effects of Waves and Currents Conditions. Water
**2022**, 14, 1589. [Google Scholar] [CrossRef] - Afrasiabian, B.; Eftekhari, M. Prediction of mode I fracture toughness of rock using linear multiple regression and gene expression programming. J. Rock Mech. Geotech. Eng.
**2022**, 14, 1421–1432. [Google Scholar] [CrossRef] - Giustolisi, O.; Savic, D.A. A symbolic data-driven technique based on evolutionary polynomial regression. J. Hydroinformat.
**2006**, 8, 207–222. [Google Scholar] [CrossRef] - Savic, D.; Giustolisi, O.; Berardi, L.; Shepherd, W.; Djordjevic, S.; Saul, A. Modelling sewer failure by evolutionary computing. Proc. Inst. Civ. Eng.-Water Manag.
**2006**, 159, 111–118. [Google Scholar] [CrossRef] - Savic, D.A.; Giustolisi, O.; Laucelli, D. Asset deterioration analysis using multi-utility data and multi-objective data mining. J. Hydroinformat.
**2009**, 11, 211–224. [Google Scholar] [CrossRef] - Fiore, A.; Marano, G.C.; Laucelli, D.; Monaco, P. Evolutionary Modeling to Evaluate the Shear Behavior of Circular Reinforced Concrete Columns. Adv. Civ. Eng.
**2014**, 2014, 684256. [Google Scholar] [CrossRef] - Balacco, G.; Laucelli, D. Improved air valve design using evolutionary polynomial regression. Water Supply
**2019**, 19, 2036–2043. [Google Scholar] [CrossRef] - Nahm, F.S. Receiver operating characteristic curve: Overview and practical use for clinicians. Korean J. Anesthesiol.
**2022**, 75, 25–36. [Google Scholar] [CrossRef] - Fleiss, J.L. Statistical Methods for Rates and Proportions, 2nd ed.; John Wiley and Sons: Hoboken, NJ, USA, 1981; pp. 38–46. [Google Scholar]
- Ahmadianfar, I.; Jamei, M.; Karbasi, M.; Gharabaghi, B. A novel boosting ensemble committee-based model for local scour depth around non-uniformly spaced pile groups. Eng. Comput.
**2022**, 38, 3439–3461. [Google Scholar] [CrossRef]

**Figure 2.**Histogram of WQPs measured in Hudson River: (

**a**) tur, (

**b**) SO

_{4}

^{2−}, (

**c**) Na

^{+}, (

**d**) K

^{+}, (

**e**) pH, (

**f**) NO

_{3}

^{−}, (

**g**) Mg

^{2+}, (

**h**) hardness, (i) F

^{−}, (

**j**) Cl

^{−}, (

**k**) AS, (

**l**) Alk, and (

**m**) DO.

**Figure 3.**Performance of AI models in the prediction of WQI for (

**a**) training and (

**b**) testing phases.

**Figure 5.**Spatial variations of WQI predicted by AI models for 12/03/2021: (

**a**) MT, (

**b**) MARS, (

**c**) GEP, and (

**d**) EPR.

**Figure 6.**Temporal variations of WQI predicted by MT for all dates of satellite images: (

**a**) 12/03/2021, (

**b**) 13/04/2021, (

**c**) 20/04/2021, (

**d**) 06/05/2021, (

**e**) 15/05/2021, and (

**f**) 07/06/2021.

Parameter | Unit | Max | Min | Average | Standard Deviation |
---|---|---|---|---|---|

Tur | NTU | 28.69 | 1.12 | 16.67 | 6.3 |

SO_{4}^{2−} | mg/L | 16.7 | 9.02 | 11.66 | 2.34 |

Na^{+} | mg/L | 31.4 | 14.1 | 20.64 | 5.02 |

K^{+} | mg/L | 1.44 | 0.79 | 1.14 | 0.32 |

pH | --- | 7.9 | 7.5 | 7.57 | 0.14 |

NO_{3}^{−} | mg/L | 0.76 | 0.34 | 0.48 | 0.12 |

Mg^{2+} | mg/L | 5.76 | 3.56 | 4.64 | 0.68 |

Hardness | mg/L | 103 | 65 | 83.1 | 10.94 |

F^{−} | mg/L | 0.1 | 0.1 | 0.1 | 1.9 × 10^{−16} |

Cl^{−} | mg/L | 56.3 | 23.6 | 35.25 | 10.16 |

AS | mg/L | 53 × 10^{−3} | 27 × 10^{−3} | 36 × 10^{−3} | 8.12 |

Alk | mg/L | 76.7 | 52.4 | 65.7 | 6.88 |

DO | mg/L | 14.1 | 7.5 | 10.9 | 2.18 |

Bands | Wavelength (μm) | Resolution (m) |
---|---|---|

Band 1—Coastal aerosol | 0.43–0.45 | 30 |

Band 2—Blue | 0.45–0.51 | 30 |

Band 3—Green | 0.53–0.59 | 30 |

Band 4—Red | 0.64–0.67 | 30 |

Band 5—Near Infrared (NIR) | 0.85–0.88 | 30 |

Band 6—SWIR 1 | 1.57–1.65 | 30 |

Band 7—SWIR 2 | 2.11–2.29 | 30 |

Band 8—Panchromatic | 0.50–0.68 | 15 |

Band 9—Cirrus | 1.36–1.38 | 30 |

Band 10—Thermal Infrared (TIRS) 1 | 10.6–11.19 | 100 |

Band 11—Thermal Infrared (TIRS) 2 | 11.50–12.51 | 100 |

Image Acquisition Date | Image ID | Range of Image Usage |
---|---|---|

12 March 2021 | LC80130312021071LGN00 | 12 March 2021 |

13 April 2021 | LC80140312021110LGN00 | 13 March 2021–13 April 2021 |

20 April 2021 | LC80140312021126LGN00 | 14 April 2021–20 April 2021 |

6 May 2021 | LC80130312021135LGN00 | 21 April 2021–5 May 2021 |

15 May 2021 | LC80140312021158LGN00 | 7 May 2021–15 May 2021 |

7 June 2021 | LC80130312021167LGN00 | 16 May 2021–7 June 2021 |

Parameters | Multivariate Linear Regression Equation | $\mathit{R}$ |
---|---|---|

Tur | $969.3-1.5468\times {\mathrm{b}}_{11}+2.07\times \frac{{\mathrm{b}}_{5}}{{\mathrm{b}}_{2}}$ | 0.873 |

SO_{4}^{2−} | $-285+2824\times {\mathrm{b}}_{2}+91\times \frac{{\mathrm{b}}_{4}}{{\mathrm{b}}_{3}}-548\times {\mathrm{b}}_{6}$ | 0.867 |

Na^{+} | $477+10,066\times {\mathrm{b}}_{9}-17.8\times {\mathrm{b}}_{11}-34,776\times \frac{{\mathrm{b}}_{3}}{{\mathrm{b}}_{11}}$ | 0.756 |

K^{+} | $1.4643-0.217\times \frac{{\mathrm{b}}_{2}}{{\mathrm{b}}_{6}}-0.1186\times \frac{{\mathrm{b}}_{5}}{{\mathrm{b}}_{6}}+0.0786\times \frac{{\mathrm{b}}_{5}}{{\mathrm{b}}_{7}}$ | 0.849 |

pH | $-2.03+0.912\times {\mathrm{b}}_{11}$ | 0.939 |

NO_{3}^{−} | $0.299-0.894\times \frac{{\mathrm{b}}_{7}-{\mathrm{b}}_{5}}{{\mathrm{b}}_{7}+{\mathrm{b}}_{5}}-28.17\times {\mathrm{b}}_{6}-1.31\times \frac{{\mathrm{b}}_{6}}{{\mathrm{b}}_{5}}$ | 0.868 |

Mg^{2+} | $8.063-183,590\times \frac{{\mathrm{b}}_{9}}{{\mathrm{b}}_{11}}+156,519\times \frac{{\mathrm{b}}_{9}}{{\mathrm{b}}_{11}}$ | 0.888 |

Hardness | $-755+1745+{\mathrm{b}}_{1}+705\times \frac{{\mathrm{b}}_{1}}{{\mathrm{b}}_{8}}+404\times \frac{{\mathrm{b}}_{4}}{{\mathrm{b}}_{3}}$ | 0.801 |

F^{−} | $1.597-0.00508\times {\mathrm{b}}_{11}-174\times \frac{{\mathrm{b}}_{9}}{{\mathrm{b}}_{10}}$ | 0.937 |

Cl^{−} | $277+0.0001\times {\mathrm{b}}_{10}-0.807\times {\mathrm{b}}_{11}-3.835\times \frac{{\mathrm{b}}_{2}}{{\mathrm{b}}_{6}}$ | 0.954 |

AS | $0.38939-0.02497\times \frac{{\mathrm{b}}_{6}}{{\mathrm{b}}_{5}}+0.04198\times \frac{{\mathrm{b}}_{2}}{{\mathrm{b}}_{6}}$ | 0.936 |

Alk | $-18,804+27,173\times \frac{{\mathrm{b}}_{11}-{\mathrm{b}}_{2}}{{\mathrm{b}}_{11}+{\mathrm{b}}_{2}}-8290\times \frac{{\mathrm{b}}_{11}-{\mathrm{b}}_{1}}{{\mathrm{b}}_{11}+{\mathrm{b}}_{1}}-0.0009\times \frac{{\mathrm{b}}_{11}}{\mathrm{b}4}$ | 0.920 |

DO | $103.16-0.3289\times {\mathrm{b}}_{11}$ | 0.917 |

Class | Threshold Value | Water Quality States |
---|---|---|

Ι | 95–100 | Excellent |

ΙΙ | 80–94 | Good |

ΙΙΙ | 60–79 | Fair |

ΙV | 45–59 | Marginal |

V | 0–44 | Poor |

Parameter | Max | Min | Average | Standard Deviation |
---|---|---|---|---|

b_{1} | 0.107 | 0.034 | 0.055 | 0.02 |

b_{2} | 0.09 | 0.037 | 0.055 | 0.02 |

b_{3} | 0.072 | 0.029 | 0.048 | 0.012 |

b_{4} | 0.09 | 0.029 | 0.057 | 0.021 |

b_{5} | 0.052 | 0.025 | 0.034 | 0.007 |

b_{6} | 0.038 | 0.016 | 0.027 | 0.007 |

b_{7} | 0.053 | 0.011 | 0.026 | 0.011 |

b_{8} | 0.171 | 0.036 | 0.064 | 0.012 |

b_{9} | 0.004 | 0.0009 | 0.002 | 0.001 |

b_{10} | 293.7 | 276.22 | 283.34 | 6.13 |

b_{11} | 292.9 | 275.71 | 282.67 | 5.99 |

WQI | 96.25 | 84.25 | 88.11 | 3.68 |

Basis Function | Formulation |
---|---|

BF_{1} | $\mathrm{max}\left(0,{\mathrm{b}}_{1}-276.41\right)$ |

BF_{2} | $\mathrm{max}\left(0,0.02367-{\mathrm{b}}_{1}\right)$ |

BF_{3} | $\mathrm{max}\left(0,0.1261-{\mathrm{b}}_{7}\right)$ |

BF_{4} | $\mathrm{max}\left(0,0.1261-{\mathrm{b}}_{7}\right)\times \mathrm{max}(0,{\mathrm{b}}_{2}-0.07603)$ |

BF_{5} | $\mathrm{max}\left(0,0.1261-{\mathrm{b}}_{7}\right)\times \mathrm{max}(0,0.07603-{\mathrm{b}}_{2})$ |

BF_{6} | $\mathrm{max}\left(0,{\mathrm{b}}_{1}-276.41\right)\times \mathrm{max}(0,{\mathrm{b}}_{4}-0.08869)$ |

BF_{7} | $\mathrm{max}\left(0,0.02367-{\mathrm{b}}_{1}\right)\times \mathrm{max}(0,{\mathrm{b}}_{4}-0.0891)$ |

BF_{8} | $\mathrm{max}\left(0,{\mathrm{b}}_{1}-276.41\right)\times \mathrm{max}(0,287.98-{\mathrm{b}}_{11})$ |

BF_{9} | $\mathrm{max}\left(0,0.02994-{\mathrm{b}}_{7}\right)\times \mathrm{max}(0,{\mathrm{b}}_{1}-0.05777)$ |

BF_{10} | $\mathrm{max}(0,{\mathrm{b}}_{2}-283.42)$ |

BF_{11} | $\mathrm{max}\left(0,0.02994-{\mathrm{b}}_{7}\right)\times \mathrm{max}(0,{\mathrm{b}}_{4}-0.08869)$ |

Parameters | Values |
---|---|

Number of chromosomes | 30 |

Linking function | + |

Mutation | 0.00138 |

Fixed-Root Mutation | 0.00068 |

Gene-Recombination | 0.00068 |

Gene-Transportation | 0.00277 |

One-Point Recombination | 0.00277 |

Best fitness function | 419.5948 |

Stop condition | R-Square Threshold |

Maximum depth of subtree | 7 |

Mathematical operators and function | ±, ×,/, Ln(x), exp(x), Average (x_{1}, x_{2}) |

Model. No | Formulation | MSE |
---|---|---|

1 | $\mathrm{WQI}=0.0016231\times \frac{1}{{\mathrm{b}}_{6}{}^{2}}+3.54\times \mathrm{Ln}\left({\mathrm{b}}_{9}^{0.5}\times {\mathrm{b}}_{10}^{2}\right)+5.4618\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}^{2}}\right)\phantom{\rule{0ex}{0ex}}+0.012942\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{1}{{\mathrm{b}}_{7}^{1.5}}\right)+5141.924\times \frac{{\mathrm{b}}_{3}^{0.5}}{{\mathrm{b}}_{11}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)\phantom{\rule{0ex}{0ex}}+63.1082\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{6}^{0.5}\right)+128.9703$ | 1.706 |

2 | $\mathrm{WQI}=0.014943\times \frac{1}{{\mathrm{b}}_{6}^{2}}+1.8054\times \mathrm{Ln}\left({\mathrm{b}}_{9}^{0.5}\times {\mathrm{b}}_{10}^{2}\right)+5.114\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)\phantom{\rule{0ex}{0ex}}+0.010792\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{7}^{1.5}}\right)+4814.2172\times \frac{{\mathrm{b}}_{3}^{0.5}}{{\mathrm{b}}_{11}}\times \mathrm{Ln}({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10})+\mathrm{57,955}\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{9}^{0.5}\right)+124.0877$ | 1.588 |

3 | $\mathrm{WQI}=0.0014518\times \frac{1}{{\mathrm{b}}_{6}^{2}}+1.8395\times \mathrm{Ln}\left({\mathrm{b}}_{9}^{0.5}\times {\mathrm{b}}_{10}^{2}\right)+5.103\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}\times {\mathrm{b}}_{11}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{11}^{2}}\right)\phantom{\rule{0ex}{0ex}}+0.010619\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{7}^{1.5}}\right)+280.4126\times \frac{{\mathrm{b}}_{3}^{0.5}}{{\mathrm{b}}_{11}^{0.5}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)\phantom{\rule{0ex}{0ex}}+58.3367\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{9}^{0.5}\right)+93.3195$ | 1.656 |

4 | $\mathrm{WQI}=1.8322\times \mathrm{Ln}\left({\mathrm{b}}_{9}^{0.5}\right)+5.4262\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.024901\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)+\phantom{\rule{0ex}{0ex}}295.5584\times \frac{{\mathrm{b}}_{3}^{0.5}}{{\mathrm{b}}_{11}^{0.5}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)+280.1486\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)+\phantom{\rule{0ex}{0ex}}70,121.1045\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}+125.822$ | 1.585 |

5 | $\mathrm{WQI}=0.61063\times \mathrm{Ln}\left({\mathrm{b}}_{9}\times {\mathrm{b}}_{10}^{2}\right)+5.024\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.018268\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)+\phantom{\rule{0ex}{0ex}}2077.6743\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}^{0.5}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)+314.7386\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)+\phantom{\rule{0ex}{0ex}}76,\mathrm{488.31.78}\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}+129.5538$ | 1.585 |

6 | $\mathrm{WQI}=5.0562\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.75124\times \mathrm{Ln}\left({\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{9}\times {\mathrm{b}}_{10}^{2}\right)+0.017321\times {\mathrm{b}}_{11}\times \phantom{\rule{0ex}{0ex}}\mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)+2103.1284\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}^{0.5}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)+314.0978\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \phantom{\rule{0ex}{0ex}}\mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)+71,395.4675\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}+133.1729$ | 1.521 |

7 | $\mathrm{WQI}=4.852\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.78368\times \mathrm{Ln}\left({\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{2}^{0.5}\times {\mathrm{b}}_{9}\times {\mathrm{b}}_{10}^{2}\right)+0.016237\times \phantom{\rule{0ex}{0ex}}{\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)+2022.4638\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}^{0.5}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)+306.1721\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \phantom{\rule{0ex}{0ex}}\mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)+66,319.6416\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}+133.5218$ | 1.58 |

8 | $\mathrm{WQI}=4.9538\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.017425\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)+4.8373\times {\mathrm{b}}_{6}^{0.5}\times \phantom{\rule{0ex}{0ex}}\mathrm{Ln}\left({\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{2}^{0.5}\times {\mathrm{b}}_{9}\times {\mathrm{b}}_{10}\right)+2061.653\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}^{0.5}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)+303.5518\times {\mathrm{b}}_{1}^{0.5}\times \phantom{\rule{0ex}{0ex}}{\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)+61,065.493\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{6}^{1.5}+{\mathrm{b}}_{7}+132.2393$ | 1.499 |

9 | $\mathrm{WQI}=4.6907\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.015951\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)+5.7518\times {\mathrm{b}}_{7}^{0.5}\times \phantom{\rule{0ex}{0ex}}\mathrm{Ln}\left({\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{2}^{0.5}\times {\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{9}\times {\mathrm{b}}_{10}^{2}\right)0.5+34,537.1755\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)+\phantom{\rule{0ex}{0ex}}336.6602\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)+75,604.5548\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}+133.6116$ | 1.602 |

10 | $\mathrm{WQI}=4.7174\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.01511\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)\phantom{\rule{0ex}{0ex}}+6.1317\times {\mathrm{b}}_{7}^{0.5}\times \mathrm{Ln}\left({\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{2}^{0.5}\times {\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{9}\times {\mathrm{b}}_{10}^{2}\right)\phantom{\rule{0ex}{0ex}}+36,315.0134\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)\phantom{\rule{0ex}{0ex}}+334.1929\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)\phantom{\rule{0ex}{0ex}}+299,003.26\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}+136.9156$ | 1.562 |

11 | $\mathrm{WQI}=4.6505\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{3}^{2}\times {\mathrm{b}}_{11}^{0.5}}{{\mathrm{b}}_{7}^{0.5}\times {\mathrm{b}}_{10}^{2}}\right)+0.014827\times {\mathrm{b}}_{11}\times \mathrm{Ln}\left(\frac{{\mathrm{b}}_{10}}{{\mathrm{b}}_{6}}\right)\phantom{\rule{0ex}{0ex}}+6.1112\times {\mathrm{b}}_{7}^{0.5}\times \mathrm{Ln}\left({\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{2}^{0.5}\times {\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{9}\times {\mathrm{b}}_{10}^{2}\right)\phantom{\rule{0ex}{0ex}}+35,910.17\times \frac{{\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{0.5}}{{\mathrm{b}}_{11}}\times \mathrm{Ln}\left({\mathrm{b}}_{7}^{2}\times {\mathrm{b}}_{10}\right)\phantom{\rule{0ex}{0ex}}+335.2054\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{7}\times \mathrm{Ln}\left({\mathrm{b}}_{1}\times {\mathrm{b}}_{6}^{0.5}\right)\phantom{\rule{0ex}{0ex}}+\mathrm{303,430.10.54}\times {\mathrm{b}}_{1}^{0.5}\times {\mathrm{b}}_{3}^{0.5}\times {\mathrm{b}}_{6}^{1.5}\times {\mathrm{b}}_{7}\times 123.4055$ | 1.507 |

Inner Function | Natural Logarithm |
---|---|

Range of exponents | [−2, −1.5, −1, −0.5, 0, 0.5, 1, 1.5, 2] |

Number of terms | 6 |

Expression structure | Sum(ai × x_{1}× x_{2} × f (x1× x_{2})) + bias |

Regression method | Non-negative least squares |

Optimum number of Generation | [10 40] |

Fitness function | Mean Square Error |

AI Models | Training Phase | |||
---|---|---|---|---|

IOA | RMSE | MAE | SI | |

MT | 0.969 | 1.287 | 0.0091 | 0.0146 |

MARS | 0.992 | 0.64 | 0.0059 | 0.0073 |

GEP | 0.964 | 1.383 | 0.0104 | 0.0157 |

EPR | 0.973 | 1.194 | 0.0076 | 0.0135 |

AI Models | Testing Phase | |||

IOA | RMSE | MAE | SI | |

MT | 0.978 | 1.085 | 0.0084 | 0.0146 |

MARS | 0.975 | 1.165 | 0.0088 | 0.0129 |

GEP | 0.978 | 1.052 | 0.0093 | 0.0109 |

EPR | 0.977 | 1.123 | 0.0083 | 0.0135 |

AI Models | SSR | SSE | MSR | MSE | F_{0} | Hypothesis States |
---|---|---|---|---|---|---|

GEP | 170.861 | 1213.8 | 13.143 | 14.985 | 0.877 | Accept |

MARS | 60.549 | 1150.9 | 4.657 | 14.208 | 0.327 | Accept |

EPR | 12462 | 13770 | 958.587 | 169.995 | 5.639 | Reject |

M5MT | 308.959 | 1272.20 | 237.766 | 15.706 | 1.513 | Accept |

AI Models | μ_{e} | S_{e} | $\mathit{C}{\mathit{L}}_{\mathit{e}}^{+}$ | $\mathit{C}{\mathit{L}}_{\mathit{e}}^{-}$ | Uncertainty Band $(\mathit{C}{\mathit{L}}_{\mathit{e}}^{+}-\mathit{C}{\mathit{L}}_{\mathit{e}}^{-})$ |
---|---|---|---|---|---|

GEP | 0.0710 | 0.8152 | 0.1419 | 0.0000003 | 0.1419 |

MARS | 0.1732 | 1.5702 | 0.2755 | 0.0710 | 0.2046 |

EPR | 0.2252 | 1.9852 | 0.2771 | 0.1732 | 0.1039 |

M5MT | 0.2997 | 2.3734 | 0.3741 | 0.2252 | 0.1489 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Najafzadeh, M.; Basirian, S.
Evaluation of River Water Quality Index Using Remote Sensing and Artificial Intelligence Models. *Remote Sens.* **2023**, *15*, 2359.
https://doi.org/10.3390/rs15092359

**AMA Style**

Najafzadeh M, Basirian S.
Evaluation of River Water Quality Index Using Remote Sensing and Artificial Intelligence Models. *Remote Sensing*. 2023; 15(9):2359.
https://doi.org/10.3390/rs15092359

**Chicago/Turabian Style**

Najafzadeh, Mohammad, and Sajad Basirian.
2023. "Evaluation of River Water Quality Index Using Remote Sensing and Artificial Intelligence Models" *Remote Sensing* 15, no. 9: 2359.
https://doi.org/10.3390/rs15092359