Next Article in Journal
Innervation of an Ultrasound-Mediated PVDF-TrFE Scaffold for Skin-Tissue Engineering
Previous Article in Journal
Complex-Exponential-Based Bio-Inspired Neuron Model Implementation in FPGA Using Xilinx System Generator and Vivado Design Suite
Previous Article in Special Issue
A New Hyper-Heuristic Multi-Objective Optimisation Approach Based on MOEA/D Framework
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Archimedes Optimization Algorithm-Based Feature Selection with Hybrid Deep-Learning-Based Churn Prediction in Telecom Industries

1
Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
2
Department of Computer Science and Engineering, College of Applied Studies and Community Services, King Saud University, P.O. Box 22459, Riyadh 11495, Saudi Arabia
3
Department of Financial and Banking Sciences, Applied College at Muhail Aseer, King Khalid University, Abha 61413, Saudi Arabia
4
Department of Computer Science, University of the People, Pasadena, CA 91101, USA
5
Department of Computer Science, College of Post-Graduated Studies, Sudan University of Science and Technology, Khartoum 11111, Sudan
6
Research Center, Future University in Egypt, New Cairo 11835, Egypt
*
Author to whom correspondence should be addressed.
Biomimetics 2024, 9(1), 1; https://doi.org/10.3390/biomimetics9010001
Submission received: 17 October 2023 / Revised: 5 December 2023 / Accepted: 8 December 2023 / Published: 19 December 2023

Abstract

:
Customer churn prediction (CCP) implies the deployment of data analytics and machine learning (ML) tools to forecast the churning customers, i.e., probable customers who may remove their subscriptions, thus allowing the companies to apply targeted customer retention approaches and reduce the customer attrition rate. This predictive methodology improves active customer management and provides enriched satisfaction to the customers and also continuous business profits. By recognizing and prioritizing the relevant features, such as usage patterns and customer collaborations, and also by leveraging the capability of deep learning (DL) algorithms, the telecom companies can develop highly robust predictive models that can efficiently anticipate and mitigate customer churn by boosting retention approaches. In this background, the current study presents the Archimedes optimization algorithm-based feature selection with a hybrid deep-learning-based churn prediction (AOAFS-HDLCP) technique for telecom companies. In order to mitigate high-dimensionality problems, the AOAFS-HDLCP technique involves the AOAFS approach to optimally choose a set of features. In addition to this, the convolutional neural network with autoencoder (CNN-AE) model is also involved for the churn prediction process. Finally, the thermal equilibrium optimization (TEO) technique is employed for hyperparameter selection of the CNN-AE algorithm, which, in turn, helps in achieving improved classification performance. A widespread experimental analysis was conducted to illustrate the enhanced performance of the AOAFS-HDLCP algorithm. The experimental outcomes portray the high efficiency of the AOAFS-HDLCP approach over other techniques, with a maximum accuracy of 94.65%.

1. Introduction

Telecommunications has become one of the most large-scale industries in developed countries. The technological developments and a large number of operators increase the range of challenges encountered by the industry [1]. Companies are actively working to survive in this competitive market, for which several approaches are being followed [2]. In order to generate high revenues, three key policies are followed, such as gaining new customers, promoting the existing customers, and raising the retention time of the customers. Comparing these policies and taking the return on investment (RoI) cost of all into account, it can be inferred that the third policy is the most profitable approach [3], since retaining a present customer costs considerably less than gaining a new one. Further, it is also regarded as a simple task compared to the upselling plan. In order to implement the third policy, companies need to reduce the ability of customer churn [4]. Alternatively, the prediction of the customers who are likely to leave the network can help in retaining the customer and, thus, indicates a possibly massive increase in profit if it is implemented in the early phase [5]. Various studies have established that the machine learning (ML) technique is extremely effective in predicting the churning customers. This approach is implemented based on the knowledge gained from prior data [6].
Big data tasks can be performed easily with the help of artificial intelligence (AI) technology without much effort from the sales and customer support teams [7]. So, it is crucial to incorporate the AI in financial activities that contain social marketing, sales, customer relationship management (CRM), and so on to effectively attract the customers and gain their trust. Since AI is a significant part of social networks and other electronic marketing sites, it is crucial to understand how to utilize, change, and execute these sites in an efficient manner [8]. Customer behavior analysis seriously affects the social networking and other marketing actions of the company by permitting highly customized and predictive marketing activities. By analyzing the customer data, the companies increase their vision on what can resonate with their viewers [9]. Businesses employ such data to engage in highly efficient social media and marketing activities. It can successively result in greater customer support and conversion rates. Also, the deep learning (DL) techniques can support companies in terms of optimization and automation of their promotional activities, thus saving resources and time, while it also enhances the firm’s overall effectiveness. Recently, metaheuristic algorithms [10] have been widely used for hyperparameter tuning of the DL models. A few such metaheuristics include monarch butterfly optimization (MBO) [11], slime mold algorithm (SMA) [12], moth search algorithm (MSA) [13], hunger games search (HGS) [14], Runge Kutta method (RUN) [15], colony predation algorithm (CPA) [16], weighted mean of vectors (INFO) [17], Harris hawks optimization (HHO) [18], rime optimization algorithm (RIME) [19], etc.
In this background, the current study introduces the Archimedes optimization algorithm-based feature selection with hybrid deep-learning-based churn prediction (AOAFS-HDLCP) technique for telecom companies. The objective of the proposed AOAFS-HDLCP method is to predict the churning customers so as to increase the customer retention activities in the telecom industry. In the presented AOAFS-HDLCP technique, the AOAFS approach is intended to choose an optimal set of features. It has the following benefits, i.e., fast convergence rate and a fine balance between local and global search capacity, while resolving continuing problems. The current study involves the convolutional neural network with an autoencoder (CNN-AE) model for churn prediction. Further, the thermal equilibrium optimization (TEO) technique has been applied to the hyperparameter tuning method to boost the outcomes of the CNN-AE model. An extensive experimental analysis was conducted to illustrate the enhanced performance of the AOAFS-HDLCP method. Briefly, the major contributions of this research are given below:
  • An intelligent AOAFS-HDLCP method including AOAFS, CNN-AE classification, and TEO-based hyperparameter tuning is introduced for churn prediction. The AOAFS-HDLCP method does not exist in the literature to the best of the authors’ knowledge.
  • The AOAFS method is designed to detect the essential attributes from the telecom industry’s complex datasets, thus enhancing the efficiency and effectiveness of the churn prediction process.
  • The CNN-AE model is employed for the churn prediction process, which represents a significant contribution to the research community. It can capture intricate patterns and relationships in the data, thus potentially improving the accuracy of churn prediction compared with the rest of the traditional approaches.
  • A TEO technique has been developed to fine-tune the model parameters of the CNN-AE model in an effective manner so as to optimize the performance in terms of predicting customer churn.

2. Related Works

The authors in the literature [20] introduced the AI with Jaya optimization algorithm (JOA)-based churn prediction for data exploration (AIJOA-CPDE) method. In this algorithm, a primary step of feature selection was introduced by employing the JOA approach for the selection of feature sets. The proposed system utilized a bidirectional LSTM (BLSTM) algorithm for churn prediction. Finally, the chicken swarm optimization (CSO) method was applied in this study for hyper-parameter optimization. Kozak et al. [21] considered customer churn management to validate the efficiency of swarm intelligence machine learning (SIML) techniques. The aims of this study were of two-fold: for the existence of particular features and the objective in customer churn management and validating whether the adapted SIML technique increased the efficiency of churn-related segmentation and decision-making method. Saha et al. [22] studied ensemble learning approaches, namely, xgboost (XGB), bagging and stacking, Adaboost, gradient boosting (GBM), extremely randomized tree (ERT), and random forest (RF), standard classification algorithms, such as LR, ANN, DT, and KNN, and the DL-CNN approach in order to select the best method for developing the CCP technique.
In the literature [23], the authors developed the dynamic customer churn prediction (CCP) method for business intelligence by applying text analytics with a metaheuristic optimizer (CCPBI-TAMO) method. Additionally, the LSTM with stacked AE (LSTM-SAE) algorithm was also implemented for the classification of the feature-minimized data. Faritha Banu et al. [24] suggested the AI-based CCP for Telecommunication Business Markets (AICCP-TBM) method in which the chaotic SSO-based FS (CSSO-FS) algorithm was utilized for selecting the superior feature set. Additionally, the fuzzy-rule-based classifier (FRC) was exploited for differentiating the non-churn customers and churners. The quantum behaved particle swarm optimization (QPSO) approach was applied in this study to select the membership roles for the FRC algorithm.
In the study conducted earlier [25], the stacked bidirectional LSTM (SBLSTM) and RNN models were developed for AOA from CCP. The aim of the presented approach was to forecast the existence of customer churn from the insurance company. Primarily, the AOA approach conducted the preprocessing of the data to change the new data into a valuable format. Moreover, the SBLSTM-RNN algorithm was utilized in this study for distinguishing the churn and non-churn customers. In the literature [26], the authors created an ML approach that can forecast the effective churn for the telecom companies. The outcomes can be used in an appropriate manner, i.e., use marketing retention approaches to retain the customers as and when time passes. In this method, the authors employed recent databases and made use of preprocessing systems such as bivariate and univariate analyses and employed data visualization methods to understand the database correctly. Alshamari [27] intended to analyze and measure the user approval for the services rendered by the Saudi Telecom Company (STC), Mobily, and Zain. This kind of SA has been a dominant parameter and has been utilized to create a significant business decision in enhancing the satisfaction as well as the loyalty of the customers. In this case, the author established new approaches based on DL technique for analyzing the percentage of customer satisfaction using the openly accessible database, i.e., AraCust.
The existing literature on CCP has made significant strides in leveraging both ML and DL techniques to identify the potential churners. However, a notable research gap persists in adequately addressing the critical aspects of feature selection and hyperparameter tuning within this context. Though comprehensive studies have been conducted earlier on individual aspects of CCP, the simultaneous consideration of feature selection and hyperparameter tuning remains an underexplored territory. Feature selection plays an important role in improving the efficacy of the model by detecting the most informative variables, thus reducing both noise and computation. At the same time, hyperparameter tuning is crucial for fine-tuning the model’s performance and generalization. The synergy between these two crucial aspects can potentially yield highly efficient and accurate churn prediction methods. However, the existing research often overlooks this synergy, thus resulting in suboptimal predictive abilities. Bridging this research gap is a vital element to unlock the maximum potential of CCP algorithms. This can further offer the businesses highly efficient mechanisms for customer retention and improved decision-making processes in extremely competitive industries.

3. The Proposed Model

In this article, the AOAFS-HDLCP system has been proposed for churn prediction in the telecom industry. The objective of the AOAFS-HDLCP method is to obtain churn prediction so as to increase the customer retention in the telecom industry. In the presented AOAFS-HDLCP technique, the AOAFS approach, CNN-AE classification, and TEO-based hyperparameter tuning are introduced. Figure 1 exhibits the working procedure of the AOAFS-HDLCP approach.

3.1. Stage I: Feature Selection Using AOA

In this study, the AOA is designed to choose the optimum feature set. The fundamental condition of AOA is based on Archimedes’ physical law of buoyancy [28]. AOA is an effective model for the optimization process since it can balance the tradeoff between exploration and exploitation phases, thus making it suitable for managing difficult and multidimensional search spaces. Inspired by the Archimedes’ principle of buoyancy, the AOA method formulates an effective way for its searching mechanism based on the fitness landscape, thus enabling effective convergence towards the optimal solution. It is highly adaptable, integrated to the ability of escaping the local minima and well suited for addressing real-world problems across various domains. Since the feature selection process identifies highly relevant features, the AOA’s adaptability and capacity to discern informative features from a multitude of possibilities prove to be invaluable. With dynamic adjustment of the searching process based on the dataset characteristics, the AOA performs well in the detection of optimum feature subsets. It results in improved model interpretability, reduced computational complexity, and improved generalization performance.
AOA is a new metaheuristic algorithm, derived from the Archimedes’ principle. Similar to other population-based metaheuristic techniques, the AOA technique begins its search method with an initial population and a random volume, density, and acceleration. Following is the list of steps followed in AOA method.
Step 1. Initialize the population location, volume, density, and acceleration using the following Equation (1):
X i = l b i + r a n d × u b i 1 b i ; i = 1 , 2 , , N , a c c i = l b i + r a n d × u b i 1 b i ; i = 1 , 2 , , N , d e n i = r a n d   N , D v o l i = r a n d   N , D
where the population number and dimension of the search range are N and D , respectively. The i t h object in the N population is X i . The lower and upper limitations of the search range are l b i and u b i , respectively. N × D dimensional matrix that can be calculated randomly by the system function is denoted by   r a n d ( N , D ) . Volume, density, and acceleration of the i t h object are v o l i ,   d e n j , and a c c i , correspondingly. Next, the individual X b e s t with the optimum fitness value and the respective a c c b e s t ,   d e n b e s t , and v o l b e s t are chosen [29].
Step 2. Upgrade the density and volume of the t + 1 t h iteration of the i t h objectas given below.
d e n i t + 1 = d e n i t + r a n d × d e n b e s t d e n i t ,     v o l i t + 1 = v o l i t + r a n d × v o l b e s t v o l i t ,
In Equation (2), the global optimum values of density and volume are denoted by d e n b e s t and v o l b e s t , correspondingly.
Step 3. Compute the density decline factor d and the parameter T F , which creates a balance between global and local convergence capability of the AOA method.
T F = e x p t t m a x t m a x ,
In Equation (3), the maximum and the existing iterations are denoted by t m a x and t , respectively. Here, T F rises with the iteration number, until TF = 1.
d t + 1 = e x p t m a x t t m a x t t m a x ,  
In Equation (4), as the iteration number increases, d reduces and the search is transported to the bounded area that has been detected [30].
Step 4. When T F 0.5 , then the exploration and collision takes place between the objects. Using the following equation, the acceleration is updated.
a c c i t + 1 = d e n m r + v o l m r × a c c m r d e n i t + 1 + v o l i t + 1 ,   m r = r a n d ,
In Equation (5), acceleration, volume, and density of the i t h individual at ( t + 1 ) t h iteration are denoted by c i t + 1 ,   v o l i t + 1 , and   d e n i t + 1 , correspondingly. The c i t + 1 ,   v o l i t + 1 , and   d e n i t + 1 of the random individuals are denoted by a c c m r ,   d e n m r ,   a n d   v o l m r , correspondingly.
When T F > 0.5 , the exploitation stage and no collision between the objects takes place. So, the acceleration is updated as given below.
a c c i t + 1 = d e n b e s t + v o l b e s t × a c c b e s t d e n i t + 1 + v o l i t + 1 ,  
Next, using the following equation, the acceleration is normalized.
a c c i , n o r m t + 1 = u × a c c i t + 1 m i n a c c m a x a c c m i n a c c + l ,    
In Equation (7), the range of normalization and fixed value at 0.9 and 0.1 are u and l , correspondingly. The step percentage of each agent change is a c c i , n o r m t + 1 . When the object i is far from the global optima, then a c c i , n o r m t + 1 value would be higher, which implies that the object is in the exploration stage.
Step 5. When T F 0.5 , then the location of the population X is updated using the equation below.
X i t + 1 = X i t + C 1 × r a n d × a c c i , n o r m t + 1 × d × X r a n d X i t ,  
In Equation (8), C 1 is a constant equivalent to 2 . Or else, when T F > 0.5 , the location of the population X is updated using Equation (9):
X i t + 1 = X b e s t t + F × C 2 × r a n d × a c c i , n o r m t + 1 × d × T × X b e s t X i t ,
Here, C 1 is a constant equivalent to 6 . T = C 3 × T F ; T rises with time. The parameter F changes the movement’s direction and is evaluated by Equation (10):
P = 2 × r a n d C 4 , F = + 1 ,     i f P 0.5 , 1 ,     i f P > 0.5 ,    
where C 3 and C 4 balance the direction of the movements to adjust the capability of the model so as to escape the local optima.
Step 6. Evaluation. Based on the updated population, the individual with the optimal fitness and their acceleration, density, and volume are selected. The procedure is reiterated until the maximal iteration is obtained [31].
The FF of the AOA-FS technique considers the classification outcomes and the amount of features selected. It diminishes the set size of the selected features and increases the classification outcomes. Hence, the FF is used for evaluating the individual solutions:
F i t n e s s = α     E r r o r R a t e + 1 α     # S F # A l l _ F ,  
In Equation (11), E r r o r R a t e implies the classifier error rate based on the selected features. ErrorRate is estimated as a percentage of incorrect classification to the amount of classifications made in the range of [0,1]. # S F shows the number of features selected and # A l l _ F denotes the total quantity of features in the original dataset. α controls the prominence of classification quality and the subset length. α is fixed as 0.9 in the current study.

3.2. Stage II: Churn Prediction Using CNN-AE Model

The CNN-AE model is used for churn prediction. CNN model is a kind of DL method and is one of the state-of-art techniques for CV applications, owing to its considerable benefits [32]. CNN technique has a primary benefit, i.e., feature learning, and it can extract and learn relevant features. Due to its deep architecture, the CNN technique also learns from abundant datasets. Feature extraction is a main and challenging problem for pattern prediction. The features are highly essential since they represent the image properties. CNN is a DL approach used for the extraction of features that give a self-learning layer. The component in the encoded vector does not mean to encode a single feature. In the decoding network, masses of parameters exist while a combination could encode and construct a vast number of features. Thus, the CNN-AE technique is used to implement the unsupervised learning for dimension reduction and feature extraction. The distance between the vectors is much more rapid to compute since the smaller feature is projected to be a low dimension. Figure 2 demonstrates the infrastructure of the CNN-AE model.
CAE has a similar structure to CNN that comprises pooling layers and convolutional filters. However, the only difference between CNN and CAE is that both input and output nodes have equal dimensions in CAE. The recreated data are compared to the input dataset. The learning method is not reliant on the labeled dataset. The CNN-AE is a category of unsupervised learning method, while CNN is a kind of DL method with multiple convolutional layers. It is primarily exploited for feature extraction process and image processing tasks [33]. CAE uses a convolution operator for encoding the input features and replicating them in the output with a minimal amount of reconstructed errors. CAE consists of output layer m feature maps and m convolution kernels. The input mapping feature is generated from the input layer while n corresponds to the number of input channels. The hidden depiction of CAE of the k t h feature map in the encoder is described using Equation (12), where σ denotes the activation function and indicates the 2D convolution. In the decoder, the reconstruction is described using a subsequent equation, where H shows the hidden feature maps and c denotes the bias as per the input channel [34].
h k = σ x     W k + b k ,    
y = σ k ϵ H h k     W ~ k + c    

3.3. Stage III: Parameter Tuning Using the TEO Method

Ultimately, the TEO has been implemented in the current study for fine-tuning the parameters, compared to the CNN-AE architecture. The target of hyperparameter selection is critical for fine-tuning the configuration of the CNN-AE technique. Optimum hyperparameters considerably impact the effectiveness of the model, while they also affect the model’s capability for effectually taking complex features and generalizing them. By implementing the TEO technique, the research goal is to proficiently direct the hyperparameter space and enhance the capabilities of CNN-AEs in the context of CCP within the telecom industry. The TEO method is inspired from the unique ability to represent the principles of thermal equilibrium in physical systems, thus enabling a robust analysis of the hyperparameter space. The TEO system provides different benefits in the optimization process, mainly in hyperparameter tuning for the DL models. Inspired from the principles of thermal equilibrium, the TEO technique strikes an active balance between the exploration and exploitation phases. Thus, it can navigate complex solution spaces, mimic physical methods, provide greater convergence and solution quality, and can be combined with local and global search approaches. The versatility and efficiency of the TEO method make it a favorable choice for fine-tuning the hyperparameters in architectures, namely CNN-AE. Further, it is also applicable in case of CCP in the telecom industry, where it yields an enriched performance and can accomplish optimum configurations.
According to the Newton’s law of cooling, TEO is a novel optimization technique, which describes that the rate of heat loss for an object is directly proportionate to the temperature difference between the object and its surrounding environments at a certain point [35]. In the current research work, some search agents are represented as reference, while some as recognized nodes (cooling objects). Unrecognized NLOS nodes or nodes, on the other hand, are represented as environment. The heat exchange between the environment and the cooling objects is mathematically modelled as follows:
T i c e n v = 1 c v 1 + c v 2     1 N C I     r n d     T i p e n v
N C I = C I N M a x I t e r      
T i p e n v and T i x e n v represent the earlier and the modified temperatures of the environment’s objects, respectively, with c v 1 and c v 2 being considered as the variables used for controlling the prediction or localization operations, correspondingly [36]. Furthermore, C I N and M a x I t e r refer to the existing and the maximum iteration counts. In addition to this, the initial phase of the TEO optimization technique updates the temperature of the objects and their surrounding environments as given below.
T i n e w e n v = T i x e n v + T i o l d e n v T i x e n v     e β N C I
β = C o s i n e N C I ( O b i ) C o s i n e N C I ( W o r s t _ O b j )    
Now, the r n d value is compared to the predefined prevention threshold that has been implemented earlier for randomly selecting a single dimension of the i t h searching agent to restore its value based on Equation (18):
T i , j = T i , M i n + r n d     T j , M a x T j , M i n
In Equation (18), T j represents the j t h variable of the i t h searching agent, with T ,   M i n and T , M a x correspondingly indicating the lower and upper thresholds of the j t h variable [37]. Fitness selection has been an essential component in the TEO methodology. An encoder solution is applied to estimate the outcome of the solution candidate. Therefore, the accuracy value is the foremost form applied for designing the FF.
F i t n e s s = m a x T P T P + F P  
Here, the true and false positive values are denoted by T P and F P , respectively.

4. Results and Discussion

The developed method was validated using the Python 3.8.5 tool on a PC configured with i5-8600k, GeForce 1050Ti 4 GB, 16 GB RAM, 250 GB SSD, and 1 TB HDD specifications. Diverse Python Packages were implemented, namely opencv-python, numpy, matplotlib, tensorflow (GPU-CUDA Enabled), keras, pickle, sklearn, and pillow. The CCP performance of the AOAFS-HDLCP technique was investigated using the customer churn prediction: Telecom Churn Dataset [38], including 3,333 data instances with 21 attributes as described in Table 1. The dataset was downloaded from the Kaggle repository.
The set of measures, used for examining the classification outcomes, are accuracy ( a c c u y ), precision ( p r e c n ), recall ( r e c a l ), and F-score ( F s c o r e ).
P r e c n = T P T P + F P
Precision is used to measure the proportion of the predicted positive instances out of each instance that is predicted as positive.
R e c a l = T P T P + F N
Recall is used to measure the proportion of the positive samples classified.
A c c u y = T P + T N T P + T N + F P + F N
Accuracy is used to measure the proportion of the classified samples (positive and negative) against the overall samples classified.
F s c o r e = 2 T P 2 T P + F P + F N
F-score combines the harmonic mean of p r e c n and r e c a l .
The confusion matrices generated by the AOAFS-HDLCP method on 90:10 and 80:20 of the TRS/TSS datasets are demonstrated in Figure 3. The outcomes portray the effectual recognition of the proposed model in terms of churn and non-churn samples on all the class labels.
The CCP outcomes of the AOAFS-HDLCP method under 90:10 and 80:20 of the TRS/TSS datasets are shown in Table 2. The simulation values demonstrate that the AOAFS-HDLCP method categorized the churn and non-churn samples effectively. With 90% TRS, the AOAFS-HDLCP model provided an average a c c u y of 93.58%, p r e c n of 96.63%, r e c a l of 93.58%, F s c o r e of 95.03%, and an A U C s c o r e of 93.58%. In addition, with 10% TSS, the AOAFS-HDLCP technique offered an average a c c u y of 90.59%, p r e c n of 94.89%, r e c a l of 90.59%, F s c o r e of 92.59%, and an A U C s c o r e of 90.59%. Also, with 80% TRS, the AOAFS-HDLCP model yielded an average a c c u y of 90.62%, p r e c n of 93.88%, r e c a l of 90.62%, F s c o r e of 92.15%, and an A U C s c o r e of 90.62%. At last, with 20% TSS, the AOAFS-HDLCP method accomplished an average a c c u y of 92.01%, p r e c n of 94.34%, r e c a l of 92.01%, F s c o r e of 93.13%, and an A U C s c o r e of 92.01%.
The confusion matrices generated by the AOAFS-HDLCP system on 60:40 and 70:30 TRS/TSS datasets are illustrated in Figure 4. The outcomes indicate the effectual prediction of the proposed model in terms of churn and non-churn samples under all the classes.
The CCP outcomes of the AOAFS-HDLCP system at 60:40 and 70:30 TRS/TSS datasets are shown in Table 3. The achieved outcomes indicate that the proposed AOAFS-HDLCP technique categorized the churn and non-churn samples in an effective manner. With 60% TRS, the AOAFS-HDLCP method provided an average a c c u y of 87.18%, p r e c n of 96.83%, r e c a l of 87.18%, F s c o r e of 91.21%, and an A U C s c o r e of 87.18%. In addition, with 40% TSS, the AOAFS-HDLCP method yielded an average a c c u y of 91.58%, p r e c n of 97.70%, r e c a l of 91.58%, F s c o r e of 94.33%, and an A U C s c o r e of 91.58%. Also, with 70% TRS, the AOAFS-HDLCP method produced an average a c c u y of 93.09%, p r e c n of 96.64%, r e c a l of 93.09%, F s c o r e of 94.76%, and an A U C s c o r e of 93.08%. At last, with 30% TSS, the AOAFS-HDLCP method accomplished an average a c c u y of 94.65%, p r e c n of 96.92%, r e c a l of 94.65%, F s c o r e of 95.74%, and an A U C s c o r e of 94.65%.
Both T R _ a c c u y and V L _ a c c u y outcomes of the AOAFS-HDLCP methodology for 70:30 TRS/TSS dataset are illustrated in Figure 5. The T L _ a c c u y is evaluated by estimating the AOAFS-HDLCP system on the TR data, while V L _ a c c u y is determined by the assessment of the proposed method using test data. The simulation values show that both T R _ a c c u y and V L _ a c c u y values increase with the maximum number of epochs. Hereafter, the effectiveness of the AOAFS-HDLCP method increases on the TR and TS data with an increase in the number of epochs.
The T R _ l o s s and V R _ l o s s outcomes of the AOAFS-HDLCP model under 70:30 of the TRS/TSS are shown in Figure 6. The T R _ l o s s represents the error between the prediction performance and original values at the TR dataset. The V R _ l o s s denotes the performance evaluation of the AOAFS-HDLCP method on the validation dataset. The simulation value demonstrates that both T R _ l o s s and V R _ l o s s tend to reduce with an increase in the number of epochs. This provides the superior outcome of the AOAFS-HDLCP algorithm and its ability to produce accurate classification. The minimized T R _ l o s s and V R _ l o s s values reveal the high efficiency of the AOAFS-HDLCP system in capturing patterns and correlations.
A wide range of PR analysis was conducted upon the AOAFS-HDLCP model upon the 70:30 TRS/TSS dataset and the results are shown in Figure 7. The simulation values infer that the AOAFS-HDLCP approach produced the maximum PR values. Additionally, the AOAFS-HDLCP technique attained the maximum PR performance in all the classes.
In Figure 8, the ROC analysis curve achieved by the AOAFS-HDLCP algorithm for 70:30 TRS/TSS dataset is shown. This figure indicates that the AOAFS-HDLCP system achieved an improvement in the ROC values. The outcomes provide valuable insights about the tradeoffs between the rate of TPR and FPR. It provides the predictive outcomes of the presented technique on the classification of different classes.
Table 4 shows the results of the comparison analysis conducted between the proposed AOAFS-HDLCP method and the existing methods [20,39,40]. The experimental values infer that the DR and LR models exhibited poor results, whereas the SVM, SGD, and RMSProp approaches achieved slightly increased performance.

5. Conclusions

In the current study, the AOAFS-HDLCP technique has been introduced for churn prediction in the telecom industry. The objective of the presented method is to accomplish churn prediction so as to increase the customer retention process in the telecom industry. In the presented technique, the AOAFS approach, CNN-AE classification, and TEO-based hyperparameter tuning have been developed. In the current research work, the AOAFS is designed to choose an optimal set of features. The CNN-AE model has been involved in churn prediction process. The TEO technique has been applied to the hyperparameter tuning process to optimize the outcomes of the CNN-AE system. A widespread experimental analysis was conducted to illustrate the superior performance of the AOAFS-HDLCP approach. The achieved findings portray the significant performance of the AOAFS-HDLCP method over other techniques, with an improved accuracy of 94.65%. In the future, studies can focus on handling outlier removal and class imbalance data handling problems.

Author Contributions

Conceptualization, H.A.M. and N.A.; methodology, H.A.M.; software, C.S.; validation, N.A., F.K. and A.M.; formal analysis, A.M.; investigation, H.A.M.; resources, E.S.A.E.; data curation, E.S.A.E.; writing—original draft preparation, H.A.M., N.A., F.K., C.S. and A.M.; writing—review and editing, E.S.A.E.; visualization, E.S.A.E.; supervision, H.A.M.; project administration, H.A.M.; funding acquisition, H.A.M., F.K. and N.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through large group Research Project under grant number (RGP2/48/44). Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R114), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. Research Supporting Project number (RSPD2023R608), King Saud University, Riyadh, Saudi Arabia. This study is partially funded by the Future University in Egypt (FUE).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data sharing does not apply to this article as no datasets were generated during the current study.

Conflicts of Interest

The authors declare that they have no conflicts of interest. The manuscript was written through the contributions of all authors. All authors have approved the final version of the manuscript.

References

  1. Saha, L.; Tripathy, H.K.; Gaber, T.; El-Gohary, H.; El-Kenawy, E.-S.M. Deep Churn Prediction Method for Telecommunication Industry. Sustainability 2023, 15, 4543. [Google Scholar] [CrossRef]
  2. Amin, A.; Adnan, A.; Anwar, S. An adaptive learning approach for customer churn prediction in the telecommunication industry using evolutionary computation and Naïve Bayes. Appl. Soft Comput. 2023, 137, 110103. [Google Scholar] [CrossRef]
  3. Abdulsalam, S.O.; Arowolo, M.O.; Saheed, Y.K.; Afolayan, J.O. Customer Churn Prediction in Telecommunication Industry Using Classification and Regression Trees and Artificial Neural Network Algorithms. Indones. J. Electr. Eng. Inform. (IJEEI) 2022, 10, 431–440. [Google Scholar] [CrossRef]
  4. Singh, K.D.; Singh, P.D.; Bansal, A.; Kaur, G.; Khullar, V.; Tripathi, V. Exploratory Data Analysis and Customer Churn Prediction for the Telecommunication Industry. In Proceedings of the 3rd International Conference on Advances in Computing, Communication, Embedded and Secure Systems, Kochi, India, 18 May 2023; IEEE: Piscataway, NJ, USA; pp. 197–201. [Google Scholar]
  5. Teoh, J.S.; Samad, B.S.A. Developing Machine Learning and Deep Learning Models for Customer Churn Prediction in the Telecommunication Industry. In 人工生命とロボットに関する国際会議予稿集 株式会社; ALife Robotics: Oita, Japan, 2022; Volume 27, pp. 533–539. [Google Scholar]
  6. Gupta, V.; Jatain, A. Artificial Intelligence-Based Predictive Analysis of Customer Churn. Formosa J. Comput. Inf. Sci. 2023, 2, 95–110. [Google Scholar]
  7. Ramesh, P.; Emilyn, J.J.; Vijayakumar, V. Hybrid Artificial Neural Networks Using Customer Churn Prediction. Wirel. Pers. Commun. 2022, 124, 1695–1709. [Google Scholar] [CrossRef]
  8. Samuel, A.I.; David, M.; Salihu, B.A.; Usman, A.U.; Abdullahi, I.M. Pastoralist Optimization Algorithm Approach for Improved Customer Churn Prediction in the Telecom Industry; Schools of Engineering Technology, Federal University of Technology Minna: Minna, Nigeria, 2023. [Google Scholar]
  9. Patil, K.; Patil, S.; Danve, R.; Patil, R. Machine Learning and Neural Network Models for Customer Churn Prediction in Banking and Telecom Sectors. In Proceedings of Second International Conference on Advances in Computer Engineering and Communication Systems, ICACECS 2021; Springer Nature: Singapore, 2022; pp. 241–253. [Google Scholar]
  10. Eltamaly, A.M.; Rabie, A.H. A Novel Musical Chairs Optimization Algorithm. Arab. J. Sci. Eng. 2023, 48, 10371–10403. [Google Scholar] [CrossRef]
  11. Wang, G.G.; Deb, S.; Cui, Z. Monarch butterfly optimization. Neural Comput. Appl. 2019, 31, 1995–2014. [Google Scholar] [CrossRef]
  12. Li, S.; Chen, H.; Wang, M.; Heidari, A.A.; Mirjalili, S. Slime mould algorithm: A new method for stochastic optimization. Future Gener. Comput. Syst. 2020, 111, 300–323. [Google Scholar] [CrossRef]
  13. Wang, G.-G. Moth search algorithm: A bio-inspired metaheuristic algorithm for global optimization problems. Memetic Comput. 2018, 10, 151–164. [Google Scholar] [CrossRef]
  14. Yang, Y.; Chen, H.; Heidari, A.A.; Gandomi, A.H. Hunger games search: Visions, conception, implementation, deep analysis, perspectives, and towards performance shifts. Expert Syst. Appl. 2021, 177, 114864. [Google Scholar] [CrossRef]
  15. Butcher, J.C. On the implementation of implicit Runge-Kutta methods. BIT Numer. Math. 1976, 16, 237–240. [Google Scholar] [CrossRef]
  16. Tu, J.; Chen, H.; Wang, M.; Gandomi, A.H. The Colony Predation Algorithm. J. Bionic Eng. 2021, 18, 674–710. [Google Scholar] [CrossRef]
  17. Ahmadianfar, I.; Asghar Heidari, A.; Noshadian, S.; Chen, H.; Gandomi, A.H. INFO: An Efficient Optimization Algorithm based on Weighted Mean of Vectors. Expert Syst. Appl. 2022, 195, 116516. [Google Scholar] [CrossRef]
  18. Heidari, A.A.; Mirjalili, S.; Faris, H.; Aljarah, I.; Mafarja, M.; Chen, H. Harris hawks optimization: Algorithm and applications. Futur. Gener. Comput. Syst. 2019, 97, 849–872. [Google Scholar] [CrossRef]
  19. Su, H.; Zhao, D.; Heidari, A.A.; Liu, L.; Zhang, X.; Mafarja, M.; Chen, H. RIME: A physics-based optimization. Neurocomputing 2023, 532, 183–214. [Google Scholar] [CrossRef]
  20. Abdullaev, I.; Prodanova, N.; Ahmed, M.A.; Lydia, E.L.; Shrestha, B.; Joshi, G.P.; Cho, W. Leveraging metaheuristics with artificial intelligence for customer churn prediction in telecom industries. Electron. Res. Arch. 2023, 31, 4443–4458. [Google Scholar] [CrossRef]
  21. Kozak, J.; Kania, K.; Juszczuk, P.; Mitręga, M. Swarm intelligence goal-oriented approach to data-driven innovation in customer churn management. Int. J. Inf. Manag. 2021, 60, 102357. [Google Scholar] [CrossRef]
  22. Pustokhina, I.V.; Pustokhin, D.A.; Rh, A.; Jayasankar, T.; Jeyalakshmi, C.; Díaz, V.G.; Shankar, K. Dynamic customer churn prediction strategy for business intelligence using text analytics with evolutionary optimization algorithms. Inf. Process. Manag. 2021, 58, 102706. [Google Scholar] [CrossRef]
  23. Banu, J.F.; Neelakandan, S.; Geetha, B.; Selvalakshmi, V.; Umadevi, A.; Martinson, E.O. Artificial Intelligence Based Customer Churn Prediction Model for Business Markets. Comput. Intell. Neurosci. 2022, 2022, 1–14. [Google Scholar] [CrossRef]
  24. Jajam, N.; Challa, N.P.; Prasanna, K.S.L.; Deepthi, C.H.V.S. Arithmetic Optimization with Ensemble Deep Learning SBLSTM-RNN-IGSA Model for Customer Churn Prediction. IEEE Access 2023, 11, 93111–93128. [Google Scholar] [CrossRef]
  25. Pandithurai, O.; Ahmed, H.H.; Sriman, B.; Seetha, R. Telecom Customer Churn Prediction Using Supervised Machine Learning Techniques. In Proceedings of the International Conference on Advances in Computing, Communication and Applied Informatics (ACCAI), Chennai, India, 25–26 May 2023; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
  26. Alshamari, M.A. Evaluating User Satisfaction Using Deep-Learning-Based Sentiment Analysis for Social Media Data in Saudi Arabia’s Telecommunication Sector. Computers 2023, 12, 170. [Google Scholar] [CrossRef]
  27. Hashim, F.A.; Hussain, K.; Houssein, E.H.; Mabrouk, M.S.; Al-Atabany, W. Archimedes optimization algorithm: A new metaheuristic algorithm for solving optimization problems. Appl. Intell. 2021, 51, 1531–1551. [Google Scholar] [CrossRef]
  28. Houssein, E.H.; Helmy, B.E.-D.; Rezk, H.; Nassef, A.M. An enhanced Archimedes optimization algorithm based on Local escaping operator and Orthogonal learning for PEM fuel cell parameter identification. Eng. Appl. Artif. Intell. 2021, 103, 104309. [Google Scholar] [CrossRef]
  29. Desuky, A.S.; Hussain, S.; Kausar, S.; Islam, A.; El Bakrawy, L.M. EAOA: An Enhanced Archimedes Optimization Algorithm for Feature Selection in Classification. IEEE Access 2021, 9, 120795–120814. [Google Scholar] [CrossRef]
  30. Zhang, L.; Wang, J.; Niu, X.; Liu, Z. Ensemble wind speed forecasting with multi-objective Archimedes optimization algorithm and sub-model selection. Appl. Energy 2021, 301, 117449. [Google Scholar] [CrossRef]
  31. Saponara, S.; Elhanashi, A.; Zheng, Q. Recreating Fingerprint Images by Convolutional Neural Network Autoencoder Architecture. IEEE Access 2021, 9, 147888–147899. [Google Scholar] [CrossRef]
  32. Bedi, P.; Gole, P. Plant disease detection using hybrid model based on convolutional autoencoder and convolutional neural network. Artif. Intell. Agric. 2021, 5, 90–101. [Google Scholar] [CrossRef]
  33. Wen, T.; Zhang, Z. Deep Convolution Neural Network and Autoencoders-Based Unsupervised Feature Learning of EEG Signals. IEEE Access 2018, 6, 25399–25410. [Google Scholar] [CrossRef]
  34. Khan, S. Short-Term Electricity Load Forecasting Using a New Intelligence-Based Application. Sustainability 2023, 15, 12311. [Google Scholar] [CrossRef]
  35. Yue, G.; Hong, S.; Liu, S.-H. Process hazard assessment of energetic ionic liquid with kinetic evaluation and thermal equilibrium. J. Loss Prev. Process. Ind. 2023, 81, 104972. [Google Scholar] [CrossRef]
  36. Liu, S.; Ahmadi-Senichault, A.; Levet, C.; Lachaud, J. Experimental investigation on the validity of the local thermal equilibrium assumption in ablative-material response models. Aerosp. Sci. Technol. 2023, 141, 108516. [Google Scholar] [CrossRef]
  37. Available online: https://www.kaggle.com/code/mnassrib/customer-churn-prediction-telecom-churn-dataset/notebook (accessed on 12 June 2023).
  38. Lalwani, P.; Mishra, M.K.; Chadha, J.S.; Sethi, P. Customer churn prediction system: A machine learning approach. Computing 2021, 104, 271–294. [Google Scholar] [CrossRef]
  39. Pustokhina, I.V.; Pustokhin, D.A.; Nguyen, P.T.; Elhoseny, M.; Shankar, K. Multi-objective rain optimization algorithm with WELM model for customer churn prediction in telecommunication sector. Complex Intell. Syst. 2021, 9, 3473–3485. [Google Scholar] [CrossRef]
  40. Dalli, A. Impact of Hyperparameters on Deep Learning Model for Customer Churn Prediction in Telecommunication Sector. Math. Probl. Eng. 2022, 2022, 1–11. [Google Scholar] [CrossRef]
Figure 1. Overall procedure of the AOAFS-HDLCP system.
Figure 1. Overall procedure of the AOAFS-HDLCP system.
Biomimetics 09 00001 g001
Figure 2. Structure of CNN-AE.
Figure 2. Structure of CNN-AE.
Biomimetics 09 00001 g002
Figure 3. Confusion matrices of (a,b) 90:10 of TR set (TRS)/TS set (TSS) and (c,d) 80:20 of TRS/TSS.
Figure 3. Confusion matrices of (a,b) 90:10 of TR set (TRS)/TS set (TSS) and (c,d) 80:20 of TRS/TSS.
Biomimetics 09 00001 g003
Figure 4. Confusion matrices of (a,b) 60:40 of TRS/TSS and (c,d) 70:30 of TRS/TSS.
Figure 4. Confusion matrices of (a,b) 60:40 of TRS/TSS and (c,d) 70:30 of TRS/TSS.
Biomimetics 09 00001 g004
Figure 5. A c c u y curve of AOAFS-HDLCP method on 70:30 of TRS/TSS.
Figure 5. A c c u y curve of AOAFS-HDLCP method on 70:30 of TRS/TSS.
Biomimetics 09 00001 g005
Figure 6. Loss curve of the AOAFS-HDLCP model under 70:30 of TRS/TSS.
Figure 6. Loss curve of the AOAFS-HDLCP model under 70:30 of TRS/TSS.
Biomimetics 09 00001 g006
Figure 7. PR analysis of the AOAFS-HDLCP methodology under 70:30 of TRS/TSS.
Figure 7. PR analysis of the AOAFS-HDLCP methodology under 70:30 of TRS/TSS.
Biomimetics 09 00001 g007
Figure 8. ROC of AOAFS-HDLCP model under 70:30 of TRS/TSS.
Figure 8. ROC of AOAFS-HDLCP model under 70:30 of TRS/TSS.
Biomimetics 09 00001 g008
Table 1. Details of the database.
Table 1. Details of the database.
ClassNo. of Samples
Churn483
Non-Churn2850
Total Samples3333
Table 2. CCP outcomes of the AOAFS-HDLCP method on 90:10 and 80:20 of TRS/TSS datasets.
Table 2. CCP outcomes of the AOAFS-HDLCP method on 90:10 and 80:20 of TRS/TSS datasets.
Class A c c u y P r e c n R e c a l F s c o r e A U C s c o r e
Training Phase (90%)
Churn87.9095.3087.9091.4593.58
Non-Churn99.2697.9699.2698.6093.58
Average93.5896.6393.5895.0393.58
Testing Phase (10%)
Churn82.2292.5082.2287.0690.59
Non-Churn98.9697.2898.9698.1190.59
Average90.5994.8990.5992.5990.59
Training Phase (80%)
Churn82.6990.6582.6986.4990.62
Non-Churn98.5597.1098.5597.8290.62
Average90.6293.8890.6292.1590.62
Testing Phase (20%)
Churn85.4291.1185.4288.1792.01
Non-Churn98.6097.5798.6098.0892.01
Average92.0194.3492.0193.1392.01
Table 3. CCP outcomes of the AOAFS-HDLCP method on 60:40 and 70:30 of TRS/TSS datasets.
Table 3. CCP outcomes of the AOAFS-HDLCP method on 60:40 and 70:30 of TRS/TSS datasets.
Class A c c u y P r e c n R e c a l F s c o r e A U C s c o r e
Training Phase (60%)
Churn74.6597.7074.6584.6387.18
Non-Churn99.7195.9699.7197.8087.18
Average87.1896.8387.1891.2187.18
Testing Phase (40%)
Churn83.4298.2283.4290.2291.58
Non-Churn99.7497.1799.7498.4391.58
Average91.5897.7091.5894.3391.58
Training Phase (70%)
Churn86.8895.5186.8890.9993.09
Non-Churn99.3097.7799.3098.5393.09
Average93.0996.6493.0994.7693.09
Testing Phase (30%)
Churn90.0095.4590.0092.6594.65
Non-Churn99.3098.3999.3098.8494.65
Average94.6596.9294.6595.7494.65
Table 4. Comparison analysis outcomes of the AOAFS-HDLCP technique with other approaches [20,39,40].
Table 4. Comparison analysis outcomes of the AOAFS-HDLCP technique with other approaches [20,39,40].
Methods A c c u y P r e c n R e c a l F s c o r e A U C s c o r e
AOAFS-HDLCP 94.6596.9294.6595.7494.65
AIJOA-CPDE91.2895.5291.2994.0891.29
Logistic Regression80.5379.3180.4479.0582.18
Decision Tree76.6756.7875.6864.9778.25
ISMOTE-OWELM90.4891.6589.3989.6489.85
SVM Model84.2984.5483.9985.5983.98
SGD Model84.4186.1085.8184.3284.80
RMSProp Model87.3585.1885.1985.0786.27
Along with that, the AIJOA-CPDE approach illustrated reasonable outcomes with an a c c u y of 91.28%, p r e c n of 95.52%, r e c a l of 91.29%, F s c o r e of 94.08%, and an A U C s c o r e of 91.29%. However, the AOAFS-HDLCP technique gained the maximum performance with an a c c u y of 94.65%, p r e c n of 96.92%, r e c a l of 94.65%, F s c o r e of 95.74%, and an A U C s c o r e of 94.65%. Therefore, the AOAFS-HDLCP technique can be applied for accurate CCP process.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mengash, H.A.; Alruwais, N.; Kouki, F.; Singla, C.; Abd Elhameed, E.S.; Mahmud, A. Archimedes Optimization Algorithm-Based Feature Selection with Hybrid Deep-Learning-Based Churn Prediction in Telecom Industries. Biomimetics 2024, 9, 1. https://doi.org/10.3390/biomimetics9010001

AMA Style

Mengash HA, Alruwais N, Kouki F, Singla C, Abd Elhameed ES, Mahmud A. Archimedes Optimization Algorithm-Based Feature Selection with Hybrid Deep-Learning-Based Churn Prediction in Telecom Industries. Biomimetics. 2024; 9(1):1. https://doi.org/10.3390/biomimetics9010001

Chicago/Turabian Style

Mengash, Hanan Abdullah, Nuha Alruwais, Fadoua Kouki, Chinu Singla, Elmouez Samir Abd Elhameed, and Ahmed Mahmud. 2024. "Archimedes Optimization Algorithm-Based Feature Selection with Hybrid Deep-Learning-Based Churn Prediction in Telecom Industries" Biomimetics 9, no. 1: 1. https://doi.org/10.3390/biomimetics9010001

Article Metrics

Back to TopTop