Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection

Zakariah, Mohammed; AlQahtani, Salman A.; Al-Rakhami, Mabrook S.

doi:10.3390/app13116504

Open AccessArticle

Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection

by

Mohammed Zakariah

¹,

Salman A. AlQahtani

^2,*

and

Mabrook S. Al-Rakhami

³

¹

Department of Computer Science, College of Computer and Information Science, King Saud University, Riyadh 11495, Saudi Arabia

²

New Emerging Technologies and 5G Network and Beyond Research Chair, Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia

³

Department of Information Systems, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(11), 6504; https://doi.org/10.3390/app13116504

Submission received: 3 May 2023 / Revised: 21 May 2023 / Accepted: 23 May 2023 / Published: 26 May 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Traditional firewalls and data encryption techniques can no longer match the demands of current IoT network security due to the rising amount and variety of network threats. In order to manage IoT network risks, intrusion detection solutions have been advised. Even though machine learning (ML) helps the widely used intrusion detection techniques currently in use, these algorithms struggle with low detection rates and the requirement for extensive feature engineering. The deep learning model for IoT network intrusion detection is a method for traffic anomaly detection that is suggested by this study. To extract the sequence properties of data flow through a CNN, it combines an attention mechanism with a Long Short Term Memory (LSTM) network. This method uses adaptive synthetic sampling (ADASYN) to increase the size of minority-class samples. The proposed models demonstrated acceptable precision and recall for each class when used for binary-class classification, proving their stability and capacity to identify all classes correctly. The MLP classifier’s accuracy, precision, recall, and F1 value were 87%, 89%, 87%, and 89%, respectively, with an AUC score of 0.88. Overall, the proposed models performed well. The attack and all-class models exhibited good AUCs and macro metrics, the same as the proposed MLP classifier, which had an F1 score of 83% and an AUC score of 0.94. Additionally, it trained the MLP classifier and integrated the ADAM optimizer and category cross-entropy loss function for all-class classification. With an AUC value of 94%, it possessed 84% accuracy, 87% precision, 84% recall, and an 83% F1 score. A further indication of the hybrid model’s ability to combine the benefits of both models to improve overall performance was that it regularly outperformed the MLP model. This model’s accuracy and F1 score are better than those of earlier comparable algorithms, according to experimental results using the publicly accessible benchmark dataset for network intrusion detection (NSL–KDD).

Keywords:

network intrusion detection; NSL–KDD; deep learning; adaptive synthetic sampling; network security; machine learning; classification

1. Introduction

The number of services that are available to consumers globally has increased due to internet access as a result of the quick expansion of communications and computing infrastructure. The Internet of Things (IoT) has dramatically altered how devices communicate with one another and with humans. However, the quantity and variety of cyberattacks, including malicious attacks, malicious eavesdropping, IoT network viruses, and others, are gravely threatening the protection of people’s information and the preservation of their property. As a result, communication and information IoT security are now essential for both people and society as a whole [1]. One primary type of IoT security which is frequently used is firewalls. However, in units that need strong protection (such as state buildings, etc.), it is no longer appropriate because of the problems with personal settings and the latency for new types of crimes [2]. The need for robust network security measures has become critical in recent years, as the number of connected devices has increased. While traditional security measures such as firewalls and data encryption techniques remain useful, they have proven ineffective against the evolving and sophisticated nature of network threats [3]. This has resulted in the development of more sophisticated security solutions, such as intrusion detection systems (IDSs), which are specifically designed to detect and prevent malicious network activity. IDSs work by analyzing network traffic and detecting patterns of behavior that are indicative of potential security threats such as unauthorized access attempts, malware infections, and data breaches [4,5].

Because of the increasing complexity and diversity of network threats, security experts are finding it increasingly difficult to detect and respond to these incidents manually. IDSs help to alleviate this problem by automating the threat detection and response process with machine learning algorithms and artificial intelligence (AI). Furthermore, IDSs are capable of detecting new and emerging threats that traditional security measures may miss [6]. They can also provide valuable insights into network traffic patterns and potential vulnerabilities, allowing network administrators to take proactive measures to prevent security incidents from occurring.

Machine learning (ML) has gained popularity in intrusion detection systems (IDSs) due to its ability to analyze large amounts of data and detect patterns that traditional methods may not be able to detect. ML-based intrusion detection systems (IDSs) use algorithms to learn from data and identify suspicious behavior, which can improve detection rates and response times [7]. However, ML-based intrusion detection systems face a number of challenges, including low detection rates and the need for extensive feature engineering. Low detection rates can occur when the IDS is not properly trained or when the data being analyzed do not accurately represent real-world scenarios. Furthermore, ML-based intrusion detection systems require large amounts of data to learn effectively, which can be costly and time-consuming to acquire [8]. The process of selecting and extracting relevant features from data that can be used to train the ML algorithm is known as feature engineering. In intrusion detection systems, feature engineering entails selecting and extracting relevant network traffic data to train the ML algorithm to detect potential security threats. This process, however, can be difficult and requires expert knowledge of both network security and machine learning techniques. To address these challenges, researchers have developed a variety of techniques to improve the performance of ML-based intrusion detection systems. Deep learning, for example, can automatically extract relevant features from raw data, reducing the need for extensive feature engineering. To improve detection rates, another approach is to use hybrid systems that combine different ML algorithms, such as supervised and unsupervised learning [9]. Furthermore, it is critical to ensure that the data being analyzed are representative of real-world scenarios in order to improve the performance of ML-based IDSs. This can be accomplished by employing data augmentation techniques, such as the generation of synthetic data or the use of multiple data sources, to produce a more diverse dataset.

Intrusion detection typically uses ML techniques, but ML algorithms frequently place a focus on feature selection. Massive invasion data, therefore, present a supervised learning challenge, since high-dimensional data cause feature selection problems that lead to subpar detection performance and other problems [10]. DL has advanced significantly in the fields of computer vision, natural language processing, and other fields, and it is now being utilized more frequently in other AI fields. For example, DL methods for IDS have become more popular lately. Without the requirement for human interaction, DL techniques can autonomously identify high-level latent information. However, the results are not in optimum contrast to chi-square feature selection and PCA, which mainly depend on intelligence, randomness, and artificial feature extraction [11]. Furthermore, traditional ML technology has limited computational complexity because of the high dimension, massive volume, and complex structure of network activity. Therefore, there is still an issue with learning non-linear dynamic correlations in massive data.

Figure 1 displays IDS classification methods divided into data sources and detection methods. Anomaly detection, abuse detection, IoT network-based IDS, and host-based IDS are some of the further divisions. Among these source-based methodologies, additional detection techniques are divided as classification components. More broadly, the log-based, flow-based, session-based, and packet-based detection methods all use the ML methodology.

Conventional ML methods such as NB, decision trees, SVM, and others try to work past the limitations imposed by detection methods. However, these systems require professional involvement and experience, which significantly improve detection accuracy, in order to interpret enormous amounts of data. These techniques, usually referred to as shallow classifiers, look for algorithms that enable computers to learn on their own without help from humans. However, as more features are added, they may eventually not function as well for multiclass problems. Self-learning intrusion detection systems that recognize and categorize known and zero-day intrusions have evolved due to research; these methods enable proactive efforts to spot and stop malicious IoT network traffic. Finally, deep learning is defined as a complicated model, or an advanced subset, of machine learning techniques that overcomes some of the drawbacks of external IoT networks.

Each network leverages the correlation with traffic attributes, and network data are distributed randomly. This work offers a DLNID model that uses ADASYN for feature extraction from unequal datasets and revised stacked auto-encoders for data feature reduction to address the mentioned remaining deficiencies. To retrain and assess the performance of the DLNID model, computational testing using the NSL–KDD dataset is conducted. The major contributions of this study are as follows:

It would help to use a DLNID model that combines bi-LSTM and attentional mechanisms. This DLNID model can precisely categorize data about network traffic;
By employing ADASYN for feature extraction from the minority class of sampling, the concern of unbalanced packet headers is resolved, resulting in a substantially equal distribution of the quantity of each sampling unit and enabling the system to learn successfully;
To increase information fusion, the complexity of the data is lowered by employing a more compact layer auto-encoder.

The structure of this research study is as specified: The introduction is covered in Section 1 of the paper. The relevant work is listed in Section 2. The set of data is relayed in Section 3. The suggested methodology is explained in the subsections. It thoroughly explains the recommended approach and briefly explains the used classifier model. The method’s effectiveness is discussed in Section 4. Section 5 is devoted to the discussion of the paper outcome section. Finally, in Section 6, the conclusion is explained.

2. Literature Review

IoT network security researchers have created a revolutionary technique for managing and quickly identifying erroneous connections. IDS has been shown to be one of the most successful and promising tactics. By keeping an eye on computer system traffic data, it can identify recognized dangers and malicious activity, and, when this happens, notifications are sent out [12,13,14,15]. The following are the primary types of intrusion detection systems:

An NIDS can comprise software (a console) and hardware (sensors) to regulate and watch over network traffic packets sent to and from various places in case of an intrusion or other anomaly;
Only activity on the host system, a particular computer or server, is tracked by HIDS. It is more capable than NIDS while restricted to a single system, since it can access encrypted information transmitted across a network, such as configuration management databases, configuration files, and file attributes;
The host, network, and fog layers make up a cloud IDS. Secure authorization for demand-based accessibility to a collaborative effort or API is offered via the cloud layer. Similarly, it builds a link between current IDS and hypervisors.

There are two methods for keeping an eye out for bad behavior. The first is comparable to antivirus software’s signature-based detection, whereas the second is a heuristic diagnosis, which requires comparison with conventional traffic to obtain a conclusion. Before gathering intrusion features, characterization is necessary. R2L, DoS, U2R, and probing attacks were the four categories into which Stolfo et al. subdivided network attacks in the KDD99 dataset [15,16,17]. Many academics advise combining ML and IDS to recognize IoT network dangers by creating precise models. In [17], the authors evaluated naive Bayes and decision trees to determine anomalous networks. By utilizing SVM and the evolutionary technique to optimize the sampling, variables, and model parameters of SVM attributes, the authors in [18] boosted the precision of network attack identification. The authors of [19] developed a multi-level RF model to identify IoT network irregularities. The researchers in [20] combined the KNN classifier and K-MEANS clustering to enhance detection precision. The performance of the existing KNN classifier was boosted by doing this. The researchers of [21] proposed a new IDS technique that first divided the data on the IoT network into more controllable subsets for the algorithm of decision trees, and then generated multiple SVM algorithms for the subsets. As the rate of detecting unidentified attacks rises, temporal complexity is reduced [22].

The focus of Classic ML algorithms, on the other hand, is primarily on feature engineering, which uses a lot of processing power and typically only learns surface-level features, producing detection results that are less satisfactory. Many researchers have focused on the current deep learning trend to skip the feature selection step and directly input IoT network traffic data into the model. The authors of one study [23] suggested a model structure based on DBNs and PNNs to categorize the data using PNNs, which are superior to traditional PNNs, and to employ DBNs to decrease the dimension of the data. Other researchers [24] suggested an identification method based on CNNs which eliminated the requirement for explicitly defining characteristics by transforming traffic data into images. Another study [25,26] by researchers utilized the performance of RNN networks on spectral parameters to detect botnet irregularities, effectively increasing classification accuracy.

On the subject of IDS using deep learning and ML approaches, there are numerous research trends. Several of these related works are explained below.

A paradigm that has sensors that pull the necessary data from the data source was suggested by Ding et al. [18]. The analysis was performed using a detector. The model was constructed using many sensors and detectors. It gathered data in real-time from all the sources that were accessible, which were then analyzed by a detector. The detector recognized previous intrusions; it then picked up on new types of intrusions and acted in response to events, sounding an alarm if necessary. The intrusion and back-propagation were visualized using numerous self-organizing maps. For the classification of intrusions, a neural network was utilized. When an intrusion was found, the IDS notified the administrator and requested their approval before allowing the intruder to carry on with their operations. The IDS was designed to learn from the incident if authorized and if it turned out to be a false alarm; otherwise, the system was shut down, and the user exited the operation. The simulation produced a false alarm rate of 2.9% and a detection rate of 97%. Based on files that approach a system, Jiang et al. [20] suggested a model that would routinely analyze the system’s vulnerabilities and weaknesses. The model built a hybrid intrusion detection system combining statistical and machine learning methods. The suggested method distinguished between viruses based on the definition, which consisted of characters or a string of characters gleaned from viruses in files. This model’s primary goal was to create an IDS that could continuously monitor a system’s or network’s vulnerabilities based on the files that approached the system via the network. The system’s vulnerabilities were carefully monitored to determine if an intrusion was occurring. As a result, 92.65% of true positives were correctly identified, according to the statistics, and 7.35 percent of false alarms were generated.

Hindy et al. [26] suggested a model based on a DL technique to develop a functional, practical, adaptable, and trustworthy NIDS. The proposed model included a soft-max regression-based NIDS and a sparse autoencoder. The NSL–KDD dataset, which may be used to assess anomaly detection accuracy, was utilized as a benchmark dataset for network infiltration. It was possible to identify and categorize normal or aberrant connections when assessed using the test results [27]. Applying strategies such as stacked autoencoding, a deep belief net version of sparse autoencoding, for unsupervised feature learning and NB-Tree, Random Tree, or J48 for additional categorization boosted performance even more [28]. The efficiency was both admirable and respectable. A DL technique can improve the performance of an IDS, according to research by Khraisat et al. [29], who developed a strategy for creating an IDS that takes advantage of DL models. A DNN with a multi-layer feed-forward neural network and 400 hidden layers was chosen. Soft-max, rectifiers, and shallow models made up the output layer [30]. The suggested model’s detection rate was 99.994% when tested against other classifiers using the benchmark dataset KDD 99 Cup. Reference [31] employed five distinct kinds of ML algorithms (RF, SVM, CART, NB, and J84) as the foundation for the Dwivedi purpose approach to create a precise and effective technique. The NSL–KDD dataset, which comprises 41 features, was analyzed using this method, and it was effective in recognizing assaults and detecting abnormalities with a high degree of detection accuracy. Numerous authors have attempted to build various systems that combine DL with NIDS for SDN in the wake of the advancement of DL technology. A DL-based NID technique using the DNN algorithm was created by Zhu and Wang [32] for the SDN environment. Six characteristics from the NSL–KDD dataset were used. The outcomes of this method were contrasted with those of ML classifiers by the authors. It proved the viability and potential of using DL to create SDN-compatible network intrusion detection systems [33,34]. The method outperformed the ML classifier approach and showed good detection accuracy. The authors contrasted the results of their approach with those of ML-based classifiers.

According to Gauthama et al. in [35], a hybrid DL approach was utilized to increase accuracy and arrive at a better and more helpful direction. Two different DL classifier types were employed in these methods. The authors combined a recurrent neural network and a gated recurrent unit to create a hybrid approach known as GRU–RNN. It was based on the NSL–KDD dataset and employed six features to train the classifier [17]. The hybrid approach method showed superiority over the prior way and demonstrated accuracy gains of 89%. Its implementation in the SDN working environment was also made simple and adaptable. Building intrusion detection systems for SDN were the aim of a separate study in [36]. The researchers used ML and DL systems to compare the results. Six different types of attacks were classified using a gentle approach. The approach used a DL algorithm (GRU); the algorithm outperformed ML classifiers in terms of accuracy and performance. More than one type of dataset was used in training and comparison; the approach was very successful, indicating the possibility of applying DL in NIDS to SDN with excellent efficiency.

This study’s author [37] presented a hybrid ML system to boost the system’s accuracy (decision trees with SVM techniques). The many different kinds of known assaults were categorized using the decision tree method [38,39]. The normal data were categorized using the SVM algorithm. This model was developed using data from the NSL–KDD dataset. This system was accurate 96.4 percent of the time. A GA and SVM were suggested by Ashiku et al. [40] to be used to find the intrusion packets. Select features were combined using SVM and GA. Researchers utilized SVM to address classification and regression issues. The detection accuracy for the KDD Cup 1999 dataset used in this experiment was 97.3% [41]. The researchers created a network-detecting method using the NSL–KDD dataset and an RNN methodology [42]. The accuracy of binary classification in this work was 83.28%, and that of multiclass classification was 81.29%. A CNN was used as the suggested method in [17] to identify network breaches. The KDD Cup 1999 dataset was used to generate new datasets, and test data for the CNN–IDS model were two-dimensionalized. This model had a detection rate of 97.7%.

Researchers in [43] used ANN for network IDS with the KDD Cup 1999 dataset. The data were normalized using the min/max approach, and the number of attributes was minimized using PCA [44]. In this study, ANN was designed using FFNN and LM back-propagation with mean squared error as the loss function. This model was accurate 97.97% of the time. DNN with NSL–KDD datasets was utilized in the system proposed in [45]. They suggested using label encoders, min–max normalization, and an auto-encoder network for pre-processing and training the DL layers. Five categories were used to develop this model. The DOS assault had the highest accuracy detection rate in this area, at 97.7%, versus 89.8% for the probe. The KDD Cup 1999 dataset was used to analyze and validate the DNN-based AI IDS proposed by the researchers in [46]. Four hidden layers were utilized to build the neural network, with the Adam optimizer used for back-propagation training and the ReLU mechanism employed as the convolution operation in forwarding [47,48]. The accuracy of the classification, which could either be normal or an attack, was 99.08%. In [38], a DNN system using a DARPA 1999 dataset was suggested. The output layer only had two neurons, since ReLU was the hidden layer’s non-linear activation function. There was a 93% accuracy rate. DNN is employed in the method proposed in this paper to identify network intrusion packets, since it is both pertinent and efficient. This algorithm gives each feature in the input layer the proper weights, using those weights to make judgments. As a result, it is more appropriate for novel signatures than rule-based ones. The one-hot encoder method and z-score normalization were both used in the pre-processing phase of the data in this paper. ReLU was used as the non-linear learning algorithm, the ADAM function served as an algorithm, and cross-entropy was employed as the logistic regression to train the DNN. The approaches for output classification used both binary and multiclass categorization.

Figure 2 illustrates the two primary categories of machine learning established by researchers: unsupervised and supervised. The availability of labeled data is a critical success factor in supervised learning, especially for classification tasks. Supervised learning algorithms may struggle to generalize to new, unlabeled data if there are insufficient labeled data. Obtaining labeled data, on the other hand, can be a difficult and time-consuming process. Unsupervised learning approaches, conversely, provide a solution to the problem of obtaining labeled data. Unsupervised learning can simplify the process of gathering training data by extracting useful feature information from unlabeled data. While unsupervised learning has shown promise in some applications, it typically fails to outperform supervised learning methods in detection tasks. As a result, the choice between supervised and unsupervised learning methods is determined by the task at hand as well as the availability of labeled data. In some cases, unsupervised learning may be sufficient for certain applications, whereas supervised learning may be required to achieve the desired performance.

Table 1 provides a thorough summary of the methodology and results, as well as a list of previous reference research. It is a powerful tool for quickly grasping the research’s essential details, providing a clear and concise snapshot of the study.

It is important to note that our contribution extends beyond accuracy alone. The proposed approaches incorporate advanced deep learning architectures, attention mechanisms, and synthetic sampling techniques to address the limitations of existing methods, such as low detection rates and the need for extensive feature engineering. This study’s proposed approach fills several research gaps in the field of IoT network security. First, it addresses the shortcomings of traditional intrusion detection techniques, which struggle with low detection rates and necessitate extensive feature engineering. Second, for traffic anomaly detection, it employs a deep learning model that combines an attention mechanism with an LSTM network to extract the sequence properties of data flow via a CNN. Third, it uses ADASYN to increase the size of minority-class samples, which improves the model’s performance. Finally, the proposed approach outperforms previous comparable algorithms in terms of accuracy and F1 scores, indicating its ability to improve overall performance by combining the advantages of both models.

3. Material & Methods

Figure 3 presents a clear overview of the proposed approach. The proposed approach starts with the NSL–KDD dataset, which forms the foundation of the intrusion detection system. Several pre-processing steps are used to prepare the data for modeling. Data cleaning to remove inconsistencies or errors, one-hot encoding to represent category features, normalization to bring numerical features to a uniform scale, and target balancing to address class imbalance are all examples. Visualization techniques such as t-SNE and Principal Component Analysis (PCA) are used to obtain insights into the distribution of attacks within the dataset. These visualizations aid in comprehending the patterns and relationships between various kinds of attacks. Three methodologies are used for the classification task: binary classification, four-class classification, and all-class classification. Each technique attempts to categorize network traffic occurrences, separating regular traffic from various sorts of attacks. This enables more granular data analysis and the detection of specific assault types. XGBoost, a gradient boosting technique, is used to discover the most significant characteristics for classification. It evaluates the characteristics according to their importance and selects the top features that significantly contribute to the classification process. During the model building and training phase, two approaches are used: an MLP (Multi-Layer Perceptron) model and a hybrid model that combines a CNN and an LSTM. The MLP model is a standard feed-forward neural network, whereas the hybrid model uses CNN for feature extraction and LSTM for capturing temporal dependencies in data. Finally, the output of the trained models is analyzed and evaluated. To evaluate the performance of the models, metrics such as accuracy, precision, recall, and F1 score are produced.

3.1. Dataset

A basic intrusion detection system’s internet traffic records are contained in the dataset known as NSL–KDD [49]. These recordings are the phantoms of the traffic that a real IDS would have seen, leaving only traces of its former existence. There are 41 attributes linked to the traffic input for each record in the dataset, along with one label indicating the type of assault, for example, denial of service, often known as “DoS”, where the four sorts of attacks that make up the target variable “Class” are probe (which stands for surveillance or a similar type of probing), U2R, U2R, and R2L. The KDD dataset’s high volume of redundant records is one of its most severe weaknesses, since it pushes learning algorithms to prioritize learning frequent records over learning less frequent archives, which are frequently very detrimental to systems such as R2L and U2R assaults.

The data in Figure 4 provide a breakdown of traffic records, both legitimate and malicious. The majority of the records, at 53.4%, correspond to regular, non-malicious traffic, while only a small percentage correspond to specific types of attacks. For example, probe attacks account for just 9.25% of the records, R2L attacks make up only 0.79%, and U2R attacks are the least common, at 0.041%. Normal traffic refers to legitimate network traffic that includes data packets transmitted between network nodes for regular communication purposes such as web browsing, email, and file transfer. Conversely, DoS attacks are malicious attempts to disrupt the normal functioning of a network or website by overwhelming it with a flood of traffic or requests, making it unavailable to its intended users. Probe attacks are focused on gathering information about a target system by sending a series of probes or requests to identify open ports, operating system information, and other details that can be used in further attacks. R2L attacks, also known as “remote-to-local” attacks, involve an attacker attempting to gain unauthorized access to a local system from a remote location by targeting vulnerabilities in remote access protocols or software applications. Finally, U2R attacks are attempts by an attacker to gain elevated privileges or “root” access on a target system by exploiting vulnerabilities in software applications or operating system components.

3.2. Data Pre-Processing

Data pre-processing prepares a dataset for analysis through cleansing, converting, and formatting. It is a crucial phase in any ML process because it can significantly impact the accuracy and dependability of the analysis findings. To remove duplicates and irrelevant records and to handle missing values, the NSL–KDD dataset required data cleaning. To convert categorical data to a numeric format, one-hot encoding was used, and numerical features were normalized. To improve the accuracy of detecting minority classes, target balancing was performed by oversampling minority classes with ADASYN. All these steps are described below.

3.2.1. Data Cleaning

The NSL–KDD dataset is notorious for containing a large number of redundant records, which can be detrimental to the learning algorithms. As a result, it is critical to remove duplicates and irrelevant records from the dataset. Also, any missing values in the dataset must be identified and handled appropriately, either by filling them with appropriate values or by removing the missing value records.

3.2.2. One-Hot Encoding

The process of transforming categorical (i.e., non-numeric) data into a numeric format that machine learning algorithms can employ is known as categorical data encoding. Each category of each categorical feature receives its new binary column from the one-hot encoding.

The dataset contains a total of 41 features, which are a combination of categorical, binary, and numerical values. These features provide information about different aspects of network connections, such as protocol types, service types, source and destination IP addresses, source and destination port numbers, duration, etc.

The dataset comprises 41 categorical features that are encoded using this method to produce new features for each category and enable deep learning networks to handle them. Following one-hot encoding, the final dataset has a total of 128 features.

3.2.3. Normalization

Normalization is a data pre-processing technique that narrows the scale of values for numerical features in a dataset without altering their correlations or fluctuations.

The dataset contains X numerical features that do not fit a Gaussian distribution and have well-known bounds. For the following reasons, the numerical attributes have been normalized using the min–max method to the range [0, 1]:

X_{p r o c e s s e d} = \frac{X - X_{m i n}}{X_{m a x} - X_{m i n}}

(1)

A numerical feature’s maximum and minimum boundaries are defined by its boundaries, X_min and X_max.

3.2.4. Target Balancing

It can be seen in Figure 4 that the dataset is quite unbalanced, which poses a significant challenge for the model’s training because it prevents the model from accurately detecting minority classes such as R2L, probe, and U2R. Classes were balanced to address this problem. These methods alter the dataset’s distribution of the label categories by either undersampling majority classes, oversampling minority classes, or using a combination of the two methods such as SMOTE and ADASYN.

i.: SMOTE

SMOTE [50] is a synthetic oversampling technique that generates synthetic samples for the minority class in order to balance imbalanced datasets. It works by generating synthetic examples along the line segments that connect neighboring instances of the minority class. This is accomplished by selecting a minority class instance at random, determining its k nearest neighbors, and generating synthetic samples by interpolating between the selected instance and its neighbors. The desired level of oversampling determines the number of synthetic samples generated.

The SMOTE technique helps in increasing the representation of the minority class, thereby mitigating the impact of class imbalance. By introducing synthetic samples, SMOTE enhances the learning process by providing a more balanced training set for the classifier.

To tackle the issue of class imbalance in the dataset, the minority class was identified, which had fewer instances than the majority class. The imbalance ratio was computed to determine the proportion of instances between the two classes. The SMOTE algorithm was utilized to overcome this problem. This approach involved selecting a random instance from the minority class and determining its k nearest neighbors. By interpolating between the chosen instance and its neighbors, synthetic samples were generated. The number of synthetic samples produced depended on the desired level of oversampling required to balance the classes. These synthetic samples were then added to the training set, augmenting the representation of the minority class. The process was repeated until the desired level of class balance was achieved, resulting in an enhanced training set for the machine learning models.

ii.: ADASYN

ADASYN [51] is an adaptive synthetic oversampling technique based on SMOTE principles. ADASYN focuses on creating synthetic samples for minority class instances that are more difficult to learn. It accomplishes this by adjusting the distribution of synthetic samples adaptively based on the density distribution of the instances. The density distribution of minority class instances is estimated in ADASYN, and instances with lower densities are prioritized for generating synthetic samples. This means that the generation of synthetic samples is more focused on difficult-to-learn instances, effectively emphasizing the areas that require more attention.

ADASYN addresses the limitations of SMOTE by adaptively adjusting sample synthesis in cases where the class imbalance is more severe or the minority class instances are distributed in complex patterns. ADASYN seeks to strike a better balance between oversampling the minority population and avoiding overfitting.

The implementation of ADASYN followed a similar approach to SMOTE by identifying the minority class and calculating the imbalance ratio. However, ADASYN required an extra step, to consider the density distribution of the minority class instances. To estimate this distribution, the number of minority class instances within a certain radius around each instance was measured. Instances with lower densities, indicating regions that were more challenging to learn, were given higher importance for generating synthetic samples. The number of synthetic samples to be generated for each instance was determined based on these importance values. The SMOTE algorithm was then applied with adjusted synthesis of samples based on the importance values. This allowed the focus to be on generating synthetic samples specifically for the challenging instances, to address the imbalanced nature of the data. Finally, the synthetic samples were added to the training data resulting in a more balanced representation of the minority class while emphasizing the difficult regions that required more attention during the learning process.

The identification of minority samples that pose challenges for neural network training is of interest to us. Hence, ADASYN was favored and selected as a method to balance the dataset. After oversampling the minority classes with ADASYN, Figure 5 depicts the updated distribution of the traffic record categories, with 20% for each type of assault.

3.3. Classification Strategies for Intrusion Detection

Three classification versions were tried and trained to determine whether incursion would occur. First, the challenge was simplified to a binary classification problem because it involved deciding whether a traffic record was legitimate or malicious. After the first predicted that a signal was an attack, the second signified the type of attack, which could be used as a predictive model. Next, it reduced the task to a four-category classification issue by predicting whether an attack was a DoS, an R2L, a probe, or a U2R. The third and last issue was creating a five-category classification model using the complete dataset.

Figure 6 shows the method for making predictions. The first method (on the left) assigns the intrusion signal a normal or abnormal classification. The four-category classification approach forecasts the type of assault if the signal is categorized as abnormal. The second method (on the right) uses the intrusion signal to directly predict the kind of signal into one of the five pre-existing categories.

This study has developed incremental models for intrusion detection in order to provide a flexible and comprehensive approach. The two-class model serves as the foundation for identifying potential intrusions by distinguishing between normal and abnormal traffic. The four-class incremental model further classifies abnormal traffic into specific attack categories, enabling targeted response and mitigation strategies. The five-class incremental model directly predicts the category of the intrusion signal without classifying it as normal or abnormal, streamlining the prediction process. This approach offers flexibility in the level of detail required for intrusion detection and can be customized based on specific network requirements. The system is scalable and adaptable, catering to different use cases and security needs.

t-SNE [52] was applied to visualize the attack distribution. t-SNE, short for t-Distributed Stochastic Neighbor Embedding, is a technique used to reduce the dimensions of high-dimensional data for visualization purposes. However, it is mainly intended for numerical data and may not be suitable for categorical data. The dataset used for t-SNE visualization in this case had a large number of attributes, indicating that it may have been high-dimensional and could have potentially contained categorical representations. Since t-SNE is not ideal for handling categorical data, an alternative method of dimensionality reduction was suggested, as stated in the FAQ by Laurens Maaten [53].

The dataset can be transformed into a lower-dimensional representation by applying PCA, which reduces the number of attributes to 50 while retaining the most important information. The resulting data can be visualized using t-SNE to show the distribution of attacks in a two-dimensional space. However, using t-SNE with categorical data or high-dimensional datasets may not capture all the complexities of the data, and additional preprocessing or encoding may be required. Moreover, visualizing sparse or imbalanced classes in two dimensions can be challenging, and alternative visualization methods or further analysis may be necessary to fully understand these types of attacks.

The DoS and probe assaults exhibit some discernible patterns, as seen in Figure 7. However, due to the sparse sample sizes for U2R and R2L, it is challenging to visualize most of the reasoning underlying their distribution and how their repartition functions using only two dimensions.

3.4. Classification of Relevant Features Using XGBoost

Large feature sets can be bad for deep learning training since they can cause overfitting owing to data noise, resulting in subpar performance on new, untried data. In addition, the “curse of dimensionality,” which refers to the existence of high-dimensional data that make the data sparse and make it challenging for the model to model the underlying patterns in the dataset accurately, is another factor that may harm model performance.

The dataset, which comprises 128 features due to one-hot encoding of the categorical data, may lead to the issues above. Therefore, the data were subjected to a feature selection technique to solve this issue. “Feature selection” refers to picking a subset of pertinent features to be used in the model. The objective is to identify a subset of traits that improve model performance while reducing data dimensionality.

After XGBoost had been tested on the complete dataset, the idea of gain in XGBoost was employed to determine each attribute’s importance. XGBoost [54], which stands for eXtreme Gradient Boosting, is a popular machine learning algorithm known for its efficiency and effectiveness in handling large feature sets. It is commonly used for feature selection tasks because of its ability to assess the importance of features in a dataset.

The dataset consisted of 128 features after one-hot encoding of categorical data. This large number of features can lead to challenges such as overfitting and the curse of dimensionality. To address these issues, feature selection techniques are applied to identify a subset of relevant features that can improve model performance while reducing dimensionality. XGBoost was used for feature selection in this case. The gain in XGBoost was employed to determine the importance of each attribute. The gain represented the improvement in the model’s performance achieved by including a particular feature. XGBoost built a model based on the full dataset and evaluated the gain of each feature to determine their relevance. The gain values across all decision trees in the model were then aggregated to determine the overall significance of each feature.

XGBoost determined the primary functions of each feature by developing a model on the full dataset and then evaluating the gain of each feature to determine the relevance of each feature. Then, the gain across every decision tree in the model was aggregated to determine the significance of each feature.

As seen in Figure 8, where Src Bytes is the most significant feature, 38 features were retrieved from the original 138 features. Therefore, the other characteristics were eliminated from the dataset after these features had been utilized for training the models.

3.5. Model Architecture

Two different model architectures were trained to predict the sorts of assaults for the three sub-problems using this dataset. An MLP method was used in the first architecture. MLPs are a type of feed-forward neural network made up of several layers of interconnected nodes (neurons). They are well-known for capturing complex non-linear relationships between features and target variables. MLPs are widely used for a variety of machine learning tasks, including classification problems such as the one investigated in this study. As the initial model, the authors chose an MLP architecture to evaluate its performance in predicting the types of assaults in the given dataset.

A hybrid model combining CNN-1D and LSTM was used in the second architecture. This hybrid model aimed to improve performance by combining the strengths of CNN-1D and LSTM. CNN-1D is a convolutional neural network that is designed to process one-dimensional sequences such as time series or sequential data. It excels at capturing local patterns and identifying relevant data features. The authors hoped to extract meaningful features from the input data by incorporating a CNN-1D component into the model.

The hybrid model utilized LSTM, a type of RNN, to address the vanishing gradient problem that traditional RNNs can encounter. The problem arises when the gradients of the error become too small while backpropagating through time, making it difficult for the network to learn long-term dependencies. LSTM networks overcome this issue by incorporating memory cells that can retain information over extended periods. This approach enables the hybrid model to effectively remember and utilize prior inputs to capture temporal dependencies in the data. Using CNN-1D and LSTM in combination in the hybrid model allows for local feature extraction and capture of long-term dependencies in the sequential data. This architecture was selected to enhance the model’s ability to learn and represent complex patterns present in the dataset, ultimately improving the prediction performance for various types of assaults.

As shown in Figure 9, the initial neural network design utilized for training was an MLP. There were 38 inputs for the features in the input layer, 64 units in the dense layer, and the ReLU activation function. Then, a dropout layer with a ratio of 0.4 was implemented to avoid overfitting by randomly discarding nodes during training. A thick layer of 128 units followed the ReLU activation function in the second hidden layer. Next, a dropout layer with a ratio of 0.4 followed. Next, a dropout layer of 0.4 was placed after a ReLU activation function, and 512 units made up the third hidden layer. Finally, a ReLU activation layer and a dropout layer with a unit count of 0.4 came after the fourth layer, which had 128 units. Following from the model type, either binary, four-category, or five-category, the output layer was a dense layer with one, four, or five units.

The second neural network design used for training was a hybrid model that combined CNN-1D and LSTM, as shown in Figure 10. The input layer consisted of 38 inputs for the features, and the next layer was a CNN-1D with 64 filters, a kernel size of 9, and a ReLU activation function. After that, there was a max-pooling layer with a pool size of two. Applying a dropout layer with a dropout rate of 0.4 was the next step, to prevent overfitting caused by randomly removing nodes during training. The second hidden layer was a CNN-1D layer with 64 filters and a kernel size of 6, followed by the ReLU activation function. Then, a max-pooling layer with a pool size of 2 and a dropout layer of 0.3 was added. Then, an LSTM layer with 64 units and a dropout layer of 0.2 made up the third hidden layer. Finally, the output layer was dense, with one, four, or five units, depending on the model type—binary, four-, or five-category.

4. Results and Analysis

This section presents the results and interpretation of the experiments. This section examines the MLP and hybrid CNN-1D/LSTM models’ performance across various classification tasks, discussing their accuracy, recall, precision, F1 score, and AUC.

4.1. Experimental Settings

Two different models, namely an MLP model and a hybrid CNN-1D and LSTM model, were utilized in the applied experiments. The performance of these models was assessed using a variety of criteria, including recall, accuracy, F1 score, precision, and AUC (Area Under the Curve) score.

The MLP model was trained for the binary classification task for 30 epochs, employing the ADAM optimizer. A batch size of 128 was used, along with the binary cross-entropy loss function and a learning rate of 0.001. Similarly, the hybrid CNN + LSTM model was trained for 30 epochs, utilizing the same learning rate and batch size, as well as the ADAM optimizer and binary cross-entropy loss function. To evaluate the performance of both models, metrics such as AUC, accuracy, precision, recall, and F1 score were employed.

For the attack classification task, both the MLP model and hybrid CNN + LSTM model were trained with the same settings as the binary classification task. The MLP model utilized the categorical cross-entropy loss function, while the hybrid model used the binary cross-entropy loss function. The evaluation of the models once again involved metrics such as AUC, accuracy, precision, recall, and F1 score. Furthermore, for the all-class classification task, the MLP model was trained with the categorical cross-entropy loss function, and the hybrid CNN + LSTM model used the binary cross-entropy loss function. The training conditions remained comparable to the previous tasks, with both models trained using 30 epochs, a batch size of 128, and a learning rate of 0.001. The ADAM optimizer was employed for optimization, and the performance of the models was assessed using AUC, accuracy, precision, recall, and F1 score metrics.

4.2. Performance Matrices

The models were assessed using various criteria, such as recall, accuracy, F1 score, and precision. The recall is determined by dividing the number of TPs by the true positive examples in the dataset (TP + FP). The majority of the positive examples in the dataset, or all of the highly engaged students, are likely correctly identified by the model, according to a high recall score [55].

Recall = \frac{T P}{T P + F P}

(2)

The precision is determined by dividing the total number of positive forecasts (TP + FN) by the number of actual positive forecasts. A model with a high precision rating will produce fewer false positive predictions [56].

Precision = \frac{T P}{T P + F N}

(3)

By calculating the harmonic mean of a classifier, the F1 score integrates precision and recall into one metric.

F 1 = \frac{2 * precision * recall}{precision + recall}

(4)

A key model evaluation metric, the AUC, is also used to compare the performance of multiple models. The AUC score is determined by comparing the TPR and FPR at various classification levels using the ROC curve. A high AUC value indicates an exceptional ability to discriminate between positive and negative categories [57].

4.3. Binary Classification

Using the ADAM optimizer, the MLP model was trained over a 30-epoch span using the binary cross-entropy loss function, with a learning rate of 0.001 and batch size of 128. As a result, the MLP classifier’s precision, recall, F1 score, and accuracy scores were 89%, 89%, 88%, and 87%, respectively.

The hybrid CNN + LSTM model was trained over 30 epochs using the ADAM optimizer and binary cross-entropy loss function, with a batch size of 128 and a learning rate of 0.001. The hybrid model’s F1 scores were 90%, with an AUC value of 0.98. The hybrid classifier’s precision, recall, and accuracy scores were 90%, 89%, and 89%, respectively. Table 2 compares the binary classification performance of the two models, including the AUC, precision, recall, accuracy, and F1 score.

Figure 11 depicts the MLP model’s confusion matrix, a table that rates a classification method’s performance. Each column of the matrix represents an actual class, whereas the rows of the matrix indicate instances of a predicted class. For example, 10,362 out of 12,833 were successfully projected as class 1 (anormal), while 9330 out of 9711 were correctly projected as class 0 (normal).

Figure 12 depicts the hybrid model’s confusion matrix, a table that details how well a classification approach works. The matrix rows represent instances of a forecasting class, while each column represents an actual class. For example, class 0 (normal) was correctly predicted for 9260 out of 9711, whereas class 1 (anormal) was correctly predicted for 10,903 out of 12,833.

ROC curves graphically represent the trade-off between the model’s specificity (i.e., true negative rate) and sensitivity (i.e., true positive rate). The ROC curves for each class in the MLP model are shown in Figure 13, with an AUC of 0.88 for class 0 and an AUC of 0.88 for class 1.

The hybrid model’s ROC curves are shown in Figure 14, with an AUC of 0.98 for class 0 and 0.98 for class 1.

4.4. Attack Classification

The categorical cross-entropy loss function and the ADAM optimizer were employed to build the MLP model across 30 iterations, with a learning rate of 0.001 and a batch size of 128. The MLP classifier had an AUC score of 0.94 with F1 values, accuracy, precision, and recall of 63%, 72%, and 64%, respectively.

With a batch size of 128 and a learning rate of 0.001, the hybrid CNN + LSTM model was developed across 30 epochs using the ADAM optimizer and the binary cross-entropy loss function. The hybrid model achieved precision, accuracy, recall, F1 score, and AUC score of 74%, 92%, 65%, 64%, and 95%, respectively. The efficiency of the two models for categorizing attacks is seen in Table 3.

The MLP model’s confusion matrix is shown in Figure 15. The matrix’s rows represent a predicted class’s occurrences, and the matrix’s columns represent actual classes. For example, 6945 out of 7460 were correctly predicted as being in class 0, 1467 out of 2885 as being in class 1, 2215 out of 2421 as being in class 2, and 36 out of 67 as being in class 3.

Figure 16 displays the ROC curves for the hybrid model for each class, with AUC values of 0.96 for class 0, 0.95 for class 1, 0.94 for class 2, and 0.89 for class 3.

Figure 17 displays the confusion matrix for the hybrid model. The matrix rows represent instances of a forecasting class, while each column represents an actual class. For example, 6945 out of 7460 were correctly predicted as being in class 0, 1467 out of 2885 as being in class 1, 2215 out of 2421 as being in class 2, and 36 out of 67 as being in class 3.

Figure 18 displays the ROC curves for the hybrid model for each class, with AUC values of 0.97, 0.95, 0.95, and 0.92 for classes 0, 1, 2, and 3.

4.5. All-Class Classification

The categorical cross-entropy loss function was used to train the MLP model across 30 epochs using the ADAM optimizer, with a learning rate of 0.001 and a batch size of 128. The MLP classifier performed wonderfully, achieving 94% AUC, 94% accuracy, 73% precision, 72% recall, and a 66% F1 score.

With a batch size of 128 and a learning rate of 0.001, the hybrid CNN + LSTM model was developed, employing the trained model and the binary cross-entropy transfer functions. The hybrid model’s precision, accuracy, F1 score, and recall score were 75%, 94%, 70%, and 71%, respectively, while its AUC was 97%. Table 4 below displays the performance of all classification models, including AUC, recall, precision, F1, and accuracy.

Figure 19 displays the confusion matrix for the MLP model. The matrix rows represent occurrences of a predicted class, while each column represents an actual class. In other words, 9325 out of 9711 were accurately indicated as being in class 0, 6499 out of 7460 as being in class 1, 974 out of 2884 as being in class 2, 2192 out of 2421 as being in class 3, and 39 out of 67 as being in class 4.

The ROC curves of the MLP model are shown in Figure 20 for each class. The class with the most significant AUC is class 0, with a value of 0.96, followed by 0.97, 0.89, 0.97, and 0.93 for the other classes.

Figure 21 displays the confusion matrix for the hybrid model. The matrix rows represent instances in a predicted class, while the matrix columns represent actual classes. For example, 9045 out of 9711 were accurately predicted as class 0, 6818 out of 7460 were correctly forecasted as class 1, 1383 out of 2884 were correctly predicted class 2, 1937 out of 2421 were correctly predicted as class 3, and 42 out of 67 were precisely predicted as class 4.

With AUC values of 0.98 for class 0, 0.99 for class 1, 0.94 for class 2, 0.95 for class 3, and 0.97 for class 4, the ROC curves for the hybrid model are shown in Figure 22 for each class.

4.6. Comparison with Previous Work

The comparison of the outcomes between the proposed models and the ML algorithms employed in [58] is shown in Table 5. Table 5 compares the accuracy of our work to previous research on binary classification. SVM, KNN, RF, NB, DT, and FFDNN (Feed-Forward Deep Neural Network) are the techniques listed in the table, along with the accuracy values for each. Two models were proposed in this study: an MLP model and a hybrid model. The MLP model had an accuracy of 87.50%, while the hybrid model had an accuracy of 89.00% for binary classification. These findings show that both of the proposed models outperform the previously mentioned techniques in terms of binary classification accuracy. This implies that the proposed models have the potential to provide more accurate predictions and perform better in binary data classification than traditional machine learning techniques such as SVM, KNN, RF, NB, DT, and even the FFDNN model from previous research. These results demonstrate the efficacy of the proposed models in achieving high accuracy in binary classification tasks. In contrast to all the models described in that earlier study, the proposed models outperform them all for binary classification.

Comparing the hybrid model to the best previous one, FFDNN [13], the proposed model surpasses all the models in [13] in terms of accuracy by at least 1.16%. Additionally, Table 2 presents the various graphs of the two models created for the binary classification. The proposed models generalize well to anonymous data, as demonstrated by the ROC curves, where the area under the curve (AUC) is high for both classes. The proposed models have excellent performance in detecting each class, with an average F1 score of 89%.

Regarding accuracy, our hybrid CNN–LSTM models exceed every model in [14,15,16,17,18,19,20]. Table 6 compares the accuracy of this work and previous research for attack classification. The table lists various models and their respective accuracy percentages. DNN [14], CNN [15], Deep-MLP [16], RF [17], DLS–IDS [18], MFFSEM [19], and hybrid [20] are some of the models mentioned in the table. Furthermore, the proposed MLP model had an accuracy of 91%, while the hybrid model had an accuracy of 92%, demonstrating the superior performance of the proposed models in comparison to the existing ones.

Furthermore, the hybrid model outperforms both existing techniques and the MLP model. This suggests that the hybrid model’s combination of CNN-1D and LSTM architectures contributes to higher accuracy than MLP alone. The hybrid model takes advantage of both architectures’ strengths to capture complex patterns and temporal dependencies in data, resulting in improved classification performance. Furthermore, the proposed models—the MLP and hybrid models—were designed and trained specifically for the binary classification task under consideration. This specialization enabled them to concentrate on the specific requirements and characteristics of the dataset, resulting in higher accuracy when compared to the more generic approaches of existing techniques.

5. Discussion

IoT security has become an essential concern for network systems due to the rise in applications and users. Degradation of real objects, equipment failures, and power limits are all examples of physical layer issues. Network layer problems include DoS attacks, sniffer attacks, unauthorized access, and gateway assaults. Many IoT devices rely on internal IoT security mechanisms, rendering them open to attacks. The initial challenges that an internet system should face are the authenticating issue and immediate physical vulnerabilities. The confidentiality of information transmitted through applications and services is the primary concern of the following attribute of IoT security problems. Data security difficulties arise when a network system is compromised by spoofing or noise. Random assaults such as DoS, probing attacks, and DDoS are a few possibilities that might harm IoT products and services. The implementation of a ground-breaking IDS based on DNN was made necessary by the constraints posed by existing, heavily secured internet services and advancements in attacks. The provided strategy resolves the overfitting problem. IDS regulates network communication for both normal and abnormal behavior. The KDD99 dataset was normalized during the pre-processing stage, using the mean and standard deviation. Due to the intricate classification procedure, ReLU and soft-max were used as the kernel function for the final and hidden layer.

The NSL–KDD dataset was compiled from the basic IDS’s internet traffic logs. There were 41 attributes associated with the traffic input for each data collection record, as well as a label indicating the type of assault. R2L, DoS, U2R, and unauthorized remote system access were the four types of assaults covered by the target variable “Class.” The MLP model was trained with the binary cross-entropy loss function, using the ADAM algorithm for 30 epochs in conformity with the binary classification strategy. The MLP classifier’s AUC was 0.88, accuracy was 87%, precision was 89%, recall was 87%, and F1 was 87%. The hybrid CNN + LSTM model was created utilizing the ADAM-trained model and the binary cross-entropy loss function, with a batch size of 128 and a learning rate of 0.001. The table in Figure 10 also demonstrates the efficiency of this classification strategy. These are the correct predictions for this number, along with class 0 (normal) of 9711, class 1 (anormal) of 12,833, and class 2 (9511) of 9260 (supernormal). On the other hand, the ROC curves for the MLP model for binary classification visually depict how the model’s specificity and sensitivity are weakened. Figure 12 displays the ROC curves for the MLP model for each class, with AUC values of 0.88 for classes 0 and 1.

The hybrid model often beats the MLP model in Table 3’s comparison of the two models’ abilities to categorize attacks over the course of 30 ADAM optimization algorithm iterations. The MLP model was trained using the classification cross-entropy gradient descent. The MLP classifier recorded an F1 score of 83 percent, accuracy of 83 percent, precision of 86 percent, and an AUC value of 94 percent. The binary cross-entropy loss function was used to train the hybrid CNN + LSTM model across 30 epochs, with a batch size of 128 and a learning rate of 0.001. The MLP model’s confusion matrix is shown in Figure 14. The matrix’s columns display actual classes in contrast to their layers, representing occurrences in a class label. For example, class 0 was accurately foreseen for 6945 out of 7460, class 2 was correctly identified for 1467 out of 2421, and class 1 was correctly predicted for 2215 out of 2885. The hybrid model’s ROC curves are shown in Figure 15, with AUC values of 0.96 for class 0, 0.95 for class 1, 0.94 for class 2, 0.89 for class 3, and ROC curves for each class. In Figure 16, the confusion matrix for the hybrid model can also be seen. The matrix’s rows correspond to instances of a forecasting class, and its columns represent real classes. According to the estimates, 6945 of the 7460 persons were in class 0, 1467 were in class 1, 2215 were in class 2, and 36 were in class 3 out of the remaining 67. Figure 17 shows that the hybrid model’s ROC curves’ AUC values for classes 0, 1, 2, and 3 were 0.97, 0.95, 0.95, and 0.93.

For all-class classification, the same hybrid model outperformed the MLP model. It was possible to use the ADAM optimizer to train the MLP model with a learning rate of 0.001 and batch size of 128, while using the categorical cross-entropy loss function. The MLP classifier had 94% above-average results for accuracy, precision, recall, and F1. The hybrid CNN + LSTM model, on the other hand, was trained over 30 iterations using the binary cross-entropy loss function and the ADAM optimizer, with a learning rate of 0.001 and a batch size of 128. The hybrid model’s metrics—accuracy, precision, AUC score, recall, and F1—were all greater than 97%. Figure 18 displays the MLP model’s confusion matrix. Each row in the algorithm consists of a prediction class, whereas each column in the matrix indicates an actual type. For example, anticipated values of 6499, 974, and 2192 were adequately predicted to be class 1, class 2, and class 3, correspondingly, while class 0 was adequately predicted to be 9711, compared to 9325. The hybrid model’s ROC curves have an AUC value for class 0, 1, 2, 3, and 4, respectively, of 0.96, 0.97, 0.89, 0.97, and 0.93.

Finally, a comparison was made between the model developed in this study and earlier research, which included the MLP and hybrid models. The accuracy of the binary classifications, such as SVM, KNN, RF, NB, DT, and FFDNN, was below 80%, whereas the proposed models achieved scores of about 87.50% for MLP and 89% for the hybrid model. DNN, CNN, Deep-MLP, RF, DLS–IDS, and MFFSEM models scored 78.5%, 79.48%, and 84.34% for attack categorization. The proposed hybrid model and model-like MLP scored 83.5% and 86.20%, respectively.

Three techniques were supported: the channel attention mechanism coupled with the BiLSTM network as the core network, the layered learning algorithm with higher dropout arrangement as the statistics data assimilation method to improve the model’s generalization capacity, and the ADASYN resampling algorithm as the data pre-processing technique to resolve the intrusion prevention data imbalance problem. As a result, the F1 score and accuracy of the proposed network model were 89.12% and 90.45% on the KDDTest+ test set. Furthermore, the suggested DLIND model outperformed several reference network models in evaluation metrics. Therefore, the proposed model is helpful for the present phase of NID development.

Although the Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection has shown promise, there are various limitations that need to be taken into account. One of the significant limitations is dataset bias, as the effectiveness of the proposed technique is heavily reliant on the quality and representativeness of the training dataset. If the dataset used for training is biased or lacks diversity, it may limit the model’s generalizability to different network environments and real-world scenarios. Another limitation is the dependency on manual feature engineering, even though the deep learning model combines an attention mechanism with an LSTM network to capture sequence properties. This process can be time-consuming and requires expert knowledge to select and engineer relevant features. Automating this process or exploring more efficient feature extraction techniques could enhance the model’s effectiveness. The scalability of the proposed technique may also be a concern when dealing with large-scale IoT networks. As the number of devices and the amount of network traffic increases, the computational requirements and processing time of the model may become impractical. Ensuring the scalability of the proposed approach to handle large volumes of data is essential for its practical implementation. Moreover, deep learning models such as the ones proposed in this study are often considered black-box models, meaning that their decision-making process and reasoning are not easily interpretable. Incorporating interpretability techniques or exploring explainable AI methods could address this limitation.

6. Conclusions

The challenge of creating tools and methods for identifying and blocking unauthorized access to networked and computer systems falls under the purview of the computer science and engineering discipline known as NIDS. It is possible to recognize and respond to potential IoT security risks using intrusion detection systems (IDS), firewalls, and other IoT security technologies and methodologies. The R2L and U2R attacks in the NSL–KDD dataset, which are the categories with the lowest performance, will be overlooked more frequently in the future. Thus, a method was planned to be developed in order to enhance this detection rate. A variety of DL approaches were utilized in this work to find anomalies in IDS. The approach produced convincing and trustworthy results when measured against various parameters. One of the work’s most noteworthy accomplishments was using the feature selection strategy to train classifiers on the most significant feature correlations while avoiding missed leads during training, to give the best results. The proposed strategy focused on binary classification using DL techniques. When the results of the algorithms were compared, some classifiers’ results were close together, and the CNN classifier generated the best results. Deep learning demonstrated its viability and superiority when applied to the binary classification of network IDS.

The NSL–KDD data package also includes internet traffic records from the fundamental intrusion detection system. R2L, DoS, U2R, and unauthorized remote system access are the four types of assaults included in the target variable “Class”. The same hybrid model outperformed the MLP model for all-class classification. Class 1 was correctly identified for 1467 out of 2885, Class 2 was accurately predicted for 2215 out of 2421, and Class 0 was correctly anticipated for 6945 out of 7460. The hybrid model had accuracy, F1 score, precision, recall, and AUC score greater than 89%. The F1 score and accuracy of the projected network model on the KDDTest+ test set were 90.45% and 89.12%, respectively. According to the comparison, the suggested MLP and hybrid models outperformed prior reference network models in categorization. The proposed paradigm is helpful in NIDS development.

Future research on intrusion detection and network security can explore improving the detection of R2L and U2R attacks, exploring different deep learning architectures, advancing feature selection strategies, extending the approach to multi-class classification, conducting real-world evaluations, and enhancing the interpretability of deep learning models for network administrators and security analysts.

Author Contributions

Conceptualization, M.Z. and S.A.A.; methodology, M.Z.; software, M.S.A.-R.; validation, M.Z., S.A.A., and M.Z.; formal analysis, S.A.A.; investigation, M.Z.; resources, M.S.A.-R.; data curation, M.Z.; writing—original draft preparation, M.Z.; writing—review and editing, M.S.A.-R.; visualization, S.A.A.; supervision, M.Z.; project administration, M.S.A.-R.; funding acquisition, S.A.A. All authors have read and agreed to the published version of the manuscript.

Funding

The authors extend their appreciation to the Deputyship for Research & Innovation, Ministry of Education in Saudi Arabia, for funding this research (IFKSURC-1-7104).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used in this study is publicly available at https://www.kaggle.com/datasets/hassan06/nslkdd (accessed on 20 January 2018).

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations and Definitions

Abbreviation	Definition	Abbreviation	Definition
IDS	Intrusion Detection System	ADASYN	Adaptive Synthetic Sampling
CNN	Convolutional Neural Network	MLP	Multiceptron Layered Protocol
DoS	Denial-of-Service Attack	U2R	User-to-root attack
R2L	Remote-to-local attack	ML	Machine-Learning
DL	Deep Learning	KNN	K-Nearest Neighbor
DBN	Deep-Belief Network	PNN	Probabilistic Neural Network
RNN	Recurrent Neural Network	Bi-LSTM	Bidirectional LSTM
DLIND	Deep learning model for network intrusion detection	SVM	Support Vector Machine
NIDS	Network Intrusion Detection system	HIDS	Host intrusion detection system
CIDS	Cloud intrusion detection system	API	Application Programming Interface
RF	Random Forest	NB	Naïve Bayes
GA	Genetic Algorithm	FFNN	Feed Forward Neural Network
AI	Artificial Intelligence	DNN	Deep Neural Network
PCA	Principal Component Analysis	LM	Levenberg–Marquardt
ANN	Artificial Neural Network	ReLU	Rectified Linear Unit
NSID	Network Security and Intrusion Detection	t-SNE	t-distributed Stochastic Neighbor Embedding
TPR	True Positive Rate	FPR	False Positive Rate
ROC	Receiver Operating Characteristic

References

Kim, J.; Shin, N.; Jo, S.Y.; Kim, S.H. Method of intrusion detection using deep neural network. In Proceedings of the 2017 IEEE International Conference on Big Data and Smart Computing (BigComp), Jeju, Republic of Korea, 13–16 February 2017; pp. 313–316. [Google Scholar] [CrossRef]
Revathy, F.D.H.; Mani, N. Millennials’ Mentality on School Bullying Through R Programming. Int. J. Recent Technol. Eng. 2019, 8, 2170–2173. [Google Scholar] [CrossRef]
Li, Y.; Xu, Y.; Liu, Z.; Hou, H.; Zheng, Y.; Xin, Y.; Zhao, Y.; Cui, L. Robust detection for network intrusion of industrial IoT based on multi-CNN fusion. Measurement 2019, 154, 107450. [Google Scholar] [CrossRef]
Butun, I.; Morgera, S.D.; Sankar, R. A survey of intrusion detection systems in wireless sensor networks. IEEE Commun. Surv. Tutor. 2013, 16, 266–282. [Google Scholar] [CrossRef]
Liu, H.; Lang, B. Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey. Appl. Sci. 2019, 9, 4396. [Google Scholar] [CrossRef]
Alladi, T.; Kohli, V.; Chamola, V.; Yu, F.R.; Guizani, M. Artificial Intelligence (AI)-Empowered Intrusion Detection Architecture for the Internet of Vehicles. IEEE Wirel. Commun. 2021, 28, 144–149. [Google Scholar] [CrossRef]
Lee, S.-W.; Sidqi, H.M.; Mohammadi, M.; Rashidi, S.; Rahmani, A.M.; Masdari, M.; Hosseinzadeh, M. Towards secure intrusion detection systems using deep learning techniques: Comprehensive analysis and review. J. Netw. Comput. Appl. 2021, 187, 103111. [Google Scholar] [CrossRef]
Carneiro, J.; Oliveira, N.; Sousa, N.; Maia, E.; Praça, I. Machine learning for network-based intrusion detection systems: An analysis of the CIDDS-001 dataset. In Distributed Computing and Artificial Intelligence, Volume 1: 18th International Conference 18 2022; Springer International Publishing: Berlin/Heidelberg, Germany, 2022; pp. 148–158. [Google Scholar]
Ogundokun, R.O.; Awotunde, J.B.; Sadiku, P.; Adeniyi, E.A.; Abiodun, M.; Dauda, O.I. An Enhanced Intrusion Detection System using Particle Swarm Optimization Feature Extraction Technique. Procedia Comput. Sci. 2021, 193, 504–512. [Google Scholar] [CrossRef]
Rawat, S.; Srinivasan, A.; Ravi, V.; Ghosh, U. Intrusion detection systems using classical machine learning techniques vs integrated unsupervised feature learning and deep neural network. Internet Technol. Lett. 2020, 5, e232. [Google Scholar] [CrossRef]
Naveed, M.; Arif, F.; Usman, S.M.; Anwar, A.; Hadjouni, M.; Elmannai, H.; Ullah, S.S.; Umar, F. A Deep Learning-Based Framework for Feature Extraction and Classification of Intrusion Detection in Networks. Wirel. Commun. Mob. Comput. 2022, 2022, 2215852. [Google Scholar] [CrossRef]
Fu, Y.; Du, Y.; Cao, Z.; Li, Q.; Xiang, W. A Deep Learning Model for Network Intrusion Detection with Imbalanced Data. Electronics 2022, 11, 898. [Google Scholar] [CrossRef]
Musa, U.S.; Chhabra, M.; Ali, A.; Kaur, M. Intrusion detection system using machine learning techniques: A review. In Proceedings of the 2020 International Conference on Smart Electronics and Communication (ICOSEC), Trichy, India, 10–12 September 2020; IEEE: Manhattan, NY, USA, 2020; pp. 149–155. [Google Scholar] [CrossRef]
Jiadong, R.; Xinqian, L.; Qian, W.; Haitao, H.; Xiaolin, Z. A multi-level intrusion detection method based on KNN outlier detection and random forests. J. Comput. Res. Dev. 2019, 56, 566. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
Wisanwanichthan, T.; Thammawichai, M. A Double-Layered Hybrid Approach for Network Intrusion Detection System Using Combined Naive Bayes and SVM. IEEE Access 2021, 9, 138432–138450. [Google Scholar] [CrossRef]
Ieracitano, C.; Adeel, A.; Morabito, F.C.; Hussain, A. A novel statistical analysis and autoencoder driven intelligent intrusion detection approach. Neurocomputing 2020, 387, 51–62. [Google Scholar] [CrossRef]
Ding, Y.; Zhai, Y. Intrusion detection system for NSL-KDD dataset using convolutional neural networks. In Proceedings of the 2018 2nd International Conference on Computer Science and Artificial Intelligence, Shenzhen, China, 8–10 December 2018; pp. 81–85. [Google Scholar] [CrossRef]
Gao, X.; Shan, C.; Hu, C.; Niu, Z.; Liu, Z. An adaptive ensemble machine learning model for intrusion detection. IEEE Access 2019, 7, 82512–82521. [Google Scholar] [CrossRef]
Jiang, K.; Wang, W.; Wang, A.; Wu, H. Network intrusion detection combined hybrid sampling with deep hierarchical network. IEEE Access 2020, 8, 32464–32476. [Google Scholar] [CrossRef]
Bondoc, C.E.; Malawit, T.G. Cybersecurity for higher education institutions: Adopting regulatory framework. Glob. J. Eng. Technol. Adv. 2020, 2, 16. [Google Scholar] [CrossRef]
Berman, D.S.; Buczak, A.L.; Chavis, J.S.; Corbett, C.L. A Survey of Deep Learning Methods for Cyber Security. Information 2019, 10, 122. [Google Scholar] [CrossRef]
Hameed, S.; Khan, F.I.; Hameed, B. Understanding Security Requirements and Challenges in Internet of Things (IoT): A Re-view. J. Comput. Netw. Commun. 2019, 2019, 9629381. [Google Scholar] [CrossRef]
Ayrour, Y.; Raji, A.; Nassar, M. Modelling cyber-attacks: A survey study. Netw. Secur. 2018, 2018, 13–19. [Google Scholar] [CrossRef]
Hindy, H.; Tachtatzis, C.; Atkinson, R.; Bayne, E.; Bellekens, X. MQTT-IoT-IDS 2020: MQTT internet of things intrusion detection dataset. IEEE Dataport. 2020. [Google Scholar] [CrossRef]
Hindy, H.; Tachtatzis, C.; Atkinson, R.; Bayne, E.; Bellekens, X. Developing a siamese network for intrusion detection systems. In Proceedings of the 1st Workshop on Machine Learning and Systems, ser. EuroMLSys’21, Online, UK, 26 April 2021; ACM: New York, NY, USA, 2021; pp. 120–126. [Google Scholar] [CrossRef]
Hindy, H.; Tachtatzis, C.; Atkinson, R.; Brosset, D.; Bures, M.; Andonovic, I.; Michie, C.; Bellekens, X. Leveraging Siamese networks for One-Shot intrusion detection model. arXiv 2020, arXiv:2006.15343. [Google Scholar] [CrossRef]
Hindy, H.; Atkinson, R.; Tachtatzis, C.; Colin, J.-N.; Bayne, E.; Bellekens, X. Utilising Deep Learning Techniques for Effective Zero-Day Attack Detection. Electronics 2020, 9, 1684. [Google Scholar] [CrossRef]
Khraisat, A.; Gondal, I.; Vamplew, P.; Kamruzzaman, J. Survey of intrusion detection systems: Techniques, datasets and challenges. Cybersecurity 2019, 2, 20. [Google Scholar] [CrossRef]
Liu, C.; Liu, Y.; Yan, Y.; Wang, J. An Intrusion Detection Model with Hierarchical Attention Mechanism. IEEE Access 2020, 8, 67542–67554. [Google Scholar] [CrossRef]
Dwivedi, S.; Vardhan, M.; Tripathi, S.; Shukla, A.K. Implementation of adaptive scheme in evolutionary technique for anomaly-based intrusion detection. Evol. Intell. 2020, 13, 103–117. [Google Scholar] [CrossRef]
Su, T.; Sun, H.; Zhu, J.; Wang, S.; Li, Y. BAT: Deep Learning Methods on Network Intrusion Detection Using NSL-KDD Dataset. IEEE Access 2020, 8, 29575–29585. [Google Scholar] [CrossRef]
Alagrash, Y.; Drebee, A.; Zirjawi, N. Comparing the Area of Data Mining Algorithms in Network Intrusion Detection. J. Inf. Secur. 2020, 11, 1–18. [Google Scholar] [CrossRef]
Khammassi, C.; Krichen, S. A NSGA2-LR wrapper approach for feature selection in network intrusion detection. Comput. Netw. 2020, 172, 107183. [Google Scholar] [CrossRef]
Gauthama Raman, M.R.; Somu, N.; Jagarapu, S.; Manghnani, T.; Selvam, T.; Krithivasan, K.; Shankar Sriram, V.S. An efficient intrusion detection technique based on support vector machine and improved binary gravitational search algorithm. Artif. Intell. Rev. 2020, 53, 3255–3286. [Google Scholar] [CrossRef]
Dey, S.K.; Rahman, M.M. Effects of Machine Learning Approach in Flow-Based Anomaly Detection on Software-Defined Networking. Symmetry 2020, 12, 7. [Google Scholar] [CrossRef]
Elmasry, W.; Akbulut, A.; Zaim, A.H. Evolving deep learning architectures for network intrusion detection using a double PSO metaheuristic. Comput. Netw. 2020, 168, 107042. [Google Scholar] [CrossRef]
Iwendi, C.; Khan, S.; Anajemba, J.H.; Mittal, M.; Alenezi, M.; Alazab, M. The Use of Ensemble Models for Multiple Class and Binary Class Classification for Improving Intrusion Detection Systems. Sensors 2020, 20, 2559. [Google Scholar] [CrossRef]
Kumar, G. An improved ensemble approach for effective intrusion detection. J. Supercomput. 2020, 76, 275–291. [Google Scholar] [CrossRef]
Ashiku, L.; Dagli, C. Cybersecurity as a Centralized Directed System of Systems using SoS Explorer as a Tool. In Proceedings of the 2019 14th Annual Conference System of Systems Engineering (SoSE), Anchorage, AK, USA, 19–22 May 2019; pp. 140–145. [Google Scholar] [CrossRef]
Latif, S.; Zeba, I.; Zhuo, Z.; Jawad, A. DRaNN: A Deep Random Neural Network Model for Intrusion Detection in Industrial IoT. In Proceedings of the 2020 International Conference on UK-China Emerging Technologies (UCET), Glasgow, UK, 20–21 August 2020; pp. 1–4. [Google Scholar] [CrossRef]
Kasongo, S.M.; Sun, Y. A deep learning method with wrapper based feature extraction for wireless intrusion detection system. Comput. Secur. 2020, 92, 101752. [Google Scholar] [CrossRef]
Supratik, P.; Kurin, V.; Whiteson, S. Fast efficient hyperparameter tuning for policy gradients. arXiv 2019, arXiv:1902.06583. [Google Scholar] [CrossRef]
Zhang, H.; Wu, C.Q.; Gao, S.; Wang, Z.; Xu, Y.; Liu, Y. An Effective Deep Learning Based Scheme for Network Intrusion Detection. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 682–687. [Google Scholar] [CrossRef]
Khan, A.R.; Kashif, M.; Jhaveri, R.H.; Raut, R.; Saba, T.; Bahaj, S.A. Deep Learning for Intrusion Detection and Security of Internet of Things (IoT): Current Analysis, Challenges, and Possible Solutions. Secur. Commun. Netw. 2022, 2022, 4016073. [Google Scholar] [CrossRef]
Abbasi, R.; Chen, J.; Al-Otaibi, Y.; Rehman, A.; Abbas, A.; Cui, W. RDH-based dynamic weighted histogram equalization using for secure transmission and cancer prediction. Multimedia Syst. 2021, 27, 177–189. [Google Scholar] [CrossRef]
Ali, M.H.; Jaber, M.M.; Abd, S.K.; Rehman, A.; Awan, M.J.; Damaševičius, R.; Bahaj, S.A. Threat Analysis and Distributed Denial of Service (DDoS) Attack Recognition in the Internet of Things (IoT). Electronics 2022, 11, 494. [Google Scholar] [CrossRef]
Khan, H.U.; Ali, F.; Alshehri, Y.; Nazir, S. Towards Enhancing the Capability of IoT Applications by Utilizing Cloud Computing Concept. Wirel. Commun. Mob. Comput. 2022, 2022, 233531. [Google Scholar] [CrossRef]
Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A. A Detailed Analysis of the KDD CUP 99 Data Set. In Proceedings of the Submitted to Second IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), Ottawa, ON, Canada, 8–10 July 2009. [Google Scholar] [CrossRef]
Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
He, H.; Bai, Y.; Garcia, E.A.; Li, S. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–8 June 2008; pp. 1322–1328. [Google Scholar]
Van der Maaten, L.J.P.; Hinton, G.E. Visualizing High-Dimensional Data. Using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
Kasongo, S.M.; Sun, Y. A Deep Learning Method with Filter Based Feature Engineering for Wireless Intrusion Detection System. IEEE Access 2019, 7, 38597–38607. [Google Scholar] [CrossRef]
Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H.; Chen, K.; Mitchell, R.; Cano, I.; Zhou, T. Xgboost: Extreme gradient boosting. Available online: https://cran.microsoft.com/snapshot/2017-12-11/web/packages/xgboost/vignettes/xgboost.pdf (accessed on 21 May 2023).
Zafeiropoulos, N.; Mavrogiorgou, A.; Kleftakis, S.; Mavrogiorgos, K.; Kiourtis, A.; Kyriazis, D. Interpretable Stroke Risk Prediction Using Machine Learning Algorithms. In Intelligent Sustainable Systems: Selected Papers of WorldS4 2022; Springer Nature: Singapore, 2023; Volume 2, pp. 647–656. [Google Scholar]
Mavrogiorgou, A.; Kiourtis, A.; Kleftakis, S.; Mavrogiorgos, K.; Zafeiropoulos, N.; Kyriazis, D. A Catalogue of Machine Learning Algorithms for Healthcare Risk Predictions. Sensors 2022, 22, 8615. [Google Scholar] [CrossRef] [PubMed]
Thakkar, A.; Lohiya, R. A survey on intrusion detection system: Feature selection, model, performance measures, application perspective, challenges, and future research directions. Artif. Intell. Rev. 2021, 55, 453–563. [Google Scholar] [CrossRef]
Zhang, C.; Ruan, F.; Yin, L.; Chen, X.; Zhai, L.; Liu, F. A Deep Learning Approach for Network Intrusion Detection Based on NSL-KDD Dataset. In Proceedings of the 2019 IEEE 13th International Conference on Anti-counterfeiting Security, and Identification (ASID), Xiamen, China, 25–27 October 2019; p. 4145. [Google Scholar] [CrossRef]

Figure 1. Framework for IDS system.

Figure 2. Machine learning algorithm in IDS.

Figure 3. Overview of the proposed approach.

Figure 4. Distribution of the categories of the traffic records.

Figure 5. Distribution of the categories of the traffic records After ADASYN.

Figure 6. Diagram for the classification of the type of intrusion.

Figure 7. 2D visualization of the intrusion attacks.

Figure 8. Feature importance scores.

Figure 9. Architecture of the MLP model.

Figure 10. Architecture of the hybrid model.

Figure 11. Confusion matrix for the MLP model for the binary classification.

Figure 12. Confusion matrix for the hybrid model for the binary classification.

Figure 13. MLP model ROC curves for binary classification.

Figure 14. Hybrid model ROC curves for binary classification.

Figure 15. Attack classification confusion matrix for the MLP.

Figure 16. ROC curves for the MLP model for attack classification.

Figure 17. Hybrid model confusion matrix for attack classification.

Figure 18. Hybrid model ROC curves for attack classification.

Figure 19. MLP model’s confusion matrix for all-class classification.

Figure 20. The MLP model’s ROC curves for all-class classification.

Figure 21. Confusion matrix for the hybrid model for all-class classification.

Figure 22. The hybrid model’s ROC curves for all-class classification.

Table 1. List of past paper references, including methodology used and results.

Ref.	Algorithm/Dataset	Method Used	Methodology	Findings
[12]	Dataset for NSL–KDD. KDDTest+ had 21,543 traffic samples, while the training set had 125,983 traffic samples. In addition, its training set included 19 different kinds of attack	Supervised Learning	Confusion matrix, LSTM, RNN, DLIND, Adaptive Synthetic Sampling (ADASYN), and SMOTE data augmentation method	89.45% F1 score and 91.4% accuracy
[17]	Datasets from KDD Cup 1999 and NSL–KDD	Supervised Learning	ML, MLP, LSVM, LSTM, and DL	Accuracy of 82.34% and a multi-classification accuracy of 80.12%
[14]	Dataset NSL–KDD KDDTest+ showed Dataset for KDD Cup 1999	Supervised Learning	KNN, SVM, SMOTE Algorithm, Random Forest, and Similarity Matrix	Accuracy: RF—99.5%. 90.56% for RepTree. MLP, SVM—99.12% 99.8% enhanced RF
[20]	Datasets NSL–KDD and UNSW-NB15. There were 82,337 test connection records and 175,343 train connection records.	Supervised Learning	CNN, BiLSTM, DL, TensorFlow, Python, SMOTE method	Classification accuracy: 83.65% and 78.13%
[26]	Datasets from IDS, including CICIDS2017, KDD Cup 1999, and NSL–KDD	Supervised Learning	ML, ANN, and Classification Confusion Matrix	85% Siamese Network Accuracy
[25]	NSL–KDD, CICIDS 2017, and KDD Cup 1999	Supervised Learning	CNN, LSTM, Siamese Network, and ANN	85% accurate The accuracy of KDD Cup99 and NSL–KDD is 75%
[36]	The NSL–KDD dataset had 41 characteristics.	Supervised Learning	TensorFlow, LSTM, DL, RNN, ML, SVM, DDoS, DNN	82.34% RF Accuracy and 88.12% GRU-LSTM Classifier
[37]	NSL–KDD and CICIDS2017	Supervised Learning	DNN, LSTM, RNN, Multiclass Classification	RNN—72.99% AE + DBN + DNN + ELM–—97.96% Double PSO + LSTM + RNN—98.16% Double PSO + DBN—99.91%

Table 2. Performance of the two models for binary classification.

Model	AUC	Accuracy	Precision	Recall	F1 Score
MLP	0.88	0.87	0.89	0.89	0.88
Hybrid model	0.98	0.89	0.90	0.89	0.90

Table 3. The effectiveness of the two models for classifying attacks.

Model	AUC	Accuracy	Precision	Recall	F1 Score
MLP	0.94	0.91	0.72	0.64	0.63
Hybrid model	0.95	0.92	0.74	0.65	0.64

Table 4. Performance of the two models for all-class classification.

Model	AUC	Accuracy	Precision	Recall	F1 Score
MLP	0.94	0.94	0.73	0.72	0.66
Hybrid model	0.97	0.94	0.75	0.71	0.70

Table 5. Accuracy comparison between the proposed work and previous research for binary classification.

Techiques	Accuracy (%)
SVM [58]	80.62
KNN [58]	76.40
RF [58]	85.35
NB [58]	78.80
DT [58]	77.31
FFDNN [58]	87.74
Our MLP model	87.50
Our Hybrid model	89.00

Table 6. Accuracy comparison between proposed work and previous research for attack classification.

Models	Accuracy (%)
DNN [14]	78.50
CNN [15]	79.48
Deep-MLP [16]	79.74
RF [17]	81.95
DLS–IDS [18]	83.57
MFFSEM [19]	84.33
Hybrid [20]	85.24
Our MLP model	91.00
Our Hybrid model	92.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zakariah, M.; AlQahtani, S.A.; Al-Rakhami, M.S. Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection. Appl. Sci. 2023, 13, 6504. https://doi.org/10.3390/app13116504

AMA Style

Zakariah M, AlQahtani SA, Al-Rakhami MS. Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection. Applied Sciences. 2023; 13(11):6504. https://doi.org/10.3390/app13116504

Chicago/Turabian Style

Zakariah, Mohammed, Salman A. AlQahtani, and Mabrook S. Al-Rakhami. 2023. "Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection" Applied Sciences 13, no. 11: 6504. https://doi.org/10.3390/app13116504

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Adaptive Synthetic Sampling Technique for Intrusion Detection

Abstract

1. Introduction

2. Literature Review

3. Material & Methods

3.1. Dataset

3.2. Data Pre-Processing

3.2.1. Data Cleaning

3.2.2. One-Hot Encoding

3.2.3. Normalization

3.2.4. Target Balancing

3.3. Classification Strategies for Intrusion Detection

3.4. Classification of Relevant Features Using XGBoost

3.5. Model Architecture

4. Results and Analysis

4.1. Experimental Settings

4.2. Performance Matrices

4.3. Binary Classification

4.4. Attack Classification

4.5. All-Class Classification

4.6. Comparison with Previous Work

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations and Definitions

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI