Distinctive Measurement Scheme for Security and Privacy in Internet of Things Applications Using Machine Learning Algorithms

Alhalabi, Wadee; Al-Rasheed, Amal; Manoharan, Hariprasath; Alabdulkareem, Eatedal; Alduailij, Mai; Alduailij, Mona; Selvarajan, Shitharth

doi:10.3390/electronics12030747

Open AccessArticle

Distinctive Measurement Scheme for Security and Privacy in Internet of Things Applications Using Machine Learning Algorithms

by

Wadee Alhalabi

¹

,

Amal Al-Rasheed

²

,

Hariprasath Manoharan

³

,

Eatedal Alabdulkareem

⁴,

Mai Alduailij

⁴,

Mona Alduailij

⁴ and

Shitharth Selvarajan

^5,*

¹

Computer Science Department, Virtual Reality Research Group, King Abdulaziz University, P.O. Box 80200, Jeddah 21589, Saudi Arabia

²

Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

³

Department of Electronics and Communication Engineering, Panimalar Engineering College, Chennai 600069, India

⁴

Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

⁵

Department of Computer Science and Engineering, Kebri Dehar University, Kebri Dehar 250, Ethiopia

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(3), 747; https://doi.org/10.3390/electronics12030747

Submission received: 31 December 2022 / Revised: 30 January 2023 / Accepted: 30 January 2023 / Published: 2 February 2023

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

More significant data are available thanks to the present Internet of Things (IoT) application trend, which can be accessed in the future using some platforms for data storage. An external storage space is required for practical purposes whenever a data storage platform is created. However, in the IoT, certain cutting-edge storage methods have been developed that compromise the security and privacy of data transfer processes. As a result, the suggested solution creates a standard mode of security operations for storing the data with little noise. One of the most distinctive findings in the suggested methodology is the incorporation of machine learning algorithms in the formulation of analytical representations. The aforementioned integration method ensures high-level quantitative measurements of data security and privacy. Due to the transmission of large amounts of data, users are now able to assess the reliability of data transfer channels and the duration of queuing times, where each user can separate the specific data that has to be transferred. The created system is put to the test in real time using the proper metrics, and it is found that machine learning techniques improve security more effectively. Additionally, for 98 percent of the scenarios defined, the accuracy for data security and privacy is maximized, and the predicted model outperforms the current method in all of them.

Keywords:

software defined networks (SDN); security; privacy; Internet of Things (IoT); machine learning

1. Introduction

For specified applications where high data security must be built, the entire IoT development process is carried out in real time. The majority of data transferred via wireless applications depends on a number of variables, including data volume, size, and mode of transmission, where cloud integrations are established for high-security authentications. However, the majority of IoT apps use certain unnamed elements to secure data, which does not entirely control the accuracy of data transmission and reception [1]. In order to improve data security and privacy during transmission and reception phases, where various parametric evaluations for data determinations are conducted, the proposed method is introduced [2]. With a given set of data, even additional black-box design functions can be created, adding to the complexity of IoT processing methods [3]. By adopting an external source where the cost of monitoring is increased, customers can manage some data issues, such as denial-of-service attacks, cyber-attacks, etc. [4]. However, the suggested system includes a sophisticated data monitoring system with research characteristics for minimization scenarios, such as noise factors, malicious node detection, and data queuing reductions [5]. Additionally, a fresh set of formulations with precise design methods is offered, and all machine learning techniques that offer additional support for the proposed system are merged [6]. There are numerous authentication stages in the IoT development process where security and privacy must be guaranteed with the same configuration steps.

Figure 1 represents a general block diagram for different devices in the network where, at initial state, multiple sensors are connected with accurate IoT devices. Then the device representations are made with distinct connection representations, and data interconnection is prepared with more networks. Once the networks are connected, routing representation procedures are carried out for all connected sensors and components. Then a centralized network formation is made which directly provides connection to wireless networks. Since most of the IoT applications are distributed in wireless operation, high security enhancement is provided with machine learning algorithms in a distributed mode; thus, data storage and analytics for IoT operations are completed in a much clearer way. In this type of continuous secured IoT data transfer, only distributed users are connected, which in turn maximizes the privacy of the entire network.

1.1. Literature Survey

Analyzing some of the literature models that increase support for new IoT applications is therefore important. The majority of the current systems provide security and privacy for data connections in a discrete manner; however, the current technological development necessitates integrating all defined formulations. As a result, each technique is individually analyzed before being introduced in order to create a combined mechanism for data security and privacy. In order to offer information regarding data leakage that is present in various public data functions, Ref. [1] establishes an accuracy trade-off criterion. The IoT data leakage issues will have a significant impact on the system, which in turn causes high distortion rates in real time. As a result, data security will be stronger if notation functions with privacy inputs are supplied. However, if real-time notation functions are applied, the included system will get seriously confused by system modifications, which will impact data secrecy features. Since there would be a substantial impact owing to the increased user base, mathematical models are being created to avoid data theft on social networks [2]. This leads to the introduction of a hypothetical decision-making process with two distinct nodes, such as adversarial and ideal, where data falsification is entirely eliminated from the system. However, the number of false positives in planned systems that are not processed due to fewer requests must be detected using an informed decision-making technique.

Utilizing a trust management framework [3] that processes big data and has a feature set, the majority of the development process for IoT applications is carried out. It is vital to supply crowdsourcing to connected devices where multi-perspective task forces can be enabled in order to regulate the degree of confidence in data privacy. Even if these task forces are activated, confusion metrics in the real-time database will continue to impact user data. As a result, by permitting node technical features [4] with differentiation and clustering approaches, representations are created instead of data task forces. As a result, a multi-signature platform is developed for data preservation and is used to store various transmitted signature properties. However, these features can only be used with a system if a significant amount of memory is available; as a result, a dimensionless system must be developed. In order to provide block chain technology with comprehensive system transfer function protection, a dimensionless system must be established throughout the IoT design phase, which is not achievable with real-time data transmission. Researchers take additional steps to determine the future course of IoT data security and privacy with big data deployments [5]. Certain technological features necessitate clear verification and authentication requirements for data processing and delivery to the receiver. Such technical processes need a real-time watermarking system, since certain apps can detect environmental changes. All defined layers of open system interconnections are examined to understand the real-time effects on data security, and it has been discovered that the physical layer is crucial for preventing data theft [6]. A convex estimating approach is presented to address such attacks because the physical layer is thought to be one of the main sources for passive attacks. Even though the physical layer is able to control attacks, the remaining top levels will still significantly disrupt the various communication layers; as a result, the same estimation algorithm must be used for the upper layers, which is a much more challenging procedure.

In order to provide high security to all data-connected networks, some system functions are carried out with resilient characteristics as offloading methods in industrial processes [7]. If an edge computing system is created, some manual work will need to be done to determine whether an attack is coming from an internal or external user. However, because of the bulk of the data, manual operations are avoided, and only automatic attack detection is carried out on centralised servers. A lightweight authentication protocol is enabled for all communication protocols in order to transform network functioning in an automatic manner, making IoT networks very flexible against various forms of threats [8]. To maintain data privacy in the rising cloud environment, a random provability verification method is used in the case of light-weight protocols. Even though IoT cloud storage solutions already have a reliable verification mechanism in place, system developments that are mostly the result of data attacks require an external verification method. A spectrum sharing strategy is required, per the IoT procedure with stable verification [9], for stabilizing the complete database when using distributed ways. Data symmetry is established throughout the full IoT process with power degradation using the distributed technique, which is indicated as a two-scale authentication model. Data cannot reach the receiver if energy to IoT systems is not supplied properly; thus, appropriate power must be supplied in both scale factors. To retain all of the data in a set of records utilizing some encryption keys, evolutionary approaches must be used [10]. To increase the effectiveness of data privacy amongst users, a data sharing scheme is enabled within the same user platform due to the requirement that the encryption keys be consistent across all IoT systems. Table 1 lists the important works in IoT data security and privacy that use various algorithms.

1.2. Research Gap and Motivation

Most of the existing approaches provide real-time IoT information that is present at the data transfer state where different approaches are implemented for enhancing the privacy of the network. However, the major gap that is observed in existing approaches [17,18,19,20] is that data security is made using only an authentication key factor, which is not highly tolerable if multiple data are transmitted over the same network in the same time period. Even if a greater amount of data is transmitted in a distributed way, the system suffers with a misperception problem if size and representation of data are identical. Further, in IoT applications, multiple sensors are interconnected with a distinct network, thereby producing duplicate packets over the entire network, which needs to be avoided. If duplicate data are created, then external users can certainly catch the data by creating multiple destinations in the transmitted time period. Therefore, some alternate design representations must be incorporated into existing approaches to enhance the security and privacy of IoT operations.

To overcome the security and privacy gap that is present in existing approaches, the projected design is generated with analytical representations. In the proposed method, unique design factors are combined with data noise, robustness, configurations and length of the network. Moreover, with the aforementioned security factors it is possible to transmit more data in the same time period even on the same network. In addition, the response of IoT data in the system is much higher with a minimized queue, thereby making the connected network highly secured from duplicated packets. Furthermore, the designed system is combined with a machine learning algorithm by which pseudo labels are provided for each set of transmitted data, thus maximizing data privacy on a large scale.

1.3. Objectives

The proposed design for IoT representations are considered with transmission of multiple data to distributed users where the combined multi-objective case must have content as follows:

To minimize the robustness and noise factors in IoT data, thereby decreasing error functions.
To eliminate the presence of malicious nodes at low congestion factor in interconnected system operations.
To provide individual data set representations with labelled data sets even for different cluster regions.

2. Security: Analytical Representations

In order to set up a system for various expanding applications where unique design models can increase the accuracy of prediction, the mathematical approaches for IoT models are presented. However, it has been noted that individual system analysis has increased the overall complexity of network operations. Common system representations must be created and examined in order to reduce the complexity of the IoT network for different applications, which is done in this part. Additionally, the main issue of security and data privacy is also modeled mathematically using cloud storage techniques, combining all system communication layers. Every time data is communicated in the Internet of Things, it can be secured by utilizing an imprint representation in the presence of a licensed appearance. To do this, Equation (1) can be used to minimize robustness, as shown below,

R_{i} = m i n \sum_{i = 1}^{n} \frac{M_{p} (i)}{I_{i} (i)}

(1)

where

$M_{p}$ indicates number of identical configurations,
$I_{i}$ represents imprint configurations.

Equation (1) indicates that it must be eliminated from the network using some represented trademarks if identical setups are generated for different data. To use the imprint configuration (2), the noise signals of the IoT data signals must be reduced as shown in the Equation:

N_{i} = m i n \sum_{i = 1}^{n} \frac{γ_{i}}{E_{i}}

(2)

where

$γ_{i}$ describes maximum number of integrated configurations in data,
$E_{i}$ indicates total error functions.

Equation (2) shows that if transmitted data noise is decreased, it will be considerably simpler to configure the system with original data records, ensuring data privacy. The capacity of IoT transmission must be maximized with respect to system function, which can be specified using Equation (3) as follows if integrated configurations are represented.

C_{i} = m a x \sum_{i = 1}^{n} \frac{S_{d} (i)}{t o t a l_{d a t a}} * 100

(3)

where

$S_{d}$ indicates underground IoT data,
$t o t a l_{d a t a}$ represents total number of transmitted data.

Equation (3) explains that in order for IoT systems to function at high capacities, total data representations must be maximized while data separation must be carried out in underground systems. Using Equation (4), the following data replication suggests the existence of malicious nodes that must be eliminated from the network if it is operating with identical representations.

M_{n} (i) = m i n \sum_{i = 1}^{n} (D_{i} * δ_{i}) + (P_{p} (i) + P_{a} (i))

(4)

where

$D_{i}$ indicates requests that are denied from unknown users,
$δ_{i}$ denotes number of detections,
$P_{p}$ , $P_{a}$ represent data privacy defense and access.

Equation (4) indicates that all malicious nodes in IoT cloud operation networks can be reduced if the number of unknown accesses is denied. However, the presence of malicious nodes is also caused by the lengthening of systems, as seen in the following formulation of Equation (5).

M_{l} (i) = \sum_{i = 1}^{n} (L_{i} * w_{i} * h_{i})

(5)

where

$L_{i}$ , $w_{i}$ , $h_{i}$ denote length, width and height of transmitted data, respectively.

According to Equation (5), the concentration ranges must contain the transmitted data reproduction rates. However, in IoT networks, the data only needs to be validated using Equation (6) if the nodes contain individual data owners.

d a t a_{i n d i v i u d a l} = \sum_{i = 1}^{n} O_{s} (i) + n_{d} (i)

(6)

where

$O_{s}$ represents number of data possessors,
$n_{d}$ denotes number of individual data from each users.

Equation (6) indicates that individual data segments must be examined before processing the data in order to decrease the number of rogue nodes if the group of users is very large. Therefore, Equation (7) must be used to decrease the number of individual queues for group nodes in the manner shown below,

Q u e u e_{i} = m i n \sum_{i = 1}^{n} ϑ_{i n} * ω_{i}

(7)

where

$ϑ_{i n}$ represents number of connected IoT nodes to central server,
$ω_{i}$ denotes shortest route to reach the destination without any queue.

All of the fundamental formulations offered in Equations (1)–(7) are set up to increase network security and will be used as the standard representation for IoT data privacy systems. Equation (8) can therefore be used to obtain the objective function of the suggested approach, which is as follows:

O b j_{i} = m i n \sum_{i = 1}^{n} R_{i}, N_{i}, M_{n}, Q u e u e_{i}, m a x \sum_{i = 1}^{n} C_{i}

(8)

The goal specifies a multi-objective optimization using minimization and maximization functions while keeping an eye on both benign and malicious nodes and individual data. Section 3 states that all machine learning algorithms will be combined to check the specified objective function.

3. Machine Learning Algorithms

Machine learning algorithms are employed in real-time applications for IoT management to determine the type of attack and provide the necessary reaction elements. Since the majority of the data in IoT applications is present in enormous quantities, it is vital to identify malicious activity. Therefore, the suggested method assesses all machine learning methods, including supervised, semi-supervised, unsupervised, and reinforcement learning. To confirm the security aspects of data privacy in IoT applications, a comparison of mathematical descriptions is also required. The main benefit of integrating machine learning algorithms with predefined system models is the rapid detection of all identical occurrences, which lowers network traffic. In addition, the detection procedure is implemented using the Internet of Concession (IoC), which is processed before the data are stored inside cloud representations. Also known as user interface systems, machine learning methods can be integrated with behavior modeling techniques. In order to ensure that there are no attack scenarios in the suggested system model, the aforementioned types might therefore develop a mapping technique to detect the presence of cyber-embezzlement. All four kinds are compared using secured parametric values with a pre-trained data set that is gathered from various application platforms in order to demonstrate zero-attack situations [21,22].

The security features of supervised machine learning can be enabled in two steps, allowing for both internal and external supervision. Therefore, it is always required to mark the data so that external data can be mapped quickly. However, supervised learning algorithms can be used to lessen categorization issues. Equation (9) can be used to formulate the mathematical model of supervised learning for security as follows,

S L_{i} = \sum_{i = 1}^{n} (f_{i} + l_{i}) * l s_{i}

(9)

where

$f_{i}$ indicates a vector set of feature extraction,
$l_{i}$ represents secured labeled data,
$l s_{i}$ denotes number of possible security labels to be added.

As flexibility is restricted for complicated IoT applications, semi-supervised machine learning algorithms perform an intermediate measurement between supervised and unsupervised learning. This limitation in flexibility is mostly due to the fact that both labeled and unlabeled data sets can be employed, with some IoT applications requiring safe data storage while others do not. When security is a top priority, it is typically seen that all IoT procedures are changed to manual labeling processes, which cannot provide any further protection in the system design. As a result, using labeled data in shifting data time periods is never necessary; instead, unlabeled data can also be used by grouping all duplicate data into a single cluster. Additionally, because there are unlabeled data sets present in semi-supervised learning, where continuity processes are used, the implementation cost is also decreased. Equation (10) can be used to express semi-supervised learning with a pseudo-label set mathematically for IoT security and privacy, as shown below,

S S L_{i} = \sum_{i = 1}^{n} \frac{U L_{i} (x, y)}{L_{n} (i)}

(10)

where

$U L_{i}$ denotes unlabeled data with two set variables,
$L_{n}$ indicates number of labeled data set.

The process of data learning typically occurs in unsupervised machine learning techniques without any peripheral node connections; as a result, the acquired data is not trained and is displayed in the same way for data privacy. The suggested solution will turn big IoT data sets for all applications into unsupervised learning, where the data will only be taught using unlabeled data.

Even in unsupervised machine learning algorithms, the distance of data transmission can be calculated using variable k measurable locations. The Figure 2 depicts the integrated system model with machine learning.

U L_{i} = \sum_{i = 1}^{n} \frac{d_{1} + \dots + d_{i}}{c l u s t e r_{i n}}

(11)

where

$d_{1} + \dots + d_{i}$ indicates number of data points,
$c l u s t e r_{i n}$ describes the data that is present at cluster points.

The IoT activities for various applications are carried out using an agent-based model in the reinforcement machine learning technique, where actions are translated into personalized feedback with rewards. The majority of IoT activities use solely reinforcement learning algorithms, since Q-learning rates are used to calculate data privacy values. As high-dimensional goods are transformed into low-dimensional ones at output units, further relevant decisions are made using the reinforcement learning technique. Additionally, the reinforcement learning technique permits external actors to make snap judgments and actions in response to changes in the allotted time period. As a result, Equation (12) may be used to mathematically formulate the latency representations for reinforcement learning in the IoT approach as follows,

T i m e_{d} (i) = \sum_{i = 1}^{n} \frac{μ_{i}}{h_{c} (i)}

(12)

where

$μ_{i}$ indicates latency of data,
$h_{c}$ represents higher classification of data.

The suggested method integrates all of the aforementioned machine learning algorithms with the proposed system model, allowing for real-time monitoring of each algorithm’s security and privacy percentages (Figure 1). Additionally, the proposed method uses both labeled and unlabeled sets of data, establishing excellent security for the applications in two distinct fields of interest. The following sections offer the results of integration and its comparison cases.

4. Results and Discussions

In this section, the data representation technique is tested and validated in relation to the integrated technical outcomes for the system model that are used in real-time applications. The initial steps of the test bed system for data security are initiated via wireless link establishment, which filters out unnecessary network noise. It has been noted that automatic noise introduced by external linkages into IoT networks during the experimental setup has an impact on marginal outcomes but is addressed in the early stages of development. Additionally, a marginal representation approach is employed to process IoT applications in order to cut down on the number of settings that are exactly the same for various applications. A set of training features are provided to integrated algorithms using labeled and unlabeled data sets, since the main goal of the proposed work is to choose the same system architecture for various IoT application platforms in order to monitor necessary parameters. Users of the same network can transfer data using the impression marks by connecting two distinct communication devices in device management representations. A queue will emerge for inquiry cases if the identification factor cannot be located, allowing the system to be changed to an offloading representation. Once the offloading operations for the inquiries have been finished, the data will be sent to the designated user while maintaining privacy. Extended support cannot be offered, since duplicate data in same-node systems will be destroyed if the query is not resolved. If external users satisfy the data security risk management, the central management system will determine the amount of data that must be transmitted by the user end. Additionally, real-time analysis is performed using data set representation, which takes into account the influence of outside sources on historical data from multiple users. A collection of feature values is obtained by using a labeled flow technique for transmission scenarios after converting the full data system representations. The following scenarios are used to test the effectiveness of real-time data security and privacy procedures using machine learning algorithms.

Scenario 1: Robustness and noise reduction factors.

Scenario 2: Detection of malicious nodes.

Scenario 3: Maximization of data transfer.

Scenario 4: Reduction of data queuing.

Scenario 5: Comparison of machine learning accuracy.

All the above mentioned scenarios are carried out in connection with a simulation outcome that is analyzed using MATLAB. Since the percentage of data transfer operation for different machine learning algorithms needs to be compared, such a simulation setup is considered. Further, the real-world traffic analysis with reduction in data queuing is performed by introducing additional nodes, as internal data security support is not extended. The descriptions of all scenarios are as follows.

Scenario 1

As the data transfer confidence level will be verified for efficient operations, the IoT data transmission method must always be free from robust performance. If any data is robust when a system is formed in a secure environment, then numerous steps must be taken for systems that are vulnerable to attacks, and as a result, a system removal procedure may be framed. A poor level of privacy is assigned if it is discovered that the data sequence is altered during such removal processes. Therefore, this scenario is assessed to determine the percentage of data robustness present in two alternative configurations, such as imprinting and sharing, in IoT applications. If two data sets include the same configuration, it must be removed from the system using a licensed formation technique. Because of this, separating the two data configurations will make IoT applications less robust, which in turn will boost network privacy. Additionally, where total error functions are minimized, the aforementioned setups are used to reduce the amount of noise in the data. Therefore, the noise factor for smooth processing will be reduced by the separation values between errors and identical setups.

The output comparison for data robustness and noise factor, which are determined using new and current methods [7,17,18], is shown in Figure 3. The number of similar configurations is significantly bigger and equal to 50,000, 78,000, 86,000, 98,000, and 110,000, respectively, because the projected technique is integrated for studying large data set functions. The imprint modification factor must be used in the planned formulations to reduce the number of identical configurations. As a result, the error values are calculated and identified as 16, 27, 44, 61, and 83, respectively, where additional error values prevent IoT networks from processing data in a secure manner. The robustness is seen, though, as real-time values must be known from the system, even with the same level of error. After making a few observations, it was discovered that the projected method offers less robustness than the current methodology. This can be demonstrated using over 100,000 identical configurations with 83 error values. The robustness of the new and existing approaches is 52 and 147, respectively.

Scenario 2

The presence of malicious nodes in an IoT network will make data security operations very difficult, so it is important to look at how many malicious nodes are present during IoT application operations. Requests must be made immediately from all users in order to examine the number of malicious nodes. If requests are not made before transmission, the data on the IoT device will be assumed to be idle. In contrast, if a request is made for data transmission, the central server examines the status of the users who made the request previously. A user can transfer data to internal receivers if the received request is legitimate, offering a secure data transmission mechanism. However, after a while, the request will be rejected by the server, which shows that some malicious nodes have been found during the data transfer. In order to prevent the presence of hostile nodes during the automated data transfer procedure, defense and access mechanisms are built. The number of malicious nodes for the proposed and existing approaches is shown in Figure 4 [7].

According to Figure 4, it is reasonable to assume that each node will have a substantially higher number of denied requests for big data processing, such as 67, 94, 118, 143, and 169 with unknown detections. The number of unidentified detections in the suggested approach is also discovered to be 10, 13, 18, 24, and 28, in turn. The percentage of malicious nodes will be higher due to more unidentified detections, and it will decrease after some observations. The comparison analysis reveals that the current approach [7,19,20] detects a greater number of malicious nodes because it only processes data with unlabeled requests. The number of malicious nodes is decreased by the combination mechanism in the proposed method, since it uses both labeled and unlabeled data. This is demonstrable by the 169 refused transmitter-side requests with a total of 28 unknown detections. The predicted and existing approaches in this situation have malicious node percentages of 1 and 17 percent, respectively. Therefore, the suggested method has a minimal number of malicious nodes, even in the face of a significant number of refused requests, effectively securing the data.

Scenario 3

Data transfer is one of the crucial metrics in IoT applications that should be tracked in real time. When only a small quantity of data needs to be transferred, it is not necessary to examine the underground data channels; however, because the suggested method will transmit more data at once, the subsurface data must be confirmed in every situation. Furthermore, in IoT systems, the data transfer pathways of each connected node must be specified in order for independent node data transfer to occur correctly. Additionally, the number of transmissions in the system, which analyses data in three dimensions (height, breadth, and length, respectively), directly affects the amount of data transfer. Furthermore, as with subsurface IoT systems, all sent data maintain a high level of anonymity. As a result, the separated ratio of underground data transfer with transmitted data determines the percentage of data transfer functions depicted in Figure 5.

According to Figure 5, the quantity of subterranean data that must be transferred for IoT applications is set at 100,000,000, 200,000,000, 300,000,000, 400,000,000, and 500,000,000 bits. At first, however, only a partial amount of data is transmitted as desired by various users; 46,000, 78,000, 116,000, 198,000, and 236,000 bits of data are transferred in the initial state, with the remaining data being transmitted at the following phase of transmission. According to the amount of communicated data, it is seen that the suggested method’s percentage of data transfer after separation functions is maximized at the conclusion of every stage, whereas the maximizing of the existing method is only accomplished to a limited degree. This may be demonstrated by using beginning and final transferred bits that are set to 100,000 and 500,000, respectively, where the percentage of full data transmission for the current and predicted models using machine learning algorithms is 82, 91, and 93, respectively. Therefore, the results of the data transfer demonstrate that the suggested IoT system is capable of transferring all data without using an insufficient procedure.

Scenario 4

The number of queuing time periods over the entire network must be kept to a minimum for data privacy in IoT applications. More time will go unused if a transmitted data packet deviates from the path and tries to remain in a specific queue. Therefore, the break will affect any remaining data, lowering the percentage of data transmissions. The proposed method tracks the duration of data queuing to prevent such reductions in data flow. Shortest routes are taken into account while determining individual paths for this type of monitoring system. However, if the first shortest path is congested, the remaining paths will be picked to prevent data queuing. Once the shortest path has been determined, the number of nodes linked to the central server for data transmission is examined; if there is a greater amount of data present, it will be transferred via alternate routes. The data transmission functions’ queuing time period is depicted in Figure 6.

According to Figure 6, there are 12, 17, 25, 32, and 37 connected nodes in a network, with the minimum path distances being 3, 5 and 6, respectively. Using the connected node transfer mechanism, the results of the current [7,17,19] and proposed methods are compared. In the event that an existing path is made available or if it is congested with traffic, the data that is delivered for IoT applications will concentrate on alternate paths. Therefore, even when paths and distance alter in an unanticipated pattern, precise measurements of queuing time periods are still made. It can be seen from the comparison of queuing times that the current technique offers a larger queue for transmitting specific data. However, the projected model using machine learning techniques (which in this case takes into account both labeled and unlabeled data) only offers a short waiting time. This may be evaluated using six constant pathways with 25, 32, and 37 connected nodes, where the queuing times for the proposed and existing methods are respectively 1.3, 1.1, and 0.8 seconds and 0.16, 0.1, and 0 seconds.

Scenario 5

For IoT applications that incorporate machine learning algorithms into a single platform, a comparison of precise measurement models is always required. Therefore, the proposed approach for measuring the percentage of accuracy for all four types of machine learning algorithms uses two sets of measurement data: one set of labeled data and the other set of unlabeled data. The suggested method carries out accuracy measurements by defining both labeled and unlabeled data, and a special case study for reinforcement learning is given. Due to these differences, the suggested method takes into account a cluster point function to determine the sample values in differentiated set functions. Figure 7 compares the accuracy of machine learning algorithms in maintaining data for Internet of Things applications.

Input trials are shown in step variations of 20 in Figure 7, which are increased to 100 for better observation scenarios. The analysis of both labeled and unlabeled sequence sets requires time for each iteration period, but the computing time for reinforcement learning is significantly higher than for the other three types of machine learning algorithms. Due to the existence of distinguishing characteristics in data set measures, the reinforcement learning method outperforms the other three in accuracy measurements. The accuracy of measurement for supervised, semi-supervised, unsupervised, and reinforcement learning algorithms, respectively, is equivalent to 96.8, 97.39, 98.31, and 98.18 percent. This may be confirmed using large iterations of roughly 100.

Validation Metrics

In the proposed method, performance metrics is carried out by examining data space complexity, where operations are carried out on the challenging data. The space complexity in the proposed method usually defines the data size to be present for an particular operational case study. In order to simulate such executions, a high amount of memory space is needed after wireless data transmission takes place. Thus, in the representation architecture itself a separate block is provided for data storage and analytics segments, thereby making the system perform effective data transfer operations at the same network with low complexity. Further, a large data set is provided in the case of a machine learning algorithm; therefore, the initial space complexity will be higher, and thereafter the data flow will be segmented for further reduction in complexity cases.

Figure 8 provides a comparison of space complexity with existing approaches [17,18] for a greater amount of data functions where the projected model proves to have low data space complexity. To verify the validation metrics, the best epochs are considered in changing steps of 20 and extend till 100 in order to determine the exact state of operation. In the starting time period before allocating the data to different segments the space complexity is higher, but after IoT data segmentation with high security the complexity is reduced. This can be proved with the best epochs of 20 and 80, where complexity for proposed method is 12 and 3 percentage, whereas with the same epoch the existing approach provides a complex data storage operation of 35 and 18 percentage, respectively. Hence, with machine learning optimization it is possible to further reduce the complexity of operations, as only the pseudo-label data set is enabled.

5. Conclusions

Since most user-transmitted data is transmitted wirelessly, it is vulnerable to a variety of attacks in the coming years. Additionally, more data protection is required if wireless activities are combined with IoT applications such as medical, surveillance, etc. However, the security requirements for IoT-related operations are now much lower in system development and administration, which leads to users maintaining low integrity in IoT environmental scenarios. Additionally, because there is no documented common system model for security and data transmission channels, the IoT integration procedures are more challenging for all identified applications. In order to develop a common system model with security evaluation criteria including robustness, noise levels, malicious node determinations, and queuing-time-period evaluations, the proposed method is put into practice. Additionally, a distinctive system model is created and linked using machine learning techniques for each of the aforementioned parametric evaluations. A programming loop system is established for supervised, semi-supervised, unsupervised, and reinforcement learning algorithms, and the design model is tested with both labeled and unlabeled data. The security of data transmission is greatly increased, and data privacy is protected as a result of labeled data sets. Using an existing dataset that has been segmented into numerous cluster areas, a real-time implementation test of the developed analytical model is conducted. Five additional scenarios are taken into account to assess the effectiveness of the predicted model, and it is clear from the results that machine learning techniques are significantly more accurate for running IoT applications. Additionally, 98 percent of all transferred data were safe and correct thanks to reinforcement learning. Future IoT applications will be able to connect directly to learning nodes, requiring simply the combination of algorithms for path specifications to enable data operation in highly secure environments.

Future Works

The proposed approach can be extended with a greater number of segmented data set features that support the system from entering into non-privacy content. In addition, the automatic feature of adding several data functions can also be provided using a machine learning feature set where a high noise factor can be reduced at maximized tolerance level.

Author Contributions

Conceptualization, S.S. and H.M.; methodology, W.A., A.A.-R. and E.A.; software, W.A., A.A.-R. and E.A.; validation W.A., A.A.-R. and E.A.; formal analysis, W.A., A.A.-R. and E.A.; investigation, M.A. (Mai Alduailij) and M.A. (Mona Alduailij); resources, M.A. (Mai Alduailij) and M.A. (Mona Alduailij); data curation, M.A. (Mai Alduailij) and M.A. (Mona Alduailij); writing—original draft preparation, S.S. and H.M.; writing—review and editing, S.S. and H.M.; visualization, M.A. and M.A. (Mona Alduailij); supervision S.S. and H.M.; project administration, S.S. and H.M.; funding acquisition, W.A., A.A.-R. and E.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Deanship of Scientific Research at Princess Nourah Bint Abdulrahman University through the Research Groups Program Grant no. (RGP-1440-0026).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ah-Fat, P.; Huth, M. Optimal Accuracy-Privacy Trade-Off for Secure Computations. IEEE Trans. Inf. Theory 2019, 65, 3165–3182. [Google Scholar] [CrossRef]
Alterazi, H.A.; Kshirsagar, P.R.; Manoharan, H.; Selvarajan, S.; Alhebaishi, N.; Srivastava, G.; Lin, J.C.-W. Prevention of Cyber Security with the Internet of Things Using Particle Swarm Optimization. Sensors 2022, 22, 6117. [Google Scholar] [CrossRef] [PubMed]
Bahutair, M.; Bouguettaya, A.; Neiat, A.G. Multi-Perspective Trust Management Framework for Crowdsourced IoT Services. IEEE Trans. Serv. Comput. 2022, 15, 2396–2409. [Google Scholar] [CrossRef]
Mummadi, A.; Yadav, B.M.K.; Sadhwika, R.; Shitharth, S. An Appraisal of Cyber-Attacks and Countermeasures Using Machine Learning Algorithms. In Proceedings of the ICAIDS 2021: Artificial Intelligence and Data Science, Hyderabad, India, 17–18 December 2021; Kumar, A., Fister, I., Jr., Gupta, P.K., Debayle, J., Zhang, Z.J., Usman, M., Eds.; Communications in Computer and Information Science. Springer: Cham, Switzerland, 2022; Volume 1673. [Google Scholar]
Khadam, U.; Iqbal, M.M.; Alruily, M.; Al Ghamdi, M.A.; Ramzan, M.; Almotiri, S.H. Text Data Security and Privacy in the Internet of Things: Threats, Challenges, and Future Directions. Wirel. Commun. Mob. Comput. 2020, 2020, 7105625. [Google Scholar] [CrossRef]
Khadidos, A.O.; Shitharth, S.; Khadidos, A.O.; Sangeetha, K.; Alyoubi, K.H. Healthcare Data Security Using IoT Sensors Based on Random Hashing Mechanism. J. Sens. 2022, 2022, 8457116. [Google Scholar] [CrossRef]
Gyamfi, E.; Jurcut, A. A Robust Security Task Offloading in Industrial IoT-Enabled Distributed Multi-Access Edge Computing. Front. Signal Process. 2022, 2, 13. [Google Scholar] [CrossRef]
Wu, T.-Y.; Meng, Q.; Kumari, S.; Zhang, P. Rotating behind Security: A Lightweight Authentication Protocol Based on IoT-Enabled Cloud Computing Environments. Sensors 2022, 22, 3858. [Google Scholar] [CrossRef] [PubMed]
Zhang, M.; Chen, J.; He, S.; Yang, L.; Gong, X.; Zhang, J. Privacy-Preserving Database Assisted Spectrum Access for Industrial Internet of Things: A Distributed Learning Approach. IEEE Trans. Ind. Electron. 2020, 67, 7094–7103. [Google Scholar] [CrossRef]
Prasanth, S.K.; Shitharth, S.; Kumar, B.P.; Subedha, V.; Sangeetha, K. Optimal Feature Selection based on Evolutionary Algorithm for Intrusion Detection. SN Comput. Sci. 2022, 3, 439. [Google Scholar] [CrossRef]
Abbas, G.; Mehmood, A.; Carsten, M.; Epiphaniou, G.; Lloret, J. Safety, Security and Privacy in Machine Learning Based Internet of Things. J. Sens. Actuator Netw. 2022, 11, 38. [Google Scholar] [CrossRef]
Abutaha, M.; Atawneh, B.; Hammouri, L.; Kaddoum, G. Secure lightweight cryptosystem for IoT and pervasive computing. Sci. Rep. 2022, 12, 19649. [Google Scholar] [CrossRef] [PubMed]
Priyadharshini, T.C.; Geetha, D.M. Efficient Key Management System Based Lightweight Devices in IoT. Intell. Autom. Soft Comput. 2022, 31, 1793–1808. [Google Scholar] [CrossRef]
Meng, P.; Tian, C.; Cheng, X. Publicly verifiable and efficiency/security-adjustable outsourcing scheme for solving large-scale modular system of linear equations. J. Cloud Comput. 2019, 8, 24. [Google Scholar] [CrossRef]
Dorri, A.; Kanhere, S.S.; Jurdak, R.; Gauravaram, P. LSB: A Lightweight Scalable Blockchain for IoT security and anonymity. J. Parallel Distrib. Comput. 2019, 134, 180–197. [Google Scholar] [CrossRef]
Singh, N.K.; Mahajan, V. Mathematical Model of Cyber Intrusion in Smart Grid. In Proceedings of the 2019 IEEE PES GTD Grand International Conference and Exposition Asia (GTD Asia), Bangkok, Thailand, 19–23 March 2019; pp. 965–969. [Google Scholar]
Velinov, A.; Mileva, A.; Wendzel, S.; Mazurczyk, W. Covert channels in the mqtt-based internet of things. IEEE Access 2019, 7, 161899–161915. [Google Scholar] [CrossRef]
Kotenko, I.; Saenko, I.; Lauta, O.; Kribel, A. A Proactive Protection of Smart Power Grids against Cyberattacks on Service Data Transfer Protocols by Computational Intelligence Methods. Sensors 2022, 22, 7506. [Google Scholar] [CrossRef]
Górski, T. UML Profile for Messaging Patterns in Service-Oriented Architecture, Microservices, and Internet of Things. Appl. Sci. 2022, 12, 12790. [Google Scholar] [CrossRef]
Alam, S.; Zardari, S.; Shamsi, J.A. Blockchain-Based Trust and Reputation Management in SIoT. Electronics 2022, 11, 3871. [Google Scholar] [CrossRef]
Sarker, I.H. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Comput. Sci. 2021, 2, 160. [Google Scholar] [CrossRef] [PubMed]
Singh, S.; Sulthana, R.; Shewale, T.; Chamola, V.; Benslimane, A.; Sikdar, B. Machine-Learning-Assisted Security and Privacy Provisioning for Edge Computing: A Survey. IEEE Internet Things J. 2022, 9, 236–260. [Google Scholar] [CrossRef]

Figure 1. Flow and security enhancement in IoT data processing.

Figure 2. Machine learning with integrated system model.

Figure 3. Robustness with error functions.

Figure 4. Detection of malicious nodes.

Figure 5. Percentage of data transfer.

Figure 6. Queuing time periods.

Figure 7. Accuracy of machine learning algorithms.

Figure 8. Comparison of space complexities.

Table 1. Comparison of relevant works.

References	Methods	Objectives
[11]	Machine learning for cyber physical systems	Minimization of computational cost and privacy problems
[12]	Pervasive computing networks based on field-programmable gate arrays	Allocation of limited resources and maximization of epoch data security
[13]	Advanced encryption system for data privacy	Data owner and key management for security
[14]	Outsourcing scheme determinations	Security balance with optimal privacy verification
[15]	Scalable block chain for IoT	Maximization of distributed throughput
[16]	Remote terminal unit interaction for data privacy	Minimization of denial of service attacks
Proposed	Machine learning algorithms for data security and privacy	Multi-objective framework with minimization of data robustness, noise factors, malicious nodes and data queues

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alhalabi, W.; Al-Rasheed, A.; Manoharan, H.; Alabdulkareem, E.; Alduailij, M.; Alduailij, M.; Selvarajan, S. Distinctive Measurement Scheme for Security and Privacy in Internet of Things Applications Using Machine Learning Algorithms. Electronics 2023, 12, 747. https://doi.org/10.3390/electronics12030747

AMA Style

Alhalabi W, Al-Rasheed A, Manoharan H, Alabdulkareem E, Alduailij M, Alduailij M, Selvarajan S. Distinctive Measurement Scheme for Security and Privacy in Internet of Things Applications Using Machine Learning Algorithms. Electronics. 2023; 12(3):747. https://doi.org/10.3390/electronics12030747

Chicago/Turabian Style

Alhalabi, Wadee, Amal Al-Rasheed, Hariprasath Manoharan, Eatedal Alabdulkareem, Mai Alduailij, Mona Alduailij, and Shitharth Selvarajan. 2023. "Distinctive Measurement Scheme for Security and Privacy in Internet of Things Applications Using Machine Learning Algorithms" Electronics 12, no. 3: 747. https://doi.org/10.3390/electronics12030747

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Distinctive Measurement Scheme for Security and Privacy in Internet of Things Applications Using Machine Learning Algorithms

Abstract

1. Introduction

1.1. Literature Survey

1.2. Research Gap and Motivation

1.3. Objectives

2. Security: Analytical Representations

3. Machine Learning Algorithms

4. Results and Discussions

Validation Metrics

5. Conclusions

Future Works

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI