Journal of Cybersecurity and Privacy

19 pages, 4035 KiB

Open AccessArticle

Anomaly Detection for Modbus over TCP in Control Systems Using Entropy and Classification-Based Analysis

by Tirthankar Ghosh, Sikha Bagui, Subhash Bagui, Martin Kadzis and Jackson Bare

J. Cybersecur. Priv. 2023, 3(4), 895-913; https://doi.org/10.3390/jcp3040041 - 01 Dec 2023

Viewed by 1113

This article presents a statistical approach using entropy and classification-based analysis to detect anomalies in industrial control systems traffic. Several statistical techniques have been proposed to create baselines and measure deviation to detect intrusion in enterprise networks with a centralized intrusion detection approach [...] Read more.

This article presents a statistical approach using entropy and classification-based analysis to detect anomalies in industrial control systems traffic. Several statistical techniques have been proposed to create baselines and measure deviation to detect intrusion in enterprise networks with a centralized intrusion detection approach in mind. Looking at traffic volume alone to find anomalous deviation may not be enough—it may result in increased false positives. The near real-time communication requirements, coupled with the lack of centralized infrastructure in operations technology and limited resources of the sensor motes, require an efficient anomaly detection system characterized by these limitations. This paper presents extended results from our previous work by presenting a detailed cluster-based entropy analysis on selected network traffic features. It further extends the analysis using a classification-based approach. Our detailed entropy analysis corroborates with our earlier findings that, although some degree of anomaly may be detected using univariate and bivariate entropy analysis for Denial of Service (DOS) and Man-in-the-Middle (MITM) attacks, not much information may be obtained for the initial reconnaissance, thus preventing early stages of attack detection in the Cyber Kill Chain. Our classification-based analysis shows that, overall, the classification results of the DOS attacks were much higher than the MITM attacks using two Modbus features in addition to the three TCP/IP features. In terms of classifiers, J48 and random forest had the best classification results and can be considered comparable. For the DOS attack, no resampling with the 60–40 (training/testing split) had the best results (average accuracy of 97.87%), but for the MITM attack, the 80–20 non-attack vs. attack data with the 75–25 split (average accuracy of 82.81%) had the best results. Full article

(This article belongs to the Special Issue Intrusion, Malware Detection and Prevention in Networks)

► Show Figures

Figure 1

13 pages, 3271 KiB

Open AccessArticle

Evaluating Cluster-Based Synthetic Data Generation for Blood-Transfusion Analysis

by Shannon K. S. Kroes, Matthijs van Leeuwen, Rolf H. H. Groenwold and Mart P. Janssen

J. Cybersecur. Priv. 2023, 3(4), 882-894; https://doi.org/10.3390/jcp3040040 - 01 Dec 2023

Viewed by 881

Abstract

Synthetic data generation is becoming an increasingly popular approach to making privacy-sensitive data available for analysis. Recently, cluster-based synthetic data generation (CBSDG) has been proposed, which uses explainable and tractable techniques for privacy preservation. Although the algorithm demonstrated promising performance on simulated data, [...] Read more.

Synthetic data generation is becoming an increasingly popular approach to making privacy-sensitive data available for analysis. Recently, cluster-based synthetic data generation (CBSDG) has been proposed, which uses explainable and tractable techniques for privacy preservation. Although the algorithm demonstrated promising performance on simulated data, CBSDG has not yet been applied to real, personal data. In this work, a published blood-transfusion analysis is replicated with synthetic data to assess whether CBSDG can reproduce more complex and intricate variable relations than previously evaluated. Data from the Dutch national blood bank, consisting of 250,729 donation records, were used to predict donor hemoglobin (Hb) levels by means of support vector machines (SVMs). Precision scores were equal to the original data results for both male (0.997) and female (0.987) donors, recall was 0.007 higher for male and 0.003 lower for female donors (original estimates 0.739 and 0.637, respectively). The impact of the variables on Hb predictions was similar, as quantified and visualized with Shapley additive explanation values. Opportunities for attribute disclosure were decreased for all but two variables; only the binary variables Deferral Status and Sex could still be inferred. Such inference was also possible for donors who were not used as input for the generator and may result from correlations in the data as opposed to overfitting in the synthetic-data-generation process. The high predictive performance obtained with the synthetic data shows potential of CBSDG for practical implementation. Full article

(This article belongs to the Special Issue Data Protection and Privacy)

► Show Figures

Figure 1

24 pages, 5484 KiB

Open AccessArticle

Machine Learning Detection of Cloud Services Abuse as C&C Infrastructure

by Turki Al lelah, George Theodorakopoulos, Amir Javed and Eirini Anthi

J. Cybersecur. Priv. 2023, 3(4), 858-881; https://doi.org/10.3390/jcp3040039 - 01 Dec 2023

Viewed by 1026

Abstract

The proliferation of cloud and public legitimate services (CLS) on a global scale has resulted in increasingly sophisticated malware attacks that abuse these services as command-and-control (C&C) communication channels. Conventional security solutions are inadequate for detecting malicious C&C traffic because it blends with [...] Read more.

The proliferation of cloud and public legitimate services (CLS) on a global scale has resulted in increasingly sophisticated malware attacks that abuse these services as command-and-control (C&C) communication channels. Conventional security solutions are inadequate for detecting malicious C&C traffic because it blends with legitimate traffic. This motivates the development of advanced detection techniques. We make the following contributions: First, we introduce a novel labeled dataset. This dataset serves as a valuable resource for training and evaluating detection techniques aimed at identifying malicious bots that abuse CLS as C&C channels. Second, we tailor our feature engineering to behaviors indicative of CLS abuse, such as connections to known CLS domains and potential C&C API calls. Third, to identify the most relevant features, we introduced a custom feature elimination (CFE) method designed to determine the exact number of features needed for filter selection approaches. Fourth, our approach focuses on both static and derivative features of Portable Executable (PE) files. After evaluating various machine learning (ML) classifiers, the random forest emerges as the most effective classifier, achieving a 98.26% detection rate. Fifth, we introduce the “Replace Misclassified Parameter (RMCP)” adversarial attack. This white-box strategy is designed to evaluate our system’s detection robustness. The RMCP attack modifies feature values in malicious samples to make them appear as benign samples, thereby bypassing the ML model’s classification while maintaining the malware’s malicious capabilities. The results of the robustness evaluation demonstrate that our proposed method successfully maintains a high accuracy level of 84%. In sum, our comprehensive approach offers a robust solution to the growing threat of malware abusing CLS as C&C infrastructure. Full article

(This article belongs to the Special Issue Cloud Security and Privacy)

► Show Figures

Figure 1

14 pages, 587 KiB

Open AccessArticle

Challenging Assumptions of Normality in AES s-Box Configurations under Side-Channel Analysis

by Clay Carper, Stone Olguin, Jarek Brown, Caylie Charlton and Mike Borowczak

J. Cybersecur. Priv. 2023, 3(4), 844-857; https://doi.org/10.3390/jcp3040038 - 29 Nov 2023

Viewed by 809

Abstract

Power-based Side-Channel Analysis (SCA) began with visual-based examinations and has progressed to utilize data-driven statistical analysis. Two distinct classifications of these methods have emerged over the years; those focused on leakage exploitation and those dedicated to leakage detection. This work primarily focuses on [...] Read more.

Power-based Side-Channel Analysis (SCA) began with visual-based examinations and has progressed to utilize data-driven statistical analysis. Two distinct classifications of these methods have emerged over the years; those focused on leakage exploitation and those dedicated to leakage detection. This work primarily focuses on a leakage detection-based schema that utilizes Welch’s t-test, known as Test Vector Leakage Assessment (TVLA). Both classes of methods process collected data using statistical frameworks that result in the successful exfiltration of information via SCA. Often, statistical testing used during analysis requires the assumption that collected power consumption data originates from a normal distribution. To date, this assumption has remained largely uncontested. This work seeks to demonstrate that while past studies have assumed the normality of collected power traces, this assumption should be properly evaluated. In order to evaluate this assumption, an implementation of Tiny-AES-c with nine unique substitution-box (s-box) configurations is conducted using TVLA to guide experimental design. By leveraging the complexity of the AES algorithm, a sufficiently diverse and complex dataset was developed. Under this dataset, statistical tests for normality such as the Shapiro-Wilk test and the Kolmogorov-Smirnov test provide significant evidence to reject the null hypothesis that the power consumption data is normally distributed. To address this observation, existing non-parametric equivalents such as the Wilcoxon Signed-Rank Test and the Kruskal-Wallis Test are discussed in relation to currently used parametric tests such as Welch’s t-test. Full article

► Show Figures

Figure 1

14 pages, 669 KiB

Open AccessArticle

A Hybrid Dimensionality Reduction for Network Intrusion Detection

by Humera Ghani, Shahram Salekzamankhani and Bal Virdee

J. Cybersecur. Priv. 2023, 3(4), 830-843; https://doi.org/10.3390/jcp3040037 - 16 Nov 2023

Cited by 1 | Viewed by 1211

Abstract

Due to the wide variety of network services, many different types of protocols exist, producing various packet features. Some features contain irrelevant and redundant information. The presence of such features increases computational complexity and decreases accuracy. Therefore, this research is designed to reduce [...] Read more.

Due to the wide variety of network services, many different types of protocols exist, producing various packet features. Some features contain irrelevant and redundant information. The presence of such features increases computational complexity and decreases accuracy. Therefore, this research is designed to reduce the data dimensionality and improve the classification accuracy in the UNSW-NB15 dataset. It proposes a hybrid dimensionality reduction system that does feature selection (FS) and feature extraction (FE). FS was performed using the Recursive Feature Elimination (RFE) technique, while FE was accomplished by transforming the features into principal components. This combined scheme reduced a total of 41 input features into 15 components. The proposed systems’ classification performance was determined using an ensemble of Support Vector Classifier (SVC), K-nearest Neighbor classifier (KNC), and Deep Neural Network classifier (DNN). The system was evaluated using accuracy, detection rate, false positive rate, f1-score, and area under the curve metrics. Comparing the voting ensemble results of the full feature set against the 15 principal components confirms that reduced and transformed features did not significantly decrease the classifier’s performance. We achieved 94.34% accuracy, a 93.92% detection rate, a 5.23% false positive rate, a 94.32% f1-score, and a 94.34% area under the curve when 15 components were input to the voting ensemble classifier. Full article

(This article belongs to the Special Issue Intrusion, Malware Detection and Prevention in Networks)

► Show Figures

Figure 1

22 pages, 5820 KiB

Open AccessArticle

D2WFP: A Novel Protocol for Forensically Identifying, Extracting, and Analysing Deep and Dark Web Browsing Activities

by Mohamed Chahine Ghanem, Patrick Mulvihill, Karim Ouazzane, Ramzi Djemai and Dipo Dunsin

J. Cybersecur. Priv. 2023, 3(4), 808-829; https://doi.org/10.3390/jcp3040036 - 15 Nov 2023

Cited by 1 | Viewed by 1446

Abstract

The use of the unindexed web, commonly known as the deep web and dark web, to commit or facilitate criminal activity has drastically increased over the past decade. The dark web is a dangerous place where all kinds of criminal activities take place, [...] Read more.

The use of the unindexed web, commonly known as the deep web and dark web, to commit or facilitate criminal activity has drastically increased over the past decade. The dark web is a dangerous place where all kinds of criminal activities take place, Despite advances in web forensic techniques, tools, and methodologies, few studies have formally tackled dark and deep web forensics and the technical differences in terms of investigative techniques and artefact identification and extraction. This study proposes a novel and comprehensive protocol to guide and assist digital forensic professionals in investigating crimes committed on or via the deep and dark web. The protocol, named D2WFP, establishes a new sequential approach for performing investigative activities by observing the order of volatility and implementing a systemic approach covering all browsing-related hives and artefacts which ultimately resulted in improving the accuracy and effectiveness. Rigorous quantitative and qualitative research has been conducted by assessing the D2WFP following a scientifically sound and comprehensive process in different scenarios and the obtained results show an apparent increase in the number of artefacts recovered when adopting the D2WFP which outperforms any current industry or opensource browsing forensic tools. The second contribution of the D2WFP is the robust formulation of artefact correlation and cross-validation within the D2WFP which enables digital forensic professionals to better document and structure their analysis of host-based deep and dark web browsing artefacts. Full article

(This article belongs to the Special Issue Cyber Security and Digital Forensics)

► Show Figures

Figure 1

14 pages, 692 KiB

Open AccessArticle

Towards a Near-Real-Time Protocol Tunneling Detector Based on Machine Learning Techniques

by Filippo Sobrero, Beatrice Clavarezza, Daniele Ucci and Federica Bisio

J. Cybersecur. Priv. 2023, 3(4), 794-807; https://doi.org/10.3390/jcp3040035 - 06 Nov 2023

Viewed by 1108

Abstract

In the very recent years, cybersecurity attacks have increased at an unprecedented pace, becoming ever more sophisticated and costly. Their impact has involved both private/public companies and critical infrastructures. At the same time, due to the COVID-19 pandemic, the security perimeters of many [...] Read more.

In the very recent years, cybersecurity attacks have increased at an unprecedented pace, becoming ever more sophisticated and costly. Their impact has involved both private/public companies and critical infrastructures. At the same time, due to the COVID-19 pandemic, the security perimeters of many organizations expanded, causing an increase in the attack surface exploitable by threat actors through malware and phishing attacks. Given these factors, it is of primary importance to monitor the security perimeter and the events occurring in the monitored network, according to a tested security strategy of detection and response. In this paper, we present a protocol tunneling detector prototype which inspects, in near real-time, a company’s network traffic using machine learning techniques. Indeed, tunneling attacks allow malicious actors to maximize the time in which their activity remains undetected. The detector monitors unencrypted network flows and extracts features to detect possible occurring attacks and anomalies by combining machine learning and deep learning. The proposed module can be embedded in any network security monitoring platform able to provide network flow information along with its metadata. The detection capabilities of the implemented prototype have been tested both on benign and malicious datasets. Results show an overall accuracy of

97.1 %

and an F1-score equal to

95.6 %

. Full article

(This article belongs to the Collection Machine Learning and Data Analytics for Cyber Security)

► Show Figures

Figure 1

36 pages, 1830 KiB

Open AccessReview

Security in Cloud-Native Services: A Survey

by Theodoros Theodoropoulos, Luis Rosa, Chafika Benzaid, Peter Gray, Eduard Marin, Antonios Makris, Luis Cordeiro, Ferran Diego, Pavel Sorokin, Marco Di Girolamo, Paolo Barone, Tarik Taleb and Konstantinos Tserpes

J. Cybersecur. Priv. 2023, 3(4), 758-793; https://doi.org/10.3390/jcp3040034 - 26 Oct 2023

Viewed by 4035

Abstract

Cloud-native services face unique cybersecurity challenges due to their distributed infrastructure. They are susceptible to various threats like malware, DDoS attacks, and Man-in-the-Middle (MITM) attacks. Additionally, these services often process sensitive data that must be protected from unauthorized access. On top of that, [...] Read more.

Cloud-native services face unique cybersecurity challenges due to their distributed infrastructure. They are susceptible to various threats like malware, DDoS attacks, and Man-in-the-Middle (MITM) attacks. Additionally, these services often process sensitive data that must be protected from unauthorized access. On top of that, the dynamic and scalable nature of cloud-native services makes it difficult to maintain consistent security, as deploying new instances and infrastructure introduces new vulnerabilities. To address these challenges, efficient security solutions are needed to mitigate potential threats while aligning with the characteristics of cloud-native services. Despite the abundance of works focusing on security aspects in the cloud, there has been a notable lack of research that is focused on the security of cloud-native services. To address this gap, this work is the first survey that is dedicated to exploring security in cloud-native services. This work aims to provide a comprehensive investigation of the aspects, features, and solutions that are associated with security in cloud-native services. It serves as a uniquely structured mapping study that maps the key aspects to the corresponding features, and these features to numerous contemporary solutions. Furthermore, it includes the identification of various candidate open-source technologies that are capable of supporting the realization of each explored solution. Finally, it showcases how these solutions can work together in order to establish each corresponding feature. The insights and findings of this work can be used by cybersecurity professionals, such as developers and researchers, to enhance the security of cloud-native services. Full article

(This article belongs to the Special Issue Cloud Security and Privacy)

► Show Figures

Figure 1

14 pages, 2934 KiB

Open AccessArticle

A Framework for Synthetic Agetech Attack Data Generation

by Noel Khaemba, Issa Traoré and Mohammad Mamun

J. Cybersecur. Priv. 2023, 3(4), 744-757; https://doi.org/10.3390/jcp3040033 - 09 Oct 2023

Viewed by 1189

Abstract

To address the lack of datasets for agetech, this paper presents an approach for generating synthetic datasets that include traces of benign and attack datasets for agetech. The generated datasets could be used to develop and evaluate intrusion detection systems for smart homes [...] Read more.

To address the lack of datasets for agetech, this paper presents an approach for generating synthetic datasets that include traces of benign and attack datasets for agetech. The generated datasets could be used to develop and evaluate intrusion detection systems for smart homes for seniors aging in place. After reviewing several resources, it was established that there are no agetech attack data for sensor readings. Therefore, in this research, several methods for generating attack data were explored using attack data patterns from an existing IoT dataset called TON_IoT weather data. The TON_IoT dataset could be used in different scenarios, but in this study, the focus is to apply it to agetech. The attack patterns were replicated in a normal agetech dataset from a temperature sensor collected from the Information Security and Object Technology (ISOT) research lab. The generated data are different from normal data, as abnormal segments are shown that could be considered as attacks. The generated agetech attack datasets were also trained using machine learning models, and, based on different metrics, achieved good classification performance in predicting whether a sample is benign or malicious. Full article

► Show Figures

Figure 1

38 pages, 3956 KiB

Open AccessArticle

Studying Imbalanced Learning for Anomaly-Based Intelligent IDS for Mission-Critical Internet of Things

by Ghada Abdelmoumin, Danda B. Rawat and Abdul Rahman

J. Cybersecur. Priv. 2023, 3(4), 706-743; https://doi.org/10.3390/jcp3040032 - 06 Oct 2023

Viewed by 1107

Abstract

Training-anomaly-based, machine-learning-based, intrusion detection systems (AMiDS) for use in critical Internet of Things (CioT) systems and military Internet of Things (MioT) environments may involve synthetic data or publicly simulated data due to data restrictions, data scarcity, or both. However, synthetic data can be [...] Read more.

Training-anomaly-based, machine-learning-based, intrusion detection systems (AMiDS) for use in critical Internet of Things (CioT) systems and military Internet of Things (MioT) environments may involve synthetic data or publicly simulated data due to data restrictions, data scarcity, or both. However, synthetic data can be unrealistic and potentially biased, and simulated data are invariably static, unrealistic, and prone to obsolescence. Building an AMiDS logical model to predict the deviation from normal behavior in MioT and CioT devices operating at the sensing or perception layer due to adversarial attacks often requires the model to be trained using current and realistic data. Unfortunately, while real-time data are realistic and relevant, they are largely imbalanced. Imbalanced data have a skewed class distribution and low-similarity index, thus hindering the model’s ability to recognize important features in the dataset and make accurate predictions. Data-driven learning using data sampling, resampling, and generative methods can lessen the adverse impact of a data imbalance on the AMiDS model’s performance and prediction accuracy. Generative methods enable passive adversarial learning. This paper investigates several data sampling, resampling, and generative methods. It examines their impacts on the performance and prediction accuracy of AMiDS models trained using imbalanced data drawn from the UNSW_2018_IoT_Botnet dataset, a publicly available IoT dataset from the IEEEDataPort. Furthermore, it evaluates the performance and predictability of these models when trained using data transformation methods, such as normalization and one-hot encoding, to cover a skewed distribution, data sampling and resampling methods to address data imbalances, and generative methods to train the models to increase the model’s robustness to recognize new but similar attacks. In this initial study, we focus on CioT systems and train PCA-based and oSVM-based AMiDS models constructed using low-complexity PCA and one-class SVM (oSVM) ML algorithms to fit an imbalanced ground truth IoT dataset. Overall, we consider the rare event prediction case where the minority class distribution is disproportionately low compared to the majority class distribution. We plan to use transfer learning in future studies to generalize our initial findings to the MioT environment. We focus on CioT systems and MioT environments instead of traditional or non-critical IoT environments due to the stringent low energy, the minimal response time constraints, and the variety of low-power, situational-aware (or both) things operating at the sensing or perception layer in a highly complex and open environment. Full article

(This article belongs to the Special Issue Intrusion, Malware Detection and Prevention in Networks)

► Show Figures

Figure 1

44 pages, 1078 KiB

Open AccessArticle

Cyberattacks in Smart Grids: Challenges and Solving the Multi-Criteria Decision-Making for Cybersecurity Options, Including Ones That Incorporate Artificial Intelligence, Using an Analytical Hierarchy Process

by Ayat-Allah Bouramdane

J. Cybersecur. Priv. 2023, 3(4), 662-705; https://doi.org/10.3390/jcp3040031 - 27 Sep 2023

Cited by 6 | Viewed by 5502

Abstract

Smart grids have emerged as a transformative technology in the power sector, enabling efficient energy management. However, the increased reliance on digital technologies also exposes smart grids to various cybersecurity threats and attacks. This article provides a comprehensive exploration of cyberattacks and cybersecurity [...] Read more.

Smart grids have emerged as a transformative technology in the power sector, enabling efficient energy management. However, the increased reliance on digital technologies also exposes smart grids to various cybersecurity threats and attacks. This article provides a comprehensive exploration of cyberattacks and cybersecurity in smart grids, focusing on critical components and applications. It examines various cyberattack types and their implications on smart grids, backed by real-world case studies and quantitative models. To select optimal cybersecurity options, the study proposes a multi-criteria decision-making (MCDM) approach using the analytical hierarchy process (AHP). Additionally, the integration of artificial intelligence (AI) techniques in smart-grid security is examined, highlighting the potential benefits and challenges. Overall, the findings suggest that “security effectiveness” holds the highest importance, followed by “cost-effectiveness”, “scalability”, and “Integration and compatibility”, while other criteria (i.e., “performance impact”, “manageability and usability”, “compliance and regulatory requirements”, “resilience and redundancy”, “vendor support and collaboration”, and “future readiness”) contribute to the evaluation but have relatively lower weights. Alternatives such as “access control and authentication” and “security information and event management” with high weighted sums are crucial for enhancing cybersecurity in smart grids, while alternatives such as “compliance and regulatory requirements” and “encryption” have lower weighted sums but still provide value in their respective criteria. We also find that “deep learning” emerges as the most effective AI technique for enhancing cybersecurity in smart grids, followed by “hybrid approaches”, “Bayesian networks”, “swarm intelligence”, and “machine learning”, while “fuzzy logic”, “natural language processing”, “expert systems”, and “genetic algorithms” exhibit lower effectiveness in addressing smart-grid cybersecurity. The article discusses the benefits and drawbacks of MCDM-AHP, proposes enhancements for its use in smart-grid cybersecurity, and suggests exploring alternative MCDM techniques for evaluating security options in smart grids. The approach aids decision-makers in the smart-grid field to make informed cybersecurity choices and optimize resource allocation. Full article

► Show Figures

Graphical abstract

Journal Menu

Journal Browser

J. Cybersecur. Priv., Volume 3, Issue 4 (December 2023) – 11 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI