Mathematics

Research

20 pages, 4716 KiB

Open AccessArticle

SSGCL: Simple Social Recommendation with Graph Contrastive Learning

by Zhihua Duan, Chun Wang and Wending Zhong

Mathematics 2024, 12(7), 1107; https://doi.org/10.3390/math12071107 - 07 Apr 2024

Viewed by 355

As user–item interaction information is typically limited, collaborative filtering (CF)-based recommender systems often suffer from the data sparsity issue. To address this issue, recent recommender systems have turned to graph neural networks (GNNs) due to their superior performance in capturing high-order relationships. Furthermore, [...] Read more.

As user–item interaction information is typically limited, collaborative filtering (CF)-based recommender systems often suffer from the data sparsity issue. To address this issue, recent recommender systems have turned to graph neural networks (GNNs) due to their superior performance in capturing high-order relationships. Furthermore, some of these GNN-based recommendation models also attempt to incorporate other information. They either extract self-supervised signals to mitigate the data sparsity problem or employ social information to assist with learning better representations under a social recommendation setting. However, only a few methods can take full advantage of these different aspects of information. Based on some testing, we believe most of these methods are complex and redundantly designed, which may lead to sub-optimal results. In this paper, we propose SSGCL, which is a recommendation system model that utilizes both social information and self-supervised information. We design a GNN-based propagation strategy that integrates social information with interest information in a simple yet effective way to learn user–item representations for recommendations. In addition, a specially designed contrastive learning module is employed to take advantage of the self-supervised signals for a better user–item representation distribution. The contrastive learning module is jointly optimized with the recommendation module to benefit the final recommendation result. Experiments on several benchmark data sets demonstrate the significant improvement in performance achieved by our model when compared with baseline models. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

13 pages, 1061 KiB

Open AccessArticle

GA-CatBoost-Weight Algorithm for Predicting Casualties in Terrorist Attacks: Addressing Data Imbalance and Enhancing Performance

by Yuxiang He, Baisong Yang and Chiawei Chu

Mathematics 2024, 12(6), 818; https://doi.org/10.3390/math12060818 - 11 Mar 2024

Viewed by 439

Abstract

Terrorism poses a significant threat to international peace and stability. The ability to predict potential casualties resulting from terrorist attacks, based on specific attack characteristics, is vital for protecting the safety of innocent civilians. However, conventional data sampling methods struggle to effectively address [...] Read more.

Terrorism poses a significant threat to international peace and stability. The ability to predict potential casualties resulting from terrorist attacks, based on specific attack characteristics, is vital for protecting the safety of innocent civilians. However, conventional data sampling methods struggle to effectively address the challenge of data imbalance in textual features. To tackle this issue, we introduce a novel algorithm, GA-CatBoost-Weight, designed for predicting whether terrorist attacks will lead to casualties among innocent civilians. Our approach begins with feature selection using the RF-RFE method, followed by leveraging the CatBoost algorithm to handle diverse modal features comprehensively and to mitigate data imbalance. Additionally, we employ Genetic Algorithm (GA) to finetune hyperparameters. Experimental validation has demonstrated the superior performance of our method, achieving a sensitivity of 92.68% and an F1 score of 90.99% with fewer iterations. To the best of our knowledge, our study is the pioneering research that applies CatBoost to address the prediction of terrorist attack outcomes. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

21 pages, 1178 KiB

Open AccessArticle

CLG: Contrastive Label Generation with Knowledge for Few-Shot Learning

by Han Ma, Baoyu Fan, Benjamin K. Ng and Chan-Tong Lam

Mathematics 2024, 12(3), 472; https://doi.org/10.3390/math12030472 - 01 Feb 2024

Viewed by 579

Abstract

Training large-scale models needs big data. However, the few-shot problem is difficult to resolve due to inadequate training data. It is valuable to use only a few training samples to perform the task, such as using big data for application scenarios due to [...] Read more.

Training large-scale models needs big data. However, the few-shot problem is difficult to resolve due to inadequate training data. It is valuable to use only a few training samples to perform the task, such as using big data for application scenarios due to cost and resource problems. So, to tackle this problem, we present a simple and efficient method, contrastive label generation with knowledge for few-shot learning (CLG). Specifically, we: (1) Propose contrastive label generation to align the label with data input and enhance feature representations; (2) Propose a label knowledge filter to avoid noise during injection of the explicit knowledge into the data and label; (3) Employ label logits mask to simplify the task; (4) Employ multi-task fusion loss to learn different perspectives from the training set. The experiments demonstrate that CLG achieves an accuracy of 59.237%, which is more than about 3% in comparison with the best baseline. It shows that CLG obtains better features and gives the model more information about the input sentences to improve the classification ability. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

20 pages, 522 KiB

Open AccessArticle

Predicting Typhoon Flood in Macau Using Dynamic Gaussian Bayesian Network and Surface Confluence Analysis

by Shujie Zou, Chiawei Chu, Weijun Dai, Ning Shen, Jia Ren and Weiping Ding

Mathematics 2024, 12(2), 340; https://doi.org/10.3390/math12020340 - 19 Jan 2024

Viewed by 897

Abstract

A typhoon passing through or making landfall in a coastal city may result in seawater intrusion and continuous rainfall, which may cause urban flooding. The urban flood disaster caused by a typhoon is a dynamic process that changes over time, and a dynamic [...] Read more.

A typhoon passing through or making landfall in a coastal city may result in seawater intrusion and continuous rainfall, which may cause urban flooding. The urban flood disaster caused by a typhoon is a dynamic process that changes over time, and a dynamic Gaussian Bayesian network (DGBN) is used to model the time series events in this paper. The scene data generated by each typhoon are different, which means that each typhoon has different characteristics. This paper establishes multiple DGBNs based on the historical data of Macau flooding caused by multiple typhoons, and similar analysis is made between the scene data related to the current flooding to be predicted and the scene data of historical flooding. The DGBN most similar to the scene characteristics of the current flooding is selected as the predicting network of the current flooding. According to the topography, the influence of the surface confluence is considered, and the Manning formula analysis method is proposed. The Manning formula is combined with the DGBN to obtain the final prediction model, DGBN-m, which takes into account the effects of time series and non-time-series factors. The flooding data provided by the Macau Meteorological Bureau are used to carry out experiments, and it is proved that the proposed model can predict the flooding depth well in a specific area of Macau under the condition of a small amount of data and that the best predicting accuracy can reach 84%. Finally, generalization analysis is performed to further confirm the validity of the proposed model. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

13 pages, 560 KiB

Open AccessArticle

Healthcare Cost Prediction Based on Hybrid Machine Learning Algorithms

by Shujie Zou, Chiawei Chu, Ning Shen and Jia Ren

Mathematics 2023, 11(23), 4778; https://doi.org/10.3390/math11234778 - 27 Nov 2023

Cited by 1 | Viewed by 1176

Abstract

Healthcare cost is an issue of concern right now. While many complex machine learning algorithms have been proposed to analyze healthcare cost and address the shortcomings of linear regression and reliance on expert analyses, these algorithms do not take into account whether each [...] Read more.

Healthcare cost is an issue of concern right now. While many complex machine learning algorithms have been proposed to analyze healthcare cost and address the shortcomings of linear regression and reliance on expert analyses, these algorithms do not take into account whether each characteristic variable contained in the healthcare data has a positive effect on predicting healthcare cost. This paper uses hybrid machine learning algorithms to predict healthcare cost. First, network structure learning algorithms (a score-based algorithm, constraint-based algorithm, and hybrid algorithm) for a Conditional Gaussian Bayesian Network (CGBN) are used to learn the isolated characteristic variables in healthcare data without changing the data properties (i.e., discrete or continuous). Then, the isolated characteristic variables are removed from the original data and the remaining data used to train regression algorithms. Two public healthcare datasets are used to test the performance of the proposed hybrid machine learning algorithm model. Experiments show that when compared to popular single machine learning algorithms (Long Short Term Memory, Random Forest, etc.) the proposed scheme can obtain similar or higher prediction accuracy with a reduced amount of data. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

19 pages, 2159 KiB

Open AccessArticle

Enhancing the Security and Privacy in the IoT Supply Chain Using Blockchain and Federated Learning with Trusted Execution Environment

by Linkai Zhu, Shanwen Hu, Xiaolian Zhu, Changpu Meng and Maoyi Huang

Mathematics 2023, 11(17), 3759; https://doi.org/10.3390/math11173759 - 01 Sep 2023

Cited by 1 | Viewed by 1190

Abstract

Federated learning has emerged as a promising technique for the Internet of Things (IoT) in various domains, including supply chain management. It enables IoT devices to collaboratively learn without exposing their raw data, ensuring data privacy. However, federated learning faces the threats of [...] Read more.

Federated learning has emerged as a promising technique for the Internet of Things (IoT) in various domains, including supply chain management. It enables IoT devices to collaboratively learn without exposing their raw data, ensuring data privacy. However, federated learning faces the threats of local data tampering and upload process attacks. This paper proposes an innovative framework that leverages Trusted Execution Environment (TEE) and blockchain technology to address the data security and privacy challenges in federated learning for IoT supply chain management. Our framework achieves the security of local data computation and the tampering resistance of data update uploads using TEE and the blockchain. We adopt Intel Software Guard Extensions (SGXs) as the specific implementation of TEE, which can guarantee the secure execution of local models on SGX-enabled processors. We also use consortium blockchain technology to build a verification network and consensus mechanism, ensuring the security and tamper resistance of the data upload and aggregation process. Finally, each cluster can obtain the aggregated parameters from the blockchain. To evaluate the performance of our proposed framework, we conducted several experiments with different numbers of participants and different datasets and validated the effectiveness of our scheme. We tested the final global model obtained from federated training on a test dataset and found that increasing both the number of iterations and the number of participants improves its accuracy. For instance, it reaches 94% accuracy with one participant and five iterations and 98.5% accuracy with ten participants and thirty iterations. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

16 pages, 973 KiB

Open AccessArticle

Parallel Dense Video Caption Generation with Multi-Modal Features

by Xuefei Huang, Ka-Hou Chan, Wei Ke and Hao Sheng

Mathematics 2023, 11(17), 3685; https://doi.org/10.3390/math11173685 - 26 Aug 2023

Cited by 1 | Viewed by 1094

Abstract

The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-then-captioning sequence within given frame sequences, resulting [...] Read more.

The task of dense video captioning is to generate detailed natural-language descriptions for an original video, which requires deep analysis and mining of semantic captions to identify events in the video. Existing methods typically follow a localisation-then-captioning sequence within given frame sequences, resulting in caption generation that is highly dependent on which objects have been detected. This work proposes a parallel-based dense video captioning method that can simultaneously address the mutual constraint between event proposals and captions. Additionally, a deformable Transformer framework is introduced to reduce or free manual threshold of hyperparameters in such methods. An information transfer station is also added as a representation organisation, which receives the hidden features extracted from a frame and implicitly generates multiple event proposals. The proposed method also adopts LSTM (Long short-term memory) with deformable attention as the main layer for caption generation. Experimental results show that the proposed method outperforms other methods in this area to a certain degree on the ActivityNet Caption dataset, providing competitive results. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

11 pages, 4951 KiB

Open AccessArticle

Optimal Multi-Attribute Auctions Based on Multi-Scale Loss Network

by Zefeng Zhao, Haohao Cai, Huawei Ma, Shujie Zou and Chiawei Chu

Mathematics 2023, 11(14), 3240; https://doi.org/10.3390/math11143240 - 24 Jul 2023

Viewed by 871

Abstract

There is a strong demand for multi-attribute auctions in real-world scenarios for non-price attributes that allow participants to express their preferences and the item’s value. However, this also makes it difficult to perform calculations with incomplete information, as a single attribute—price—no longer determines [...] Read more.

There is a strong demand for multi-attribute auctions in real-world scenarios for non-price attributes that allow participants to express their preferences and the item’s value. However, this also makes it difficult to perform calculations with incomplete information, as a single attribute—price—no longer determines the revenue. At the same time, the mechanism must satisfy individual rationality (IR) and incentive compatibility (IC). This paper proposes an innovative dual network to solve these problems. A shared MLP module is constructed to extract bidder features, and multiple-scale loss is used to determine network status and update. The method was tested on real and extended cases, showing that the approach effectively improves the auctioneer’s revenue without compromising the bidder. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

22 pages, 1082 KiB

Open AccessArticle

From Replay to Regeneration: Recovery of UDP Flood Network Attack Scenario Based on SDN

by Yichuan Wang, Junxia Ding, Tong Zhang, Yeqiu Xiao and Xinhong Hei

Mathematics 2023, 11(8), 1897; https://doi.org/10.3390/math11081897 - 17 Apr 2023

Cited by 1 | Viewed by 1334

Abstract

In recent years, various network attacks have emerged. These attacks are often recorded in the form of Pcap data, which contains many attack details and characteristics that cannot be analyzed through traditional methods alone. Therefore, restoring the network attack scenario through scene reconstruction [...] Read more.

In recent years, various network attacks have emerged. These attacks are often recorded in the form of Pcap data, which contains many attack details and characteristics that cannot be analyzed through traditional methods alone. Therefore, restoring the network attack scenario through scene reconstruction to achieve data regeneration has become an important entry point for detecting and defending against network attacks. However, current network attack scenarios mainly reproduce the attacker’s attack steps by building a sequence collection of attack scenarios, constructing an attack behavior diagram, or simply replaying the captured network traffic. These methods still have shortcomings in terms of traffic regeneration. To address this limitation, this paper proposes an SDN-based network attack scenario recovery method. By parsing Pcap data and utilizing network topology reconstruction, probability, and packet sequence models, network traffic data can be regenerated. The experimental results show that the proposed method is closer to the real network, with a higher similarity between the reconstructed and actual attack scenarios. Additionally, this method allows for adjusting the intensity of the network attack and the generated topology nodes, which helps network defenders better understand the attackers’ posture and analyze and formulate corresponding security strategies. Full article

(This article belongs to the Special Issue Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Hybrid Data Processing by Combining Machine Learning, Expert, Safety and Security

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (9 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI