Data Mining and Machine Learning in Social Network Analysis

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 July 2024 | Viewed by 2310

Special Issue Editor


E-Mail Website
Guest Editor
Department of Informatics, University of Piraeus, Karaoli & Dimitriou 80, 18534 Piraeus, Greece
Interests: machine learning; data mining; evolutionary computing; signal processing; digital social networks
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Data mining and machine learning have found significant applications in the realm of social networks, transforming the way we understand and interact with online communities. These technologies enable the extraction of valuable insights from the massive volumes of data generated on platforms like Facebook, Twitter, and Instagram. By analyzing user behavior, content interactions, and network structures, data mining uncovers hidden patterns and trends that inform personalized content recommendations, targeted advertising, and even sentiment analysis. Machine learning algorithms, on the other hand, play a pivotal role in predicting user preferences, identifying influencers, and detecting anomalies such as fake accounts or cyberbullying.

In the context of social networks, the synergy of data mining and machine learning drives the development of recommendation systems that cater to individual interests, enhancing user engagement and retention. Moreover, the integration of these technologies allows platforms to combat the spread of misinformation and harmful content by recognizing patterns of virality and identifying sources of fake news. As social networks continue to evolve, data mining and machine learning promise to reshape user experiences, fostering more personalized, secure, and socially responsible interactions in the digital landscape. However, ethical considerations surrounding data privacy, algorithmic biases, and potential misuse highlight the need for a thoughtful and balanced approach in leveraging these technologies for the benefit of both users and society as a whole.

This Special Issue will accept publications that fall within the following research topics:

  • Development of novel machine learning algorithms to identify and classify communities within social networks based on structural and behavioral patterns.
  • Influence propagation modeling: investigating machine learning approaches to model and predict the spread of influence and information within social networks.
  • Anomaly detection: design of techniques using machine learning to detect anomalous behaviors and activities within social networks, such as bots, spam, and unusual user interactions.
  • Link prediction: exploring predictive models using machine learning to forecast future connections between users or entities in social networks.
  • Sentiment analysis: development of advanced sentiment analysis methods using machine learning to understand and predict user emotions and opinions within social media posts.
  • User profiling and personalization: utilizing machine learning to create accurate user profiles for personalized content recommendation and targeted advertising in social networks.
  • Fake news detection: designing machine learning algorithms to identify and combat the dissemination of fake news and misinformation within social networks.
  • Opinion dynamics modeling: investigating how machine learning can be employed to model the evolution of opinions and beliefs in social networks over time.
  • Network evolution prediction: development of predictive models using machine learning to anticipate changes and shifts in the structure and dynamics of social networks.
  • Graph representation learning: exploring techniques for learning informative node and graph embeddings in social networks, enhancing various downstream tasks.
  • Network robustness analysis: using machine learning to study the vulnerability and resilience of social networks against attacks, failures, and cascading events.
  • Privacy preservation: researching machine learning methods to analyze and mitigate privacy risks in social networks while preserving data utility.
  • Temporal network analysis: development of models using machine learning to analyze the temporal dynamics of social networks and capture patterns of interactions over time.
  • Behavioral pattern recognition: designing algorithms that utilize machine learning to recognize recurring behavioral patterns and trends within social network activities.
  • Cross-network analysis: investigating methods to combine information from multiple social networks or platforms using machine learning to gain deeper insights.
  • Network visualization: exploring machine learning-driven visualization techniques to represent complex social network structures and interactions in interpretable ways.
  • Opinion leaders’ identification: developing machine learning approaches to identify influential users and opinion leaders within social networks based on their impact and interactions.
  • Gender and demographic analysis: using machine learning to infer user gender, age, and other demographics from their social network activities, enabling targeted studies.
  • Network fairness and bias: researching machine learning techniques to identify and mitigate biases in social network algorithms that can lead to unfair outcomes.
  • Multi-modal social network analysis: combining textual, visual, and other modalities in social network data using machine learning for a comprehensive understanding of user interactions.

Dr. Dionisios Sotiropoulos
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • data mining
  • machine learning
  • recommendation systems
  • social networks

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

16 pages, 339 KiB  
Article
RumorLLM: A Rumor Large Language Model-Based Fake-News-Detection Data-Augmentation Approach
by Jianqiao Lai, Xinran Yang, Wenyue Luo, Linjiang Zhou, Langchen Li, Yongqi Wang and Xiaochuan Shi
Appl. Sci. 2024, 14(8), 3532; https://doi.org/10.3390/app14083532 - 22 Apr 2024
Viewed by 344
Abstract
With the rapid development of the Internet and social media, false information, rumors, and misleading content have become pervasive, posing significant threats to public opinion and social stability, and even causing serious societal harm. This paper introduces a novel solution to address the [...] Read more.
With the rapid development of the Internet and social media, false information, rumors, and misleading content have become pervasive, posing significant threats to public opinion and social stability, and even causing serious societal harm. This paper introduces a novel solution to address the challenges of fake news detection, presenting the “Rumor Large Language Models” (RumorLLM), a large language model finetuned with rumor writing styles and content. The key contributions include the development of RumorLLM and a data-augmentation method for small categories, effectively mitigating the issue of category imbalance in real-world fake-news datasets. Experimental results on the BuzzFeed and PolitiFact datasets demonstrate the superiority of the proposed model over baseline methods, particularly in F1 score and AUC-ROC. The model’s robust performance highlights its effectiveness in handling imbalanced datasets and provides a promising solution to the pressing issue of false-information proliferation. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning in Social Network Analysis)
Show Figures

Figure 1

23 pages, 4610 KiB  
Article
Exploring the Performance of Continuous-Time Dynamic Link Prediction Algorithms
by Raphaël Romero, Maarten Buyl, Tijl De Bie and Jefrey Lijffijt
Appl. Sci. 2024, 14(8), 3516; https://doi.org/10.3390/app14083516 - 22 Apr 2024
Viewed by 224
Abstract
Dynamic Link Prediction (DLP) addresses the prediction of future links in evolving networks. However, accurately portraying the performance of DLP algorithms poses challenges that might impede progress in the field. Importantly, common evaluation pipelines usually calculate ranking or binary classification metrics, where the [...] Read more.
Dynamic Link Prediction (DLP) addresses the prediction of future links in evolving networks. However, accurately portraying the performance of DLP algorithms poses challenges that might impede progress in the field. Importantly, common evaluation pipelines usually calculate ranking or binary classification metrics, where the scores of observed interactions (positives) are compared with those of randomly generated ones (negatives). However, a single metric is not sufficient to fully capture the differences between DLP algorithms, and is prone to overly optimistic performance evaluation. Instead, an in-depth evaluation should reflect performance variations across different nodes, edges, and time segments. In this work, we contribute tools to perform such a comprehensive evaluation. (1) We propose Birth–Death diagrams, a simple but powerful visualization technique that illustrates the effect of time-based train–test splitting on the difficulty of DLP on a given dataset. (2) We describe an exhaustive taxonomy of negative sampling methods that can be used at evaluation time. (3) We carry out an empirical study of the effect of the different negative sampling strategies. Our comparison between heuristics and state-of-the-art memory-based methods on various real-world datasets confirms a strong effect of using different negative sampling strategies on the test area under the curve (AUC). Moreover, we conduct a visual exploration of the prediction, with additional insights on which different types of errors are prominent over time. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning in Social Network Analysis)
Show Figures

Figure 1

24 pages, 501 KiB  
Article
Outlier Detection and Prediction in Evolving Communities
by Nikolaos Sachpenderis and Georgia Koloniari
Appl. Sci. 2024, 14(6), 2356; https://doi.org/10.3390/app14062356 - 11 Mar 2024
Viewed by 426
Abstract
Community detection in social networks is of great importance and is used in a variety of applications such as recommendation systems and targeted advertising. While detecting dense groups with high levels of connectivity and similar interests between their members is the main target [...] Read more.
Community detection in social networks is of great importance and is used in a variety of applications such as recommendation systems and targeted advertising. While detecting dense groups with high levels of connectivity and similar interests between their members is the main target of traditional network analysis, finding network members with quite different behavior than the majority of nodes is important as well. These nodes are known as outliers, and their accurate detection can be very useful; when outliers are marked as noisy nodes, their early exclusion from analysis can lead to high computational profits. On the other hand, they can represent interesting components that call for further investigation to find the reasons for their outlying behavior and possible ways to include them in a neighboring community. Both community and outlier detection are challenging in temporal environments where changes occur in real time; thus, dynamic methods need to be deployed rather than to static methods. In our work, we take into account the content of the network, in contrast to most of related studies, where only the network’s structure contributes to community formation. We define an adaptive outlier score to be assigned to each node in order to quantify its outlierness, and introduce a complete online community detection algorithm that analyzes both the network’s structure and content while at the same time detecting community outliers. To evaluate our method, we retrieved and processed two real datasets regarding social networks with temporal and content information. Experimental results show that our method is capable of detecting outliers in real-time evolving communities and provides an outlier score which is a better metric of each node’s outlierness compared to widely used metrics. Finally, experimental results indicate that our method is suitable for predicting the status of future nodes based on their current outlier score. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning in Social Network Analysis)
Show Figures

Figure 1

19 pages, 1345 KiB  
Article
Two-Stage Dimensionality Reduction for Social Media Engagement Classification
by Jose Luis Vieira Sobrinho, Flavio Henrique Teles Vieira and Alisson Assis Cardoso
Appl. Sci. 2024, 14(3), 1269; https://doi.org/10.3390/app14031269 - 03 Feb 2024
Viewed by 537
Abstract
The high dimensionality of real-life datasets is one of the biggest challenges in the machine learning field. Due to the increased need for computational resources, the higher the dimension of the input data is, the more difficult the learning task will be—a phenomenon [...] Read more.
The high dimensionality of real-life datasets is one of the biggest challenges in the machine learning field. Due to the increased need for computational resources, the higher the dimension of the input data is, the more difficult the learning task will be—a phenomenon commonly referred to as the curse of dimensionality. Laying the paper’s foundation based on this premise, we propose a two-stage dimensionality reduction (TSDR) method for data classification. The first stage extracts high-quality features to a new subset by maximizing the pairwise separation probability, with the aim of avoiding overlap between individuals from different classes that are close to one another, also known as the class masking problem. The second stage takes the previous resulting subset and transforms it into a reduced final space in a way that maximizes the distance between the cluster centers of different classes while also minimizing the dispersion of instances within the same class. Hence, the second stage aims to improve the accuracy of the succeeding classifier by lowering its sensitivity to an imbalanced distribution of instances between different classes. Experiments on benchmark and social media datasets show how promising the proposed method is over some well-established algorithms, especially regarding social media engagement classification. Full article
(This article belongs to the Special Issue Data Mining and Machine Learning in Social Network Analysis)
Show Figures

Figure 1

Back to TopTop