Research

17 pages, 3568 KiB

Open AccessArticle

Innovative Use of Self-Attention-Based Ensemble Deep Learning for Suicide Risk Detection in Social Media Posts

by Hoan-Suk Choi and Jinhong Yang

Appl. Sci. 2024, 14(2), 893; https://doi.org/10.3390/app14020893 - 20 Jan 2024

Viewed by 755

Suicidal ideation constitutes a critical concern in mental health, adversely affecting individuals and society at large. The early detection of such ideation is vital for providing timely support to individuals and mitigating its societal impact. With social media serving as a platform for [...] Read more.

Suicidal ideation constitutes a critical concern in mental health, adversely affecting individuals and society at large. The early detection of such ideation is vital for providing timely support to individuals and mitigating its societal impact. With social media serving as a platform for self-expression, it offers a rich source of data that can reveal early symptoms of mental health issues. This paper introduces an innovative ensemble learning method named LSTM-Attention-BiTCN, which fuses LSTM and BiTCN models with a self-attention mechanism to detect signs of suicidality in social media posts. Our LSTM-Attention-BiTCN model demonstrated superior performance in comparison to baseline models in the realm of classification and suicidal ideation detection, boasting an accuracy of 0.9405, a precision of 0.9385, a recall of 0.9424, and an F1-score of 0.9405. Our proposed model can aid healthcare professionals in recognizing suicidal tendencies among social media users accurately, thereby contributing to efforts to reduce suicide rates. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

14 pages, 434 KiB

Open AccessArticle

Abstractive Summarizers Become Emotional on News Summarization

by Vicent Ahuir, José-Ángel González, Lluís-F. Hurtado and Encarna Segarra

Appl. Sci. 2024, 14(2), 713; https://doi.org/10.3390/app14020713 - 15 Jan 2024

Viewed by 707

Abstract

Emotions are central to understanding contemporary journalism; however, they are overlooked in automatic news summarization. Actually, summaries are an entry point to the source article that could favor some emotions to captivate the reader. Nevertheless, the emotional content of summarization corpora and the [...] Read more.

Emotions are central to understanding contemporary journalism; however, they are overlooked in automatic news summarization. Actually, summaries are an entry point to the source article that could favor some emotions to captivate the reader. Nevertheless, the emotional content of summarization corpora and the emotional behavior of summarization models are still unexplored. In this work, we explore the usage of established methodologies to study the emotional content of summarization corpora and the emotional behavior of summarization models. Using these methodologies, we study the emotional content of two widely used summarization corpora: Cnn/Dailymail and Xsum, and the capabilities of three state-of-the-art transformer-based abstractive systems for eliciting emotions in the generated summaries: Bart, Pegasus, and T5. The main significant findings are as follows: (i) emotions are persistent in the two summarization corpora, (ii) summarizers approach moderately well the emotions of the reference summaries, and (iii) more than 75% of the emotions introduced by novel words in generated summaries are present in the reference ones. The combined use of these methodologies has allowed us to conduct a satisfactory study of the emotional content in news summarization. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

13 pages, 1120 KiB

Open AccessArticle

Leveraging Prompt and Top-K Predictions with ChatGPT Data Augmentation for Improved Relation Extraction

by Ping Feng, Hang Wu, Ziqian Yang, Yunyi Wang and Dantong Ouyang

Appl. Sci. 2023, 13(23), 12746; https://doi.org/10.3390/app132312746 - 28 Nov 2023

Viewed by 768

Abstract

Relation extraction tasks aim to predict the type of relationship between two entities from a given text. However, many existing methods fail to fully utilize the semantic information and the probability distribution of the output of pre-trained language models, and existing data augmentation [...] Read more.

Relation extraction tasks aim to predict the type of relationship between two entities from a given text. However, many existing methods fail to fully utilize the semantic information and the probability distribution of the output of pre-trained language models, and existing data augmentation approaches for natural language processing (NLP) may introduce errors. To address this issue, we propose a method that introduces prompt information and Top-K prediction sets and utilizes ChatGPT for data augmentation to improve relational classification model performance. First, we add prompt information before each sample and encode the modified samples by pre-training the language model RoBERTa and using these feature vectors to obtain the Top-K prediction set. We add a multi-attention mechanism to link the Top-K prediction set with the prompt information. We then reduce the possibility of introducing noise by bootstrapping ChatGPT so that it can better perform the data augmentation task and reduce subsequent unnecessary operations. Finally, we investigate the predefined relationship categories in the SemEval 2010 Task 8 dataset and the prediction results of the model and propose an entity location prediction task designed to assist the model in accurately determining the relative locations between entities. Experimental results indicate that our model achieves high results on the SemEval 2010 Task 8 dataset. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

14 pages, 793 KiB

Open AccessArticle

Unlocking Everyday Wisdom: Enhancing Machine Comprehension with Script Knowledge Integration

by Zhihao Zhou, Tianwei Yue, Chen Liang, Xiaoyu Bai, Dachi Chen, Congrui Hetang and Wenping Wang

Appl. Sci. 2023, 13(16), 9461; https://doi.org/10.3390/app13169461 - 21 Aug 2023

Cited by 2 | Viewed by 847

Abstract

Harnessing commonsense knowledge poses a significant challenge for machine comprehension systems. This paper primarily focuses on incorporating a specific subset of commonsense knowledge, namely, script knowledge. Script knowledge is about sequences of actions that are typically performed by individuals in everyday life. Our [...] Read more.

Harnessing commonsense knowledge poses a significant challenge for machine comprehension systems. This paper primarily focuses on incorporating a specific subset of commonsense knowledge, namely, script knowledge. Script knowledge is about sequences of actions that are typically performed by individuals in everyday life. Our experiments were centered around the MCScript dataset, which was the basis of the SemEval-2018 Task 11: Machine Comprehension using Commonsense Knowledge. As a baseline, we utilized our Three-Way Attentive Networks (TriANs) framework to model the interactions among passages, questions, and answers. Building upon the TriAN, we proposed to: (1) integrate a pre-trained language model to capture script knowledge; (2) introduce multi-layer attention to facilitate multi-hop reasoning; and (3) incorporate positional embeddings to enhance the model’s capacity for event-ordering reasoning. In this paper, we present our proposed methods and prove their efficacy in improving script knowledge integration and reasoning. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

24 pages, 1852 KiB

Open AccessArticle

High-Quality Data from Crowdsourcing towards the Creation of a Mexican Anti-Immigrant Speech Corpus

by Alejandro Molina-Villegas, Thomas Cattin, Karina Gazca-Hernandez and Edwin Aldana-Bobadilla

Appl. Sci. 2023, 13(14), 8417; https://doi.org/10.3390/app13148417 - 21 Jul 2023

Viewed by 830

Abstract

Currently, a significant portion of published research on online hate speech relies on existing textual corpora. However, when examining a specific context, there is a lack of preexisting datasets that include the particularities associated with various conditions (e.g., geographic and cultural). This issue [...] Read more.

Currently, a significant portion of published research on online hate speech relies on existing textual corpora. However, when examining a specific context, there is a lack of preexisting datasets that include the particularities associated with various conditions (e.g., geographic and cultural). This issue is evident in the case of online anti-immigrant speech in Mexico, where available data to study this emergent and often overlooked phenomenon are scarce. In light of this situation, we propose a novel methodology wherein three domain experts annotate a certain number of texts related to the subject. We establish a precise control mechanism based on these annotations to evaluate non-expert annotators. The evaluation of the contributors is implemented in a custom annotation platform, enabling us to conduct a controlled crowdsourcing campaign and assess the reliability of the obtained data. Our results demonstrate that a combination of crowdsourced and expert data leads to iterative improvements, not only in the accuracy achieved by various machine learning classification models (reaching 0.8828) but also in the model’s adaptation to the specific characteristics of hate speech in the Mexican Twittersphere context. In addition to these methodological innovations, the most significant contribution of our work is the creation of the first online Mexican anti-immigrant training corpus for machine-learning-based detection tasks. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

18 pages, 2072 KiB

Open AccessArticle

GenCo: A Generative Learning Model for Heterogeneous Text Classification Based on Collaborative Partial Classifications

by Zie Eya Ekolle and Ryuji Kohno

Appl. Sci. 2023, 13(14), 8211; https://doi.org/10.3390/app13148211 - 14 Jul 2023

Viewed by 1075

Abstract

The use of generative learning models in natural language processing (NLP) has significantly contributed to the advancement of natural language applications, such as sentimental analysis, topic modeling, text classification, chatbots, and spam filtering. With a large amount of text generated each day from [...] Read more.

The use of generative learning models in natural language processing (NLP) has significantly contributed to the advancement of natural language applications, such as sentimental analysis, topic modeling, text classification, chatbots, and spam filtering. With a large amount of text generated each day from different sources, such as web-pages, blogs, emails, social media, and articles, one of the most common tasks in NLP is the classification of a text corpus. This is important in many institutions for planning, decision-making, and creating archives of their projects. Many algorithms exist to automate text classification tasks but the most intriguing of them is that which also learns these tasks automatically. In this study, we present a new model to infer and learn from data using probabilistic logic and apply it to text classification. This model, called GenCo, is a multi-input single-output (MISO) learning model that uses a collaboration of partial classifications to generate the desired output. It provides a heterogeneity measure to explain its classification results and enables a reduction in the curse of dimensionality in text classification. Experiments with the model were carried out on the Twitter US Airline dataset, the Conference Paper dataset, and the SMS Spam dataset, outperforming baseline models with 98.40%, 89.90%, and 99.26% accuracy, respectively. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

24 pages, 3188 KiB

Open AccessArticle

One-Class Learning for AI-Generated Essay Detection

by Roberto Corizzo and Sebastian Leal-Arenas

Appl. Sci. 2023, 13(13), 7901; https://doi.org/10.3390/app13137901 - 05 Jul 2023

Cited by 1 | Viewed by 2548

Abstract

Detection of AI-generated content is a crucially important task considering the increasing attention towards AI tools, such as ChatGPT, and the raised concerns with regard to academic integrity. Existing text classification approaches, including neural-network-based and feature-based methods, are mostly tailored for English data, [...] Read more.

Detection of AI-generated content is a crucially important task considering the increasing attention towards AI tools, such as ChatGPT, and the raised concerns with regard to academic integrity. Existing text classification approaches, including neural-network-based and feature-based methods, are mostly tailored for English data, and they are typically limited to a supervised learning setting. Although one-class learning methods are more suitable for classification tasks, their effectiveness in essay detection is still unknown. In this paper, this gap is explored by adopting linguistic features and one-class learning models for AI-generated essay detection. Detection performance of different models is assessed in different settings, where positively labeled data, i.e., AI-generated essays, are unavailable for model training. Results with two datasets containing essays in L2 English and L2 Spanish show that it is feasible to accurately detect AI-generated essays. The analysis reveals which models and which sets of linguistic features are more powerful than others in the detection task. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

14 pages, 2080 KiB

Open AccessArticle

Enhancing Abstractive Summarization with Extracted Knowledge Graphs and Multi-Source Transformers

by Tong Chen, Xuewei Wang, Tianwei Yue, Xiaoyu Bai, Cindy X. Le and Wenping Wang

Appl. Sci. 2023, 13(13), 7753; https://doi.org/10.3390/app13137753 - 30 Jun 2023

Cited by 6 | Viewed by 3155

Abstract

As the popularity of large language models (LLMs) has risen over the course of the last year, led by GPT-3/4 and especially its productization as ChatGPT, we have witnessed the extensive application of LLMs to text summarization. However, LLMs do not intrinsically have [...] Read more.

As the popularity of large language models (LLMs) has risen over the course of the last year, led by GPT-3/4 and especially its productization as ChatGPT, we have witnessed the extensive application of LLMs to text summarization. However, LLMs do not intrinsically have the power to verify the correctness of the information they supply and generate. This research introduces a novel approach to abstractive summarization, aiming to address the limitations of LLMs in that they struggle to understand the truth. The proposed method leverages extracted knowledge graph information and structured semantics as a guide for summarization. Building upon BART, one of the state-of-the-art sequence-to-sequence pre-trained LLMs, multi-source transformer modules are developed as an encoder, which are capable of processing textual and graphical inputs. Decoding is performed based on this enriched encoding to enhance the summary quality. The Wiki-Sum dataset, derived from Wikipedia text dumps, is introduced for evaluation purposes. Comparative experiments with baseline models demonstrate the strengths of the proposed approach in generating informative and relevant summaries. We conclude by presenting our insights into utilizing LLMs with graph external information, which will become a powerful aid towards the goal of factually correct and verified LLMs. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

20 pages, 742 KiB

Open AccessArticle

EvoText: Enhancing Natural Language Generation Models via Self-Escalation Learning for Up-to-Date Knowledge and Improved Performance

by Zhengqing Yuan, Huiwen Xue, Chao Zhang and Yongming Liu

Appl. Sci. 2023, 13(8), 4758; https://doi.org/10.3390/app13084758 - 10 Apr 2023

Viewed by 1452

Abstract

In recent years, pretrained models have been widely used in various fields, including natural language understanding, computer vision, and natural language generation. However, the performance of these language generation models is highly dependent on the model size and the dataset size. While larger [...] Read more.

In recent years, pretrained models have been widely used in various fields, including natural language understanding, computer vision, and natural language generation. However, the performance of these language generation models is highly dependent on the model size and the dataset size. While larger models excel in some aspects, they cannot learn up-to-date knowledge and are relatively difficult to relearn. In this paper, we introduce EvoText, a novel training method that enhances the performance of any natural language generation model without requiring additional datasets during the entire training process (although a prior dataset is necessary for pretraining). EvoText employs two models: G, a text generation model, and D, a model that can determine whether the data generated by G is legitimate. Initially, the fine-tuned D model serves as the knowledge base. The text generated by G is then input to D to determine whether it is legitimate. Finally, G is fine-tuned based on D’s output. EvoText enables the model to learn up-to-date knowledge through a self-escalation process that builds on a priori knowledge. When EvoText needs to learn something new, it simply fine-tunes the D model. Our approach applies to autoregressive language modeling for all Transformer classes. With EvoText, eight models achieved stable improvements in seven natural language processing tasks without any changes to the model structure. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

19 pages, 605 KiB

Open AccessArticle

An Abstractive Summarization Model Based on Joint-Attention Mechanism and a Priori Knowledge

by Yuanyuan Li, Yuan Huang, Weijian Huang, Junhao Yu and Zheng Huang

Appl. Sci. 2023, 13(7), 4610; https://doi.org/10.3390/app13074610 - 05 Apr 2023

Cited by 3 | Viewed by 1510

Abstract

An abstractive summarization model based on the joint-attention mechanism and a priori knowledge is proposed to address the problems of the inadequate semantic understanding of text and summaries that do not conform to human language habits in abstractive summary models. Word vectors that [...] Read more.

An abstractive summarization model based on the joint-attention mechanism and a priori knowledge is proposed to address the problems of the inadequate semantic understanding of text and summaries that do not conform to human language habits in abstractive summary models. Word vectors that are most relevant to the original text should be selected first. Second, the original text is represented in two dimensions—word-level and sentence-level, as word vectors and sentence vectors, respectively. After this processing, there will be not only a relationship between word-level vectors but also a relationship between sentence-level vectors, and the decoder discriminates between word-level and sentence-level vectors based on their relationship with the hidden state of the decoder. Then, the pointer generation network is improved using a priori knowledge. Finally, reinforcement learning is used to improve the quality of the generated summaries. Experiments on two classical datasets, CNN/DailyMail and DUC 2004, show that the model has good performance and effectively improves the quality of generated summaries. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

14 pages, 514 KiB

Open AccessArticle

Readability Metrics for Machine Translation in Dutch: Google vs. Azure & IBM

by Chaïm van Toledo, Marijn Schraagen, Friso van Dijk, Matthieu Brinkhuis and Marco Spruit

Appl. Sci. 2023, 13(7), 4444; https://doi.org/10.3390/app13074444 - 31 Mar 2023

Cited by 1 | Viewed by 1242

Abstract

This paper introduces a novel method to predict when a Google translation is better than other machine translations (MT) in Dutch. Instead of considering fidelity, this approach considers fluency and readability indicators for when Google ranked best. This research explores an alternative approach [...] Read more.

This paper introduces a novel method to predict when a Google translation is better than other machine translations (MT) in Dutch. Instead of considering fidelity, this approach considers fluency and readability indicators for when Google ranked best. This research explores an alternative approach in the field of quality estimation. The paper contributes by publishing a dataset with sentences from English to Dutch, with human-made classifications on a best-worst scale. Logistic regression shows a correlation between T-Scan output, such as readability measurements like lemma frequencies, and when Google translation was better than Azure and IBM. The last part of the results section shows the prediction possibilities. First by logistic regression and second by a generated automated machine learning model. Respectively, they have an accuracy of 0.59 and 0.61. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

21 pages, 801 KiB

Open AccessArticle

Does Context Matter? Effective Deep Learning Approaches to Curb Fake News Dissemination on Social Media

by Jawaher Alghamdi, Yuqing Lin and Suhuai Luo

Appl. Sci. 2023, 13(5), 3345; https://doi.org/10.3390/app13053345 - 06 Mar 2023

Cited by 6 | Viewed by 1805

Abstract

The prevalence of fake news on social media has led to major sociopolitical issues. Thus, the need for automated fake news detection is more important than ever. In this work, we investigated the interplay between news content and users’ posting behavior clues in [...] Read more.

The prevalence of fake news on social media has led to major sociopolitical issues. Thus, the need for automated fake news detection is more important than ever. In this work, we investigated the interplay between news content and users’ posting behavior clues in detecting fake news by using state-of-the-art deep learning approaches, such as the convolutional neural network (CNN), which involves a series of filters of different sizes and shapes (combining the original sentence matrix to create further low-dimensional matrices), and the bidirectional gated recurrent unit (BiGRU), which is a type of bidirectional recurrent neural network with only the input and forget gates, coupled with a self-attention mechanism. The proposed architectures introduced a novel approach to learning rich, semantical, and contextual representations of a given news text using natural language understanding of transfer learning coupled with context-based features. Experiments were conducted on the FakeNewsNet dataset. The experimental results show that incorporating information about users’ posting behaviors (when available) improves the performance compared to models that rely solely on textual news data. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

14 pages, 1434 KiB

Open AccessArticle

Predicting Location of Tweets Using Machine Learning Approaches

by Mohammed Alsaqer, Salem Alelyani, Mohamed Mohana, Khalid Alreemy and Ali Alqahtani

Appl. Sci. 2023, 13(5), 3025; https://doi.org/10.3390/app13053025 - 26 Feb 2023

Cited by 3 | Viewed by 2186

Abstract

Twitter, one of the most popular microblogging platforms, has tens of millions of active users worldwide, generating hundreds of millions of posts every day. Twitter posts, referred to as “tweets”, the short and the noisy text, bring many challenges with them, such as [...] Read more.

Twitter, one of the most popular microblogging platforms, has tens of millions of active users worldwide, generating hundreds of millions of posts every day. Twitter posts, referred to as “tweets”, the short and the noisy text, bring many challenges with them, such as in the case of some emergency or disaster. Predicting the location of these tweets is important for social, security, human rights, and business reasons and has raised noteworthy consideration lately. However, most Twitter users disable the geo-tagging feature, and their home locations are neither standardized nor accurate. In this study, we applied four machine learning techniques named Logistic Regression, Random Forest, Multinomial Naïve Bayes, and Support Vector Machine with and without the utilization of the geo-distance matrix for location prediction of a tweet using its textual content. Our extensive experiments on our vast collection of Arabic tweets From Saudi Arabia with different feature sets yielded promising results with 67% accuracy. Full article

(This article belongs to the Special Issue Text Mining, Machine Learning, and Natural Language Processing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Text Mining, Machine Learning, and Natural Language Processing

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (13 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI