Research

18 pages, 1459 KiB

Open AccessArticle

Contrastive Learning Penalized Cross-Entropy with Diversity Contrastive Search Decoding for Diagnostic Report Generation of Reduced Token Repetition

by Taozheng Zhang, Jiajian Meng, Yuseng Yang and Shaode Yu

Appl. Sci. 2024, 14(7), 2817; https://doi.org/10.3390/app14072817 - 27 Mar 2024

Viewed by 394

Abstract

Medical imaging description and disease diagnosis are vitally important yet time-consuming. Automated diagnosis report generation (DRG) from medical imaging description can reduce clinicians’ workload and improve their routine efficiency. To address this natural language generation task, fine-tuning a pre-trained large language model (LLM) [...] Read more.

Medical imaging description and disease diagnosis are vitally important yet time-consuming. Automated diagnosis report generation (DRG) from medical imaging description can reduce clinicians’ workload and improve their routine efficiency. To address this natural language generation task, fine-tuning a pre-trained large language model (LLM) is cost-effective and indispensable, and its success has been witnessed in many downstream applications. However, semantic inconsistency of sentence embeddings has been massively observed from undesirable repetitions or unnaturalness in text generation. To address the underlying issue of anisotropic distribution of token representation, in this study, a contrastive learning penalized cross-entropy (CLpCE) objective function is implemented to enhance the semantic consistency and accuracy of token representation by guiding the fine-tuning procedure towards a specific task. Furthermore, to improve the diversity of token generation in text summarization and to prevent sampling from unreliable tail of token distributions, a diversity contrastive search (DCS) decoding method is designed for restricting the report generation derived from a probable candidate set with maintained semantic coherence. Furthermore, a novel metric named the maximum of token repetition ratio (maxTRR) is proposed to estimate the token diversity and to help determine the candidate output. Based on the LLM of a generative pre-trained Transformer 2 (GPT-2) of Chinese version, the proposed CLpCE with DCS (CLpCEwDCS) decoding framework is validated on 30,000 desensitized text samples from the “Medical Imaging Diagnosis Report Generation” track of 2023 Global Artificial Intelligence Technology Innovation Competition. Using four kinds of metrics evaluated from n-gram word matching, semantic relevance, and content similarity as well as the maxTRR metric extensive experiments reveal that the proposed framework effectively maintains semantic coherence and accuracy (BLEU-1, 0.4937; BLEU-2, 0.4107; BLEU-3, 0.3461; BLEU-4, 0.2933; METEOR, 0.2612; ROUGE, 0.5182; CIDER, 1.4339) and improves text generation diversity and naturalness (maxTRR, 0.12). The phenomenon of dull or repetitive text generation is common when fine-tuning pre-trained LLMs for natural language processing applications. This study might shed some light on relieving this issue by developing comprehensive strategies to enhance semantic coherence, accuracy and diversity of sentence embeddings. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

22 pages, 3287 KiB

Open AccessArticle

Towards Understanding Neural Machine Translation with Attention Heads’ Importance

by Zijie Zhou, Junguo Zhu and Weijiang Li

Appl. Sci. 2024, 14(7), 2798; https://doi.org/10.3390/app14072798 - 27 Mar 2024

Viewed by 395

Abstract

Although neural machine translation has made great progress, and the Transformer has advanced the state-of-the-art in various language pairs, the decision-making process of the attention mechanism, a crucial component of the Transformer, remains unclear. In this paper, we propose to understand the model’s [...] Read more.

Although neural machine translation has made great progress, and the Transformer has advanced the state-of-the-art in various language pairs, the decision-making process of the attention mechanism, a crucial component of the Transformer, remains unclear. In this paper, we propose to understand the model’s decisions by the attention heads’ importance. We explore the knowledge acquired by the attention heads, elucidating the decision-making process through the lens of linguistic understanding. Specifically, we quantify the importance of each attention head by assessing its contribution to neural machine translation performance, employing a Masking Attention Heads approach. We evaluate the method and investigate the distribution of attention heads’ importance, as well as its correlation with part-of-speech contribution. To understand the diverse decisions made by attention heads, we concentrate on analyzing multi-granularity linguistic knowledge. Our findings indicate that specialized heads play a crucial role in learning linguistics. By retaining important attention heads and removing the unimportant ones, we can optimize the attention mechanism. This optimization leads to a reduction in the number of model parameters and an increase in the model’s speed. Moreover, by leveraging the connection between attention heads and multi-granular linguistic knowledge, we can enhance the model’s interpretability. Consequently, our research provides valuable insights for the design of improved NMT models. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

19 pages, 827 KiB

Open AccessArticle

MLSL-Spell: Chinese Spelling Check Based on Multi-Label Annotation

by Liming Jiang, Xingfa Shen, Qingbiao Zhao and Jian Yao

Appl. Sci. 2024, 14(6), 2541; https://doi.org/10.3390/app14062541 - 18 Mar 2024

Viewed by 401

Abstract

Chinese spelling errors are commonplace in our daily lives, which might be caused by input methods, optical character recognition, or speech recognition. Due to Chinese characters’ phonetic and visual similarities, the Chinese spelling check (CSC) is a very challenging task. However, the existing [...] Read more.

Chinese spelling errors are commonplace in our daily lives, which might be caused by input methods, optical character recognition, or speech recognition. Due to Chinese characters’ phonetic and visual similarities, the Chinese spelling check (CSC) is a very challenging task. However, the existing CSC solutions cannot achieve good spelling check performance since they often fail to fully extract the contextual information and Pinyin information. In this paper, we propose a novel CSC framework based on multi-label annotation (MLSL-Spell), consisting of two basic phases: spelling detection and correction. In the spelling detection phase, MLSL-Spell uses the fusion vectors of both character-based pre-trained context vectors and Pinyin vectors and adopts the sequence labeling method to explicitly label the type of misspelled characters. In the spelling correction phase, MLSL-Spell uses Masked Language Mode (MLM) model to generate candidate characters, then performs corresponding screenings according to the error types, and finally screens out the correct characters through the XGBoost classifier. Experiments show that the MLSL-Spell model outperforms the benchmark model. On SIGHAN 2013 dataset, the spelling detection F1 score of MLSL-Spell is 18.3% higher than that of the pointer network (PN) model, and the spelling correction F1 score is 10.9% higher. On SIGHAN 2015 dataset, the spelling detection F1 score of MLSL-Spell is 11% higher than that of Bert and 15.7% higher than that of the PN model. And the spelling correction F1 of MLSL-Spell score is 6.8% higher than that of PN model. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

18 pages, 728 KiB

Open AccessArticle

TodBR: Target-Oriented Dialog with Bidirectional Reasoning on Knowledge Graph

by Zongfeng Qu, Zhitong Yang, Bo Wang and Qinghua Hu

Appl. Sci. 2024, 14(1), 459; https://doi.org/10.3390/app14010459 - 04 Jan 2024

Viewed by 704

Abstract

Target-oriented dialog explores how a dialog agent connects two topics cooperatively and coherently, which aims to generate a “bridging” utterance connecting the new topic to the previous conversation turn. The central focus of this task entails multi-hop reasoning on a knowledge graph (KG) [...] Read more.

Target-oriented dialog explores how a dialog agent connects two topics cooperatively and coherently, which aims to generate a “bridging” utterance connecting the new topic to the previous conversation turn. The central focus of this task entails multi-hop reasoning on a knowledge graph (KG) to achieve the desired target. However, current target-oriented dialog approaches suffer from inefficiencies in reasoning and the inability to locate pertinent key information without bidirectional reason. To address these limitations, we present a bidirectional reasoning model for target-oriented dialog implemented on a commonsense knowledge graph. Furthermore, we introduce an automated technique for constructing dialog subgraphs, which aids in acquiring multi-hop reasoning capabilities. Our experiments demonstrate that our proposed method attains superior performance in reaching the target while providing more coherent responses. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

32 pages, 1399 KiB

Open AccessArticle

Arabic Emotion Recognition in Low-Resource Settings: A Novel Diverse Model Stacking Ensemble with Self-Training

by Maha Jarallah Althobaiti

Appl. Sci. 2023, 13(23), 12772; https://doi.org/10.3390/app132312772 - 28 Nov 2023

Viewed by 633

Abstract

Emotion recognition is a vital task within Natural Language Processing (NLP) that involves automatically identifying emotions from text. As the need for specialized and nuanced emotion recognition models increases, the challenge of fine-grained emotion recognition with limited labeled data becomes prominent. Moreover, emotion [...] Read more.

Emotion recognition is a vital task within Natural Language Processing (NLP) that involves automatically identifying emotions from text. As the need for specialized and nuanced emotion recognition models increases, the challenge of fine-grained emotion recognition with limited labeled data becomes prominent. Moreover, emotion recognition for some languages, such as Arabic, is a challenging task due to the limited availability of labeled data. This scarcity exists in both size and the granularity of emotions. Our research introduces a novel framework for low-resource fine-grained emotion recognition, which uses an iterative process that integrates a stacking ensemble of diverse base models and self-training. The base models employ different learning paradigms, including zero-shot classification, few-shot methods, machine learning algorithms, and transfer learning. Our proposed method eliminates the need for a large labeled dataset to initiate the training process by gradually generating labeled data through iterations. During our experiments, we evaluated the performance of each base model and our proposed method in low-resource scenarios. Our experimental findings indicate our approach outperforms the individual performance of each base model. It also outperforms the state-of-the-art Arabic emotion recognition models in the literature, achieving a weighted average F1-score equal to 83.19% and 72.12% when tested on the AETD and ArPanEmo benchmark datasets, respectively. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

18 pages, 2167 KiB

Open AccessArticle

RB_BG_MHA: A RoBERTa-Based Model with Bi-GRU and Multi-Head Attention for Chinese Offensive Language Detection in Social Media

by Meijia Xu and Shuxian Liu

Appl. Sci. 2023, 13(19), 11000; https://doi.org/10.3390/app131911000 - 06 Oct 2023

Viewed by 919

Abstract

Offensive language in social media affects the social experience of individuals and groups and hurts social harmony and moral values. Therefore, in recent years, the problem of offensive language detection has attracted the attention of many researchers. However, the primary research currently focuses [...] Read more.

Offensive language in social media affects the social experience of individuals and groups and hurts social harmony and moral values. Therefore, in recent years, the problem of offensive language detection has attracted the attention of many researchers. However, the primary research currently focuses on detecting English offensive language, while few studies on the Chinese language exist. In this paper, we propose an innovative approach to detect Chinese offensive language. First, unlike previous approaches, we utilized both RoBERTa’s sentence-level and word-level embedding, combining the sentence embedding and word embedding of RoBERTa’s model, bidirectional GRU, and multi-head self-attention mechanism. This feature fusion allows the model to consider sentence-level and word-level semantic information at the same time so as to capture the semantic information of Chinese text more comprehensively. Second, by concatenating the output results of multi-head attention with RoBERTa’s sentence embedding, we achieved an efficient fusion of local and global information and improved the representation ability of the model. The experiments showed that the proposed model achieved 82.931% accuracy and 82.842% F1-score in Chinese offensive language detection tasks, delivering high performance and broad application potential. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

15 pages, 660 KiB

Open AccessArticle

A Study of Contrastive Learning Algorithms for Sentence Representation Based on Simple Data Augmentation

by Xiaodong Liu, Wenyin Gong, Yuxin Li, Yanchi Li and Xiang Li

Appl. Sci. 2023, 13(18), 10120; https://doi.org/10.3390/app131810120 - 08 Sep 2023

Viewed by 833

Abstract

In the era of deep learning, representational text-matching algorithms based on BERT and its variant models have become mainstream and are limited by the sentence vectors generated by the BERT model, and the SimCSE algorithm proposed in 2021 has improved the sentence vector [...] Read more.

In the era of deep learning, representational text-matching algorithms based on BERT and its variant models have become mainstream and are limited by the sentence vectors generated by the BERT model, and the SimCSE algorithm proposed in 2021 has improved the sentence vector quality to a certain extent. In this paper, to address the problem that the SimCSE algorithm has—that the greater the difference in sentence length, the smaller the probability that the sentence pairs are similar—an EdaCSE algorithm is proposed to perturb the sentence length using a simple data enhancement method without affecting the semantics of the sentences. The perturbation is applied to the sentence length by adding meaningless English punctuation marks to the original sentence so that the model no longer tends to recognise sentences of similar length as similar sentences. Based on the BERT series of models, experiments were conducted on five different datasets, and the experiments proved that the EdaCSE method improves an average of 1.67, 0.84, and 1.08 on the five datasets. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

14 pages, 1258 KiB

Open AccessArticle

A Novel Embedding Model for Knowledge Graph Entity Alignment Based on Graph Neural Networks

by Hongchan Li, Zhaoyang Han, Haodong Zhu and Yuchao Qian

Appl. Sci. 2023, 13(10), 5876; https://doi.org/10.3390/app13105876 - 10 May 2023

Viewed by 1265

Abstract

The objective of the entity alignment (EA) task is to identify entities with identical semantics across distinct knowledge graphs (KGs) situated in the real world, which has garnered extensive recognition in both academic and industrial circles. Within this paper, a pioneering entity alignment [...] Read more.

The objective of the entity alignment (EA) task is to identify entities with identical semantics across distinct knowledge graphs (KGs) situated in the real world, which has garnered extensive recognition in both academic and industrial circles. Within this paper, a pioneering entity alignment framework named PCE-HGTRA is proposed. This framework integrates the relation and property information from varying KGs, along with the heterogeneity information present within the KGs. Firstly, by learning embeddings, this framework captures the similarity that exists between entities present across diverse KGs. Additionally, property triplets in KGs are used to generate property character-level embeddings, facilitating the transfer of entity embeddings from two distinct KGs onto an identical space. Secondly, the framework strengthens the property character-level embeddings using the transitivity rule to increase the count of entity properties. Then, in order to effectively capture the heterogeneous features in the entity neighborhoods, a heterogeneous graph transformer with relation awareness is designed to model the heterogeneous relations in KGs in the framework. Finally, comparative experimental results on four widely recognized real-world datasets demonstrate that PCE-HGTRA performs exceptionally well. In fact, its Hits@1 performance exceeds the best baseline by 7.94%, outperforming seven other state-of-the-art methods. Full article

(This article belongs to the Special Issue Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Applications, Challenges and Future Direction of Natural Language Processing Based on Deep Learning

Share This Special Issue

Special Issue Editor

Special Issue Information

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI