Research

17 pages, 1582 KiB

Open AccessArticle

AdMISC: Advanced Multi-Task Learning and Feature-Fusion for Emotional Support Conversation

by Xuhui Jia, Jia He, Qian Zhang and Jin Jin

Electronics 2024, 13(8), 1484; https://doi.org/10.3390/electronics13081484 - 13 Apr 2024

Viewed by 361

The emotional support dialogue system is an emerging and challenging task in natural language processing to alleviate people’s emotional distress. Each utterance in the dialogue has features such as emotion, intent, and commonsense knowledge. Previous research has indicated subpar performance in strategy prediction [...] Read more.

The emotional support dialogue system is an emerging and challenging task in natural language processing to alleviate people’s emotional distress. Each utterance in the dialogue has features such as emotion, intent, and commonsense knowledge. Previous research has indicated subpar performance in strategy prediction accuracy and response generation quality due to overlooking certain underlying factors. To address these issues, we propose Advanced Multi-Task Learning and Feature-Fusion for Emotional Support Conversation (AdMISC), which extracts various potential factors influencing dialogue through neural networks, thereby improving the accuracy of strategy prediction and the quality of generated responses. Specifically, we extract features affecting dialogue through dynamic emotion extraction and commonsense enhancement and then model strategy prediction. Additionally, the model learns these features through attention networks to generate higher quality responses. Furthermore, we introduce a method for automatically averaging loss function weights to improve the model’s performance. Experimental results using the emotional support conversation dataset ESConv demonstrate that our proposed model outperforms baseline methods in both strategy label prediction accuracy and a range of automatic and human evaluation metrics. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

25 pages, 3406 KiB

Open AccessArticle

Persona-Identified Chatbot through Small-Scale Modeling and Data Transformation

by Bitna Keum, Juoh Sun, Woojin Lee, Seongheum Park and Harksoo Kim

Electronics 2024, 13(8), 1409; https://doi.org/10.3390/electronics13081409 - 09 Apr 2024

Viewed by 315

Abstract

Research on chatbots aimed at facilitating more natural and engaging conversations is actively underway. With the growing recognition of the significance of personas in this context, persona-based conversational research is gaining prominence. Despite the abundance of publicly available chit-chat datasets, persona-based chat datasets [...] Read more.

Research on chatbots aimed at facilitating more natural and engaging conversations is actively underway. With the growing recognition of the significance of personas in this context, persona-based conversational research is gaining prominence. Despite the abundance of publicly available chit-chat datasets, persona-based chat datasets remain scarce, primarily due to the higher associated costs. Consequently, we propose a methodology for transforming extensive chit-chat datasets into persona-based chat datasets. Simultaneously, we propose a model adept at effectively incorporating personas into responses, even with a constrained number of parameters. This model can discern the most relevant information from persona memory without resorting to a retrieval model. Furthermore, it makes decisions regarding whether to reference the memory, thereby enhancing the interpretability of the model’s judgments. Our CC2PC framework demonstrates superior performance in both automatic and LLM evaluations when compared to high-cost persona-based chat dataset. Additionally, experimental results on the proposed model indicate the improved persona-based response capabilities. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

21 pages, 9086 KiB

Open AccessArticle

Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts

by Brendan Hannon, Yulia Kumar, Dejaun Gayle, J. Jenny Li and Patricia Morreale

Electronics 2024, 13(5), 842; https://doi.org/10.3390/electronics13050842 - 22 Feb 2024

Viewed by 1027

Abstract

In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously [...] Read more.

In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

21 pages, 815 KiB

Open AccessArticle

Prediction of Arabic Legal Rulings Using Large Language Models

by Adel Ammar, Anis Koubaa, Bilel Benjdira, Omer Nacar and Serry Sibaee

Electronics 2024, 13(4), 764; https://doi.org/10.3390/electronics13040764 - 15 Feb 2024

Viewed by 682

Abstract

In the intricate field of legal studies, the analysis of court decisions is a cornerstone for the effective functioning of the judicial system. The ability to predict court outcomes helps judges during the decision-making process and equips lawyers with invaluable insights, enhancing their [...] Read more.

In the intricate field of legal studies, the analysis of court decisions is a cornerstone for the effective functioning of the judicial system. The ability to predict court outcomes helps judges during the decision-making process and equips lawyers with invaluable insights, enhancing their strategic approaches to cases. Despite its significance, the domain of Arabic court analysis remains under-explored. This paper pioneers a comprehensive predictive analysis of Arabic court decisions on a dataset of 10,813 commercial court real cases, leveraging the advanced capabilities of the current state-of-the-art large language models. Through a systematic exploration, we evaluate three prevalent foundational models (LLaMA-7b, JAIS-13b, and GPT-3.5-turbo) and three training paradigms: zero-shot, one-shot, and tailored fine-tuning. In addition, we assess the benefit of summarizing and/or translating the original Arabic input texts. This leads to a spectrum of 14 model variants, for which we offer a granular performance assessment with a series of different metrics (human assessment, GPT evaluation, ROUGE, and BLEU scores). We show that all variants of LLaMA models yield limited performance, whereas GPT-3.5-based models outperform all other models by a wide margin, surpassing the average score of the dedicated Arabic-centric JAIS model by 50%. Furthermore, we show that all scores except human evaluation are inconsistent and unreliable for assessing the performance of large language models on court decision predictions. This study paves the way for future research, bridging the gap between computational linguistics and Arabic legal analytics. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

24 pages, 2195 KiB

Open AccessArticle

CLICK: Integrating Causal Inference and Commonsense Knowledge Incorporation for Counterfactual Story Generation

by Dandan Li, Ziyu Guo, Qing Liu, Li Jin, Zequn Zhang, Kaiwen Wei and Feng Li

Electronics 2023, 12(19), 4173; https://doi.org/10.3390/electronics12194173 - 08 Oct 2023

Viewed by 1018

Abstract

Counterfactual reasoning explores what could have happened if the circumstances were different from what actually occurred. As a crucial subtask, counterfactual story generation integrates counterfactual reasoning into the generative narrative chain, which requires the model to preserve minimal edits and ensure narrative consistency. [...] Read more.

Counterfactual reasoning explores what could have happened if the circumstances were different from what actually occurred. As a crucial subtask, counterfactual story generation integrates counterfactual reasoning into the generative narrative chain, which requires the model to preserve minimal edits and ensure narrative consistency. Previous work prioritizes conflict detection as a first step, and then replaces conflicting content with appropriate words. However, these methods mainly face two challenging issues: (a) the causal relationship between story event sequences is not fully utilized in the conflict detection stage, leading to inaccurate conflict detection, and (b) the absence of proper planning in the content rewriting stage results in a lack of narrative consistency in the generated story ending. In this paper, we propose a novel counterfactual generation framework called CLICK based on causal inference in event sequences and commonsense knowledge incorporation. To address the first issue, we utilize the correlation between adjacent events in the story ending to iteratively calculate the contents from the original ending affected by the condition. The content with the original condition is then effectively prevented from carrying over into the new story ending, thereby avoiding causal conflict with the counterfactual conditions. Considering the second issue, we incorporate structural commonsense knowledge about counterfactual conditions, equipping the framework with comprehensive background information on the potential occurrence of counterfactual conditional events. Through leveraging a rich hierarchical data structure, CLICK gains the ability to establish a more coherent and plausible narrative trajectory for subsequent storytelling. Experimental results show that our model outperforms previous unsupervised state-of-the-art methods and achieves gains of 2.65 in BLEU, 4.42 in ENTScore, and 3.84 in HMean on the TIMETRAVEL dataset. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

16 pages, 615 KiB

Open AccessArticle

Asking Questions about Scientific Articles—Identifying Large N Studies with LLMs

by Razvan Paroiu, Stefan Ruseti, Mihai Dascalu, Stefan Trausan-Matu and Danielle S. McNamara

Electronics 2023, 12(19), 3996; https://doi.org/10.3390/electronics12193996 - 22 Sep 2023

Viewed by 831

Abstract

The exponential growth of scientific publications increases the effort required to identify relevant articles. Moreover, the scale of studies is a frequent barrier to research as the majority of studies are low or medium-scaled and do not generalize well while lacking statistical power. [...] Read more.

The exponential growth of scientific publications increases the effort required to identify relevant articles. Moreover, the scale of studies is a frequent barrier to research as the majority of studies are low or medium-scaled and do not generalize well while lacking statistical power. As such, we introduce an automated method that supports the identification of large-scale studies in terms of population. First, we introduce a training corpus of 1229 manually annotated paragraphs extracted from 20 articles with different structures and considered populations. Our method considers prompting a FLAN-T5 language model with targeted questions and paragraphs from the previous corpus so that the model returns the number of participants from the study. We adopt a dialogic extensible approach in which the model is asked a sequence of questions that are gradual in terms of focus. Second, we use a validation corpus with 200 articles labeled for having N larger than 1000 to assess the performance of our language model. Our model, without any preliminary filtering with heuristics, achieves an F1 score of 0.52, surpassing previous analyses performed that obtained an F1 score of 0.51. Moreover, we achieved an F1 score of 0.69 when combined with previous extraction heuristics, thus arguing for the robustness and extensibility of our approach. Finally, we apply our model to a newly introduced dataset of ERIC publications to observe trends across the years in the Education domain. A spike was observed in 2019, followed by a decrease in 2020 and, afterward, a positive trend; nevertheless, the overall percentage is lower than 3%, suggesting a major problem in terms of scale and the need for a change in perspective. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

19 pages, 3239 KiB

Open AccessArticle

ConKgPrompt: Contrastive Sample Method Based on Knowledge-Guided Prompt Learning for Text Classification

by Qian Wang, Cheng Zeng, Bing Li and Peng He

Electronics 2023, 12(17), 3656; https://doi.org/10.3390/electronics12173656 - 30 Aug 2023

Viewed by 1444

Abstract

Text classification aims to classify text according to pre-defined categories. Despite the success of existing methods based on the fine-tuning paradigm, there is a significant gap between fine-tuning and pre-training. Currently, prompt learning methods can bring state of the art (SOTA) performance to [...] Read more.

Text classification aims to classify text according to pre-defined categories. Despite the success of existing methods based on the fine-tuning paradigm, there is a significant gap between fine-tuning and pre-training. Currently, prompt learning methods can bring state of the art (SOTA) performance to pre-trained language models (PLMs) in text classification and transform a classification problem into a masked language modeling problem. The crucial step of prompt learning is to construct a map between original labels and the label extension words. However, most mapping construction methods consider only labels themselves; relying solely on a label is not sufficient to achieve accurate prediction of mask tokens, especially in classification tasks where semantic features and label words are highly interrelated. Therefore, the accurate prediction of mask tokens requires one to consider additional factors beyond just label words. To this end, we propose a contrastive sample method based on knowledge-guided prompt learning framework (ConKgPrompt) for text classification. Specifically, this framework utilizes external knowledge bases (KBs) to expand the label vocabulary of verbalizers at multiple granularities. In the contrastive sample module, we incorporate supervised contrastive learning to make representations more expressive. Our approach was validated on four benchmark datasets, and extensive experimental results and analysis demonstrated the effectiveness of each module of the ConKgPrompt method. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

18 pages, 546 KiB

Open AccessArticle

Prompt Learning with Structured Semantic Knowledge Makes Pre-Trained Language Models Better

by Hai-Tao Zheng, Zuotong Xie, Wenqiang Liu, Dongxiao Huang, Bei Wu and Hong-Gee Kim

Electronics 2023, 12(15), 3281; https://doi.org/10.3390/electronics12153281 - 30 Jul 2023

Viewed by 1213

Abstract

Pre-trained language models with structured semantic knowledge have demonstrated remarkable performance in a variety of downstream natural language processing tasks. The typical methods of integrating knowledge are designing different pre-training tasks and training from scratch, which requires high-end hardware, massive storage resources, and [...] Read more.

Pre-trained language models with structured semantic knowledge have demonstrated remarkable performance in a variety of downstream natural language processing tasks. The typical methods of integrating knowledge are designing different pre-training tasks and training from scratch, which requires high-end hardware, massive storage resources, and long computing times. Prompt learning is an effective approach to tuning language models for specific tasks, and it can also be used to infuse knowledge. However, most prompt learning methods accept one token as the answer, instead of multiple tokens. To tackle this problem, we propose the long-answer prompt learning method (KLAPrompt), with three different long-answer strategies, to incorporate semantic knowledge into pre-trained language models, and we compare the performance of these three strategies through experiments. We also explore the effectiveness of the KLAPrompt method in the medical field. Additionally, we generate a word sense prediction dataset (WSP) based on the Xinhua Dictionary and a disease and category prediction dataset (DCP) based on MedicalKG. Experimental results show that discrete answers with the answer space partitioning strategy achieve the best results, and introducing structured semantic information can consistently improve language modeling and downstream tasks. Full article

(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Emerging Theory and Applications in Natural Language Processing

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (8 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI