Emerging Theory and Applications in Natural Language Processing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 October 2024 | Viewed by 8061

Special Issue Editors


E-Mail Website
Guest Editor
School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
Interests: knowledge graph; natural language processing; multimodal

E-Mail Website
Guest Editor
School of Computer Science, Beijing Jiaotong University, Beijing 100091, China
Interests: natural language processing; knowledge graph; machine learning
School of Computer Science and Technology, Dalian University of Technology, Dalian 116081, China
Interests: information retrieval; question answering and dialogue; natural language processing; biomedical literature-based knowledge discovery

Special Issue Information

Dear Colleagues,

In recent years, natural language processing (NLP) has been transformed by groundbreaking deep learning advancements and the emergence of large language models (LLMs). The combination of LLMs with adaptation tuning methods has significantly increased the generalization capabilities of NLP models, illuminating the path towards general artificial intelligence systems for researchers. Recognizing the significance of these emerging developments, it is crucial to explore their potential and understand their relationship with classical methods in shaping the future of NLP and its real-world applications. The aim of this Special Issue is to showcase cutting-edge research in NLP, highlighting novel theories, methods, and applications that advance the state of the art, while also promoting interdisciplinary research.

Suggested themes for this Special Issue include, but are not limited to:

(1)  Novel NLP theory, architectures, and algorithms;

(2)  Theoretical foundations of LLMs: emergent abilities, scaling effects, etc.;

(3)  Model training and utilization strategies;

(4)  Efficiency and scalability of language models;

(5)  Integration of NLP with other AI technologies;

(6)  Interpretability of NLP and LLM;

(7)  Evaluating large language models: capabilities and limitations;

(8)  Ethical considerations and fairness;

(9)  Safety and alignment in LLMs;

(10) Domain-specific NLP applications;

(11)  Other emerging topics in NLP and LLM research.

Dr. Linmei Hu
Dr. Jian Liu
Dr. Bo Xu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • large language models
  • NLP theory and application

Published Papers (8 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 1582 KiB  
Article
AdMISC: Advanced Multi-Task Learning and Feature-Fusion for Emotional Support Conversation
by Xuhui Jia, Jia He, Qian Zhang and Jin Jin
Electronics 2024, 13(8), 1484; https://doi.org/10.3390/electronics13081484 - 13 Apr 2024
Viewed by 361
Abstract
The emotional support dialogue system is an emerging and challenging task in natural language processing to alleviate people’s emotional distress. Each utterance in the dialogue has features such as emotion, intent, and commonsense knowledge. Previous research has indicated subpar performance in strategy prediction [...] Read more.
The emotional support dialogue system is an emerging and challenging task in natural language processing to alleviate people’s emotional distress. Each utterance in the dialogue has features such as emotion, intent, and commonsense knowledge. Previous research has indicated subpar performance in strategy prediction accuracy and response generation quality due to overlooking certain underlying factors. To address these issues, we propose Advanced Multi-Task Learning and Feature-Fusion for Emotional Support Conversation (AdMISC), which extracts various potential factors influencing dialogue through neural networks, thereby improving the accuracy of strategy prediction and the quality of generated responses. Specifically, we extract features affecting dialogue through dynamic emotion extraction and commonsense enhancement and then model strategy prediction. Additionally, the model learns these features through attention networks to generate higher quality responses. Furthermore, we introduce a method for automatically averaging loss function weights to improve the model’s performance. Experimental results using the emotional support conversation dataset ESConv demonstrate that our proposed model outperforms baseline methods in both strategy label prediction accuracy and a range of automatic and human evaluation metrics. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

25 pages, 3406 KiB  
Article
Persona-Identified Chatbot through Small-Scale Modeling and Data Transformation
by Bitna Keum, Juoh Sun, Woojin Lee, Seongheum Park and Harksoo Kim
Electronics 2024, 13(8), 1409; https://doi.org/10.3390/electronics13081409 - 09 Apr 2024
Viewed by 315
Abstract
Research on chatbots aimed at facilitating more natural and engaging conversations is actively underway. With the growing recognition of the significance of personas in this context, persona-based conversational research is gaining prominence. Despite the abundance of publicly available chit-chat datasets, persona-based chat datasets [...] Read more.
Research on chatbots aimed at facilitating more natural and engaging conversations is actively underway. With the growing recognition of the significance of personas in this context, persona-based conversational research is gaining prominence. Despite the abundance of publicly available chit-chat datasets, persona-based chat datasets remain scarce, primarily due to the higher associated costs. Consequently, we propose a methodology for transforming extensive chit-chat datasets into persona-based chat datasets. Simultaneously, we propose a model adept at effectively incorporating personas into responses, even with a constrained number of parameters. This model can discern the most relevant information from persona memory without resorting to a retrieval model. Furthermore, it makes decisions regarding whether to reference the memory, thereby enhancing the interpretability of the model’s judgments. Our CC2PC framework demonstrates superior performance in both automatic and LLM evaluations when compared to high-cost persona-based chat dataset. Additionally, experimental results on the proposed model indicate the improved persona-based response capabilities. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

21 pages, 9086 KiB  
Article
Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts
by Brendan Hannon, Yulia Kumar, Dejaun Gayle, J. Jenny Li and Patricia Morreale
Electronics 2024, 13(5), 842; https://doi.org/10.3390/electronics13050842 - 22 Feb 2024
Viewed by 1027
Abstract
In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously [...] Read more.
In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

21 pages, 815 KiB  
Article
Prediction of Arabic Legal Rulings Using Large Language Models
by Adel Ammar, Anis Koubaa, Bilel Benjdira, Omer Nacar and Serry Sibaee
Electronics 2024, 13(4), 764; https://doi.org/10.3390/electronics13040764 - 15 Feb 2024
Viewed by 682
Abstract
In the intricate field of legal studies, the analysis of court decisions is a cornerstone for the effective functioning of the judicial system. The ability to predict court outcomes helps judges during the decision-making process and equips lawyers with invaluable insights, enhancing their [...] Read more.
In the intricate field of legal studies, the analysis of court decisions is a cornerstone for the effective functioning of the judicial system. The ability to predict court outcomes helps judges during the decision-making process and equips lawyers with invaluable insights, enhancing their strategic approaches to cases. Despite its significance, the domain of Arabic court analysis remains under-explored. This paper pioneers a comprehensive predictive analysis of Arabic court decisions on a dataset of 10,813 commercial court real cases, leveraging the advanced capabilities of the current state-of-the-art large language models. Through a systematic exploration, we evaluate three prevalent foundational models (LLaMA-7b, JAIS-13b, and GPT-3.5-turbo) and three training paradigms: zero-shot, one-shot, and tailored fine-tuning. In addition, we assess the benefit of summarizing and/or translating the original Arabic input texts. This leads to a spectrum of 14 model variants, for which we offer a granular performance assessment with a series of different metrics (human assessment, GPT evaluation, ROUGE, and BLEU scores). We show that all variants of LLaMA models yield limited performance, whereas GPT-3.5-based models outperform all other models by a wide margin, surpassing the average score of the dedicated Arabic-centric JAIS model by 50%. Furthermore, we show that all scores except human evaluation are inconsistent and unreliable for assessing the performance of large language models on court decision predictions. This study paves the way for future research, bridging the gap between computational linguistics and Arabic legal analytics. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

24 pages, 2195 KiB  
Article
CLICK: Integrating Causal Inference and Commonsense Knowledge Incorporation for Counterfactual Story Generation
by Dandan Li, Ziyu Guo, Qing Liu, Li Jin, Zequn Zhang, Kaiwen Wei and Feng Li
Electronics 2023, 12(19), 4173; https://doi.org/10.3390/electronics12194173 - 08 Oct 2023
Viewed by 1018
Abstract
Counterfactual reasoning explores what could have happened if the circumstances were different from what actually occurred. As a crucial subtask, counterfactual story generation integrates counterfactual reasoning into the generative narrative chain, which requires the model to preserve minimal edits and ensure narrative consistency. [...] Read more.
Counterfactual reasoning explores what could have happened if the circumstances were different from what actually occurred. As a crucial subtask, counterfactual story generation integrates counterfactual reasoning into the generative narrative chain, which requires the model to preserve minimal edits and ensure narrative consistency. Previous work prioritizes conflict detection as a first step, and then replaces conflicting content with appropriate words. However, these methods mainly face two challenging issues: (a) the causal relationship between story event sequences is not fully utilized in the conflict detection stage, leading to inaccurate conflict detection, and (b) the absence of proper planning in the content rewriting stage results in a lack of narrative consistency in the generated story ending. In this paper, we propose a novel counterfactual generation framework called CLICK based on causal inference in event sequences and commonsense knowledge incorporation. To address the first issue, we utilize the correlation between adjacent events in the story ending to iteratively calculate the contents from the original ending affected by the condition. The content with the original condition is then effectively prevented from carrying over into the new story ending, thereby avoiding causal conflict with the counterfactual conditions. Considering the second issue, we incorporate structural commonsense knowledge about counterfactual conditions, equipping the framework with comprehensive background information on the potential occurrence of counterfactual conditional events. Through leveraging a rich hierarchical data structure, CLICK gains the ability to establish a more coherent and plausible narrative trajectory for subsequent storytelling. Experimental results show that our model outperforms previous unsupervised state-of-the-art methods and achieves gains of 2.65 in BLEU, 4.42 in ENTScore, and 3.84 in HMean on the TIMETRAVEL dataset. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

16 pages, 615 KiB  
Article
Asking Questions about Scientific Articles—Identifying Large N Studies with LLMs
by Razvan Paroiu, Stefan Ruseti, Mihai Dascalu, Stefan Trausan-Matu and Danielle S. McNamara
Electronics 2023, 12(19), 3996; https://doi.org/10.3390/electronics12193996 - 22 Sep 2023
Viewed by 831
Abstract
The exponential growth of scientific publications increases the effort required to identify relevant articles. Moreover, the scale of studies is a frequent barrier to research as the majority of studies are low or medium-scaled and do not generalize well while lacking statistical power. [...] Read more.
The exponential growth of scientific publications increases the effort required to identify relevant articles. Moreover, the scale of studies is a frequent barrier to research as the majority of studies are low or medium-scaled and do not generalize well while lacking statistical power. As such, we introduce an automated method that supports the identification of large-scale studies in terms of population. First, we introduce a training corpus of 1229 manually annotated paragraphs extracted from 20 articles with different structures and considered populations. Our method considers prompting a FLAN-T5 language model with targeted questions and paragraphs from the previous corpus so that the model returns the number of participants from the study. We adopt a dialogic extensible approach in which the model is asked a sequence of questions that are gradual in terms of focus. Second, we use a validation corpus with 200 articles labeled for having N larger than 1000 to assess the performance of our language model. Our model, without any preliminary filtering with heuristics, achieves an F1 score of 0.52, surpassing previous analyses performed that obtained an F1 score of 0.51. Moreover, we achieved an F1 score of 0.69 when combined with previous extraction heuristics, thus arguing for the robustness and extensibility of our approach. Finally, we apply our model to a newly introduced dataset of ERIC publications to observe trends across the years in the Education domain. A spike was observed in 2019, followed by a decrease in 2020 and, afterward, a positive trend; nevertheless, the overall percentage is lower than 3%, suggesting a major problem in terms of scale and the need for a change in perspective. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

19 pages, 3239 KiB  
Article
ConKgPrompt: Contrastive Sample Method Based on Knowledge-Guided Prompt Learning for Text Classification
by Qian Wang, Cheng Zeng, Bing Li and Peng He
Electronics 2023, 12(17), 3656; https://doi.org/10.3390/electronics12173656 - 30 Aug 2023
Viewed by 1444
Abstract
Text classification aims to classify text according to pre-defined categories. Despite the success of existing methods based on the fine-tuning paradigm, there is a significant gap between fine-tuning and pre-training. Currently, prompt learning methods can bring state of the art (SOTA) performance to [...] Read more.
Text classification aims to classify text according to pre-defined categories. Despite the success of existing methods based on the fine-tuning paradigm, there is a significant gap between fine-tuning and pre-training. Currently, prompt learning methods can bring state of the art (SOTA) performance to pre-trained language models (PLMs) in text classification and transform a classification problem into a masked language modeling problem. The crucial step of prompt learning is to construct a map between original labels and the label extension words. However, most mapping construction methods consider only labels themselves; relying solely on a label is not sufficient to achieve accurate prediction of mask tokens, especially in classification tasks where semantic features and label words are highly interrelated. Therefore, the accurate prediction of mask tokens requires one to consider additional factors beyond just label words. To this end, we propose a contrastive sample method based on knowledge-guided prompt learning framework (ConKgPrompt) for text classification. Specifically, this framework utilizes external knowledge bases (KBs) to expand the label vocabulary of verbalizers at multiple granularities. In the contrastive sample module, we incorporate supervised contrastive learning to make representations more expressive. Our approach was validated on four benchmark datasets, and extensive experimental results and analysis demonstrated the effectiveness of each module of the ConKgPrompt method. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

18 pages, 546 KiB  
Article
Prompt Learning with Structured Semantic Knowledge Makes Pre-Trained Language Models Better
by Hai-Tao Zheng, Zuotong Xie, Wenqiang Liu, Dongxiao Huang, Bei Wu and Hong-Gee Kim
Electronics 2023, 12(15), 3281; https://doi.org/10.3390/electronics12153281 - 30 Jul 2023
Viewed by 1213
Abstract
Pre-trained language models with structured semantic knowledge have demonstrated remarkable performance in a variety of downstream natural language processing tasks. The typical methods of integrating knowledge are designing different pre-training tasks and training from scratch, which requires high-end hardware, massive storage resources, and [...] Read more.
Pre-trained language models with structured semantic knowledge have demonstrated remarkable performance in a variety of downstream natural language processing tasks. The typical methods of integrating knowledge are designing different pre-training tasks and training from scratch, which requires high-end hardware, massive storage resources, and long computing times. Prompt learning is an effective approach to tuning language models for specific tasks, and it can also be used to infuse knowledge. However, most prompt learning methods accept one token as the answer, instead of multiple tokens. To tackle this problem, we propose the long-answer prompt learning method (KLAPrompt), with three different long-answer strategies, to incorporate semantic knowledge into pre-trained language models, and we compare the performance of these three strategies through experiments. We also explore the effectiveness of the KLAPrompt method in the medical field. Additionally, we generate a word sense prediction dataset (WSP) based on the Xinhua Dictionary and a disease and category prediction dataset (DCP) based on MedicalKG. Experimental results show that discrete answers with the answer space partitioning strategy achieve the best results, and introducing structured semantic information can consistently improve language modeling and downstream tasks. Full article
(This article belongs to the Special Issue Emerging Theory and Applications in Natural Language Processing)
Show Figures

Figure 1

Back to TopTop