Trends and Prospects in Hybrid Methods for Natural Language Processing

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 August 2024 | Viewed by 17114

Special Issue Editors


E-Mail Website
Guest Editor
Intelligent Systems Group, Universidad Politécnica de Madrid, 28031 Madrid, Spain
Interests: natural language processing; machine learning; sentiment analysis, radicalization

E-Mail Website
Guest Editor
Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente, 7522 NH Enschede, The Netherlands
Interests: NLP; sentiment analysis; natural language generation; affective language; computational creativity

E-Mail Website
Guest Editor
Intelligent Systems Group, Universidad Politécnica de Madrid, 28040 Madrid, Spain
Interests: machine learning; agent technology; cognitive bots; natural language processing
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Institute for Scientific Interchange Foundation, 10126 Turin, Italy
Interests: computational social science; machine learning; moral and personality psychology; cognitive science; vaccine hesitancy; digital humanities; algorithmic biases; natural language processing

Special Issue Information

Dear Colleagues,

Recent advances in natural language processing (NLP) that involve machine learning and deep learning have certainly revolutionized the field. Still, there are specific tasks and domains where these new techniques have still not surpassed more classical approaches—for example, tasks that require deep linguistic knowledge such as natural language understanding, semantic reasoning, and question answering. Another common limitation is that of the scarcity of training datasets, a situation that arises when trying to apply recent approaches to new domains. To overcome these limitations, it is necessary to consider hybrid systems that exploit domain-oriented knowledge into learning models in a way that allows machines to grasp the intricacies of real-world applications, equipping them with deep understanding and general common sense.

While there are efforts to design hybrid models, several aspects need to be considered. such as interpretability, transparency, accountability, and efficiency. This Special Issue of Electronics addresses the direction of NLP efforts toward hybrid solutions, considering the mentioned characteristics and their effects on end users and society in general.

Topics of interest of this Special Issue include but are not limited to:

  • Information extraction;
  • Semantic reasoning;
  • Text and speech processing;
  • Relational semantics;
  • Discourse analysis;
  • Argument mining;
  • Text summarization;
  • Machine translation;
  • Natural language generation;
  • Natural language understanding;
  • Question answering;
  • Sentiment and emotion analysis;
  • Affect analysis;
  • Hate speech analysis;
  • Radicalization analysis;
  • Disinformation analysis;
  • Authorship attribution.

Dr. Oscar Araque
Dr. Lorenzo Gatti
Dr. Álvaro Carrera Barroso
Dr. Kyriaki Kalimeri
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing
  • text mining
  • machine learning
  • information extraction

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

21 pages, 4367 KiB  
Article
Chinese Multicategory Sentiment of E-Commerce Analysis Based on Deep Learning
by Hongchan Li, Jianwen Wang, Yantong Lu, Haodong Zhu and Jiming Ma
Electronics 2023, 12(20), 4259; https://doi.org/10.3390/electronics12204259 - 15 Oct 2023
Viewed by 1037
Abstract
With the continuous rise of information technology and social networks, and the explosive growth of network text information, text sentiment analysis technology now plays a vital role in public opinion monitoring and product development analysis on networks. Text data are high-dimensional and complex, [...] Read more.
With the continuous rise of information technology and social networks, and the explosive growth of network text information, text sentiment analysis technology now plays a vital role in public opinion monitoring and product development analysis on networks. Text data are high-dimensional and complex, and traditional binary classification can only classify sentiment from positive or negative aspects. This does not fully cover the various emotions of users, and, therefore, natural language semantic sentiment analysis has limitations. To solve this deficiency, we propose a new model for analyzing text sentiment that combines deep learning and the bidirectional encoder representation from transformers (BERT) model. We first use an advanced BERT language model to convert the input text into dynamic word vectors; then, we adopt a convolutional neural network (CNN) to obtain the relatively significant partial emotional characteristics of the text. After extraction, we use the bidirectional recurrent neural network (BiGRU) to bidirectionally capture the contextual feature message of the text. Finally, with the MultiHeadAttention mechanism we obtain correlations among the data in different information spaces from different subspaces so that the key information related to emotion in the text can be selectively extracted. The final emotional feature representation obtained is classified using Softmax. Compared with other similar existing methods, our model in this research paper showed a good effect in comparative experiments on an e-commerce text dataset, and the accuracy and F1-score of the classification were significantly improved. Full article
Show Figures

Figure 1

15 pages, 3565 KiB  
Article
Modified Aquila Optimizer with Stacked Deep Learning-Based Sentiment Analysis of COVID-19 Tweets
by Ahmed S. Almasoud, Hala J. Alshahrani, Abdulkhaleq Q. A. Hassan, Nabil Sharaf Almalki and Abdelwahed Motwakel
Electronics 2023, 12(19), 4125; https://doi.org/10.3390/electronics12194125 - 03 Oct 2023
Cited by 1 | Viewed by 952
Abstract
In recent times, global cities have been transforming from traditional cities to sustainable smart cities. In text sentiment analysis (SA), many people face critical issues namely urban traffic management, urban living quality, urban information security, urban energy usage, urban safety, etc. Artificial intelligence [...] Read more.
In recent times, global cities have been transforming from traditional cities to sustainable smart cities. In text sentiment analysis (SA), many people face critical issues namely urban traffic management, urban living quality, urban information security, urban energy usage, urban safety, etc. Artificial intelligence (AI)-based applications play important roles in dealing with these crucial challenges in text SA. In such scenarios, the classification of COVID-19-related tweets for text SA includes using natural language processing (NLP) and machine learning methodologies to classify tweet datasets based on their content. This assists in disseminating relevant information, understanding public sentiment, and promoting sustainable practices in urban areas during this pandemic. This article introduces a modified aquila optimizer with a stacked deep learning-based COVID-19 tweet Classification (MAOSDL-TC) technique for text SA. The presented MAOSDL-TC technique incorporates FastText, an effective and powerful text representation approach used for the generation of word embeddings. Furthermore, the MAOSDL-TC technique utilizes an attention-based stacked bidirectional long short-term memory (ASBiLSTM) model for the classification of sentiments that exist in tweets. To improve the detection results of the ASBiLSTM model, the MAO algorithm is applied for the hyperparameter tuning process. The presented MAOSDL-TC technique is validated on the benchmark tweets dataset. The experimental outcomes implied the promising results of the MAOSDL-TC technique compared to recent models in terms of different measures. This MAOSDL-TC technique improves accuracy and interpretability of sentiment prediction. Full article
Show Figures

Figure 1

16 pages, 794 KiB  
Article
A Cross-Domain Generative Data Augmentation Framework for Aspect-Based Sentiment Analysis
by Jiawei Xue, Yanhong Li, Zixuan Li, Yue Cui, Shaoqiang Zhang and Shuqin Wang
Electronics 2023, 12(13), 2949; https://doi.org/10.3390/electronics12132949 - 04 Jul 2023
Cited by 1 | Viewed by 1440
Abstract
Aspect-based sentiment analysis (ABSA) is a crucial fine-grained sentiment analysis task that aims to determine sentiment polarity in a specific aspect term. Recent research has advanced prediction accuracy by pre-training models on ABSA tasks. However, due to the lack of fine-grained data, those [...] Read more.
Aspect-based sentiment analysis (ABSA) is a crucial fine-grained sentiment analysis task that aims to determine sentiment polarity in a specific aspect term. Recent research has advanced prediction accuracy by pre-training models on ABSA tasks. However, due to the lack of fine-grained data, those models cannot be trained effectively. In this paper, we propose the cross-domain generative data augmentation framework (CDGDA) that utilizes a generation model to produce in-domain, fine-grained sentences by learning from similar, coarse-grained datasets out-of-domain. To generate fine-grained sentences, we guide the generation model using two prompt methods: the aspect replacement and the aspect–sentiment pair replacement. We also refine the quality of generated sentences by an entropy minimization filter. Experimental results on three public datasets show that our framework outperforms most baseline methods and other data augmentation methods, thereby demonstrating its efficacy. Full article
Show Figures

Figure 1

19 pages, 6488 KiB  
Article
Multimodal Natural Language Explanation Generation for Visual Question Answering Based on Multiple Reference Data
by He Zhu, Ren Togo, Takahiro Ogawa and Miki Haseyama
Electronics 2023, 12(10), 2183; https://doi.org/10.3390/electronics12102183 - 10 May 2023
Cited by 2 | Viewed by 1726
Abstract
As deep learning research continues to advance, interpretability is becoming as important as model performance. Conducting interpretability studies to understand the decision-making processes of deep learning models can improve performance and provide valuable insights for humans. The interpretability of visual question answering (VQA), [...] Read more.
As deep learning research continues to advance, interpretability is becoming as important as model performance. Conducting interpretability studies to understand the decision-making processes of deep learning models can improve performance and provide valuable insights for humans. The interpretability of visual question answering (VQA), a crucial task for human–computer interaction, has garnered the attention of researchers due to its wide range of applications. The generation of natural language explanations for VQA that humans can better understand has gradually supplanted heatmap representations as the mainstream focus in the field. Humans typically answer questions by first identifying the primary objects in an image and then referring to various information sources, both within and beyond the image, including prior knowledge. However, previous studies have only considered input images, resulting in insufficient information that can lead to incorrect answers and implausible explanations. To address this issue, we introduce multiple references in addition to the input image. Specifically, we propose a multimodal model that generates natural language explanations for VQA. We introduce outside knowledge using the input image and question and incorporate object information into the model through an object detection module. By increasing the information available during the model generation process, we significantly improve VQA accuracy and the reliability of the generated explanations. Moreover, we employ a simple and effective feature fusion joint vector to combine information from multiple modalities while maximizing information preservation. Qualitative and quantitative evaluation experiments demonstrate that the proposed method can generate more reliable explanations than state-of-the-art methods while maintaining answering accuracy. Full article
Show Figures

Figure 1

17 pages, 4265 KiB  
Article
A Novel AB-CNN Model for Multi-Classification Sentiment Analysis of e-Commerce Comments
by Hongchan Li, Yantong Lu, Haodong Zhu and Yu Ma
Electronics 2023, 12(8), 1880; https://doi.org/10.3390/electronics12081880 - 16 Apr 2023
Cited by 2 | Viewed by 1184
Abstract
Despite the success of dichotomous sentiment analysis, it does not encompass the various emotional colors of users in reality, which can be more plentiful than a mere positive or negative association. Moreover, the complexity and imbalanced nature of Chinese text presents a formidable [...] Read more.
Despite the success of dichotomous sentiment analysis, it does not encompass the various emotional colors of users in reality, which can be more plentiful than a mere positive or negative association. Moreover, the complexity and imbalanced nature of Chinese text presents a formidable obstacle to overcome. To address prior inadequacies, the three-classification method is employed and a novel AB-CNN model is proposed, incorporating an attention mechanism, BiLSTM, and a CNN. The proposed model was tested on a public e-commerce dataset and demonstrated a superior performance compared to existing classifiers. It utilizes a word vector model to extract features from sentences and vectorize them. The attention layer is used to calculate the weighted average attention of each text, and the relevant representation is obtained. BiLSTM is then employed to read the text information from both directions, further enhancing the emotional level. Finally, softmax is used to classify the emotional polarity. Full article
Show Figures

Figure 1

21 pages, 4238 KiB  
Article
Sarcasm Detection over Social Media Platforms Using Hybrid Ensemble Model with Fuzzy Logic
by Dilip Kumar Sharma, Bhuvanesh Singh, Saurabh Agarwal, Nikhil Pachauri, Amel Ali Alhussan and Hanaa A. Abdallah
Electronics 2023, 12(4), 937; https://doi.org/10.3390/electronics12040937 - 13 Feb 2023
Cited by 12 | Viewed by 3421
Abstract
A figurative language expression known as sarcasm implies the complete contrast of what is being stated with what is meant, with the latter usually being rather or extremely offensive, meant to offend or humiliate someone. In routine conversations on social media websites, sarcasm [...] Read more.
A figurative language expression known as sarcasm implies the complete contrast of what is being stated with what is meant, with the latter usually being rather or extremely offensive, meant to offend or humiliate someone. In routine conversations on social media websites, sarcasm is frequently utilized. Sentiment analysis procedures are prone to errors because sarcasm can change a statement’s meaning. Analytic accuracy apprehension has increased as automatic social networking analysis tools have grown. According to preliminary studies, the accuracy of computerized sentiment analysis has been dramatically decreased by sarcastic remarks alone. Sarcastic expressions also affect automatic false news identification and cause false positives. Because sarcastic comments are inherently ambiguous, identifying sarcasm may be difficult. Different individual NLP strategies have been proposed in the past. However, each methodology has text contexts and vicinity restrictions. The methods are unable to manage various kinds of content. This study suggests a unique ensemble approach based on text embedding that includes fuzzy evolutionary logic at the top layer. This approach involves applying fuzzy logic to ensemble embeddings from the Word2Vec, GloVe, and BERT models before making the final classification. The three models’ weights assigned to the probability are used to categorize objects using the fuzzy layer. The suggested model was validated on the following social media datasets: the Headlines dataset, the “Self-Annotated Reddit Corpus” (SARC), and the Twitter app dataset. Accuracies of 90.81%, 85.38%, and 86.80%, respectively, were achieved. The accuracy metrics were more accurate than those of earlier state-of-the-art models. Full article
Show Figures

Figure 1

12 pages, 282 KiB  
Article
Emotion Recognition Based on the Structure of Narratives
by Tibor Pólya and István Csertő
Electronics 2023, 12(4), 919; https://doi.org/10.3390/electronics12040919 - 11 Feb 2023
Cited by 3 | Viewed by 1748
Abstract
One important application of natural language processing (NLP) is the recognition of emotions in text. Most current emotion analyzers use a set of linguistic features such as emotion lexicons, n-grams, word embeddings, and emoticons. This study proposes a new strategy to perform [...] Read more.
One important application of natural language processing (NLP) is the recognition of emotions in text. Most current emotion analyzers use a set of linguistic features such as emotion lexicons, n-grams, word embeddings, and emoticons. This study proposes a new strategy to perform emotion recognition, which is based on the homologous structure of emotions and narratives. It is argued that emotions and narratives share both a goal-based structure and an evaluation structure. The new strategy was tested in an empirical study with 117 participants who recounted two narratives about their past emotional experiences, including one positive and one negative episode. Immediately after narrating each episode, the participants reported their current affective state using the Affect Grid. The goal-based structure and evaluation structure of the narratives were analyzed with a hybrid method. First, a linguistic analysis of the texts was carried out, including tokenization, lemmatization, part-of-speech tagging, and morphological analysis. Second, an extensive set of rule-based algorithms was used to analyze the goal-based structure of, and evaluations in, the narratives. Third, the output was fed into machine learning classifiers of narrative structural features that previously proved to be effective predictors of the narrator’s current affective state. This hybrid procedure yielded a high average F1 score (0.72). The results are discussed in terms of the benefits of employing narrative structure analysis in NLP-based emotion recognition. Full article
10 pages, 253 KiB  
Article
Combining Transformer Embeddings with Linguistic Features for Complex Word Identification
by Jenny A. Ortiz-Zambrano, César Espin-Riofrio and Arturo Montejo-Ráez
Electronics 2023, 12(1), 120; https://doi.org/10.3390/electronics12010120 - 27 Dec 2022
Cited by 4 | Viewed by 2386
Abstract
Identifying which words present in a text may be difficult to understand by common readers is a well-known subtask in text complexity analysis. The advent of deep language models has also established the new state-of-the-art in this task by means of end-to-end semi-supervised [...] Read more.
Identifying which words present in a text may be difficult to understand by common readers is a well-known subtask in text complexity analysis. The advent of deep language models has also established the new state-of-the-art in this task by means of end-to-end semi-supervised (pre-trained) and downstream training of, mainly, transformer-based neural networks. Nevertheless, the usefulness of traditional linguistic features in combination with neural encodings is worth exploring, as the computational cost needed for training and running such networks is becoming more and more relevant with energy-saving constraints. This study explores lexical complexity prediction (LCP) by combining pre-trained and adjusted transformer networks with different types of traditional linguistic features. We apply these features over classical machine learning classifiers. Our best results are obtained by applying Support Vector Machines on an English corpus in an LCP task solved as a regression problem. The results show that linguistic features can be useful in LCP tasks and may improve the performance of deep learning systems. Full article
Show Figures

Figure 1

Review

Jump to: Research

31 pages, 6959 KiB  
Review
A Survey of Non-Autoregressive Neural Machine Translation
by Feng Li, Jingxian Chen and Xuejun Zhang
Electronics 2023, 12(13), 2980; https://doi.org/10.3390/electronics12132980 - 06 Jul 2023
Cited by 1 | Viewed by 1752
Abstract
Non-autoregressive neural machine translation (NAMT) has received increasing attention recently in virtue of its promising acceleration paradigm for fast decoding. However, these splendid speedup gains are at the cost of accuracy, in comparison to its autoregressive counterpart. To close this performance gap, many [...] Read more.
Non-autoregressive neural machine translation (NAMT) has received increasing attention recently in virtue of its promising acceleration paradigm for fast decoding. However, these splendid speedup gains are at the cost of accuracy, in comparison to its autoregressive counterpart. To close this performance gap, many studies have been conducted for achieving a better quality and speed trade-off. In this paper, we survey the NAMT domain from two new perspectives, i.e., target dependency management and training strategies arrangement. Proposed approaches are elaborated at length, involving five model categories. We then collect extensive experimental data to present abundant graphs for quantitative evaluation and qualitative comparison according to the reported translation performance. Based on that, a comprehensive performance analysis is provided. Further inspection is conducted for two salient problems: target sentence length prediction and sequence-level knowledge distillation. Accumulative reinvestigation of translation quality and speedup demonstrates that non-autoregressive decoding may not run fast as it seems and still lacks authentic surpassing for accuracy. We finally prospect potential work from inner and outer facets and call for more practical and warrantable studies for the future. Full article
Show Figures

Graphical abstract

Back to TopTop