Neural Network Technologies in Natural Language Processing and Data Mining

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: 20 August 2024 | Viewed by 1679

Special Issue Editors


E-Mail Website
Guest Editor
Faculty of Informatics and Digital Technologies, University of Rijeka, Radmile Matejčić 2, 51000 Rijeka, Croatia
Interests: artificial intelligence; machine learning; interpretable machine learning; educational data mining; natural language processing; machine translation
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Faculty of Informatics and Digital Technologies, University of Rijeka, 51000 Rijeka, Croatia
Interests: artificial intelligence; data science; machine learning; explainable artificial intelligence; explainable machine learning; human-centric AI; trustworthy Internet of Things systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Neural network technologies have revolutionized the fields of natural language processing (NLP) and data mining, as well as the way we process and extract hidden insights from vast amounts of textual data. Neural networks play a central role in uncovering patterns and trends from unstructured complex data sources. The have been successfully applied in all sorts of classification and regression tasks, as well as in clustering, anomaly detection, association rule extraction, etc. Different types of neural networks are used depending on the application. BERT, GPT, and T5 are among the most notable transformer architectures applied in NLP tasks such as language understanding, generation, summarization, question answering, and translation. They all share the fundamental concept of leveraging self-attention mechanisms. Both NLP and data mining fields evolve over time and continue to explore new variations and enhancements to address different tasks and challenges in a more efficient and effective manner.

This Special Issue of Applied Sciences covers all application areas of neural network technologies in the fields of natural language processing (NLP) and data mining. It aims to show how neural network technologies have addressed long-standing challenges in these areas, as well as how they give rise to new challenges.

Both original research articles and comprehensive review articles are welcome.

Topics of interest in this Special Issue include various applications of neural networks such as:

  • Topic modeling;
  • Text classification;
  • Automatic language translation;
  • Text generation;
  • Profiling;
  • Language understanding;
  • Named entity recognition;
  • Information extraction;
  • Social media analysis;
  • Pattern recognition;
  • Classification;
  • Regression;
  • Anomaly detection;
  • Association rule extraction;
  • Feature extraction;
  • Dealing with imbalanced and biased data sets;
  • Biases in language models;
  • Emerging trends.

Prof. Dr. Marija Brkić Bakarić
Prof. Dr. Maja Matetic
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • neural network technologies
  • natural language processing (NLP)
  • data mining
  • classification
  • regression
  • clustering
  • anomaly detection
  • association rule extraction
  • transformer
  • language understanding
  • generation
  • summarization
  • question answering
  • translation
  • self-attention mechanisms

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

34 pages, 11122 KiB  
Article
A Bibliometric Analysis of Text Mining: Exploring the Use of Natural Language Processing in Social Media Research
by Andra Sandu, Liviu-Adrian Cotfas, Aurelia Stănescu and Camelia Delcea
Appl. Sci. 2024, 14(8), 3144; https://doi.org/10.3390/app14083144 - 09 Apr 2024
Viewed by 367
Abstract
Natural language processing (NLP) plays a pivotal role in modern life by enabling computers to comprehend, analyze, and respond to human language meaningfully, thereby offering exciting new opportunities. As social media platforms experience a surge in global usage, the imperative to capture and [...] Read more.
Natural language processing (NLP) plays a pivotal role in modern life by enabling computers to comprehend, analyze, and respond to human language meaningfully, thereby offering exciting new opportunities. As social media platforms experience a surge in global usage, the imperative to capture and better understand the messages disseminated within these networks becomes increasingly crucial. Moreover, the occurrence of adverse events, such as the emergence of a pandemic or conflicts in various parts of the world, heightens social media users’ inclinations towards these platforms. In this context, this paper aims to explore the scientific literature dedicated to the utilization of NLP in social media research, with the goal of highlighting trends, keywords, and collaborative networks within the authorship that contribute to the proliferation of papers in this field. To achieve this objective, we extracted and analyzed 1852 papers from the ISI Web of Science database. An initial observation reveals a remarkable annual growth rate of 62.18%, underscoring the heightened interest of the academic community in this domain. This paper includes an n-gram analysis and a review of the most cited papers in the extracted database, offering a comprehensive bibliometric analysis. The insights gained from these efforts provide essential perspectives and contribute to identifying pertinent issues in social media analysis addressed through the application of NLP. Full article
Show Figures

Figure 1

27 pages, 1581 KiB  
Article
Authorship Attribution in Less-Resourced Languages: A Hybrid Transformer Approach for Romanian
by Melania Nitu and Mihai Dascalu
Appl. Sci. 2024, 14(7), 2700; https://doi.org/10.3390/app14072700 - 23 Mar 2024
Viewed by 473
Abstract
Authorship attribution for less-resourced languages like Romanian, characterized by the scarcity of large, annotated datasets and the limited number of available NLP tools, poses unique challenges. This study focuses on a hybrid Transformer combining handcrafted linguistic features, ranging from surface indices like word [...] Read more.
Authorship attribution for less-resourced languages like Romanian, characterized by the scarcity of large, annotated datasets and the limited number of available NLP tools, poses unique challenges. This study focuses on a hybrid Transformer combining handcrafted linguistic features, ranging from surface indices like word frequencies to syntax, semantics, and discourse markers, with contextualized embeddings from a Romanian BERT encoder. The methodology involves extracting contextualized representations from a pre-trained Romanian BERT model and concatenating them with linguistic features, selected using the Kruskal–Wallis mean rank, to create a hybrid input vector for a classification layer. We compare this approach with a baseline ensemble of seven machine learning classifiers for authorship attribution employing majority soft voting. We conduct studies on both long texts (full texts) and short texts (paragraphs), with 19 authors and a subset of 10. Our hybrid Transformer outperforms existing methods, achieving an F1 score of 0.87 on the full dataset of the 19-author set (an 11% enhancement) and an F1 score of 0.95 on the 10-author subset (an increase of 10% over previous research studies). We conduct linguistic analysis leveraging textual complexity indices and employ McNemar and Cochran’s Q statistical tests to evaluate the performance evolution across the best three models, while highlighting patterns in misclassifications. Our research contributes to diversifying methodologies for effective authorship attribution in resource-constrained linguistic environments. Furthermore, we publicly release the full dataset and the codebase associated with this study to encourage further exploration and development in this field. Full article
Show Figures

Figure 1

Back to TopTop