Applications of Text Mining in Data Analytics

A special issue of Applied Sciences (ISSN 2076-3417). This special issue belongs to the section "Computing and Artificial Intelligence".

Deadline for manuscript submissions: closed (20 September 2023) | Viewed by 3671

Special Issue Editors


E-Mail Website
Guest Editor
Faculty of Mining, Ecology, Process Control and Geotechnologies, Institute of Earth Resources, Department of Geo and Mining Tourism, Technical university Košice, 04001 Košice, Slovakia
Interests: information systems(IS); GIS in modeling the historical montane landscape; 2D and 3D models and historical research of the landscape for geotourism needs; various interactive forms of the presentation of 2D and 3D models through modern techniques (stereoscopy, holography, etc.)

E-Mail Website
Guest Editor
Faculty of Mining, Ecology, Process Control and Geotechnologies, Institute of Earth Resources, Department of Earth Resources Management, Technical university Košice, 040 01 Košice, Slovakia
Interests: mining; geotourism; mineral exploration; mineral economics

E-Mail Website
Guest Editor
Institute of Geodesy, Cartography and Geographical Information Systems, Faculty of Mining, Ecology, Process Control and Geotechnology, Technical University of Kosice, 040 01 Kosice, Slovakia
Interests: geographic information systems (GIS); 3D visualization; spatial modelling; applied geology; remote sensing and data collection; geostatistics; CAD systems
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Text mining technology is now widely used for a wide range of government, research and business needs. All of these groups can use text mining to manage records and find documents relevant to their day-to-day activities. Scientific researchers incorporate text mining approaches in efforts to organize large text data sets (solving the problem of unstructured data) to determine the ideas communicated through text, and to support scientific discoveries in various fields.

This Special Issue will be devoted to new perspectives on text mining and resulting applications, as well as text mining in the age of social media.

The topics discussed in this Special Issue will focus on the latest methods and advances in text mining.

Dr. Ladislav Hvizdák
Dr. Lucia Domaracká
Dr. Peter Blišťan
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Applied Sciences is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • natural language processing (NLP)
  • machine learning
  • ontologies
  • vocabularies
  • custom dictionaries
  • open architecture
  • ETL (extract, transform, load)
  • social media
  • quality data

Published Papers (2 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 14885 KiB  
Article
Extracting Domain-Specific Chinese Named Entities for Aviation Safety Reports: A Case Study
by Xin Wang, Zurui Gan, Yaxi Xu, Bingnan Liu and Tao Zheng
Appl. Sci. 2023, 13(19), 11003; https://doi.org/10.3390/app131911003 - 06 Oct 2023
Viewed by 952
Abstract
Aviation safety reports can provide detailed records of past aviation safety accidents, analyze their problems and hidden dangers, and help airlines and other aviation enterprises avoid similar accidents from happening again. In a novel way, we plan to use named entity recognition technology [...] Read more.
Aviation safety reports can provide detailed records of past aviation safety accidents, analyze their problems and hidden dangers, and help airlines and other aviation enterprises avoid similar accidents from happening again. In a novel way, we plan to use named entity recognition technology to quickly mine important information in reports, helping safety personnel improve efficiency. The development of intelligent civil aviation creates demands for the incorporation of big data and artificial intelligence. Because of the aviation-specific terms and the complexity of identifying named entity boundaries, the mining of aviation safety report texts is a challenging domain. This paper proposes a novel method for aviation safety report entity extraction. First, ten kinds of entities and sequences, such as event, company, city, operation, date, aircraft type, personnel, flight number, aircraft registration and aircraft part, were annotated using the BIO format. Second, we present a semantic representation enhancement approach through the fusion of enhanced representation through knowledge integration embedding (ERNIE), pinyin embedding and glyph embedding. Then, in order to improve the accuracy of specific entity extraction, we constructed and utilized the aviation domain dictionary which includes high-frequency technical aviation terms. After that, we adopted bilinear attention networks (BANs), the feature fusion approach originally used in multi-modal analysis, in our study to incorporate features extracted from both iterated dilated convolutional neural network (IDCNN) and bi-directional long short-term memory (BiLSTM) architectures. A case study of specific entity extraction for an aviation safety events dataset was conducted. The experimental results demonstrate that our proposed algorithm, with an F1 score reaching 97.93%, is superior to several baseline and advanced algorithms. Therefore, the proposed approach offers a robust methodological foundation for the relationship extraction and knowledge graph construction of aviation safety reports. Full article
(This article belongs to the Special Issue Applications of Text Mining in Data Analytics)
Show Figures

Figure 1

17 pages, 3072 KiB  
Article
Automated Arabic Long-Tweet Classification Using Transfer Learning with BERT
by Meshrif Alruily, Abdul Manaf Fazal, Ayman Mohamed Mostafa and Mohamed Ezz
Appl. Sci. 2023, 13(6), 3482; https://doi.org/10.3390/app13063482 - 09 Mar 2023
Cited by 5 | Viewed by 1994
Abstract
Social media platforms like Twitter are commonly used by people interested in various activities, interests, and subjects that may cover their everyday activities and plans, as well as their thoughts on religion, technology, or the products they use. In this paper, we present [...] Read more.
Social media platforms like Twitter are commonly used by people interested in various activities, interests, and subjects that may cover their everyday activities and plans, as well as their thoughts on religion, technology, or the products they use. In this paper, we present bidirectional encoder representations from transformers (BERT)-based text classification model, ARABERT4TWC, for classifying the Arabic tweets of users into different categories. This work aims to provide an enhanced deep-learning model that can automatically classify the robust Arabic tweets of different users. In our proposed work, a transformer-based model for text classification is constructed from a pre-trained BERT model provided by the hugging face transformer library with custom dense layers. The multi-class classification layer is built on top of the BERT encoder to categorize the tweets. First, data sanitation and preprocessing were performed on the raw Arabic corpus to improve the model’s accuracy. Second, an Arabic-specific BERT model was built and input embedding vectors were fed into it. Using five publicly accessible datasets, substantial experiments were executed, and the fine-tuning technique was assessed in terms of tokenized vector and learning rate. In addition, we assessed the accuracy of various deep-learning models for classifying Arabic text. Full article
(This article belongs to the Special Issue Applications of Text Mining in Data Analytics)
Show Figures

Figure 1

Back to TopTop