Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Mathematics and Computer Science".

Deadline for manuscript submissions: closed (31 January 2022) | Viewed by 43030

Printed Edition Available!
A printed edition of this Special Issue is available here.

Special Issue Editors


E-Mail Website
Guest Editor
Department of Computer Science, Faculty of Mathematics and Computer Science, University of Bucharest, 010014 Bucharest, Romania
Interests: artificial intelligence (AI); knowledge representation; natural language processing; computational linguistics; human language technology; computational statistics applied in natural language processing; data analysis
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Computer Science, University of Illinois at Chicago, Chicago, IL 60607, USA
Interests: artificial intelligence; machine learning; natural language processing; information retrieval

Special Issue Information

Dear Colleagues,

Natural language processing (NLP) is one of the most important technologies in use today due to the large and growing amount of online text that needs to be understood to fully ascertain its enormous value. Although many machine learning models have been developed for NLP applications, recently, deep learning approaches have achieved remarkable results across many NLP tasks. This Special Issue focuses on the use and exploration of current advances in machine learning and deep learning for NLP topics including (but not limited to) information extraction, information retrieval and text mining, text summarization, computational social science, discourse and dialog systems, interpretability, ethics in NLP, linguistic theories, NLP for social good. This Special Issue provides a platform for researchers from academia and industry to present their novel and unpublished work in the domain of natural language processing and its applications, with a focus on applications of machine learning and deep learning in the broad spectrum of research areas that are concerned with computational approaches to natural language. This will help to foster future research in NLP and related fields.

Prof. Dr. Florentina Hristea
Prof. Dr. Cornelia Caragea
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Natural language processing
  • Machine learning
  • Deep learning

Published Papers (17 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Editorial

Jump to: Research, Other

5 pages, 185 KiB  
Editorial
Preface to the Special Issue “Natural Language Processing (NLP) and Machine Learning (ML)—Theory and Applications”
by Florentina Hristea and Cornelia Caragea
Mathematics 2022, 10(14), 2481; https://doi.org/10.3390/math10142481 - 16 Jul 2022
Viewed by 1175
Abstract
Natural language processing (NLP) is one of the most important technologies in use today, especially due to the large and growing amount of online text, which needs to be understood in order to fully ascertain its enormous value [...] Full article

Research

Jump to: Editorial, Other

25 pages, 3539 KiB  
Article
Analytics Methods to Understand Information Retrieval Effectiveness—A Survey
by Josiane Mothe
Mathematics 2022, 10(12), 2135; https://doi.org/10.3390/math10122135 - 19 Jun 2022
Cited by 6 | Viewed by 2127 | Correction
Abstract
Information retrieval aims to retrieve the documents that answer users’ queries. A typical search process consists of different phases for which a variety of components have been defined in the literature; each one having a set of hyper-parameters to tune. Different studies focused [...] Read more.
Information retrieval aims to retrieve the documents that answer users’ queries. A typical search process consists of different phases for which a variety of components have been defined in the literature; each one having a set of hyper-parameters to tune. Different studies focused on how and how much the components and their hyper-parameters affect the system performance in terms of effectiveness, others on the query factor. The aim of these studies is to better understand information retrieval system effectiveness. This paper reviews the literature of this domain. It depicts how data analytics has been used in IR to gain a better understanding of system effectiveness. This review concludes that we lack a full understanding of system effectiveness related to the context which the system is in, though it has been possible to adapt the query processing to some contexts successfully. This review also concludes that, even if it is possible to distinguish effective from non-effective systems for a query set, neither the system component analysis nor the query features analysis were successful in explaining when and why a particular system fails on a particular query. Full article
Show Figures

Figure 1

21 pages, 443 KiB  
Article
On the Use of Morpho-Syntactic Description Tags in Neural Machine Translation with Small and Large Training Corpora
by Gregor Donaj and Mirjam Sepesy Maučec
Mathematics 2022, 10(9), 1608; https://doi.org/10.3390/math10091608 - 09 May 2022
Cited by 3 | Viewed by 1617
Abstract
With the transition to neural architectures, machine translation achieves very good quality for several resource-rich languages. However, the results are still much worse for languages with complex morphology, especially if they are low-resource languages. This paper reports the results of a systematic analysis [...] Read more.
With the transition to neural architectures, machine translation achieves very good quality for several resource-rich languages. However, the results are still much worse for languages with complex morphology, especially if they are low-resource languages. This paper reports the results of a systematic analysis of adding morphological information into neural machine translation system training. Translation systems presented and compared in this research exploit morphological information from corpora in different formats. Some formats join semantic and grammatical information and others separate these two types of information. Semantic information is modeled using lemmas and grammatical information using Morpho-Syntactic Description (MSD) tags. Experiments were performed on corpora of different sizes for the English–Slovene language pair. The conclusions were drawn for a domain-specific translation system and for a translation system for the general domain. With MSD tags, we improved the performance by up to 1.40 and 1.68 BLEU points in the two translation directions. We found that systems with training corpora in different formats improve the performance differently depending on the translation direction and corpora size. Full article
Show Figures

Figure 1

14 pages, 490 KiB  
Article
Identifying Source-Language Dialects in Translation
by Sergiu Nisioi, Ana Sabina Uban and Liviu P. Dinu
Mathematics 2022, 10(9), 1431; https://doi.org/10.3390/math10091431 - 24 Apr 2022
Cited by 1 | Viewed by 1720
Abstract
In this paper, we aim to explore the degree to which translated texts preserve linguistic features of dialectal varieties. We release a dataset of augmented annotations to the Proceedings of the European Parliament that cover dialectal speaker information, and we analyze different classes [...] Read more.
In this paper, we aim to explore the degree to which translated texts preserve linguistic features of dialectal varieties. We release a dataset of augmented annotations to the Proceedings of the European Parliament that cover dialectal speaker information, and we analyze different classes of written English covering native varieties from the British Isles. Our analyses aim to discuss the discriminatory features between the different classes and to reveal words whose usage differs between varieties of the same language. We perform classification experiments and show that automatically distinguishing between the dialectal varieties is possible with high accuracy, even after translation, and propose a new explainability method based on embedding alignments in order to reveal specific differences between dialects at the level of the vocabulary. Full article
Show Figures

Figure 1

26 pages, 2077 KiB  
Article
Taylor-ChOA: Taylor-Chimp Optimized Random Multimodal Deep Learning-Based Sentiment Classification Model for Course Recommendation
by Santosh Kumar Banbhrani, Bo Xu, Hongfei Lin and Dileep Kumar Sajnani
Mathematics 2022, 10(9), 1354; https://doi.org/10.3390/math10091354 - 19 Apr 2022
Cited by 7 | Viewed by 1661
Abstract
Course recommendation is a key for achievement in a student’s academic path. However, it is challenging to appropriately select course content among numerous online education resources, due to the differences in users’ knowledge structures. Therefore, this paper develops a novel sentiment classification approach [...] Read more.
Course recommendation is a key for achievement in a student’s academic path. However, it is challenging to appropriately select course content among numerous online education resources, due to the differences in users’ knowledge structures. Therefore, this paper develops a novel sentiment classification approach for recommending the courses using Taylor-chimp Optimization Algorithm enabled Random Multimodal Deep Learning (Taylor ChOA-based RMDL). Here, the proposed Taylor ChOA is newly devised by the combination of the Taylor concept and Chimp Optimization Algorithm (ChOA). Initially, course review is done to find the optimal course, and thereafter feature extraction is performed for extracting the various significant features needed for further processing. Finally, sentiment classification is done using RMDL, which is trained by the proposed optimization algorithm, named ChOA. Thus, the positively reviewed courses are obtained from the classified sentiments for improving the course recommendation procedure. Extensive experiments are conducted using the E-Khool dataset and Coursera course dataset. Empirical results demonstrate that Taylor ChOA-based RMDL model significantly outperforms state-of-the-art methods for course recommendation tasks. Full article
Show Figures

Figure 1

23 pages, 2348 KiB  
Article
Automatic Classification of National Health Service Feedback
by Christopher Haynes, Marco A. Palomino, Liz Stuart, David Viira, Frances Hannon, Gemma Crossingham and Kate Tantam
Mathematics 2022, 10(6), 983; https://doi.org/10.3390/math10060983 - 18 Mar 2022
Cited by 9 | Viewed by 2208
Abstract
Text datasets come in an abundance of shapes, sizes and styles. However, determining what factors limit classification accuracy remains a difficult task which is still the subject of intensive research. Using a challenging UK National Health Service (NHS) dataset, which contains many characteristics [...] Read more.
Text datasets come in an abundance of shapes, sizes and styles. However, determining what factors limit classification accuracy remains a difficult task which is still the subject of intensive research. Using a challenging UK National Health Service (NHS) dataset, which contains many characteristics known to increase the complexity of classification, we propose an innovative classification pipeline. This pipeline switches between different text pre-processing, scoring and classification techniques during execution. Using this flexible pipeline, a high level of accuracy has been achieved in the classification of a range of datasets, attaining a micro-averaged F1 score of 93.30% on the Reuters-21578 “ApteMod” corpus. An evaluation of this flexible pipeline was carried out using a variety of complex datasets compared against an unsupervised clustering approach. The paper describes how classification accuracy is impacted by an unbalanced category distribution, the rare use of generic terms and the subjective nature of manual human classification. Full article
Show Figures

Figure 1

24 pages, 8510 KiB  
Article
Towards a Benchmarking System for Comparing Automatic Hate Speech Detection with an Intelligent Baseline Proposal
by Ștefan Dascălu and Florentina Hristea
Mathematics 2022, 10(6), 945; https://doi.org/10.3390/math10060945 - 16 Mar 2022
Cited by 4 | Viewed by 2243
Abstract
Hate Speech is a frequent problem occurring among Internet users. Recent regulations are being discussed by U.K. representatives (“Online Safety Bill”) and by the European Commission, which plans on introducing Hate Speech as an “EU crime”. The recent legislation having passed in order [...] Read more.
Hate Speech is a frequent problem occurring among Internet users. Recent regulations are being discussed by U.K. representatives (“Online Safety Bill”) and by the European Commission, which plans on introducing Hate Speech as an “EU crime”. The recent legislation having passed in order to combat this kind of speech places the burden of identification on the hosting websites and often within a tight time frame (24 h in France and Germany). These constraints make automatic Hate Speech detection a very important topic for major social media platforms. However, recent literature on Hate Speech detection lacks a benchmarking system that can evaluate how different approaches compare against each other regarding the prediction made concerning different types of text (short snippets such as those present on Twitter, as well as lengthier fragments). This paper intended to deal with this issue and to take a step forward towards the standardization of testing for this type of natural language processing (NLP) application. Furthermore, this paper explored different transformer and LSTM-based models in order to evaluate the performance of multi-task and transfer learning models used for Hate Speech detection. Some of the results obtained in this paper surpassed the existing ones. The paper concluded that transformer-based models have the best performance on all studied Datasets. Full article
Show Figures

Figure 1

14 pages, 268 KiB  
Article
Intermediate-Task Transfer Learning with BERT for Sarcasm Detection
by Edoardo Savini and Cornelia Caragea
Mathematics 2022, 10(5), 844; https://doi.org/10.3390/math10050844 - 07 Mar 2022
Cited by 31 | Viewed by 4684
Abstract
Sarcasm detection plays an important role in natural language processing as it can impact the performance of many applications, including sentiment analysis, opinion mining, and stance detection. Despite substantial progress on sarcasm detection, the research results are scattered across datasets and studies. In [...] Read more.
Sarcasm detection plays an important role in natural language processing as it can impact the performance of many applications, including sentiment analysis, opinion mining, and stance detection. Despite substantial progress on sarcasm detection, the research results are scattered across datasets and studies. In this paper, we survey the current state-of-the-art and present strong baselines for sarcasm detection based on BERT pre-trained language models. We further improve our BERT models by fine-tuning them on related intermediate tasks before fine-tuning them on our target task. Specifically, relying on the correlation between sarcasm and (implied negative) sentiment and emotions, we explore a transfer learning framework that uses sentiment classification and emotion detection as individual intermediate tasks to infuse knowledge into the target task of sarcasm detection. Experimental results on three datasets that have different characteristics show that the BERT-based models outperform many previous models. Full article
Show Figures

Figure 1

27 pages, 1761 KiB  
Article
Parallel Stylometric Document Embeddings with Deep Learning Based Language Models in Literary Authorship Attribution
by Mihailo Škorić, Ranka Stanković, Milica Ikonić Nešić, Joanna Byszuk and Maciej Eder
Mathematics 2022, 10(5), 838; https://doi.org/10.3390/math10050838 - 07 Mar 2022
Cited by 3 | Viewed by 3654
Abstract
This paper explores the effectiveness of parallel stylometric document embeddings in solving the authorship attribution task by testing a novel approach on literary texts in 7 different languages, totaling in 7051 unique 10,000-token chunks from 700 PoS and lemma annotated documents. We used [...] Read more.
This paper explores the effectiveness of parallel stylometric document embeddings in solving the authorship attribution task by testing a novel approach on literary texts in 7 different languages, totaling in 7051 unique 10,000-token chunks from 700 PoS and lemma annotated documents. We used these documents to produce four document embedding models using Stylo R package (word-based, lemma-based, PoS-trigrams-based, and PoS-mask-based) and one document embedding model using mBERT for each of the seven languages. We created further derivations of these embeddings in the form of average, product, minimum, maximum, and l2 norm of these document embedding matrices and tested them both including and excluding the mBERT-based document embeddings for each language. Finally, we trained several perceptrons on the portions of the dataset in order to procure adequate weights for a weighted combination approach. We tested standalone (two baselines) and composite embeddings for classification accuracy, precision, recall, weighted-average, and macro-averaged F1-score, compared them with one another and have found that for each language most of our composition methods outperform the baselines (with a couple of methods outperforming all baselines for all languages), with or without mBERT inputs, which are found to have no significant positive impact on the results of our methods. Full article
Show Figures

Figure 1

24 pages, 1634 KiB  
Article
Unsupervised and Supervised Methods to Estimate Temporal-Aware Contradictions in Online Course Reviews
by Ismail Badache, Adrian-Gabriel Chifu and Sébastien Fournier
Mathematics 2022, 10(5), 809; https://doi.org/10.3390/math10050809 - 03 Mar 2022
Cited by 1 | Viewed by 2247
Abstract
The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may [...] Read more.
The analysis of user-generated content on the Internet has become increasingly popular for a wide variety of applications. One particular type of content is represented by the user reviews for programs, multimedia, products, and so on. Investigating the opinion contained by reviews may help in following the evolution of the reviewed items and thus in improving their quality. Detecting contradictory opinions in reviews is crucial when evaluating the quality of the respective resource. This article aims to estimate the contradiction intensity (strength) in the context of online courses (MOOC). This estimation was based on review ratings and on sentiment polarity in the comments, with respect to specific aspects, such as “lecturer”, “presentation”, etc. Between course sessions, users stop reviewing, and also, the course contents may evolve. Thus, the reviews are time dependent, and this is why they should be considered grouped by the course sessions. Having this in mind, the contribution of this paper is threefold: (a) defining the notion of subjective contradiction around specific aspects and then estimating its intensity based on sentiment polarity, review ratings, and temporality; (b) developing a dataset to evaluate the contradiction intensity measure, which was annotated based on a user study; (c) comparing our unsupervised method with supervised methods with automatic feature selection, over the dataset. The dataset collected from coursera.org is in English. It includes 2244 courses and 73,873 user-generated reviews of those courses.The results proved that the standard deviation of the ratings, the standard deviation of the polarities, and the number of reviews are suitable features for predicting the contradiction intensity classes. Among the supervised methods, the J48 decision trees algorithm yielded the best performance, compared to the naive Bayes model and the SVM model. Full article
Show Figures

Figure 1

9 pages, 471 KiB  
Article
Cross-Lingual Transfer Learning for Arabic Task-Oriented Dialogue Systems Using Multilingual Transformer Model mT5
by Ahlam Fuad and Maha Al-Yahya
Mathematics 2022, 10(5), 746; https://doi.org/10.3390/math10050746 - 26 Feb 2022
Cited by 5 | Viewed by 2355
Abstract
Due to the promising performance of pre-trained language models for task-oriented dialogue systems (DS) in English, some efforts to provide multilingual models for task-oriented DS in low-resource languages have emerged. These efforts still face a long-standing challenge due to the lack of high-quality [...] Read more.
Due to the promising performance of pre-trained language models for task-oriented dialogue systems (DS) in English, some efforts to provide multilingual models for task-oriented DS in low-resource languages have emerged. These efforts still face a long-standing challenge due to the lack of high-quality data for these languages, especially Arabic. To circumvent the cost and time-intensive data collection and annotation, cross-lingual transfer learning can be used when few training data are available in the low-resource target language. Therefore, this study aims to explore the effectiveness of cross-lingual transfer learning in building an end-to-end Arabic task-oriented DS using the mT5 transformer model. We use the Arabic task-oriented dialogue dataset (Arabic-TOD) in the training and testing of the model. We present the cross-lingual transfer learning deployed with three different approaches: mSeq2Seq, Cross-lingual Pre-training (CPT), and Mixed-Language Pre-training (MLT). We obtain good results for our model compared to the literature for Chinese language using the same settings. Furthermore, cross-lingual transfer learning deployed with the MLT approach outperform the other two approaches. Finally, we show that our results can be improved by increasing the training dataset size. Full article
Show Figures

Figure 1

11 pages, 405 KiB  
Article
Improving Machine Reading Comprehension with Multi-Task Learning and Self-Training
by Jianquan Ouyang and Mengen Fu
Mathematics 2022, 10(3), 310; https://doi.org/10.3390/math10030310 - 19 Jan 2022
Cited by 2 | Viewed by 4082
Abstract
Machine Reading Comprehension (MRC) is an AI challenge that requires machines to determine the correct answer to a question based on a given passage, in which extractive MRC requires extracting an answer span to a question from a given passage, such as the [...] Read more.
Machine Reading Comprehension (MRC) is an AI challenge that requires machines to determine the correct answer to a question based on a given passage, in which extractive MRC requires extracting an answer span to a question from a given passage, such as the task of span extraction. In contrast, non-extractive MRC infers answers from the content of reference passages, including Yes/No question answering to unanswerable questions. Due to the specificity of the two types of MRC tasks, researchers usually work on one type of task separately, but real-life application situations often require models that can handle many different types of tasks in parallel. Therefore, to meet the comprehensive requirements in such application situations, we construct a multi-task fusion training reading comprehension model based on the BERT pre-training model. The model uses the BERT pre-training model to obtain contextual representations, which is then shared by three downstream sub-modules for span extraction, Yes/No question answering, and unanswerable questions, next we fuse the outputs of the three sub-modules into a new span extraction output and use the fused cross-entropy loss function for global training. In the training phase, since our model requires a large amount of labeled training data, which is often expensive to obtain or unavailable in many tasks, we additionally use self-training to generate pseudo-labeled training data to train our model to improve its accuracy and generalization performance. We evaluated the SQuAD2.0 and CAIL2019 datasets. The experiments show that our model can efficiently handle different tasks. We achieved 83.2EM and 86.7F1 scores on the SQuAD2.0 dataset and 73.0EM and 85.3F1 scores on the CAIL2019 dataset. Full article
Show Figures

Figure 1

11 pages, 363 KiB  
Article
Evaluating Research Trends from Journal Paper Metadata, Considering the Research Publication Latency
by Christian-Daniel Curiac, Ovidiu Banias and Mihai Micea
Mathematics 2022, 10(2), 233; https://doi.org/10.3390/math10020233 - 13 Jan 2022
Cited by 5 | Viewed by 1530
Abstract
Investigating the research trends within a scientific domain by analyzing semantic information extracted from scientific journals has been a topic of interest in the natural language processing (NLP) field. A research trend evaluation is generally based on the time evolution of the term [...] Read more.
Investigating the research trends within a scientific domain by analyzing semantic information extracted from scientific journals has been a topic of interest in the natural language processing (NLP) field. A research trend evaluation is generally based on the time evolution of the term occurrence or the term topic, but it neglects an important aspect—research publication latency. The average time lag between the research and its publication may vary from one month to more than one year, and it is a characteristic that may have significant impact when assessing research trends, mainly for rapidly evolving scientific areas. To cope with this problem, the present paper is the first work that explicitly considers research publication latency as a parameter in the trend evaluation process. Consequently, we provide a new trend detection methodology that mixes auto-ARIMA prediction with Mann–Kendall trend evaluations. The experimental results in an electronic design automation case study prove the viability of our approach. Full article
Show Figures

Figure 1

21 pages, 440 KiB  
Article
Identifying the Structure of CSCL Conversations Using String Kernels
by Mihai Masala, Stefan Ruseti, Traian Rebedea, Mihai Dascalu, Gabriel Gutu-Robu and Stefan Trausan-Matu
Mathematics 2021, 9(24), 3330; https://doi.org/10.3390/math9243330 - 20 Dec 2021
Cited by 2 | Viewed by 2303
Abstract
Computer-Supported Collaborative Learning tools are exhibiting an increased popularity in education, as they allow multiple participants to easily communicate, share knowledge, solve problems collaboratively, or seek advice. Nevertheless, multi-participant conversation logs are often hard to follow by teachers due to the mixture of [...] Read more.
Computer-Supported Collaborative Learning tools are exhibiting an increased popularity in education, as they allow multiple participants to easily communicate, share knowledge, solve problems collaboratively, or seek advice. Nevertheless, multi-participant conversation logs are often hard to follow by teachers due to the mixture of multiple and many times concurrent discussion threads, with different interaction patterns between participants. Automated guidance can be provided with the help of Natural Language Processing techniques that target the identification of topic mixtures and of semantic links between utterances in order to adequately observe the debate and continuation of ideas. This paper introduces a method for discovering such semantic links embedded within chat conversations using string kernels, word embeddings, and neural networks. Our approach was validated on two datasets and obtained state-of-the-art results on both. Trained on a relatively small set of conversations, our models relying on string kernels are very effective for detecting such semantic links with a matching accuracy larger than 50% and represent a better alternative to complex deep neural networks, frequently employed in various Natural Language Processing tasks where large datasets are available. Full article
Show Figures

Figure 1

18 pages, 286 KiB  
Article
Definition Extraction from Generic and Mathematical Domains with Deep Ensemble Learning
by Natalia Vanetik and Marina Litvak
Mathematics 2021, 9(19), 2502; https://doi.org/10.3390/math9192502 - 06 Oct 2021
Cited by 1 | Viewed by 1766
Abstract
Definitions are extremely important for efficient learning of new materials. In particular, mathematical definitions are necessary for understanding mathematics-related areas. Automated extraction of definitions could be very useful for automated indexing educational materials, building taxonomies of relevant concepts, and more. For definitions that [...] Read more.
Definitions are extremely important for efficient learning of new materials. In particular, mathematical definitions are necessary for understanding mathematics-related areas. Automated extraction of definitions could be very useful for automated indexing educational materials, building taxonomies of relevant concepts, and more. For definitions that are contained within a single sentence, this problem can be viewed as a binary classification of sentences into definitions and non-definitions. In this paper, we focus on automatic detection of one-sentence definitions in mathematical and general texts. We experiment with different classification models arranged in an ensemble and applied to a sentence representation containing syntactic and semantic information, to classify sentences. Our ensemble model is applied to the data adjusted with oversampling. Our experiments demonstrate the superiority of our approach over state-of-the-art methods in both general and mathematical domains. Full article
Show Figures

Figure 1

11 pages, 444 KiB  
Article
To Batch or Not to Batch? Comparing Batching and Curriculum Learning Strategies across Tasks and Datasets
by Laura Burdick, Jonathan K. Kummerfeld and Rada Mihalcea
Mathematics 2021, 9(18), 2234; https://doi.org/10.3390/math9182234 - 11 Sep 2021
Cited by 2 | Viewed by 1737
Abstract
Many natural language processing architectures are greatly affected by seemingly small design decisions, such as batching and curriculum learning (how the training data are ordered during training). In order to better understand the impact of these decisions, we present a systematic analysis of [...] Read more.
Many natural language processing architectures are greatly affected by seemingly small design decisions, such as batching and curriculum learning (how the training data are ordered during training). In order to better understand the impact of these decisions, we present a systematic analysis of different curriculum learning strategies and different batching strategies. We consider multiple datasets for three tasks: text classification, sentence and phrase similarity, and part-of-speech tagging. Our experiments demonstrate that certain curriculum learning and batching decisions do increase performance substantially for some tasks. Full article
Show Figures

Figure 1

Other

Jump to: Editorial, Research

7 pages, 195 KiB  
Correction
Correction: Mothe, J. Analytics Methods to Understand Information Retrieval Effectiveness—A Survey. Mathematics 2022, 10, 2135
by Josiane Mothe
Mathematics 2022, 10(18), 3397; https://doi.org/10.3390/math10183397 - 19 Sep 2022
Viewed by 791
Abstract
The author wishes to make the following corrections to this paper [1]:In Abstract, (1) “It depicts how data analytics has been used in IR for a better understanding system effectiveness” should be “It depicts how data analytics has been used in IR to [...] Read more.
The author wishes to make the following corrections to this paper [1]:In Abstract, (1) “It depicts how data analytics has been used in IR for a better understanding system effectiveness” should be “It depicts how data analytics has been used in IR to gain a better understanding of system effectiveness”; (2) “This review concludes lack of full understanding of system effectiveness according to the context although it has been possible to adapt the query processing to some contexts successfully” should be changed to “This review concludes that we lack a full understanding of system effectiveness related to the context which the system is in, though it has been possible to adapt the query processing to some contexts successfully”; (3) “This review also concludes that, even if it is possible to distinguish effective from non effective system on average on a query set” should be changed to “This review also concludes that, even if it is possible to distinguish effective from non-effective systems for a query set” [...] Full article
Back to TopTop