Submit to Entropy Review for Entropy Propose a Special Issue

Journal Menu

Journal Browser

► Journal Browser

Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence

Print Special Issue Flyer
Special Issue Editors
Special Issue Information
Keywords
Published Papers

A special issue of Entropy (ISSN 1099-4300). This special issue belongs to the section "Statistical Physics".

Deadline for manuscript submissions: closed (30 April 2021) | Viewed by 29224

Share This Special Issue

Special Issue Editor

Dr. Victor Leiva

E-Mail Website
Guest Editor

School of Industrial Engineering, Pontificia Universidad Católica de Valparaíso, Avenida Brasil 2241, Valparaíso 2362807, Chile
Interests: advanced applied multivariate analysis; artificial intelligence, deep learning, and machine learning; big data, business intelligence, data mining, and data science; statistical learning and modeling
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Today, regression is a supervised technique widely used in data science, data mining, machine learning, and statistical learning. Although the focus of this Special Issue is the machine/statistical learning and modeling, we welcome contributions in artificial intelligence, classification, and unsupervised learning, as well as in the topics detailed below. We strongly encourage interdisciplinary works with real data.

This Special Issue looks for submissions in but not limited to the following areas:

(i) Machine learning and clustering;
(ii) Artificial intelligence;
(iii) Big data, dimensionality high, and large-scale data analysis in supervised learning;
(iv) Multivariate analysis with emphasis in dimensionality reduction, such as PCA, PLS, and others;
(v) Genetic algorithms, particle swarm optimization, and others, for supervised learning;
(vi) Applications of supervised learning and data science in entropy and information theory;
(vii) Bayesian methods;
(viii) Global and local influence diagnostics in supervised learning.

Prof. Dr. Victor Leiva
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Entropy is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

Artificial intelligence
Bayesian methods
Big data, data mining, and data science
Clustering
Entropy and information theory
Global and local diagnostics
Machine learning
PLS regression and PCA regression
Statistical learning and modeling

Published Papers (7 papers)

Download All Papers

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

Research

17 pages, 4586 KiB

Open AccessArticle

Silhouette Analysis for Performance Evaluation in Machine Learning with Applications to Clustering

by Meshal Shutaywi and Nezamoddin N. Kachouie

Entropy 2021, 23(6), 759; https://doi.org/10.3390/e23060759 - 16 Jun 2021

Cited by 76 | Viewed by 8273

Abstract

Grouping the objects based on their similarities is an important common task in machine learning applications. Many clustering methods have been developed, among them k-means based clustering methods have been broadly used and several extensions have been developed to improve the original k-means clustering method such as k-means ++ and kernel k-means. K-means is a linear clustering method; that is, it divides the objects into linearly separable groups, while kernel k-means is a non-linear technique. Kernel k-means projects the elements to a higher dimensional feature space using a kernel function, and then groups them. Different kernel functions may not perform similarly in clustering of a data set and, in turn, choosing the right kernel for an application could be challenging. In our previous work, we introduced a weighted majority voting method for clustering based on normalized mutual information (NMI). NMI is a supervised method where the true labels for a training set are required to calculate NMI. In this study, we extend our previous work of aggregating the clustering results to develop an unsupervised weighting function where a training set is not available. The proposed weighting function here is based on Silhouette index, as an unsupervised criterion. As a result, a training set is not required to calculate Silhouette index. This makes our new method more sensible in terms of clustering concept. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 1401 KiB

Open AccessArticle

A New Two-Stage Algorithm for Solving Optimization Problems

by Sajjad Amiri Doumari, Hadi Givi, Mohammad Dehghani, Zeinab Montazeri, Victor Leiva and Josep M. Guerrero

Entropy 2021, 23(4), 491; https://doi.org/10.3390/e23040491 - 20 Apr 2021

Cited by 31 | Viewed by 2958

Abstract

Optimization seeks to find inputs for an objective function that result in a maximum or minimum. Optimization methods are divided into exact and approximate (algorithms). Several optimization algorithms imitate natural phenomena, laws of physics, and behavior of living organisms. Optimization based on algorithms is the challenge that underlies machine learning, from logistic regression to training neural networks for artificial intelligence. In this paper, a new algorithm called two-stage optimization (TSO) is proposed. The TSO algorithm updates population members in two steps at each iteration. For this purpose, a group of good population members is selected and then two members of this group are randomly used to update the position of each of them. This update is based on the first selected good member at the first stage, and on the second selected good member at the second stage. We describe the stages of the TSO algorithm and model them mathematically. Performance of the TSO algorithm is evaluated for twenty-three standard objective functions. In order to compare the optimization results of the TSO algorithm, eight other competing algorithms are considered, including genetic, gravitational search, grey wolf, marine predators, particle swarm, teaching-learning-based, tunicate swarm, and whale approaches. The numerical results show that the new algorithm is superior and more competitive in solving optimization problems when compared with other algorithms. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Figure 1

23 pages, 602 KiB

Open AccessArticle

Knowledge Discovery for Higher Education Student Retention Based on Data Mining: Machine Learning Algorithms and Case Study in Chile

by Carlos A. Palacios, José A. Reyes-Suárez, Lorena A. Bearzotti, Víctor Leiva and Carolina Marchant

Entropy 2021, 23(4), 485; https://doi.org/10.3390/e23040485 - 20 Apr 2021

Cited by 50 | Viewed by 6993

Abstract

Data mining is employed to extract useful information and to detect patterns from often large data sets, closely related to knowledge discovery in databases and data science. In this investigation, we formulate models based on machine learning algorithms to extract relevant information predicting student retention at various levels, using higher education data and specifying the relevant variables involved in the modeling. Then, we utilize this information to help the process of knowledge discovery. We predict student retention at each of three levels during their first, second, and third years of study, obtaining models with an accuracy that exceeds 80% in all scenarios. These models allow us to adequately predict the level when dropout occurs. Among the machine learning algorithms used in this work are: decision trees, k-nearest neighbors, logistic regression, naive Bayes, random forest, and support vector machines, of which the random forest technique performs the best. We detect that secondary educational score and the community poverty index are important predictive variables, which have not been previously reported in educational studies of this type. The dropout assessment at various levels reported here is valid for higher education institutions around the world with similar conditions to the Chilean case, where dropout rates affect the efficiency of such institutions. Having the ability to predict dropout based on student’s data enables these institutions to take preventative measures, avoiding the dropouts. In the case study, balancing the majority and minority classes improves the performance of the algorithms. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Figure 1

17 pages, 413 KiB

Open AccessArticle

Change Point Test for the Conditional Mean of Time Series of Counts Based on Support Vector Regression

by Sangyeol Lee and Sangjo Lee

Entropy 2021, 23(4), 433; https://doi.org/10.3390/e23040433 - 07 Apr 2021

Cited by 2 | Viewed by 1515

Abstract

This study considers support vector regression (SVR) and twin SVR (TSVR) for the time series of counts, wherein the hyper parameters are tuned using the particle swarm optimization (PSO) method. For prediction, we employ the framework of integer-valued generalized autoregressive conditional heteroskedasticity (INGARCH) models. As an application, we consider change point problems, using the cumulative sum (CUSUM) test based on the residuals obtained from the PSO-SVR and PSO-TSVR methods. We conduct Monte Carlo simulation experiments to illustrate the methods’ validity with various linear and nonlinear INGARCH models. Subsequently, a real data analysis, with the return times of extreme events constructed based on the daily log-returns of Goldman Sachs stock prices, is conducted to exhibit its scope of application. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Figure 1

15 pages, 545 KiB

Open AccessArticle

Co-Training for Visual Object Recognition Based on Self-Supervised Models Using a Cross-Entropy Regularization

by Gabriel Díaz, Billy Peralta, Luis Caro and Orietta Nicolis

Entropy 2021, 23(4), 423; https://doi.org/10.3390/e23040423 - 01 Apr 2021

Cited by 7 | Viewed by 2213

Abstract

Automatic recognition of visual objects using a deep learning approach has been successfully applied to multiple areas. However, deep learning techniques require a large amount of labeled data, which is usually expensive to obtain. An alternative is to use semi-supervised models, such as co-training, where multiple complementary views are combined using a small amount of labeled data. A simple way to associate views to visual objects is through the application of a degree of rotation or a type of filter. In this work, we propose a co-training model for visual object recognition using deep neural networks by adding layers of self-supervised neural networks as intermediate inputs to the views, where the views are diversified through the cross-entropy regularization of their outputs. Since the model merges the concepts of co-training and self-supervised learning by considering the differentiation of outputs, we called it Differential Self-Supervised Co-Training (DSSCo-Training). This paper presents some experiments using the DSSCo-Training model to well-known image datasets such as MNIST, CIFAR-100, and SVHN. The results indicate that the proposed model is competitive with the state-of-art models and shows an average relative improvement of 5% in accuracy for several datasets, despite its greater simplicity with respect to more recent approaches. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Figure 1

16 pages, 1031 KiB

Open AccessArticle

Research on the Prediction of A-Share “High Stock Dividend” Phenomenon—A Feature Adaptive Improved Multi-Layers Ensemble Model

by Yi Fu, Bingwen Li, Jinshi Zhao and Qianwen Bi

Entropy 2021, 23(4), 416; https://doi.org/10.3390/e23040416 - 31 Mar 2021

Cited by 2 | Viewed by 1637

Abstract

Since the “high stock dividend” of A-share companies in China often leads to the short-term stock price increase, this phenomenon’s prediction has been widely concerned by academia and industry. In this study, a new multi-layer stacking ensemble algorithm is proposed. Unlike the classic stacking ensemble algorithm that focused on the differentiation of base models, this paper used the equal weight comprehensive feature evaluation method to select features before predicting the base model and used a genetic algorithm to match the optimal feature subset for each base model. After the base model’s output prediction, the LightGBM (LGB) model was added to the algorithm as a secondary information extraction layer. Finally, the algorithm inputs the extracted information into the Logistic Regression (LR) model to complete the prediction of the “high stock dividend” phenomenon. Using the A-share market data from 2010 to 2019 for simulation and evaluation, the proposed model improves the AUC (Area Under Curve) and F1 score by 0.173 and 0.303, respectively, compared to the baseline model. The prediction results shed light on event-driven investment strategies. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Figure 1

12 pages, 938 KiB

Open AccessArticle

Breakpoint Analysis for the COVID-19 Pandemic and Its Effect on the Stock Markets

by Karime Chahuán-Jiménez, Rolando Rubilar, Hanns de la Fuente-Mella and Víctor Leiva

Entropy 2021, 23(1), 100; https://doi.org/10.3390/e23010100 - 12 Jan 2021

Cited by 30 | Viewed by 4198

Abstract

In this research, statistical models are formulated to study the effect of the health crisis arising from COVID-19 in global markets. Breakpoints in the price series of stock indexes are considered. Such indexes are used as an approximation of the stock markets in different countries, taking into account that they are indicative of these markets because of their composition. The main results obtained in this investigation highlight that countries with better institutional and economic conditions are less affected by the pandemic. In addition, the effect of the health index in the models is associated with their non-significant parameters. This is due to that the health index used in the modeling would not determine the different capacities of the countries analyzed to respond efficiently to the pandemic effect. Therefore, the contagion is the preponderant factor when analyzing the structural breakdown that occurred in the world economy. Full article

(This article belongs to the Special Issue Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence)

► Show Figures

Journal Menu

Journal Browser

Machine/Statistical Learning and Modeling with Potential Applications in Entropy, Information Theory, and Artificial Intelligence

Share This Special Issue

Special Issue Editor

Special Issue Information

Keywords

Published Papers (7 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI