Advances of Applied Probability and Statistics

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics".

Deadline for manuscript submissions: 31 May 2024 | Viewed by 3612

Special Issue Editors


E-Mail Website
Guest Editor
Department of Economics, University of Campania “Luigi Vanvitelli”, 80143 Capua, Italy
Interests: probability; big data; inference; social science; finance

E-Mail Website
Guest Editor
Research Group on Knowledge Engineering and Machine Learning at Intelligent Data Science and Artificial Intelligence Research Center, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain
Interests: big data; statistics; artificial intelligence; machine learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent years, there has been a significant advancement in the field of statistics and probability, with new techniques and methodologies being developed to address a variety of challenges in data analysis, especially with the advent of the big data era. The sheer volume and complexity of the data generated by modern technologies have presented new challenges for data analysis and have sparked a wave of innovation in the field of statistics and probability. This Special Issue aims to showcase the latest research in this rapidly evolving area, highlighting both theoretical and practical advances in these fields.

The issue features a collection of articles written by experts in the field, covering a range of topics such as Bayesian inference, machine learning, causal inference, dimensionality reduction, and much more. Applications of new techniques to interesting real-world problems will also be considered. The articles should address the unique challenges posed by big data and demonstrate the impact that these methods are having on fields such as biology, finance, and social sciences.

This Special Issue also provides a forum for researchers to share their perspectives on the current state of the field and to discuss the future direction of statistics and probability in the big data era. The articles highlight the important role that these disciplines play in making sense of the vast amounts of data generated by modern technologies and provide insights into the opportunities and challenges that lie ahead.

Overall, this Special Issue of the journal provides an invaluable resource for researchers, students, and practitioners interested in staying up to date with the latest advances in the field of statistics and probability in the context of the big data era.

Dr. Massimiliano Giacalone
Prof. Dr. Karina Gibert
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • statistics
  • probability
  • machine learning
  • artificial intelligence
  • social sciences
  • biology
  • finance

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

18 pages, 841 KiB  
Article
Predicting Intensive Care Unit Patients’ Discharge Date with a Hybrid Machine Learning Model That Combines Length of Stay and Days to Discharge
by David Cuadrado, Aida Valls and David Riaño
Mathematics 2023, 11(23), 4773; https://doi.org/10.3390/math11234773 - 26 Nov 2023
Cited by 2 | Viewed by 757
Abstract
Background: Accurate planning of the duration of stays at intensive care units is of utmost importance for resource planning. Currently, the discharge date used for resource management is calculated only at admission time and is called length of stay. However, the evolution of [...] Read more.
Background: Accurate planning of the duration of stays at intensive care units is of utmost importance for resource planning. Currently, the discharge date used for resource management is calculated only at admission time and is called length of stay. However, the evolution of the treatment may be different from one patient to another, so a recalculation of the date of discharge should be performed, called days to discharge. The prediction of days to discharge during the stay at the ICU with statistical and data analysis methods has been poorly studied with low-quality results. This study aims to improve the prediction of the discharge date for any patient in intensive care units using artificial intelligence techniques. Methods: The paper proposes a hybrid method based on group-conditioned models obtained with machine learning techniques. Patients are grouped into three clusters based on an initial length of stay estimation. On each group (grouped by first days of stay), we calculate the group-conditioned length of stay value to know the predicted date of discharge, then, after a given number of days, another group-conditioned prediction model must be used to calculate the days to discharge in order to obtain a more accurate prediction of the number of remaining days. The study is performed with the eICU database, a public dataset of USA patients admitted to intensive care units between 2014 and 2015. Three machine learning methods (i.e., Random Forest, XGBoost, and lightGBM) are used to generate length of stay and days to discharge predictive models for each group. Results: Random Forest is the algorithm that obtains the best days to discharge predictors. The proposed hybrid method achieves a root mean square error (RMSE) and mean average error (MAE) below one day on the eICU dataset for the last six days of stay. Conclusions: Machine learning models improve quality of predictions for the days to discharge and length of stay for intensive care unit patients. The results demonstrate that the hybrid model, based on Random Forest, improves the accuracy for predicting length of stay at the start and days to discharge at the end of the intensive care unit stay. Implementing these prediction models may help in the accurate estimation of bed occupancy at intensive care units, thus improving the planning for these limited and critical health-care resources. Full article
(This article belongs to the Special Issue Advances of Applied Probability and Statistics)
Show Figures

Figure 1

12 pages, 289 KiB  
Article
On the Height of One-Dimensional Random Walk
by Mohamed Abdelkader
Mathematics 2023, 11(21), 4513; https://doi.org/10.3390/math11214513 - 01 Nov 2023
Viewed by 797
Abstract
Consider the one-dimensional random walk Xn: as it evolves (at each unit of time), it either increases by one with probability p or resets to 0 with probability 1p. In the present paper, we analyze the law of [...] Read more.
Consider the one-dimensional random walk Xn: as it evolves (at each unit of time), it either increases by one with probability p or resets to 0 with probability 1p. In the present paper, we analyze the law of the height statistics Hn, corresponding to our model Xn. Also, we prove that the limiting distribution of the walk Xn is a shifted geometric distribution with parameter 1p and find the closed forms of the mean and the variance of Xn using the probability-generating function. Full article
(This article belongs to the Special Issue Advances of Applied Probability and Statistics)
27 pages, 700 KiB  
Article
A Unified Formal Framework for Factorial and Probabilistic Topic Modelling
by Karina Gibert and Yaroslav Hernandez-Potiomkin
Mathematics 2023, 11(20), 4375; https://doi.org/10.3390/math11204375 - 21 Oct 2023
Viewed by 686
Abstract
Topic modelling has become a highly popular technique for extracting knowledge from texts. It encompasses various method families, including Factorial methods, Probabilistic methods, and Natural Language Processing methods. This paper introduces a unified conceptual framework for Factorial and Probabilistic methods by identifying shared [...] Read more.
Topic modelling has become a highly popular technique for extracting knowledge from texts. It encompasses various method families, including Factorial methods, Probabilistic methods, and Natural Language Processing methods. This paper introduces a unified conceptual framework for Factorial and Probabilistic methods by identifying shared elements and representing them using a homogeneous notation. The paper presents 12 different methods within this framework, enabling easy comparative analysis to assess the flexibility and how realistic the assumptions of each approach are. This establishes the initial stage of a broader analysis aimed at relating all method families to this common framework, comprehensively understanding their strengths and weaknesses, and establishing general application guidelines. Also, an experimental setup reinforces the convenience of having harmonized notational schema. The paper concludes with a discussion on the presented methods and outlines future research directions. Full article
(This article belongs to the Special Issue Advances of Applied Probability and Statistics)
Show Figures

Figure 1

33 pages, 6513 KiB  
Article
Variable Selection for Meaningful Clustering of Multitopic Territorial Data
by Xavier Angerri and Karina Gibert
Mathematics 2023, 11(13), 2863; https://doi.org/10.3390/math11132863 - 26 Jun 2023
Viewed by 817
Abstract
This paper proposes a new methodology to improve territorial cohesion in clustering processes where many variables from different topics are considered. Clustering techniques provide added value to identify typologies, but there are still unsolved challenges when data contain an unbalanced number of variables [...] Read more.
This paper proposes a new methodology to improve territorial cohesion in clustering processes where many variables from different topics are considered. Clustering techniques provide added value to identify typologies, but there are still unsolved challenges when data contain an unbalanced number of variables from different topics. The territorial feature selection method (TFSM) is presented as a method to select the representative variable of each topic such that the interpretability of resulting clusters is preserved and the geographical cohesion is improved with respect to classical approaches. This paper also introduces the thermometer as a new knowledge acquisition tool that allows experts to transfer semantics to the data mining process. TFSM proposes the index of potential explainability (Ek) as the criteria to select the most promising variables for clustering. Ek is based on the combination of inferential testing and metrics such as support. The proposal is applied with the INSESS-COVID19 database, where territorial groups of vulnerable populations were found. A set of 195 variables with 21 unbalanced thematic blocks is used to compare the results with a traditional multiview clustering analysis with promising results from both the geographical and the thematic point of view and the capacity to support further decision making. Full article
(This article belongs to the Special Issue Advances of Applied Probability and Statistics)
Show Figures

Figure 1

Back to TopTop