#
Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes^{ †}

^{1}

^{2}

^{3}

^{4}

^{*}

^{†}

## Abstract

**:**

## 1. Introduction

## 2. Machine Learning and Predictions

- Artificial Neural Network-ANN with a payoff equal to 7;
- Polynomial Regression with a payoff equal to 7;
- Gradient Boosted Trees Regression with a payoff equal to 14;
- Random Forest Regression with a payoff equal to 16;
- Simple Regression Tree with a payoff equal to 18;
- Tree Ensemble Regression with a payoff equal to 23;
- Linear Regression with a payoff equal to 27;
- Probabilistic Neural Network-PNN with a payoff equal to 32.

- Probabilistic Neural Networks-PNN with a payoff equal to 6;
- Simple Regression Tree with a payoff equal to 13;
- Gradient Boosted Trees Regression and Random Forest Regression with a payoff equal to 19;
- Linear Regression with a payoff equal to 27;
- Tree Ensemble Regression and Artificial Neural Network-ANN with a payoff equal to 28;
- Polynomial Regression with a payoff equal to 40.

- Data: consists of a single KNIME node which has the function of reading the data entered in the Excel format;
- Preprocessing: consists of a group of three different KNIME nodes. The first node consists of “Column Filter” and is a node in which it is possible to select the columns of interest through which to carry out the prediction activity. The second node consists of “Normalizer” and it is a node that compresses data in the range from 0 to 1. The third node consists of “Partitioning” and is a node in which the data is divided into two different groups: 70% is used for the training of the neural network while the remaining 30% is used for the actual prediction;
- Machine Learning and predictions: is the central part of the data analysis process aimed at predictions and consists of two nodes. The first, known as “RProp MLP Learner” is used for neural network training. It has hyperparameters that can be modified according to the analytical needs and which, in the analyzed case, were used in the basic version. The second node of KNIME is “Multilayer Perceptron Predictor” and is the node containing the real data prediction.
- Score: the last phase consists of the “Numeric Scorer” node which allows the evaluation of the predictive efficiency of the neural network through the analysis of both the R-square and the statistical errors.

## 3. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Massaro, A.; Maritati, V.; Savino, N.; Galiano, A.; Convertini, D.; De Fonte, E.; Di Muro, M. A Study of a health resources management platform integrating neural networks and DSS telemedicine for homecare assistance. Information
**2018**, 7, 176. [Google Scholar] [CrossRef] - Massaro, A.; Maritati, V.; Savino, N.; Galiano, A. Neural networks for automated smart health platforms oriented on heart predictive diagnostic big data systems. In Proceedings of the 2018 AEIT International Annual Conference, Bari, Italy, 3–5 October 2018; pp. 1–5. [Google Scholar] [CrossRef]
- Massaro, A.; Maritati, V.; Giannone, D.; Convertini, D.; Galiano, A. LSTM DSS automatism and dataset optimization for diabetes prediction. Appl. Sci.
**2019**, 9, 3532. [Google Scholar] [CrossRef] - Massaro, A. Electronics in Advanced Research Industries: Industry 4.0 to Industry 5.0 Advances; John Wiley & Sons: Hoboken, NJ, USA, 2021. [Google Scholar] [CrossRef]
- Denormalization of Predicted Data in Neural Networks. Available online: https://stackoverflow.com/questions/32888108/denormalization-of-predicted-data-in-neural-networks (accessed on 7 January 2022).

Algorithm | Mean Absolute Error (MAE) | Mean Squared Error (MSE) | Root Mean Squared Error (RMSE) | Mean Signed Difference (MSD) |
---|---|---|---|---|

Artificial Neural Network (ANN) | 0.199918 | 0.057661 | 0.240128 | 0.054425 |

Probabilistic Neural Network (PNN) | 0.275533 | 0.105013 | 0.324056 | 0.164210 |

Gradient Boosted Trees Regression | 0.238739 | 0.077760 | 0.278855 | 0.024218 |

Simple Regression Tree | 0.238739 | 0.077760 | 0.278855 | 0.024218 |

Random Forest Regression | 0.235623 | 0.076095 | 0.275853 | 0.073448 |

Tree Ensemble Regression | 0.241468 | 0.081222 | 0.284995 | 0.057196 |

Linear Regression | 0.242458 | 0.079036 | 0.281134 | 0.058953 |

Polynomial Regression | 0.211184 | 0.067966 | 0.260702 | 0.009944 |

**Table 2.**Synthesis of the main results of machine learning algorithms for the prediction of the glycemic status of Patient B.

Algorithm | R^{2} | Mean Squared Error (MSE) | Mean Squared Error (MSE) | Root Mean Squared Error (RMSE) | Mean Signed Difference (MSD) |
---|---|---|---|---|---|

Artificial Neural Network (ANN) | 0.03793089 | 0.02464794 | 0.00382001 | 0.06180622 | 0.001 |

Probabilistic Neural Network (PNN) | 0.96996654 | 0.00368226 | 0 | 0.01103748 | 0.001 |

Gradient Boosted Trees Regression | 0.71567727 | 0.02176984 | 0.00127028 | 0.03564093 | 0.01547137 |

Simple Regression Tree | 0.91064585 | 0.01378341 | 0 | 0.1895266 | 0.01228716 |

Random Forest Regression | 0.33102591 | 0.02416602 | 0.00273247 | 0.05227308 | 0.00120760 |

Tree Ensemble Regression | 0.29050064 | 0.03015130 | 0.00258986 | 0.05089066 | 0.01356692 |

Linear Regression | 0.04864026 | 0.02449140 | 0.00338018 | 0.05813930 | 0.00309530 |

Polynomial Regression | 0.01977070 | 0.03305683 | 0.00417221 | 0.06459261 | 0.01651951 |

Parameter | A | B |
---|---|---|

Mean | 126.71 | 147.27 |

Median | 128 | 146 |

Minimum | 90 | 140 |

Maximum | 132 | 233 |

Standard deviation | 81,069 | 65074 |

Asymmetry | −29,518 | 86,207 |

Kurtosis | 8.2884 | 95 |

5th percentile | 102 | 143 |

95th percentile | 132 | 153 |

Interquartile range | 60 | 30,000 |

Missing observation | 0 | 0 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Massaro, A.; Magaletti, N.; Cosoli, G.; Leogrande, A.; Cannone, F.
Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes. *Med. Sci. Forum* **2022**, *10*, 11.
https://doi.org/10.3390/IECH2022-12293

**AMA Style**

Massaro A, Magaletti N, Cosoli G, Leogrande A, Cannone F.
Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes. *Medical Sciences Forum*. 2022; 10(1):11.
https://doi.org/10.3390/IECH2022-12293

**Chicago/Turabian Style**

Massaro, Alessandro, Nicola Magaletti, Gabriele Cosoli, Angelo Leogrande, and Francesco Cannone.
2022. "Use of Machine Learning to Predict the Glycemic Status of Patients with Diabetes" *Medical Sciences Forum* 10, no. 1: 11.
https://doi.org/10.3390/IECH2022-12293