Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning
Abstract
:1. Introduction
 Achieve a mathematical representation to diagnose MetS using HMS criteria.
 Propose a segmentation of MetS using HMS criteria.
 Develop a framework to diagnose the different MetS types according to HMS criteria using a set of variables that doctors can obtain using noninvasive methods in a first consultation.
 Evaluate two machine learning techniques using performance indicators for each MetS type.
2. Methodology
2.1. Review
 Could authors predict the Metabolic Syndrome types or segmentation without a blood test? Y/N.
 What Metabolic Syndrome diagnostic criteria did the authors use? e.g., ATP II, IDF, HMS, or other criteria recognized.
 What ANN configuration did the authors use?
 What validation method did the authors use? e.g., hold out, random subsampling, and others.
 What performance indicators did the authors use? e.g., Sensitivity, Area Under the ROC curve, Specificity.
2.2. Analysis
2.2.1. Design and Study Population
 Age of 20 years or over.
 The subject can understand the instructions explained by the researchers.
 The subject can sign an informed consent.
 The subject resides permanently in the area.
 Are you pregnant?
 Are you bedridden?
2.2.2. Physical Examination and Blood Tests
2.3. Model
2.3.1. Mathematical Representation to Diagnose MetS
 W: Represents the normal(0) or raised(1) status of the dichotomous values of the waist circumference
 P: Represents the normal(0) or raised(1) status of the dichotomous variable of the blood pressure
 G: Represents the normal(0) or raised(1) status of the dichotomous variable of the fasting plasma glucose
 H: Represents the normal(0) or lower(1) status of the dichotomous variable of the HDLC
 T: Represents the normal(0) or raised(1) status of the dichotomous variable of the triglycerides
2.3.2. Proposed Model MetS Segmentation
2.3.3. Framework to Diagnose the MetS Types
 (a)
 Extraction, Transformation, and Load (ETL)In this stage, we collected the data from a population of 615 subjects who authorized taking a blood sample to measure the values of triglycerides, HDLC, and fasting plasma glucose. Moreover, the study recorded the anthropometric and clinical variables such as Age, Sex, Weight, Height, Waist Circumference (WC), Hip Circumference (HC), Systole Blood Pressure (SBP), and Diastole Blood Pressure (DBP).Later, through the transformation process, we obtained Body Mass Index, Body Fat Percentage, Waist Hip circumference ratio, Dichotomous Blood Pressure Systolic, Dichotomous Diastolic Blood Pressure, Dichotomous Blood Pressure, Dichotomous triglycerides, Dichotomous fasting blood sugar, Dichotomous HDLC, and Dichotomous Waist circumference among others.Afterward, we used dichotomous values of the HMS criteria’ risk factors to build the different MetS types obtained from the segmentation process explained in the previous subsection. We obtained the output variables WPG, WPH, WPT, WGH, WGT, WTH, PGT, PGH, PHT, and GHT. Therefore, all anthropometric and clinical data was loaded in a dataset of 615 records.
 (b)
 Statistical analysis and balancing datasetIn this stage, we began with a dataset containing 615 people with samples of biochemical variables with their respective diagnostic of MetS. Then, we did a descriptive statistical analysis of the dataset, finding that some types of MetS were imbalanced, as shown in the Results section.This problem was caused by the low prevalence of the risk factor for fasting blood glucose in the study population. This low prevalence is expected in a study of MetS [40]. We resolved this imbalance by using a data balancing technique, such as the Synthetic Minority Oversampling Technique (SMOTE) [41,42] implemented by WEKA. We created synthetic data to get a balanced dataset of 799 records (615 plus 184 synthetic data) and a better distribution of risk factors of MetS, thus improving the quality of discrimination.
 (c)
 ModelingIn this stage, we use an algorithm to select the necessary nonbiochemical features. We used Sequential Feature Selection in Matlab to achieve the maximum discrimination in both datasets (imbalanced and balanced) of the proposed model’s output variables.For the following step, we used several Multilayer Perceptron (MLP) ANN to predict each MetS type: WPG, WPH, WPT, WGH, WGT, WTH, PGT, PGH, PTH, and GHT. These ANN should be trained before being used to predict the output variable value, i.e., the dependent variable. Each ANN is formed by neurons whose elements are a set of inputs that can come from other neurons or the outside, as shown in Figure 3 the basic structure of an ANN.Each structure of ANN should be initialized according to the propagation rule to the starting and each node has synaptic weights, which are the degree of communication between neurons, as shown Equations (7) and (8). Then, the data used to train the network is introduced into the network after the propagation algorithm is employed to obtain the final parameters in the network. In practice, the algorithm is divided into two parts: network training and network testing. The steps of propagation algorithm are described as follows [43]:$$ne{t}^{k}=\sum _{i=1}^{n}({\omega}_{i}^{k}{x}_{i}^{k}{\alpha}_{i}^{k})$$$${y}^{k}=\theta \left(ne{t}^{k}\right)$$This information flows in one direction only from the inputs to the hidden layer and after to the output layer, i.e., the information that comes from different activation function neurons, which is responsible for determining the current state and finally converges all the data to the output [33].Each ANN has several hidden neurons that have functions, such as the hyperbolic tangent sigmoid function and an output layer with a neuron. The neuron has a function that can be a logsigmoid function [44,46].It should be noted that there are no hard and fast rules for the number of hidden neurons. These hidden neurons can be calculated or found empirically and are highly dependent on the problem and the dataset [47]. However, we used the methodology mentioned by [48,49,50] and described in Equation (9), where the number of hidden neurons (NHN) can be $2/3$ of the input variables plus an output variable.$$NHN=\frac{2\left(Input\phantom{\rule{0.166667em}{0ex}}variables\right)}{3}+Output\phantom{\rule{0.166667em}{0ex}}variables$$We used Equation (9), to estimate the number of hidden neurons to contribute to research in the area of machine learning for the diagnosis of MetS without using biochemical variables and in a way, describe every detail of the process for experimentation by other researchers can continue investigating these models as well as Chen [32] that used other equation to calculate the hidden neurons.Another machine learning technique used to diagnose the MetS types was the ensemble Random undersampling Boosted tree (RusBoost) because the data from the MetS study is imbalanced [51]. This technique improves the performance indicators of models using imbalanced data by applying a random undersampling technique. The technique randomly removes samples from the majority class [52], as shown in the algorithm detailed in Appendix B with the configuration showed in Table 5.
2.4. Performance Indicators and Model Assessment
2.5. Document
3. Results
3.1. Data Description
3.2. Experiment to Diagnose the Traditional MetS without Biochemical Variables
3.3. Experiments to Diagnose Each MetS Type without a Blood Test
 Approach 1 is using only the ANN technique with a feature selection algorithm.
 Approach 2 uses an ensemble classification algorithm in the dataset, which is the Random undersampling Boosted tree (RusBoost) ensemble.
 Approach 3 uses SMOTE to create more data that we called dataset with oversampling for then applying ANN.
 Approach 4 is using the dataset with oversampling and RusBoost.
3.3.1. Approach 1: Diagnosis of Each MetS Type Using the Original Dataset and ANN
3.3.2. Approach 2: Diagnosis of Each MetS Type Using the Original Dataset and RusBoost
3.3.3. Approach 3: Diagnosis of Each MetS Type Using the Dataset with Oversampling and ANN
3.3.4. Approach 4: Diagnosis of Each MetS Type Using the Dataset with Oversampling and RusBoost
4. Discussion
5. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
MDPI  Multidisciplinary Digital Publishing Institute 
DOAJ  Directory of open access journals 
IEEE  Institute of Electrical and Electronics Engineers 
ACM  Association for Computing Machinery 
DBLP  Digital Bibliography Library Project 
GCP  Good Clinical Practices 
ICH  Guide and the International Conference on Harmonization 
WHO  World Health Organization 
NCEP ATP III  National Cholesterol Education Programme Adult Treatment Panel III 
EGIR  European Group for the study of Insulin Resistance 
IDF  International Diabetes Federation 
HMS  Harmonized Metabolic Syndrome 
MetS  Metabolic Syndrome 
MetSG  Metabolic Syndrome General 
CHD  Coronary Heart Disease 
IR  Insulin Resistance 
ICD  International Classification of Diseases 
OR  Odds Ratio 
CI  Confidence Interval 
SS  Sensitivity 
SP  Specificity 
FNR  False Negative Rate 
FPR  False Positive Rate 
AROC  Area under Receiver Operating Characteristic Curve 
WC  Waist Circumference 
BP  Blood Pressure 
HDLC  HighDensity Lipoprotein Cholesterol 
FPG  Fasting Plasma Glucose 
TG  Triglycerides 
WG  Weight 
HG  Height 
HC  Hip Circumference 
WHHR  Waist to Hip ratio 
WSR  Waist to Stature 
BMI  Body Mass Index 
BFP  Body Fat Percentage 
SBP  Systole Blood Pressure 
DBP  Diastole Blood Pressure 
SBPD  Systole Blood Pressure Dichotomous 
DBPD  Diastole Blood Pressure Dichotomous 
W  Represents the normal(0) or raised(1) status of the dichotomous values of the WC 
P  Represents the normal(0) or raised(1) status of the dichotomous variable of the BP 
G  Represents the normal(0) or raised(1) status of the dichotomous variable of the FPG 
H  Represents the normal(0) or lowed(1) status of the dichotomous variable of the HDLC 
T  Represents the normal(0) or raised(1) status of the dichotomous variable of the TG 
ANN  Artificial Neural Networks 
SMOTE  Synthetic Minority Oversampling Technique 
PCLR  Principal Component Logistic Regression 
RUSBoost  Random Undersampling Synthetic Minority Oversampling Technique 
Appendix A. Solution of Quine–McCluskey Algorithm to Minimize the MetS Types
 All those implicants of order 0, where only one variable has changed its state are grouped together. The group is obtained by eliminating the changed variable of those implicants of order 1. An example is the implicant of order 0, number 7 (W’P’GHT) and number 15 (W’PGHT), which are grouped together, resulting in W’GHT, which is of order 1.
 Then the implicants of order 1, where only one variable has changed its state are grouped together, obtained by eliminating changed variable. For example, the implicants 7, 15 (W’GHT) and 23, 31 (WGHT) (both implicants of order 1) are grouped together, resulting in GHT, which is of order 2.
 This process is carried out on all the implicants of order 0, until all implicants are minimized as shown Equation (6)
IMPLICANTS  

n  Order 0 *  Order 1  Order 2  
7  W’P’GHT  7, 15  W’GHT  7, 15, 23, 31  GHT 
11  W’PG’HT  7, 23  P’GHT  11, 15, 27, 31  PHT 
13  W’PGH’T  11, 2  W’PHT  13, 15, 29, 31  PGT 
14  W’PGHT’  11, 3  PG’HT  14, 15, 30, 31  PGH 
15  W’PGHT  13, 2  W’PGT  19, 23, 27, 31  WHT 
19  WP’G’HT  3, 29  PGH’T  21, 23, 29, 31  WGT 
21  WP’GH’T  14, 2  W’PGH  22, 23, 30, 31  WGH 
22  WP’GHT’  14, 30  PGHT’  25, 27, 30, 31  WPT 
23  WP’GHT  15, 31  PGHT  26, 27, 30, 31  WPH 
25  WPG’H’T  19, 23  WP’HT  28, 29, 30, 31  WPG 
26  WPG’HT’  19, 27  WG’HT  
27  WPG’HT  21, 23  WP’GT  
28  WPGH’T’  21, 29  WGH’T  
29  WPGH’T  22, 23  WP’GH  
30  WPGHT’  22, 30  WGHT’  
31  WPGHT  23, 31  WGHT  
25, 3  WPG’T  
25, 29  WPH’T  
26, 27  WPG’H  
26, 30  WPHT’  
27, 31  WPHT  
28, 29  WPGH’  
28, 30  WPGT’  
29, 31  WPGT  
30, 31  WPGH 
Appendix B
Algorithm A1 RUSBoost Algorithm(Adapted from [65]). 
Given: Set S of examples $({x}_{1},{y}_{1})$,...,$({x}_{m},{y}_{m})$ with minority class Weak learner (decision tree), WeakLearn Number of iterations, T Desired percentage of total instances to be represented by the minority class, N

References
 Kaur, J. A Comprehensive Review on Metabolic Syndrome. Cardiol. Res. Pract. 2014, 1–21. [Google Scholar] [CrossRef]
 Cornier, M.A.; Dabelea, D.; Hernandez, T.L.; Lindstrom, R.C.; Steig, A.J.; Stob, N.R.; Eckel, R.H. The Metabolic Syndrome. Endocr. Rev. 2008, 29, 777–822. [Google Scholar] [CrossRef] [PubMed]
 MüllerNordhorn, J.; Willich, S.N. Coronary Heart Disease. In International Encyclopedia of Public Health, 2nd ed.; Academic Press: Cambridge, MA, USA, 2017; Volume 2, pp. 159–167. [Google Scholar] [CrossRef]
 WHO. Global Action Plan for the Prevention and Control of Noncommunicable Diseases 2013–2020; World Heal. Organ: Geneva, Switzerland, 2013; ISBN 9789241506236. [Google Scholar]
 Navarro Lechuga, E.; Vargas Moranth, R. Metabolic syndrome in the southeast of Barranquilla (Colombia). Salud Uninorte 2008, 24, 40–52. [Google Scholar]
 Chobanian, A.V.; Bakris, G.L.; Black, H.R.; Cushman, W.C.; Green, L.A.; Izzo, J.L.; Roccella, E.J. Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Hypertension 2003, 42, 1206–1252. [Google Scholar] [CrossRef] [PubMed] [Green Version]
 Esposito, K.; Chiodini, P.; Colao, A.; Lenzi, A.; Giugliano, D. Metabolic Syndrome and Risk of Cancer: A systematic review and metaanalysis. Diabetes Care 2012, 35, 2402–2411. [Google Scholar] [CrossRef] [PubMed] [Green Version]
 Chen, J.; Muntner, P.; Hamm, L.L.; Jones, D.W.; Batuman, V.; Fonseca, V.; He, J. The Metabolic Syndrome and Chronic Kidney Disease in U.S. Adults. Ann. Intern. Med. 2004, 140, 167. [Google Scholar] [CrossRef] [PubMed]
 Grundy, S.M. Metabolic Syndrome: Connecting and Reconciling Cardiovascular and Diabetes Worlds. J. Am. Coll. Cardiol. 2006, 47, 1093–1100. [Google Scholar] [CrossRef] [Green Version]
 Grundy, S.M. Metabolic Syndrome Pandemic. Arterioscler. Thromb. Vasc. Biol. 2008, 28, 629–636. [Google Scholar] [CrossRef] [Green Version]
 Ford, W.H.; Giles, E.S.; Dietz, W.H. Prevalence of the Metabolic Syndrome Among US Adult. J. Am. Med. Assoc. 2002, 287, 356–359. [Google Scholar] [CrossRef]
 Mozumdar, A.; Liguori, G. Persistent Increase of Prevalence of Metabolic Syndrome Among U.S. Adults: NHANES III to NHANES 1999–2006. Diabetes Care 2011, 34, 216–219. [Google Scholar] [CrossRef] [Green Version]
 Aguilar, M.; Bhuket, T.; Torres, S.; Liu, B. Prevalence of the Metabolic Syndrome in the United States, 2003–2012. JAMA 2015, 313, 1973–1974. [Google Scholar] [CrossRef] [PubMed]
 Lakka, H. The Metabolic Syndrome and Total and Cardiovascular Disease Mortality in Middleaged Men. JAMA 2002, 288, 2709–2716. [Google Scholar] [CrossRef] [PubMed]
 Grundy, S.M. Metabolic Syndrome: A Multiplex Cardiovascular Risk Factor. J. Clin. Endocrinol. Metab. 2007, 92, 399–404. [Google Scholar] [CrossRef]
 Aschner, P. Metabolic syndrome as a risk factor for diabetes. Expert Rev. Cardiovasc. Ther. 2010, 8, 407–412. [Google Scholar] [CrossRef]
 GutiérrezSolis, R.M.; Datta Banik, A.L.; MéndezGonzález, S. Prevalence of Metabolic Syndrome in Mexico: A Systematic Review and MetaAnalysis. Metabolic Syndrome and Related Disorders. Metab. Syndr. Relat. Disord. 2018, 16, 395–405. [Google Scholar] [CrossRef] [PubMed]
 Navarro, E.; Vargas, R.F. Coronary risk according to Framinghan equation in adults with metabolic syndrome in the city of Soledad, Atlantico, 2010. Rev. Colomb. Cardiol. 2012, 19, 109–118. [Google Scholar]
 Alberti, K.G.M.M.; Zimmet, P.Z. Definition, Diagnosis and Classification of Diabetes Mellitus and its Complications Part 1: Diagnosis and Classification of Diabetes Mellitus Provisional Report of a WHO Consultation. Diabet. Med. 1998, 15, 539–553. [Google Scholar] [CrossRef]
 Bartlett, J.G.M. Executive summary of the third report of the National Cholesterol Education Program (NCEP) expert panel on detection, evaluation and treatment of high blood cholesterol in adults. Infect. Dis. Clin. Pract. 2001, 10, 287–288. [Google Scholar]
 Balkau, B.; Charles, M. Comment on the provisional report from the WHO consultation. European Group for the Study of Insulin Resistance (EGIR). Diabet. Med. 1999, 16, 442–443. [Google Scholar]
 Alberti, K.G.M.M.; Zimmet, P.; Shaw, J. Metabolic syndrome—A new worldwide definition. A Consensus Statement from the International Diabetes Federation. J. Compil. 2006, 23, 469–480. [Google Scholar] [CrossRef]
 Alberti, K.G.M.M.; Eckel, R.H.; Grundy, S.M.; Zimmet, P.Z.; Cleeman, J.I.; Donato, K.A.; Smith, S.C. Harmonizing the Metabolic Syndrome International Atherosclerosis Society; and International Association for the Study of Obesity. Circulation 2009, 120, 1640–1645. [Google Scholar] [CrossRef] [PubMed] [Green Version]
 Minsalud. Informe Nacional de Calidad de la Atención en Salud 2015; Ministerio de Salud y Protección Social: Bogotá, Colombia, 2015; p. 217. [Google Scholar]
 Irving, G.; Neves, A.L.; DambhaMiller, H.; Oishi, A.; Tagashira, H.; Verho, A.; Holden, J. International variations in primary care Doctor consultation time: A systematic review of 67 countries. BMJ Open 2017, 7, e017902. [Google Scholar] [CrossRef] [PubMed]
 Jover, A.; Corbella, E.; Mun, A.; Pedrobotet, J.; Herna, A.; Zu, M. Prevalence of Metabolic Syndrome and its Components in Patients With Acute Coronary Syndrome. Rev. EspañOla Cardiol. 2011, 64, 579–586. [Google Scholar] [CrossRef] [PubMed]
 De Kroon, M.L.; Renders, C.M.; Kuipers, E.C.; van Wouwe, J.P.; Van Buuren, S.; De Jonge, G.A.; Hirasing, R.A. Identifying metabolic syndrome without blood tests in young adults—The Terneuzen Birth Cohort. Eur. J. Public Health 2008, 18, 656–660. [Google Scholar] [CrossRef] [Green Version]
 Hsiung, D.Y.; Liu, C.W.; Cheng, P.C.; Ma, W.F. Using noninvasive assessment methods to predict the risk of metabolic syndrome. Appl. Nurs. Res. 2015, 28, 72–77. [Google Scholar] [CrossRef]
 Alshehri, A. Metabolic syndrome and cardiovascular risk. J. Fam. Community Med. 2010, 17, 73. [Google Scholar] [CrossRef]
 Barrios, M.; Jimeno, M.; Villalba, P.; Navarro, E. Novel Data Mining Methodology for Healthcare Applied to a New Model to Diagnose Metabolic Syndrome without a blood test. Diagnostics 2019, 9, 192. [Google Scholar] [CrossRef] [Green Version]
 MurguíaRomero, M.; JiménezFlores, R.; MéndezCruz, A.R.; VillalobosMolina, R. Predicting Metabolic Syndrome with Neural Networks. In Advances in Artificial Intelligence and Its Applications; Castro, F., Gelbukh, A., González, M., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 464–472. [Google Scholar]
 Chen, H.; Xiong, S.; Ren, X. Evaluating the Risk of Metabolic Syndrome Based on an Artificial Intelligence Model. Abstr. Appl. Anal. 2014, 2014, 207268. [Google Scholar] [CrossRef]
 Ivanović, D.; Kupusinac, A.; Stokić, E.; Doroslovački, R.; Ivetić, D. ANN Prediction of Metabolic Syndrome: A Complex Puzzle that will be Completed. J. Med. Syst. 2016, 40, 264. [Google Scholar] [CrossRef]
 Navarro Lechuga, E.; Vargas Moranth, R.F.; Alcocer Olaciregui, A.E. Grasa corporal total como posible indicador de síndrome metabólico en adultos. Rev. EspañOla Nutr. Hum. DietéTica 2016, 20, 198. [Google Scholar] [CrossRef] [Green Version]
 Rodríguez, A.S.; Soidan, J.L.G.; Gómez, M.J.A.; Rodríguez, R.L.; del Alonso, A.Á.; Fernández, M.R.P. Metabolic syndrome and visceral fat in women with cardiovascular risk factor. Nutr. Hosp. 2017, 34, 863–868. [Google Scholar]
 Lean, M.E.J.; Han, T.S.; Deurenberg, P. Predicting body composition by densitometry from simple anthropometric measurements. Am. J. Clin. Nutr. 1996, 63, 4–14. [Google Scholar] [CrossRef] [PubMed] [Green Version]
 Fliotsos, M.; Zhao, D.; Rao, V.N.; Ndumele, C.E.; Guallar, E.; Burke, G.L.; Michos, E.D. Body mass index from Early, Mid, and Olderadulthood and risk of heart failure and atherosclerotic cardiovascular disease: MESA. J. Am. Heart Assoc. 2018, 7, e009599. [Google Scholar] [CrossRef] [PubMed] [Green Version]
 Floyd, T.L. Digital Fundamentals, 8th ed.; Pearson Education: New York City, NY, USA, 2002; ISBN 9780130995278. [Google Scholar]
 Rosen, K.H. Discrete Mathematics and Its Applications, 5th ed.; McGrawHill Higher Education: New York, NY, USA, 2002. [Google Scholar]
 Perveen, S.; Shahbaz, M.; Keshavjee, K.; Guergachi, A. Metabolic Syndrome and Development of Diabetes Mellitus: Predictive Modeling Based on Machine Learning Techniques. IEEE Access 2019, 7, 1365–1375. [Google Scholar] [CrossRef]
 Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic Minority Oversampling Technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
 Fernandez, A.; Garcia, S.; Herrera, F.; Chawla, N.V. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15year Anniversary. J. Artif. Intell. Res. 2018, 61, 863–905. [Google Scholar] [CrossRef]
 Kumar, S. Neural Networks, 2nd ed.; Tata McGrawHill Education: New York, NY, USA, 2012. [Google Scholar]
 Bishop, C.M. Pattern Recognition and Machine Learning (Information Science and Statistics); Springer: Secaucus, NJ, USA, 2006. [Google Scholar]
 Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning (Second); Springer: New York, NY, USA, 2009. [Google Scholar] [CrossRef]
 Witten, I.H.; Frank, E. Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems); Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2005. [Google Scholar]
 Andrea, T.A.; Kalayeh, H. Applications of Neural Networks m Quantitative StructureActivity Relationships of Dihydrofolate Reductase Inhibitors. J. Med. Chem. 1991, 34, 2824–2836. [Google Scholar] [CrossRef]
 Boger, Z.; Guterman, H. Knowledge extraction from artificial neural networks models. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Orlando, FL, USA, 12–15 October 1997; pp. 3030–3035. [Google Scholar] [CrossRef]
 Karsoliya, S. Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture. Int. J. Eng. Trends Technol. 2012, 3, 714–717. [Google Scholar]
 Panchal, F.S.; Panchal, M. Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network. Int. J. Comput. Sci. Mob. Comput. 2014, 3, 455–464. [Google Scholar]
 Mounce, S.R.; Ellis, K.; Edwards, J.M.; Speight, V.L.; Jakomis, N.; Boxall, J.B. Ensemble Decision Tree Models Using RUSBoost for Estimating Risk of Iron Failure in Drinking Water Distribution Systems. Water Resour. Manag. 2017, 31, 1575–1589. [Google Scholar] [CrossRef] [Green Version]
 Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: Improving classification performance when training data is skewed. In Proceedings of the 2008 19th International Conference on Pattern Recognition, Tampa, FL, USA, 8–11 December 2008; pp. 1–4. [Google Scholar]
 Xu, Q.S.; Liang, Y.Z. Monte Carlo cross validation. Chemom. Intell. Lab. Syst. 2001, 56, 1–11. [Google Scholar] [CrossRef]
 Shao, J. Linear model selection by crossvalidation. J. Stat. Plan. Inference 2005, 128, 231–240. [Google Scholar] [CrossRef]
 Fushiki, T. Estimation of prediction error by using Kfold crossvalidation. Stat. Comput. 2011, 21, 137–146. [Google Scholar] [CrossRef]
 Berrar, D. CrossValidation. In Encyclopedia of Bioinformatics and Computational Biology; Elsevier: Amsterdam, The Netherlands, 2019; pp. 542–545. [Google Scholar] [CrossRef]
 Hosmer, D.W.; Lemeshow, S. Assessing the Fit of the Model. In Applied Logistic Regression; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2004; pp. 143–202. [Google Scholar] [CrossRef]
 Guyon, I.; Andre, E. An Introduction to Variable and Feature Selection. J. Mach. Learn. Res. 2003, 3, 1157–1182. [Google Scholar]
 Rückstieß, T.; Osendorfer, C.; van der Smagt, P. Sequential Feature Selection for Classification; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, German, 2011; pp. 132–141. [Google Scholar] [CrossRef] [Green Version]
 Duncan, G.E.; Perri, M.G.; Theriaque, D.W.; Hutson, A.D.; Eckel, R.H.; Stacpoole, P.W. Exercise Training, Without Weight Loss, Increases Insulin Sensitivity and Postheparin Plasma Lipase Activity in Previously Sedentary Adults. Diabetes Care 2003, 26, 557–562. [Google Scholar] [CrossRef] [Green Version]
 Bouwmeester, W.; Zuithoff, N.P.; Mallett, S.; Geerlings, M.I.; Vergouwe, Y.; Steyerberg, E.W.; Moons, K.G. Reporting and methods in clinical prediction research: A systematic review. PLoS Med. 2012, 9, e1001221. [Google Scholar] [CrossRef] [Green Version]
 Sun, Y.; Wong, A.K.C.; Kamel, M.S. Classification of imbalanced data: A review. Int. J. Pattern Recognit. Artif. Intell. 2009, 23, 687–719. [Google Scholar] [CrossRef]
 Melillo, P.; Luca, N.D.; Bracale, M.; Pecchia, L. Classification tree for risk assessment in patients suffering from congestive heart failure via longterm heart rate variability. IEEE J. Biomed. Heal. Inform. 2013, 17, 727–733. [Google Scholar] [CrossRef]
 BolónCanedo, V.; SánchezMaroño, N.; AlonsoBetanzos, A. A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 2013, 34, 483–519. [Google Scholar] [CrossRef]
 Seiffert, C.; Khoshgoftaar, T.M.; Van Hulse, J.; Napolitano, A. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 2020, 40, 185–197. [Google Scholar] [CrossRef]
Risk Factors  HMS Criteria 

Central Obesity  Waist Circumference (WC) population and country specific 
Triglycerides (TG)  ≥150 mg/dL 
Fasting Plasma Glucose (FPG)  ≥100 mg/dL 
HighDensity Lipoprotein Cholesterol (HDLC)  <40 mg/dL in males <50 mg/dL in females 
Blood Pressure  Systolic ≥ 130 mmHg and/or Diastolic ≥ 85 mmHg 
Diagnostic criteria  Three risk factors 
Authors  MurguiaRomero [31]  Chen [32]  Kupusinac [33] 

Age  E  E  
Sex  E  E  E 
WG:Weight  E  I  I 
HG:Height  E  I  I 
WC: Waist circumference  E  E  I 
HC: Hip circumference  E  
WHR: Waist to Hip ratio  E  
WSR: Waist to Stature Ratio  E  
BMI: Body Mass Index  E  E  E 
SBP: Systolic blood pressure  E  E  
DBP: Diastolic blood pressure  E  E  
Hidden neurons  25  5  85 and 96 
n  W  P  G  H  T  MetS 

0  0  0  0  0  0  0 
1  0  0  0  0  1  0 
2  0  0  0  1  0  0 
3  0  0  0  1  1  0 
4  0  0  1  0  0  0 
5  0  0  1  0  1  0 
6  0  0  1  1  0  0 
7  0  0  1  1  1  1 
8  0  1  0  0  0  0 
9  0  1  0  0  1  0 
10  0  1  0  1  0  0 
11  0  1  0  1  1  1 
12  0  1  1  0  0  0 
13  0  1  1  0  1  1 
14  0  1  1  1  0  1 
15  0  1  1  1  1  1 
16  1  0  0  0  0  0 
17  1  0  0  0  1  0 
18  1  0  0  1  0  0 
19  1  0  0  1  1  1 
20  1  0  1  0  0  0 
21  1  0  1  0  1  1 
22  1  0  1  1  0  1 
23  1  0  1  1  1  1 
24  1  1  0  0  0  0 
25  1  1  0  0  1  1 
26  1  1  0  1  0  1 
27  1  1  0  1  1  1 
28  1  1  1  0  0  1 
29  1  1  1  0  1  1 
30  1  1  1  1  0  1 
31  1  1  1  1  1  1 
Type  Diagnostic of MetS 

WPT  Increased Waist Circumference, Blood Pressure, and Triglycerides levels 
WPH  Increased Waist Circumference, Blood Pressure, and reduction of HDLC levels 
WPG  Increased Waist Circumference, Blood Pressure, and Fasting Plasma Glucose levels 
WGT  Increased Waist Circumference, Fasting Plasma Glucose, and Triglycerides levels 
WGH  Increased Waist Circumference, Fasting Plasma Glucose, and decreased HDLC levels 
WTH  Increased Waist Circumference, Triglycerides, and decreased HDLC levels 
PGT  Increased Blood Pressure, Fasting Plasma Glucose, Triglycerides levels 
PGH  Increased Blood Pressure, Fasting Plasma Glucose, and decreased HDLC levels 
PHT  Increased Blood Pressure, Triglycerides and decreased HDLC levels 
GHT  Increased Fasting Plasma Glucose, Triglycerides and decreased HDLC levels 
Learned Type  Decision Tree 

Maximum number of splits  20 
Number of learners  30 
Learning rate  0,1 
AROC  Discrimination Ability 

AROC = 0.5  No discrimination 
0.5 < AROC < 0.7  Regular 
0.7 ≤ AROC < 0.8  Acceptable 
0.8 ≤ AROC < 0.9  Excellent 
AROC ≥ 0.9  Outstanding 
Variables *  MetS m(SD)  No MetS m(SD)  Total m(SD)  p 

TG  216.94 (112.8)  121.84 (63.05)  160.81 (98.67)  <0.001 
GL  97.33 (38.82)  84 (19.56)  89.47 (29.74)  <0.001 
HDLC[W]  38 (8.22)  46.97 (13.54)  43.39 (12.49)  <0.001 
HDLC[M]  36.11 (11.1)  43.32 (11.63)  40.27 (11.93)  <0.001 
Variables  MetS m(SD)  No MetS m(SD)  Total m(SD)  p 

Age (year)  47.62 (17.49)  38.89 (15.96)  42.61 (17.17)  <0.001 
WC (cm)  99.81 (11.33)  87.24 (11.91)  92.59 (13.21)  <0.001 
HC (cm)  105.51 (10.56)  93.73 (12.50)  98.75 (13.07)  <0.001 
Weight (Kg)  79.08 (17.11)  66.59 (13.81)  71.71 (16.43)  <0.001 
Height (m)  1.64 (0.09)  1.62 (0.09)  1.63 (0.09)  0.068 
BMI (Kg/m)  29.09 (5.31)  25.26 (4.74)  26.89 (5.33)  <0.001 
WHR *  0.94 (0.05)  0.93 (0.09)  0.94 (0.08)  <0.001 
BFP (%)  38.64 (8.46)  30.86 (10.23)  34.05 (10.28)  <0.001 
SBP (mmHg)  128.52 (18,46)  112.91 (12,61)  119.55 (17.19)  <0.001 
DBP (mmHg)  78.48 (11.13)  71.18 (9.21)  74.29 (10.69)  <0.001 
Parameter  Value 

Training Function  Levenberg–Marquardt backpropagation 
min_grad  10${}^{10}$ 
mu  10${}^{3}$ 
mu_dec  0.1 
mu_inc  10 
mu_max  10${}^{10}$ 
HL function  hyperbolic tangent sigmoid 
Out function  Logsigmoid 
Types  Predicting Variables  

WPT  WC  SBP  DBP  POD  
WPH  BMI  BFP  HC  SBP  DBP 
WPG  AGE  HC  SBP  
WGT  WC  
WGH  BFP  HC  
WTH  BFP  WC  
PGH  SBP  POD  
PGT  AGE  WG  SBP  
PTH  HC  SBP  DBP  
GHT  WSR  
MetSG  AGE  WC  WHR  SBP 
WPT  WPH  WPG  WGT  WGH  WTH  PGH  PGT  PTH  GHT  MetSG 

4  4  3  2  2  2  2  3  3  2  4 
Target  Predicting Variables  

WPT  WC  SBP  DBP  
WPH  BFP  HG  WHR  SBP  DBP 
WPG  AGE  POD  WG  SBP  
WGT  WC  POD  
WGH  BFP  HC  WHR  DBP  
WTH  BFP  WC  
PGH  AGE  POD  WG  SBP  
PGT  AGE  WG  SBP  
PTH  BFP  HC  SBP  DBP  
GHT  HC  POD  
MetG  AGE  WC  WHR  SBP 
WPT  WPH  WPG  WGT  WGH  WTH  PGH  PGT  PTH  GHT  MetSG 

3  4  4  2  4  2  4  3  4  2  4 
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. 
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Barrios, M.; Jimeno, M.; Villalba, P.; Navarro, E. Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning. Appl. Sci. 2020, 10, 8404. https://doi.org/10.3390/app10238404
Barrios M, Jimeno M, Villalba P, Navarro E. Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning. Applied Sciences. 2020; 10(23):8404. https://doi.org/10.3390/app10238404
Chicago/Turabian StyleBarrios, Mauricio, Miguel Jimeno, Pedro Villalba, and Edgar Navarro. 2020. "Framework to Diagnose the Metabolic Syndrome Types without Using a Blood Test Based on Machine Learning" Applied Sciences 10, no. 23: 8404. https://doi.org/10.3390/app10238404