Next Article in Journal
Soret & Dufour and Triple Stratification Effect on MHD Flow with Velocity Slip towards a Stretching Cylinder
Next Article in Special Issue
Applications of ANFIS-Type Methods in Simulation of Systems in Marine Environments
Previous Article in Journal / Special Issue
Variable Decomposition for Large-Scale Constrained Optimization Problems Using a Grouping Genetic Algorithm
 
 
Article
Peer-Review Record

Evaluation of Machine Learning Algorithms for Early Diagnosis of Deep Venous Thrombosis

Math. Comput. Appl. 2022, 27(2), 24; https://doi.org/10.3390/mca27020024
by Eduardo Enrique Contreras-Luján 1, Enrique Efrén García-Guerrero 1, Oscar Roberto López-Bonilla 1, Esteban Tlelo-Cuautle 2, Didier López-Mancilla 3 and Everardo Inzunza-González 1,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Math. Comput. Appl. 2022, 27(2), 24; https://doi.org/10.3390/mca27020024
Submission received: 29 December 2021 / Revised: 1 March 2022 / Accepted: 2 March 2022 / Published: 4 March 2022
(This article belongs to the Special Issue Numerical and Evolutionary Optimization 2021)

Round 1

Reviewer 1 Report

The study is very interesting and presents an important contribution to the field of DVT prediction. I'm very grateful to be able to contribute to the evaluation of this study. The paper is well organized and it is written in good english. I think that the paper has great potential to be published. However, there are several aspects that need to be adressed before its acceptance. Thus, I recommend a major revision of the text. My main concern is regarding the lack of complete methods.

I am observing that papers that deal with the use of artificial intelligence do not present the evaluation of the models’ configuration parameters. The authors usually present the best result without a proper discussion on the model's construction and details on the result evaluation. Thus, I think that there is a lack of transparency on several published studies. And this happens in this paper as well.

I believe that the text should present more details about the model's obtaining to be published. The authors should address several aspects, such as:

  1. The aim of the study should be in the abstract
  2. The input and output variables should be in the abstract
  3. lines- 46-50 --> objective. It should be at the end of the section
    4. lines 62-70 --> this should be the first paragraph of the introduction
  4. lines 88-90 -->  "As a result, they must be housed in computers, which is not practical because they raise the cost of diagnosis, make it more unpleasant, and consume a lot of energy." Personally, I don't agree with this sentence. Computers are very energy efficient nowadays. Every office has a computer in it. Everyone uses a computer. Computers are cheap, especially one for the use of simple software as those equations developed for ML applications. The authors should provide advantages in the use of portable devices to develop models, instead of disadvantages of using computers.

methods:

  1. Why did the authors not use the age itself as an input?
  2. lines 144-147: did the authors mean output data?
  3. section 2.3: why the NN had 4 hidden layers?
  4. why did the NN's hidden layers had 32-64-32-16 neurons each?
  5. Why did the authors did not present an extensive research regarding the ANN configuration (number of layers, number of neurons in each layer, activation functions, optimization algorithms) parameters for the search of the optimum model? The same is for the other models: they have several configuration parameters that influence the final result. The search for this optimum configuration values is missing.

Results:

  1. Lines 279-282: reformulate. The word ‘however’ in line 280 can be removed.
  2. Line 294: The word ‘however’ can be removed.
  3. The authors used the RPi4 to train all the models, showed that its results were equal to those obtained by the PC, and that the training time was longer (which was expected). At the end of the paper the authors suggested a usage scenario for their model in a portable smart system, and at line 338 they stated that the “The smart system will have a trained ML model…”. So, why did the authors compare training parameters on both devices, if the RPi4 will use a trained model?! Training time and hardware efficiency are not important in this scenario.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript by Contreras-Luján and colleagues presents an interesting study on machine learning models for DVT. The major contribution is an extensive analysis of existing mL models on a new dataset. No novel ML model is proposed.

One of the major limitation of the paper is that existing results are not properly analyzed. Indeed, VTE prediction with ML (referred ad DVT in this paper) has been largely studied. Previous studies [Ferroni, et al. Medical Decision Making 2017; Ferroni, et al. Disease markers 2017; Riondino, et al. Cancers 2019] introduce a model based on Multi-Kernel and Random Optimization, which outperforms existing models in VTE prediction in cancer patients. This paper should analyze these studies and compare against.

The introduction shows a certain lack of medical terminology and should be extensively revised.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

In this work, authors have evaluated 6 machine learning models to predict DVT using 11 input variables. Different performance metrics and execution times were reported. 

 

Strength: Authors have described in detail the tools used for the construction of ML models. Very clear motivation for this work was provided.

 

Limitations: Previous work was not cited. Data sources were not defined but rather a reference was provided. It is not clear why certain input variables were used. It is also not clear which variables contributed more toward predictions. 

 

Comments:

Could you please provide the data and model for reproducibility?

 

Authors have used previous DVT occurrences as input variables. However if a patient is predicted to have a DVT then it is more of a question of recurrence of VT rather than just DVT. Could you please clarify why DVT diagnosis was used as an output variable rather than VT recurrence?

 

Could you please discuss which variables contributed more toward prediction in these ML models?

 

Line 40: time as algorithms become more and more optimal and less complex.

I do not think that algorithms are becoming less and less complex but rather we have more data these days, that’s why we are able to apply NNs these days.

 

Line 59: a NN capable of excluding DVT without …. 

This sentence is not clear. What do authors mean by excluding DVT ?

 

Please describe the data source briefly in the main text and then refer to the source. 

 

Why did the authors decide to use scikit learn for neural network implementation? Why were Keras or other frameworks not considered?

 

How did you divide the dataset into training and validation? How was the validation done? How did you optimize hyper-parameters? 

How data augmentation was done? 

 

Line 154-155:  After that, using the Seaborn pairplot command, plots are generated between all of the data so that each of the values may be discriminated against. 

Provide these plots if it is too much for the main text then maybe in the supplementary section.

 

How was the data imbalance issue handled?

 

Maybe some intro on PC and raspberry... why were they chosen? Some explanation would be highly appreciated.

 

Line 274: They are a key tool for assessing an ML model’s performance.

ROC is not a tool but rather a metric to evaluate performance of machine learning models.

Line 299-302: The performance metrics are the compute Area Under the Curve (AUC) using trapezoidal method . . .

Typos

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors improved the paper. Congratulations on the study.

Author Response

Point 1. The authors improved the paper. Congratulations on the study.

Response 1. Thank you very much for your professional comments and contribution to the evaluation of this study. We believe that all your comments, questions, and suggestions helped clarify this paper's contribution and improve its quality.

We appreciate your feedback.

Best regards and have a good day.

Atte.

The authors

Reviewer 3 Report

If you divided the dataset into training (85%) and validation (15%). How did you check the performance of the models? Why some testing dataset was not kept separated to compare performances of machine learning model?

 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Back to TopTop