Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Reducing False Negatives in Ransomware Detection: A Critical Evaluation of Machine Learning Algorithms

Appl. Sci. 2022, 12(24), 12941; https://doi.org/10.3390/app122412941

by Robert Bold¹, Haider Al-Khateeb^2,*

and Nikolaos Ersotelos^3,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Appl. Sci. 2022, 12(24), 12941; https://doi.org/10.3390/app122412941

Submission received: 28 October 2022 / Revised: 8 December 2022 / Accepted: 10 December 2022 / Published: 16 December 2022

(This article belongs to the Special Issue AI-Enabled Cyber Defence in IoT Deployments: Challenges and Opportunities)

Round 1

Reviewer 1 Report

I would like to commend the authors for this sound and interesting topic. The tools utilized are of the current trends with study objectives that are relevant and timely. I only have a few comments with the paper.

1. I believe that the introduction needs to be highlighted separately since the third paragraph is quite long which encompasses literatures and gap, purpose, and tools. I think for clarity, these can be separated into two or three paragraphs to avoid confusion.

2. The inclusion of research questions would benefit the paper from the probable output readers can assume upon going through the paper.

3. The methodology may opt to consider pseudocodes alongside the calculation explanation so readers may benefit from it - especially for those who plan to extend/cite the paper.

4. The evaluation I feel needs proper representation like utilizing the Taylor Diagram to encompass all possible accuracy, standard deviation, RMSEA, and correlation among different machine learning tools. You may want to consider https://doi.org/110.3390/su141811329 for reference.

5. The formatting and references of the paper, together with the grammar and sentence construction should be checked.

I hope these comments would help the authors to have their paper ready for publication.

Author Response

“I would like to commend the authors for this sound and interesting topic. The tools utilized are of the current trends with study objectives that are relevant and timely. I only have a few comments with the paper.

I believe that the introduction needs to be highlighted separately since the third paragraph is quite long which encompasses literatures and gap, purpose, and tools. I think for clarity, these can be separated into two or three paragraphs to avoid confusion.”

The authors thank the reviewer for their valuable comments to improve the current state of the paper. Please note the following revision in response to the review:

The introduction has been revised to split the long paragraph into separate paragraphs with each focusing on the following topics:

ransomware-as-a-service phenomenon
spread of ransomware via phishing emails
other delivery methods such as unpatched vulnerability
Challenges proposed by new malware trends
Finally, a paragraph that focuses on the opportunities and challenges offered by ML.

“2. The inclusion of research questions would benefit the paper from the probable output readers can assume upon going through the paper.”

Two key research questions have been included in the introduction.

“3. The methodology may opt to consider pseudocodes alongside the calculation explanation so readers may benefit from it - especially for those who plan to extend/cite the paper.”

For further clarity, a sample code has been included e.g., on how to aggregate the API calls made by the processes spawned by each sample file that was run within our experiment, where the results were output to a CSV file.

“4. The evaluation I feel needs proper representation like utilizing the Taylor Diagram to encompass all possible accuracy, standard deviation, RMSEA, and correlation among different machine learning tools. You may want to consider https://doi.org/110.3390/su141811329 for reference.”

Thank you for your constructive feedback. The https://doi.org/110.3390/su141811329 was not accessible online (we have received a “DOI Prefix [110.3390] Not Found” error. However, we appreciate the feedback and confirm there can be several matrices to compare ML algorithms. In our study we have:

Discussed and utilised the contextual evaluation metrics used in related work as explained in Section 4.1
Discussed and utilised novel metrics proposed and covered them in our study such as Positive Likelihood Ratio, (Negative Likelihood Ratio, Diagnostic Odds Ratio, Youden's index, Number Needed to Diagnose, Number Needed to Misdiagnose, Net Benefit.

“5. The formatting and references of the paper, together with the grammar and sentence construction should be checked.

I hope these comments would help the authors to have their paper ready for publication.”

We have revisited the manuscript for additional proof-reading.

Reviewer 2 Report

This paper presents an empirical comparison of the performance of ML algorithms for the task of detecting ransomware, by monitoring API calls. Special focus is put on the False Negatives that each algorithm produces, which is a "red flag" for an algorithm performing in such a domain.

Language is good, the existing work that is analyzed is sufficient, and the experimental setup used to obtain the results seems sound and realistic.

However, regarding the design, I would expect to see a more thorough evaluation, e.g. k-fold validation, more datasets, more state-of-the-art classification algorithms (and perhaps all composed in an extensible open-source tool that can be used for evaluating algorithms applied in the security sector).

Minor comments:

Including a future work part might improve the overall presentation of the work.

The third paragraph of the introduction is too long, and perhaps disrupts the focus of the reader. For example, some historical background could have been placed in Section 2.

l. 94: "has been used": Perhaps this should be rephrased (e.g. is used), because the way it is currently written implies use in other (past) studies.

l. 124: a missing r in "behaviour"

l. 133: but if this is not certain, then perhaps the following discussion should have been based upon the assumption of a different approach, other than that of Netto et al.

l. 454: The title of 4.2 should change, since "novel" implies that the metrics are introduced in the current paper, bit this is not the case, as it is mentioned in l. 455. This is somehow inconsistent.

Table 3: The first row could be omitted, since it contains the same value in all columns.

l. 567-574: I would prefer to see the notes for Table 3 in a paragraph instead of in a numbered list.

l.576-onwards: Quite a few whitespaces are missing.

Author Response

“This paper presents an empirical comparison of the performance of ML algorithms for the task of detecting ransomware, by monitoring API calls. Special focus is put on the False Negatives that each algorithm produces, which is a "red flag" for an algorithm performing in such a domain.

Language is good, the existing work that is analyzed is sufficient, and the experimental setup used to obtain the results seems sound and realistic.”

The authors thank the reviewer for their valuable comments to improve the current state of the paper. Please note the following revision in response to the review:

“However, regarding the design, I would expect to see a more thorough evaluation, e.g. k-fold validation, more datasets, more state-of-the-art classification algorithms (and perhaps all composed in an extensible open-source tool that can be used for evaluating algorithms applied in the security sector).”

We do appreciate there can be several matrices to compare ML algorithms. Integrating all classification algorithms within an open-source tool is out of scope for this paper but we have considered the suggestion in the following way to improve this study:

We have included a new Figure (Figure 6) to illustrate the Net Benefit for a range of probability thresholds.
We have fully revised the conclusion section and also included a future work section in which we also cover the opportunity of an extensible open-source tool to developed.
The study discuss and utilise contextual evaluation metrics used in related work as explained in Section 4.1
Additionally, our paper discuss and utilise novel metrics proposed and covered them in our study such as Positive Likelihood Ratio, (Negative Likelihood Ratio, Diagnostic Odds Ratio, Youden's index, Number Needed to Diagnose, Number Needed to Misdiagnose, Net Benefit.

“Minor comments:

Including a future work part might improve the overall presentation of the work.”

A future section was added to the paper.

“The third paragraph of the introduction is too long, and perhaps disrupts the focus of the reader. For example, some historical background could have been placed in Section 2.”