Next Article in Journal
Sentiment Analysis of Comment Texts on Online Courses Based on Hierarchical Attention Mechanism
Next Article in Special Issue
Detection of Algorithmically Generated Malicious Domain Names with Feature Fusion of Meaningful Word Segmentation and N-Gram Sequences
Previous Article in Journal
Simulation Research on the Grouser Effect of a Reconfigurable Wheel-Crawler Integrated Walking Mechanism Based on the Surface Response Method
Previous Article in Special Issue
Photovoltaic Power-Stealing Identification Method Based on Similar-Day Clustering and QRLSTM Interval Prediction
 
 
Article
Peer-Review Record

Anomaly Detection Method for Unknown Protocols in a Power Plant ICS Network with Decision Tree

Appl. Sci. 2023, 13(7), 4203; https://doi.org/10.3390/app13074203
by Kyoung-Mun Lee 1,2, Min-Yang Cho 3,4, Jung-Gu Kim 5 and Kyung-Ho Lee 2,*
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Reviewer 5: Anonymous
Appl. Sci. 2023, 13(7), 4203; https://doi.org/10.3390/app13074203
Submission received: 16 February 2023 / Revised: 23 March 2023 / Accepted: 24 March 2023 / Published: 26 March 2023

Round 1

Reviewer 1 Report

In this paper, aiming at the detection of ICS operation stability, the anomaly identification process is designed using an simple and effective AI method. The principle is simple, and the main contribution is engineering design. I have only minor queries.

1. The performance of proposed method is significantly better than that of the comparison method. What is the main reason?

2.Whether the tested data comes from the simulation generation or the actual operation of power plant, the paper does not give a clear explanation. Such an engineering problem would be better if the experiment based on the data from multiple power plants.

3. There are more or less differences between different power plants. Whether the proposed method needs to be customized for each power plant, or whether it can provide common portable codes.

Author Response

Dear Reviewer

Thank you for the good review comments.
We tried to reflect your review comments as much as possible.
Review opinions that were not reflected will be reflected in future research to make it a good research.

Thank you.

Author Response File: Author Response.docx

Reviewer 2 Report


The submitted manuscript deals with anomaly detection methods for unknown protocols in power plan ICS networks with machine learning. The paper does have merit, but several issues need to be addressed. A basis for securing the health of a closed network is provided, contributing to stable system operation.

Below I include some detailed comments.

 

A new method is proposed. But the introduction must be improved and more supported by appropriate references.

 

Line 248: Figure 7 is followed by Figure no 2. Two pages later are – again – Figure no 3. The numbering of figures must be revised.

 

What does the sign * in line 266 refer to (DarkTrace*)?

 

What is meant by “NW configuration” (line 287)?

 

The numbering of tables must be revised too, see e.g., page 11.

 

What are the limitations of the proposed approach? A short discussion on them should be provided.

 

The reference list should be extended to include also some recent works (from the last two years).

 

 

Author Response

Dear Reviewer

Thank you for the good review comments.
We tried to reflect your review comments as much as possible.
Review opinions that were not reflected will be reflected in future research to make it a good research.

Thank you.

Author Response File: Author Response.docx

Reviewer 3 Report

Section 2.1 about related works seems to be too general and there are too few details related to works similar to those presented by the authors.

In line 176, the authors propose the Null Hypothesis. Why the value 9 is chosen? It should be explained in the text.

Lines: 225, 231, and 238 with relational algebra are difficult to read and analyze. Moreover, I can't understand why authors write "S≥ST∧S≤ST∧D≥DT∧D≤DT" instead of "S=ST∧D=DT" (according to mathematical logic it's the same).

Lines 269/270: authors write "control traffic data from the power plant was collected" but they should be more specific about the origin of the data. What is the power plant?

 

Small errors:
In the title, instead of "in Power Plan" should be " in Power Plant".
In the text, there should be a reference to table 1.
The numbers of tables and figures should be corrected:
page 9 fig.2 should be 8
page 11 tab.1 should be 5, fig.3 should be 9, etc.

Additionally, the positions of tables and figures in the text should be corrected, because they make it difficult to read the manuscript. And please make the font smaller in tables 10-12 so that the numbers in the last few rows fit in.

Author Response

Dear Reviewer

Thank you for the good review comments.
We tried to reflect your review comments as much as possible.
Review opinions that were not reflected will be reflected in future research to make it a good research.

Thank you.

Author Response File: Author Response.docx

Reviewer 4 Report

1.      Overall better work presented by the authors, However, authors may consider the following,

2.      Authors may revise the abstract to elaborate more on the problem statement, findings, and contributions.

3.      Introduction is not clear. Authors may contribute more towards this.

4.      How authors can support or justify their this claim (Prevention of performance bottlenecks and failures)

5.      Why decision tree has been chosen reference to figure 2. The caption of this figure requires amendments.

6.      Authors may elaborate more on the novelty/contribution of their work and how it contributes to the literature in the second last paragraph of the introduction clearly.

7.      Thorough proofreading is recommended.

8.      Most of the figure’s resolutions are not clear and hard to read

9.      A few references are missing some information; you may complete them critically.

10.   The conclusion is not clear and needs revision and clarity and alignment with the abstract and title.

11.   Provided references are better enough. However, authors are recommended to consider more latest and related, such as,

 

 

 

Adeyemo, V. E., Abdullah, A., JhanJhi, N. Z., Supramaniam, M., & Balogun, A. O. (2019). Ensemble and deep-learning methods for two-class and multi-attack anomaly intrusion detection: An empirical study. International Journal of Advanced Computer Science and Applications, 10(9) doi:https://doi.org/10.14569/IJACSA.2019.0100969

Author Response

Dear Reviewer

Thank you for the good review comments.
We tried to reflect your review comments as much as possible.
Review opinions that were not reflected will be reflected in future research to make it a good research.

Thank you.

Author Response File: Author Response.docx

Reviewer 5 Report

In this paper, the authors proposed using decision tree for detecting illegal assets and abnormal traffic in the control network of a power plant.

 Some suggestions and remarks are required to be addressed to improve the paper's quality:

1)  The article should undergo extensive English revisions since there are many mistakes in many phrases in terms of English.

2) Paper title: The authors used only decision tree among many supervised machine learning techniques. So, I suggest changing the title to " Anomaly Detection Method for Unknown Protocols in Power Plant ICS Network with decision tree”.  

2) Section Abstract: Please add one sentence at the end of the abstract that summarizes the experimental results of this paper.

3) Section 1. Introduction:

The Introduction section explained the need for an automatic solution to solve the problem in the control network of a power plant. However, some points are missing in Introduction Section:

-The motivations for using machine learning and particularly decision tree.

-What are the existing works that used machine learning for anomaly detection in

Industrial Control Systems

- What are the limitations of the existing works that used machine learning for anomaly detection in Industrial Control Systems

- How the proposed solution can overcome these limitations

4) Section 2.1. Related Work:

In this section, it is expected that the authors discussed the existing works based on machine learning. Then, the authors need to explain what are the differences between the proposed work and the existing works.

There are many recent works that applied machine learning for anomaly detection in Industrial Control Systems:

For examples:

1.     Perales Gómez, Ángel Luis, et al. "Madics: A methodology for anomaly detection in industrial control systems." Symmetry 12.10 (2020): 1583.

2.     Mokhtari, Sohrab, et al. "A machine learning approach for anomaly detection in industrial control systems based on measurement data." Electronics 10.4 (2021): 407.

3.     Wang, Chao, et al. "Anomaly detection for industrial control system based on autoencoder neural network." Wireless Communications and Mobile Computing 2020 (2020): 1-10.

4.     Abdelaty, Maged, Roberto Doriguzzi-Corin, and Domenico Siracusa. "DAICS: A deep learning solution for anomaly detection in industrial control systems." IEEE Transactions on Emerging Topics in Computing 10.2 (2021): 1117-1129.

5) In Section  2.2. Proposal:

No explanation about how the decision tree works, and how the decision tree is applied for anomaly detection in Industrial Control Systems

6) In Section 2.2.4. Abnormal Sign Judgement Rate:

-In Table 2. Anomaly detection true/false matrix, no need to include the performance measures equations in Table 2. The measure equations were written in Table 3.

-In Table 3. Evaluation criteria, the authors define wrongly the positive and negative class in the anomaly detection field. Thus, descriptions of measures are not correct.

If you refer to the papers mentioned in comment 4, you will see the following definitions:

True Positive (TP): indicates the number of anomalies properly detected by the model.
True Negative (TN): indicates the number of non-anomalies properly classified.
False Positive (FP): indicates the number of non-anomalies wrongly classified as anomalies.
False Negative (FN): indicates the number of anomalies wrongly classified as non-anomalies.

6) in Section  3. Results:

In lines 268-278, the number and names of features used in the training dataset should be described as well.

7) In section 3.2.3. Analysis of the Verification Results of the Proposed Model

 

The authors presented the results only for Accuracy and Fall-out while they did not include the other performance measures mentioned in Table 3. Evaluation criteria. Furthermore, in general, the results need more discussion.

Author Response

Dear Reviewer

Thank you for the good review comments.
We tried to reflect your review comments as much as possible.
Review opinions that were not reflected will be reflected in future research to make it a good research.

Thank you.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Regarding my previous comment #1:
I do not see any added references in the introduction! There are even less than in the initial submission!

Regarding my previous comment #2:
There is still a problem with numbering of added elements. Figure 10 occurs twice in the paper.

Regarding my previous comment #7:
I do not see any added references. The total number of them is exactly the same as before.

Author Response

Dear Reviewer,


Thank you for your good review comments.
According to your comments, the latest references and highly relevant references have been revised and reflected. The reflected contents were reflected in Section 2.1.4 Anomaly Detection in Industrial Control System.
For future research, we will refer to the latest cases of previous research in the related field.


Thank you.

Author Response File: Author Response.docx

Reviewer 4 Report

The authors addressed most of the comments. However, the literature they are still missing from the latest references, I do not agree with their response for that point. Authors need to consider the different benchmarking with the work and brief about their contribution to the literature and their novelty. 

They may consider like (https://doi.org/10.3390/info13070322) and other related. 

Author Response

Dear Reviewer,

Thank you for your good review comments.
According to your comments, the latest references and highly relevant references have been revised and reflected. The reflected contents were reflected in Section 2.1.4 Anomaly Detection in Industrial Control System.
For future research, we will refer to the latest cases of previous research in the related field.

Thank you.

Author Response File: Author Response.docx

Reviewer 5 Report

The authors failed in addressing the comments effectively and  so the revised manuscript has not been improved sufficiently 

Author Response

Dear Reviewer,

Thank you for your good review comments.
According to your comments, the latest references and highly relevant references have been revised and reflected. The reflected contents were reflected in Section 2.1.4 Anomaly Detection in Industrial Control System.
For future research, we will refer to the latest cases of previous research in the related field.

Thank you.

Author Response File: Author Response.docx

Round 3

Reviewer 2 Report

Alhough the manuscript is better now, it still needs to be revised.

The newly added sentence in lines 61-63 is not clear.

Flowcharts in Figures 7 and 8 are not clear as they include simple lines instead of arcs.

In the chart in Figure 15 - what is the time unit for the y-axis?

 

Author Response

Dear Reviewer

Thank you for your good review comments.
Based on your review comments, the English expression of the manuscript has been revised and reflected. In addition, the meaning of Figures 7 and 8 has been clearly revised and reflected.

Thank you.

Author Response File: Author Response.docx

Reviewer 5 Report

For clarification, I list comments for the responses: 

Point 2: Paper title: The authors used only decision tree among many supervised machine learning techniques. So, I suggest changing the title to " Anomaly Detection Method for Unknown Protocols in Power Plant ICS Network with decision tree”.

Response 2: While we acknowledge that decision tree was one of the supervised machine learning techniques used in our study, the aim of the research was not limited to the evaluation of this particular method. Instead, the focus was on proposing an effective anomaly detection method for unknown protocols in power plant ICS network, which was achieved by utilizing a combination of different techniques, including decision tree. Therefore, we believe that the current title accurately reflects the scope and purpose of the study, and we would like to keep it unchanged.

Comment on Response 2: The supervised machine learning techniques are different in algorithms and performance. If you get good results by decision tree, it is not necessary to get good performance by other machine learning techniques. That means it is not correct to generalize the performance for all machine learning. If you still insist to use machine learning in the title, I suggest using also other common machine learning such as SVM, ANN, kNN, RF and so on.

 

Point 4: Section 1. Introduction:

The Introduction section explained the need for an automatic solution to solve the problem in the control network of a power plant. However, some points are missing in Introduction Section:

               -The motivations for using machine learning and particularly decision tree.

               -What are the existing works that used machine learning for anomaly detection in Industrial Control Systems.

               - What are the limitations of the existing works that used machine learning for anomaly detection in Industrial Control Systems

               - How the proposed solution can overcome these limitations

Response 4: Thank you for your valuable feedback. The proposed method aims to address the problem of detecting anomalies in closed networks of private protocols in power plant control systems. We believe that machine learning, particularly the decision tree, can help overcome the limitations of traditional methods that rely on whitelisting techniques designed for open environments. In our study, we presented a model to identify normal communication patterns based on traffic generated by closed protocols.

Regarding existing works, we agree that there are studies that used machine learning for anomaly detection in industrial control systems, but they mostly focused on open industrial networks and did not take into account the closed network model of private protocols. Therefore, our study contributes to the literature by proposing a model that can efficiently detect anomalies in closed networks of private protocols. By expanding patterns through learning in a real data environment, we have demonstrated that our proposed method outperforms existing research methods.

Although we acknowledge that relearning the model can be challenging in case of unannounced changes to the vendor's private protocol, we believe that our method of detecting pattern communication can continuously improve based on the hypothesis of attempting pattern communication. We will further improve our method in future research.

Comment on Response 4: The authors ignore this comment and did not write any enhancement in Section Introduction

 

Point 5: Section 2.1. Related Work:

In this section, it is expected that the authors discussed the existing works based on machine learning. Then, the authors need to explain what are the differences between the proposed work and the existing works.

There are many recent works that applied machine learning for anomaly detection in Industrial Control Systems:

For examples:

               1). Perales Gómez, Ángel Luis, et al. "Madics: A methodology for anomaly detection in industrial control systems." Symmetry 12.10 (2020): 1583.

               2). Mokhtari, Sohrab, et al. "A machine learning approach for anomaly detection in industrial control systems based on measurement data." Electronics 10.4 (2021): 407.

               3). Wang, Chao, et al. "Anomaly detection for industrial control system based on autoencoder neural network." Wireless Communications and Mobile Computing 2020 (2020): 1-10.

               4). Abdelaty, Maged, Roberto Doriguzzi-Corin, and Domenico Siracusa. "DAICS: A deep learning solution for anomaly detection in industrial control systems." IEEE Transactions on Emerging Topics in Computing 10.2 (2021): 1117-1129.

Response 5: We appreciate the reviewer's suggestions and input. However, it is important to note that the focus of this study is on the detection of abnormal patterns in the closed network of private protocols, which is a different perspective from the examples provided. Therefore, we presented a model that detects abnormal patterns in the closed network of private protocols by detecting pattern communication. Traditional methods for detecting anomalies in open networks have been inefficient in closed networks such as power generation networks. In this study, we presented a model that finds a pattern that can be judged as normal from traffic generated by a closed protocol. This approach is different from existing studies that detect anomalies in industrial control systems based on measurement data or using autoencoder neural networks. We believe that the proposed method can be applied more efficiently to closed networks of private protocols, which is the main focus of this study.

Comment on Response 5: In the first round the authors ignore this comment. In the second round, the authors added Section 2.1.4 Anomaly Detection in Industrial Control System which includes only 2 references.  

 

Point 6: In Section 2.2. Proposal:

No explanation about how the decision tree works, and how the decision tree is applied for anomaly detection in Industrial Control Systems.

Response 6: Thank you for your comment. We appreciate your suggestion. In this study, we focused on detecting abnormalities by detecting pattern communication in a closed protocol environment, rather than explaining how the decision tree works. However, we understand the importance of explaining the decision tree method and its application in anomaly detection in Industrial Control Systems. In future studies, we plan to provide a more detailed explanation of how the decision tree works and how it is applied for anomaly detection in Industrial Control Systems. Additionally, we aim to compare the detection efficiency between supervised and unsupervised learning to provide a more comprehensive understanding of the different machine learning approaches. In this study, the decision tree was chosen based on its suitability under the limited conditions of our closed network of the closed protocol.

Comment on Response 6: It is a surprising answer. I’m asking about the main contribution: how to apply the decision tree in anomaly detection in Industrial Control Systems. However, the authors ignored this important part, and they said in the future work.

 

Point 7: In Section 2.2.4. Abnormal Sign Judgement Rate:

               - In Table 2. Anomaly detection true/false matrix, no need to include the performance measures equations in Table 2. The measure equations were written in Table 3.

               - In Table 3. Evaluation criteria, the authors define wrongly the positive and negative class in the anomaly detection field. Thus, descriptions of measures are not correct.

If you refer to the papers mentioned in comment 4, you will see the following definitions:

- True Positive (TP): indicates the number of anomalies properly detected by the model.

- True Negative (TN): indicates the number of non-anomalies properly classified.

- False Positive (FP): indicates the number of non-anomalies wrongly classified as anomalies.

- False Negative (FN): indicates the number of anomalies wrongly classified as non-anomalies.

Response 7: Thank you for your comments. Based on the comments, we have removed the equations in Table 2. And the definition error you pointed out has been corrected and reflected.

Comment on Response 7: The authors corrected Table1. However, the authors in Table 3 defined wrongly the positive and negative classes in the anomaly detection field.

Sensitivity : Ratio of actual positives, which are actually normal, predicted as positives.

The correct Sensitivity: Ratio of actual positives, which are actually anomalies, predicted as positives.

Specificity: Ratio of correct identification of actual anomalies as anomalies

The correct Specificity: Ratio of correct identification of actual normal(non-anomalies) as normal

Please read articles to distinguish  between the positive and negative classes in the anomaly detection field

 

Point 9: In section 3.2.3. Analysis of the Verification Results of the Proposed Model

The authors presented the results only for Accuracy and Fall-out while they did not include the other performance measures mentioned in Table 3. Evaluation criteria. Furthermore, in general, the results need more discussion.

Response 9: Thank you for pointing out this issue. We agree that the evaluation criteria should be thoroughly discussed in the paper. We will consider adding a more detailed analysis of the performance measures in Table 3 in our future work. As for the missing performance measures, we have included them in Table 14, and we believe that the presented results are still sufficient to verify the performance of our proposed method.

Comment on Response 9The authors ignore this comment. I did not see any other measures in Table 14

 

 

Author Response

Dear Reviewer
Thank you for your good review comments.
It may be a bit lacking in the reviewer's view, but we have accepted and reflected your comments as much as possible. Thank you for your help in making it a good manuscript.

Thank you.

Author Response File: Author Response.docx

Back to TopTop