Next Article in Journal
Applying a Hybrid Kano/Quality Function Deployment Integration Approach to Wood Desk Designs for Open-Plan Offices
Previous Article in Journal
Effects of Rhus typhina Invasion on Soil Physicochemical Properties and Carbon Emissions in Urban Green Spaces
 
 
Article
Peer-Review Record

High-Precision Real-Time Forest Fire Video Detection Using One-Class Model

Forests 2022, 13(11), 1826; https://doi.org/10.3390/f13111826
by Xubing Yang *,†, Yang Wang, Xudong Liu and Yunfei Liu *,†
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Forests 2022, 13(11), 1826; https://doi.org/10.3390/f13111826
Submission received: 25 September 2022 / Revised: 28 October 2022 / Accepted: 1 November 2022 / Published: 2 November 2022
(This article belongs to the Section Natural Hazards and Risk Management)

Round 1

Reviewer 1 Report (Previous Reviewer 3)

The revision is sufficient. 

Author Response

Thanks for the comment!

Reviewer 2 Report (New Reviewer)

Dear authors,

 

I consider the study very interesting for mapping forest fires with the application of current technologies (hardware and software). However, I believe that such research would fit better in the journal FIRE (ISSN 2571-6255) than Forests (ISSN 1999-4907).

I suggest some changes.

 

Lines: 162-172

2.2. Method 

Doesn't the fact of using samples of only one class (pure fire) to map forest fires make the model limited to be applied only in specific situations?

Considering that the computer recognition standard will be adapted and facilitated only for the detection of a closed “ball” sample, of pure fire, disregarding the different variations of plant fuels, environmental factors (wind, soil, topography, altitude, slope of the terrain, etc. ) and the respective flammability stages of each type of forest fuel (dry, moderately wet and wet), among others.

 

Lines: 268-269

Can using the terrain truth (GT) provided only by the interactive annotation of the fire pixel obtained from fire images be considered sufficient for validation of this method?

Why weren't real field samples (control) used to detect different intensities or variations in fire behavior patterns in nature?

For better detection of boundary regions between pure fire and no fire.

Comment better.

 

Lines: 256-291

The description of these methodological details would not be better in the topic: “2. Materials and Methods > 2.1.1 Fire vídeos”?

I suggest starting the topic "3. Results" with the excerpt "3.2 Experimental results", which actually present the results of this research.

 

OK!

In the topic "5. Conclusion" the authors clarified several questions.

 

Author Response

Please see attached word file or the following texts.

Response to the comments of Reviewer #2

I consider the study very interesting for mapping forest fires with the application of current technologies (hardware and software). However, I believe that such research would fit better in the journal FIRE (ISSN 2571-6255) than Forests (ISSN 1999-4907).

Response: Thanks for the comment.

 

I suggest some changes.

Q1): Lines: 162-172, 2.2. Method

Doesn't the fact of using samples of only one class (pure fire) to map forest fires make the model limited to be applied only in specific situations? Considering that the computer recognition standard will be adapted and facilitated only for the detection of a closed “ball” sample, of pure fire, disregarding the different variations of plant fuels, environmental factors (wind, soil, topography, altitude, slope of the terrain, etc.) and the respective flammability stages of each type of forest fuel (dry, moderately wet and wet), among others.

Response: It is a good question!

It is true that the existing model-based methods are all scene-dependent, i.e., the reviewer mentioned specific situations, including deep learning-based ones. Here the “scene-dependent” means that the model can be generalized to various forest scenes, but it should be trained on the data collected from the STUDYING scene. We understand what the reviewer worry about, though it may be beyond our discussion. That is, whether a well-trained model on the current data is efferent or not for detecting future fire. The answer involves in three aspects, described as follows:

(1) Hardware devices, e.g., photosensitive CCD sensors of camera (the transformation from optical to electrical to digital signals). According to the principles of digital imaging, the pixel values obtained by different cameras may be different, because of different sensor sizes and the pixel numbers in each sensor unit.

(2) Image formats, or saying, the difference between natural image (human vision) and digital image (computer vision). According to the ITU-R BT.601, a digital criterion for color video from CCIR (International Radio Consultative Committee), different format emphasizes different color component. For example, the format RGB (RGB tricolor-system) is just convenient for image acquisition and storage, e.g., in a 24-bit RGB generic color model, each channel need 8 bits for encoding pixel value, i.e., 3*8=24 bits. While for the YCbCr, it is quite different. For example, for a YCbCr image with sampling format 4:1:1, it means that every pixel needs 8-bit for Y, and every cell with 2*2 pixels share one 8-bit Cb and 8-bit Cr. Averagely, each pixel only needs 8+8/4+8/4 = 12 bits (8:2:2=4:1:1), the half of RGB, for encodings. Therefore, the YCbCr has been accepted as a criterion for image compression, such as JPEG (joint photographic experts group), MPEG (moving pictures experts group) and H.264/AVC (advanced video codec). Furthermore, YCbCr describes a color by the principle of brightness and chromatic aberration, where Y means luminance/illumination, and Cb (Cr) denotes chrominance between Y and blue (red) signals in RGB;

(3) Data unobtainability. In vision-based forest fire detection research, the most difficulty is from the limited real data, to our best knowledge. For the user-specified forest scene, fire data is limited or even unobtainable because igniting is forbidden in the real forest scenes, not to mention collecting fire data from various conditions (windy or not, day or night, dry or wet forest combustible, etc.). Intuitively, the man-made fire scene cannot stand for a real and complex forest scene, e.g., burning something in a safe and empty place. On the contrary, this difficulty also includes the unavailability of non-fire data, e.g., public fire images or videos. That is, the recognition of forest fire is more difficult than that of indoor fires.

At the end, we talking something about model’s generalization. In earlier works, e.g., color-based methods, people do attempt to refine a universal rules or thresholds suitable for all fire scenes, but the result is not satisfactory. For example, the obtained (refined) rules may be efficient for this scene, but inefficient for another, because of the scene variation, as reviewer mentioned. But for the model-based method, due to the fact that training data is collected from the studying scene, it should be efficient for this scene. Following this view, even if the scene changes, the color of the fire is still shown in light yellow, orange-yellow, or reddish, i.e., subject to the same data distribution. Therefore, this insensitivity to non-fire environment factors is an advantage of data distribution-based one-class model. In contrast, two-class or multi-class methods may be more easily affected by environment factors because decision functions are determined by both classes of fire and non-fire samples. Of course, if the selected samples used for model-training cannot match fire pixel distribution, it may cause fire miss-detected problem for all model-based methods besides one-class model. Obviously, this is a matter of sample selection, not the one-class model itself.

Our manuscript is revised as below.

Lines 169-173, “For our concerned one-class model, the core is to seek an ideal closed “ball”, such that: (1) All fire samples should be contained inside; (2) The ball should be compact enough to exclude non-fire samples outside as possible. Due to without using non-fire supervisions, the model-training just depends on the one-class fire samples. Therefore, the selected fire samples should match the distribution of fire pixels.

 

Q2) Lines: 268-269

Can using the terrain truth (GT) provided only by the interactive annotation of the fire pixel obtained from fire images be considered sufficient for validation of this method? Why weren't real field samples (control) used to detect different intensities or variations in fire behavior patterns in nature? For better detection of boundary regions between pure fire and no fire.

Response: It is the common method for the performance assessment, by comparing the difference between GT values and the predictions of the used models, e.g., measured by quantitative indicators.

Detecting fire behavior on different intensities or variations by real field samples, it is an undoubtedly simple and intuitive method for the model’s performance assessment. However, due to various limitations or injunctions, the cost of data acquisition is too expensive, especially for the fire samples, as the above-mentioned “Data unobtainability”. In fact, our research group has also established multiple observation points in our forest fields, but never got fire data in nature. It may be due to safety considerations that the applications for using artificial fire are often denied. This is the reason why the public fire databases are very popular.

 

Q3) Lines: 256-291

The description of these methodological details would not be better in the topic: “2. Materials and Methods > 2.1.1 Fire vídeos”? I suggest starting the topic "3. Results" with the excerpt "3.2 Experimental results", which actually present the results of this research. In the topic "5. Conclusion" the authors clarified several questions.

Response: In our revised manuscript, the description “Fire videos” is revised as “2.1.1 Fire video data”. But for the topic “3. Result”, if using “experiment results” as the topic, it cannot cover the content of the whole section, e.g., experiment setup. Furthermore, this version is also following the style of the references published in the journal “Forests”, e.g., the reference [22] (Zhang, L.; Wang, M.; Fu, Y.; Ding, Y. A Forest Fire Recognition Method Using UAV Images Based on Transfer Learning. Forests, 2022, 13, 975. https://doi.org/10.3390/f13070975).

At the end, we all authors really should say thanks to the reviewer #2, for his/her carefulness and rigorous academic attitude.

Author Response File: Author Response.docx

Reviewer 3 Report (New Reviewer)

Fire detection with machine learning methods is a fascinating topic. Real data is used inside the work to validate the performance of the proposed method. While the authors might want to proofread the manuscript to make the paper more readable.

Author Response

Thanks for the comment!

In our new version, we have gone through the manuscript again and tried to make it more readable.

This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.


Round 1

Reviewer 1 Report

 

1. Please provide the source of the figure 1.a

2. Line 93 (b). remove dot.

3. Please provide the source of the figure 4.a and 4.b

 

Reviewer 2 Report

The manuscript entitled "High-Precision Real-Time Forest Fire Video Detection Using One-Class Model" by Yang et al. belongs in the category of scientific approaches in which machine learning and computer vision are used to detect forest fires in video (or image) corpora by the characteristic color of fire pixels in RGB space. This methodology, as the authors point out, is justified in situations where large areas need to be surveilled in an automated fashion and event-specific training data is not available. In principle, research in this area is worthwhile and progress is desirable. 

 

Unfortunately I cannot recommend this manuscript for publication in Forest as things stand, for three reasons:

 

  1. The approach practiced by the authors is very atypical for the readership of Forests. Indeed, the manuscript has the outward appearance of an applied mathematics paper, or maybe belonging in computer vision. Key references are from expert systems journals. I believe that to the extent that the research is publishable, it should be in a journal specializing in these.

  2. Related to (1), the orientation of the scientific questions addressed in the manuscript is of unclear use to a reader interested in forest fire detection with a perspective of actual fire, forest or land management. The mathematical sub-structure takes up a major part of the text, with a degree of technical detail that appears excessive. More seriously, the problems affirmed in the introduction are not sufficiently elucidated by practical examples to be convincing in the affirmation that they matter for the application of pixel-based fire detection via color. The authors affirm that the independent identical distribution property is violated in their dataset, at least in a one-class model for the "non-fire" class. This may well be the case (though I don't see a conclusive argument that demonstrates this), but it is not demonstrated that this causes machine learning approaches to the classification to actually break down. On a related note, the method to use the first frame of a fire video for training and all the subsequent, sometimes thousands, of frames to fit the model seems quite unbalanced to me.  Also, the figure presenting the key results (Fig. 5) is entirely unreadable and unclear. The description in the text does not clarify matters.

  3. This last point relates to my third concern: The overall quality of the presentation is extremely vague and idiosyncratic. Goals and evaluation criteria are stated nowhere.  The Introduction presents data; the Discussion and Conclusions section does not summarize the results or draw conclusions, while subsection 4.1 of Results contains material that belongs into a discussion. The selection of the three (or four?) methods that are being compared is not clearly motivated, and the conflation of pixels with object is somewhat troubling, especially as the authors admit that the "non-fire" class has no specific physical category attached to it, let alone objects. There appears to be no attempt to provide code. 

 

Last, while clearly effort has been made to write in good English, unfortunately the manuscript would need extensive editing by an English speaker who understands at least the overall thrust of the technical presentation. Before the manuscript can even be evaluated in detail, the presentation needs substantially more clarity. This said, even if (2) and (3) were addressed, I still do not believe that Forests is a suitable journal for this sort of mathematical and computational presentation.  

 

Reviewer 3 Report

This paper adopts a one-class model to achieve real-time forest fire detection, which can give a reasonable detection result. However, the description of the methodology is not clear enough to show the advantages of the proposed method. A major revision is needed before it can be published in forests. The following comments should be addressed:

 1.      The one-class model should be highlighted, which could be the main innovation point of this paper. What’s the difference between the one-class model, binary classification, and multi-classification model? What are the advantages of using the one-class model?

 2.      The paper draws a lot of content to introduce unnecessary algorithmic principles, making the specific implementation process not detailed enough, such as the programming language, how the hyper-parameter is tuned, and why this method was chosen.

3.      As shown in Fig.4, the one-class model requires an image containing distinct flame pixels for training, does this mean that the method cannot be used for early forest fire detection? 

4.      Line 74, the author says the two-class or multi-class classification violates the principle of independent and identical distribution. Please provide more evidence to prove this. Why good detection results can usually be obtained using binary and multiclassification?

5.      Fig.5, the performance of the one-class model is not too far from the SVM model, and its implementation principle is also similar. Therefore, what is the difference between this model and SVM, especially OneClasssSVM?

6.      Lines 398-399, there are already many publicly available high-precision fire annotation databases, such as [1,2]

7.      Lines 402-403, even with the one-class model, there is still no guarantee that the data will be balanced. 

8.      Other issues: 

-        Lines 321-323. the GT should only contain the flame pixel.

-        Fig.5, why use the logarithm of base e for the vertical coordinate? Even if logarithmic coordinates are used, the values of the axes should be the original values.

 References

 [1]      M. Cazzolato, L.P.S. Avalhais, D.Y.T. Chino, J.S. Ramos, M.T. Cazzolato, J.A. De Souza, J.F. Rodrigues-, A.J.M. Traina. FiSmo: A Compilation of Datasets from Emergency Situations for Fire and Smoke Analysis.  Brazilian Symposium on Databases, 2017,. https://goo.gl/uW7LxW.

[2]      T. Toulouse, L. Rossi, A. Campana, T. Celik, M.A. Akhloufi. Computer vision for wildfire research: An evolving image dataset for processing and analysis. Fire Safety Journal, 2017, 92: 188–194. https://doi.org/10.1016/j.firesaf.2017.06.012.

 

 

Back to TopTop