Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Probabilistic Forecasting of Lightning Strikes over the Continental USA and Alaska: Model Development and Verification

Fire 2024, 7(4), 111; https://doi.org/10.3390/fire7040111

by Ned Nikolov^1,*, Phillip Bothwell² and John Snook²

Reviewer 1: Anonymous

Reviewer 2:

Brian Vant-Hull

Fire 2024, 7(4), 111; https://doi.org/10.3390/fire7040111

Submission received: 16 January 2024 / Revised: 22 March 2024 / Accepted: 26 March 2024 / Published: 28 March 2024

(This article belongs to the Special Issue Probabilistic Risk Assessments in Fire Protection Engineering)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Predicting forest fires caused by lightning is a meaningful work, but the paper has the following issues.

1）The climate data from 29 years does not have statistical significance, usually it takes 30 years to have statistical significance.

2）Lightning induced forest fires can also occur in the spring and autumn seasons, and the model must take these two seasons into account.

3）Because horizontal temperature distribution by pressure levels is related basic synoptic condition, the model is lack of true directional lightning physical processes.

4）The latitude span of the United States is large, and using a single model to predict lightning is not suitable.

Comments on the Quality of English Language

The author should make the article more concise and concise.

Author Response

Thank you for reviewing our manuscript!

Here are our responses (in red) to your comments (in black):

The climate data from 29 years does not have statistical significance, usually it takes 30 years to have statistical significance

The lack of statistical significance would be true, if we had a dataset of less than 30 points. However, the 29-year long datasets used to describe meteorology and lightning over the USA contain literally millions of data points. This is clearly described in the manuscript. Hence, the statistical significance of our analysis is not an issue.
Lightning induced forest fires can also occur in the spring and autumn seasons, and the model must take these two seasons into account.

Yes, that's true, and as the manuscript describes it, we have developed equations for autumn and winter months as well.
Because horizontal temperature distribution by pressure levels is related basic synoptic condition, the model is lack of true directional lightning physical processes.

We are not sure what this means or implies (mostly due to a poorly formulated sentence). However, as shown in the Results section of our manuscript, the model (consisting of 1,000 logistic equations) displays excellent skill in forecasting the probability of one or more CG strikes over ConUS and AK as assessed against independent data. This implies that the model captures rather well the physical processes governing lightning activity on a continental scale.
The latitude span of the United States is large, and using a single model to predict lightning is not suitable.

We are NOT using a single model for the entire USA! The manuscript clearly explains that we have parametrized a large set of logistic equations to describe the dependence of CG lightning strikes on meteorological conditions in 11 regions (including AK) and for every 3-h interval of an average diurnal cycle in every month of the year. There are 960 individually parametrized equations for ConUS and 40 equations for Alaska... We advise the reviewer to carefully read the revised version of the manuscript.

Reviewer 2 Report

Comments and Suggestions for Authors

Introduction

The introduction should serve as an introduction to the state of the field, yet not a single paper is quoted here. It took me less than 2 minutes to find a very similar study done in Australia (Lightning Prediction for Australia Using Multivariate Analyses of Large-Scale Atmospheric Variables: Bryson C. Bates, Andrew J. Dowdy and Richard E. Chandler: 01 Mar 2018 DOI: Journal of Applied Meteorology. https://doi.org/10.1175/JAMC-D-17-0214.1) and there are many more. Some previous papers are quoted in the Materials and Methods section, but a deeper literature review is expected beyond the direct sources that are cited. Were the authors asked to create a lightning flash density algorithm because existing ones were lacking?

Materials and Methods

- I'm not sure I'd refer to GFS as being independent from NARR, as both use the same underlaying observations. The difference is that NARR as a reanalysis is best for research work covering many years, as the model has been set not to change for just this purpose. GFS is a global forecast (spectral rather than discrete element). If you want to convince people why GFS is used, a better reason is that is can be used to extend the model to a global forecast, while NARR is set in north America and can only be used to model historical data.

- The company producing the lightning data is Vaisala, not VISALA (and not a capitalized acronym).

- The link to the Alaska lightning data is a website that does not have information about the instrumentation quoted in this paper. Similarly the link for the CONUS lightning data is to the Desert Research Institute and does not link directly to information about the lightning dataset. The lightning dataset for this paper is essentially undocumented, yet filled with claims such as the detection efficiency increased by 60% after replacing instrumentation. If this is not documented at least provide a personal communication.

- it would be better if there were direct links to documentation about NARR and GFS, though these are generally well known data sets.

- Why were fuels included in a lightning prediction model for dividing into regions. It will be useful for the future application to the fire ignition, but this should be stated. Independent of fires it makes no sense.

Figure 1: what do the colors mean? This should be in the caption. If land cover it doesn't seem very strongly tied to the regions.

Results

- For those not familar with PCA analysis, it should be noted in the text that every component is a linear combination of the base variables. Figure 2 does a nice job listing the main contributors to each component, but you may want to include an appendix that shows the exact mix of variables for each component.

- For component 1, by horizontal temperature and height fields are these simply the values for each 20 km grid point? I would think local maxima and minima might be more diagnostic; perhaps the 2D gradient.

- There is mention that output maps were delivered to users via website, but was there any feedback or count of views? perhaps too soon for this, but nice to mention if available.

Figure 3: What are the histograms to the lower right? They are not mentioned, the x axis is not labeled, and if not important should be removed. What does each point represent? I would think if it's each occurence of lightning there would be a lot more points, so there's some averaging/binning going on that is not fully explained. Does model 1 represent 12 UTC and we have 7 others not shown?

Figure 4: There's several questions.

- The left has 4 dots, the right has 3 dots. What does each dot represent? It can only be huge amounts of data binned together, but how is it binned?

- are you really fitting this curve to just a few points? If so, it's obvious overfitting. If not, the points need to be shown. I'd prefer ALL the points be indicated by shading or something else so we have a feeling for what kind of noise these curves are being fit to.

- I'm uncomfortable with the idea of both true and false positives being high. This could use some discussion.

Figure 5: see comments for Figure 3.

Figure 8: From the left side it's not at all obvious to me that the forecast accuracy decreases with the length of the forecast: in fact longer forecast times appear to get closer to the 1:1 line than 0 forecast hours. What are these mysterious points 0.4, 0.3 etc? In the caption "legent" -> "legend"

Figure 9: the color scheme is very different from what people have come to expect, with red being high and blue being low. You may want to change it.

Discussion

- if this is the first and only gridded lighting forecast map for North America, why wasn't this mentioned in the abstract, or introduction, where people are expecting to see such information? By burying it in the discussion it won't be seen by people shuffling through large numbers of papers to find information like this. The claim is also not backed up by a literature search, where you will find plenty of such models, so perhaps what you mean is the "only *operational* gridded lightning forecast algorithm".

- This looks more like a conclusion or summary to me than a discussion. Consider changing the section heading.

Author Response

Thank you for providing a thoughtful review of our manuscript.

Here are our responses (in red) to your comments (in black italic):

We agree with this point and have expanded the Introduction to include a brief historical overview of the modeling of lightning. The most accurate lightning forecast models are those derived for specific region and season using multivariate statistical methods or machine-learning algorithms. Theoretical algorithms (based on "first principles") do not provide the accuracy needed by operational applications. As far as we know, our lighting model consisting of 1,000 logistic equations is the most comprehensive an accurate one for ConUS and Alaska.

As described in the Methods section of the manuscript, our lightning prediction model was developed (derived) using NARR data going back to 1993, not GFS initialization fields. We only used GFS fields as a source of independent meteorological data (drivers) in the model verification procedure.

- The company producing the lightning data is Vaisala, not VISALA (and not a capitalized acronym).

Yes, this typo has been corrected in the revised version.

The link to the Alaska lightning data is a website that does not have information about the instrumentation quoted in this paper. Similarly the link for the CONUS lightning data is to the Desert Research Institute and does not link directly to information about the lightning dataset. The lightning dataset for this paper is essentially undocumented, yet filled with claims such as the detection efficiency increased by 60% after replacing instrumentation. If this is not documented at least provide a personal communication.

This has been addressed in the revised manuscript by providing additional clarifications and references.

it would be better if there were direct links to documentation about NARR and GFS, though these are generally well known data sets.

Direct links to NARR and GFS are provided in the References list and cited in the text.

Why were fuels included in a lightning prediction model for dividing into regions. It will be useful for the future application to the fire ignition, but this should be stated. Independent of fires it makes no sense.

The land-cover map in Fig. 1 was only used as a background to the 10 regions shown on the map. Fuel data were not a part of the statistical model derivation procedure. The revised manuscript contains a new version of Fig. 1 without the land-cover background map.

For those not familiar with PCA analysis, it should be noted in the text that every component is a linear combination of the base variables. Figure 2 does a nice job listing the main contributors to each component, but you may want to include an appendix that shows the exact mix of variables for each component.

For component 1, by horizontal temperature and height fields are these simply the values for each 20 km grid point? I would think local maxima and minima might be more diagnostic; perhaps the 2D gradient.

A clarification text has been added to the PCA description in the revised manuscript. The text on Fig. 2 has also been edited to make it clearer what the strongest predictors were at every 20-km grid point.

There is mention that output maps were delivered to users via website, but was there any feedback or count of views? perhaps too soon for this, but nice to mention if available.

RMC has been delivering fire-weather products to the fire-management community in the USA for 23 years now. We have over 2500 registered users on our website. The lightning-probability forecasts are now a part of a wildfire ignition prediction model, which is being evaluated by the user community (primarily the National Predictive Services). We are expecting a report from them this year.

Figure 4: There's several questions.

- The left has 4 dots, the right has 3 dots. What does each dot represent? It can only be huge amounts of data binned together, but how is it binned?

- Are you really fitting this curve to just a few points? If so, it's obvious overfitting. If not, the points need to be shown. I'd prefer ALL the points be indicated by shading or something else so we have a feeling for what kind of noise these curves are being fit to.

I'm uncomfortable with the idea of both true and false positives being high. This could use some discussion.

Figure 4: There's several questions.

- The left has 4 dots, the right has 3 dots. What does each dot represent? It can only be huge amounts of data binned together, but how is it binned?

- I'm uncomfortable with the idea of both true and false positives being high. This could use some discussion.

The dots visible on the ROC curves are now explained in the revised manuscript. These are not fitting data points! The ROC curves were generated by the R software based on thousands of data points. Also, please consult the references provided for explaining the meaning of the ROC curves.

The dots and histograms on the Reliability Diagrams are now explained in the revised manuscript.

The revised manuscript provides additional discussion regarding Fig. 8 that addresses this comment. The typo in "legent" has been fixed as well.

Figure 9: the color scheme is very different from what people have come to expect, with red being high and blue being low. You may want to change it.

Since we use a black background to better show the observations in white, the contour colors of the forecast probabilities were picked so that brighter colors correspond to higher probabilities. Colors are explained in the captions of Figures 9 and 10.

Discussion

if this is the first and only gridded lighting forecast map for North America, why wasn't this mentioned in the abstract, or introduction, where people are expecting to see such information? By burying it in the discussion it won't be seen by people shuffling through large numbers of papers to find information like this. The claim is also not backed up by a literature search, where you will find plenty of such models, so perhaps what you mean is the "only *operational* gridded lightning forecast algorithm".

This looks more like a conclusion or summary to me than a discussion. Consider changing the section heading.

We agree! This comment is addressed in the revised manuscript. The title of the Discussion section was changed to Conclusion as well.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Due to some of the 13 lightning predictors being repetitive and highly correlated, and lacking important cloud microphysical factors, further optimization and refinement are needed.

Comments on the Quality of English Language

Due to some of the 13 lightning predictors being repetitive and highly correlated, and lacking important cloud microphysical factors, further optimization and refinement are needed.

Author Response

Thank you for continuing to review our manuscript.

Below is our response (in red) to your comment (in italic black):

Due to some of the 13 lightning predictors being repetitive and highly correlated, and lacking important cloud microphysical factors, further optimization and refinement are needed.

The 13 lightning predictors (Principal Components estimated by PCA) are not repetitive as clearly shown in Fig. 2. The reviewer may have been misled by the multiple appearance of the term Wind in Fig. 2, but note that these are different variables describing winds.

There is no unified physical theory that can accurately predict cloud-to-ground lightning as a function of cloud microphysical properties. This why the best lightning prediction models are derived using multi-variate statistics rather than cloud microphysics. Our study was aimed at accuracy of lighting forecasts, not testing of theoretical lightning models. The effect of cloud microphysics is captured implicitly in our model by 13 principal components depicted in Fig. 2.

Reviewer 2 Report

Comments and Suggestions for Authors

General notes: there are a large number of formatting problems here which I blame on the unreasonable time limits imposed by MPDI journals. I would suggest a version be produced with no redlining so the figures can be better placed relative to the text. Though the final formatting depends on the layout department it is useful to them to see the author's intentions.

My comments below that are NOT related to formatting are suggestions that are not critical for publication.

Detailed Notes to Authors:

Thank you for adding the research background. So it seems like what makes your approach new is a statistical approach using PCA analysis rather than the physical approach currently used. I suppose Model Output Statistics (MOS) has not been employed for lightning, though it is employed for precipitation. If employed it would be a similar technique to your own, so it's worth mentioning.

Line 93: why was the 3 hour resolution deleted?

line 180: I think it would be nice to mention that GFS is a global model, so if it seems to work your model can be applied globally.

Why are there two identical versions of figure 2?

I hope figures 4, 5 and 6 will be better placed among the text in the final version.

The split caption for figure 8 needs to be fixed.

Line 307: don't you think 8 significant figures is a bit extreme? this indicates ridiculous accuracy for a tool meant to measure errors in the first place.

The placement of Figure 10 needs to be fixed, as it has halfway disappeared.

Author Response

Thank you for continuing to review our manuscript.

Below are our responses (in red) to your comments (in italic black):

My comments below that are NOT related to formatting are suggestions that are not critical for publication.

None of the formatting problems you report are present in the revised manuscript we uploaded to the MDPI's web portal. We uploaded a MS .docx version and a PDF version of our paper. We also found no formatting issues in the MS Word version of the manuscript that came to us with the 2nd round of reviews. So, I'm quite puzzled as to why you've encountered these formatting problems?!

Detailed Notes to Authors:

MOS is part of the software of the GFS weather forecast model providing the 7-day forecast fields that drive our lightning prediction model. Since we do not run GFS inhouse (it's run by NOAA NCEP), we have no access to its MOS module. We believe that the spatial statistics provided by the R software, which we used, is adequate for our purposes.

Line 93: why was the 3 hour resolution deleted?

There is no reference to a 3-hour resolution in or around line 93 in our version of the manuscript. We have not deleted any reference to the 3-h resolution we employ in our model in the revised manuscript.

line 180: I think it would be nice to mention that GFS is a global model, so if it seems to work your model can be applied globally.

The full name of GFS (Global Forecast System) is already stated on p. 3 (line 104) in the revised manuscript. Also, GFS is a well-known GLOBAL forecast model.

Our lightning prediction model (consisting of 1,000 logistic equations) is parametrized using data only from ConUS and Alaska. Thus, it's not appropriate to apply the model globally. However, our methodology for deriving the equations can be applied globally, if long-term global lightning- and weather parameterization datasets are available.

Why are there two identical versions of figure 2?

I hope figures 4, 5 and 6 will be better placed among the text in the final version.

The split caption for figure 8 needs to be fixed.

None of these formatting issues are present in the revised manuscript we submitted to MDPI. Again, it appears that the publisher created these misalignments during the internal processing of our article.

Line 307: don't you think 8 significant figures is a bit extreme? this indicates ridiculous accuracy for a tool meant to measure errors in the first place.

Line 307 in our version of the manuscript does not contain any numbers. I guess you are referring to this sentence on p. 9 of the revised manuscript:

"The AUC values decrease from 0.9328158 at forecast hours 0 – 3 to 0.8697127 at forecast hours 168 - 171"

We simply listed the AUC values of the ROC curves as provided by the R software. Note that these AUC values are not error estimates.

The placement of Figure 10 needs to be fixed, as it has halfway disappeared.

Again, no problem with the placement of Fig. 10 in the manuscript version we uploaded to MDPI.

Round 3

Reviewer 1 Report

Comments and Suggestions for Authors

In fact, in Figure 2, the lightning predictor 2 is highly correlated with factors 1, 3, 5, 8, and 11, and is not independent of each other. The occurrence and development of lightning are highly related to the microphysical processes of clouds (especially the cold cloud processes) and the author overlooked it, so I do not agree with the author's response.

Comments on the Quality of English Language

Quality of English language can be improved further.

Author Response

Please find attached the cover letter

Author Response File: Author Response.pdf

Article Menu

Probabilistic Forecasting of Lightning Strikes over the Continental USA and Alaska: Model Development and Verification

Further Information

Guidelines

MDPI Initiatives

Follow MDPI