Next Article in Journal
An Extended Range Divider Technique for Multi-Band PLL
Previous Article in Journal
Resonator Arrays for Linear Position Sensors
 
 
Article
Peer-Review Record

FTFNet: Multispectral Image Segmentation

J. Low Power Electron. Appl. 2023, 13(3), 42; https://doi.org/10.3390/jlpea13030042
by Justin Edwards and Mohamed El-Sharkawy *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
J. Low Power Electron. Appl. 2023, 13(3), 42; https://doi.org/10.3390/jlpea13030042
Submission received: 23 April 2023 / Revised: 8 June 2023 / Accepted: 20 June 2023 / Published: 30 June 2023

Round 1

Reviewer 1 Report

This is a technical walk through the subtle differences of processing steps in the Encoder for a CNN.  The task that the authors pick is looking at scene segmentation for a combined LWIR and RGB imagery data set.  

I am more of an expert on the imagery side of this rather than the CNN processing undertaken.  As such, I am taking the author's word that the introduction of pyramid pooling and a symmetrical encoding step lead to improved scene segmentation results as shown.  

I would however liked for the authors to circle back to the interplay between the LWIR and RGB portions of the data set.  Does this variation in methodology from MFNet to FTFNet improve segmentation due to the fact that the data is multispectral?  Would this work as well in purely RGB or purely LWIR sets, or is it something about the multispectral nature of this data that enables FTFNet to have a segmentation advantage?

Also, how significant is a 2.5x increase in parameter number impact processing time?  Given the hardware used, perhaps include a column in table 6 that lists processing time for each method.  

I would also have liked to see some comparison imagery examples between the methods.  Select one or two example scenes with RGB and LWIR imagery and then show the resulting scene segmentation results in order to give the reader more of a visual interpretation of the level of improvement.  

On the whole, it looks like the authors have realized a ~10% improvement over legacy CNN methods for this problem and that is certainly worthy of presenting.  The authors note that this improvement is due to a number of factors (better loss functions, data augmentation functions, etc.). Is it possible to break out the various factors?  Which of your improvements over the baseline MFNet was the main contributor to the observed improvement?

Overall, the manuscript was well written.

Author Response

Please see attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This work  focuses on improving semantic segmentation by combining long wave infrared (LWIR) imagery with visual spectrum imagery. The proposed approach, the Fast Thermal Fusion Network (FTFNet), outperforms the baseline architecture (MFNet) in terms of accuracy while maintaining a low footprint. The FTFNet aims to enhance real-time autonomous systems by addressing the limitations of using visual imagery alone, making it suitable for various fields such as medical imagery, land demarcation, and autonomous vehicles. Please check my following comments:

Major Comments

1)      The literature review reported in this manuscript is inadequate as it does not mention the extensive body of work done on the MFNet and FFFnet approaches. The author should include references to these key works in the Introduction section to provide a comprehensive overview of the existing research in the field.

2)      To establish the novelty of the research and the author's contributions, a thorough analysis of previous work on FTFNet and MFNet is necessary. This analysis should highlight the gaps in the literature that the current research aims to address. It can be included in a separate section or integrated into the introduction.

3)      It is essential for the authors to compare their results with similar schemes proposed in previous studies. This comparative analysis is missing in the manuscript and should be included to evaluate the performance of FTFNet in relation to other existing approaches.

4)      The background section of the manuscript requires further elaboration. The authors should provide additional information and context to better establish the background of the research and provide a solid foundation for the study.

Minor Comments:

1)      The manuscript requires improvement in the English language usage. It is recommended to carefully review and revise the entire manuscript to enhance the clarity, grammar, and overall language quality.

2)      The section titled "Discussion" should be replaced with "Conclusion" to accurately reflect the content and purpose of that section.

3)      The resolution of Figures 5, 6, 7, and 8 should be enhanced. It is advisable to ensure that the figures are of sufficient quality and clarity to effectively convey the information to the readers. This may involve increasing the resolution or improving the visual presentation of the figures

Author Response

Please see attachment.

Author Response File: Author Response.pdf

Back to TopTop