Next Article in Journal
FURSformer: Semantic Segmentation Network for Remote Sensing Images with Fused Heterogeneous Features
Next Article in Special Issue
GMDNet: An Irregular Pavement Crack Segmentation Method Based on Multi-Scale Convolutional Attention Aggregation
Previous Article in Journal
Millimeter-Wave Conformal Directional Leaky-Wave Antenna Based on Substrate-Integrated Waveguide
Previous Article in Special Issue
Building Change Detection in Remote Sensing Imagery with Focal Self-Attention and Multi-Level Feature Fusion
 
 
Article
Peer-Review Record

Stereo SLAM in Dynamic Environments Using Semantic Segmentation

Electronics 2023, 12(14), 3112; https://doi.org/10.3390/electronics12143112
by Yongbao Ai, Qianchong Sun, Zhipeng Xi, Na Li, Jianmeng Dong and Xiang Wang *
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Electronics 2023, 12(14), 3112; https://doi.org/10.3390/electronics12143112
Submission received: 27 May 2023 / Revised: 24 June 2023 / Accepted: 10 July 2023 / Published: 18 July 2023
(This article belongs to the Special Issue Computer Vision for Modern Vehicles)

Round 1

Reviewer 1 Report

This paper presents a stereo-visual SLAM with dynamic object removal using deep learning-based semantic segmentation. Moving Object Detection (MOD) is important for visual SLAM due to its negative effect in large-scale outdoor environments, so this paper adequately covers the topics needed in SLAM field. However, in order to increase the academic value of this paper, I recommend some revisions as follows:

1. Since MOD-based visual SLAM has been studied for a long time, there are a lot of research papers covering the subject. The authors can contain more related papers in Chapter 2.

2. In the experiments, the authors choose three comparison algorithms; ORB-SLAM, OPENVslam, and DYNAslam. It would be much better if the authors explain why these algorithms are selected and why these algorithms are proper to evaluate the proposed SLAM framework.

 

English is good enough to present research results.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

My major concern involves the usage of semantic segmentation method ENet. The method is published in 2016 and thus out-of-dated considering the fast evolving of the field. I am wondering how the method will perform if using more advanced semantic segmentation methods.

Following the previous comments, recent semantic segmentation methods should be reviewed, such as, Exploring cross-image pixel contrast for semantic segmentation, Rethinking Semantic Segmentation: A Prototype View, Volumetric memory network for interactive medical image segmentation.

As stated, the method can only recognize 19 categories defined in Cityscapes, but how will the method deal with new categories appearing in the scene?

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The evaluated article is a sample application of visual SLAM using the semantic segmentation neural network Enet. Visual SLAM has been addressed in the scientific literature for about 20 years, but this article brings innovation in the form of the use of the moving object detection (MOD) method proposed by the authors. The authors document their own contribution of improving localization accuracy by testing the proposed VSLAM algorithm on the KITTY dataset and comparing it to other VSLAM methods. I consider the graphs in figures 4, 5, 6 and table 2 with the quantification of errors and their reduction using the proposed VSLAM algorithm, which is relatively significant, to be the key results of the research.

I consider the conclusions resulting from the presented research to be correct, even though the experimental verification of VSLAM activity was done at a (probably) low movement speed the TurtleBot3 robot. References describe the topic treated in the article. I recommend publishing the article after minor formal adjustments and completion of answers to questions.

Comments:

1. Please unify the colour legend on Figures 4 and 5.

2. Please add the units on the vertical axes in the graphs in fig. 4.

Questions:

1. What was the speed of the TurtleBot3 robot during the experiment in chapter 4.3?

2. In line 258, the threshold value N=5 is set. How does change the success of moving objects detection when the threshold value is increased/decreased?

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The manuscript is well-revised, so I recommend its publication.

Overall English is good, but, I recommend that the author's final proofreading before submitting the final manuscript.

Back to TopTop