Next Article in Journal
Data Sampling Methods to Deal With the Big Data Multi-Class Imbalance Problem
Next Article in Special Issue
The Imperial Cathedral in Königslutter (Germany) as an Immersive Experience in Virtual Reality with Integrated 360° Panoramic Photography
Previous Article in Journal
Research on the Internal Thermal Boundary Conditions of Concrete Closed Girder Cross-Sections under Historically Extreme Temperature Conditions
Previous Article in Special Issue
Semantic 3D Reconstruction for Robotic Manipulators with an Eye-In-Hand Vision System
 
 
Article
Peer-Review Record

Semantic 3D Reconstruction with Learning MVS and 2D Segmentation of Aerial Images

Appl. Sci. 2020, 10(4), 1275; https://doi.org/10.3390/app10041275
by Zizhuang Wei 1,2,†, Yao Wang 1,2,†, Hongwei Yi 1,2, Yisong Chen 1,2,3 and Guoping Wang 1,2,3,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Appl. Sci. 2020, 10(4), 1275; https://doi.org/10.3390/app10041275
Submission received: 21 December 2019 / Revised: 8 February 2020 / Accepted: 10 February 2020 / Published: 14 February 2020
(This article belongs to the Special Issue Augmented Reality, Virtual Reality & Semantic 3D Reconstruction)

Round 1

Reviewer 1 Report

The introduction of this article speaks of the importance of 3d modelling in heritage conservation among other fields. This is entirely accurate insofar as 3d modelling is certainly a key technology in this field, but it is difficult to establish the relevance of this paper's findings to practical heritage conservation. As it stands, this paper does not establish scope, aims and success criteria before moving on to technical detail.

What sort of semantic labels are considered appropriate or necessary for any of these fields? It is important to establish this at an early stage for the reader, especially readers intending to apply the method to a given field, because their intended use case may or may not be within your intended scope. Therefore, I would suggest that you flesh this out with reference to the state-of-the-art in these fields.

The labels used in this case appear to include general labels (vegetation, building, road...) - how does this compare to semantic labels used in real-world cultural heritage information systems, such as INSPIRE/ResCult, CityGML, etc? I suspect that the proposed system fulfils only a small subset of semantic label requirements in these systems. This is not necessarily a problem but it is important to present a specific and realistic use case if an application area is proposed.

The experimental evaluation, carried out on the Urban Drone Dataset, shows good outcomes. However, the scope and motivation issues identified above weaken the paper overall - it would be a stronger piece of work on its own terms if a strengthened justification for attention on this task were provided, or if the paper focused on incremental improvement in a standard task on this dataset and did not suggest that a direct contribution to application areas existed.

In general the paper is otherwise well written with a few minor grammatical points. See for example line 129, 'However, the same as most depth estimation methods' ('However, as with most depth estimation methods').

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

As authors described in this manuscript, they suggest and explore a semantic Multi-View Stereo (MVS) method to reconstruct 3D semantic models from 2D images using 2D semantic probability distribution by Convolutional Neural Network (CNN); further utilizing Structure from Motion (SfM), while the depth maps are estimated by learning MVS. Eventually after a few more steps (methods), the Multi-View Consistency, the proposed method efficiently produces a fine-level 3D semantic point cloud. Their experimental results showed re-projection maps achieving 88.4% Pixel Accuracy on UDD dataset. Thus, making technologies such as Virtual Reality (VR) and Augmented Reality (AR) much more promising and flexible.

+Presenting interesting and practical semantic 3D reconstruction framework to reach high Pixel Accuracy on the UDD dataset.

+Literature review section is adequately described.

+Methodology and experimental evaluations sections are clearly designed and stated.

+Results and conclusions sections are adequate.

+References are current and related.

+Figures, graphs, and tables are persuasively and eloquently presented.

-Please show the full description of the UDD aberration in the Abstract section.

Overall, this manuscript is an excellent report of an original research.

Author Response

Response to Reviewer 2 Comments

 

Point 1: Please show the full description of the UDD aberration in the Abstract section.

 

Response 1: Thank you for your approval of our work. We have updated our Abstract section to give the full description of the UDD (Urban Drone Dataset) aberration.

Back to TopTop