Next Article in Journal
Classification of Hydroacoustic Signals Based on Harmonic Wavelets and a Deep Learning Artificial Intelligence System
Previous Article in Journal
Non-Scanning Three-Dimensional Imaging System with a Single-Pixel Detector: Simulation and Experimental Study
 
 
Article
Peer-Review Record

Generation of Stereo Images Based on a View Synthesis Network

Appl. Sci. 2020, 10(9), 3101; https://doi.org/10.3390/app10093101
by Yuan-Mau Lo 1, Chin-Chen Chang 2,*, Der-Lor Way 3 and Zen-Chung Shih 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2020, 10(9), 3101; https://doi.org/10.3390/app10093101
Submission received: 8 March 2020 / Revised: 23 April 2020 / Accepted: 25 April 2020 / Published: 29 April 2020
(This article belongs to the Section Computing and Artificial Intelligence)

Round 1

Reviewer 1 Report

The paper deals with quite interesting topic of generation of stereo images from single image. The paper is well conceived and structured. The proceed is quite well described, so I think that the paper could be published in its presented form. 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The reviewed article concerns the image generation method, which is an image from a set of stereo pairs in stereo vision. The authors assumed that they have an image seen through the left eye and generate an image that would be seen through the right eye. For this purpose, they use a method based on simulation of simultaneous displacement and rotation.

In my opinion, this is a familiar approach, often used in stereovision. So this is not a complete novelty, about which the authors write. The authors use other existing methods, such as already developed neural networks.

The work also lacks well presented research results. Subjective feelings cannot be a confirmation of research (as in section 4.2). It has been noticed that the authors' metadata does not always give better results.

However, I would like to draw attention to the possibilities of the proposed method. It can be seen in the photos in the work that the method gives quite good results of generating new images, especially within clear edges.

Therefore, the improved description of the research part will allow me to accept the work for publication.

My suggestions:
1. Manual labeling is probably not a good idea for these types of methods.
2. Figure 5 could contain some reference lines that illustrate rotation and offset.
3. Figure 4 is not completely understood
4. Table 1 presents the results for "Warping" and "Our approach". It would be good to explain more precisely how CW-SSIM calculations are made.
5. In section 4.2 the results are presented only in the form of photos. Maybe an interesting element of the visualization would be to show the resulting image from the difference between the images of "Warping and" Our "? Maybe there are more differences?
6. Scaling images always introduces some errors. It was not possible to change the scale of the network? It would make it necessary to re-learn the network, but maybe the results would be even better.
7. You include 38 categories. The use of 8233 training data gives just over 200 data per category. With networks it is probably a fairly small set.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

The paper proposes an interesting method to estimate a stereo image pair given a single image and considers both translation and rotation of the objects in the scene.

I have the following questions/suggestions:

1. Please provide some more context on the equation in line 223.
2. The image qualities are low to observe artifacts or subtle improvements. Please include hi-res images in future versions of the paper.
3. Line 310 looks incomplete, I did not understand: "we could the searching region to"?
4. [Fig. 6] Please find a way to show details, e.g. zooming. Use the full width of the paper to place them as necessary.
5. Not clear how you selected the 10 pairs from the dataset. Looks like you picked 5 perfect and 5 imperfect not from the same 5 sets, but from 10 sets. Please explain why?
6. [Sec 4, Results] 2-3 minutes is a long time. Provide time profiling. I want to see into some quantification of the time takes in the most computationally expensive operations. The description is too brief.
7. What stopped you from trying other datasets? Give proper reasons in the paper.
8. [Table 1] Please provide some more insight into the score differences between perfect/imperfect pairs.
9. [Fig. 7, Fig. 8] Really hard to analyze. Please help me by showing where to look at! Some heatmap like visualization for errors or any other way for emphasis on the region of interest will help.
10. [Fig. 10] The blue boxes look green (due to compression?). Maybe you can draw a thicker border to avoid any confusion.
11. [Fig. 11] No need to superimpose the cropped regions, instead, put them the side (enough room out there in the blank spaces in the side) and refer to the original image using arrows. This will help to visualize them more clearly.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop