Colorization of Logo Sketch Based on Conditional Generative Adversarial Networks
Round 1
Reviewer 1 Report
This paper extends the traditional U-Net structure by adding channel attention and spatial attention mechanism. Several attention-based U-Net modules are included in the generator. One image from Att-UNet modules is randomly selected and is fed to the discriminator training. In this context, several engineering advances are proposed, such as skip-connection in the U-Net module to improve the stability of the output.
Although the authors seem to be knowledgeable of the research related to GAN, they do not compare their method to related methods [17], [18]. and [19].
The math in Section 3.1 is not consistent. Notation is not always defined.
The interpretation of MAE is adhoc and not convincing. Another objective figure of merit should be used.
Author Response
Please see the attachment
Author Response File: Author Response.pdf
Reviewer 2 Report
This paper proposes a logo design method by a Conditional GAN algorithm which can output multiple colorful logos by simply inputing a logo sketch. The authors improved the traditional U-Net, added the attention module for skip-connection. Experimental results suggested the feasibility and superior performance of the proposed method. This paper is interesting and worth investigating. However, to enhance the quality of this paper, the following concerns should be addressed.
- From the motivation, it is a great idea to develop an algorithm for automatic colorization of logo sketch. But practically speaking, different designers may want different colors for different parts of the logo, even for the same purpose. But the method proposed in this manuscript seems to be unable to serve this purpose. Can the authors propose some ideas to tackle this problem?
- The authors presented many impressive examples for visualization. However, I would like to see more quantitative measurements (e.g., either MAE, accuracy, AUC or other common metrics) to compare the proposed method against state-of-the-art methods. So far I only see Table 1, but this is not enough.
- Is Fig. 4 not fully shown? It seems weird. Also, please enrich the captions for each figure to explain every special terminology so that readers can understand without looking into the texts of the manuscript.
- I would like to know how each part of the proposed method contributed to the final performance? For example, if GAN instead of conditional GAN was used, how would the results be?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The paper proposed a method for colorization of logo sketch based on Conditional Generative Adversarial Network (cGAN) including attention-based U-Net (Att-Unet) as the generator. The manuscript was well prepared, however, there are some issues that can be pointed out and revised to make the work clarified.
- The performance comparison with the more state-of-the-art methods should be included, rather than only with pix2pix method.
- It would be better to include the attention weight figures of the other comparative methods also in the Fig. 6.
- It can be seen from the generated logo images shown in Fig. 5 and Fig. 7 that the proposed method gives not so high degree of color variations, as only red or blue color dominates. Is this the issue of the method?
- Some minor issues:
+ The Fig. 4 in the manuscript should be fixed.
+ The mathematical annotations should be more clarified in the equations, for examples, what is MLP in (6)?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
The authors have addressed this reviewer comments adequately. The revised manuscript can now be accepted as is.
Author Response
Thank you very much for your encouragement.
Reviewer 2 Report
The authors have addressed most of my concerns. Just one of the concerns has not been ideally addressed. For the 1st concern, the authors mentioned that they will consider introducing a new supervision information to customize the needs of different designers. They claim that this can disturb the training and prevent overfitting. I would like the authors to discuss how the newly added supervision info can "disturb the training and prevent overfitting", and eventually serve the purpose of customizing the needs of different designers. Preferably, the authors should add a new section or paragraph to discuss this.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
The authors' responses and additional experimental results are adequate to the review comments. Overall, the revised manuscript can be accepted for publication.
Author Response
Thank you very much for your encouragement.