Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

MSGCN: Multi-Subgraph Based Heterogeneous Graph Convolution Network Embedding

Appl. Sci. 2021, 11(21), 9832; https://doi.org/10.3390/app11219832

by Junhui Chen

, Feihu Huang

and Jian Peng^*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Appl. Sci. 2021, 11(21), 9832; https://doi.org/10.3390/app11219832

Submission received: 21 September 2021 / Revised: 10 October 2021 / Accepted: 19 October 2021 / Published: 21 October 2021

(This article belongs to the Topic Machine and Deep Learning)

Round 1

Reviewer 1 Report

The authors propose a model based on Graph Convolution Network to cope with the heterogeneous graph embedding and inductive learning tasks. The idea sounds interesting and scientific. Indeed, this area has gained very attention nowadays, and it is a hot topic in the specialized field. The main highlighted points following such as:

1) The Abstract section is a bit confusing. Authors should clarify: a) motivation, b) problem under study and literature gap, c) a brief description of the proposal, and d) quantitative summarization of results. The reviewer recommends a significant revision in this section, including rewritten in most of the text.

2) Section 1 gave an interesting presentation about the problem under study. Remarkably, the gap in the specialized literature regarding network embedding methods considering the heterogeneous graph is open to studies. The text of this section is very well written; however, it is missing a comparison of the enumerated contribution with references from 2021. Indeed, network embedding has gained a lot of brand-new contributions in 2021. Reviewer has identified that the proposed contributions have their merit; however, a comparison with very recent literature is necessary to clarify contributions.

3) The text in Section 2 has some typos.

4) Section 2 presents a brief literature review about the envisaged problem. The presentation is interesting, but it lacks a final closure. The reviewer suggests the authors add a Table to compare contributions with literature ones.

5) The authors could replace the name in section 3 with a more intuitive name. The way it is, it gets very generic.

6) Lines 150 and 158, eliminate the redundant ":".

7) Authors presented the core of the proposal in Section 4. The description is feasible to read and understand for someone specialized in the envisaged field. However, the subsection Inductive Learning Task is tough to follow. It is not clear the necessary steps to reproduce this part of the methodology. Therefore, the authors are suggested to rethink section 4.5 to clear all steps required to produce the main idea.

8) Section 4.5 could better clarify the motivation to use the activated function mentioned in the text, like LeakyRelu, for example.

9) The experiment results, as described in Table 2, are promising. The proposal had a better performance when compared with baseline approaches. This strategy is used for most of the contributions in the machine learning field. However, there is a gap here, "why" is the proposal better? The authors should add a deep discussion about the results and the motivation that the proposal is better.

10) For the sake of understanding and due to reproducibility issues, mainly considering sections 5.5 and 5.6, the authors should explicitly indicate some public repository (github or gitlab) where the reader can reproduce all codes and results.

11) Conclusion is very timid. The authors must expand the final discussion highlighting the main achievements and indicating the bottleneck of the proposal. Overall the proposal is exciting, and the results are promising. However, the paper has several flaws when considering the presentation, mainly related to Sections 4 and 5.

12) A proofreading service is necessary in order to eliminate typos and grammatical errors.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

In Table 1, G and Gi refer to the original 'heterogeneous' graph and its decompositions? rather than 'homogeneuous', I presume.

In section 4, some of the equations and their definitions change the font of the text, making it slightly hard to follow. For example, equation 1 describes the aggregation of content information of X_i(capital X) elements, but then the description in line 185 talks about x_i (non-capital X). Make sure that there is no discrepancy between these values.

Also, Eq.7 mentions D_ni and Z_ni when the correct terms should be Di and Zi respectively (as per table 1).

Figure 2, in the upper row, when decomposing the graph into G1 and G|R| the edges on the subgraphs dont match those depicted by different colors in the original graph. Probably it would be easier to understand the basis of graph decomposition based on edges if these matched.

In the results section, Figures 6 and 7 can be slightly misleading. Fig. 6 has a three stage discontinuos X axis, where single ticks vary in their represented distance, for example from 1 to 1.2 (delta 0.2), then to 1.5 (delta 0.3) and back to delta 0.2 in the jump from 1.8 to 2; this is even more noticable in jumps size 1 between 2 and 5 and jump size 5 from 5 to 10. This clearly affects the perceived slope of the curve shown in the Figure.

As for Fig 7, there is no continuity between values for single, 2 sub-graphs A, and so on... So I think no line should be drawn between the values.

Section 5 has Table 1 and Table 2, when they should be table 2 and 3 respectively.

The experimental setup description is really poor. What does it mean 10% of the data was selected for training? was it random selection? how do you account for the distribution of node's connectiveness (specially in the DBLP dataset, which presented issues in the induction task)?

A better presentation of the experiments is needed. For example, although emphasis is placed on the multi-node classification task, no information is given about the multi-type node properties of the input datasets. Although we know each test network, has three types of edges, we do not know how many categories of nodes there are (I can guess 4 from the results shown in Fig 4, although since there is no complete legend for the graph, it is impossible to know.), or how these might be distributed. Does this have any impact on the results?

I do not understand Fig 5.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The authors implemented the suggestion mentioned in the first revision round. It was tough to identify the changes because the highlights in the text were not implemented. It is possible to note the overall quality of the paper was improved and the reviewer suggestion is to indicate the accept the paper in the present form.

Article Menu

MSGCN: Multi-Subgraph Based Heterogeneous Graph Convolution Network Embedding

Further Information

Guidelines

MDPI Initiatives

Follow MDPI