Next Article in Journal
Fault Identification of U-Net Based on Enhanced Feature Fusion and Attention Mechanism
Previous Article in Journal
Parameters Identification of a Permanent Magnet DC Motor: A Review
 
 
Article
Peer-Review Record

Short Text Classification Based on Hierarchical Heterogeneous Graph and LDA Fusion

Electronics 2023, 12(12), 2560; https://doi.org/10.3390/electronics12122560
by Xinlan Xu 1, Bo Li 1,*, Yuhao Shen 1, Bing Luo 1, Chao Zhang 2,* and Fei Hao 3
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4: Anonymous
Electronics 2023, 12(12), 2560; https://doi.org/10.3390/electronics12122560
Submission received: 26 April 2023 / Revised: 21 May 2023 / Accepted: 3 June 2023 / Published: 6 June 2023
(This article belongs to the Special Issue Industrial Artificial Intelligence: Innovations and Challenges)

Round 1

Reviewer 1 Report

Latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data combined with a hierarchical heterogeneous graph for short text classification is used in order to adjust the feature weight and enhancing the effectiveness of short text extension.

Given the impressive amount of data available, the authors argue the need to classify short texts and the most common-use methods, proposing a new algorithm, based on SHINE and LDA models.

Chapter 2, Related Work it is very intuitive, describing in detail Short Text Classification and Expansion Methods, The SHINE and LDA models for the proposed SHINE+LDA method.

 

Chapters Proposed Method and Experiments, are very well structured and fundamented, describing in details the proposed SHINE+LDA models, the test environment and dataset used for the algorithm, evaluating the proposed SHINE+LDA model with SHINE, HGAT and LDA+SVM models in order to estimate the approach’s effectiveness and to demonstrate the accuracy of the obtained experimental results.

Taking into account the fact that the access to the dataset, algorithm, development environment and source program is restrictive, I cannot validate the obtained results, implicitly the results (Table 1, 2 and 3) and 2 D graphical representations (Figure 5 and 6).

The paper ends with conclusions and bibliography.

Finally, the percentage of plagiarism is unacceptable, taking into account the attached report.

Comments for author File: Comments.pdf

Minor editing of English language required

Author Response

The authors are grateful to the editor and reviewer for his/her constructive suggestions on our paper. We have revised our paper according to the reviewer’s comments. The concrete revision is as follows. 

First of all, in view of the lack of a result summary in the abstract part, we added a result summary in the abstract part. For details, please see the abstract. In addition, we added an introduction to the potential applications of short text classification in Section 1 and a description of the document structure at the end of the section, as detailed in Section 1. Moreover, in Section 2, we add and discuss some of the methods that have recently been used to classify short texts. Please see Section 2 for details. Finally, in view of the lack of details in the future work, we have introduced the future work in detail in Section 5, please see Section 5 for details.

Reviewer 2 Report

Short Text Classification Based on Hierarchical Heterogeneous

Graph and LDA Fusion

=========================

The paper proposed an approach using hierarhical heterogenous graph and LDA fusion for the classification of short text.

The paper is interesting to read. 

However, the paper has following concerns to be fixed for the consideration.

1. The open question in the abstract/introduction is unclear. Please rewrite it.

2. The result summary is absent in the abstract.

3. What are the significance of the work? Please hightlight it.

4. Subsections under the related work in not coherent (eg two subsections are differnet, how to link them?). Please organise them coherently.

5. Please discuss some recent short text classification approaches as follow:

https://arxiv.org/abs/2203.10286

https://www.hindawi.com/journals/cin/2021/2158184/

These papers explain some recent methods being used for short text (eg tweets) classification.

6. SOTA comparison is weak. Please use more SOTA methods and compare them with them in terms of precision, recall, etc.

7. Please do a class-wise performance analysis

8. The paper needs statistical tests eg Wilcoxon or t-test against each other.

 

Author Response

 

The authors are grateful to the editor and reviewer for his/her constructive suggestions on our paper. We have revised our paper according to the reviewer’s comments. The concrete revision is as follows.

Reviewer #2: Comments and Suggestions for Authors

Point 1: The open question in the abstract/introduction is unclear. Please rewrite it.

 Response 1: We have already rewritten it. Please see the abstract and introduction for details.

 Point 2: The result summary is absent in the abstract.

 Response 2: We have modified it. Please see the abstract and introduction for details.

 Point 3: What are the significance of the work? Please hightlight it.

 Response 3: This work proposes a new hierarchical heterogeneous graph representation learning method which can effectively solve the problem of lacking context information in short text classification. Specifically, the method constructs multiple feature maps to represent the text, and uses the LDA topic model to adjust the feature weight, thus improving the classification effect. In addition, this method also achieves good performance in the classification of Chinese short texts. Therefore, this work is of great significance to improve the accuracy and practicability of short text classification.

Point 4: Subsections under the related work in not coherent (eg two subsections are differnet, how to link them?). Please organise them coherently.

Response 4: This is a constructive suggestion. We have modified it. Please see Section 2 related work for details.

 Point 5: Please discuss some recent short text classification approaches as follow:

https://arxiv.org/abs/2203.10286

https://www.hindawi.com/journals/cin/2021/2158184/

These papers explain some recent methods being used for short text (eg tweets) classification.

Response 5: The Reviewer’s comment is correct and constructive. We have added it in Section 2.1. Please see Section 2.1 for details.

 Point 6: SOTA comparison is weak. Please use more SOTA methods and compare them with them in terms of precision, recall, etc.

 Response 6: This is a constructive suggestion. The SHINE model we refer to is considered the most advanced method available. By referring to relevant literature, we noticed that recall rate was not widely used in relevant papers, while accuracy rate and F1 value were widely used as evaluation indicators. Therefore, accuracy and F1 values are also used as evaluation indicators.

 Point 7: Please do a class-wise performance analysis.

 Response 7: This is a constructive suggestion. We've added it in Section 4.6. Please see Section 4.6 for details.

 Point 8: The paper needs statistical tests eg Wilcoxon or t-test against each other.

 Response 8: The Reviewer’s comment is correct and constructive. Based on the references we have reviewed, we note that the relevant papers do not seem to use statistical tests. For this reason, statistical tests are not used in this paper either.

Reviewer 3 Report

The work is certainly interesting but not entirely timely given the already existing widespread acceptance of (and related research on) short texts. Nevertheless, this is counter-balanced by the cumulative volume of short texts existing that still require a variety of management techniques.

 

Expressions such as "At present, the most advanced method for short text classification is SHINE" (as well a similar sentences like 118) require either more details or citations in order to stand, as far as the "performs better" part.

 

Section 2 would be better of described in a compare / contrast, towards the proposed method, attitude that would also increase the motivation and contribution of the proposed method.

 

Figure 4 does not show in a fresh version of acrobat reader in Windows 10 and had to load it through the google drive platform.

 

Section 4.3 requires some form of sensitivity validation & evaluation for all selected values of hyper-parameters (i.e. in addition to the 2 tested in Section 4.6) of the experiments: in other words why select these values, how does the change of these values affect the experiments, and intuitively why do these values "work" / perform best / etc? 

 

Table 2's results could benefit by further qualitative explanation in order to enhance the superiority claim given the delta between SHINE and SHINE+LDA for twitter and snippets's results is significantly small. The proposed method's superiority, despite its increased complexity in comparison to sole SHINE, would also be interesting to be shown in a composite metric, especially with relation to the aforementioned delta.

 

Section 4.7 starts of in the appropriate manner, i.e. discuss why is the Chinese language of special interest to this work, but both the maturity of the method, as shown by the results, and the qualitative discussion on the results leave readers desiring more.

Author Response

The authors are grateful to the editor and reviewer for his/her constructive suggestions on our paper. We have revised our paper according to the reviewer’s comments. The concrete revision is as follows.

Reviewer #3: Comments and Suggestions for Authors

Point 1: Expressions such as "At present, the most advanced method for short text classification is SHINE" (as well a similar sentences like 118) require either more details or citations in order to stand, as far as the "performs better" part.

 Response 1: Based on SHINE, this method introduces the LDA topic model to further improve the accuracy and generalization ability of short text classification, resulting in better results for practical applications.

 Point 2: Section 2 would be better of described in a compare / contrast, towards the proposed method, attitude that would also increase the motivation and contribution of the proposed method.

 Response 2: The Reviewer’s comment is correct and constructive. We have modified it. Please see Section 2 for details.

 Point 3: Figure 4 does not show in a fresh version of acrobat reader in Windows 10 and had to load it through the google drive platform.

 Response 3: We have modified it. Please see Figure 4 for details.

 Point 4: Section 4.3 requires some form of sensitivity validation & evaluation for all selected values of hyper-parameters (i.e. in addition to the 2 tested in Section 4.6) of the experiments: in other words why select these values, how does the change of these values affect the experiments, and intuitively why do these values "work" / perform best / etc?

 Response 4: The Reviewer’s comment is correct and constructive. In Section 4.3, the rest of the hyper-parameters are based on the hyper-parameter Settings in the comparison experiment, except for the threshold and the embedding size of the GCN. We did not consider how changes in these values affect the experiment, and we will investigate the influence of hyper-parameters in future work. 

 Point 5: Table 2's results could benefit by further qualitative explanation in order to enhance the superiority claim given the delta between SHINE and SHINE+LDA for twitter and snippets's results is significantly small. The proposed method's superiority, despite its increased complexity in comparison to sole SHINE, would also be interesting to be shown in a composite metric, especially with relation to the aforementioned delta.

 Response 5: This is a constructive suggestion. We have modified it. Please see Section 4.5 for details.

 Point 6: Section 4.7 starts of in the appropriate manner, i.e. discuss why is the Chinese language of special interest to this work, but both the maturity of the method, as shown by the results, and the qualitative discussion on the results leave readers desiring more.

 Response 6: The Reviewer’s comment is correct and constructive. We have modified it. Please see Section 4.7 for details.

Reviewer 4 Report

Summary:

This is a nice manuscript on the issue of provide a proper classification for short texts, combining two existing methods. Authors claim that they are getting better results than SHINE, at least for the Chinese language.

 

Broad comments:

Strengths:

  1. The research is focused on an interesting topic, and the approach is simple but apparently effective.
  2. The results seem promising (SHINE is a good reference).
  3. The manuscript is easy to read, even for non-technical personnel.

Weaknesses:

  1. To my humble opinion, the main weakness is related to the impact of the research (it might seem more a use case).   
  2. Some key parts are missing or need to be elaborated. The tuning of the parameters is not yet clear to me, and how the algorithm is applied to the dataset is very briefly addressed.

 

Specific comments:

  1. Major issues:
    1. The main issue I find is that it seems more a use case rather than ground-breaking research: mixing graph and LDA.
    2. I am missing a proper explanation of the application to the different datasets (a Materials and Methods). How and why is the impact of the GCN embedding size to the performance of the text classification?
    3. Please, create a “Future works” describing the next steps for this research (not just a sentence in line 324).
    4. I am not sure if this paper is a proper match for Electronics. Maybe Applied Sciences is a more appropriate journal.

 

  1. Minor issues:
    1. Please, add a description at the end of the “Introduction” section, explaining the structure of the document.
    2. Please, mover Figure 2 next to its reference (line 182).
    3. Figure 4: it is apparently missing, or at least not visible in the PDF.
    4. Formula (10): please add an explanation of the ReLu function (line 209).
    5. Please, elaborate the parameters in section 4.3. It is not clear how they have been selected. Moreover, it is not clear why k=15, k=20, P=2, etc.
    6.  

Comments for author File: Comments.pdf

  1. Please, consider using a native English professional translator to review the document: some sentences are not clear.

Author Response

The authors are grateful to the editor and reviewer for his/her constructive suggestions on our paper. We have revised our paper according to the reviewer’s comments. The concrete revision is as follows.

Reviewer #4: Comments and Suggestions for Authors

Point 1: The main issue I find is that it seems more a use case rather than ground-breaking research: mixing graph and LDA. 

Response 1: Mixing graph and LDA are indeed known techniques and algorithms rather than brand new research. But that doesn't mean they lack value or creativity. Mixing graph and LDA are very useful in many application areas, and we propose new solutions by improving and combining existing methods to drive its development in natural language processing areas. 

Point 2: I am missing a proper explanation of the application to the different datasets (a Materials and Methods). How and why is the impact of the GCN embedding size to the performance of the text classification? 

Response 2:

Data preprocessing includes the following steps:

  • Divide each sentence into words.
  • Stop words and low-frequency words that appeared less than 5 times in the corpus were removed.
  • Following the method in Reference 9, for each dataset, we randomly selected 40 labeled documents in each class, half of which were used as training sets and the other half as validation sets. The remaining documents are used as test sets, which are unlabeled documents during training.

Please see Section 4.2 for the specific dataset.

The node embedding representation of GCN is affected by its neighbor nodes. If the embedding size is too small, it may not capture enough neighbor node information, resulting in performance degradation. Conversely, if the embedding size is too large, it may result in overfitting or increased computational complexity, which also affects performance. Therefore, choosing the right embedding size is critical to the performance of GCN. 

Point 3: Please, create a “Future works” describing the next steps for this research (not just a sentence in line 324).

Response 3: The Reviewer’s comment is correct and constructive. We have added it in Section 5. Please see Section 5 for details. 

Point 4: I am not sure if this paper is a proper match for Electronics. Maybe Applied Sciences is a more appropriate journal. 

Response 4: This is a constructive suggestion. However, short text classification technology has many applications. For example, short text classification can be used to analyze soil data to determine the presence of specific contaminants in the soil or to determine soil quality levels. It can also be used to analyze the data of industrial equipment, so as to better understand the operating state of equipment and diagnose faults. 

Point 5: Please, add a description at the end of the “Introduction” section, explaining the structure of the document. 

Response 5: The Reviewer’s comment is correct and constructive. We have added it in Section 1. Please see Section 1 for details. 

Point 6: Please, mover Figure 2 next to its reference (line 182). 

Response 6: We have modified it. Please see line 234 for details. 

Point 7: Figure 4: it is apparently missing, or at least not visible in the PDF. 

Response 7: We have modified it. Please see Figure 4 for details. 

Point 8: Formula (10): please add an explanation of the ReLu function (line 209). 

Response 8: The Reviewer’s comment is correct and constructive. We have added it on line 262. Please see line 262 for details. 

Point 9: Please, elaborate the parameters in section 4.3. It is not clear how they have been selected. Moreover, it is not clear why k=15, k=20, P=2, etc. 

Response 9: This is a constructive suggestion. We have modified it. Please see Section 4.3 for details. 

Point 10: Please, consider using a native English professional translator to review the document: some sentences are not clear. 

Response 10: This is a constructive suggestion. We have modified it.

Reviewer 5 Report

Very iteresting paper. As far as I can tell, it's grammar is OK. Its structure is appropriate but I suggest some improvements.

 

1. The goal of the reseatch work is not defined. What are the potential applications of such classifications. 

2. Last paragraph of Introducrion section refers to SHINE+LDA method without explaining them. Also it refers to 4 benchmark datasets without naming them

Please consider to move this paragraph after explaining the related work, and define clear goals of your work

3. Line 93 uses notation as Dir(alpha), Line 99 uses Dirichlet(alpha), please use the same for both

4. Not all operators explained in (1)

5. The model described in 2.3 based on allpha and beta parameters. Please explain how to select them as appropriate.

6. L118 states that SHINE performs better, L32 says it is the best . Is it supported by facts ?

7. before section 3.1 plase add a paragraph describing  why do you need word-level-graph and how to produce it

8. L132 how do you get the parameters of (2) ?

9. (6) seems to be wrong, it always results 1. Probably the upper index of the first parameter should be i

10. please explain  xe better in (6), how will it be a vector ?

11 (10) again refers to trainable parameters, without referring how to train them

12. L228-L234 should go before Algorithm1

13. L234 the data of used from 'Headlines Today' must be saved and published to validate your result

14. Header of Table 1 is not described correctly: Quantity of what, Average length (in words ?) Classes 

15. Support (12) and (13) by references

16. Please describe the potential benefit of your research

 

 

 

 

 

 

 

 

Author Response

The authors are grateful to the editor and reviewer for his/her constructive suggestions on our paper. We have revised our paper according to the reviewer’s comments. The concrete revision is as follows.

Reviewer #5: Comments and Suggestions for Authors

Point 1: The goal of the research work is not defined. What are the potential applications of such classifications. 

Response 1: The goal of the research work is to improve the accuracy of short text classification by dealing with semantic and syntactic information missing in short text.

The potential applications of such classifications:

  • The short text classification technology is used to analyze the data of industrial equipment, which can better understand the running state of equipment and diagnose the fault.
  • Classify and cluster large-scale news reports and social media topics, analyze and predict the development and trend of events.
  • According to the historical behaviors and interests of users, the short text library is classified and matched to recommend personalized goods or services for users. 

Point 2: Last paragraph of Introduction section refers to SHINE+LDA method without explaining them. Also it refers to 4 benchmark datasets without naming them.

Please consider to move this paragraph after explaining the related work, and define clear goals of your work. 

Response 2: This is a constructive suggestion. Please see the introduction for details. 

Point 3: Line 93 uses notation as Dir(alpha), Line 99 uses Dirichlet(alpha), please use the same for both. 

Response 3: We have modified it. Please see the details in Section 2.3. 

Point 4: Not all operators explained in (1). 

Response 4: We have added it. Please see the details in Section 2.3. 

Point 5: The model described in 2.3 based on alpha and beta parameters. Please explain how to select them as appropriate. 

Response 5: We have added how to select the alpha and beta parameters. Please see the details in Section 2.3. 

Point 6: L118 states that SHINE performs better, L32 says it is the best . Is it supported by facts ? 

Response 6: Yes,it is. Reference 13 shows that SHINE is currently the best performing model in the short text classification. 

Point 7: before section 3.1 please add a paragraph describing  why do you need word-level-graph and how to produce it. 

Response 7: The Reviewer’s comment is correct and constructive. We have added it. Please see the details in Section 3.1. 

Point 8: L132 how do you get the parameters of (2) ? 

Response 8: The trainable parameters in formula (2) are obtained by learning the model parameters of the neural network and updated during the process of the back propagation algorithm. Therefore, in the process of neural network model training, trainable parameters will be constantly updated until the loss function reaches the minimum value or the number of training reaches the preset value. 

Point 9: (6) seems to be wrong, it always results 1. Probably the upper index of the first parameter should be i. 

Response 9: This is a constructive suggestion. We have modified it. Please refer to Formula (6) for details. 

Point 10: please explain xe better in (6), how will it be a vector ? 

Response 10: Firstly, construct an entity type knowledge graph by creating an entity node for each entity type and creating edges for the relationships between different entity types. The TransE method is then used to map entities and relationships into low-dimensional vector Spaces. Finally, learn entity embeddings and relationship embeddings vectors by minimizing the loss of the distance function. 

Point 11: (10) again refers to trainable parameters, without referring how to train them. 

Response 11: The trainable parameters in formula (10) can be trained by the back propagation algorithm. The end-to-end training method is used to optimize the model parameters through the back propagation algorithm to minimize the loss function. Specifically, in the SHINE+LDA algorithm, adaptive graph learning is implemented through joint optimization on all node embeddings and adjacency matrices. The label information is then propagated using a 2-layer GCN and the category prediction probability for each short text is calculated using the softmax function. Finally, cross entropy is used as a loss function to conduct end-to-end training on the whole model to obtain the best parameters to minimize the loss function.

 Point 12: L228-L234 should go before Algorithm1. 

Response 12: This is a constructive suggestion. We have modified it. Please see Section 3.2 for details. 

Point 13: L234 the data of used from 'Headlines Today' must be saved and published to validate your result. 

Response 13: We have added it. Please see L288-L289 for details. 

Point 14: Header of Table 1 is not described correctly: Quantity of what, Average length (in words ?) Classes.

Response 14: The Reviewer’s comment is correct and constructive. We have modified it. Please see Table 1 for details. 

Point 15: Support (12) and (13) by references. 

Response 15: This is a constructive suggestion. We have added it. Please see References 31 and 32. 

Point 16: Please describe the potential benefit of your research. 

Response 16:

  • Improve the accuracy and efficiency of short text classification, and provide better solutions for practical applications.
  • By constructing hierarchical heterogeneous graph and introducing LDA topic model for feature weighting adjustment, the particularity of short text data can be better handled and the problem of data sparsity can be alleviated.
  • Can be applied to a variety of NLP tasks: SHINE+LDA model is not only suitable for short text classification, but can also be extended to other areas, such as named entity recognition and relationship extraction.

Round 2

Reviewer 2 Report

Thanks for the revision. The manuscript is acceptable now.

Reviewer 4 Report

I consider that the authors have responded all the inquiries that I have raised in my previous review:

  • As for major concerns:
    • Authors have added an explanation regarding why this research fills a gaps with actual state of the art (combination of LDA and graph), and they have also given a rationale for its inclusion in Electronics.
    • Authors have also explained the application of the datasets that have been used.
    • The structure of the document has been improved, with the description of the organization of the manuscript, and a brief description of the Future Works as next steps in the development.

 

  • As for minor issues:
    • All minor concerns have been taken into consideration:
      • Description of the organization of the document.
      • Small typos in several lines.
      • Figures 2 and 4 now are visible and in their right place.
      • Formula #(10) has been detailed.
      • The selection of the actual values of the parameters has been explained.

 

As a summary, under my humble opinion, the manuscript can be considered for publication.

Comments for author File: Comments.pdf

Minor editing of English language required, but quite better.

Back to TopTop