Next Article in Journal
Network Anomaly Detection by Using a Time-Decay Closed Frequent Pattern
Next Article in Special Issue
Weakly Supervised Learning for Evaluating Road Surface Condition from Wheelchair Driving Data
Previous Article in Journal
Impact of Information Sharing and Forecast Combination on Fast-Moving-Consumer-Goods Demand Forecast Accuracy
Previous Article in Special Issue
Multi-Regional Online Car-Hailing Order Quantity Forecasting Based on the Convolutional Neural Network
 
 
Article
Peer-Review Record

Semantic Information G Theory and Logical Bayesian Inference for Machine Learning

Information 2019, 10(8), 261; https://doi.org/10.3390/info10080261
by Chenguang Lu
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Information 2019, 10(8), 261; https://doi.org/10.3390/info10080261
Submission received: 20 June 2019 / Revised: 13 August 2019 / Accepted: 13 August 2019 / Published: 16 August 2019
(This article belongs to the Special Issue Machine Learning on Scientific Data and Information)

Round 1

Reviewer 1 Report

This paper is about Semantic Information G (=generalization) Theory and Logical Bayesian Inference for machine learning. So it is a piece of work drawing pretty deep into the roots of information theory and logic. It is part review and part novel material showing algorithms build from the principles discussed in the paper.

In this reviewer's opinion the paper is unnecessarily hard to read and could perhaps benefit from being split into two where the first part could concentrate on background material and theoretical development and part two could develop machine learning algorithms. 

In the machine learning algorithm part, it could be nice to see a side by side comparison between existing methods (e.g. EM algorithm) and the newly proposed method both in terms of algorithmic recipe and empirical results.

So to summarize, this paper potentially could be very interesting for both information theorists and machine learners but currently suffers from hard to decipher delivery. 


Author Response

This file includes two letters. The first letter is sent to both reviewers to explain why the paper is reorganized in this way. The second letter is sent to reviewer 1 with detailed replies.


Author Response File: Author Response.pdf

Reviewer 2 Report

Major revision remarks:

1.      Abstract: The background part is too general and in fact does not discuss the major problem that the presented Channel Matching (CM) algorithm overcomes: “Using the CM algorithm, for multi-label classifications, we need not convert a multi-label learning task into several two-label learning tasks.” I would suggest shortened and more focused on the problems background part.

2.      Abstract: Missing formulation of the aims of the study.

3.      Abstract: Missing arguments about the novelty, as well as details of the proposed new methods and developments.

4.      Abstract: “For the Maximum Mutual Information (MMI) classifications of unseen instances, two 18 iterations can make mutual information surpass 99% of the MMI in most cases.” -> Missing information about the data used to estimate these results (99%). How do you asses the mutial information?

5.      Abstract: “improved EM algorithm, the CM-EM algorithm, can outperform the EM algorithm” -> The EM and CM-EM algorithms were not described earlier. The reader is blinded about the need and the kind of improvement.

6.      Ln 119-120: The authors have not formulated the aims of the study properly. The article is too vast and covers information beyond the short formulation: “Therefore, the author also wants to test the G theory by solving the MMI puzzle for classical information theory.”

7.      The paper is not formatted according to the standard journal’s requirements with major sections: Introduction, Material and Methods, Results, Discussion. In fact, Methods, Materials, Results and Discussion sections are a mess. I strongly recommend reformatting according to the standard journal requirements. For example, Methods should include subsections “Background” (describe previous knowledge in the field) and “Your method name” (where clearly indicate what is the novelty of this study). The description in Ln121-123 should be deleted, as it is not necessary for the standard formatting.

8.      Methods: Clearly identify the novelty of this study:

8.1.      It contains, in its large part (15 pages), already known mathematical definitions and measures that are not contributing to any new knowledge, for example section “2. The Semantic Information G theory” and “3. Logical Bayesian Inference”.

8.2.     Section 3 includes: “3.3.3. A New Confirmation Measure”, but it is not clear why the adjective “new” has been used – because this measure is some novelty (proposed for the first time by the authors in this study) or it is rather adopted from [24].

8.3.     It looks like there is some novelty in section “4. Multi-label Learning and Classification”. However, I find it too short (less than 1 page from 29 pages in total) and almost indistinguishable. The authors should provide a larger accent on the novelty and the contributions of this paper.

8.4.     Identify the novelty in section “5. The CM Iteration Algorithm for the MMI classifications of Unseen instances”. What is the difference between:

8.5.     (1) Matching I in section 4.1 vs. line 632 “Matching I: First, we obtain the Shannon channel for given S:”;

8.6.     (2) Matching II in section 4.2 vs. line 640 “Matching II: Let the Shannon channel match the semantic channel by the classifier”?

8.7.     Identify the novelty in section “6. The CM-EM Algorithm for Mixture Models”.

 

9.      Section “4.3. Results” should be presented in a major section ‘Results’. It is too short (1 page) and must be extended to present in detail the data and the experimental setting used to generate the results. The advantage of improved performance should be demonstrated by comparison between new and old methods (e.g. those presented in the Mathematical background and the new ones Matching I and II).

10.  Section “4.3. Results”: The data for generation of figures 7 and 8 are not described. The number of cases and the type of the data should be identified.

11.  Figure 8 is overloaded and incomprehensible. Include a legend to describe each specific graph. Identify how you derived the values of the thresholds.

12.  Section “4.4. Discussion” should be presented in a major section ‘Discussion’. In fact, after careful reading, I don’t find this discussion appropriate. It can be considerably shortened. Make the discussion more general and focused only on the important achievements. If some details of the figure 8 are discussed, this can be said in Results.

13.  Section “5.2. Results” should be presented in a major section ‘Results’.

14.  Ln 685-688: “For the initial partition as Figure 11 (a), two iterations make the mutual information reach 99.99% of the convergent MMI. The author has used the above example with different parameters and different initial partitions to test the CM iteration algorithm. All iterative processes are fast and valid. In most cases, 2-3 iterations can make mutual information exceed 99% of the MMI.” -> This result is not well explained in respect of estimation of the mutual information.

15.  Sections: “5.3. Discussion” and “6.3. Discussion” should be presented in a major section ‘Discussion’.

16.  Section “6.2. Results” should be presented in a major section ‘Results’.

 

Minor revision remarks:

17.  Ln 86: the acronym “SL” was not explained before.

18.  Ln 94: the acronym “MMI” should be introduced in the main text (apart from its first introduction in the abstract). Check for such other cases.

19.  Ln 353: “We the medical test” -> correct the statement;

20.  Figure 2: Insert a legend to clearly indicate which are the two probabilities (statistical and logical) for tracking valuable differences. There are too many graphs depicted (unexplained and unclear).

21.  Table 1: Footer note (1) should be included in the reference list. This is the standard way for referencing external sources. Footer notes are acceptable only if some information cannot be presented in the main text, which is not the case.  

22.  Figure 5: Include a legend, explaining all 6 depicted graphs.

23.  Ln 481: “Even if sensibility is 1” -> The correct term is “sensitivity”. The calculation of Sensitivity and Specificity is not explained. Their relation to the measure b1 (equation 3.16) is not clear.  

24.  Ln 544: “In the following, we apply the G theory and LBI to machine learning to test them.” -> The following sentence is more appropriate for the aims of the study and not in the end of section 3.3.5.  

25.  Figure 10:  Overloaded. Include a legend, explaining all 6 depicted graphs, and specific points.

26.  Refer “Appendix A” where appropriate in the text.

27.  Author Contributions: Missing

28.  Funding: Missing

Author Response

This file includes two letters. The first letter is sent to both reviewers to explain why the paper is reorganized in this way. The second letter is sent to reviewer 1 with detailed replies.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

The paper is much improved. It is good to see the author is willing to follow the reviewers' recommendations. The paper could still be more clear and get more impact but it is at an acceptable level now.  

Author Response

Dear Reviewer 2:

I send a letter to you as the attachment.

Best wishes.

Chenguang Lu

Author Response File: Author Response.docx

Reviewer 2 Report

I’m glad that the author followed the reviewers’ comments with responsibility and significantly improved his manuscript. I find adequate answers to all my revision remarks.

I have the following comments:

The English style of the whole text should be extensively revised by a native English speaker (or an official English proof company). Otherwise, the manuscript is not publishable.  A simple example is the first sentence of the abstract, which is far from a      comprehensible English structure: “An important problem with machine learning is that when label number n>2, it is very difficult to      construct and optimize a group of learning functions, and we wish that      optimized learning functions are still useful when prior distribution P(x)      (where x is an instance) is changed.”. Poorly structured sentences are      everywhere and it is not in my proficiency and capability to mark all of  them. I also note that past and present verb tenses are mixed in the same and neighboring sentences, while the author should keep the same verb tense all over a section.  

Ln 48: “(see Appendix A for all abbreviations in this paper)” ->  remove this supplementary explanation, which unreasonably segmented the sentence.

Ln 85-86: “Therefore, the author tried to develop a semantic      information theory that can combine Shannon’s information theory and  Fisher’s likelihood method.” -> Such an explanation sounds like definition      of the aims of the study, however, this is not appropriate in the middle  of Introduction. Just present the facts consistently (always supported      with the source (reference) of the information), without introducing own thoughts (past or present). Avoid own comments in Introduction, just keep them for the      Discussion.

Ln 106-108: “Therefore, the author of this paper improved it by Eq. (2.15) in Section 2.1.1. Further, bringing likelihood functions and truth      functions into Shannon’s mutual information formula, the author obtained      the SMI formula.”- > Never, ever present methodological details or      results of your Study in section Introduction!!! Such kind of comments are      appropriate ONLY for DISCUSSION! Check the whole section Introduction for      such inconsistencies: “New methods” and “Own results” discussed and      presented in the Introduction. Your own methods should be marked only in      the aims of the study (briefly, without details)!

Equation (1.1) -> Not a good practice to present equations in      section Introduction (there is a dedicated section (Methods I: Background)  for that. You should have a serious motive to include a formula in Introduction.      

Ln 202-203: “This paper also compares the CM algorithms with some      popular methods to show their efficiencies.” -> The good style requires      a clear definition of the aims: “This study aims to compare the CM      algorithms ….” Besides, the term “also” suggests that there should be      other aims, however, I cannot identify them due to the author’s lack to describe them in a consistent manner in the last section of Introduction.


Author Response

Dear Reviewer 2:

I send the letter to you as the attachment.

Best wishes,

Chenguang Lu

Author Response File: Author Response.docx

Back to TopTop