Next Article in Journal
Exploring the Boundaries of Success: A Literature Review and Research Agenda on Resource, Complementary, and Ecological Boundaries in Digital Platform Business Model Innovation
Previous Article in Journal
The Impact of YouTube on Loneliness and Mental Health
 
 
Article
Peer-Review Record

LARF: Two-Level Attention-Based Random Forests with a Mixture of Contamination Models

Informatics 2023, 10(2), 40; https://doi.org/10.3390/informatics10020040
by Andrei Konstantinov, Lev Utkin and Vladimir Muliukha *,†
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3:
Informatics 2023, 10(2), 40; https://doi.org/10.3390/informatics10020040
Submission received: 11 December 2022 / Revised: 14 April 2023 / Accepted: 21 April 2023 / Published: 28 April 2023
(This article belongs to the Section Machine Learning)

Round 1

Reviewer 1 Report

New models of the attention-based random forests called LARF are proposed in the manuscript by introduing a two-level attention mechanism together with  a delicately-devised computing strategy via a mixture of contamination models. On the whole, the idea for this work is nice and the manuscript is well-organized with sound experimental validations.

I wish that the written presentation in the manuscript could be improved further in the revised manuscript before it is accepted definitely, which is the must-to-do work for the authors.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript needs major revision and should be rewritten. It is full of grammatical errors. Please see below for specific comments and edits.

Page 1, line 17: needs to explain what the important role is.

Page 1, line 19: how wights are assigned? From prior knowledge?

Page 1, line 27: what is Nadaraya-Watson kernel regression model?

Page 1, line 30: what is Huber’s e-contamination model?

Page 1, line 35: why it is not sufficient?

Page 2, line 66: implementation for what?

Page 2, line 67: how are they different?

Page 3, lines 106-108: This part is vague. What do you mean by “some kinds of linear approximation”?

Page 4, line 161: why you have used “will”? are you referring to the future?

Page 12, line 305: have you optimized the number of trees which is 100 in your case?

Pages 15 and 16: you have used the term “a direction for the future research” so many times. Please use concisely.

Page 1, line 16, “of” should be changed to “in”.

Page 1, line 36, “stems” should be changed to “stem”.

Page 2, line 39, “trees” should be changed to “a tree”.

Page 2, line 56, “analog” should be changed to “analogs”.

Page 2, line 62, “to avoid” should be changed to “avoid”.

Page 2, line 82, “training” should be changed to “trained”.

Page 2, line 86, “in” should be changed to “at”.

Page 3, line 108, “approaches” should be changed to “approach”.

Page 3, line 124, “most these” should be changed to “most of these”.

Page 3, line 125, “an disadvantage” should be changed to “a disadvantage”.

Page 4, line 152, “mean” should be changed to “means”.

Page 4, lines 153 and 154, “One of the popular kernel” should be changed to “One of the popular kernels”.

Line 156, “values and the” needs to be changed to “values, and the”.

Line 165, delete “with”.

Line 190, delete “clearly”.

Line 201, “to the possible” needs to be changed to “the possible”.

Line 211, add “the” before imprecision.

Lines 214 and 218, “as an attention parameter” not “a attention”

Lines 235 and 236, add “, and” before the last word in the series.

Line 243: “an optimization problem”.

Line 244, use “introduction to” not “introduction into.

Lines 254 and 255, “weight is linearly depends” needs to be changed to “weight is linearly dependent”.

Line 273, denoted “as”

Line 301, Let “us”

Line 312, “every leaf of a tree”

Line 313, add “and” before the last word in the series.

Line 315, taken “as” 10.

Line 325, taken at “the” site.

Line 346, change “clearly seen from” to “seen in”

Line 375, delete “been”

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

The results presented in this paper are interesting and of practical importance in the improvement of machine learning. The authors present a new two-level attention model called a LARF (Leaf Attention-based Random Forest). This new model is an extension of the authors previous work published titled “Attention-based random forest and contamination model” in [10].

 

Minor Revision:

 

The authors list 4 contributions of the research, Lines 71-87.

The last one states “Many numerical experiments with real datasets are performed for studying LARFs. They demonstrate outperforming results of some modifications of LARF.”

 

The number of trees grown in the Random Forest (ntree) is one of the important parameters of the RF model.

 

Line 305: The authors state “In all experiments, RFs as well as ERTs consist of 100 trees.”

 

The conclusions of the paper were based on the comparisons and results in Tables 3-6.

 

The number of trees should be set sufficiently large. I do not believe that 100 trees are large enough for the RF applied to these dataset examples. I would want to know if 100 trees are enough trees so that the out-of-bag (OOB) error has “settled down”. The OBB error should “settle down” as the number of trees increases. When considering the Boston Housing Data, growing 100 trees was not enough trees demonstrated by examining the OOB error plot. (This was demonstrated by using the other default settings in R and the parameter setting of ntree=100 and comparing the OOB error plot with a larger number of trees.) The authors need to demonstrate that 100 trees are indeed a large enough ensemble of trees so that the comparisons and results given in the Tables 3-6 are correct.  

 

 

Additional Minor Revisions:

 

1. Line 311: The authors state “all trees in experiments are trained such that at least 10 examples fall into every leaf of trees.” This is one of the parameters of a RF and is the minimum size of the leaf or terminal node. The authors needed to explain how this was determined.

 

2. In a random forest (RF), there are other parameters. There is a parameter for the number of variables randomly sampled as candidates at each split. The authors need to give and explain how this was determined.

 

3. Choosing the bandwidth for the kernel regression?

In nonparametric kernel regression, the choice of the smoothing parameter or bandwidth is very important. The authors have used the term temperature. The author’s notation for the bandwidth is tau in (3).  Cross-validation is a standard statistical approach to bandwidth selection. Cross validation is data dependent.

 

Line: 307: The authors state “The search for the best parameter τ0 is carried out by considering all its values in a predefined grid. A cross-validation procedure is subsequently used to select their appropriate values.”

 

The authors should provide more details about the cross-validation procedure.

 

4. Line: 315: The authors state “The value of M is taken 10.”

 

The authors should provide justification for this choice of M.   

 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

none

Author Response

Response to comments on the paper "LARF: Two-level Attention-based Random Forests with a Mixture of Contamination Models"
Reviewer #2:
Review: English language and style are fine/minor spell check required
Response: We've proofread the paper and have made spell checking

Back to TopTop