Next Article in Journal
A Super-Efficient TinyML Processor for the Edge Metaverse
Next Article in Special Issue
The Process of Identifying Automobile Joint Failures during the Operation Phase: Data Analytics Based on Association Rules
Previous Article in Journal
Architecture and Data Knowledge of the Regional Data Center for Intelligent Agriculture
Previous Article in Special Issue
Graph Neural Networks and Open-Government Data to Forecast Traffic Flow
 
 
Article
Peer-Review Record

FedUA: An Uncertainty-Aware Distillation-Based Federated Learning Scheme for Image Classification

Information 2023, 14(4), 234; https://doi.org/10.3390/info14040234
by Shao-Ming Lee 1 and Ja-Ling Wu 1,2,*
Reviewer 1:
Reviewer 2: Anonymous
Information 2023, 14(4), 234; https://doi.org/10.3390/info14040234
Submission received: 27 February 2023 / Revised: 4 April 2023 / Accepted: 6 April 2023 / Published: 10 April 2023
(This article belongs to the Special Issue Artificial Intelligence and Big Data Applications)

Round 1

Reviewer 1 Report

The authors propose an uncertainty measurement-based approach to handle the problem of Non-IID in federated scenario real-world user data.

The core idea and approach are interesting for the reader. The authors face a significant challenge, and the concept beyond the paper is worth investigating. 

However, the paper must be consistently improved since it lacks many points:

  1. The paper must improve its style. 
    1. Improve figure 7 and correct the labeling of figures on page 23
    2. Fix Figure 4
    3. Are the authors adopt the correct template for the journal?
    4. Improve code examples on lines 239 and 245. Please adopt a more explicit representation (https://www.overleaf.com/learn/latex/Algorithms).
  1. A substantial improvement of related work should be addressed to report more references conferencing the no-IDD issues and the other challenges in the FL domain ( https://doi.org/10.48550/arXiv.1905.10497https://doi.org/10.1016/j.eswa.2021.116109,https://doi.org/10.48550/arXiv.2006.07242,https://doi.org/10.48550/arXiv.1905.06641http://dx.doi.org/10.2139/ssrn.3696609https://doi.org/10.1016/j.future.2022.06.006https://doi.org/10.1016/j.ins.2022.11.126, and more...)

 .         A summary table of the difference between the paper and the related work would be appreciated.

 

  1. Why a Gaussian mixture? How did they set its parameters?
  2. Improve paragraph 3.3 with more details.
  3. Improve the subsection “The Impact of Sample Assessment
  4. Results should put more emphasis on the contribution of “Uncertainty Measurement”

Author Response

Point-to-point Replies to comments of Reviewer 1:

 

  1. The authors propose an uncertainty measurement-based approach to handle the problem of Non-IID in federated scenario real-world user data.

The core idea and approach are interesting for the reader. The authors face a significant challenge, and the concept beyond the paper is worth investigating. 

 

Reply: Thanks reviewer’s encouragement, we did learn a lot from your informative comments and suggestions.

 

  1. However, the paper must be consistently improved since it lacks many points:
  2. The paper must improve its style. 
    1. Improve Figure 7 and correct the labeling of figures on page 23
    2. Fix Figure 4
    3. Are the authors adopt the correct template for the journal?
    4. Improve code examples on lines 239 and 245. Please adopt a more explicit representation (https://www.overleaf.com/learn/latex/Algorithms).

 

Reply: As suggested, we have tried our best to improve the quality and give better explanations of all figures (cf. Figures 2, 4, 5, 6, 7, 11, 12, and 13) and Tables (cf. Tables 1-3) in our revision. Moreover, we have converted the format of our original manuscript according to the template provided on the journal’s website.

 

  1. A substantial improvement of related work should be addressed to report more references conferencing the no-IDD issues and the other challenges in the FL domain ( https://doi.org/10.48550/arXiv.1905.10497, https://doi.org/10.1016/j.eswa.2021.116109,https://doi.org/10.48550/arXiv.2006.07242,https://doi.org/10.48550/arXiv.1905.06641, http://dx.doi.org/10.2139/ssrn.3696609, https://doi.org/10.1016/j.future.2022.06.006, https://doi.org/10.1016/j.ins.2022.11.126, and more...)

 

Reply: Thanks for bringing the literature mentioned above to our attention. A new sub-section (Section 4.1) is added to the revision in response to this suggestion. We briefly described the contributions and commented on the pros and cons of all the suggested references in Section 4.1. 

 

  1. A summary table of the difference between the paper and the related work would be appreciated.

 

Reply: We found that Reference [33] has tabulated the state of the arts of federated Learning published between 2018 and 2022 regarding applications, adopted approaches, pros, and cons. So, we decided not to duplicate the work but refer interested readers to [33] (lines 523-526, in the revision).

  1. Why a Gaussian mixture? How did they set its parameters?

Reply: We added a few words at the beginning of line 189 to respond to this comment, and the parameters are set according to equations 1 and 2.

  1. Improve paragraph 3.3 with more details.

 

Reply: To respond to comment 3, we added a new Figure (Figure 5), redrew Figure 6 by adding a few indications of the information flow, and hope the picture of our architecture can be much more comprehensible. Based on Figure 5, some paragraphs (colored in blue) are specifically added in Sub-section 3.3 to explain the composition of our system architecture (as a specific response to comment 3).

 

  1. Improve the subsection “The Impact of Sample Assessment

 

As for comment 4, besides emphasizing that BALD-based sample screening performed better than random batch ones, a new paragraph (colored in blue) has been added in Sub-section 4.2.1 (a) to explain why BLAD filtering is effective in reducing the required computation loads.

 

  1. Results should put more emphasis on the contribution of “Uncertainty Measurement

 

Finally, a new Figure (Figure 12) and a new paragraph (Colored in blue) are added in Sub-section 4.2.1 (b) to address the contributions of the proposed Uncertainty Measurement mechanism (as a specific reply to comment 5).

 

Notice that all the newly added paragraphs are colored in blue in the revision.

 

 

 

Author Response File: Author Response.docx

Reviewer 2 Report

The authors of this paper address the challenges faced in Federated Learning (FL), specifically focusing on non-IID data distribution and limited communication bandwidth. They propose a novel architecture for model aggregation by incorporating knowledge distillation and deep neural network (DNN) uncertainty quantification methods. Through image classification experiments, they demonstrate that their approach effectively overcame the challenges above.

 

 

 

Strengths:

 

 

 

The paper is well-structured and provides a clear overview of the research problem and proposed solution.

 

The authors identify the main challenges in FL, such as data security, unbalanced and non-IID data distribution, and unreliable connections, and focus their work on addressing these issues.

 

The proposed architecture that combines knowledge distillation and DNN uncertainty quantification methods shows promise in improving the performance of FL.

 

The experimental results support the effectiveness of the proposed model aggregation scheme, especially when the transmission cost is limited.

 

Weaknesses:

 

 

 

The paper could benefit from a more detailed description of the proposed architecture, explaining the underlying mechanisms and how they are combined to improve FL performance.

 

A comparison with existing approaches or baseline models in the literature would strengthen the argument for the effectiveness of the proposed method.

 

The authors should consider extending their experiments to other tasks and datasets to demonstrate the generalizability of their proposed model aggregation scheme.

 

A discussion of the computational complexity and scalability of the proposed approach would be valuable to assess its practicality in real-world applications.

 

DNN has been used for many years; please explain why the model is novel by consulting with the book https://www.deeplearningbook.org/ or "Neural Networks and Learning Machines" by Haykin, S.O.

Please edit the English language and check the grammatical mistakes. There are some online tools to do it.

In conclusion, the paper presents a novel approach to address the challenges in Federated Learning, particularly non-IID data distribution and limited communication bandwidth. By incorporating knowledge distillation and DNN uncertainty quantification methods, the authors propose a new architecture for model aggregation that shows promising results in image classification tasks. Further elaboration on the architecture, comparison with existing methods, and extended experiments would strengthen the paper and its contributions to the field.

Author Response

Point-to-point Replies to comments of Reviewer 2

  1. The authors of this paper address the challenges faced in Federated Learning (FL), specifically focusing on non-IID data distribution and limited communication bandwidth. They propose a novel architecture for model aggregation by incorporating knowledge distillation and deep neural network (DNN) uncertainty quantification methods. Through image classification experiments, they demonstrate that their approach effectively overcame the challenges above.

Strengths:

The paper is well-structured and provides a clear overview of the research problem and proposed solution.

The authors identify the main challenges in FL, such as data security, unbalanced and non-IID data distribution, and unreliable connections, and focus their work on addressing these issues. 

The proposed architecture that combines knowledge distillation and DNN uncertainty quantification methods shows promise in improving the performance of FL.

The experimental results support the effectiveness of the proposed model aggregation scheme, especially when the transmission cost is limited.

Reply: Thanks reviewer’s encouragement. We did learn a lot from your informative comments and suggestions.

2. Weaknesses:

The paper could benefit from a more detailed description of the proposed architecture, explaining the underlying mechanisms and how they are combined to improve FL performance.

Reply: As suggested, we have revised Sub-sections 3.3, 4.2.1 (a), and 4.2.1 (b) to provide more detailed descriptions of our architecture, the underlying mechanisms, and their effectiveness in performance improvement in FL. Notice that all the newly added paragraphs are colored in blue in the revision.

  1. A comparison with existing approaches or baseline models in the literature would strengthen the argument for the effectiveness of the proposed method.

Reply: A new sub-section (Section 4.1) is added to the revision in response to this suggestion. We added six new references in the revision, briefly described their contributions, and commented on the pros and cons of those recently published related references.

 

  1. The authors should consider extending their experiments to other tasks and datasets to demonstrate the generalizability of their proposed model aggregation scheme.

 Reply: In theory, we agree with the reviewer's comment that " we should consider extending our experiments to other tasks and datasets to demonstrate the generalizability of their proposed model and aggregation scheme." However, finding enough computational resources and large data sets to conduct accurate and concrete experiments is challenging in academia. We added a new paragraph at the beginning of Sub-section 4.2.2 (c) in the revision to address this reality and explain why we designed our simulations concerning the effects of limited allowable communication capacity by using the clients' participation ratio (denoted by C) uploaded to the server each round as the testing condition parameter.  

  1. A discussion of the computational complexity and scalability of the proposed approach would be valuable to assess its practicality in real-world applications. 

Reply: Again, we agree with the reviewer's comment that “a discussion of the computational complexity and scalability of the proposed approach would be valuable to assess its practicality in real-world applications." However, for the same reason as our previous reply, we need more support to justify the analyzed complexity with convincing scalability experiments of the proposed system. We decided to put it into our future work.

  1. DNN has been used for many years; please explain why the model is novel by consulting with the book https://www.deeplearningbook.org/ or "Neural Networks and Learning Machines" by Haykin, S.O.

Reply: Thanks for bringing this book to our attention. After investigating most of the slides, we found that most of their last updates are done before 2018. However, FL was introduced in 2016, while the first work concerning knowledge distillation was published in 2018, not to mention that the concept of data uncertainty was not addressed until 2021. Although we do not read the whole book, we are confident that the proposed model has yet to be covered by it.

  1. Please edit the English language and check the grammatical mistakes. There are some online tools to do it.

Reply: We have revised the paper's spelling as much as possible with the aid of Grammarly and asked a native speaker to check the English usage of the revision.

  1. Further elaboration on the architecture, comparison with existing methods, and extended experiments would strengthen the paper and its contributions to the field.

Reply: As suggested, we have tried our best to prepare this revision.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors have addressed all the reviewers' requests. The paper has been improved, and many of the unclear points have been clarified, making it more readable and simple to understand.

Back to TopTop