# Communication Cost Reduction with Partial Structure in Federated Learning

^{*}

## Abstract

**:**

## 1. Introduction

- By distributing the server model to the client as a smaller derivative model, the communication cost is reduced, simultaneously providing superior training efficiency over the standard method.
- Models distributed to clients do not require any additional computational work such as decompression and can be used for local data training as they are.
- Because the client does not know the complete structure of the server model, it provides protection against potential attacks such as malicious gradient uploads.

## 2. Background

#### 2.1. Federated Learning

#### 2.2. Related Works

## 3. Method

Algorithm 1: FederatedPartial. T is the number of communication rounds, B is the local minibatch size, E is the numberof local epochs, $\eta $ is the learning rate, i and j are indices of layers to be replaced, and p is the probability of being distributed. |

## 4. Experiment

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Ghosh, A.; Chakraborty, D.; Law, A. Artificial intelligence in Internet of things. CAAI Trans. Intell. Technol.
**2018**, 3, 208–218. [Google Scholar] [CrossRef] - Misra, N.; Dixit, Y.; Al-Mallahi, A.; Bhullar, M.S.; Upadhyay, R.; Martynenko, A. IoT, big data and artificial intelligence in agriculture and food industry. IEEE Internet Things J.
**2020**. [Google Scholar] [CrossRef] - Hard, A.; Rao, K.; Mathews, R.; Ramaswamy, S.; Beaufays, F.; Augenstein, S.; Eichner, H.; Kiddon, C.; Ramage, D. Federated learning for mobile keyboard prediction. arXiv
**2018**, arXiv:1811.03604. [Google Scholar] - Li, L.; Fan, Y.; Tse, M.; Lin, K.Y. A review of applications in federated learning. Comput. Ind. Eng.
**2020**, 149, 106854. [Google Scholar] [CrossRef] - Yang, Q.; Liu, Y.; Cheng, Y.; Kang, Y.; Chen, T.; Yu, H. Federated learning. Synth. Lect. Artif. Intell. Mach. Learn.
**2019**, 13, 1–207. [Google Scholar] [CrossRef] - Li, T.; Sahu, A.K.; Talwalkar, A.; Smith, V. Federated learning: Challenges, methods, and future directions. IEEE Signal Process. Mag.
**2020**, 37, 50–60. [Google Scholar] [CrossRef] - Bagdasaryan, E.; Veit, A.; Hua, Y.; Estrin, D.; Shmatikov, V. How to backdoor federated learning. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Palermo, Italy, 3–5 June 2020; pp. 2938–2948. [Google Scholar]
- Nishio, T.; Yonetani, R. Client selection for federated learning with heterogeneous resources in mobile edge. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–7. [Google Scholar]
- Wang, Z.; Song, M.; Zhang, Z.; Song, Y.; Wang, Q.; Qi, H. Beyond inferring class representatives: User-level privacy leakage from federated learning. In Proceedings of the IEEE INFOCOM 2019—IEEE Conference on Computer Communications, Paris, France, 29 April–2 May 2019; pp. 2512–2520. [Google Scholar]
- Sofia, D.; Lotrecchiano, N.; Trucillo, P.; Giuliano, A.; Terrone, L. Novel Air Pollution Measurement System Based on Ethereum Blockchain. J. Sens. Actuator Netw.
**2020**, 9, 49. [Google Scholar] [CrossRef] - Mcdonald, R.; Mohri, M.; Silberman, N.; Walker, D.; Mann, G. Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models. Available online: https://storage.googleapis.com/pub-tools-public-publication-data/pdf/35648.pdf (accessed on 27 August 2021).
- Gupta, O.; Raskar, R. Distributed learning of deep neural network over multiple agents. J. Netw. Comput. Appl.
**2018**, 116, 1–8. [Google Scholar] [CrossRef] [Green Version] - Shi, W.; Cao, J.; Zhang, Q.; Li, Y.; Xu, L. Edge computing: Vision and challenges. IEEE Internet Things J.
**2016**, 3, 637–646. [Google Scholar] [CrossRef] - Satyanarayanan, M. The emergence of edge computing. Computer
**2017**, 50, 30–39. [Google Scholar] [CrossRef] - Byers, C.C. Architectural imperatives for fog computing: Use cases, requirements, and architectural techniques for fog-enabled iot networks. IEEE Commun. Mag.
**2017**, 55, 14–20. [Google Scholar] [CrossRef] - OpenFog Consortium Architecture Working Group. OpenFog Architecture Overview. Available online: https://www.iiconsortium.org/pdf/OpenFog_Reference_Architecture_2_09_17.pdf (accessed on 27 August 2021).
- Kairouz, P.; McMahan, H.B.; Avent, B.; Bellet, A.; Bennis, M.; Bhagoji, A.N.; Bonawitz, K.; Charles, Z.; Cormode, G.; Cummings, R.; et al. Advances and open problems in federated learning. arXiv
**2019**, arXiv:1912.04977. [Google Scholar] - Chen, X.W.; Lin, X. Big data deep learning: Challenges and perspectives. IEEE Access
**2014**, 2, 514–525. [Google Scholar] [CrossRef] - Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw.
**2015**, 61, 85–117. [Google Scholar] [CrossRef] [Green Version] - Islam, S.S.; Rahman, S.; Rahman, M.M.; Dey, E.K.; Shoyaib, M. Application of deep learning to computer vision: A comprehensive study. In Proceedings of the 2016 5th International Conference on Informatics, Electronics and Vision (ICIEV), Dhaka, Bangladesh, 13–14 May 2016; pp. 592–597. [Google Scholar]
- Chen, D.; Mak, B.K.W. Multitask learning of deep neural networks for low-resource speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process.
**2015**, 23, 1172–1183. [Google Scholar] - Majumder, N.; Poria, S.; Gelbukh, A.; Cambria, E. Deep learning-based document modeling for personality detection from text. IEEE Intell. Syst.
**2017**, 32, 74–79. [Google Scholar] [CrossRef] - Jiang, Z.; Li, L.; Huang, D.; Jin, L. Training word embeddings for deep learning in biomedical text mining tasks. In Proceedings of the 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Washington, DC, USA, 9–12 November 2015; pp. 625–628. [Google Scholar]
- Voigt, P.; Von dem Bussche, A. The EU General Data Protection Regulation (GDPR). In A Practical Guide, 1st ed.; Springer International Publishing: Cham, Switzerland, 2017; Volume 10, p. 3152676. [Google Scholar]
- Basu, D.; Data, D.; Karakus, C.; Diggavi, S. Qsparse-local-SGD: Distributed SGD with quantization, sparsification, and local computations. arXiv
**2019**, arXiv:1906.02367. [Google Scholar] - Konečnỳ, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated learning: Strategies for improving communication efficiency. arXiv
**2016**, arXiv:1610.05492. [Google Scholar] - Barnes, L.P.; Inan, H.A.; Isik, B.; Özgür, A. rTop-k: A statistical estimation approach to distributed SGD. IEEE J. Sel. Areas Inf. Theory
**2020**, 1, 897–907. [Google Scholar] [CrossRef] - Zhang, Y.; Duchi, J.C.; Jordan, M.I.; Wainwright, M.J. Information-Theoretic Lower Bounds for Distributed Statistical Estimation with Communication Constraints. Available online: https://people.eecs.berkeley.edu/~jordan/papers/zhang-etal-nips14.pdf (accessed on 27 August 2021).
- Suresh, A.T.; Felix, X.Y.; Kumar, S.; McMahan, H.B. Distributed mean estimation with limited communication. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 3329–3337. [Google Scholar]
- Konečnỳ, J.; Richtárik, P. Randomized distributed mean estimation: Accuracy vs. communication. Front. Appl. Math. Stat.
**2018**, 4, 62. [Google Scholar] [CrossRef] - Alistarh, D.; Grubic, D.; Li, J.; Tomioka, R.; Vojnovic, M. QSGD: Communication-efficient SGD via gradient quantization and encoding. Adv. Neural Inf. Process. Syst.
**2017**, 30, 1709–1720. [Google Scholar] - Chraibi, S.; Khaled, A.; Kovalev, D.; Richtárik, P.; Salim, A.; Takáč, M. Distributed fixed point methods with compressed iterates. arXiv
**2019**, arXiv:1912.09925. [Google Scholar] - Caldas, S.; Konečny, J.; McMahan, H.B.; Talwalkar, A. Expanding the reach of federated learning by reducing client resource requirements. arXiv
**2018**, arXiv:1812.07210. [Google Scholar] - Hamer, J.; Mohri, M.; Suresh, A.T. FedBoost: A Communication-Efficient Algorithm for Federated Learning. In Proceedings of the 37th International Conference on Machine Learning, Virtual Event, 13–18 July 2020; pp. 3973–3983. [Google Scholar]
- McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE
**1998**, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version] - Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv
**2017**, arXiv:1708.07747. [Google Scholar]

**Figure 2.**This figure shows the configuration of the server model used in the experiment and the process of generating an arbitrary derivative model to be distributed to the client.

**Figure 3.**An example of grayscale handwritten digits and 10 fashion items from the dataset MNIST and Fashion-MNIST.

**Figure 4.**A graphical set of bars to show how to partition the entire dataset so that each client has an extreme data distribution.

**Table 1.**This table shows the federated learning results of the image classification model using the standard method and the proposed algorithm, FederatedPartial. The ratios represent the cost of FederatedPartial compared to the standard method.

Cost Hit 0.8 in Rounds | Cost Hit 0.8 in Traffic | Method | Rounds Ratio | Traffic Ratio | Data Distribution | Dataset | Variables | Optimizer | |
---|---|---|---|---|---|---|---|---|---|

Exp1 | 960 | $2.86\times {10}^{9}$ | Standard | 0.892 | 0.569 | (10, 90) | Fashion MNIST | T: 2000 B: 50 E: 5 $\eta $: 0.001 i: 3 j: 9 p: $\frac{2}{3}$ | SGD with 0.9 Momentum |

Exp2 | 857 | $1.63\times {10}^{9}$ | Partial | ||||||

Exp3 | 1032 | $3.08\times {10}^{9}$ | Standard | 0.757 | 0.500 | (90, 10) | |||

Exp4 | 781 | $1.54\times {10}^{9}$ | Partial | ||||||

Exp5 | 1007 | $3.00\times {10}^{9}$ | Standard | 0.781 | 0.517 | (10, 10, 80) | |||

Exp6 | 787 | $1.55\times {10}^{9}$ | Partial | ||||||

Exp7 | 323 | $9.57\times {10}^{8}$ | Standard | 0.346 | 0.229 | (10, 90) | MNIST | ||

Exp8 | 111 | $2.19\times {10}^{8}$ | Partial | ||||||

Exp9 | 316 | $9.42\times {10}^{8}$ | Standard | 0.413 | 0.273 | (90, 10) | |||

Exp10 | 131 | $2.57\times {10}^{8}$ | Partial | ||||||

Exp11 | 279 | $8.31\times {10}^{8}$ | Standard | 0.431 | 0.284 | (10, 10, 80) | |||

Exp12 | 120 | $2.36\times {10}^{8}$ | Partial |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kang, D.; Ahn, C.W.
Communication Cost Reduction with Partial Structure in Federated Learning. *Electronics* **2021**, *10*, 2081.
https://doi.org/10.3390/electronics10172081

**AMA Style**

Kang D, Ahn CW.
Communication Cost Reduction with Partial Structure in Federated Learning. *Electronics*. 2021; 10(17):2081.
https://doi.org/10.3390/electronics10172081

**Chicago/Turabian Style**

Kang, Dongseok, and Chang Wook Ahn.
2021. "Communication Cost Reduction with Partial Structure in Federated Learning" *Electronics* 10, no. 17: 2081.
https://doi.org/10.3390/electronics10172081