Next Article in Journal
Investigation of Quadrate Fresnel Zone Plates Fabricated by Femtosecond Laser Direct Writing
Next Article in Special Issue
A Sentence Prediction Approach Incorporating Trial Logic Based on Abductive Learning
Previous Article in Journal
Wavelet Energy Evolution Characteristics of Acoustic Emission Signal under True-Triaxial Loading during the Rockburst Test
Previous Article in Special Issue
Neural Graph Similarity Computation with Contrastive Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Kalman Filter-Based Differential Privacy Federated Learning Method

School of Cyberspace Security and Computer, Hebei University, Baoding 071000, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(15), 7787; https://doi.org/10.3390/app12157787
Submission received: 30 June 2022 / Revised: 25 July 2022 / Accepted: 30 July 2022 / Published: 2 August 2022

Abstract

:
The data privacy leakage problem of federated learning has attracted widespread attention. Using differential privacy can protect the data privacy of each node in the federated learning, but adding noise to the model parameters will reduce the accuracy and convergence efficiency of the model. A Kalman Filter-based Differential Privacy Federated Learning Method (KDP-FL) has been proposed to solve this problem, which reduces the impact of the noise added on the model by Kalman filtering. Furthermore, the effectiveness of the proposed method is verified in the case of both Non-IID and IID data distributions. The experiments show that the accuracy of the proposed method is improved by 0.3–4.5% compared to differential privacy federated learning.

1. Introduction

With the development of industry and the improvement in computing power, the application of artificial intelligence in various fields of people’s lives has been becoming more popular, affecting many aspects, including medical care, logistics, face and voice recognition, etc. There is a wide application of machine learning and profound learning benefits from computing power and massive data. The traditional machine learning and deep learning model rely on data centers and require large amounts of high-quality data. With the increasing requirements of machine learning and deep learning in our lives, the requirements for data quantity and quality are also higher. In addition, in production and life, other companies and organizations need to pay to collect massive amounts of data. Aside from the cost, protecting users’ privacy is also an important factor limiting the data obtained. To address these issues, Google proposed federated learning [1,2,3], which has attracted attention from scholars and organizations. Federated learning can train models using decentralized data, which significantly increases the amount and quality of data. On the other hand, the participating nodes are not required to provide their own local data in federated learning. They only need to deliver the training results to protect the user’s data privacy. However, federated learning does not make the user’s privacy completely invulnerable, and the user’s privacy may be obtained through the shared model parameters [4,5].
Two schemes have been proposed to improve privacy protection in federated learning, one based on homomorphic encryption [6,7,8,9] and the other based on differential privacy [10,11,12,13]. However, these methods have difficulties in practical application due to their limitations. Differential privacy relies on adding noise to the original data or training results to protect privacy, and it may reduce the model’s accuracy. Too much noise can make the model less accurate than expected. At the same time, too little noise will not achieve the protection of user privacy, making it challenging to balance privacy protection and accuracy [8]. Homomorphic encryption techniques rely on complex encryption and decryption computations, which may cost a large amount of time. At the same time, the different computing power of devices and the different time used in complex calculations situations can make the aggregation unsynchronized [14].
To solve the above problems of federated learning, Kalman filter-based differential privacy federated learning (KDP-FL) is proposed. The proposed scheme consists of two parts, using differential privacy to protect the gradient privacy in federated learning and using the Kalman filter to reduce the noise added to preserve privacy in order to improve the accuracy of the global model. The main contributions of this paper are as follows.
(1)
A multiparty cooperative model training scheme that protects data privacy is proposed.
(2)
A differential privacy federated learning algorithm is proposed to protect user data privacy by a differential privacy mechanism.
(3)
The noise introduced by the differential privacy mechanism is reduced by Kalman filtering to improve the model’s accuracy.

2. Related Work

Federated learning has been gradually applied to protect the data privacy of all parties in scenarios where multiple parties cooperate in training and need to exchange data, such as traffic monitoring [15], edge computing [14], and IoT [16], to ensure that the original data is not leaked. However, privacy analysis attacks mean that federated learning also faces the risk of privacy leakage. It is essential to solve the problems of federated learning, especially gradient privacy protection, where data privacy is a particular concern. There are currently two privacy-preserving methods in federated learning: (1) differential privacy and (2) homomorphic encryption.
Abadi et al. [10] proposed a differential privacy deep learning method that combined machine learning or deep learning methods with differential privacy and proposed a new computational theory to obtain lower privacy consumption with smaller constraints, providing a basis for introducing differential privacy in federated learning. Hu et al. [11] used a block coordinate descent to optimize the objective function and differential privacy, which could protect gradient privacy but performed poorly in non-convex models.
Zhao et al. [12] used differential privacy federated learning for vehicle nets to train intelligent traffic management systems to protect vehicle location privacy but did not evaluate the accuracy under the influence of noise. Homomorphic encryption is a cryptographic method based on the computational complexity theory of mathematical puzzles, where the model parameters computed by each training node are aggregated and sent to the specified node after homomorphic encryption, and the model parameters obtained after decryption are the same as those obtained by direct aggregation without encryption, using this property to protect gradient privacy. Li et al. [13] used the chain aggregation method to protect gradient privacy, which improved the accuracy, but the chain aggregation approach introduced a large communication overhead. Fang et al. [8] used the Pilliar algorithm combined with federated learning to protect gradient privacy, but the Pilliar algorithm requires complex power operations, which introduced a large time overhead. Ma et al. [9] used the multi-key homomorphic encryption (MK-HE) to solve the collusion attack leakage problem brought by a single key, which improved privacy protection for federated learning. Zhang et al. [7] improved the encoding method during gradient transmission, reducing the computational overhead due to encryption but decreasing the accuracy.
To balance accuracy and differential privacy, this paper uses a differential privacy approach to protect gradient privacy while using Kalman filtering to reduce the impact of the noise on the accuracy.

3. Proposed Method

This section introduces the Kalman filtering-based differential privacy federated learning method, which contains a Differential Privacy mini-Batch Gradient Descent Algorithm and the Kalman reduction parameters aggregation method.

3.1. Method Overview

There are two types of nodes in federated learning, the Request Node (RN) and the Training Node (TN). The nodes that need others to assist in training the model are called the RN, which deliver task requests to other nodes and initialize or update the global model. The nodes that assist the RN in training the model are called the TN, which train the model using local data and provide the training results, which is also called the gradient, to the RN. There are C TNs participating in training each round. The workflow of KDP-FL is shown in Figure 1. Federated learning contains T rounds of global training. Global training includes task definition, model update, and local training. The RN defines the task and initializes the global model. Then, it loops the following steps to finish the global training:
(1)
The RN updates the global model parameters and sends them to the TN.
(2)
The TN trains the model locally and obtains the local training result for the tth round of training after e iterations. This step is also called local training.
(3)
The TN adds Gaussian noise to the training result and sends them to the RN.
(4)
The RN uses Kalman filtering on all the training result to reduce the noise.
(5)
The RN aggregates all the training results to obtain the global model parameters for the next round.

3.2. Method Details

In this section, we introduce our method in detail. Some significant symbols used in this paper are defined in Table 1.

3.2.1. Differential Privacy-Based Stochastic Mini-BATCH Gradient Descent (DP-SGD)

TNs train the model locally instead of providing local raw data to the RNs to protect data privacy, then send the training result to the RN. Algorithm 1 describes the local training. Each TN uses w 0 as the initial local model parameters, uses the local data as the dataset, and trains the model using the DP-SGD algorithm. In the eth round of the DP-SGD, the TN performs several iterations to optimize the model parameters. During optimization, the training results are not sent to the RN, and noise is added to the training result at the end of the Eth round of optimization of local training. Local training is performed using the following iteration
w = w λ 1 B b = 1 B g b
After E rounds of local training, the training result should be sent to the RN. To protect the gradient privacy, the results need to be revised using a differential privacy method before sending. The differential privacy mechanism in this paper is the Gaussian mechanism, defined in Equation (2), where N 0 , S f 2 · σ 2 denotes a normal distribution with mean zero and variance S f 2 · σ 2 .
M D = f D + N 0 , S f 2 · σ 2
A Gaussian mechanism for a function f with S f as sensitivity satisfies ε , δ -differential privacy if δ 4 5 exp σ ε 2 / 2 and ε < 1 are satisfied [14]. Then, the local training result noise addition of the TN is calculated as:
g = 1 λ w w 0 + N 0 , σ 2 M 2 I
To address the gradient explosion that may occur during the training process, L 2 norm gradient clipping is used. In addition, gradient clipping can limit the gradient values to a certain range, which is beneficial to the Gaussian mechanism. The clipping threshold is set to M. The gradient value g does not change when the L 2 norm number of the gradient g 2 M , and if g 2 > M , the gradient is clipped as Algorithm 1 described. The gradient clipping method can be used to control the gradient values from exceeding the set threshold and prevent the impact of gradient explosion on the model. In addition, after gradient clipping, the upper bound of the gradient value L 2 norm is g 2 M . Limiting the upper bound can effectively control the variance in the noise distribution.
The use of differential privacy in SGD requires the calculation of privacy consumption, and there is a risk of privacy leakage when the privacy cost is higher than the privacy budget. Compared to the strong composition theorem [17], using the Moment Accountant proposed by Adabi et al. [10] can obtain a smaller σ with the same privacy budget. According to the theory of Moment Account, for any ε < c 1 q 2 T and σ c 2 q T log 1 / δ ε , the DP-BGD algorithm is ε , δ -differentially private for any δ > 0 where c 1 , c 2 are constants, and q is the sample probability defined in Equation (4). The variance in the added noise is based on the number of training rounds and privacy budget.
q = C N B E | D |
σ c 2 q T log 1 / δ ε
The DP-SGD algorithm is shown in Algorithm 1.
Algorithm 1 DP-SGD
Input: D c ; w 0 , M;
Output: g
1:
w w 0
2:
for e = 1 to E do
3:
    Take a random batch with size B
4:
    for  b = 1 to B  do
5:
         g b l w , x b w
6:
         g b g b / max 1 , g b M
7:
    end for
8:
     w w λ 1 B b = 1 B g b
9:
end for
10:
g 1 λ w w 0 + N 0 , σ 2 M 2 I

3.2.2. Kalman Noise Reduction Parameter Aggregation

The TN sends the training result to the RN after adding noise using Equation (3), and the model parameters of node TN c are noted as g c ; then, the RN receives the set of model parameters G = g 1 , g 2 , , g c .
The model parameters obtained by the RN contain noise, which will decrease the accuracy of the model. In order to reduce the impact of differential privacy on the model and improve the accuracy, it is necessary to revise the model parameters sent by the TN. In this paper, we use Kalman filtering to revise all model parameters obtained by the RN to improve the accuracy of the model under the situation where differential privacy noise has been added to the global model:
g = K a l m a n g g G
The details will be described in Section 3.2.3. The global model parameters of the tth round are obtained after aggregating all the model parameters revised by the Kalman filter; then, the global model is updated.
w t + 1 = w t λ C c = 1 C g t , c
The KDP-FL algorithm is shown in Algorithm 2.
Algorithm 2 KDP-FedAvg
Input: D i ; w 0 ;
Output:w
1:
for t = 1 to T do
2:
    for  c = 1 to C  in parallel do
3:
        TN c do: g t , c DP - SGD D i , w , M
4:
        RN do:  g t , c Kalman g t , c
5:
    end for
6:
     w t + 1 w t + λ C c C g t , c
7:
end for

3.2.3. Details of Kalman Noise Reduction

The Kalman filter is able to reduce the noise of normal distribution in the linear system [18]. The noise added to the training result after differential privacy is Gaussian noise satisfying the condition of normal distribution, and the state update equation is linear, so the Kalman filter can be used for noise reduction in the training result to decrease the impact of differential privacy on the accuracy of the model. The noise revising has two steps, the first step is the model parameters’ update, and the second step is the parameters’ update correction.
There are C TNs in the tth round of federated learning; the model parameters of the TN c are w t , c , the global model parameters are w t , the parameter vector composed of the model parameters of all TNs is W t = w t w t , 1 w t , 2 w t , N , and the gradient values of TN c are g t , c . Then, the gradient vector composed of the gradient values of all TNs is G t = 0 g t , 1 g t , 2 g t , c . The linear coefficient matrix of the state prediction equation U is a unit matrix, and V is defined as following:
V = 1 0 0 0 0 λ C 0 0 0 0 λ C 0 0 0 0 λ C
Then, the parameter update equation for model training can be transformed into the state prediction equation:
W t + 1 T = U W t T + V G t T
In fact, the parameter update equation Equation (9) for model training can also be written as:
w t + 1 = w t λ c C g t , c w t + 1 , 1 = w t , 1 λ g t , 1 + N 0 , σ 2 M 2 I w t + 1 , c = w t , c λ g t , c + N 0 , σ 2 M 2 I ,
The noise added by the TN in differential privacy uses the Gaussian mechanism satisfying Gaussian distribution.
To achieve dimensional matching with U and V, we need the calculation described in Algorithm 3. The elements in W t T and G t T are all matrix. First, we reshape each element in W t T and G t T . Let N = m × n , and each matrix w t , c or g t , c shaped m × n will be reshaped to N × 1 . Let ϖ c , i be the ith parameter in w t , c and ϱ c , i be the ith in g t , c . Then, we can calculate Equation (9) using Algorithm 3.
Algorithm 3 Matrix broadcast calculation
Input: G t T ; W t T ; U; V;
Output: W t + 1
1:
for i = 1 toNdo
2:
     ω i ϖ i ϖ 1 , i ϖ 2 , i ϖ C , i
3:
     ρ i 0 ϱ 1 , i ϱ 2 , i ϱ C , i
4:
     ω i T U ω i T + V ρ i T
5:
end for
6:
w t + 1 ϖ 1 ϖ 2 ϖ N
7:
for c = 1 to C do
8:
     w t + 1 , c ϖ c , 1 ϖ c , 2 ϖ c , N
9:
end for
10:
W t + 1 w t + 1 w t + 1 , 1 w t + 1 , 2 w t + 1 , C
The gradient vector G k T contains noise added by the TN, and the distribution of the noise is known. The Gaussian noise added by the differential privacy is considered an external disturbance, i.e., uncertainty in the system.
In Kalman filtering, z t is used to denote the actually obtained data, calculated using the following equation:
z t = H W t + R ,
where H is the data unit transformation matrix. In this paper, the data sent by the TN are the same as the data obtained by the RN; so, H is the C + 1 dimensional unit vector. R is the error column vector. In order to avoid the occurrence of a singular matrix in subsequent calculations, here, R = 0.1 Q is taken. The calculation of the Kalman gain matrix using the noise covariance matrix P t is
K t + 1 = H P t H T H P t H T + R 1
Since H is a C + 1 dimensional unit vector, the formula for the Kalman gain can be reduced to
K t + 1 = P t P t + R 1 .
The parameter update equation using the Kalman gain correction is
W t + 1 = W t + K t z t H W t .
We update the covariance matrix of the norm noise:
P t + 1 = I K t H P t + Q .
Q N 0 , σ 2 M 2 I

4. Experiment and Analysis

4.1. Experiment Setting

The public datasets MNIST [19], FMNIST [20], and CIFAR10 [21] were selected to train and test. The dataset was partitioned to simulate a federated learning environment, (1) independently and identically distributed (IID) for each TN [1]: the training data were shuffled and then partitioned into 100 clients each receiving an equal amount of examples; (2) non-independently and identically distributed (Non-IID) for each TN [22]: we distributed the data among 100 TNs, such that each TN contained samples of only two kinds; the number of samples among the TN followed a power law, as detailed in Table 2.
Using CNN as the training model for MNIST and FMNIST, the network architecture consisted of two fully connected layers. The convolutional layers used 5 × 5 convolutions with stride 1, followed by a ReLU and 2 × 2 max pools, with 32 channels in the first layer and 64 channels in the second layer. Then the first convolution output was 32 × 14 × 14 after pools, and the second convolution output was 64 × 7 × 7 after pools. Last, we had two fully connected layers and used resnet50 [23] to train the model for CIFAR10.
The experimental environment was an Intel(R) Xeon(R) W-2295 CPU@3.00 GHz, 64 GB RAM, NVIDIA GeForce RTX 3080 GPU, and the experimental platform was Pytorch. According to the related works [10,11,12], the parameters of this paper were set as shown in Table 3.

4.2. Privacy Budget Analysis

Moments Accountant can provide a tighter bound than the composition theorem on privacy loss. The privacy loss can be calculated by the noise level σ . The privacy loss with the increasing global epoch is shown in Figure 2. To test the effectiveness of the proposed method under different privacy budgets, we selected ε = 1 and ε = 5 . A larger privacy budget ε = 5 will result in a more accurate model, and a smaller privacy budget ε = 1 will achieve better privacy protection. Then, we let δ = 10 5 and computed the noise level σ using Equation (5).

4.3. Experimental Results

The effectiveness of the method is shown by the model accuracy, and the comparison results are shown in Figure 3. The DP-FedAvg algorithm protected privacy by adding noise to the gradient matrix, and the impact of privacy protection depended on the size of the privacy budget. The smaller the privacy budget, the better the privacy protection. However, as the privacy budget decreased, the noise required to be added increased, which may lead to a decrease in the accuracy of the model. By comparing the accuracy under the non-IID dataset, DP-FedAvg in (a) and (c), a higher requirement for privacy protection needed a smaller ε , which led to lower accuracy of the model. Furthermore, the accuracy of DP-FedAvg at ε = 1 decreased about 3% compared to that at ε = 5 . Furthermore, the accuracy of DP-FedAvg ε = 1 was lower than ε = 5 and was the same as the pattern in the IID.
In addition, using different datasets, FMNIST and CIFAR10, the experimental results in Table 4 show that KDPFL achieved good results in the other two datasets. Hence, the model is suitable for other datasets. By testing CNN and CIFAR10, the experimental results show that KDPFL is suitable for different neural network structures and has good compatibility.
In this paper, the parameter update equation was transformed into the state update equation. Furthermore, the global model parameter update, which was completely determined by the TN’s training result, was transformed into the global model parameter update, which was jointly determined by the estimated value and the TN calculation result. The Kalman gain coefficient was calculated by the noise covariance matrix, and the weight of the estimated value and the predicted value in the global model update were calculated by the Kalman gain coefficient to realize the optimal global model update.
The Kalman filter noise reduction method, which reduced the noise at aggregation, decreased the effect of the differential privacy added noise on the model, achieved an accuracy closer to the FedAvg algorithm without added noise, and performed well with both non-IID data and IID data. With non-IID data, the accuracy ε = 1 was improved up to 4.5% compared to DP-FedAvg, and the accuracy ε = 5 was improved up to 2.5%. The above experiments show that the scheme proposed in this paper can protect the user’s privacy with little loss of accuracy.

5. Conclusions

In this paper, a Kalman filter-based differential privacy federated learning method, KDP-FL, was proposed. Federated learning was used to achieve data sharing while preserving privacy. In addition, differential privacy was added to federated learning, which can resist gradient differential attacks and further protect user privacy. Furthermore, the influence of the noise added by the differential mechanism on the accuracy of the model was reduced by the Kalman filter. The experiment results and analysis showed that the proposed method was able to train models with little loss of accuracy while protecting user privacy. KDP-FL is suitable for training privacy-preserving models under multiple nodes and can help in data cooperation among multiple nodes.
When the neural network is simple, the additional computational overhead brought by Algorithm 3 is negligible. However, when the neural network is complex with a large number of parameters, optimizing all parameters requires calculating the Kalman gain for each parameter, which will bring about a non-negligible additional computational overhead.
In addition, the proposed method used Kalman filtering, which increases the computational overhead and is not applicable to devices and nodes with weak computational power, which is a direction for future improvement.

Author Contributions

Conceptualization, X.Y. and Z.D.; data collection and analysis, Z.D.; validation, X.Y.; writing—original draft preparation, Z.D.; writing—review and editing, X.Y. and Z.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Ministry of Science and Technology of China, National Key R&D Program “Cyberspace Security” Key Project, 2017YFB0802305 and the Natural Science Foundation of Hebei Province, F2021201052.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. The source code can be found here: https://git.acwing.com/dzj/kdpfl.git (accessed on 29 July 2022), and the datasets can be found here: https://pan.baidu.com/s/1PVFJ7vnezGe9jEp8Pgh_fg?pwd=n2w5 (accessed on 29 July 2022).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A. Communication-Efficient Learning of Deep Networks from Decentralized Data. Artif. Intell. Stat. 2017, 54, 1273–1282. [Google Scholar]
  2. Konečný, J.; McMahan, H.B.; Ramage, D.; Richtárik, P. Federated Optimization: Distributed Machine Learning for On-Device Intelligence. arXiv 2016, arXiv:1610.02527. [Google Scholar]
  3. Konečný, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated Learning: Strategies for Improving Communication Efficiency. arXiv 2016, arXiv:1610.05492. [Google Scholar]
  4. Melis, L.; Song, C.; Cristofaro, E.D.; Shmatikov, V. Exploiting Unintended Feature Leakage in Collaborative Learning. In Proceedings of the IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 20–22 May 2019; pp. 691–706. [Google Scholar]
  5. Zhao, B.; Mopuri, K.R.; Bilen, H. iDLG: Improved Deep Leakage from Gradients. arXiv 2020, arXiv:2001.02610. [Google Scholar]
  6. Ou, W.; Zeng, J.; Guo, Z.; Yan, W.; Liu, D.; Fuentes, S. A homomorphic-encryption-based vertical federated learning scheme for rick management. Comput. Sci. Inf. Syst. 2020, 17, 819–834. [Google Scholar] [CrossRef]
  7. Zhang, C.; Li, S.; Xia, J.; Wang, W.; Yan, F.; Liu, Y. BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning. In Proceedings of the USENIX Annual Technical Conference, Boston, MA, USA, 15–17 July 2020; pp. 493–506. [Google Scholar]
  8. Fang, H.; Qian, Q. Privacy Preserving Machine Learning with Homomorphic Encryption and Federated Learning. Future Internet 2021, 13, 94. [Google Scholar] [CrossRef]
  9. Ma, J.; Naas, S.; Sigg, S.; Lyu, X. Privacy-preserving Federated Learning based on Multi-key Homomorphic Encryption. arXiv 2021, arXiv:2104.06824. [Google Scholar] [CrossRef]
  10. Abadi, M.; Chu, A.; Goodfellow, I.J.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 308–318. [Google Scholar]
  11. Hu, R.; Guo, Y.; Li, H.; Pei, Q.; Gong, Y. Personalized Federated Learning With Differential Privacy. IEEE Internet Things J. 2020, 7, 9530–9539. [Google Scholar] [CrossRef]
  12. Zhao, Y.; Zhao, J.; Yang, M.; Wang, T.; Wang, N.; Lyu, L.; Niyato, D.; Lam, K. Local Differential Privacy-Based Federated Learning for Internet of Things. IEEE Internet Things J. 2021, 8, 8836–8853. [Google Scholar] [CrossRef]
  13. Li, Y.; Zhou, Y.; Jolfaei, A.; Yu, D.; Xu, G.; Zheng, X. Privacy-Preserving Federated Learning Framework Based on Chained Secure Multiparty Computing. IEEE Internet Things J. 2021, 8, 6178–6186. [Google Scholar] [CrossRef]
  14. Liu, J.; Xu, H.; Xu, Y.; Ma, Z.; Wang, Z.; Qian, C.; Huang, H. Communication-efficient asynchronous federated learning in resource-constrained edge computing. Comput. Netw. 2021, 199, 108429. [Google Scholar] [CrossRef]
  15. Abubaker, Z.; Javaid, N.; Almogren, A.; Akbar, M.; Zuair, M.; Ben-Othman, J. Blockchained service provisioning and malicious node detection via federated learning in scalable Internet of Sensor Things networks. Comput. Netw. 2022, 204, 108691. [Google Scholar] [CrossRef]
  16. Mothukuri, V.; Khare, P.; Parizi, R.M.; Pouriyeh, S.; Dehghantanha, A.; Srivastava, G. Federated-Learning-Based Anomaly Detection for IoT Security Attacks. IEEE Internet Things J. 2022, 9, 2545–2554. [Google Scholar] [CrossRef]
  17. Dwork, C.; Rothblum, G.N.; Vadhan, S.P. Boosting and Differential Privacy. In Proceedings of the 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, Las Vegas, NV, USA, 23–26 October 2010; pp. 51–60. [Google Scholar]
  18. Alazard, D. Introduction to Kalman Filtering. Meas. Control-Lond. Inst. Meas. Control 1986, 19, 84–110. [Google Scholar]
  19. Lecun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef] [Green Version]
  20. Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
  21. Ranzato, M.; Krizhevsky, A.; Hinton, G.E. Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010; Volume 9, pp. 621–628. [Google Scholar]
  22. Li, X.; Huang, K.; Yang, W.; Wang, S.; Zhang, Z. On the Convergence of FedAvg on Non-IID Data. arXiv 2019, arXiv:1907.02189. [Google Scholar]
  23. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Figure 1. The workflow of KDPFL.
Figure 1. The workflow of KDPFL.
Applsci 12 07787 g001
Figure 2. Privacy loss with increasing global epoch.
Figure 2. Privacy loss with increasing global epoch.
Applsci 12 07787 g002
Figure 3. Experiment results: (a) non-IID, ε = 1 ; (b) IID, ε = 1 ; (c) non-IID, ε = 5 ; (d) IID, ε = 5 .
Figure 3. Experiment results: (a) non-IID, ε = 1 ; (b) IID, ε = 1 ; (c) non-IID, ε = 5 ; (d) IID, ε = 5 .
Applsci 12 07787 g003
Table 1. Symbol definitions.
Table 1. Symbol definitions.
SymbolDefinition
w 0 Initial global model parameters
TTotal number of global training rounds
EThe number of iterations performed in local training between two global training rounds
λ Learning rate
BBatch size
gLocal training result of TN
g Local training result of TN after adding noise
wModel parameters of TN in local training
w t Global model parameters in the tth round of global training.
w t , c Model parameters of TN c in the tth round of global training
g t , c Local training result of TN c in the tth round of global training
g t , c Local training result of TN c in the tth round of global training after adding noise
M Differential privacy mechanism
DDataset
MClipping threshold
qSample probability
IUnit matrix
CThe number of TN
Table 2. Dataset details.
Table 2. Dataset details.
DatasetLabelsDevicesTraining SamplesSamples/Device
meanstd
MNIST1010060,000600328
FMNIST1010060,000600328
Cifar101010050,000500293
Table 3. Parameter definitions.
Table 3. Parameter definitions.
DefinitionParameter
Total number of nodes C a l l = 100
Number of TNs involved in training C = 20
Number of local batches B = 32
Learning rate λ = 0.05
Local training round E = 5
Privacy budget ε = 1 or ε = 5
Table 4. Accuracy in different models and datasets.
Table 4. Accuracy in different models and datasets.
MethodFMNIST (acc/%)CIFAR10 (acc/%)
Non IIDIIDNon IIDIID
ε = 1 ε = 5 ε = 1 ε = 5 ε = 1 ε = 5 ε = 1 ε = 5
FedAvg86.6289.5683.2186.34
DP-FedAvg83.1784.9486.5988.0277.2679.4481.2383.55
KDPFL86.1286.2388.9389.0182.7682.8885.9386.03
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Yang, X.; Dong, Z. Kalman Filter-Based Differential Privacy Federated Learning Method. Appl. Sci. 2022, 12, 7787. https://doi.org/10.3390/app12157787

AMA Style

Yang X, Dong Z. Kalman Filter-Based Differential Privacy Federated Learning Method. Applied Sciences. 2022; 12(15):7787. https://doi.org/10.3390/app12157787

Chicago/Turabian Style

Yang, Xiaohui, and Zijian Dong. 2022. "Kalman Filter-Based Differential Privacy Federated Learning Method" Applied Sciences 12, no. 15: 7787. https://doi.org/10.3390/app12157787

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop