Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning

Chen, Haisong; Yang, Linlin; Wang, Aili

doi:10.3390/electronics13061105

Open AccessArticle

Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning

by

Haisong Chen

¹,

Linlin Yang

² and

Aili Wang

^2,*

¹

School of Undergraduate Education, Shenzhen Polytechnic University, Shenzhen 518055, China

²

Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin 150080, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(6), 1105; https://doi.org/10.3390/electronics13061105

Submission received: 3 February 2024 / Revised: 7 March 2024 / Accepted: 13 March 2024 / Published: 18 March 2024

(This article belongs to the Special Issue Machine Learning Methods in Software Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Software defect prediction is an important part of software development, which aims to use existing historical data to predict future software defects. Focusing on the model performance and communication efficiency of cross-project software defect prediction, this paper proposes an efficient communication-based federated meta-learning (ECFML) algorithm. The lightweight MobileViT network is used as the meta-learner of the Model Agnostic Meta-Learning (MAML) algorithm. By learning common knowledge on the local data of multiple clients, and then fine-tuning the model, the number of unnecessary iterations is reduced, and communication efficiency is improved while reducing the number of parameters. The gradient information model is encrypted using the differential privacy of the Laplace mechanism, and the optimal privacy budget is determined through experiments. Experiments on three public datasets (AEEEM, NASA, and Relink) verified the effectiveness of ECFML in terms of parameter quantity, convergence, and model performance of cross-project software defect prediction.

Keywords:

software defect prediction; federated meta-learning; MobileViT; differential privacy

1. Introduction

Software defects are typically defined as deviations between programming results and requirements. In the development lifecycle of a software project, the later internal defects are discovered, the higher the cost of fixing them [1]. Researchers hope to identify as many program modules with potential defects as possible before software deployment, and then allocate sufficient testing resources to them [2]. Software defect prediction (SDP) uses historical data from software development and machine learning methods to identify program modules that may have defects in a software project. The current SDP can be divided into Within Project Defect Prediction (WPDP) and Cross-Project Defect Prediction (CPDP) [3].

WPDP refers to building models and making predictions on limited data within the same project, but the prediction results of this method often cannot achieve satisfactory results. This is because establishing a good defect prediction model requires a large amount of training data. In reality it is difficult and manpower and material resources are wasted on gathering sufficient training data in the same project [4]. Moreover, most historical software defect datasets are unbalanced, which leads to the overfitting phenomenon of the trained model and the lack of generalization performance. Therefore, the researchers proposed CPDP [5] to build a software defect prediction model based on the historical data of other projects (source projects), and then predict the defects of the current project (target project).

But, in most cases, the data distribution between the source project and the target project is heterogeneous which leads to a prediction model based on the source project, making it difficult to produce a satisfactory prediction effect for the target project. Jing et al. introduced the effective transfer learning (TL) method of canonical correlation analysis (CCA) into CPDP for the first time, making the distribution of source dataset and target dataset similar, and proposed the CCA+ cross-project heterogeneous defect prediction model to improve prediction performance [6]. In order to reduce the data heterogeneity between the source project and the target project, a unified measurement representation (UMR) is formed. Gong et al. mapped the data of the source project and the target project into UMR [7]. Their model introduced the maximum average difference as the distance between source data and target data and designed a simple neural network model to deal with data heterogeneity and class imbalance in CPDP. The TCA+ [8] proposed by J. Nam et al. is an improvement on the analysis of transfer components. Sun et al. proposed a cross-project semi-supervised defect prediction model based on a generative countermeasure network [9], which is called distinguishing opponent feature learning. This method is composed of feature converter and item discriminator and can reduce the data distribution difference between multiple projects.

Ma et al. proposed a heterogeneous CPDP transfer learning method, called kernel canonical correlation analysis plus (KCCA+) [10], which combines the kernel method with the transfer learning technology to improve the adaptive ability of the predictor in nonlinear separable scenarios. Aiming at the problem that the software defect prediction under federated learning does not consider the low quality of data in the aggregation stage, Song et al. proposed the aggregation method of dynamic selection software defect prediction federated model [11]. In the aggregation process, the server used the dynamic selection method to screen out clients that did not meet the threshold conditions, and weighted the clients that met the conditions. Experiments on the open-source software defect prediction dataset showed that this method improves the performance of the model, greatly reduced the impact of low-quality data on the model, and enhanced the robustness of the model. Wang et al. proposed federated reinforcement learning via gradient clustering (FRLGC) [12] and verified that FRLGC is superior to the related cross-project software defect prediction methods in three public databases.

Communication efficiency is an important consideration. Since federated learning is conducted on distributed clients or servers for model training, data communication is essential. Communication efficiency directly affects the performance and scalability of federated learning. The number of model parameters directly affects the communication cost. The strategy of gradient aggregation will also affect the communication efficiency. A common strategy is to select some clients for gradient aggregation or use incremental aggregation to avoid transmitting all the gradients in each round. The number of communications between the client and the server is expressed as the communication frequency. A higher communication frequency will increase the communication overhead.

Wang et al. proposed federated transfer learning via knowledge disruption (FTLKD). This was the first application of federated learning in software defect prediction [13]. A convolutional neural network (CNN) is used to extract the feature of defect data, and transfer learning and fine-tuning technology are used for knowledge transfer. Federated distillation is a new algorithm paradigm of federated learning. By extracting the prediction results of the client for unlabeled auxiliary data sets into the server student model, it achieves competitive training performance compared with the previous method based on parameter average, and also allows the client to train different model architectures. FedAUX is an upgraded version of co-distillation [14]. In order to find a suitable initialization model, the method performs unsupervised pretraining on auxiliary data, which has excellent performance in CNN and transformer model. Li et al. proposed a method based on intra-cluster training and Top-k gradient sparsity [15]. The intra cluster training strategy is used to reduce the negative impact of non-independent identically distributed data. The gradient transmission between client and server is sparse. This method is superior to FedAvg in common federated learning scenarios and performs well in communication efficiency and non-independent identically distributed scenarios.

For wireless communication scenarios with limited communication resources, Park et al. proposed an efficient communication-based federated learning system using local model update prediction [16]. A local model updating compression scheme based on projection to the selected subspace was designed. In addition, to avoid error propagation during iteration, park also developed a new standard to determine whether to compress local model updates. Simulation results show that the algorithm is superior to the existing benchmark scheme.

In the context of the limited communication bandwidth of the Internet of Things and edge devices, Yang et al. proposed a federated learning technology with enhanced communication efficiency [17]. This method takes the geometric median of each layer in the global model as the criterion, and cooperatively selects the important convolution kernel of the local model to achieve efficient communication. Tang et al. proposed a communication efficient decentralized federated learning framework [18]. By designing a sparse algorithm, each client only needs to communicate with a peer node with a highly sparse model to improve the communication efficiency of federated learning. Xu et al. proposed a ternary quantization algorithm [19]. The quantization client model is optimized through self-learning quantization factors. Based on the ternary quantization algorithm, a ternary Federation average protocol is proposed to reduce the communication overhead. The experimental results show that the proposed ternary federation average protocol is effective in reducing the communication cost.

The contributions of this article include the following three aspects:

We take the lead in applying federated meta-learning to software defect prediction. This strategy has fewer parameters, faster reasoning speed, and better model performance.
The lightweight MobileViT network is used as the meta-learner of MAML algorithm. By learning the general knowledge and model fine-tuning, the number of unnecessary iterations is reduced, and the communication efficiency is improved while reducing the number of parameters.
Comprehensive experiments on three public datasets show that the model has good performance in terms of parameters and performance.

2. Materials and Methods

Figure 1 shows the framework of an efficient communication software defect prediction algorithm based on federated meta-learning. This framework includes global model parameter initialization, fine-tuning during local model training, model testing, and differential privacy encryption transmission.

This method adopts a centralized federated learning paradigm, which includes a server and multiple clients [20]. The server is responsible for coordinating communication between participants, model aggregation, and planning and scheduling model updates, which reduces complex coordination among participants. Meanwhile, servers can reduce the computational burden on clients, especially for devices or users with weaker computing power. The centralized architecture makes the training and updating processes of the model more controllable. Centralized entities can choose appropriate algorithms and model structures based on needs and ensure that all participants use the same algorithm for training. This helps to improve the consistency and robustness of the model.

Firstly, the server initializes the model parameter

ω_{S}

of the global meta-learner and distributes it to the selected client. The local data of each client consists of a support set and a query set. The meta-learner also undergoes a gradient update on the query set, unlike conventional model training which only performs a gradient update on the support set. When the client sends the local model gradient to the server, Laplace differential privacy is used for encryption to ensure the security of the information. After receiving the update gradient uploaded by the client, the server updates its meta-learner based on the average. When the updated model parameters are distributed to the client, repeat the iterative process until the local meta-learner

f_{i} (θ)

converges. The server sends the model parameters of meta-learning

f_{S} (\hat{θ})

to the target client. After fine-tuning on the support set, the local model of the target client can reach a convergence state. This fine-tuning process only requires a few iterations of training on the support set of the target client, thereby reducing the communication overhead.

2.1. Federated Meta-Learning Based on Lightweight MobileViT Network

One of the main advantages of federated learning is localized model training on the device, which reduces the transmission of sensitive data. Lightweight models typically have fewer parameters, which can reduce the amount of information that may be leaked during local model training and improve the level of privacy protection. Lightweight models can reduce the size and computational complexity of the model, thereby reducing the amount of data transmitted between devices and communication overhead. Due to the small size and computational requirements of lightweight models, they can perform real-time model updating, which is useful for software defect prediction scenarios that require rapid response and continuous improvement.

2.1.1. MAML Algorithm

This method utilizes the meta-learning MAML algorithm to achieve efficient cross-project software defect prediction based on federated meta-learning [21]. The MAML algorithm treats multiple clients as multiple learning tasks, and the optimization goal of the algorithm is to find an initialization model. The target client only needs to perform one or several gradient descent steps on the initialization model using their local data to personalize the model to their own local data distribution. The MAML algorithm does not need to design specific tasks in advance, but through iterative optimization of multiple tasks, so that the model can learn general knowledge from tasks. When facing a new task, the model can quickly adapt to the new task and obtain better performance with only a few gradient updates. The MAML algorithm is suitable for tasks such as small sample learning, transfer learning, and reinforcement learning. It provides an effective method for model rapid learning and adapting to new tasks and solves the problems of data scarcity and domain migration to a certain extent.

The goal of the MAML algorithm is to train algorithm

A (φ)

, which can quickly train models, such as a deep neural network for new learning tasks. Algorithm

A (φ)

is usually parameterized, and its parameter

φ

is updated through using the task set. The task T in the MAML algorithm consists of a support set

D_{S}^{T}

and a query set

D_{Q}^{T}

with label and not intersecting. The support set and query set are defined as follows:

D_{S}^{T} = {\{(x_{i}, y_{i})\}}_{i = 1}^{|D_{S}^{T}|}

(1)

D_{Q}^{T} = {\{({x^{'}}_{i}, {y^{'}}_{i})\}}_{i = 1}^{|D_{Q}^{T}|}

(2)

A (φ)

trains model f on the support set

D_{S}^{T}

, outputs parameter

θ_{T}

, then evaluates the model on query set

D_{Q}^{T}

and calculate the test loss

L (θ)

to verify the generalization ability. Finally,

A (φ)

updates to minimize testing loss. Meta training is conducted in an episode manner, with a batch of tasks sampled from the task distribution T of the meta training set in each episode. Therefore, the objectives for optimizing algorithm

A (φ)

are defined as follows:

\min_{φ} E_{T \sim T} [L (θ)] = \min_{φ} E_{T \sim T} [L (A (φ) (D_{S}^{T}))]

(3)

The MAML algorithm is a representative gradient-based meta-learning method that uses gradient update steps to train the model. Algorithm

A (φ)

of MAML is only used to provide initialization for the model. For each task T, the algorithm maintains

φ = θ

as the initial parameter. Then, train the support set

D_{S}^{T}

and update

θ

to

θ_{T}

with one or more gradient descent methods with training losses.

L_{D_{S}^{T}} (θ) = \frac{1}{|D_{S}^{T}|} \sum_{(x, y) \in D_{S}^{T}} L (f_{θ} (x), y)

(4)

where L is the cross-entropy loss function. Finally, test

f_{T} (θ)

on query set

D_{Q}^{T}

.

L_{D_{Q}^{T}} (θ_{T}) = \frac{1}{|D_{Q}^{T}|} \sum_{(x^{'}, y^{'}) \in D_{Q}^{T}} L (f_{T} (x^{'}), y^{'})

(5)

The optimization objective in Equation (3) is instantiated as follows:

\min_{φ} E_{T \sim T} [L_{D_{Q}^{T}} (θ - α \nabla L_{D_{S}^{T}} (θ))]

(6)

where

α

is the updated learning rate. By embedding meta-learning into the federated learning framework, the data distributed in each client can cooperate to train the meta-learning algorithm, and the data of all clients are used to train the initialization parameters. Each client downloads initialization model parameters from the server and uses the support set

D_{S}

of local data to train model

f (θ)

in the federated settings. Test on a separate query set. The updated model gradient is sent to the server.

2.1.2. Lightweight Meta-Learner Design Based on MobileViT

In this method, the clients and server use the more lightweight MobileViT network as the meta-learner. Compared with the original MobileViT network [22], it has a smaller model size and a smaller number of parameters. However, compared with the traditional deep convolutional neural network, the lightweight MobileViT still maintains high feature extraction ability.

MobileViT block can use information from input tensors to build models with fewer parameters. As an input tensor

X \in ℝ^{H \times W \times C}

, MobileViT applies a standard convolution layer of

n \times n

, and then a pointwise convolution layer to obtain

X_{L} \in ℝ^{H \times W \times d}

. In order to enable MobileViT to learn the global representation of defect data, P = wh is non-overlapping patche, N is the number of patches.

N = \frac{H W}{P}

(7)

where

h \leq n, w \leq n

is the height and width of patch respectively. For each

p \in \{1, 2, \dots, P\}

,

X_{G} = ℝ^{P \times N \times d}

is obtained by coding the relationship between patches, which is expressed as:

X_{G} (p) = Transformer (X_{U} (p)), 1 \leq p \leq P

(8)

MobileViT maintains patch order and spatial order of pixels in patches.

X_{G} = ℝ^{P \times N \times d}

is folded to get

X_{F} = ℝ^{P \times N \times d}

.

Meta-learners in meta-learning need to have good generalization ability, which means they can quickly learn and adapt to new tasks. When lightweight MobileViT is used as a meta-learner, it can capture global features of images and effectively learn shared representations across multiple learning tasks. However, there are some issues with the Transformer architecture. The Transformer structure has many parameters, which requires high computational power and lacks spatial induction bias capability [23]. The main problem is that migrating the Transformer to other tasks is more cumbersome, making model training more difficult. The lightweight MobileViT removes the Transformer structure from the MobileViT Block and uses CNN instead for global feature extraction. The learning ability of this global feature enables meta-learners to have better generalization ability and can adapt to different tasks.

The structure of the MobileViT meta-learner used in this article is shown in Figure 2. A single channel tensor with a network input of 5 × 5. Firstly, it passes through a convolutional layer with a kernel of 3. The output after the convolutional layer will be cascaded through MobileViT Blocks.

Table 1 shows the main structure of MobileViT meta-learner. Each block contains convolution layer, BN layer, and feature map stitching operations of different scales. At the same time, the input size and output channel number of each layer are described in detail. The output of the three blocks goes through the average pooling, the full connection layer, and Softmax classifier, and finally the prediction results are obtained.

Algorithm 1 is pseudocode for the ECFML. First, input the client data participating in the training, and set the number of communication rounds as T, fine-tuning rounds as N, learning rate as

μ

, and other super parameters. After that, the server initializes the model parameter

ω_{S}

of the meta-learner and sends it to the selected client. The client performs training and updates the local meta-learner model parameters. After the update is completed, the server sends the global model parameter

ω_{S}

to the target client, and the target client can conduct model testing after several fine tunings.

Algorithm 1 Pseudocode of ECFML

Input: Selected clients

C_{i}, i = 1, 2, \dots, m

, Target client

C_{T}

, Number of communication rounds T, Number of fine-tuning N, Local epoch E, Learning rate

μ

.
Server processing:

1: Server initializes model parameters

ω_{S}

for the meta=learner.
2: Server initializes model parameters

ω_{S}

.

3: for each

t = 1, 2, \dots, T

do

4: for each client

i \in {1, 2, \dots, m}

do

5:

g_{i} \leftarrow

Local update

{ω, D_{i}}

,

f (\hat{θ}) \leftarrow \frac{1}{m} \sum_{i = 1}^{m} g_{i}

6: end for
7: Send

ω_{S}

to

C_{T}

Target client processing:
1: for each fine-tuning

n = 1, 2, \dots, N

do
2:

f_{T} (\hat{θ}) \leftarrow \{ω_{S}, D_{T, S}\}

3: end for
4: Test

Local update:

1: for local epoch

e = 1, 2, \dots, E

do

2: for each

D_{i, S}, D_{i, Q} \in D_{i}

do
3: Update model:

f_{i} (θ) \leftarrow \{ω_{S}, D_{i, S}\}

, Calculate loss:

L_{i} (θ) \leftarrow f_{i} (θ, D_{i, Q})

4:

g_{i} = \nabla L_{i} (θ)

5: end for

6: end for

7: return

g_{i}

2.2. Model Update with Privacy Encryption

Clients in federated learning have their own local data, but do not want to share data with other clients or servers. The encryption of model update plays an important role in protecting data privacy, which can ensure the protection of data privacy during transmission and prevent unauthorized access or disclosure. Model updating encryption can use a variety of technologies such as differential privacy [24], secure multi-party computing [25], and homomorphic encryption [26]. The specific choice of encryption technology depends on the comprehensive consideration of security requirements, computing overhead, and performance requirements. The calculation process of same-platform encryption is usually relatively slow, requiring larger computing resources and time. The availability and performance of same-platform encryption depend on the specific encryption scheme chosen, and different schemes may have different performance and security trade-offs. Due to the centralized federated learning architecture adopted in this method, secure multi-party computation is suitable for decentralized paradigms, which require communication and interaction between participating parties. In addition, the computational cost of secure multi-party computation is usually high, especially in situations with a large number of participants or high computational complexity. Laplace differential privacy has the combined nature of differential privacy; that is, the privacy protection after multiple queries were combined can still be guaranteed, which means that the consistency of privacy protection can be maintained in multiple query scenarios. Compared with other privacy protection technologies, Laplace differential privacy has achieved a good balance between privacy protection and data utility. By reasonably setting the privacy budget, the impact of noise can be reduced to a certain extent, so as to protect privacy and try to maintain the accuracy and availability of data.

Each client keeps its local data in federated learning and only transmits updates of model parameters. This method can prevent the original data from being exposed to other clients or servers, so as to protect the privacy of users. This is particularly important for data containing sensitive personal information. Privacy protection measures can enhance the client’s trust in federated learning. Establish a reliable federal learning framework through clear privacy protection policies and mechanisms and make clients more willing to share their data and participate in federated learning tasks.

Differential privacy encryption means that when a query operation is performed on a dataset D, the output result of the query is

f (D)

. The result obtained from the second query is still

f (D)

when a piece of data is added to or deleted from the dataset; this ensures that whether one of the data is in the dataset or not, since the results of query output have no impact, the existing adjacent data sets

D_{1}

and

D_{2}

, if a random algorithm K on domain

N^{|X|}

, comply with the differential privacy mechanism for all adjacent data sets

D_{1}

,

D_{2} \in N^{|X|}

, and all

S \in Im (K)

.

\Pr [K (D_{1}) \in S] \leq e^{ε} \cdot \Pr [K (D_{2}) \in S]

(9)

Since

D_{1}

and

D_{2}

can be exchanged, the lower bound can be directly obtained:

\Pr [K (D_{1}) \in S] \geq e^{- ε} \cdot \Pr [K (D_{2}) \in S]

(10)

Combining the above two formulas, it can be obtained as follows:

- ε \leq \log (\frac{\Pr [K (D_{1}) \in S]}{\Pr [K (D_{2}) \in S]}) \leq ε

(11)

For software defect data, differential privacy technology is a more suitable encryption method to reduce the communication and storage overhead, because in the software defect prediction scenario, statistical analysis, clustering, model training and other calculation operations are usually required. By introducing noise or disturbance into the calculation, differential privacy can realize the aggregation calculation of data. Differential privacy can protect data privacy and aggregate results.

Laplace differential privacy [27] and Gaussian differential privacy are common differential privacy techniques used to protect individual privacy during data processing. They are all based on adding noise to query results to hide sensitive information. Laplace differential privacy uses Laplace distribution to generate noise, while Gaussian differential privacy uses Gaussian distribution. Laplacian noise has a sharper tail, which means it is easier to introduce larger noise in query results, providing stronger privacy protection.

Assuming that algorithm Alg acts on any two adjacent datasets

D B

and

D B^{'}

, with all possible output sets

P_{M}

and

S_{M}

being any subset of

P_{M}

, if algorithm Alg satisfies the following inequality, then random algorithm F is satisfied

ε

—differential privacy.

\Pr [F (D B) \in S_{M}] \leq e^{ε} \times \Pr [F (D B^{'}) \in S_{M}]

(12)

\Pr [\cdot]

is controlled by the randomness of the algorithm, indicating the risk of privacy leakage, and

ε

is called the privacy budget, which measures the degree of privacy guaranteed by the algorithm. Lower values indicate higher security and require more input distortion.

The Laplace mechanism is defined to provide

ε

—differential privacy protection for numerical query results. For a given data set DB, there is a function

f : D \to R^{d}

with sensitivity

Δ f

assuming that random algorithm F provides

ε

—difference privacy protection. The response output is defined as:

F (D) = f (D) + Y

(13)

where

f (D)

is the real result of the user query and Y is the controllable noise added to the real query result and obeys the random noise of Laplace distribution with the scale parameter

Δ f / ε

.

2.3. Implementation Steps of ECFML Algorithm

Figure 3 is the flow chart of implementing ECFML for further illustration. Select the clients to participate in the training and a server responsible for collaborative scheduling. The server needs to initialize the global parameter

ω_{S}

of the meta-learner for all clients and broadcasts it. After loading

ω_{S}

into the local model, clients train on their respective support sets, and then test on the query set. After getting the model update

g_{i}

, clients encrypt it with Laplace differential privacy encryption. All clients send

g_{i}

to the server after encryption. The server needs to update the global parameter

ω_{S}

again after averaging these

g_{i}

and broadcast back to the clients again. Repeat the above process. The target client needs to obtain

ω_{S}

from the server and fine tune the local support set. After fine-tuning, model testing is performed on the query set of the target client.

3. Results

In this section, we first observe the impact of privacy budget on model performance through the control variable method and determine the best privacy budget. Then, the model parameters and training time are compared with other methods. By recording the test results of each client’s MobileViT meta-learner in each round of training, the convergence curve of MobileViT meta-learner is drawn to analyze the convergence of the corresponding model. Finally, the performance of the proposed method in this paper is compared with other cross-project defect prediction methods.

3.1. Datasets Description

We used AEEEM [28], NASA [29] and Relink datasets widely used in software defect prediction to evaluate the performance of ECFML algorithm. AEEEM dataset uses EQ, JDT, and LC. Each project has 61 features, which consider the software development process from many aspects. The NASA dataset is widely used for defect prediction, and its static code metrics include size, readability, complexity, etc., which are closely related to software quality. We used five projects, including PC3, PC4, and MW1. The dataset in Relink improves the performance of defect prediction by increasing the quality of defect data. The defect information in Relink has been manually verified and corrected. It consists of three open-source projects, each of which has 26 complexity measures. Table 2 shows the statistics of the items used in the experiment.

The programming language used in this experiment is Python 3.6, and the programming software is Pycharm 2018. Based on Python 1.4 version, the ECFML framework is built, with Intel (R) Core (TM) i5-10400 processor and Windows 10 operating system. All experiments were conducted on the NVIDIA GTX GeForce 1650 Ti graphics card.

3.2. Performance Evaluation Indicator

In software defect prediction, instances with defects are called positive classes, while instances without defects are called negative classes. Software defect prediction belongs to the classic binary classification problem. AUC is a commonly used metric to evaluate the performance of binary classification models. G-mean index is a geometric average value, which can consider the classification performance of software defect prediction model for positive and negative categories. This index can be obtained via confusion matrix calculation, and g-mean can be calculated by confusion matrix. The binary confusion matrix is shown in Table 3.

If the classifier leans towards one category, it will affect the correctness of the other category. The larger the value of the G-mean index, the higher the classification accuracy of positive and negative classes, and the better the performance of the model.

G - mean = \sqrt{\frac{TP}{FN + TP} \times \frac{TN}{TN + FP}}

(14)

3.3. Impact of Privacy Encryption Budget on Model Accuracy

Laplace differential privacy is a common privacy protection technology, which protects the privacy of sensitive data by adding noise to query results. In the Laplace mechanism, the size of noise is controlled by the privacy budget

ε

. The larger the budget, the smaller the noise, but the lower the degree of privacy protection; the smaller the budget, the greater the noise, but the higher the degree of privacy protection.

However, with the reduction of privacy budget, the increase of noise will have a negative impact on data utility. The introduction of noise will lead to the inaccuracy and distortion of query results, which will reduce the availability of data and the accuracy of analysis. Different types of data have different sensitivities. For some very sensitive data, even if the budget is large, the noise introduced may have a significant negative impact on the data utility. In order to balance privacy protection and data utility to the greatest extent, it is necessary to select an appropriate privacy budget. Observe the performance of the model while continuously adjusting the value of the privacy budget and select the appropriate privacy budget according to the performance of the model.

As shown in Figure 4, when

ε

slowly increases, the prediction accuracy of the model improves, but when it reaches a certain value, the classification accuracy of the model changes little, although

ε

continues to increase, which means that the added noise has been very small and has little impact on the performance of the model. With the increase of

ε

, the AUC index gradually converges. When

ε

= 4.2, AUC = 0.7412. If it continues to increase, the added noise will be less and less, and the ability to protect the privacy of data will be gradually weakened. Therefore, considering comprehensively,

ε

= 4.2 is the most appropriate privacy budget.

3.4. Comparison of Model Parameters and Inference Time of Different Methods

In order to verify the advantages of this method in the amount of model parameters and training time, the model parameters and the average time from training to convergence are compared with the other four related methods based on federated learning.

The signSGD is an optimization algorithm used for model training in distributed machine learning [30]. It is an improvement on the traditional stochastic gradient descent algorithm, aimed at reducing communication overhead and reducing the computational and storage requirements for model training. The signSGD algorithm reduces communication overhead by introducing symbols (sign). For each training sample, each node only needs to calculate the sign of the gradient (i.e., the sign of the gradient), and then send the sign information to the central server. The central server calculates the direction of the global gradient based on the received symbol information and updates the parameters.

The DGC (Distributed Gradient Compression) [31] algorithm aims to reduce communication overhead and bandwidth requirements for model transmission. It achieves this goal by compressing and spar sizing gradients in a distributed environment. In traditional distributed machine learning, each client (such as a device or node) calculates gradients locally and transmits the complete gradient to the central server for parameter updates. This communication overhead increases with the number of participants, especially in large-scale distributed computing where efficiency is lower. Through gradient compression and sparsity, the DGC algorithm can significantly reduce the transmission volume of gradients, thereby reducing communication overhead and bandwidth requirements. After receiving the compressed sparse gradient, the central server performs decompression and sparsification inverse operations, and uses these gradients to update the model parameters.

In the standard FedAvg [32] algorithm, model parameters are usually not directly compressed. The main goal of FedAvg is to achieve model aggregation, which involves averaging or weighted averaging model parameters distributed across multiple devices to obtain global model updates. During this process, explicit compression of model parameters is usually not performed. In the communication phase of FTLKD method, the knowledge distillation is used to update the local model of each client, which can use the prediction results of the teacher model on the training data as the soft target, rather than transmitting the complete training data. This can reduce the amount of data transmitted between clients, thus reducing the communication cost.

Table 4 shows the comparison results of model parameters and training time between our proposed ECFML model and other methods. It can be seen from Table 4 that the parameters of the meta-learner based on MobileViT are significantly less than those of the other four methods in terms of model parameters, which will greatly reduce the communication cost in the subsequent training process. In terms of training time, ECFML takes much less time to train the local model to reach the convergence state than the other four methods. To sum up, ECFML is superior to the other four methods in terms of parameters and training time.

3.5. Convergence Analysis of MobileViT Meta-Learner

In order to observe the convergence of the ECFML algorithm on each project, NASA, AEEEM, and Relink were used as client datasets in pairs. Figure 5 shows the convergence trend when using AEEEM and Relink as datasets. Federated meta-learning enables the model to quickly learn new tasks from limited samples. By learning how to infer, generalize, and adapt, the model can make more efficient use of previously learned knowledge, so as to converge faster on new tasks.

When 0 < Rounds < 10, the AUC and G-mean indexes of the six clients increase rapidly. The difference is that in AUC, JDT, LC, and Safe reach a stable state at 0 < Rounds < 5, while EQ, Apache, and Zxing increase slightly at 0 < Rounds < 5, but they grow rapidly and reach a convergence state at 5 < Rounds < 10. In the G-mean index, JDT, Safe, and EQ reach a stable state at 0 < Rounds < 5, while other clients reach a convergence state at 5 < Rounds < 10.

In Figure 6, the six clients also tend to converge with the increase of the number of training rounds. In AUC index, PC3, PC4, and Safe increased rapidly and reached convergence within 0 < Rounds < 5, while MW1, Apache, and Zxing increased relatively slowly, but showed convergence when Rounds = 15. In the G-mean index, only PC3 and Zxing converge when Rounds = 5, PC4, Safe, and MW1converge when Rounds = 10, and Apache tends to be stable when Rounds = 15.

In Figure 7, the six clients can reach the convergence state when Rounds = 10, showing excellent convergence. For most clients, AUC and G-mean increase in varying degrees during the continuous training between the clients and the server. In general, with the increase of training rounds, the vast majority of clients show a relatively stable trend, and each client tends to be stable after 5~10 rounds of training.

Since federated meta-learning focuses on learning general learning algorithms or strategies rather than over fitting the data of specific tasks, federated meta-learning can improve the robustness of the model. This means that the model can maintain good performance even in the face of new tasks with different training data.

3.6. Comparative Experiments and Analysis of Different Prediction Methods

In addition to the validation of the model parameters and convergence, this method also uses AUC and G-mean as the performance indicators of each project to compare with the other four methods. The client settings were extracted from two of the three datasets of AEEEM, Relink, and NASA in turn. A total of six projects were selected as clients for testing. A total of three groups of experiments were conducted. In order to reduce the impact of randomness on the experiment, each group of experiments was repeated 15 times, and the average results were statistically analyzed.

Table 5 shows the experimental results using AEEEM and Relink as client sources. The six items are EQ, JDT, LC, Apache, Safe, and Zxing, respectively. In this method, the AUC and G-mean of six clients in ECFML are 0.7109~0.7299 and 0.5693~0.5762, respectively, with an average of 0.7208 and 0.5730, respectively. Compared with the other four methods, the average AUC and G-mean are (0.1693, 0.0783), (0.1857, 0.0895), (0.1680, 0.0874), and (0.1280, 0.0535) higher, respectively.

Table 6 shows the experimental results of NASA and Relink as client sources. The six projects are PC3, PC4, mw1, Apache, Safe, and Zxing, respectively. The AUC and G-mean were 0.5563~0.7751 and 0.5138~0.5917, respectively, with an average of 0.6852 and 0.5588, respectively. Compared with the other four methods, the average AUC and g-mean were (0.1408, 0.0661), (0.1529, 0.0721), (0.1394, 0.0719), and (0.1078, 0.0464), respectively.

Table 7 shows the experimental results with NASA and AEEEM as client sources. The six projects are PC3, PC4, MW1, EQ, JDT, and LC, respectively. The AUC and G-mean were 0.6507~0.7657 and 0.5404~0.5855, respectively, with an average of 0.7234 and 0.5696, respectively. Compared with the other four methods, the average AUC and g-mean were (0.1863, 0.0747), (0.1859, 0.0751), (0.1763, 0.0650), and (0.1518, 0.0569), respectively.

Figure 8 shows the prediction results of different methods on AEEEM and Relink datasets. It can be seen that the AUC and G-mean of the ECFML method on all projects are higher than those of signSGD, DGC, FedAvg, and FTLKD, which shows that meta-learning provides useful prior knowledge by learning the similarities and differences between different client data, so as to improve the performance of the new client model.

Figure 9 shows the prediction results of different methods on NASA and Relink datasets. It can be seen that the AUC and G-mean of the ECFML method on all projects are higher than those of signSGD, DGC, and FedAvg. The ECFML method has advantages except that it does not perform as well as FTLKD method on Safe and Zxing projects. This is because the data distribution of Safe and Zxing projects is more heterogeneous than that of the other four projects.

Figure 10 shows the prediction results of different methods on NASA and AEEEM datasets. Figure 10 show the prediction results of different methods on NASA and AEEEM datasets. It can be seen that the AUC and G-mean of the ECFML method on all projects are higher than signSGD, DGC, FedAvg, and FTLKD. This shows that meta-learning can learn the general feature extraction methods and model structure. By learning the shared features of data, the model can better adapt to the new heterogeneous data.

To sum up, meta-learning can learn the general knowledge of software defect data from multiple clients and help federated learning improve the generalization ability of meta-learners on new clients, and the learning algorithm can better adapt to the new heterogeneous data distribution and reduce the model training time. This method not only has advantages in terms of communication cost, but also has great advantages in terms of AUC and G-mean performance, which is also better than other methods. Our method enables the model to adapt to different tasks and environments. By learning how to select and adjust the parameters of the learning algorithm, the model can be adaptive according to different task characteristics and environmental conditions, so as to improve the generalization ability and adaptability of the model.

3.7. Generalization of Models on Different Datasets

Due to the strong influence of historical data volume, usage time, and failure time related to software defects, software defect prediction is easily affected. To verify the generalization of this method on other datasets, we added the SOFTLAB dataset [33] for testing. This experiment used three items from the SOFTLAB software defect dataset, with a sample size range of 63 to 121. The details of the dataset are shown in Table 8.

The projects in SOFTLAB have 29 metrics. Software projects used in this experiment are widely distributed and come from five different projects, which meets the needs of heterogeneous defect prediction and makes the experiment more representative. The number of samples included in each project varies greatly, and the quantity varies greatly. The minimum number of samples is only 63, while the maximum number reaches 1458, which indirectly indicates the diversity of the items used in the experiment. We will extend the PC3, PC4, and MW1 in the NASA dataset to include the AR3, AR4, and AR5 projects in the SOFTLAB dataset. We hope to verify the generalization of ECFML on other datasets in this way. Table 9 shows the results of the experiment.

As shown in Table 9, after expanding the dataset, ECFML is equally effective on other datasets. Although the AUC of project AR3 is lower than FTLKD, the AUC and G-mean of project AR4 are lower than FTLKD. But the average results for all projects are higher than other methods, which proves that ECFML has good generalization ability.

4. Conclusions

Looking at the problem of communication efficiency in the construction of a cross-project software defect prediction method based on federated learning, an efficient communication software defect prediction method based on federated meta-learning is proposed. This method regards multiple clients in federated learning as multiple tasks in meta-learning, and extracts general knowledge from multiple tasks, which makes the learning algorithm adapt to new tasks faster without a lot of training from scratch. At the same time, the MobileViT network is used as a meta-learner to reduce the number of model parameters and further reduce the communication cost. The experimental results show that this method has advantages in terms of model parameters and model performance, and the model performs well in terms of convergence speed and convergence.

The future research direction will start from privacy protection technology and seek cross-project defect prediction methods that can better balance privacy protection and data availability.

Author Contributions

Conceptualization, A.W., L.Y. and H.C.; methodology, A.W. and L.Y.; software, L.Y.; validation L.Y.; writing—review and editing A.W., L.Y. and H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the High-end Foreign Experts Introduction Program (G2022012010L) and Key Research and Development Program Guidance Project of Heilongjiang (GZ20220123).

Data Availability Statement

AEEEM, NASA, Relink and SOFTLAB: https://github.com/bharlow058/AEEEM-and-other-SDP-datasets/tree/master/datasetcsv (accessed on 27 June 2015).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Vasileiou, M.; Papageorgiou, G.; Tjortjis, C. A Machine Learning Approach for Effective Software Defect Detection. In Proceedings of the 2023 14th International Conference on Information, Intelligence, Systems & Applications (IISA), Volos, Greece, 10–12 July 2023; pp. 1–8. [Google Scholar]
Bala, Y.Z.; Samat, P.A.; Sharif, K.Y.; Manshor, N. Improving Cross-Project Software Defect Prediction Method through Transformation and Feature Selection Approach. IEEE Access 2023, 11, 2318–2326. [Google Scholar] [CrossRef]
Amasaki, S.; Aman, H.; Yokogawa, T. A Preliminary Evaluation of CPDP Approaches on Just-in-Time Software Defect Prediction. In Proceedings of the 2021 47th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), Palermo, Italy, 1–3 September 2021; pp. 279–286. [Google Scholar] [CrossRef]
Elbosaty, A.T.; Abdelmoez, W.M.; Elfakharany, E. Within-Project Defect Prediction Using Improved CNN Model via Extracting the Source Code Features. In Proceedings of the 2022 International Arab Conference on Information Technology (ACIT), Abu Dhabi, United Arab Emirates, 22–24 November 2022; pp. 1–8. [Google Scholar] [CrossRef]
Li, K.; Xiang, Z.; Chen, T.; Tan, K.C. BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction. In Proceedings of the 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), Melbourne, VIC, Australia, 21–25 September 2020; pp. 573–584. [Google Scholar]
Jing, X.; Wu, F.; Dong, X.; Qi, F.; Xu, W. Heterogeneous Cross-company Defect Prediction by Unified Metric Representation and CCA-based Transfer Learning. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, Bergamo, Italy, 30 August–4 September 2015; pp. 496–507. [Google Scholar] [CrossRef]
Gong, L.; Jiang, S.; Jiang, L. Conditional Domain Adversarial Adaptation for Heterogeneous Defect Prediction. IEEE Access 2020, 8, 150738–150749. [Google Scholar] [CrossRef]
Nam, J.; Pan, S.J.; Kim, S. Transfer Defect Learning; ICSE: San Francisco, CA, USA, 2013; pp. 382–391. [Google Scholar]
Sun, Y.; Jing, X.; Wu, F.; Li, J.; Xing, D.; Chen, H.; Sun, Y. Adversarial Learning for Cross-Project Semi-Supervised Defect Prediction. IEEE Access 2020, 8, 32674–32687. [Google Scholar] [CrossRef]
Ma, Y.; Zhu, S.; Chen, Y.; Li, J. Kernel CCA Based Transfer Learning for Software Defect Prediction. IEICE Trans. Inf. Syst. 2017, 100, 1903–1906. [Google Scholar] [CrossRef]
Song, H.; Li, Y.; Zhang, W.; Liu, Y. Research on Aggregation of Federated Model for Software Defect Prediction Based on Dynamic Selection. In Proceedings of the 2023 10th International Conference on Dependable Systems and Their Applications (DSA), Tokyo, Japan, 10–11 August 2023; pp. 242–249. [Google Scholar] [CrossRef]
Wang, A.; Zhao, Y.; Li, G.; Zhang, J.; Wu, H.; Iwahori, Y. Heterogeneous Defect Prediction Based on Federated Reinforcement Learning via Gradient Clustering. IEEE Access 2022, 10, 87832–87843. [Google Scholar] [CrossRef]
Wang, A.; Zhang, Y.; Yan, Y. Heterogeneous Defect Prediction Based on Federated Transfer Learning via Knowledge Distillation. IEEE Access 2021, 9, 29530–29540. [Google Scholar] [CrossRef]
Sattler, F.; Korjakow, T.; Rischke, R.; Samek, W. FedAUX: Leveraging Unlabeled Auxiliary Data in Federated Learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 34, 5531–5543. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Liu, Z.; Huang, Y.; Xu, P. FedOES: An Efficient Federated Learning Approach. In Proceedings of the 2023 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), Guangzhou, China, 24–26 February 2023; pp. 135–139. [Google Scholar] [CrossRef]
Park, S.; Choi, W. Regulated Subspace Projection Based Local Model Update Compression for Communication-Efficient Federated Learning. IEEE J. Sel. Areas Commun. 2023, 41, 964–976. [Google Scholar] [CrossRef]
Yang, Z.; Sun, Q. Communication-efficient Federated Learning with Cooperative Filter Selection. In Proceedings of the 2022 IEEE International Symposium on Circuits and Systems (ISCAS), Austin, TX, USA, 27 May–1 June 2022; pp. 2172–2176. [Google Scholar] [CrossRef]
Tang, Z.; Shi, S.; Li, B.; Chu, X. GossipFL: A Decentralized Federated Learning Framework with Sparsified and Adaptive Communication. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 909–922. [Google Scholar] [CrossRef]
Xu, J.; Du, W.; Jin, Y.; He, W.; Cheng, R. Ternary Compression for Communication-Efficient Federated Learning. IEEE Trans. Neural Netw. Learn. Syst. 2022, 33, 1162–1176. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Deng, Y.; Nallanathan, A.; Bennis, M. Federated Learning and Meta Learning: Approaches, Applications, and Directions. IEEE Commun. Surv. Tutor. 2023, 26, 571–618. [Google Scholar] [CrossRef]
Zhong, Q.; Chen, L.; Qian, Y. Few-Shot Learning for Remote Sensing Image Retrieval with MAML. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 2446–2450. [Google Scholar] [CrossRef]
Li, Y.; Tang, J.; Li, L.; Wang, X.; Ding, W.; Li, X.; Yu, T.; Wu, X. MobileViT-based classification of Alzheimer’s disease. In Proceedings of the 2023 IEEE 6th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Haikou, China, 18–20 August 2023; pp. 443–448. [Google Scholar] [CrossRef]
Wang, T.; Lu, X. Face Forgery Detection Algorithm Based on Improved MobileViT Network. In Proceedings of the 2023 8th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 21–23 April 2023; pp. 1396–1400. [Google Scholar] [CrossRef]
Wu, N.; Peng, C.; Niu, K. A Privacy-Preserving Game Model for Local Differential Privacy by Using Information-Theoretic Approach. IEEE Access 2020, 8, 216741–216751. [Google Scholar] [CrossRef]
Wang, J.; Zhang, Y.; Li, H. Electronic voting protocol based on ring signature and secure multi-party computing. In Proceedings of the 2020 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Chongqing, China, 29–30 October 2020; pp. 50–55. [Google Scholar] [CrossRef]
Mahmood, Z.H.; Ibrahem, M.K. New Fully Homomorphic Encryption Scheme Based on Multistage Partial Homomorphic Encryption Applied in Cloud Computing. In Proceedings of the 2018 1st Annual International Conference on Information and Sciences (AiCIS), Fallujah, Iraq, 20–21 November 2018; pp. 182–186. [Google Scholar] [CrossRef]
Huang, W.; Zhou, S.; Zhu, T.; Liao, Y.; Wu, C.; Qiu, S. Improving Laplace Mechanism of Differential Privacy by Personalized Sampling. In Proceedings of the 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, China, 29 December 2020–1 January 2021; pp. 623–630. [Google Scholar] [CrossRef]
Xu, Z.; Liu, J.; Yang, Z.; An, G.; Jia, X. The Impact of Feature Selection on Defect Prediction Performance: An Empirical Comparison. In Proceedings of the 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE), Ottawa, ON, Canada, 23–27 October 2016; pp. 309–320. [Google Scholar] [CrossRef]
Huang, Y.; Xu, X. Two-stage cost-sensitive local models for heterogeneous cross-project defect prediction. In Proceedings of the 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), Los Alamitos, CA, USA, 27 June–1 July 2022; pp. 819–828. [Google Scholar] [CrossRef]
Bernstein, J.; Wang, Y.; Azizzadenesheli, K.; Anandkumar, A. signSGD: Compressed optimisation for non-convex problems. arXiv 2018, arXiv:1802.04434. [Google Scholar]
Lin, Y.; Han, S.; Mao, H.; Wang, Y.; Dally, W.J. Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training. arXiv 2017, arXiv:1712.01887. [Google Scholar]
Mcmahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B. Federated Learning of Deep Networks Using Model Averaging. arXiv 2016, arXiv:1602.05629. [Google Scholar]
Calikli, G.; Tosun, A.; Bener, A.; Celik, M. The effect of granularity level on software defect prediction. In Proceedings of the 2009 24th International Symposium on Computer and Information Sciences, Guzelyurt, Northern Cyprus, 14–16 September 2009; pp. 531–536. [Google Scholar] [CrossRef]

Figure 1. The framework of an efficient communication software defect prediction algorithm based on federated meta-learning.

Figure 2. MobileViT network architecture for cross-project defect prediction.

Figure 3. The flowchart of ECFML.

Figure 4. The impact of privacy encryption budget on model accuracy. The lines and squares in the figure are the scale lines of the horizontal and vertical coordinates.

Figure 5. Convergence trends of different projects in AEEEM and Relink: (a) AUC; (b) G-mean.

Figure 6. Convergence Trends of Different Projects in NASA and Relink: (a) AUC; (b) G-mean.

Figure 7. Convergence Trends of Different Projects in AEEEM and NASA: (a) AUC; (b) G-mean.

Figure 8. Prediction results of different methods on AEEEM and Relink datasets.

Figure 9. Prediction results of different methods on NASA and Relink datasets.

Figure 10. Prediction results of different methods on NASA and AEEEM datasets.

Table 1. The architecture of MobileViT model.

Model Structure	Operation\Stride	Kernel Size	Input Size
3 × 3 Conv2d	Conv2d\Stride = 1	1 × 3 × 3 × 3	5 × 5 × 1
MobileViT Block 1	Conv2d\Stride = 2	3 × 3 × 3 × 16	5 × 5 × 3
	BN		3 × 3 × 16
	Conv2d\Stride = 1	16 × 3 × 3 × 16	3 × 3 × 16
	BN		3 × 3 × 16
	Feature stitching		3 × 3 × 16
MobileViT Block 2	Conv2d\Stride = 2	16 × 3 × 3 × 32	3 × 3 × 16
	BN		3 × 3 × 32
	Conv2d\Stride = 1	32 × 3 × 3 × 32	3 × 3 × 32
	BN		2 × 2 × 32
	Feature stitching		2 × 2 × 32
MobileViT Block 3	Conv2d\Stride = 1	32 × 3 × 3 × 64	2 × 2 × 32
	BN		2 × 2 × 64
	Conv2d/Stride = 1	64 × 3 × 3 × 64	2 × 2 × 64
	BN		2 × 2 × 64
	Feature stitching		2 × 2 × 64
Avg pooling	Average pooling		2 × 2 × 64
Linear	Linear		1 × 1 × 64
Softmax	Softmax	Classifier	1 × 2

Table 2. Statistics of Items Used in the Experiment.

Database	Project	Description of Project	Metrics	Instance	Defects	Defective (%)
NASA	PC3	Flight Software of Each Orbiting Satellite	37	1077	134	12.44
	PC4	Flight Software of Each Orbiting Satellite	37	1458	178	12.21
	MW1	Zero Gravity Experiment on Combustion	37	253	27	10.67
AEEEM	EQ	OSGi Framework	61	324	129	39.81
	JDT	IDE Development	61	997	206	20.66
	LC	Text Search Engine Library	61	691	64	9.26
Relink	Apache	Open-Source Software System	26	194	98	50.52
	Safe	Open Intents Library	26	56	22	39.29
	Zxing	Barcode image processing library	26	399	118	29.57

Table 3. Confusion matrix.

	Defect	Defect-Free
Prediction	Defect	Defect-Free
Defect	True Positive (TP)	False Negative (FN)
Defect-free	False Positive (FP)	True Positive (TP)

Table 4. Comparison of model parameter quantity and training time with different methods.

Methods	signSGD	DGC	FedAvg	FTLKD	ECFML
Parameters	175,187	105,125	116,813	114,292	75,069
Time(s)	155.7122	233.5563	225.4934	185.1224	77.8521

Table 5. Experimental results using AEEEM and Relink as client sources.

Project	Index	signSGD	DGC	FedAvg	FTLKD	ECFML
EQ	AUC	0.5245	0.5118	0.5087	0.6500	0.7109
EQ	G-mean	0.4852	0.4724	0.4719	0.5477	0.5700
JDT	AUC	0.5218	0.5123	0.5233	0.5119	0.7287
JDT	G-mean	0.4772	0.4701	0.4731	0.4907	0.5762
LC	AUC	0.5000	0.4919	0.4977	0.7000	0.7132
LC	G-mean	0.4672	0.4627	0.4685	0.5578	0.5705
Apache	AUC	0.6164	0.5861	0.5688	0.6429	0.7282
Apache	G-mean	0.5302	0.5041	0.5162	0.5429	0.5759
Safe	AUC	0.5935	0.5832	0.5841	0.4944	0.7144
Safe	G-mean	0.5112	0.5097	0.5113	0.4898	0.5693
Average	AUC	0.5512	0.5351	0.5324	0.5928	0.7208
Average	G-mean	0.4947	0.4835	0.4856	0.5195	0.5730

Table 6. Experimental results using NASA and Relink as client sources.

Project	Index	signSGD	DGC	FedAvg	FTLKD	ECFML
PC3	AUC	0.5154	0.5117	0.5488	0.4988	0.7378
PC3	G-mean	0.4821	0.4719	0.4752	0.4711	0.5784
PC4	AUC	0.5119	0.5029	0.5318	0.5561	0.7751
PC4	G-mean	0.4753	0.4684	0.4740	0.4979	0.5917
MW1	AUC	0.5000	0.4925	0.4979	0.5000	0.6213
MW1	G-mean	0.4672	0.4698	0.4685	0.5000	0.5342
Apache	AUC	0.6065	0.5818	0.5798	0.5718	0.7138
Apache	G-mean	0.5293	0.5177	0.5168	0.5172	0.5662
Safe	AUC	0.5834	0.5833	0.5835	0.7500	0.7071
Safe	G-mean	0.5093	0.5092	0.5116	0.5774	0.5690
Zxing	AUC	0.5497	0.5221	0.5332	0.5879	0.5563
Zxing	G-mean	0.4932	0.4832	0.4754	0.5113	0.5138
Average	AUC	0.5444	0.5323	0.5458	0.5774	0.6852
Average	G-mean	0.4927	0.4867	0.4869	0.5124	0.5588

Table 7. Experimental results using NASA and AEEEM as client sources.

Project	Index	signSGD	DGC	FedAvg	FTLKD	ECFML
PC3	AUC	0.5122	0.5000	0.5565	0.5417	0.7657
PC3	G-mean	0.4889	0.4961	0.5209	0.5092	0.5886
PC4	AUC	0.5168	0.5011	0.5414	0.5530	0.7383
PC4	G-mean	0.4751	0.4988	0.5095	0.5014	0.5787
MW1	AUC	0.5195	0.5070	0.4923	0.7091	0.6507
MW1	G-mean	0.4816	0.4828	0.4696	0.5685	0.5404
EQ	AUC	0.5636	0.5389	0.5972	0.6389	0.7238
EQ	G-mean	0.5062	0.4729	0.5152	0.5307	0.5602
JDT	AUC	0.5665	0.5971	0.5530	0.5119	0.7596
JDT	G-mean	0.5162	0.5146	0.5124	0.5013	0.5855
LC	AUC	0.5445	0.5809	0.5427	0.4750	0.7023
LC	G-mean	0.5018	0.5018	0.5000	0.4655	0.5644
Average	AUC	0.5371	0.5375	0.5471	0.5716	0.7234
Average	G-mean	0.4949	0.4945	0.5046	0.5127	0.5696

Table 8. Details of the SOFTLAB dataset.

Database	Project	Description of Project	Metrics	Instance	Defects	Defective (%)
SOFTLAB	AR3	Embedded Controller of The Washing Machine	29	63	8	12.70
	AR4	Embedded Controller of The Dishwasher	29	107	20	18.69
	AR5	Embedded Controller of The Refrigerator	29	36	8	22.22

Table 9. Experimental results using NASA and SOFTLAB as client source.

Project	Index	signSGD	DGC	FedAvg	FTLKD	ECFML
PC3	AUC	0.5739	0.6104	0.6066	0.6830	0.7507
PC3	G-mean	0.5134	0.5525	0.5299	0.5596	0.5832
PC4	AUC	0.5478	0.5308	0.5882	0.7400	0.7822
PC4	G-mean	0.4963	0.5035	0.5042	0.5795	0.5939
MW1	AUC	0.5423	0.5514	0.5556	0.5846	0.6287
MW1	G-mean	0.4949	0.4989	0.4969	0.5235	0.5363
AR1	AUC	0.5882	0.5625	0.6051	0.6176	0.7130
AR1	G-mean	0.5113	0.5103	0.5380	0.5239	0.5666
AR3	AUC	0.5833	0.5847	0.6150	0.6667	0.6481
AR3	G-mean	0.5092	0.5112	0.5277	0.5443	0.5493
AR4	AUC	0.5294	0.5102	0.5172	0.7560	0.7500
AR4	G-mean	0.4851	0.4935	0.5119	0.5853	0.5774
Average	AUC	0.5608	0.5583	0.5812	0.6746	0.7121
Average	G-mean	0.5017	0.5116	0.5181	0.5526	0.5677

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, H.; Yang, L.; Wang, A. Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning. Electronics 2024, 13, 1105. https://doi.org/10.3390/electronics13061105

AMA Style

Chen H, Yang L, Wang A. Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning. Electronics. 2024; 13(6):1105. https://doi.org/10.3390/electronics13061105

Chicago/Turabian Style

Chen, Haisong, Linlin Yang, and Aili Wang. 2024. "Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning" Electronics 13, no. 6: 1105. https://doi.org/10.3390/electronics13061105

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Cross-Project Software Defect Prediction Based on Federated Meta-Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Federated Meta-Learning Based on Lightweight MobileViT Network

2.1.1. MAML Algorithm

2.1.2. Lightweight Meta-Learner Design Based on MobileViT

2.2. Model Update with Privacy Encryption

2.3. Implementation Steps of ECFML Algorithm

3. Results

3.1. Datasets Description

3.2. Performance Evaluation Indicator

3.3. Impact of Privacy Encryption Budget on Model Accuracy

3.4. Comparison of Model Parameters and Inference Time of Different Methods

3.5. Convergence Analysis of MobileViT Meta-Learner

3.6. Comparative Experiments and Analysis of Different Prediction Methods

3.7. Generalization of Models on Different Datasets

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI