A Federated Learning Framework for Breast Cancer Histopathological Image Classification

Li, Lingxiao; Xie, Niantao; Yuan, Sha

doi:10.3390/electronics11223767

Open AccessArticle

A Federated Learning Framework for Breast Cancer Histopathological Image Classification

by

Lingxiao Li

^1,†,

Niantao Xie

^2,† and

Sha Yuan

^2,*

¹

Artificial Intelligence and Human Languages Lab, Beijing Foreign Studies University, Beijing 100089, China

²

Beijing Academy of Artificial Intelligence, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Electronics 2022, 11(22), 3767; https://doi.org/10.3390/electronics11223767

Submission received: 25 October 2022 / Revised: 11 November 2022 / Accepted: 13 November 2022 / Published: 16 November 2022

(This article belongs to the Special Issue Deep and Classic Machine Learning in Signal, Image, and Video Analysis)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Quantities and diversities of datasets are vital to model training in a variety of medical image diagnosis applications. However, there are the following problems in real scenes: the required data may not be available in a single institution due to the number of patients or the type of pathology, and it is often not feasible to share patient data due to medical data privacy regulations. This means keeping private data safe is required and has become an obstacle in fusing data from multi-party to train a medical model. To solve the problems, we propose a federated learning framework, which allows knowledge fusion achieved by sharing the model parameters of each client through federated training rather than sharing data. Based on breast cancer histopathological dataset (BreakHis), our federated learning experiments achieve the expected results which are similar to the performances of the centralized learning and verify the feasibility and efficiency of the proposed framework.

Keywords:

federated learning; knowledge fusion; medical image diagnosis; breast cancer; histopathology; image classification

1. Introduction

With the rapid development of Artificial Intelligence (AI), machine learning approaches have been widely used in smart medical diagnosis [1,2]. The success of smart medical diagnosis is dependent on a large amount of high-quality labeled data which machine learning models obtain knowledge from. However, there is extremely restricted access to medical data under the consideration of patient privacy and data confidentiality. Breaking down the data isolated islands and strengthening the privacy and security of data are the two significant challenges when applying artificial intelligence to smart medical diagnosis. As a kind of secure knowledge fusion approach, federated learning allows data owners to train their models locally and then aggregate model parameters rather than fusing data directly. This study seeks to explore a federated learning framework that enables the intelligent model to learn from multi-sourced data without damaging data privacy and takes BC (Breast Cancer) as an example in the experiment.

Breast cancer tops the list of cancers among women, and early screening is critical for treatment effectiveness. Traditional diagnosis involves the participation of medical professionals, with the attendant risks of treatment delay and subjective diagnosis. Based on the breast cancer histopathological dataset, intelligence diagnosis methods [3,4,5] have been made to accelerate the cancer diagnostic process. In the early years of relevant research, hand-craft feature engineering [6,7,8] dominates the automatic cancer classification task. In 2012, AlexNet won in ImageNet, marking the beginning of the era of convolution feature extraction [9]. The implementation of an AlexNet variant on classification tasks brings an increase in accuracy [10]. With the popularization of deep learning, multiple updated deep convolution algorithms are applied to the histopathological images, all being excel in the most used breast cancer histopathological (BreakHis) dataset [11,12,13].

Research to date has tended to focus on algorithm innovations rather than data enhancement. It has previously been observed that 100% accuracy can be achieved or approached in training set after continuous training with the 5-folds strategy in the BreakHis dataset [14,15,16]. The present deep convolutional network is adequate for the classification task on the BreakHis dataset [17,18]. Furthermore, histopathological data is more complex and diverse than in the experimental environment. This situation indicates a need for more extensive and diverse data, improving the generalization and robustness of the practical application models.

The medical data, including breast histopathological images, is not allowed to be collected and exchanged outside hospitals in reality. Sufficient high-quality data is essential for model training of machine learning. However, medical datasets normally have problems with uneven distribution and insufficient data due to collecting difficulties. As a result, the contradiction between privacy protection and adequate data fusion requirements is a crucial obstacle to smart healthcare development [19,20]. In this situation, the method that fuses knowledge derived from data instead of fusing data themselves, i.e., federated learning [21] is suitable for the development of intelligent medical diagnosis systems.

Two information security technologies are widely used in protecting the machine learning knowledge fusion process. The first is the secure multi-party computation (SMC) [22,23,24], which can realize multi-participant joint training under the protection of model weights. However, since there is still a risk of deriving source data information from model parameters, a random noise mechanism is also adopted, known as the differential privacy (DP) [25,26,27]. The federated learning process involved in this paper combines SMC and DP to achieve multi-party joint modeling without performance compromise.

In this paper, we propose an efficient and feasible federated learning framework for medical image diagnosis. Based on the proposed framework, we design an efficient system with the consideration of resource efficiency. We apply breast cancer to the proposed framework as a practical case. Several comparative experiments are conducted on the BreakHis dataset. The experimental results in this work verify that federated learning is an effective way to solve the data silos and data privacy issues in the knowledge fusion process of the intelligent medical field.

2. Related Work

2.1. Breast Cancer Diagnosis

The texture of the nuclear image provides a reproducible pattern in the histopathological-cytopathological determination of cancer [28]. This reproducible pattern makes it possible that pathological cancer images can be automatically processed without human intervention. Worldwide, breast cancer is the most common cause of cancer death in women. For the diagnosis of breast cancer, a large number of computer-aided technologies have been proposed. Generally, the early stage of diagnosis is a cytological examination of breast tumor material, and then several classifiers are applied to specific nuclei features. In the hand-craft feature stage, the histopathological image features need to be designed and filtered. For nuclei feature engineering, clustering methods, such as k-means and fuzzy c-means, are used in the color space for fast nuclei segmentation [29]. Instead of accurate segmentation, the circular Hough transform is adopted for nuclei estimation [30,31]. A neural network with back-propagation is also used for the analysis of cytological images. Its performance is comparable to a traditional support vector machines (SVM) classifier [31].

In terms of breast cancer imaging, most of the existing work [31,32,33] has been on whole-slide imaging (WSI) with a high cost of processing and operating in practice. A large, publicly available, annotated dataset is crucial to develop intelligent cancer diagnosis systems. However, the difficulty in obtaining medical data is also a bottleneck in the development of breast cancer detection technology. The emergence of the BreakHis dataset [34] slightly alleviates the problem, giving support to more efficient detection technologies with the combination of deep learning. Deep convolution neural networks are adopted directly to the histopathological images for feature extractions [10,11]. It is worth noting that breast cancer image classification is feasible without considering magnification factors (40×, 100×, 200× and 400×) [11], which means that the uniformity of image magnification for all medical parties is not required.

2.2. Federated Learning

As mentioned above, machine learning models can benefit from the fusion of mass data. However, there is extremely restricted access to data under the consideration of user privacy and data confidentiality in the medical field. In this situation, privacy-preserving decentralized collaborative machine-learning techniques are suitable for the development of intelligent medical diagnosis systems. Since Google first proposed federated learning in 2016, the concept of federated learning has been extended to cover collaborative learning and knowledge fusion scenarios among organizations. There are several specific frameworks, including horizontal federated learning, vertical federated learning and federated transfer learning [21]. For isolating data and label deficiencies, federated learning provides a safe and efficient solution, that is, it can realize the training of machine learning models to synthesize and fuse the information provided by multi-party data while keeping them localized.

A considerable portion of research has been devoted to the secure computing of machine learning algorithms. As a primary machine learning algorithm, researchers hope for linear regression to fit the best curve without disclosing the input data using homomorphic encryption and Yao garbled circuits [35]. The gradient boosting decision tree (GBDT) secure computing system with the premise of privacy protection can support each private party to train a model independently, then aggregated safely [36]. After WeBank proposed the industrial-level federated learning framework, the SecureBoost framework for vertical federation [37] and the loose privacy constraint method for horizontal federation [38] both provide efficient solutions for GBDT secure computing. Google has been committed to mobile federated learning, optimizing federated communications efficiency [39,40], and building scalable federated production systems [41]. Furthermore, cross-device federated learning [42] dealing with a large number of unreliable devices with limited computing capabilities and slow access links, by contrary, cross-silo federated learning [43] handles at most a few hundreds of reliable data silos with powerful computing resources and high-speed connections.

3. Federated Learning Framework

3.1. Overview

The proposed federated learning framework for efficient medical image diagnosis is illustrated in Figure 1. The system consists of three modules: user platform, federated server, and federated client.

The user platform includes two sub-modules: medical tools and data servers. The medical tools provide users with various types of medical field prediction services, and the data server provides the labeled data required for model training. The data sources can be datasets established by the platform, or provided by medical institution partners.

The federated server contains a task scheduler, model container, and federated module. As illustrated in Figure 2, the task scheduler maps the service request, which is encrypted with advanced encryption standard (AES) [44] to the self-built or third-party model container for prediction. The model container decrypts the request, sends it to the corresponding model, and returns the result after AES encryption. When the prediction result of the model is not satisfactory, the federated module loads the model for online federated training optimization.

The federated clients download the corresponding model from the federated server when the online federated training optimization begins. As shown in Figure 3, each client trains the model with its local data severally and uploads the parameters which are encrypted with the homomorphic encryption Cheon–Kim–Kim–Song (CKKS) algorithm [45] to the federated server. The federated server adopts the federated averaging algorithm [46] to aggregate the parameters of each client. The model parameters are updated and sent back to each federated client. The online federated training optimization progress is repeated until the model completes the training.

3.2. Workflow

The overall workflow is shown as follows. Firstly, the user initiates a service request on the user platform. The service request is encrypted and sent to the task scheduler. Secondly, the processing center of the task scheduler analyzes the request and maps it to the self-built or third-party model container. The chosen model will carry out the prediction task and return the encrypted prediction result to the user. Thirdly, the processing center checks whether the user satisfies the result, and if so, it ends; otherwise, a warning message is sent to the federated module. The federated server performs online federated training with K clients and saves the optimized model as a self-built model to the corresponding model container. Finally, the task scheduler returns the optimized result to the user.

3.3. Robustness

In this subsection, we analyze the robustness of the federated learning system. When a user initiates a medical service request to the task scheduler, the processing center adopts AES encryption to enhance the request security and ensure the model privacy as the user cannot directly access the model container. If there are too many requests, and the queuing time is too long, the task scheduler carries out the weighted round robin (WRR) scheduling algorithm [47] for load balancing. Furthermore, a distributed computing framework called Spark [48] is also used for computing load balancing. When the user needs to query models in two different model containers, Spark can bring significant efficiency improvements. The model container returns the encrypted prediction results. It avoids the original information being replaced with malicious links or malicious code programs. In general, the system is developed with high scalability. It can adapt to the number increasing of users, knowledge fusion model types, and the third parties participating in online federated learning training.

4. Experiments

In this section, we conduct three kinds of confirmatory experiments (centralized, federated and independent training) to verify the feasibility and efficiency of the proposed federated learning framework for breast cancer histopathological image classification.

4.1. Data

BreakHis https://web.inf.ufpr.br/vri/databases/breast-cancer-histopathological-database-breakhis/ (accessed on 10 March 2022) [34] is the data used for experiments, which is composed of 7909 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40×, 100×, 200×, and 400×). To date, it contains 2480 benign and 5429 malignant samples (700 × 460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format).

In BreakHis, both benign and malignant types of tumors are further categorized into four subtypes depending on the tumor cells’ appearance, respectively. The four categories for benign tumors contain adenosis (A), fibroadenoma (F), phyllodes tumor (PT), and tubular adenoma (TA). Furthermore, ductal carcinoma (DC), lobular carcinoma (LC), mucinous carcinoma (MC), and papillary carcinoma (PC) are four corresponding classes of malignant tumors. Figure 4 shows an example slide of a breast malignant tumor with four magnification factors 40×, 100×, 200×, 400×, and Table 1 details the magnification factors and histological subtypes of tumors with their number of images and patients containing in the BreakHis dataset.

The BreakHis dataset is partitioned into training and test sets at a ratio of 7:3, whose benign and malignant proportions are basically the same, shown in Table 2. Moreover, we do not use images of the same patient for training and testing at the same time. In the experiments, the training section is segmented into K = 11 parts, that is, eleven virtual clients in the experimental environment with a similar amount of data and distribution of tumor types (Benign and Malignant). Each client includes four to six patients. It is worth noting that the non-IID partitioning way is followed under the data settings.

4.2. Models

Four state-of-the-art models ResNet-152 [49], DenseNet-201 [50], MobileNet-v2-100 [51] and EfficientNet-b7 [52] are used for breast cancer image diagnoses, a brief introduction to the four models is listed below.

4.2.1. ResNet-152

This model introduces a deep residual learning framework to address the degradation problem, which lets the layers fit a residual mapping

F (x)

. For the desired underlying mapping

H (x)

, the residual mapping for the stacked nonlinear layers is set to be

F (x) : = H (x) - x

, and the original mapping is recast into

F (x) + x

. We make use of the residual nets with a depth of up to 152 layers.

4.2.2. DenseNet-201

This model proposes an architecture called Dense Convolutional Network (DenseNet) that distills this insight into a simple connectivity pattern. On the one hand, to ensure maximum information flow between layers, DenseNet connects all layers directly with each other. On the other hand, to preserve the feed-forward nature, each layer obtains additional inputs from all preceding layers and passes on feature maps to all subsequent layers. Crucially, DenseNet combines features by concatenating them, which introduces

\frac{L (L + 1)}{2}

connections in an L-layer network. We apply DenseNet with 201 layers.

4.2.3. MobileNet-v2-100

This model is an improved version of MobileNet-v1, which contains a novel layer module: the inverted residual with a linear bottleneck. On the one hand, this module takes input as a low-dimensional compressed representation, which is first expanded to a high dimension and filtered with a lightweight depthwise convolution. On the other hand, features are subsequently projected back to a low-dimensional representation with a linear convolution. We take MobileNet-v2 with a depth multiplier of 1.0.

4.2.4. EfficientNet-b7

To find a principled method to scale up Convolutional Neural Networks (CNNs) that can achieve better accuracy and efficiency, this model proposes a simple yet effective compound scaling method, which uniformly scales network width, depth, and resolution with a set of fixed scaling coefficients. Taking EfficientNet-b0 as a baseline, we scale up the baseline network with different compound coefficients to obtain EfficientNet-b1 to b7.

All of them have been pre-trained on ImageNet-1k, with 224 × 224 image sizing. Generally, the performances of the four state-of-the-art models are in the ascending order of ResNet-152, DenseNet-201, MobileNet-v2-100 and EfficientNet-b7.

4.3. Metrics

Referring to [5], we conduct five evaluation metrics on the test dataset of BC histopathology images in this work, including ACC_IL (test accuracy at image level), ACC_PL (test accuracy at patient level), F1 (F1 measure), DOR (diagnostic odds ratio) and Kappa (Kappa criteria).

4.3.1. ACC_IL

ACC_IL (Equation (1)) is the ratio of

N_{r e c}

(the number of BC histopathology images correctly identified) to

N_{a l l}

(the total number of BC histopathology images).

\begin{matrix} ACC_IL = \frac{N_{r e c}}{N_{a l l}} = \frac{T P + T N}{T P + T N + F P + F N} \end{matrix}

(1)

4.3.2. ACC_PL

Patient score (Equation (2)) is the ratio of

N_{r e c}

(correctly identified BC histopathology images of patient P) to

N_{P}

(all the BC histopathology images of patient P), and ACC_PL (Equation (3)) is the ratio of the sum of patient score to the total number of patients.

\begin{matrix} Patient Score = \frac{N_{r e c}}{N_{P}} \end{matrix}

(2)

\begin{matrix} ACC_PL = \frac{\sum Patient score}{Total number of patients} \end{matrix}

(3)

4.3.3. F₁

Precision (Equation (4)) is the number of correct benign BC histopathology images divided by the number of all benign BC histopathology images returned by the classifier. Recall (Equation (5)) is the number of correct benign BC histopathology images divided by the number of all samples that should have been identified as benign. F₁ (Equation (6)) is the harmonic mean of the precision and recall.

\begin{matrix} precision = \frac{T P}{T P + F P} \end{matrix}

(4)

\begin{matrix} recall = \frac{T P}{T P + F N} \end{matrix}

(5)

\begin{matrix} F_{1} = \frac{2 \times precision \times recall}{precision + recall} \end{matrix}

(6)

4.3.4. DOR

DOR expresses the ratio of the product of TP and TN to the product of FP and FN, which reflects the degree of correlation between the results of the diagnostic prediction and ground truth. When the value is greater than 1, it indicates that the diagnostic prediction is reliable; when the value is less than 1, benign patients are more likely to be diagnosed as malignant patients; when the value is equal to 1, this diagnosis cannot distinguish between benign or malignant patients.

\begin{matrix} DOR = \frac{T P \times T N}{F P \times F N} \end{matrix}

(7)

4.3.5. Kappa

Kappa is calculated as Equation (10) describes, where

p_{0}

(Equation (8)) is equal to ACC_IL defined in Equation (1), and

p_{e}

is the ratio between the sum of the number of real images in the benign or malignant category multiplied with the predicted number of images in that category and the square of the total samples. Kappa is used for consistency checking, and its value is in the range of [

- 1

, 1], which can be divided into six groups representing the following consistency levels: [

- 1

, 0) as indicating no agreement, [0, 0.20] as slight, [0.21, 0.40] as fair, [0.41, 0.60] as moderate, [0.61, 0.80] as substantial and [0.81, 1] as almost perfect agreement.

\begin{matrix} p_{0} = \frac{N_{r e c}}{N_{a l l}} \end{matrix}

(8)

\begin{matrix} p_{e} = \frac{\sum N_{t r u e_i} \times N_{p r e_i}}{N_{a l l} \times N_{a l l}} \end{matrix}

(9)

\begin{matrix} Kappa = \frac{p_{0} - p_{e}}{1 - p_{e}} \end{matrix}

(10)

4.4. Implementation

In terms of the federated case, the federated averaging algorithm [46] is adopted. As described in Algorithm 1, to start with, the kth federated client generates a public key

p k_{k}

for encryption, a private key

s k_{k}

for decryption based on the security parameter

λ_{k}

and sends the number of local training data

n_{k}

to the federated server. Next, the kth federated client trains

E^{c}

epochs through the selected model severally to update the corresponding model parameters

w_{k}^{c}

, and sends [[

w_{k}^{c}

]] encrypted by CKKS with

p k_{k}

to the federated server separately. In addition, the federated server integrates [[

w_{k}^{c}

]] of K federated clients by weighted average to obtain the integrated parameters

[[w^{s}]]

, of which the weight for each federated client is equal to the proportion of local training data, and then returns them to K federated clients. Finally, the kth federated client receives

[[w^{s}]]

sent by the federated server, decrypts them with

s k_{k}

and updates

w_{k}^{c}

. Repeat the above steps for

E^{s}

times to complete the federated training.

Regarding hyper-parameters, we have the following settings in the training phases. Each federated client participates in training over the local data with the mini-batch size

b^{c}

= 32 and the learning rate

l r

= 0.001, executing

E^{c}

= 5 epochs each round. Then, the federated server receives locally-calculated gradients from those K = 11 federated clients each round and calculates their weighted average for a single global update. The global update takes

E^{s}

= 20 rounds, leading to a total of 100 epochs for training.

To be fair, both centralized training and independent training are conducted with identical hyper-parameter settings. In the centralized case, models are directly trained on the overall training data with 100 epochs. During the independent experiment, the training processes of eleven clients are stuck with their local data, each with 100 epochs.

During the training process, we only update the parameters of the last classification layer and lock the parameters of the previous layers in order to accelerate training.

4.5. Results

As shown in Table 3, Table 4 and Table 5, the average ACC_IL difference, the average ACC_PL difference, and the average F1 difference of the four models between the centralized learning and the federated learning in the whole dataset are −0.35%, −0.44% and 0.88%, respectively. The corresponding differences between federated learning and independent learning are 13.70%, 12.13%, and 26.27%, respectively. Furthermore, whether in the whole dataset or in each magnification subset (40×, 100×, 200×, 400×), the federated learning results are competitive with the centralized learning results and are much higher than those of independent learning. The experimental results are consistent with the theoretical analyses, which show that the federated learning method indeed brings significant improvements to all the independent clients.

What are the performances of the four state-of-the-art models in the experiments? In the whole dataset, ResNet-152 [49] achieved the best results, of which the federated learning ACC_IL, ACC_PL, and F1 scores are 3.68%, 5.7%, and 7.59% higher than those of MobileNet-v2-100 [51] in the second place. The results of EfficientNet-b7 [52] are not ideal. We hold that the training rounds are not large enough so the model has not fully converged. For each magnification subset, most of the experimental results with the 100×, 200× magnification are better than those with 40×, 400×, we hold the opinion that the previous two magnification images maintain the balance between image information and precision.

It is worth mentioning that some federated learning results exceed the centralized learning results, the reason is that in our experimental settings, federated learning updates the gradients every five epochs, while centralized learning updates the gradients for every training round, which may cause some deviations in the experimental results.

As for the reliabilities and consistencies of four state-of-the-art models in the experiments, it can be found from Table 6 that in the federal learning experiment on the overall data set, the DOR of ResNet-152 is close to 100, and the DOR of DenseNet-201 exceeds 150, which indicate that the diagnostic results of these two models are very reliable. At the same time, the DOR of MobileNet-v2-100 is close to 40, and the DOR of EfficientNet-b7 exceeds 20, indicating that the experimental results of the four models are convincing. It is worth mentioning that ResNet-152 has a DOR of 0 on the dataset with 100× magnification, this is because the calculation process sets the corresponding DOR to be 0 when the value of FP or FN is 0, which also reflects the superiority of the experimental results.

As illustrated in Table 7, in the federal learning experiment on the overall data set, the Kappa criteria of DenseNet-201, MobileNet-v2-100, and EfficientNet-b7 all belong to [0.61, 0.80], which have a substantial agreement. Meanwhile, the Kappa of ResNet-152 belongs to [0.41, 0.60] as a moderate agreement, indicating that the experimental results of the four models have a high agreement as a whole. It is noteworthy that DenseNet-201 has a Kappa of more than 0.80 on the dataset with 200× magnification, which is almost perfect agreement.

5. Conclusions

In our paper, we propose a federated learning framework for efficient medical image diagnosis, which can conduct knowledge fusion through aggregating model parameters under the data privacy requirement. In the system, the task scheduler plays a role in load balancing for multi-user access. The benefit of computing efficiency depends on the distributed computing framework. As the heart of the federated training mechanism, the encryption algorithms ensure the privacy of requests and results. Moreover, the easy extensibility of the model container makes it applicable beyond the medical field.

We also conduct breast cancer histopathological image classification experiments based on this framework. For the ACC_IL, ACC_PL and F1 measure, it proves that the four state-of-the-art models have achieved similar federated learning results to the centralized learning results, indicating the feasibility and efficiency of the federal learning framework. In addition, for the DOR and Kappa, the performances of the four models of federated learning also reflect the reliabilities and consistencies of the experimental results.

In future work, the trained network can be further tested with larger and balanced datasets from non-identically and independently distributed data sources. However, the problem of data imbalance is prevalent in the medical field, the approaches to deal with data imbalance should be carried out in the future. Furthermore, we plan to improve the operating efficiency of the homomorphic encryption algorithms and measure the performance (including its susceptibility to security attacks) of the entire federated learning framework, practically.

Author Contributions

Conceptualization, S.Y.; methodology, L.L. and N.X.; software, N.X.; validation, L.L., N.X. and S.Y.; formal analysis, L.L. and N.X.; writing—original draft preparation, N.X.; writing—review and editing, L.L.; visualization, L.L.; supervision, S.Y.; project administration, S.Y.; funding acquisition, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Key R&D Program of China (2020AAA0105200).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Greenspan, H.; Van Ginneken, B.; Summers, R.M. Guest editorial deep learning in medical imaging: Overview and future promise of an exciting new technique. IEEE Trans. Med. Imaging 2016, 35, 1153–1159. [Google Scholar] [CrossRef]
Shin, H.C.; Roberts, K.; Lu, L.; Demner-Fushman, D.; Yao, J.; Summers, R.M. Learning to read chest x-rays: Recurrent neural cascade model for automated image annotation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2497–2506. [Google Scholar]
Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, SMC-3, 610–621. [Google Scholar] [CrossRef] [Green Version]
Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
Xie, J.; Liu, R.; Luttrell, J.; Zhang, C. Deep Learning Based Analysis of Histopathological Images of Breast Cancer. Front. Genet. 2019, 10, 80. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ojansivu, V.; Heikkilä, J. Blur Insensitive Texture Classification Using Local Phase Quantization. In Proceedings of the Image and Signal Processing—3rd International Conference, ICISP 2008, Cherbourg-Octeville, France, 1–3 July 2008. [Google Scholar]
Guo, Z.; Zhang, L.; Zhang, D. A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process. 2010, 19, 1657–1663. [Google Scholar]
Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 84–90. [Google Scholar] [CrossRef] [Green Version]
Spanhol, F.A.; Oliveira, L.S.; Petitjean, C.; Heutte, L. Breast cancer histopathological image classification using convolutional neural networks. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 2560–2567. [Google Scholar]
Bayramoglu, N.; Kannala, J.; Heikkilä, J. Deep learning for magnification independent breast cancer histopathology image classification. In Proceedings of the 2016 23rd International Conference on Pattern Recognition (ICPR), Cancún, Mexico, 4–6 December 2016; pp. 2440–2445. [Google Scholar]
Abdullah-Al, N.; Ali, M.M.; Kong, Y. Histopathological Breast Cancer Image Classification by Deep Neural Network Techniques Guided by Local Clustering. Biomed Res. Int. 2018, 2018, 2362108. [Google Scholar]
Zhu, C.; Song, F.; Wang, Y.; Dong, H.; Liu, J. Breast cancer histopathology image classification through assembling multiple compact CNNs. BMC Med. Inform. Decis. Mak. 2019, 19, 198. [Google Scholar] [CrossRef] [Green Version]
Zaalouk, A.M.; Ebrahim, G.A.; Mohamed, H.K.; Hassan, H.M.; Zaalouk, M.M. A deep learning computer-aided diagnosis approach for breast cancer. Bioengineering 2022, 9, 391. [Google Scholar] [CrossRef]
Hameed, Z.; Zahia, S.; Garcia-Zapirain, B.; Javier Aguirre, J.; María Vanegas, A. Breast cancer histopathology image classification using an ensemble of deep learning models. Sensors 2020, 20, 4373. [Google Scholar] [CrossRef]
Zheng, Y.; Li, C.; Zhou, X.; Chen, H.; Xu, H.; Li, Y.; Zhang, H.; Li, X.; Sun, H.; Huang, X.; et al. Application of Transfer Learning and Ensemble Learning in Image-level Classification for Breast Histopathology. arXiv 2022, arXiv:2204.08311. [Google Scholar] [CrossRef]
Desai, M.; Shah, M. An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN). Clin. eHealth 2021, 4, 1–11. [Google Scholar] [CrossRef]
Mridha, M.F.; Hamid, M.A.; Monowar, M.M.; Keya, A.J.; Ohi, A.Q.; Islam, M.R.; Kim, J.M. A comprehensive survey on deep-learning-based breast cancer diagnosis. Cancers 2021, 13, 6116. [Google Scholar] [CrossRef]
Lu, M.Y.; Chen, R.J.; Kong, D.; Lipkova, J.; Singh, R.; Williamson, D.F.; Chen, T.Y.; Mahmood, F. Federated learning for computational pathology on gigapixel whole slide images. Med. Image Anal. 2022, 76, 102298. [Google Scholar] [CrossRef]
Scheibner, J.; Ienca, M.; Kechagia, S.; Troncoso-Pastoriza, J.R.; Raisaro, J.L.; Hubaux, J.P.; Fellay, J.; Vayena, E. Data protection and ethics requirements for multisite research with health data: A comparative examination of legislative governance frameworks and the role of data protection technologies. J. Law Biosci. 2020, 7, lsaa010. [Google Scholar] [CrossRef]
Yang, Q.; Liu, Y.; Chen, T.; Tong, Y. Federated machine learning: Concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 2019, 10, 1–19. [Google Scholar] [CrossRef]
Damgård, I.; Pastro, V.; Smart, N.P.; Zakarias, S. Multiparty Computation from Somewhat Homomorphic Encryption. IACR Cryptol. EPrint Arch. 2011, 2011, 535. [Google Scholar]
Mohassel, P.; Zhang, Y. SecureML: A System for Scalable Privacy-Preserving Machine Learning. In Proceedings of the 2017 IEEE Symposium on Security and Privacy (SP), San Jose, CA, USA, 22–26 May 2017; pp. 19–38. [Google Scholar]
Kilbertus, N.; Gascón, A.; Kusner, M.; Veale, M.; Gummadi, K.; Weller, A. Blind justice: Fairness with encrypted sensitive attributes. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 2630–2639. [Google Scholar]
Dwork, C.; Roth, A. The Algorithmic Foundations of Differential Privacy. Found. Trends Theor. Comput. Sci. 2014, 9, 211–407. [Google Scholar] [CrossRef]
Abadi, M.; Chu, A.; Goodfellow, I.J.; McMahan, H.B.; Mironov, I.; Talwar, K.; Zhang, L. Deep Learning with Differential Privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, 24–28 October 2016; pp. 308–318. [Google Scholar]
McMahan, H.B.; Ramage, D.; Talwar, K.; Zhang, L. Learning Differentially Private Language Models without Losing Accuracy. arXiv 2017, arXiv:1710.06963. [Google Scholar]
Stenkvist, B.; Westman-Naeser, S.; Holmquist, J.; Nordin, B.; Fox, C.H. Computerized nuclear morphometry as an objective method for characterizing human cancer cell populations. Cancer Res. 1979, 38, 4688–4697. [Google Scholar]
Kowal, M.; Filipczuk, P.; Obuchowicz, A.; Korbicz, J.; Monczak, R. Computer-aided diagnosis of breast cancer based on fine needle biopsy microscopic images. Comput. Biol. Med. 2013, 43, 1563–1572. [Google Scholar] [CrossRef]
Filipczuk, P.; Fevens, T.; Krzyzak, A.; Monczak, R. Computer-Aided Breast Cancer Diagnosis Based on the Analysis of Cytological Images of Fine Needle Biopsies. IEEE Trans. Med. Imaging 2013, 32, 2169–2178. [Google Scholar] [CrossRef]
George, Y.; Zayed, H.; Roushdy, M.; Elbagoury, B. Remote Computer-Aided Breast Cancer Detection and Diagnosis System Based on Cytological Images. IEEE Syst. J. 2013, 8, 949–964. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, B.; Coenen, F.; Lu, W. Breast cancer diagnosis from biopsy images with highly reliable random subspace classifier ensembles. Mach. Vis. Appl. 2013, 24, 1405–1420. [Google Scholar] [CrossRef]
Zhang, Y.; Zhang, B.; Coenen, F.; Xiao, J.; Lu, W. Erratum to: One-class kernel subspace ensemble for medical image classification. J. Adv. Signal Process. 2015, 88. [Google Scholar] [CrossRef]
Spanhol, F.A.; Oliveira, L.S.; Petitjean, C.; Heutte, L. A dataset for breast cancer histopathological image classification. IEEE Trans. Biomed. Eng. 2015, 63, 1455–1462. [Google Scholar] [CrossRef]
Nikolaenko, V.; Weinsberg, U.; Ioannidis, S.; Joye, M.; Boneh, D.; Taft, N. Privacy-preserving ridge regression on hundreds of millions of records. In Proceedings of the 2013 IEEE Symposium on Security and Privacy, San Francisco, CA, USA, 19–22 May 2013; pp. 334–348. [Google Scholar]
Zhao, L.; Ni, L.; Hu, S.; Chen, Y.; Zhou, P.; Xiao, F.; Wu, L. Inprivate digging: Enabling tree-based distributed data mining with differential privacy. In Proceedings of the IEEE INFOCOM 2018—IEEE Conference on Computer Communications, Honolulu, HI, USA, 15–19 April 2018; pp. 2087–2095. [Google Scholar]
Cheng, K.; Fan, T.; Jin, Y.; Liu, Y.; Chen, T.; Papadopoulos, D.; Yang, Q. Secureboost: A lossless federated learning framework. IEEE Intell. Syst. 2021, 36, 87–98. [Google Scholar] [CrossRef]
Li, Q.; Wen, Z.; He, B. Practical federated gradient boosting decision trees. In Proceedings of the AAAI conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 4642–4649. [Google Scholar]
Konečnỳ, J.; McMahan, H.B.; Ramage, D.; Richtárik, P. Federated optimization: Distributed machine learning for on-device intelligence. arXiv 2016, arXiv:1610.02527. [Google Scholar]
Konečnỳ, J.; McMahan, H.B.; Yu, F.X.; Richtárik, P.; Suresh, A.T.; Bacon, D. Federated learning: Strategies for improving communication efficiency. arXiv 2016, arXiv:1610.05492. [Google Scholar]
Bonawitz, K.; Eichner, H.; Grieskamp, W.; Huba, D.; Ingerman, A.; Ivanov, V.; Kiddon, C.; Konečnỳ, J.; Mazzocchi, S.; McMahan, B.; et al. Towards federated learning at scale: System design. Proc. Mach. Learn. Syst. 2019, 1, 374–388. [Google Scholar]
Yu, H.; Liu, Z.; Liu, Y.; Chen, T.; Cong, M.; Weng, X.; Niyato, D.; Yang, Q. A sustainable incentive scheme for federated learning. IEEE Intell. Syst. 2020, 35, 58–69. [Google Scholar] [CrossRef]
Zhang, C.; Li, S.; Xia, J.; Wang, W.; Yan, F.; Liu, Y. BatchCrypt: Efficient homomorphic encryption for Cross-Silo federated learning. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC 20), online, 15–17 July 2020; pp. 493–506. [Google Scholar]
Standard, N.F. Announcing the advanced encryption standard (aes). Fed. Inf. Process. Stand. Publ. 2001, 197, 3. [Google Scholar]
Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Hong Kong, China, 3–7 December 2017; pp. 409–437. [Google Scholar]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; y Arcas, B.A. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Hirsch, P.D. Task Scheduling Using Improved Weighted Round Robin Techniques. U.S. Patent 10,324,755, 18 June 2019. [Google Scholar]
Zaharia, M.; Chowdhury, M.; Franklin, M.J.; Shenker, S.; Stoica, I. Spark: Cluster computing with working sets. HotCloud 2010, 10, 95. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 22–25 July 2017; pp. 4700–4708. [Google Scholar]
Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar]
Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019; pp. 6105–6114. [Google Scholar]

Figure 1. The federated learning framework for breast cancer histopathological image classification.

Figure 2. The task scheduler and model container sub-modules of the federated server module.

Figure 3. The federated learning sub-module and federated client module of the federated learning framework.

Figure 4. A slide of breast malignant tumor with different magnification factors. Pathologist selects the key areas to be seen in the next higher magnification. For illustrative purposes, we manually add the highlighted rectangles.

Table 1. Benign and malignant image distribution by magnification factors and histological subtypes.

Types of Tumors	Subtypes of Tumors	40×	100×	200×	400×	Total	# Patients
Benign	adenosis (A)	114	113	111	106	444	4
	fibroadenoma (F)	253	260	264	237	1014	10
	phyllodes tumor (PT)	149	150	140	130	569	7
	tubular adenoma (TA)	109	121	108	115	453	3
	Total	625	644	623	588	2480	24
Malignant	ductal carcinoma (DC)	864	903	896	788	3451	38
	lobular carcinoma (LC)	156	170	163	137	626	5
	mucinous carcinoma (MC)	205	222	196	169	792	9
	papillary carcinoma (PC)	145	142	135	138	560	6
	Total	1370	1437	1390	1232	5429	58

Table 2. The partitions of training and test dataset.

Dataset	# Images	# Patients (B)	# Patients (M)
Training	5590	16	40
Testing	2319	8	18

Table 3. ACC_IL (%) of four models validated on BreakHis dataset using centroid/federated/independent training.

Model	40×	100×	200×	400×	All
ResNet-152	85.82/85.46/77.13	87.34/83.39/75.16	87.73/86.03/76.49	84.14/82.65/76.87	86.33/84.39/76.97
DenseNet-201	91.59/91.23/77.68	92.05/90.81/76.31	91.45/93.19/77.58	85.43/88.66/77.12	90.28/91.06/77.20
MobileNet-v2-100	83.02/83.77/63.49	85.89/90.18/65.86	85.38/86.22/67.00	89.52/89.52/67.18	85.87/87.38/65.62
EfficientNet-b7	82.55/83.81/73.10	83.80/83.63/73.43	84.66/86.07/73.85	80.69/82.43/66.92	82.98/84.02/72.26

Table 4. ACC_PL (%) of four models validated on BreakHis dataset using centroid/federated/independent training.

Model	40×	100×	200×	400×	All
ResNet-152	86.23/86.48/79.15	89.63/85.59/78.62	88.64/86.65/77.01	87.17/85.31/79.69	88.07/86.01/77.52
DenseNet-201	92.19/92.15/79.30	92.58/90.94/79.83	91.50/93.80/80.49	87.39/90.77/83.04	91.06/91.87/81.03
MobileNet-v2-100	83.40/83.05/64.06	83.34/87.90/64.99	83.23/85.93/68.42	87.58/88.41/68.13	84.19/86.17/65.48
EfficientNet-b7	81.63/83.47/75.79	83.50/83.25/78.63	85.54/86.38/76.59	80.97/82.86/73.24	83.06/84.09/75.61

Table 5. F1 (%) of four models validated on BreakHis dataset using centroid/federated/independent training.

Model	40×	100×	200×	400×	All
ResNet-152	72.79/69.85/53.44	76.19/64.65/55.06	77.64/72.11/61.16	67.92/62.65/56.19	73.95/67.45/55.99
DenseNet-201	84.49/84.24/55.63	85.89/83.65/60.51	85.55/88.89/61.87	76.32/82.72/55.09	83.16/84.97/58.61
MobileNet-v2-100	74.79/71.10/39.09	76.97/81.97/40.45	76.16/74.69/50.55	84.43/82.07/51.11	77.98/77.38/42.17
EfficientNet-b7	70.34/72.05/45.83	69.74/69.90/43.39	74.03/76.70/45.70	69.88/72.17/32.69	71.03/72.78/40.74

Table 6. DOR of four models validated on BreakHis dataset using centroid/federated/independent training.

Model	40×	100×	200×	400×	All
ResNet-152	53.15/453.89/9.85	106.21/0.00/9.54	63.56/135.56/8.75	47.50/64.70/8.03	63.03/168.54/10.36
DenseNet-201	372.36/151.37/8.98	162.37/105.57/13.25	167.55/218.40/8.89	29.21/52.03/8.31	97.96/99.81/9.85
mobilenet-v2-100	21.35/24.20/2.98	31.34/71.32/2.19	27.28/42.92/3.03	65.63/99.94/3.54	31.10/48.02/2.75
EfficientNet-b7	19.34/24.98/5.80	22.35/20.97/6.39	25.08/31.66/5.52	14.10/17.98/3.08	19.38/23.00/4.98

Table 7. Kappa (%) of four models validated on BreakHis dataset using centroid/federated/independent training.

Model	40×	100×	200×	400×	All
ResNet-152	63.69/61.34/41.26	68.02/55.50/37.84	69.42/63.56/43.78	58.27/52.93/42.17	65.16/58.42/41.42
DenseNet-201	78.87/78.27/42.03	80.41/77.33/40.94	79.56/84.01/46.81	65.84/74.28/42.22	76.42/78.64/43.91
MobileNet-v2-100	62.06/60.07/12.65	66.85/75.25/13.93	65.62/65.51/23.37	76.54/24.67/20.55	67.59/68.79/20.37
EfficientNet-b7	58.18/60.90/29.56	58.91/58.83/27.31	63.21/66.80/28.65	55.69/59.37/11.75	59.09/61.58/23.68

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, L.; Xie, N.; Yuan, S. A Federated Learning Framework for Breast Cancer Histopathological Image Classification. Electronics 2022, 11, 3767. https://doi.org/10.3390/electronics11223767

AMA Style

Li L, Xie N, Yuan S. A Federated Learning Framework for Breast Cancer Histopathological Image Classification. Electronics. 2022; 11(22):3767. https://doi.org/10.3390/electronics11223767

Chicago/Turabian Style

Li, Lingxiao, Niantao Xie, and Sha Yuan. 2022. "A Federated Learning Framework for Breast Cancer Histopathological Image Classification" Electronics 11, no. 22: 3767. https://doi.org/10.3390/electronics11223767

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Federated Learning Framework for Breast Cancer Histopathological Image Classification

Abstract

1. Introduction

2. Related Work

2.1. Breast Cancer Diagnosis

2.2. Federated Learning

3. Federated Learning Framework

3.1. Overview

3.2. Workflow

3.3. Robustness

4. Experiments

4.1. Data

4.2. Models

4.2.1. ResNet-152

4.2.2. DenseNet-201

4.2.3. MobileNet-v2-100

4.2.4. EfficientNet-b7

4.3. Metrics

4.3.1. ACC_IL

4.3.2. ACC_PL

4.3.3. F1

4.3.4. DOR

4.3.5. Kappa

4.4. Implementation

4.5. Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.3.3. F₁