# Automatic Breast Tumor Screening of Mammographic Images with Optimal Convolutional Neural Network

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Design of the Multilayer Deep-Learning-Based CNN

- Feature enhancement and extraction: A multilayer 2D convolution operation is used to magnify the texture of what might be tumor tissue and edge information (usually two or more layers are used), as shown in Figure 2a. Each layer uses a 3 × 3 sliding window to perform the operation of the convolutional weight. First, a 2D fractional convolution operation is performed to magnify the tumor characteristics. Then, by combining multilayer convolutional weight calculations, the contour of the tumor is gradually strengthened, noise is removed, and the image is sharpened. These effects help strengthen the target area and retain non-characteristic information. This study applies the 2D spatial fractional-order convolutional processes in the fractional convolutional layer, selects the appropriate fractional order parameters, and performs convolution in the x and y directions, thus yielding a combination of 2D weight values in space, the general formula being [35,36,37,38]:$${C}^{v}{I}_{xy}=conv{({I}_{xy},M(i,j),v)}^{T}$$$${C}_{x}^{v}{I}_{xy}={\displaystyle \sum _{i=-\frac{h-1}{2}}^{\frac{h-1}{2}}{\displaystyle \sum _{j=-\frac{h-1}{2}}^{\frac{h-1}{2}}{M}_{x}(i,j)I(x+i,y+j)}}$$$${C}_{y}^{v}{I}_{xy}={\displaystyle \sum _{j=-\frac{h-1}{2}}^{\frac{h-1}{2}}{\displaystyle \sum _{i=-\frac{h-1}{2}}^{\frac{h-1}{2}}{M}_{y}(j,i)I(x+j,y+i)}}$$
_{x}and M_{y}are 3 × 3 convolutional windows that can be written as follows [35,36,37,38]:$${M}_{x}=\left[\begin{array}{ccc}0& \frac{{v}^{2}-v}{2}& 0\\ 0& -v& 0\\ 0& 1& 0\end{array}\right],\text{}{M}_{y}={M}_{x}^{T}=\left[\begin{array}{ccc}0& 0& 0\\ \frac{{v}^{2}-v}{2}& -v& 1\\ 0& 0& 0\end{array}\right]$$$${\nabla}^{v}{I}_{xy}={\left[\begin{array}{c}{C}_{x}^{v}{I}_{\mathrm{xy}}\\ {C}_{y}^{v}{I}_{\mathrm{xy}}\end{array}\right]}^{T},\text{}\left|{\nabla}^{v}{I}_{xy}\right|\cong \frac{\left|{C}_{x}^{v}{I}_{\mathrm{xy}}\right|+\left|{C}_{y}^{v}{I}_{\mathrm{xy}}\right|}{255}$$

- Rapid screening of breast tumors: Breast tumors can be identified at the image classification layer, which includes the flattening process (FP) and a multilayer classifier, as seen in Figure 1. The FP can convert a 2D feature matrix into a 1D feature vector, which is then fed as the input vector of the classifier for further pattern recognition. After two MP treatments, the FP treatment may be written as shown in the general formula (7):$$X\left|{}_{1\times {(\frac{n}{4})}^{2}}\right.=FP(MP\left|{}_{\frac{n}{4}\times \frac{n}{4}}\right.)$$

_{i}is the 1D feature vector used as input, i = 1, 2, 3, …, n, X = [x

_{1}, x

_{2}, x

_{3}, …, x

_{n}]. The training of the multilayer classifier uses the back-propagation algorithm to adjust the connecting weighted parameters of the classifier and set the loss function as the convergent condition for terminating the training stage. For multi-class classification, multiple classes of binary cross-entropy functions [7,43,44,45] are shown in Equation (9):

_{j,k}is the target value (desired class), T = [t

_{1,k}, t

_{2,k}, t

_{3,k}, …, t

_{m}

_{,k}] for multiple classes; y

_{j,k}is the outputted prediction value, Y = [y

_{1,k}, y

_{2,k}, y

_{3,k}, …, y

_{m}

_{,k}]; and m is the number of classifications. This study sets m = 2, either normal or abnormal, coding as Y = [1, 0] and Y = [0, 1], respectively, k = 1, 2, 3, …, K, is the number of training data, and W is the weighted parameter matrix of the classifier with a fully connecting network.

#### 2.2. Adaptive Moment Estimation Method

_{1}= 0.900 and β

_{2}= 0.999 are the attenuation rates of each iteration; p = 1, 2, 3, …, p

_{max}; and p

_{max}is the maximum number of iterations. Each iteration computation adjusts the weighted parameters of the classifier within a limited range with the parameters of Equation (11), as shown in Equations (12) and (13) [31,46]:

#### 2.3. Classifier’s Performance Evaluations

#### 2.4. Computer Assistive System for Automatic Breast Tumor Screening

^{TM}) software to develop a computer assistive system for automatic breast tumor screening, integrating: (1) ROI image extraction, (2) feature enhancement and extraction, and (3) breast tumor screening classifier and other functions. Algorithms for functions (1) and (2) are developed using the MATLAB Script tool. The multilayer CNN algorithm and the interface shown in Figure 4 are written by Python software. The interface works as follows:

- Zone ①: Sets the source path of breast mammography images;
- Zone ②: Loads and displays the selected mammography images;
- Zone ③: As per the priority order, extract ROI images and perform automatic tumor screening. In this study, six areas at which tumors are most possibly identified are designated. The CAS automatically prioritizes the ROI cutting feature patterns (100 pixels × 100 pixels), as seen in Figure 2, and then screens those areas. The block marked ③ can show the output of the classifier, the identification result, and the classification information. The red and green circles show the normality and abnormality. The output value of the classifier must be >0.5 to have a high degree of confidence that there is a suspected breast tumor.

#### 2.5. Experimental Setup

## 3. Experimental Results

- The number of convolutional layers and pooling layers: This study increases the number of convolutional layers and pooling layers from 1 to 5 and the sizes of convolution windows from 3 × 3 to 11 × 11. The processing windows for the pooling layers are set to 2 × 2, and the second to fifth convolutional layers have 16 kernel convolution windows to perform feature enhancement and extraction.

^{®}Q370, Intel

^{®}Core™ i7 8700, DDR4 2400 MHz 8G*3) as a development platform to implement the multilayer CNN-based classifier suggested in this study and use the graphics processing unit (GPU) (NVIDIA

^{®}GeForce

^{®}RTX™ 2080 Ti, 1755 MHz, 11 GB GDDR6) to speed up the time it takes for digital image processing. The feasibility study was validated as described in detail in the subsequent sections.

#### 3.1. Testing of Different Multilayer CNN Models and Determination of the Most Suitable Architecture

#### 3.2. Testing of the First Convolutional Layer and Determination of the Window Type

#### 3.3. Multilayer CNN-Based Classifier Testing and Validation

_{f}= 10). Table 7 shows the overall cross-validation results. Figure 7a indicates that the accuracy (%) of Models #2 and #3 can be improved over 600 epochs of training. By comparison, the accuracy of Models #1 and #4 can be improved over 200–400 epochs, after which it converges, and the accuracy (%) of classification approaches the maximum. The training convergence curve of the classifier is shown in Figure 7b. The accuracy (%) of the four models may reach larger than 95%. To shorten the classifier’s design cycle and reduce the memory requirements for storing classifier parameters, we recommend using the architectures of Models #1 and #4 to establish and implement the multilayer CNN-based classifiers.

_{f}= 10) for averages of precision (%), recall (%), accuracy (%), and F1 score in Table 8. Hence, we suggest Model #1 to carry out a multilayer CNN-based classifier for automatic breast tumor screening. In addition, as seen in Table 9, we also set 4, 8, 16, and 32 Kernel convolutional windows and 4, 8, 16, and 32 maximum pooling windows in second and third convolutional-pooling layers, respectively, for establishing four models (Models #1–1 to #1–4). With the tenfold cross-validation, trained feature patterns are randomly selected, the average training CPU time of Models #1–1 and #1–2 is less than Model #1–3 with 16 Kernel convolutional windows and 16 maximum pooling windows. It can be seen that Model #1–4 comprises 32 Kernel convolutional windows and 32 maximum pooling windows will increase the average training CPU time and complex computational processes at each cross-validation. With the tenfold cross-validation, untrained feature patterns are also randomly selected, as seen in Table 10, Table 11, Table 12 and Table 13, the proposed architecture of multilayer classifier (Model #1–3) has promising classification accuracy and performance in terms of average precision (%), average recall (%), average accuracy (%), and average F1 score. Additionally, the proposed CNN architecture with different convolutional windows in the first convolutional layer, including fractional-order, Sobel (first-order), and Histeq convolutional windows, is used to test the performance of breast tumor screening model. Through the tenfold cross-validation, the CNN classifier with a fractional-order convolutional window in the first convolutional layer, as Model #1 in Table 14, has better classification accuracy (larger than 95%) than Model #2 (larger than 85%) and Model #3 (larger than 90%).

#### 3.4. Discussion

^{TM}) software, MATLAB Script tools, and open-source Tensorflow platform (Version 1.9.0) [28] and integrated into a computer assistive system with the automatic and manual feature extraction and breast tumor screening modes. The fractional-order convolutional layer and two convolutional-pooling layers allow the image enhancement and sharpening of the possible tumor edges, contours, and shapes via one fractional-order and two kernel convolutional processes in the feature patterns. Through a series of convolution and pooling processes at different scales and different dimensions, the classifier can obtain nonlinearity feature representation from low-level features to high-level information [29]. Then, with the specific bounding boxes (automatic or manual mode) for ROI extraction, enhanced feature patterns can then be distinguished for further breast tumor screening by the multilayer classifier in the classification layer. A gradient-descent optimization method, namely, the ADAM algorithm, is used in the back-propagation process to adjust the network weighted parameters in the classification layer. With K-fold (K

_{f}= 10) cross-validation, the 466 randomly selected untrained feature patterns for each test fold, the proposed multilayer CNN-based classifier, has high recall (%), precision (%), accuracy (%), and F1 scores for screening abnormalities in both right and left breasts. Experimental results show that the proposed multilayer CNN model offers image enhancement, feature extraction, automatic screening capability, and higher average accuracy (larger than 95%) for separating the normal condition from the possible tumor classes. It has been observed from previous literature [3,4,5,6,7,10,56] that multilayer CNNs comprised several convolutional-pooling layers and a fully connecting network to establish a classifier for automatic breast tumor screening, and could also be applied for CT, MRI, chest X-ray, and ultrasound image processes, such as image classification and segmentation in clinical applications [19,23,28,35,36,51,55]. The combination of a cascade of deep learning and a fully connecting networks is also carried out by a multilayer CNN-based classifier, and a decision scheme [56]. For the screened suspicious region on mammograms, the cascade of the deep-learning method had 98% sensitivity and 90% specificity on the SuReMapp (Suspicious Region Detection on Mammogram from PP) dataset [56], and 94% sensitivity and 91% specificity on the mini-MIAS dataset [56]. This CNN-based multilayer classifier could extract multi-scale feature patterns, and increase the depth and width feature patterns by using multi-convolutional-pooling processes, which had an overall increase in accuracy. However, excessive multi-convolutional processes would completely lead to a loss of the internal data about the position and the orientation of the desired object, and an excessive multi-pooling processing would lose valuable information relating to the spatial relationships between features; thus, many processes were required to perform with GPU hardware for complex computational processes. Hence, the proposed optimal multilayer CNN architecture contained 2D spatial information in the fractional-order convolutional layer (with two fractional-order convolutional windows), and continuously enhanced the features with two-round convolutional-pooling processes (with 16 Kernel convolutional windows and 16 maximum pooling windows), which could extract the desired features at different scales and different levels. Thus, in comparison with the other deep-learning methods, the proposed multilayer classifier exhibited promising results for the desired medical diagnostic purpose. Hence, we have some advantages for the proposed CNN-based classifier, as follows:

- The ROI extraction, image enhancement, and feature classification tasks are integrated into one learning model;
- The fractional-order convolutional process with fractional-order parameter, v = 0.30–0.40, is used to extract the tumor edges in the first convolutional layer; subsequently, two kernel convolution processes are used to extract the tumor shapes;
- The ADAM algorithm is easy to implement and operate with large datasets and parameter adjustment;
- The proposed CNN-based classifier has better classification accuracy than the CNN architecture with Sobel and Histeq convolutional windows in the first convolutional layer.

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

CNN | Convolutional Neural Network |

CBIS-DDSM | Curated Breast Imaging Subset of a Digital Database for Screening Mammography |

BC | Breast Cancer |

AI | Artificial Intelligence |

BD | Big Data |

MRI | Magnetic Resonance Imaging |

CT | Computer Tomography |

ROI | Region of Interest |

2D | Two-Dimensional |

1D | One-Dimensional |

ML | Machine Learning |

MP | Maximum-Pooling |

FP | Flattening Process |

MLP | Multilayer Perceptron |

GPU | Graphics Processing Unit |

ADAM | Adaptive Moment Estimation Method |

GeLU | GeLU |

MIAS | Mammographic Image Analysis Society |

SuReMapp | Suspicious Region Detection on Mammogram from PP |

TP | True Positive |

FP | False Positive |

TN | True Negative |

FN | False Negative |

PPV | Positive Predictive Value |

YI | Youden’s Index |

Sens | Sensitivity |

Spec | Specificity |

B | Benign |

M | Malignant |

## References

- Ministry Health and Welfare, Taiwan. 2020 Cause of Death Statistics. 2021. Available online: https://dep.mohw.gov.tw/dos/lp-1800-113.html (accessed on 1 January 2022).
- Tsui, P.-H.; Liao, Y.-Y.; Chang, C.-C.; Kuo, W.-H.; Chang, K.-J.; Yeh, C.-K. Classification of benign and malignant breast tumors by 2-D analysis based on contour description and scatterer characterization. IEEE Trans. Med. Imaging
**2010**, 29, 513–522. [Google Scholar] [CrossRef] [PubMed] - Kallenberg, M.; Petersen, K.; Nielsen, M.; Ng, A.Y.; Diao, P.; Igel, C.; Vachon, C.M.; Holland, K.; Winkel, R.R.; Karssemeijer, N.; et al. Unsupervised deep learning applied to breast density segmentation and mammographic risk scoring. IEEE Trans. Med. Imaging
**2016**, 35, 1322–1331. [Google Scholar] [CrossRef] [PubMed] - Samala, R.K.; Chan, H.; Hadjiiski, L.; Helvie, M.A.; Richter, C.D.; Cha, K.H. Breast cancer diagnosis in digital breast tomosynthesis: Effects of training sample size on multi-stage transfer learning using deep neuralnets. IEEE Trans. Med. Imaging
**2019**, 38, 686–696. [Google Scholar] [CrossRef] [PubMed] - Valkonen, M.; Isola, J.; Ylinen, O.; Muhonen, V.; Saxlin, A.; Tolonen, T.; Nykter, M.; Ruusuvuori, P. Cytokeratin-supervised deep learning for automatic recognition of epithelial cells in breast cancers stained for ER, PR, and Ki-67. IEEE Trans. Med. Imaging
**2020**, 39, 534–542. [Google Scholar] [CrossRef] - Lee, S.; Kim, H.; Higuchi, H.; Ishikawa, M. Classification of metastatic breast cancer cell using deep learning approach. In Proceedings of the 2021 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Jeju Island, Korea, 13–16 April 2021; pp. 425–428. [Google Scholar]
- Chougrad, H.; Zouaki, H.; Alheyane, O. Deep convolutional neural networks for breast cancer screening. Comput. Methods Programs Biomed.
**2018**, 157, 19–30. [Google Scholar] [CrossRef] - Jia, G.; Lam, H.-K.; Althoefer, K. Variable weight algorithm for convolutional neural networks and its applications to classification of seizure phases and types. Pattern Recognit.
**2021**, 121, 108226. [Google Scholar] [CrossRef] - Li, X.; Zhai, M.; Sun, J. DDCNNC: Dilated and depthwise separable convolutional neural network for diagnosis COVID-19 via chest X-ray images. Int. J. Cogn. Comput. Eng.
**2021**, 2, 71–82. [Google Scholar] [CrossRef] - University of South Florida. DDSM: Digital Database for Screening Mammography, Version 1 (Updated 2017/09/14); University of South Florida: Tampa, FL, USA; Available online: http://www.eng.usf.edu/cvprg/Mammography/Database.html (accessed on 1 January 2022).
- McGarthy, N.; Dahlan, A.; Gook, T.S.; Hare, N.O.; Ryan, M.; John, B.S.; Lawlor, A.; Gurran, K.M. Enterprise imaging and big data: A review from a medical physics perspective. Phys. Med.
**2021**, 83, 206–220. [Google Scholar] [CrossRef] - Yaffe, M.J. Emergence of big data and its potential and current limitations in medical imaging. Semin. Nucl. Med.
**2019**, 49, 94–104. [Google Scholar] [CrossRef] - The European Federation of Organisations for Medical Physics (EFOMP). White Paper: Big data and deep learning in medical imaging and in relation to medical physics profession. Phys. Med.
**2018**, 56, 90–93. [Google Scholar] [CrossRef][Green Version] - Diaz, O.; Kushibar, K.; Osuala, R.; Linardos, A.; Garrucho, L.; Igual, L.; Radeva, P.; Prior, F.; Gkontra, P.; Lekadir, K. Data preparation for artificial intelligence in medical imaging: A comprehensive guide to open-access platforms and tools. Phys. Med.
**2021**, 83, 25–37. [Google Scholar] [CrossRef] [PubMed] - Qiu, Y.; Lu, J. A visualization algorithm for medical big data based on deep learning. Measurement
**2021**, 183, 109808. [Google Scholar] [CrossRef] - Saranya, N.; Priya, S.K. Deep convolutional neural network feed-Forward and back propagation (DCNN-FBP) algorithm for predicting heart disease using internet of things. Int. J. Eng. Adv. Technol.
**2021**, 11, 283–287. [Google Scholar] - Zhang, J.; Qu, S. Optimization of backpropagation neural network under the adaptive genetic algorithm. Complexity
**2021**, 2021, 1718234. [Google Scholar] [CrossRef] - Sadad, T.; Munir, A.; Saba, T.; Hussain, A. Fuzzy C-means and region growing based classification of tumor from mammograms using hybrid texture feature. J. Comput. Sci.
**2018**, 29, 34–45. [Google Scholar] [CrossRef] - Comelli, A.; Bruno, A.; Di Vittorio, M.L.; Ienzi, F.; Legalla, R.; Vitabile, S.; Ardizzone, E. Automatic multi-seed detection for MR breast image segmentation. Int. Conf. Image Anal. Process.
**2017**, 10484, 706–717. [Google Scholar] - Lindquist, E.M.; Gosnell, J.M.; Khan, S.K.; Byl, J.L.; Zhou, W.; Jiang, J.; Vettukattilb, J.J. 3D printing in cardiology: A review of applications and roles for advanced cardiac imaging. Ann. 3D Print. Med.
**2021**, 4, 100034. [Google Scholar] [CrossRef] - Drozdzal, M.; Chartrand, G.; Vorontsov, E.; Shakeri, M.; Di Jorio, L.; Tang, A.; Romero, A.; Bengio, Y.; Pal, C.; Kadoury, S. Learning normalized inputs for iterative estimation in medical image segmentation. Med. Image Anal.
**2018**, 44, 1–13. [Google Scholar] [CrossRef][Green Version] - Racz, A.; Bajusz, D.; Heberger, K. Multi-level compaeison of machine learning classifier and thrir performance metrics. Molecules
**2019**, 24, 2811. [Google Scholar] [CrossRef][Green Version] - Allen, J.; Liu, H.; Iqbal, S.; Zheng, D.; Stansby, G. Deep learning-based photoplethysmography classification for peripheral arterial disease detection: A proof-of-concept study. Physiol. Meas.
**2021**, 42, 054002. [Google Scholar] [CrossRef] - Panwar, M.; Gautam, A.; Dutt, R.; Acharyya, A. CardioNet: Deep learning framework for prediction of CVD risk factors. In Proceedings of the 2020 IEEE International Symposium on Circuits and Systems (ISCAS), Seville, Spain, 12–14 October 2020; pp. 1–5. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classi-fication with Deep Convolutional Neural Networks. Commun. ACM
**2017**, 60, 84–90. [Google Scholar] [CrossRef] - Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Lecture Notes in Computer Science; 8689 LNCS; Springer: Cham, Switzerland, 2014; pp. 818–833. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Li, F.-F. ImageNet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Li, Y.-C.; Shen, T.-Y.; Chen, C.-C.; Chang, W.-T.; Lee, P.-Y.; Huang, C.-C. Automatic detection of atherosclerotic plaque and calcification from intravascular ultrasound Images by using deep convolutional neural networks. IEEE Trans. Ultrason. Ferroelectr. Freq. Control
**2021**, 68, 1762–1772. [Google Scholar] [CrossRef] [PubMed] - Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent advances in convolutional neural network. Pattern Recognit.
**2018**, 77, 354–377. [Google Scholar] [CrossRef][Green Version] - Kiranyaz, S.; Avci, O.; Abdeljaber, O.; Ince, T.; Gabbouj, M.; Inman, D.J. 1D convolutional neural networks and applications: A survey. Mech. Syst. Signal Process.
**2021**, 151, 107s398. [Google Scholar] [CrossRef] - Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference for Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
- Ma, J.; Yarats, D. Quasi-hyperbolic momentum and Adam for deep learning. In Proceedings of the International Conference on Learning Representations, New Orleans, LA, USA, 6–9 May 2019. [Google Scholar] [CrossRef]
- Pilot European Image Processing Archive. The Mini-MIAS Database of Mammograms. 2012. Available online: http://peipa.essex.ac.uk/pix/mias/ (accessed on 1 January 2022).
- Mammographic Image Analysis Society (MIAS). Database v1.21. 2019. Available online: https://www.repository.cam.ac.uk/handle/1810/250394 (accessed on 1 January 2022).
- Wu, J.-X.; Chen, P.-Y.; Li, C.-M.; Kuo, Y.-C.; Pai, N.-S.; Lin, C.-H. Multilayer fractional-order machine vision classifier for rapid typical lung diseases screening on digital chest X-ray images. IEEE Access
**2020**, 8, 105886–105902. [Google Scholar] [CrossRef] - Lin, C.-H.; Wu, J.-X.; Li, C.-M.; Chen, P.-Y.; Pai, N.-S.; Kuo, Y.-C. Enhancement of chest X-ray images to improve screening accuracy rate using iterated function system and multilayer fractional-order machine learning classifier. IEEE Photonics J.
**2020**, 12, 1–19. [Google Scholar] [CrossRef] - Pu, Y.-F.; Zhou, J.-L.; Yuan, X. Fractional differential mask: A fractional differential-based approach for multiscale texture enhancement. IEEE Trans. Image Process.
**2010**, 19, 491–511. [Google Scholar] - Zhang, Y.; Pu, Y.-F.; Zhou, J.-L. Construction of fractional differential masks based on Riemann-Liouville definition. J. Comput. Inf. Syst.
**2010**, 6, 3191–3199. [Google Scholar] - Thanh, D.N.H.; Kalavathi, P.; Thanh, L.T.; Prasath, V.B.S. Chest X-ray image denoising using Nesterov optimization method with total variation regularization. Procedia Comput. Sci.
**2020**, 171, 1961–1969. [Google Scholar] [CrossRef] - Ba, L.J.; Frey, B. Adaptive dropout for training deep neural networks. Adv. Neural Inf. Process. Syst.
**2013**, 26, 1–9. [Google Scholar] - Clevert, D.; Unterthiner, T.; Hochreite, S. Fast and accurate deep network learning by exponential linear units (ELUs). In Proceedings of the 4th International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
- Hendrycks, D.; Gimpel, K. Gaussian Error Linear Units (GELUs). arXiv
**2016**, arXiv:1606.08415. [Google Scholar] - Boer, P.; Kroese, D.P.; Mannor, S.; Rubinstein, R.Y. A tutorial on the cross-entropy method. Ann. Oper. Res.
**2005**, 134, 19–67. [Google Scholar] [CrossRef] - Rubinstein, R.Y.; Kroese, D.P. The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte-Carlo Simulation, and Machine Learning; Information Science and Statistics; Springer: New York, NY, USA, 2004; pp. 1–47. [Google Scholar]
- Ho, Y.; Wookey, S. The real-world-weight cross- entropy loss function: Modeling the costs of mislabeling. IEEE Access
**2019**, 8, 4806–4813. [Google Scholar] [CrossRef] - Chen, X.; Liu, S.; Sun, R.; Hong, M. On the convergence of a class of ADAM-type algorithms for non-convex optimization. arXiv
**2018**, arXiv:1808.02941. [Google Scholar] - Syntax: Edge, 1994–2021 Years. Available online: https://www.mathworks.com/help/images/ref/edge.html (accessed on 1 January 2022).
- Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom.
**2020**, 21, 6. [Google Scholar] [CrossRef][Green Version] - Syntax: Histeq, 1994–2021 Years. Available online: https://www.mathworks.com/help/images/ref/histeq.html (accessed on 1 January 2022).
- Syntax: Imadjust, 1994–2021 Years. Available online: https://www.mathworks.com/help/images/ref/imadjust.html (accessed on 1 January 2022).
- Wu, J.-X.; Liu, H.-C.; Chen, P.-Y.; Lin, C.-H.; Chou, Y.-H.; Shung, K.K. Enhancement of ARFI-VTI elastography images in order to preliminary rapid screening of benign and malignant breast tumors using multilayer fractional-order machine vision classifier. IEEE Access
**2020**, 8, 164222–164237. [Google Scholar] [CrossRef] - Valenzuela, G.; Laimes, R.; Chavez, I.; Salazar, C.; Bellido, E.G.; Tirado, I.; Pinto, J.; Guerrero, J.; Lavarello, R.J. In vivo diagnosis of metastasis in cervical lymph nodes using backscatter coefficient. In Proceedings of the 2018 IEEE International Ultrasonics Symposium (IUS), Kobe, Japan, 22–25 October 2018. [Google Scholar]
- Chansong, D.; Supratid, S. Impacts of Kernel size on different resized images in object recognition based on convolutional neural network. In Proceedings of the 2021 9th International Electrical Engineering Congress (iEECON), Pattaya, Thailand, 10–12 March 2021. [Google Scholar]
- Sidek, K.A.; Khalil, I.; Jelinek, H.F. ECG biometric with abnormal cardiac conditions in remote monitoring system. IEEE Trans. Syst. Man Cybern. Syst.
**2014**, 44, 1498–1509. [Google Scholar] [CrossRef] - Zhang, X.-H. A Convolutional Neural Network Assisted Fast Tumor Screening System Based on Fractional-Order Image Enhancement: The Case of Breast X-ray Medical Imaging. Master’s Thesis, Department of Electrical Engineering, National Chin-Yi University of Technology, Taichung City, Taiwan, July 2021. [Google Scholar]
- Bruno, A.; Ardizzone, E.; Vitabile, S.; Midiri, M. A novel solution based on scale invariant feature transform descriptors and deep learning for the detection of suspicious regions in mammogram images. J. Med. Signals Sens.
**2020**, 10, 158–173. [Google Scholar]

**Figure 2.**ROI block cutting and priority extraction. (

**a**) The statistics of the prevalence of malignant and benign tumors; (

**b**) priority of ROI block for automatic ROI extraction.

**Figure 4.**The human machine interface for automatic breast tumor screening: Zone ①: Sets the source path of images, Zone ②: Loads and displays the images, Zone ③: Extract ROI images and perform automatic tumor screening.

**Figure 6.**(

**a**) The original image (malignant tumor) and the enhanced image of the three convolutions, (

**b**) The original image and the pixel greyscale distribution map after image enhancement.

**Figure 7.**Training history curves of the multilayer CNN-based classifier. (

**a**) Training performance test and classification performance validation as seen classification accuracy versus the training epoch and (

**b**) classifier training convergence curve as a loss function versus the training epoch.

**Figure 8.**Human–machine interface of the computer assistive system and its automatic screening results.

Layer Function | Manner | Feature Pattern |
---|---|---|

Input Feature Pattern | ROI Extraction with 100 × 100 Bounding Box | 2D, 100 pixels × 100 pixels |

1st Convolutional Layer | 3 × 3 Fractional-Order Convolutional Window, Stride = 1 | 2D, 100 pixels × 100 pixels |

2nd Convolutional Layer | 3 × 3 Kernel Convolution Window, Stride = 1 | 2D, 100 pixels × 100 pixels |

2nd Pooling Layer | 2 × 2 Maximum Pooling Layer, Stride = 2 | 2D, 50 pixels × 50 pixels |

3rd Convolutional Layer | 3 × 3 Kernel Convolution Window, Stride = 1 | 2D, 50 pixels × 50 pixels |

3rd Pooling Layer | 2 × 2 Maximum Pooling Layer, Stride = 2 | 2D, 25 pixels × 25 pixels |

Flattening Layer | Flattening Process | 1D, 1 × 625 feature vector |

Classification Layer | Multilayer Classifier: 625 input nodes, 168 hidden nodes 64 hidden nodes, 2 output nodes (for normality and abnormality) | 1 × 625 Feature Vector Feeding into Multi-Layer Classifier |

Algorithm: ADAM Algorithm |

**Table 2.**Formulas for the evaluation criteria of the proposed classifier, including precision (%), recall (%), accuracy (%), and F1 score.

Actual | Total | Precision (%) | ||
---|---|---|---|---|

Predicted | TP | FP | TP + FP | (TP)/(TP + FP) |

FN | TN | FN + TN | ||

Total | TP + FN | FP + TN | Accuracy (%): (TP + TN)/(TP + FP + TN + FN) | |

Recall (%) | (TP)/(TP + FN) | |||

F1 Score | (2TP)/(2TP + FP + FN) |

**Table 3.**Different convolutional layer models of the multilayer CNN-based classifier (Models #1–#5).

Model | 1st Convolution Window | 2nd Convolution Window | 3rd Convolution Window | 4th Convolution Window | 5th Convolution Window | Stride | Padding |
---|---|---|---|---|---|---|---|

1 | 3 × 3, 2 | - | - | - | - | 1 | 1 |

2 | 3 × 3, 2 | 5 × 5, 16 | - | - | - | 1 | 1 |

3 | 3 × 3, 2 | 5 × 5, 16 | 7 × 7, 16 | - | - | 1 | 1 |

4 | 3 × 3, 2 | 5 × 5, 16 | 7 × 7, 16 | 9 × 9, 16 | - | 1 | 1 |

5 | 3 × 3, 2 | 5 × 5, 16 | 7 × 7, 16 | 9 × 9, 16 | 11 × 11, 16 | 1 | 1 |

**Table 4.**Comparisons of average training CPU time and average accuracy (%) for five different CNN models.

Model | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|

Training CPU Time (min) | <30 | <240 | <7 | <10 | <180 |

Average Accuracy (%) | 90.99% | 90.34% | 95.92% | 95.28% | 95.71% |

**Table 5.**Different convolutional layer models for feature enhancement and extraction (Models #1–#3).

Model | 1st Convolutional Window and Window Size | 2nd Convolutional Window and Window Size | 3rd Convolutional Window and Window Size | Stride/ Padding | Maximum Pooling Window | Stride |
---|---|---|---|---|---|---|

1 | Fractional Order, 3 × 3, 2 | 3 × 3or 5 × 5, 16 | 3 × 3 or 5 × 5, 16 | 1/1 | 2 × 2, 16 | 2 |

2 | Sobel (First Order), 3 × 3, 2 | 3 × 3 or 5 × 5, 16 | 3 × 3 or 5 × 5, 16 | 1/1 | 2 × 2, 16 | 2 |

3 | Histeq, 3 × 3, 2 | 3 × 3 or 5 × 5, 16 | 3 × 3 or 5 × 5, 16 | 1/1 | 2 × 2, 16 | 2 |

**Table 6.**Comparisons of the training CPU time for different models of multilayer CNN-based classifier.

Model | 1st Convolutional Layer | 2nd Convolutional Layer | 2nd Pooling Layer | 3rd Convolutional Layer | 3rd Pooling Layer | Classification Layer (Fully Connecting Network) | Average Training Time (s) |
---|---|---|---|---|---|---|---|

1 | 3 × 3, 2 | 3 × 3, 16 | 2 × 2, 16 | 3 × 3, 16 | 2 × 2, 16 | 625 input nodes, 168 1st hidden nodes, 64 2nd hidden nodes, 2 output nodes | <280 |

2 | 3 × 3, 16 | 5 × 5, 16 | <220 | ||||

3 | 5 × 5, 16 | 3 × 3, 16 | <240 | ||||

4 | 5 × 5, 16 | 5 × 5, 16 | <330 |

Test Fold | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | Average Accuracy | |
---|---|---|---|---|---|---|---|---|---|---|---|---|

Model | ||||||||||||

1 | 96.14 | 97.43 | 98.07 | 97.96 | 98.93 | 98.07 | 96.35 | 95.60 | 96.89 | 98.28 | 97.37 | |

2 | 97.42 | 98.93 | 98.28 | 97.64 | 99.14 | 97.21 | 97.85 | 98.28 | 99.14 | 97.42 | 98.13 | |

3 | 96.14 | 97.64 | 98.50 | 98.71 | 99.14 | 97.85 | 97.64 | 95.06 | 97.21 | 99.57 | 97.75 | |

4 | 98.93 | 98.71 | 96.14 | 97.32 | 99.36 | 98.18 | 91.74 | 90.34 | 97.64 | 90.02 | 95.84 |

**Table 8.**Experimental results of K-fold cross-validation (K

_{f}= 10) for the proposed deep-learning-based CNN.

Test Fold | Precision (%) | Recall (%) | Accuracy (%) | F1 Score |
---|---|---|---|---|

1 | 95.00 | 96.48 | 95.60 | 0.9574 |

2 | 94.38 | 95.92 | 95.20 | 0.9514 |

3 | 94.82 | 91.80 | 93.80 | 0.9389 |

4 | 95.02 | 96.50 | 95.60 | 0.9575 |

5 | 96.51 | 91.68 | 95.40 | 0.9577 |

6 | 94.09 | 94.84 | 94.40 | 0.9447 |

7 | 92.80 | 97.61 | 95.00 | 0.9515 |

8 | 92.77 | 95.85 | 94.40 | 0.9429 |

9 | 96.46 | 95.70 | 96.00 | 0.9608 |

10 | 95.19 | 95.54 | 95.00 | 0.9536 |

Average | 95.19 | 95.19 | 95.04 | 0.9516 |

**Table 9.**Comparisons of the training CPU time for multilayer CNN-based classifiers with different numbers of Kernel convolutional windows and maximum pooling windows in second and third convolutional-pooling layers.

Model | 1st Convolutional Layer | 2nd Convolutional Layer | 2nd Pooling Layer | 3rd Convolutional Layer | 3rd Pooling Layer | Classification Layer (Fully Connecting Network) | Average Training Time (s) |
---|---|---|---|---|---|---|---|

1–1 | 3 × 3, 2 | 3 × 3, 4 | 2 × 2, 4 | 3 × 3, 4 | 2 × 2, 4 | 625 input nodes, 168 1st hidden nodes, 64 2nd hidden nodes, 2 output nodes | <150 |

1–2 | 3 × 3, 8 | 2 × 2, 8 | 3 × 3, 8 | 2 × 2, 8 | <240 | ||

1–3 | 3 × 3, 16 | 2 × 2, 16 | 3 × 3, 16 | 2 × 2, 16 | <280 | ||

1–4 | 3 × 3, 32 | 2 × 2, 32 | 3 × 3, 32 | 2 × 2, 32 | <330 |

**Table 10.**Experimental results of K-fold cross-validation (K

_{f}= 10) for Model #1–1 with 4 Kernel convolutional windows and 4 maximum pooling windows in each convolutional-pooling layer.

Test Fold | Precision (%) | Recall (%) | Accuracy (%) | F1 Score |
---|---|---|---|---|

1 | 85.30 | 84.70 | 87.80 | 0.8640 |

2 | 87.10 | 93.00 | 91.20 | 0.9000 |

3 | 82.80 | 93.40 | 89.00 | 0.8760 |

4 | 86.10 | 91.50 | 93.20 | 0.9200 |

5 | 84.00 | 92.00 | 89.20 | 0.8780 |

6 | 93.20 | 84.80 | 91.08 | 0.8880 |

7 | 91.20 | 95.70 | 95.40 | 0.9480 |

8 | 90.10 | 94.80 | 93.60 | 0.9260 |

9 | 92.40 | 86.70 | 91.40 | 0.8950 |

10 | 97.20 | 99.10 | 98.40 | 0.9810 |

Average | 88.94 | 91.57 | 92.30 | 0.9076 |

**Table 11.**Experimental results of K-fold cross-validation (K

_{f}= 10) for Model #1–2 with 8 Kernel convolutional windows and 8 maximum pooling windows in each convolutional-pooling layer.

Test Fold | Precision (%) | Recall (%) | Accuracy (%) | F1 Score |
---|---|---|---|---|

1 | 96.60 | 95.70 | 96.80 | 0.9620 |

2 | 94.80 | 95.60 | 96.00 | 0.9530 |

3 | 96.10 | 93.00 | 95.40 | 0.9450 |

4 | 88.30 | 96.20 | 93.00 | 0.9210 |

5 | 90.10 | 95.30 | 93.60 | 0.9260 |

6 | 92.10 | 93.40 | 93.80 | 0.9270 |

7 | 93.40 | 93.80 | 94.70 | 0.9360 |

8 | 92.30 | 93.00 | 93.80 | 0.9270 |

9 | 94.10 | 97.60 | 96.40 | 0.9580 |

10 | 90.30 | 96.70 | 94.20 | 0.9340 |

Average | 92.81 | 95.03 | 94.77 | 0.9389 |

**Table 12.**Experimental results of K-fold cross-validation (K

_{f}= 10) for Model #1–3 with 16 Kernel convolutional windows and 16 maximum pooling windows in each convolutional-pooling layer.

Test Fold | Precision (%) | Recall (%) | Accuracy (%) | F1 Score |
---|---|---|---|---|

1 | 96.70 | 96.20 | 97.00 | 0.9640 |

2 | 97.60 | 95.30 | 96.60 | 0.9570 |

3 | 95.10 | 93.40 | 95.40 | 0.9420 |

4 | 97.10 | 94.00 | 96.20 | 0.9540 |

5 | 97.20 | 97.10 | 97.60 | 0.9680 |

6 | 93.00 | 94.00 | 94.40 | 0.9340 |

7 | 95.60 | 92.40 | 95.00 | 0.9400 |

8 | 98.00 | 97.20 | 98.10 | 0.9760 |

9 | 96.50 | 95.70 | 96.00 | 0.9610 |

10 | 95.20 | 95.50 | 95.00 | 0.9540 |

Average | 96.30 | 95.04 | 95.93 | 0.9553 |

**Table 13.**Experimental results of K-fold cross-validation (K

_{f}= 10) for Model #1–4 with 32 Kernel convolutional windows and 32 maximum pooling windows in each convolutional-pooling layer.

Test Fold | Precision (%) | Recall (%) | Accuracy (%) | F1 Score |
---|---|---|---|---|

1 | 96.70 | 96.20 | 97.00 | 0.9640 |

2 | 99.00 | 96.20 | 99.00 | 0.9760 |

3 | 90.80 | 93.40 | 93.20 | 0.9210 |

4 | 95.60 | 93.40 | 95.40 | 0.9230 |

5 | 99.50 | 97.60 | 98.80 | 0.9860 |

6 | 95.30 | 95.70 | 96.20 | 0.9550 |

7 | 66.00 | 95.60 | 77.20 | 0.7800 |

8 | 96.70 | 98.60 | 98.00 | 0.9770 |

9 | 97.40 | 91.50 | 95.40 | 0.9440 |

10 | 99.00 | 94.80 | 97.40 | 0.9690 |

Average | 93.60 | 95.30 | 94.60 | 0.9395 |

**Table 14.**Comparisons of average accuracy (%) for the CNN-based classifier with different convolutional windows in first to third convolutional layers.

Model | First Convolutional Window | Second and Third Convolutional Window | Second and Third Pooling Window | Classification Layer | Average Accuracy |
---|---|---|---|---|---|

1 | Fractional-Order Convolutional Window (2) | Kernel convolution Window (16) | Maximum Pooling Window (16) | One Input Layer, Two Hidden Layers, One Output Layer | >95% |

2 | Sobel (First-Order) Convolutional Window (2) | Kernel convolution Window (16) | Maximum Pooling Window (16) | >85% | |

3 | Histeq Convolutional Window (2) | Kernel convolution Window (16) | Maximum Pooling Window (16) | >90% |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Chen, P.-Y.; Zhang, X.-H.; Wu, J.-X.; Pai, C.-C.; Hsu, J.-C.; Lin, C.-H.; Pai, N.-S.
Automatic Breast Tumor Screening of Mammographic Images with Optimal Convolutional Neural Network. *Appl. Sci.* **2022**, *12*, 4079.
https://doi.org/10.3390/app12084079

**AMA Style**

Chen P-Y, Zhang X-H, Wu J-X, Pai C-C, Hsu J-C, Lin C-H, Pai N-S.
Automatic Breast Tumor Screening of Mammographic Images with Optimal Convolutional Neural Network. *Applied Sciences*. 2022; 12(8):4079.
https://doi.org/10.3390/app12084079

**Chicago/Turabian Style**

Chen, Pi-Yun, Xuan-Hao Zhang, Jian-Xing Wu, Ching-Chou Pai, Jin-Chyr Hsu, Chia-Hung Lin, and Neng-Sheng Pai.
2022. "Automatic Breast Tumor Screening of Mammographic Images with Optimal Convolutional Neural Network" *Applied Sciences* 12, no. 8: 4079.
https://doi.org/10.3390/app12084079