# Automatic Tumor Identification from Scans of Histopathological Tissues

^{*}

## Abstract

**:**

## 1. Introduction

- Modern augmentation and image preprocessing methods to analyze WSIs,
- Creating an adaptive U-Net model architecture,
- Adding different optimizers for best outcoming result in AUC.

- In Section 2, we did a review of machine learning models, architectures, algorithms, and other techniques that can be used for histopathological WSIs,
- Section 3 outlines the methodology, that step by step describes the machine learning model, dataset, and accuracy requirements for further experiments,
- Section 4 consists of the design of the experiments, the main values, graphical and statistical results,
- In Section 5, we list the major accomplishments and talk about the outcomes,
- In Section 6, we conclude our work and identify potential work directions.

## 2. Related Work

#### 2.1. Medical Imaging

#### 2.2. Machine Learning Models

#### Learning Rate and Planning Algorithms

## 3. Materials and Methods

#### 3.1. Proposed Model

^{−4}and then gradually increases to 5 × 10

^{−3}over 2400 iterations and descends to 5 × 10

^{−5}over 3600 iterations, shown in Figure 1. Warm-up applies only to the first epoch. The total number of iterations per epoch that we used was 6000, while the batch size was 64.

#### 3.2. Dataset

#### 3.3. Accuracy Calculation

- Precision using Equation (2),

- Recall using Equation (3),

- F1-score using Equation (4),

- AUC. Measures the quality of the model in terms of sensitivity and accuracy over the entire set of limits.

## 4. Experiments and Results

#### 4.1. Experimental Setup

- Number of filters (5):

- Number of blocks (6):

- Exclusion factor (7):

- L2 regularization (8):$$\mathrm{L}2\text{}\mathrm{regularization}=1\times e-5+\frac{f}{8}\times 0.0001$$

#### 4.2. Results

^{−4}, and the learning frequency cycle used in the SGD optimizer was converted to cosine. As expected, this teaching principle worked very well. After the fifth iteration, we managed to achieve an AUC of 0.95911, which was almost 0.4% better than the last best model trained with the AdamW optimizer.

## 5. Discussion

**F1 score of 0.924**with a threshold of

**0.393**; this score measured on our last best model’s accuracy (that is evaluated from precision and recall), as long as the confusion matrix is that shown in Figure 8, giving us a result of meaning how many times our best model gave us correct predictions: true positives—15,096, true negatives—15,189, false negatives—1295, and false positives—1188. According to experiments, even with less-than-ideally prepared training data, the last ensemble method managed to exceed

**0.9691 AUC**.

## 6. Conclusions

**Figure 3.**Random samples of the PatchCamelyon dataset that are stained with hematoxylin-eosin (H&E), blue (healthy tissue) and red squares (cancerous tissue) showing central regions.

**Figure 7.**Maximum F1 score reached on an optimized ensemble with the best model (threshold = 0.393).

AUC (Area under the Curve) | ||
---|---|---|

Model | Using Augmentation | Not Using Augmentation |

ResNet50 | 0.95001 | 0.93988 |

DenseNet121 | 0.95511 | 0.93780 |

AUC (Area under the Curve) | ||
---|---|---|

Model | Using Augmentation | Not Using Augmentation |

ResNet50 | 0.96501 | 0.99297 |

DenseNet121 | 0.98891 | 0.99971 |

AUC (Area under the Curve) | ||
---|---|---|

Model | ImageNet Weights | Xavier Initialization Weights |

DenseNet121 | 0.95672 | 0.94560 |

ResNet50 | 0.95078 | 0.94380 |

ResNet50 V2 | 0.95078 | 0.94380 |

MobileNetV1 | 0.94954 | 0.93855 |

MobileNetV2 | 0.95065 | 0.95395 |

Inception | 0.94697 | 0.94608 |

EfficientNetB0 | 0.95121 | 0.94608 |

EfficientNetB1 | 0.93876 | 0.94608 |

EfficientNetB0 V2 | 0.94570 | 0.75981 |

EfficientNetB1 V2 | 0.94287 | 0.79871 |

**Table 4.**MS-model results. New initialization indicates that the model weights were generated by Xavier initialization from a newly chosen random point.

Learning Iteration | AUC |
---|---|

Reusing weights | 0.95501 |

New initialization 1 | 0.95498 |

New initialization 2 | 0.95508 |

New initialization 3 | 0.95505 |

Learning Iteration | AUC |
---|---|

SGD | 0.95510 |

Adam | 0.95475 |

AdamW | 0.95515 |

Ranger | 0.95500 |

**Table 6.**Summary of results. The difference column represents the AUC difference between the type and the first starting point in DenseNet121.

AUC (Area under the Curve) | ||
---|---|---|

Ensemble Type | AUC | Difference |

DenseNet121 | 0.95672 | - |

M-model training 5 outputs together | 0.95405 | −0.267% |

M-model training 5 outputs separately | 0.95491 | −0.1891% |

MS-model | 0.95508 | −0.164% |

MS-model with AdamW | 0.95515 | −0.157% |

MS-model with repeated training | 0.95911 | 0.239% |

MS-model TTA | 0.96870 | 1.198% |

MS-model ensemble | 0.96592 | 0.920% |

MS-model connecting weights | 0.96240 | 0.568% |

TTA + weights and models ensemble | 0.96922 | 1.250% |

MS-model after corrections | 0.96147 | 0.475% |

MS-model after corrections with repeated training | 0.96675 | 1.003% |

Group of ensembles from all experiments | 0.96977 | 1.305% |

Optimized ensemble based on the best model | 0.97673 | 2.001% |

