## 1. Introduction

## 2. Results and Discussion

^{2}is the most effective AFM scan size to discriminate between normal and cancerous cells. Figure 1 shows examples of typical height and adhesion images of this size for cancerous and precancerous cells. The main conclusion from these images is the difficulty of discriminating between these cell phenotypes by just visual judgment. To discriminate between normal and cancerous cells in a quantitative way, it was suggested to investigate the characteristics of the fractal geometry of the cell surface and their adhesion maps [18,19]. The same studies showed that the height images provide very little discriminating power compared with the adhesion maps. Following this conclusion, we will not use the height images to analyze the differences between cancerous and precancerous cells and focus on the adhesion images.

## 3. Methods

#### 3.1. Cells and AFM imaging

^{2}scanning area was chosen at random for each cell to avoid any possible operator bias.

#### 3.2. Surface Parameters Used in This Study

^{2}adhesion map was recorded per cell at 512 × 512 pixel resolution. The map was then divided into four 5 × 5 µm

^{2}quadrants at 256 × 256 pixel resolution each. The surface parameters were determined for each quadrant. If the average and median values of the four quadrants differed by more than 50%, the cell map was visually verified for possible artifacts (a piece of dirt on the cell surface or picking dirt using the AFM probe). If an artifact was identified in a quadrant, the quadrant was removed from consideration. The surface parameters were averaged for the remaining quadrants per cell.

_{kl}is defined as

_{x}and N

_{y}are the number of pixels in the x and y directions, and u and v are the discrete Fourier indexes =0, 1, 2, … N

_{x}

_{−1}and v = 0, 1, 2 … N

_{y}

_{−1}.

^{b}. Specifically, the fractal dimension was defined as 2−b. Two fractal dimensions were calculated, below (Sfd_top) and above (Sfd_bottom) Q = 1/300 nm

^{−1}. Both fractal dimensions were used in the machine learning analysis described in this work as two separate parameters.

**Figure 1.**Examples of typical 10 × 10 µm

^{2}AFM images of precancerous and cancerous cells used in this study.

**Figure 2.**Schematics of machine learning (ML) analysis. Conversion of the AFM images into the surface parameters; splitting the database into the training and testing subsets; developing an ML algorithm using just the training subset; using the testing subset to perform the statistical analysis of the developed ML algorithm; and finally, cross-validation and the verification of the lack of overtraining of the developed approach.

**Figure 3.**Results of the ML analysis of the difference between precancerous and cancerous cells: (

**a**) confusion matrix, (

**b**) ROC curves, and (

**c**) histogram of the areas under the curve (AUC).

**Figure 4.**Further verification of the lack of overtraining of the ML algorithm used in this work; shuffled class assignment of the testing dataset: (

**a**) confusion matrix, (

**b**) ROC curves, and (

**c**) histogram of the areas under the curve (AUC). These AUC data were also used to find the statistical significance of the results shown in Figure 3.

