# Principal Component Analysis-Based Logistic Regression for Rotated Handwritten Digit Recognition in Consumer Devices

^{*}

## Abstract

**:**

## 1. Introduction

- Better robustness of the trained model. The model trained via the proposed method can recognize a rotated image, which allows the model to work well, despite rotation issues affecting the incoming data.
- Faster training process and higher real-time recognition frequency. Using PCA to reduce the data dimension, the model can recognize the image using much fewer features, which makes the model easier to train and require less memory storage to perform implementation.
- Higher accuracy than many other classifiers. The PCA-based logistic regression presented in this article can solve the problem of orientation uncertainty, which leads to the higher accuracy of testing data.
- Owing to the light computational complexity, the developed handwritten digit recognition algorithm can be realized in an embedded system or integrated into current consumer 3C devices.

## 2. Methodology

#### 2.1. Description of Research Roadmap

#### 2.2. Logistic Regression

#### 2.3. Principal Component Analysis

#### 2.4. Image Straighten Using PCA

#### 2.5. PCA Improving

#### 2.6. Image Smoothing

## 3. Experiment Background

#### 3.1. Data Image

#### 3.2. Hardware Specification

#### 3.3. Real-Time Recognition System

#### 3.4. Comparison Study Platform

## 4. Experimental Result

#### 4.1. MNIST Pre-Processing

#### 4.2. MNIST Recognition Result

## 5. Comparison Study

## 6. Conclusions

## Supplementary Materials

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Appendix A

## References

- Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2006; Volume 4. [Google Scholar]
- Salah, A.A.; Alpaydin, E.; Akarun, L. A selective attention-based method for visual pattern recognition with application to handwritten digit recognition and face recognition. IEEE Trans. Pattern Anal. Mach. Intell.
**2002**, 24, 420–425. [Google Scholar] [CrossRef] - Valdeos, M.; Velazco, A.S.V.; Paredes, M.G.P.; Velásquez, R.M.A. Methodology for an automatic license plate recognition system using Convolutional Neural Networks for a Peruvian case study. IEEE Lat. Am. Trans.
**2022**, 20, 1032–1039. [Google Scholar] [CrossRef] - Ahmed, S.S.; Mehmood, Z.; Awan, I.A.; Yousaf, R.M. A novel technique for handwritten digit recognition using deep learning. J. Sens.
**2023**, 2023, 2753941. [Google Scholar] [CrossRef] - Aly, S.; Almotairi, S. Deep Convolutional Self-Organizing Map Network for Robust Handwritten Digit Recognition. IEEE Access
**2020**, 8, 107035–107045. [Google Scholar] [CrossRef] - Li, J.; Sun, G.; Yi, L.; Cao, Q.; Liang, F.; Sun, Y. Handwritten Digit Recognition System Based on Convolutional Neural Network. In Proceedings of the 2020 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA), Shanghai, China, 25–27 August 2020; pp. 739–742. [Google Scholar]
- Gonwirat, S.; Surinta, O. DeblurGAN-CNN: Effective Image Denoising and Recognition for Noisy Handwritten Characters. IEEE Access
**2022**, 10, 90133–90148. [Google Scholar] [CrossRef] - Li, Y.; Zhang, S.; Wang, W.Q. A Lightweight Faster R-CNN for Ship Detection in SAR Images. IEEE Geosci. Remote Sens. Lett.
**2022**, 19, 1–5. [Google Scholar] [CrossRef] - Zhang, X.; Xie, G.; Wang, X.; Zhang, P.; Li, Y.; Salamatian, K. Fast Online Packet Classification with Convolutional Neural Network. IEEE/ACM Trans. Netw.
**2021**, 29, 2765–2778. [Google Scholar] [CrossRef] - Watt, J.; Borhani, R.; Katsaggelos, A.K. Machine Learning Refined: Foundations, Algorithms, and Applications; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
- Leong, A.S.; Zamani, M.; Shames, I. A Logistic Regression Approach to Field Estimation Using Binary Measurements. IEEE Signal Process. Lett.
**2022**, 29, 1848–1852. [Google Scholar] [CrossRef] - Liu, Q.M.; Ma, C.; Xiang, B.B.; Chen, H.S.; Zhang, H.F. Inferring Network Structure and Estimating Dynamical Process from Binary-State Data via Logistic Regression. IEEE Trans. Syst. Man Cybern. Syst.
**2021**, 51, 4639–4649. [Google Scholar] [CrossRef] - Rehman, H.Z.U.; Lee, S. Automatic image alignment using principal component analysis. IEEE Access
**2018**, 6, 72063–72072. [Google Scholar] [CrossRef] - Vretos, N.; Nikolaidis, N.; Pitas, I. A model-based facial expression recognition algorithm using Principal Components Analysis. In Proceedings of the 2009 16th IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009; pp. 3301–3304. [Google Scholar]
- Garg, I.; Panda, P.; Roy, K. A low effort approach to structured CNN design using PCA. IEEE Access
**2019**, 8, 1347–1360. [Google Scholar] [CrossRef] - Akbar, M.A.; Ali, A.A.S.; Amira, A.; Bensaali, F.; Benammar, M.; Hassan, M.; Bermak, A. An Empirical Study for PCA- and LDA-Based Feature Reduction for Gas Identification. IEEE Sens. J.
**2016**, 16, 5734–5746. [Google Scholar] [CrossRef] - Zhong, Y.-W.; Jiang, Y.; Dong, S.; Wu, W.-J.; Wang, L.-X.; Zhang, J.; Huang, M.-W. Tumor radiomics signature for artificial neural network-assisted detection of neck metastasis in patient with tongue cancer. J. Neuroradiol.
**2022**, 49, 213–218. [Google Scholar] [CrossRef] [PubMed] - Cohen, T.; Welling, M. Group equivariant convolutional networks. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 19–24 June 2016; pp. 2990–2999. [Google Scholar]
- Yao, C.; Yang, Y.; Yin, K.; Yang, J. Traffic Anomaly Detection in Wireless Sensor Networks Based on Principal Component Analysis and Deep Convolution Neural Network. IEEE Access
**2022**, 10, 103136–103149. [Google Scholar] [CrossRef] - Yang, F.; Liu, S.; Dobriban, E.; Woodruff, D.P. How to Reduce Dimension With PCA and Random Projections? IEEE Trans. Inf. Theory
**2021**, 67, 8154–8189. [Google Scholar] [CrossRef] [PubMed] - Michalis Titsias RC AUEB. One-vs-each approximation to softmax for scalable estimation of probabilities. Adv. Neural Inf. Process. Syst.
**2016**, 29. [Google Scholar] [CrossRef] - Mannor, S.; Peleg, D.; Rubinstein, R. The cross entropy method for classification. In Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, 7–11 August 2005; pp. 561–568. [Google Scholar]
- Brownrigg, D.R. The weighted median filter. Commun. ACM
**1984**, 27, 807–818. [Google Scholar] [CrossRef]

**Figure 1.**Handwritten digit recognition system in a mobile phone. The images in red blocks represent the recognition result of the system. (

**a**) The system correctly recognizes the digit. (

**b**) The system fails to recognize the digit subjected to rotations.

**Figure 4.**(

**a**) is the raw dataset $P$, where (

**b**) is the dataset with new basis. Linear transformation determined via setting the first two principal components, the two arrows, as the new basis.

**Figure 6.**Examples of two main types of data in the error set: (

**a**) represents the orientation uncertainty, and (

**b**) represents the horizontal digit problem.

**Figure 13.**Image pre-processing of MNIST. The images on the top are the raw images derived from MNIST, while those at the bottom are the pre-processed images.

Model | Information |
---|---|

System | Windows 10 22H2 |

Software | MATLAB 2023a |

CPU | 11th Gen Intel(R) Core(TM) i7-11800H @ 2.30 GHz |

RAM | 16 GB |

Handwritten Board | Bamboo CTH-470 |

Model | Training Time (Sec) | Validation Acc (%) | Testing Acc (%) | Model Size (kB) |
---|---|---|---|---|

Fine Tree | 78.98 | 71.8 | 41.5 | 156 |

Median Tree | 73.60 | 61.2 | 34.7 | 115 |

Coarse Tree | 58.84 | 36.0 | 21.8 | 107 |

Linear Discriminant | 401.89 | 81.8 | 49.8 | 1000 |

Quadratic Discriminant | Failed | |||

Efficient Logistic Regression | 631.88 | 83.9 | 59.8 | 6000 |

Efficient Linear Support Vector Machine | 432.25 | 85.9 | 56.8 | 10,000 |

Gaussian Naïve Bayes | Failed | |||

Linear Support Vector Machine | 7890.60 | 90.0 | 59.1 | 6000 |

Quadratic Support Vector Machine | 9528.80 | 94.3 | 66.2 | 220,000 |

Cubic Support Vector Machine | 12,029.00 | 95.0 | 65.1 | 218,000 |

Fine Gaussian Support Vector Machine | 94,873.00 | 56.3 | 10.4 | 3,000,000 |

Median Gaussian Support Vector Machine | 25,055.00 | 91.7 | 37.7 | 493,000 |

Coarse Gaussian Support Vector Machine | 27,872.00 | 88.7 | 56.9 | 543,000 |

Fine K Nearest Neighbor | 7052.60 | 90.9 | 61.2 | 360,000 |

Medium K Nearest Neighbor | 21,222.00 | 91.2 | 62.3 | 360,000 |

Coarse K Nearest Neighbor | 25,057.00 | 86.6 | 59.3 | 360,000 |

Cosine K Nearest Neighbor | 28,134.00 | 90.8 | 57.5 | 360,000 |

Cubic K Nearest Neighbor | 68,775.00 | 87.6 | 43.3 | 360,000 |

Weighted K Nearest Neighbor | 36,165.00 | 91.5 | 62.5 | 360,000 |

Booster Trees | 28,201.00 | 73.8 | 45.6 | 3000 |

Bagged Trees | 47,474.00 | 91.9 | 64.2 | 60,000 |

Subspace Discriminant | 32,226.00 | 32.3 | 51.9 | 75,000 |

Subspace K Nearest Neighbor | 98,531.00 | 95.2 | 80.7 | 5,000,000 |

RUSBooster Trees | 31,726.00 | 65.3 | 35.4 | 3000 |

Narrow Neural Network | 38,601.00 | 85.8 | 45.5 | 181 |

Medium Neural Network | 34,970.00 | 89.6 | 49.5 | 274 |

Wide Neural Network | 36,564.00 | 94.2 | 66.7 | 740 |

Bilayered Neural Network | 43,035.00 | 85.8 | 46.1 | 183 |

Trilayered Neural Network | 43,716.00 | 86.1 | 48.2 | 184 |

Support Vector Machine Kernel | 107,320.00 | 95.3 | 75.6 | 1,600,000 |

Logistic Regression Kernel | 88,626.00 | 92.6 | 68.7 | 1,600,000 |

PCA-Based Logistic Regression | 15.20 | 87.2 | 73.5 | 4628 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Peng, C.-C.; Huang, C.-Y.; Chen, Y.-H.
Principal Component Analysis-Based Logistic Regression for Rotated Handwritten Digit Recognition in Consumer Devices. *Electronics* **2023**, *12*, 3809.
https://doi.org/10.3390/electronics12183809

**AMA Style**

Peng C-C, Huang C-Y, Chen Y-H.
Principal Component Analysis-Based Logistic Regression for Rotated Handwritten Digit Recognition in Consumer Devices. *Electronics*. 2023; 12(18):3809.
https://doi.org/10.3390/electronics12183809

**Chicago/Turabian Style**

Peng, Chao-Chung, Chao-Yang Huang, and Yi-Ho Chen.
2023. "Principal Component Analysis-Based Logistic Regression for Rotated Handwritten Digit Recognition in Consumer Devices" *Electronics* 12, no. 18: 3809.
https://doi.org/10.3390/electronics12183809