# Distance Metric Optimization-Driven Neural Network Learning Framework for Pattern Classification

^{*}

## Abstract

**:**

## 1. Introduction

- (1)
- By imbedding the capped ${L}_{2,p}$-norm metric distance and Welsch loss to the TELM, a novel robust learning algorithm called Capped ${L}_{2,p}$-norm Welsch Robust Twin Extreme Learning Machine (CWTELM) is proposed. CWTELM enhances the robustness while maintaining the superiority of the TELM, so that the performance of classification is also polished;
- (2)
- To speed up the computation of CWTELM and carry forward its advantages, we present a least square version of CWTELM, namely, Fast CWTELM (FCWTELM). While inheriting the superiority of the CWTELM, FCWTELM transforms the inequality constraints into equality constraints, so that the problem becomes solving two sets of linear equations, which greatly reduces the computational cost;
- (3)
- Two efficient iterative algorithms are designed to solve CWTELM and FCWTELM, which are easy to realize, and guarantee the existence of a reasonable optimization method theoretically. Simultaneously, we have carried out a rigorous theoretical analysis and proof of the convergence of the two designed algorithms;
- (4)
- A great deal of experiments conducted across various datasets and different noise proportions demonstrates that CWTELM and FCWTELM are competitive with five other traditional classification methods in terms of robustness and practicability;
- (5)
- A statistical analysis is performed for our algorithms, which further verifies that CWTELM and FCWTELM exceed five other classifiers in robustness and classification performance.

## 2. Related Work

#### 2.1. TELM

#### 2.2. LS-TELM

#### 2.3. Welsch Loss Function

#### 2.4. RTELM

#### 2.5. Capped ${L}_{2,p}$-Norm

## 3. Main Contribution

#### 3.1. CWTELM

**Theorem**

**1.**

**Proof.**

#### 3.2. FCWTELM

#### 3.3. Convergence Analysis

**Lemma**

**1.**

**Lemma**

**2.**

**Lemma**

**3.**

**Theorem**

**2.**

Algorithm 1 Training CWTELM. |

Input: Training data: Training set ${T}_{1}={\left\{{x}_{i},{y}_{i}\right\}}_{l}^{i=1}$, $i=1,\dots ,l$, where ${x}_{i}\in {R}^{n}$, ${x}_{j}\in {R}^{n}$, ${y}_{i}\in \left\{-1,+1\right\}$; activation function $G\left(x\right)$, and the number of hidden node number L, the parameters ${C}_{1}$, ${C}_{2}$, ${\epsilon}_{1}$, ${\epsilon}_{2}$, ${\epsilon}_{3}$, ${\epsilon}_{4}$, ${\delta}_{1}$ and ${\delta}_{2}$. |

${\beta}_{1}^{*}$ and ${\beta}_{2}^{*}$; |

Process: |

1. Initialize $F\in {\mathbb{R}}^{{m}_{1}\times {m}_{1}}$ and $Q\in {\mathbb{R}}^{{m}_{2}\times {m}_{2}}$;$K\in {\mathbb{R}}^{{m}_{2}\times {m}_{2}}$ and $U\in {\mathbb{R}}^{{m}_{1}\times {m}_{1}}$; |

2. $\alpha $ and $\beta $; |

3. Passing ${Z}_{1}=-{({H}^{T}FH+{C}_{3}I)}^{-1}{E}^{T}\alpha $ and ${Z}_{2}={({E}^{T}KE+{C}_{4}I)}^{-1}{H}^{T}\beta $ Calculate ${Z}_{1}$ and ${Z}_{2}$, |

4. Accordingly, update matrix separately $Q,U,F,K$. |

**Proof.**

## 4. Experimental

#### 4.1. Experiments Setup

#### 4.2. Artificial Dataset

#### 4.3. UCI Dataset

#### 4.4. Experimental Results on the UCI Datasets without Outliers

#### 4.5. Robustness against Outliers

#### 4.6. Experimental Results on Artificial Dataset with Outliers

#### 4.7. Statistical Analysis

## 5. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Guang-Bin, H.; Zhu, Q.; Siew, C. Extreme learning machine: Theory and applica tions. Neurocomputing
**2006**, 70, 489–501. [Google Scholar] - Huang, G.-B.; Chen, Y.; Babri, H.A. Classification ability of single hidden layer feedforward neural networks. IEEE Trans. Neural Netw.
**2000**, 11, 799–801. [Google Scholar] [CrossRef][Green Version] - Han, K.; Yu, D.; Tashev, I. Speech emotion recognition using deep neural network and extreme learning machine. In Proceedings of the Interspeech 2014, Singapore, 14–18 September 2014. [Google Scholar]
- Romanuke, V. Setting the hidden layer neuron number in feedforward neural network for an image recognition problem under Gaussian noise of distortion. Comput. Inf. Sci.
**2013**, 6, 38. [Google Scholar] [CrossRef][Green Version] - Tiwari, S.; Bharadwaj, A.; Gupta, S. Stock price prediction using data analytics. In Proceedings of the 2017 International Conference on Advances in Computing, Communication and Control (ICAC3), Mumbai, India, 1–2 December 2017. [Google Scholar]
- Imran, M.; Khan, M.R.; Abraham, A. An ensemble of neural networks for weather forecasting. Neural Comput. Appl.
**2004**, 13, 112–122. [Google Scholar] - Guang-Bin, H.; Zhu, Q.; Siew, C. Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of the IEEE International Joint Conference on Neural Networks, Budapest, Hungary, 25–29 July 2004. [Google Scholar]
- Son, Y.J.; Kim, H.G.; Kim, E.H.; Choi, S.; Lee, S.K. Application of support vector machine for prediction of medication adherence in heart failure patients. Healthc. Inform. Res.
**2010**, 16, 253–259. [Google Scholar] [CrossRef] - Wang, G.; Zhao, Y.; Wang, D. A protein secondary structure prediction framework based on the extreme learning machine. Neurocomputing
**2008**, 72, 262–268. [Google Scholar] [CrossRef] - Yuan, L.; Soh, Y.C.; Huang, G. Extreme learning machine based bacterial protein subcellular localization prediction. In Proceedings of the 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), Hong Kong, China, 1–8 June 2008. [Google Scholar]
- Abdul Adeel, M.; Minhas, R.; Wu, Q.M.J.; Sid-Ahmed, M.A. Human face recognition based on multidimensional PCA and extreme learning machine. Pattern Recognit.
**2011**, 44, 2588–2597. [Google Scholar] - Nizar, A.H.; Dong, Z.Y.; Wang, Y. Power utility nontechnical loss analysis with extreme learning machine method. IEEE Trans. Power Syst.
**2008**, 23, 946–955. [Google Scholar] [CrossRef] - Decherchi, S.; Gastaldo, P.; Dahiya, R.S.; Valle, M.; Zunino, R. Tactile-data classification of contact materials using computational intelligence. IEEE Trans. Robot.
**2011**, 27, 635–639. [Google Scholar] [CrossRef] - Choudhary, R.; Shukla, S. Reduced-Kernel Weighted Extreme Learning Machine Using Universum Data in Feature Space (RKWELM-UFS) to Handle Binary Class Imbalanced Dataset Classification. Symmetry
**2022**, 14, 379. [Google Scholar] [CrossRef] - Owolabi, T.O.; Abd Rahman, M.A. Prediction of Band Gap Energy of Doped Graphitic Carbon Nitride Using Genetic Algorithm-Based Support Vector Regression and Extreme Learning Machine. Symmetry
**2021**, 13, 411. [Google Scholar] [CrossRef] - Jayadeva; Khemchandani, R.; Chandra, S. Twin support vector machines for pattern classification. IEEE Trans. Pattern Anal. Mach. Intell.
**2007**, 29, 905–910. [Google Scholar] [CrossRef] [PubMed] - Hou, Q.; Zhang, J.; Liu, L.; Wang, Y.; Jing, L. Discriminative information-based nonparallel support vector machine. Signal Process.
**2019**, 162, 169–179. [Google Scholar] [CrossRef] - Nasiri, J.A.; Charkari, N.M.; Mozafari, K. Energy-based model of least squares twin support vector machines for human action recognition. Signal Process.
**2014**, 104, 248–257. [Google Scholar] [CrossRef] - Ghorai, S.; Mukherjee, A.; Dutta, P.K. Nonparallel plane proximal classifier. Signal Process.
**2009**, 89, 510–522. [Google Scholar] [CrossRef] - Wan, Y.; Song, S.; Huang, G.; Li, S. Twin extreme learning machines for pattern classification. Neurocomputing
**2017**, 260, 235–244. [Google Scholar] [CrossRef] - Reshma, R.; Bharti, A. Least squares twin extreme learning machine for pattern classification. In Innovations in Infrastructure: Proceedings of ICIIF 2018; Springer: Singapore, 2019. [Google Scholar]
- Yuan, C.; Yang, L. Robust twin extreme learning machines with correntropy-based metric. Knowl.-Based Syst.
**2021**, 214, 106707. [Google Scholar] [CrossRef] - Ma, J.; Yang, L. Robust supervised and semi-supervised twin extreme learning machines for pattern classification. Signal Process.
**2021**, 180, 107861. [Google Scholar] [CrossRef] - Ma, J.; Yuan, C. Adaptive Safe Semi-Supervised Extreme Machine Learning. IEEE Access
**2019**, 7, 76176–76184. [Google Scholar] [CrossRef] - Shen, J.; Ma, J. Sparse Twin Extreme Learning Machine With ε -Insensitive Zone Pinball Loss. IEEE Access
**2019**, 7, 112067–112078. [Google Scholar] [CrossRef] - Zhang, K.; Luo, M. Outlier-robust extreme learning machine for regression problems. Neurocomputing
**2015**, 151, 1519–1527. [Google Scholar] [CrossRef] - Ke, J.; Gong, C.; Liu, T.; Zhao, L.; Yang, J.; Tao, D. Laplacian Welsch Regularization for Robust Semisupervised Learning. IEEE Trans. Cybern.
**2020**, 52, 164–177. [Google Scholar] [CrossRef] - Tokgoz, E.; Trafalis, T.B. Mixed convexity optimization of the SVM QP problem for nonlinear polynomial kernel maps. In Proceedings of the 5th WSEAS International Conference on Computers, Puerto Morelos, Mexico, 29–31 January 2011. [Google Scholar]
- Xu, Z.; Lai, J.; Zhou, J.; Chen, H.; Huang, H.; Li, Z. Image Deblurring Using a Robust Loss Function. Circuits Syst. Signal Process.
**2021**, 41, 1704–1734. [Google Scholar] [CrossRef] - Wang, H.; Yu, G.; Ma, J. Capped L
_{2,p}-Norm Metric Based on Robust Twin Support Vector Machine with Welsch Loss. Symmetry**2023**, 15, 1076. [Google Scholar] [CrossRef] - Ma, X.; Ye, Q.; Yan, H. L
_{2,p}-norm distance twin support vector machine. IEEE Access**2017**, 5, 23473–23483. [Google Scholar] [CrossRef] - Li, C.-N.; Shao, Y.-H.; Deng, N.-Y. Robust L
_{1}-norm non-parallel proximal support vector machine. Optimization**2014**, 65, 169–183. [Google Scholar] [CrossRef] - Yuan, C.; Yang, L. Capped L
_{2,p}-norm metric based robust least squares twin support vector machine for pattern classification. Neural Netw.**2021**, 142, 457–478. [Google Scholar] [CrossRef] - Chapelle, O.; Scholkopf, B.; Zien, A. Semi-supervised learning (chapelle, o. et al., eds.; 2006) [book reviews]. IEEE Trans. Neural Netw.
**2009**, 20, 542. [Google Scholar] [CrossRef]

Abbreviated Form | Complete Form |
---|---|

SLFNs | Single-hidden Layer Feedforward Neural Networks |

ELM | Extreme Learning Machine |

TELM | Twin Extreme Learning Machine |

TSVM | Twin Support Vector Machine |

LS-TELM | Least Squares Twin Extreme Learning Machine |

CHELM | Correntropy-based Robust Extreme Learning Machine |

RSS-ELM | Robust Semi-supervised Extreme Learning Machine |

${L}_{1}$-NPELM | Non-parallel Proximal Extreme Learning Machine |

${L}_{1}$-TELM | Robust ${L}_{1}$-norm Twin Extreme Learning Machine |

SVC | Capped ${L}_{2,p}$-norm Support Vector Classification |

C${L}_{2,p}$-LSTELM | Capped ${L}_{2,p}$-norm Least Squares Twin Extreme Learning Machine |

RTELM | Robust Supervised Twin Extreme Learning Machine |

CTSVM | Capped ${L}_{1}$-norm Twin Support Vector Machine |

CWTELM | Capped ${L}_{2,P}$-norm Welsch Twin Extreme Learning Machine |

FCWTELM | Fast Capped ${L}_{2,P}$-norm Welsch Twin Extreme Learning Machine |

LCFTELM | Robust twin extreme learning machines with correntropy-based metric |

ACC | Accuracy |

TP | True Positives |

TN | True Negatives |

FN | False Negatives |

FP | False Positives |

CD | Critical Difference |

Symbol | Meaning |
---|---|

R | Real number |

${R}^{n}$ | Real n-dimensional vector space |

${R}^{n\times n}$ | The linear space of the real n-order matrix |

$|\xb7|$ | Perpendicular distance of the data points x from the hyperplane |

${\parallel x\parallel}_{1}$ | The 1-norm of vector x |

${\parallel x\parallel}_{2}$ | The 2-norm of vector x |

${\parallel x\parallel}_{2}^{2}$ | Square of the 2-norm of the vector x |

${\parallel x\parallel}_{p}$ | The p-norm of vector x |

${\parallel x\parallel}_{1}$ | The 1-norm of the matrix A |

${\parallel x\parallel}_{1}$ | The 2-norm of the matrix A |

${A}^{T}$ | The transpose of matrix A |

${A}^{-1}$ | The inverse of matrix A |

$\tau $ | Training set |

l | Number of samples in the training set |

${y}_{i}$ | Label of ${x}_{i}$, ${y}_{i}\in \left\{+1,-1\right\}$ |

${H}_{1}$ | The hidden layer output of the samples belonging to positive class |

${H}_{2}$ | The hidden layer output of the samples belonging to negative class |

$f\left(x\right)$ | Decision function |

Datasets | Samples | Attributes | Datasets | Samples | Attributes |
---|---|---|---|---|---|

Australian | 690 | 14 | Cancer | 699 | 9 |

Balance | 576 | 4 | Wholesale | 440 | 7 |

Vote | 432 | 16 | WDBC | 569 | 30 |

QSAR | 1055 | 41 | Pima | 768 | 8 |

ELM | CHELM | TELM | CTSVM | RTELM | CWTELM | FCWTELM | |
---|---|---|---|---|---|---|---|

Datasets | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) |

Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | |

Australian | 85.74 | 86.53 | 86.69 | 84.93 | 86.58 | $\mathbf{88.24}$ | $\mathbf{86.70}$ |

$\mathbf{1.541}$ | 4.561 | 2.093 | 3.466 | 5.125 | 6.847 | $\mathbf{0.536}$ | |

Balance | 85.11 | 91.04 | 90.41 | 89.29 | $\mathbf{94.64}$ | $\mathbf{96.43}$ | 91.07 |

$\mathbf{1.739}$ | 4.543 | 3.112 | 3.097 | 4.381 | 4.853 | $\mathbf{0.427}$ | |

Vote | 94.58 | 95.60 | 95.48 | 95.58 | $\mathbf{95.81}$ | $\mathbf{97.62}$ | 92.65 |

1.043 | 4.547 | $\mathbf{0.901}$ | 9.310 | 6.234 | 4.654 | $\mathbf{0.587}$ | |

Cancer | 80.61 | 86.43 | 86.33 | 86.88 | 90.75 | $\mathbf{94.20}$ | $\mathbf{91.30}$ |

1.706 | 5.013 | $\mathbf{0.873}$ | 2.771 | 4.274 | 6.256 | $\mathbf{0.581}$ | |

wholesale | 75.07 | 74.31 | 74.56 | 73.49 | 81.40 | $\mathbf{86.05}$ | $\mathbf{81.44}$ |

1.476 | 4.675 | $\mathbf{0.937}$ | 2.819 | 3.948 | 4.123 | $\mathbf{0.369}$ | |

QSAR | 84.43 | 81.66 | 86.87 | $\mathbf{88.31}$ | 87.64 | $\mathbf{88.46}$ | 87.50 |

1.541 | 3.043 | $\mathbf{0.629}$ | 7.856 | 7.798 | 11.437 | $\mathbf{0.798}$ | |

Pima | 77.76 | 76.78 | 78.01 | 72.68 | $\mathbf{78.86}$ | $\mathbf{79.01}$ | 76.32 |

2.674 | 3.622 | $\mathbf{1.316}$ | 6.047 | 7.664 | 6.492 | $\mathbf{0.772}$ | |

WDBC | $\mathbf{95.85}$ | 95.32 | 95.55 | 95.13 | 95.21 | $\mathbf{98.21}$ | 94.64 |

1.435 | 8.951 | $\mathbf{1.225}$ | 9.549 | 6.449 | 5.224 | $\mathbf{0.454}$ |

ELM | CHELM | TELM | CTSVM | RTELM | CWTELM | FCWTELM | |
---|---|---|---|---|---|---|---|

Datasets | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) |

Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | |

Australian | 79.83 | 80.21 | 81.03 | 80.45 | 81.98 | $\mathbf{85.29}$ | $\mathbf{82.35}$ |

$\mathbf{1.523}$ | 4.631 | 2.143 | 3.487 | 5.187 | 6.473 | $\mathbf{0.521}$ | |

Balance | 83.32 | 84.43 | 83.21 | 87.23 | 85.71 | $\mathbf{91.07}$ | $\mathbf{89.29}$ |

$\mathbf{1.909}$ | 4.876 | 2.453 | 4.417 | 4.418 | 4.497 | $\mathbf{0.426}$ | |

Vote | 93.57 | 92.22 | 94.65 | 95.01 | $\mathbf{95.43}$ | $\mathbf{95.24}$ | 91.35 |

0.978 | 3.872 | $\mathbf{0.376}$ | 9.654 | 5.503 | 4.717 | $\mathbf{0.587}$ | |

Cancer | 79.36 | 83.46 | 85.48 | 85.36 | 84.06 | $\mathbf{89.86}$ | $\mathbf{86.96}$ |

1.758 | 5.001 | $\mathbf{0.773}$ | 2.608 | 4.316 | 6.354 | $\mathbf{0.590}$ | |

wholesale | 74.47 | 75.31 | 73.56 | 73.14 | 76.64 | $\mathbf{83.72}$ | $\mathbf{78.37}$ |

1.476 | 4.657 | $\mathbf{0.879}$ | 2.892 | 4.063 | 4.150 | $\mathbf{0.373}$ | |

QSAR | 73.61 | 72.43 | 79.64 | 78.79 | 84.31 | $\mathbf{85.58}$ | $\mathbf{84.62}$ |

$\mathbf{1.931}$ | 6.778 | 2.789 | 9.852 | 10.754 | 11.198 | $\mathbf{0.763}$ | |

Pima | 72.21 | 73.45 | 73.47 | 70.38 | 75.91 | $\mathbf{76.32}$ | $\mathbf{84.62}$ |

2.013 | 3.023 | $\mathbf{1.482}$ | 6.924 | 6.765 | 6.714 | $\mathbf{0.768}$ | |

WDBC | 88.53 | 89.26 | 87.63 | 91.43 | $\mathbf{92.31}$ | $\mathbf{96.43}$ | 91.07 |

1.238 | 7.693 | $\mathbf{0.924}$ | 8.988 | 5.973 | 6.608 | $\mathbf{0.667}$ |

ELM | CHELM | TELM | CTSVM | RTELM | CWTELM | FCWTELM | |
---|---|---|---|---|---|---|---|

Datasets | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) |

Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | |

Australian | 76.48 | 78.85 | 78.91 | 79.80 | $\mathbf{81.12}$ | $\mathbf{83.82}$ | 80.88 |

1.586 | 6.683 | $\mathbf{0.774}$ | 9.710 | 8.740 | 7.704 | $\mathbf{0.583}$ | |

Balance | 79.11 | 81.39 | 82.01 | 84.91 | 82.14 | $\mathbf{87.50}$ | $\mathbf{85.71}$ |

$\mathbf{1.679}$ | 5.460 | 2.489 | 3.345 | 4.392 | 4.871 | $\mathbf{0.424}$ | |

Vote | 92.51 | 90.23 | 93.76 | 94.19 | 94.43 | $\mathbf{95.24}$ | $\mathbf{95.36}$ |

0.990 | 3.856 | $\mathbf{0.401}$ | 9.348 | 5.607 | 5.321 | $\mathbf{0.506}$ | |

Cancer | 79.25 | 80.15 | 82.67 | 85.29 | 84.06 | $\mathbf{86.96}$ | $\mathbf{85.51}$ |

1.739 | 5.203 | $\mathbf{0.798}$ | 2.661 | 4.346 | 6.952 | $\mathbf{0.411}$ | |

wholesale | 74.19 | 74.91 | 73.06 | 72.21 | 74.22 | $\mathbf{81.40}$ | $\mathbf{76.74}$ |

1.330 | 4.857 | $\mathbf{0.896}$ | 3.412 | 3.953 | 4.119 | $\mathbf{0.374}$ | |

QSAR | 64.87 | 65.44 | 68.44 | 68.56 | 76.92 | $\mathbf{83.65}$ | $\mathbf{79.81}$ |

$\mathbf{2.245}$ | 10.212 | 4.443 | 11.387 | 12.876 | 10.844 | $\mathbf{0.809}$ | |

Pima | 65.87 | 65.80 | 66.32 | 71.75 | $\mathbf{73.31}$ | $\mathbf{73.68}$ | 71.50 |

1.746 | 2.498 | $\mathbf{1.090}$ | 7.095 | 5.198 | 6.329 | $\mathbf{0.931}$ | |

WDBC | 84.76 | 85.37 | 82.08 | 82.75 | 85.56 | $\mathbf{92.86}$ | $\mathbf{89.29}$ |

1.389 | 8.918 | $\mathbf{1.124}$ | 9.330 | 6.499 | 5.235 | $\mathbf{0.509}$ |

ELM | CHELM | TELM | CTSVM | RTELM | CWTELM | FCWTELM | |
---|---|---|---|---|---|---|---|

Datasets | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) | ACC (%) |

Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | Times (s) | |

Australian | 66.79 | 69.85 | 65.66 | 74.65 | $\mathbf{76.76}$ | $\mathbf{83.82}$ | 76.47 |

1.413 | 6.747 | $\mathbf{1.406}$ | 9.552 | 7.895 | 7.842 | $\mathbf{0.724}$ | |

Balance | 78.39 | 80.11 | 81.12 | 85.46 | 85.71 | $\mathbf{87.50}$ | $\mathbf{85.72}$ |

$\mathbf{1.710}$ | 2.539 | 3.411 | 3.279 | 4.494 | 4.664 | $\mathbf{0.453}$ | |

Vote | 93.82 | 92.23 | 91.10 | 93.18 | 93.64 | $\mathbf{93.86}$ | $\mathbf{93.98}$ |

0.893 | 3.653 | $\mathbf{0.664}$ | 9.118 | 7.235 | 5.321 | $\mathbf{0.508}$ | |

Cancer | 79.25 | 80.36 | 81.32 | 84.86 | 81.16 | $\mathbf{85.61}$ | $\mathbf{84.97}$ |

1.739 | 5.210 | $\mathbf{0.799}$ | 2.503 | 4.332 | 6.863 | $\mathbf{0.404}$ | |

wholesale | 74.00 | 74.01 | 73.06 | 70.35 | 72.09 | $\mathbf{79.07}$ | $\mathbf{74.42}$ |

1.404 | 5.110 | $\mathbf{0.789}$ | 3.020 | 4.066 | 4.173 | $\mathbf{0.374}$ | |

QSAR | 63.72 | 67.69 | 65.77 | 73.65 | 70.01 | $\mathbf{74.04}$ | $\mathbf{75.00}$ |

$\mathbf{0.441}$ | 10.357 | 4.499 | 9.336 | 12.876 | 11.357 | $\mathbf{0.790}$ | |

Pima | 65.83 | 65.59 | 65.29 | 70.39 | $\mathbf{70.66}$ | $\mathbf{73.68}$ | 69.74 |

1.746 | 2.202 | $\mathbf{1.045}$ | 6.691 | 7.836 | 6.394 | $\mathbf{0.860}$ | |

WDBC | 77.56 | 75.57 | 79.09 | 80.12 | 84.64 | $\mathbf{85.71}$ | $\mathbf{85.71}$ |

$\mathbf{1.401}$ | 8.631 | 1.553 | 9.407 | 6.639 | 5.248 | $\mathbf{0.435}$ |

**Table 8.**Average accuracy and ranking of the seven algorithms on the UCI datasets with different noise proportions.

ELM | CHELM | TELM | CTSVM | RTELM | CWTELM | FCWTELM | |
---|---|---|---|---|---|---|---|

Avg.ACC 10% | 80.61 | 81.35 | 82.33 | 82.72 | 84.54 | 87.94 | 86.08 |

Avg.rank 10% | 6.000 | 5.500 | 4.875 | 4.625 | 3.000 | 1.250 | 2.750 |

Avg.ACC 20% | 77.13 | 77.77 | 78.41 | 79.93 | 81.47 | 85.64 | 83.10 |

Avg.rank 20% | 6.000 | 5.750 | 5.000 | 4.500 | 2.625 | 1.375 | 2.750 |

Avg.ACC 25% | 74.92 | 75.68 | 75.30 | 79.08 | 79.33 | 82.91 | 80.75 |

Avg.rank 25% | 5.625 | 5.500 | 5.750 | 4.125 | 3.625 | 1.375 | 2.000 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Jiang, Y.; Yu, G.; Ma, J.
Distance Metric Optimization-Driven Neural Network Learning Framework for Pattern Classification. *Axioms* **2023**, *12*, 765.
https://doi.org/10.3390/axioms12080765

**AMA Style**

Jiang Y, Yu G, Ma J.
Distance Metric Optimization-Driven Neural Network Learning Framework for Pattern Classification. *Axioms*. 2023; 12(8):765.
https://doi.org/10.3390/axioms12080765

**Chicago/Turabian Style**

Jiang, Yimeng, Guolin Yu, and Jun Ma.
2023. "Distance Metric Optimization-Driven Neural Network Learning Framework for Pattern Classification" *Axioms* 12, no. 8: 765.
https://doi.org/10.3390/axioms12080765