# BELMKN: Bayesian Extreme Learning Machines Kohonen Network

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Methodology

#### 2.1. Feature Learning Using Extreme Learning Machine (ELM)

_{i}, in which i = 1,2, …, d with dimension R

^{d}. ELM consists of three layers: the input layer, a hidden layer, and the output layer. The weights between the input and the hidden layer are randomly initialized from a uniform distribution, given by $W\epsilon $${R}^{d\times \alpha}$, in which d is the number of input neurons and ${\alpha}_{}$ is the number of hidden neurons with a bias. The weight ($\delta $) between the hidden layer and the output layer needs to be computed. A feed-forward pass is performed between the input and hidden layer. The hidden layer activations (H) are calculated using,

_{i}is the number of input neurons, $s(.)$ is the sigmoidal activation function; in general, given as $s(v)=\frac{1}{1+{e}^{-v}}$, in which v = XW + b, and b is the input layer bias. The hidden to output layer weights are calculated using the objective function [28],

#### 2.2. Cluster Prediction Using Bayesian Information Criterion (BIC)

_{c}, in which c = 1, 2, …, N, the model with the lowest BIC value, is chosen, which gives the optimal number of clusters for the dataset.

#### 2.3. Partitional Clustering Using the Kohonen Network

_{c}). The Kohonen Network consists of two layers, namely, the input and the output layers. The number of output layer neurons (n

_{o}) is the number of clusters (n

_{c}) that is determined by BIC. The weight matrix between the input and the output layers, ${W}_{k}\epsilon $${R}^{\beta \times {n}_{c}}$, in which $\beta $ is the number of input neurons. Further, the weights are calculated using the discriminant function value that is used as the basis for competition using Euclidean distance as a distance metric given by,

_{o}.

_{1}and T

_{2}are time constants [29].

Pseudo code: A high-level description of BELMKN |

Input:- Input the normalized dataset (X
_{i}) and randomly initialize weights (W) between input and hidden layers
Output:- Output the feature learning (E) obtained from ELM network, the number of clusters computed using BIC (n
_{c}), the clustering centers (C_{no}), and clustering accuracy obtained from the KN
Beginning:- Obtain the hidden layer activations using Equation (1)
- Construct the graph Laplacian (L) using Equations (3) and (4) from the normalized input data
- Calculate the eigenvectors ${v}_{j}$ from the eigenvalues ${\gamma}_{j}$ using Equation (5)
- Choose the eigenvectors $\delta =[{\stackrel{~}{v}}_{2},{\stackrel{~}{v}}_{3},\mathrm{\dots},{\stackrel{~}{v}}_{\beta +1}]$ in which ${\stackrel{~}{v}}_{j}=\frac{{\stackrel{}{v}}_{j}}{\Vert H{\stackrel{}{v}}_{j}\Vert},j=2,3,\mathrm{\dots},\beta +1$, corresponding to the (2 to $\beta +1$) eigenvalues.
- Obtain $\delta $ in which columns are normalized eigenvectors
- Compute feature learning (E) using Equation (6)
- Calculate BIC to obtain the number of clusters (n
_{c}) using Equation (7) - The input to the KN will be feature learning (E) and the number of clusters (n
_{c}) - Repeat steps 10 and 11 until there is no change in topology
- Compute the winning neuron index using Equation (8)
- Update the weights of winning neuron and its neighbors using Equations (9) and (10)
- Assign the cluster numbers for each sample with the weights of the KN using minimum distance criteria
- Evaluate the clustering accuracy using Equation (11)
End |

## 3. Illustrative Example

## 4. Results and Discussion

#### 4.1. Dataset Description

- Dataset 1:
- The Cancer dataset consists of 2 classes that categorize the tumor as either malignant or benign. It contains 569 samples and 30 attributes.
- Dataset 2:
- The Dermatology dataset is based on the differential diagnosis of erythemato-squamous diseases in dermatology. It consists of 366 samples, 34 attributes, and 6 classes.
- Dataset 3:
- The E. coli dataset is based on the cellular localization sites of proteins. It contains 327 samples, 7 attributes, and 5 classes.
- Dataset 4:
- The Glass dataset is based on the oxide content of each glass type. It contains 214 samples, 9 attributes, and 6 classes.
- Dataset 5:
- The Heart dataset is based on the diagnosis of heart disease. It contains 270 samples, 13 attributes, and 2 classes.
- Dataset 6:
- The Horse dataset is to classify whether the horse will die, survive, or be euthanized. The dataset contains 364 samples, 27 attributes, and 3 classes.
- Dataset 7:
- The Iris dataset is based on the width and length of the sepals and petals of 3 varieties (classes) of flowers, namely, setosa, virginica andversicolor, with 150 samples and 4 attributes.
- Dataset 8:
- The Thyroid dataset is based on whether the thyroid is over-function, normal-function, or under-function (3 classes). The dataset contains 215 samples and 5 attributes.
- Dataset 9:
- The Vehicle dataset is used to classify a vehicle into 4 classes given the silhouette. The dataset contains 846 samples and 18 attributes.
- Dataset 10:
- The Wine dataset is obtained from the chemical analysis of wine obtained from 3 different cultivators (3 classes). The dataset contains 178 samples and 13 attributes.

#### 4.2. Analysis of Cluster Prediction

#### 4.3. Effect of Parameter Settings

#### 4.4. Analysis of Clustering Accuracy Using BELMKN

## 5. Conclusions

## Author Contributions

## Conflicts of Interest

## References

- Jain, A.K.; Murty, M.N.; Flynn, P.J. Data clustering: A review. ACM Comput. Surv.
**1999**, 31, 264–323. [Google Scholar] [CrossRef] - Senthilnath, J.; Deepak, K.; Benediktsson, J.A.; Xiaoyang, Z. A Novel Hierarchical Clustering Technique Based on Splitting and Merging. Int. J. Image Data Fusion
**2016**, 7, 19–41. [Google Scholar] [CrossRef] - Google Scholar. Available online: http://scholar.google.com (accessed on 28 March 2018).
- Jain, A.K. Data clustering: 50 years beyond K-means. Pattern Recognit. Lett.
**2010**, 31, 651–666. [Google Scholar] [CrossRef] - Yang, X.; Lo, C.P. Using a time series of satellite imagery to detect land use and land cover changes in the Atlanta, Georgia metropolitan area. Int. J. Remote Sens.
**2002**, 23, 1775–1798. [Google Scholar] [CrossRef] - Senthilnath, J.; Omkar, S.N.; Mani, V.; Tejovanth, N.; Diwakar, P.G.; Shenoy, A.B. Hierarchical clustering algorithm for land cover mapping using satellite images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2012**, 5, 762–768. [Google Scholar] [CrossRef] - Gvishiani, A.D.; Dzeboev, B.A.; Agayan, S.M. A New Approach to Recognition of the Strong Earthquake ProneAreas in the Caucasus. Izv. Phys. Solid Earth
**2013**, 49, 747–766. [Google Scholar] [CrossRef] - Rui, X.; Donald, C.W. Clustering Algorithms in Biomedical Research—A Review. IEEE Rev. Biomed. Eng.
**2010**, 3, 120–154. [Google Scholar] - Silhavy, R.; Silhavy, P.; Prokopova, Z. Evaluating Subset Selection Methods for Use Case Points Estimation. Inf. Softw. Technol.
**2018**, 97, 1–9. [Google Scholar] [CrossRef] - Raghu, K.; James, M.K. A Possibilistic Approach to Clustering. IEEE Trans. Fuzzy Syst.
**1993**, 1, 98–110. [Google Scholar] - Ron, Z.; Amnon, S. A Unifying Approach to Hard and Probabilistic Clustering. In Proceedings of the 10th IEEE International Conference on Computer Vision, Beijing, China, 7–21 October 2005; pp. 294–301. [Google Scholar]
- Sueli, A.M.; Joab, O.L. Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms. Eur. J. Oper. Res.
**2006**, 174, 1742–1759. [Google Scholar] - Rui, X.; Donald, W. Survey of Clustering Algorithms. IEEE Trans. Neural Netw.
**2005**, 16, 645–678. [Google Scholar] - Tapas, K.; David, M.M.; Nathan, S.N.; Christine, D.P.; Ruth, S.; Angela, Y.W. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Trans. Pattern Anal. Mach. Intell.
**2002**, 24, 881–892. [Google Scholar] - Jean-Claude, F.; Patrick, L.; Marie, C. Advantages and drawbacks of the Batch Kohonen algorithm. In Proceedings of the 10th Eurorean Symposium on Artificial Neural Networks, Bruges, Belgium, 24–26 April 2002; pp. 223–230. [Google Scholar]
- Paul, M.; Shaw, K.C.; David, W. A Comparison of SOM Neural Network and Heirarchical Clustering Methods. Eur. J. Oper. Res.
**1996**, 93, 402–417. [Google Scholar] - Senthilnath, J.; Dokania, A.; Kandukuri, M.; Ramesh, K.N.; Anand, G.; Omkar, S.N. Detection of tomatoes using spectral-spatial methods in remotely sensed RGB images captured by UAV. Biosyst. Eng.
**2016**, 146, 16–32. [Google Scholar] [CrossRef] - Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. R. Stat. Soc. Ser. B
**1977**, 39, 1–38. [Google Scholar] - Li, H.; Zhang, K.; Jiang, T. The regularized EM algorithm. In Proceedings of the 20th International Conference on Artificial Intelligence, Pittsburgh, PA, USA, 9–13 July 2005; pp. 807–812. [Google Scholar]
- Senthilnath, J.; Manasa, K.; Akanksha, D.; Ramesh, K.N. Application of UAV imaging platform for vegetation analysis based on spectral-spatial methods. Comput. Electron. Agric.
**2017**, 140, 8–24. [Google Scholar] [CrossRef] - Senthilnath, J.; Kulkarni, S.; Raghuram, D.R.; Sudhindra, M.; Omkar, S.N.; Das, V.; Mani, V. A novel harmony search-based approach for clustering problems. Int. J. Swarm Intell.
**2016**, 2, 66–86. [Google Scholar] [CrossRef] - Ding, S.; Zhang, N.; Zhang, J.; Xu, X.; Shi, Z. Unsupervised extreme learning machine with representational features. Int. J. Mach. Learn. Cybern.
**2017**, 8, 587–595. [Google Scholar] [CrossRef] - Akogul, S.; Erisoglu, M. An Approach for Determining the Number of Clusters in a Model-Based Cluster Analysis. Entropy
**2017**, 19, 452. [Google Scholar] [CrossRef] - Burnham, K.P.; Anderson, D.R. Multimodel inference: Understanding AIC and BIC in model selection. Sociol. Methods Res.
**2004**, 33, 261–304. [Google Scholar] [CrossRef] - Huang, G.; Huang, G.B.; Song, S.; You, K. Trends in extreme learning machines: A review. Neural Netw.
**2015**, 61, 32–48. [Google Scholar] [CrossRef] [PubMed] - Khan, B.; Wang, Z.; Han, F.; Iqbal, A.; Masood, R.J. Fabric Weave Pattern and Yarn Color Recognition and Classification Using a Deep ELM Network. Algorithms
**2017**, 10, 117. [Google Scholar] [CrossRef] - Huang, G.B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man Cybern. B
**2012**, 42, 513–529. [Google Scholar] [CrossRef] [PubMed] - Huang, G.; Song, S.; Gupta, J.N.; Wu, C. Semi-supervised and unsupervised extreme learning machines. IEEE Trans. Cybern.
**2014**, 44, 2405–2417. [Google Scholar] [CrossRef] [PubMed] - Schwarz, G. Estimating the dimension of a model. Ann. Stat.
**1978**, 6, 461–464. [Google Scholar] [CrossRef] - Huang, Z.; Ng, M.K. A fuzzy k-modes algorithm for clustering categorical data. IEEE Trans. Fuzzy Syst.
**1999**, 7, 446–452. [Google Scholar] [CrossRef][Green Version] - Blake, C.L.; Merz, C.J. Repository of Machine Learning Databases; Department of Information and Computer Science, University of California: Irvine, CA, USA, 1998. [Google Scholar]
- Senthilnath, J.; Omkar, S.N.; Mani, V. Clustering using firefly algorithm: Performance study. Swarm Evolut. Comput.
**2011**, 1, 164–171. [Google Scholar] [CrossRef] - Bhola, R.; Krishna, N.H.; Ramesh, K.N.; Senthilnath, J.; Anand, G. Detection of the power lines in UAV remote sensed images using spectral-spatial methods. J. Environ. Manag.
**2018**, 206, 1233–1242. [Google Scholar] [CrossRef] [PubMed]

**Figure 3.**Clustering of flame pattern distribution using (

**a**) k-means; (

**b**) SOM; (

**c**) EM; (

**d**) US-ELM; and (

**e**) BELMKN.

**Figure 4.**Clustering of face pattern distribution using (

**a**) k-means; (

**b**) SOM; (

**c**) EM; (

**d**) US-ELM; and (

**e**) BELMKN.

Sl. No. | Dataset | Number of Samples | Input Dimension | Number of Clusters |
---|---|---|---|---|

1 | Cancer | 569 | 30 | 2 |

2 | Dermatology | 366 | 34 | 6 |

3 | E. coli | 327 | 7 | 5 |

4 | Glass | 214 | 9 | 6 |

5 | Heart | 270 | 13 | 2 |

6 | Horse | 364 | 27 | 3 |

7 | Iris | 150 | 4 | 3 |

8 | Thyroid | 215 | 5 | 3 |

9 | Vehicle | 846 | 18 | 4 |

10 | Wine | 178 | 13 | 3 |

Dataset | Cancer | Dermatology | E. coli | Glass | Heart | Horse | Iris | Thyroid | Vehicle | Wine |
---|---|---|---|---|---|---|---|---|---|---|

Actual Clusters | 2 | 6 | 5 | 6 | 2 | 3 | 3 | 3 | 4 | 3 |

BIC cluster predicted on original dataset | 3 | 6 | 4 | 3 | 2 | 2 | 3 | 3 | 4 | 3 |

BIC cluster predicted using ELM | 3 | 6 | 5 | 6 | 3 | 3 | 3 | 3 | 4 | 3 |

Sl No. | Dataset | k-Means | SOM | EM | USELM | BELMKN |
---|---|---|---|---|---|---|

1 | Cancer | 85.4% (5) | 86% (4) | 91.21% (2) | 90% (3) | 92.6% (1) |

2 | Dermatology | 26.2% (5) | 32% (4) | 67.75% (3) | 82% (2) | 90.1% (1) |

3 | E. coli | 59.9% (4) | 61% (3) | 77.98% (2) | 82% (1) | 82% (1) |

4 | Glass | 54.2% (1) | 54% (2) | 47.66% (4) | 42% (5) | 48% (3) |

5 | Heart | 59.2% (4) | 60% (3) | 53.33% (5) | 70% (2) | 75.5% (1) |

6 | Horse | 48% (3) | 48% (3) | 43.4% (4) | 65% (1) | 63.18% (2) |

7 | Iris | 80% (5) | 82% (4) | 90% (3) | 96% (2) | 97% (1) |

8 | Thyroid | 86% (5) | 87% (4) | 94.27% (3) | 89% (3) | 90.5% (2) |

9 | Vehicle | 44% (2) | 44% (2) | 45.035% (1) | 42% (3) | 41% (4) |

10 | Wine | 70% (5) | 75% (4) | 90.44% (3) | 94% (2) | 96.6% (1) |

Clustering Algorithm | k-Means | SOM | EM | USELM | BELMKN |
---|---|---|---|---|---|

Average | 61.29 | 62.9 | 70.1 | 75.2 | 77.65 |

Rank | 5 | 4 | 3 | 2 | 1 |

Clustering Algorithm | k-Means | SOM | EM | USELM | BELMKN |
---|---|---|---|---|---|

Total | 39 | 33 | 30 | 24 | 17 |

Rank | 5 | 4 | 3 | 2 | 1 |

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Senthilnath, J.; Simha C, S.; G, N.; Thapa, M.; M, I.
BELMKN: Bayesian Extreme Learning Machines Kohonen Network. *Algorithms* **2018**, *11*, 56.
https://doi.org/10.3390/a11050056

**AMA Style**

Senthilnath J, Simha C S, G N, Thapa M, M I.
BELMKN: Bayesian Extreme Learning Machines Kohonen Network. *Algorithms*. 2018; 11(5):56.
https://doi.org/10.3390/a11050056

**Chicago/Turabian Style**

Senthilnath, J., Sumanth Simha C, Nagaraj G, Meenakumari Thapa, and Indiramma M.
2018. "BELMKN: Bayesian Extreme Learning Machines Kohonen Network" *Algorithms* 11, no. 5: 56.
https://doi.org/10.3390/a11050056