Meta Classification Model of Surface Appearance for Small Dataset Using Parallel Processing

Kazoom, Roie; Birman, Raz; Hadar, Ofer

doi:10.3390/electronics11213426

Open AccessArticle

Meta Classification Model of Surface Appearance for Small Dataset Using Parallel Processing

by

Roie Kazoom

^*,

Raz Birman

^* and

Ofer Hadar

^*

School of Electrical and Computers Engineering, Ben-Gurion University of the Negev, Beer Sheva 5253318, Israel

^*

Authors to whom correspondence should be addressed.

Electronics 2022, 11(21), 3426; https://doi.org/10.3390/electronics11213426

Submission received: 19 September 2022 / Revised: 13 October 2022 / Accepted: 17 October 2022 / Published: 22 October 2022

(This article belongs to the Special Issue Recent Trends in Applications of Artificial Intelligence for Image and Video Analysis)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Machine learning algorithms have become a very essential tool in the fields of math and engineering, as well as for industrial purposes (fabric, medicine, sport, etc.). This research leverages classical machine learning algorithms for innovative accurate and efficient fabric protrusion detection. We present an approach for improving model training with a small dataset. We use a few classic statistics machine learning algorithms (decision trees, logistic regression, etc.) and a fully connected neural network (NN) model. We also present an approach to optimize a model accuracy rate and execution time for finding the best accuracy using parallel processing with Dask (Python).

Keywords:

machine learning; fabric protrusion

1. Introduction

Machine learning (ML) models require substantial training datasets. A special challenge is encountered when the training dataset is not large enough. Optimizing a ML model with a small dataset might lead to overfitting, whereas the model training accuracy rate is usually high, but its capacity is insufficient to care with new data. In this paper, we present several models and our best solution for classifying fabric appearance. Machine learning enables data-driven models to “learn” information about a system directly from observed data. They are increasingly becoming the core technology of choice for numerous applications, such as weather forecasting, DNA sequencing, Internet search engines, stock market real-time predictions, and more. Nevertheless, ML systems are rarely used when there are small datasets, that is, when an insufficient number of training samples can compromise the learning accuracy [1,2]. In practical applications, the dataset is often imbalanced, which might lead to nonideal training. To address this challenge, we have employed some preprocessing, e.g., feature selection and engineering, smoothing techniques, scaling, and normalization.

In this research, we are proposing a machine learning approach for classifying the surface quality of a two-dimensional random field. The classification is based on estimating random field outlier parameters.

In order to improve the outlier detection accuracy, a random field is presented as a set of profiles in parallel sections, each of which is an implementation of a one-dimensional random field process. One advantage of replacing the 2D random field image with a set of 1D profile images is the ability to detect outliers against a fixed background.

2. Literature Review

Surface defects in fabrics are one of the most challenging problems in textile and apparel industries. The most common defects are the structural failure of the fabric and defects related to protruding separate fibers or group of fibers from the fabric surface. The most common types of fabric surface defect are pilling, fuzziness, snagging, and hairiness. The classification range is 1–5, whereas the best grade is 5 (no protrusions at all) and the worst is 0. The score is traditionally obtained through the use of a standard test method. In this research work, we present a fabric appearance classification approach based on data which has been obtained using a specific method of image processing parameterization [3,4].

The problem of textile surface quality evaluation has been widely explored in the literature during the last 25 years [5,6,7,8,9,10,11]. Several sources mentioned the use of image processing techniques for fabric defect evaluation [8,9,12,13,14,15,16]. The methods used for fabric defect localization and evaluation include Fourier transform [13,16], artificial neural networks [17,18,19,20,21,22], fuzzy logic method [5,7,13,19], wavelet analysis [21,22,23,24,25], and others. Common to all these works is the use of 2D fabric images as the analysis starting point. The traditional approach is based on Gaussian fit theory and threshold grading for linear convolution of pills parameters [13]. The application of machine learning to real-world problems presents a few challenges—it requires domain specialists to label data samples [26], it might be influenced by the type of the data preprocessing and tuning of algorithm parameters, thus requiring running several machine learning experiments before achieving satisfactory results [27]. An additional challenge is model performance assessment with small datasets. In this work we handle small data sets situations [27,28]. We compare the accuracy score of neural network, decision tree, random forest, logistic regression, and choose the best algorithm for the fabric surface quality classification application. The use of the proposed system comes to give a different approach, where instead of using a convolutional neural network, we use a fixed parameters that has been extracted from a set of images. The given parameters are mentioned in the research below.

The given approach can cause better results while handling a low amount of data, where there the model can learn some unimportant features.

This paper also gives a new approach of finding the best hyperparameters while using parallel processing in order to reduce the execution time.

3. Research Methodology

3.1. System Overview

We use a dataset that was created in 2010 for the textile and apparel industries [3,4]. The parameterization of the original study is not included in this research. We begin from preprocessing of the data for implementing any machine learning classification algorithms.

An example of the fabric’s 2D image of the fabric is illustrated in Figure 1.

Next, we apply some descriptive statistics to better understand the data. The dataset is processed further to handle missing data. In this research there was no need for feature engineering and outlier detection. Before dealing with categorical features and data encoding, it is necessary to implement univariate, bivariate, and multivariate analysis of the dataset. Feature scaling is performed before implementing the algorithms.

We used resampling to deal with an imbalanced data and balance the data before implementing the algorithm to prevent prediction outcome from predicting the common dependent variable. Neural networks depend on many hyperparameters, the optimization of which takes many runs. Due to the large volume of hyperparameter tuning, we suggest a parallel programing technique, whereas independent models with different parameter run in parallel on different cores of the same computer.

The overall methodology used to assess model performance s illustrated in Figure 2.

3.2. Hypothesis for Images—Low-Level Image Processing

A sample of the dataset is illustrated in Figure 3. We performed low-level image processing technique to identify and extract relevant fabric profiles.

Then, we used image segmentation [29] to convert each pixel to 255 or 0 (black or white). Then, we calculated standard deviation and average of each row in the matrix. The standard deviation of boundary rows will be higher than 0 and its average will be between 255 and 0. The flow will be as follows, whereby we loop through each row where

P_{i}

= Pixel:

F (P_{i}) = \frac{P_{i}}{255}

(1)

Now that we have defined the limits clearly, we can easily calculate standard deviation and average for each row, where N represents the number of pixels for each row—this number is equal for all rows:

Average = \frac{\sum_{i = 1}^{N} P_{i}}{N}

(2)

Standard Deviation = \sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(P_{i} - \bar{P})}^{2}}

(3)

3.3. Evaluation Criteria on Performance Measure Indices

We used metrics of Accuracy, Recall, F1 score, and Precision to assess classification performance. The results of the performance measurement indicators depend on True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) [30].

Accuracy is considered a good metric when FP and FN have a similar cost. If the cost of false positives and false negatives are very different, then Precision and Recall will be the better metrics.

Precision = \frac{T P}{T P + F P}

(4)

Recall = \frac{T P}{T P + F N}

(5)

Accuracy = \frac{T P + T N}{T P + F P + F N + T N}

(6)

F 1 Score = 2 * \frac{R e c a l l * P r e c i s i o n}{R e c a l l + P r e c i s i o n}

(7)

3.4. Tool and Language

We used the Jupiter Notebook environment, Python as a programming language, and Dask for running the program in parallel. The PC used in this research includes 8GB RAM and 8 cores (Intel Core i7 8th generation).

4. Dataset Preprocessing

4.1. Dataset Overview

Our original dataset size was 582 rows × 25 columns, where the rows are the variables, and the columns are the observations. After dropping all the unnecessary and missing data, the dataset was reduced to 388 rows × 15 columns. A sample of the clean dataset is provided in Figure 4.

Figure 5 illustrates the distribution of the target class. It is clear that the dataset is highly imbalanced.

The score labels of [1,2] were between 1 and 5 with increments of 0.5, which means there were 9 possible classes, and the fabric was considered good when the score was higher than 2. We used binary classification according to the following logic:

If Score > 2, Score = 1.

Else Score = 0.

Figure 6 illustrates the distribution of the new binary target class. The new dataset is still imbalanced.

4.2. Dataset Processing and Preparation

4.2.1. Missing Values and Outliers Detection

The treatment of missing values is an important step in training machine learning models. Missing values result from various reasons such as incomplete forms, unavailable values, etc. Multiple methods have been suggested to compute missing values [31]. They can be calculated using business logic or even deleted when their percentage is very high. The classical methods are to use mean, median, and mode. Mean is used for numerical variables when there are not many outliers in the dataset; median is used for numerical variables when there are outliers in the dataset; mode is used to treat categorical variables. In this research, we dropped some rows and columns with missing meaningless features such as: code/ID of the fabric, the date the image was taken, etc.

The presence of outliers in a classification or regression dataset can result in a poor fit and lower predictive modeling performance. Automatic outlier detection methods can be used in the modeling pipeline and compared. Figure 7 presents the box plot graphs for some of the important features in the dataset.

The box plots above help us to analyze the middle 50 percentile of the data, and we can clearly check the minimum, maximum, median, and thus identify the outlier values. Considering our relatively small dataset, despite identifying some outliers in our dataset, we have determined to keep them.

4.2.2. Univariate, Bivariate, and Multivariate Analysis

We used univariate analysis to check the distribution of the dataset features one at a time. The conclusion is that only the shade features have a distribution close to uniform. All other features are not distributed uniformly or normally. Figure 8 presents a pie chart for the distribution of the shade column.

From the pie chart we conclude that the distribution of the shade feature is close to uniform.

We used bivariate analysis to check the relationship between features to the score. Figure 9 demonstrates that the smaller the number of protrusions in an image, the score tends to be a success (binary classification 1 in our case).

When analyzing the impact of protrusion height on the classification, we conclude that the higher the average height, the score tends to be a failure (binary classification 0 in our case). This is illustrated in Figure 10.

We used multivariate analysis to create a full correlation heat map. The map of Figure 11 demonstrates that some columns have more impact on the dependent variable than others.

From the heat map of Figure 11, we can conclude that:

The average distance between protrusions and the average number of protrusions are highly correlated (0.85).
Max height and average height of protrusions are correlated as well (0.96).
Shade and brightness are highly correlated (0.96).
These correlations are further indicated in Figure 12, Figure 13 and Figure 14, which depict the relations of the three above-mentioned variable pairs, respectively.

We further checked the relation between the direction that the image was taken to its brightness. A visualization of this relation is depicted in Figure 15. From the graph we can see that images that have been taken from their length are brighter.

4.2.3. Categorical Columns (One-Hot Encoding)

We used one-hot encoding to prepare the categorical features for training. This processing leads to only one dimension of the feature matrix being asserted for a given state (usually ‘1’), and all other state dimensions are zero [31]. The one-hot encoded the color column is depicted in Figure 16.

Label-Encoder—Label-Encoder is a method to encode a categorical class between 0 and N-1 Classes, where N is the number of labels in a class [1]. We used label-encoding for the Direction class. The two labels: L and W are encoded to 0 and 1, respectively.

4.2.4. Imbalanced Data (Resampling)

As we have seen, our dataset, and in particular the target column, is imbalanced. To balance the dataset, we have applied the resampling method by adding records to the minority class or deleting records from the majority class [27,32]. Oversampling and under-sampling are opposite and roughly equivalent techniques. To avoid data loss, we have used oversampling. We used a SMOTE oversampling technique. The algorithm is as follows: for each pattern in

X_{0}

from the minority class: (1) Pick one of its K nearest neighbors X (belonging to the minority class also); (2) Create a new pattern Z on a random point on the line segment connecting the pattern and the selected neighbor as follows:

z = x_{0} + w (x - x_{0})

(8)

where w is a uniform random variable in the range [0, 1] [33].

Before implementing SMOTE, the shape of the target class distribution was:

0–275 Values

1–113 Values

After Implementing the SMOTE oversampling technique, the target class distribution was:

0–275 Values

1–275 Values

Thus, achieving a balanced dataset.

4.2.5. Splitting the Dataset

Effective methodologies for evaluating a given dataset are critical. The selection of splitting strategy can have a strong impact on the ranking of a machine learning model during evaluation, and it should be fitted to the specific dataset [2]. When there are not many observations in the dataset, it is necessary to consider both the training set for the improvement of the model and the test set to have a representative sample. Hence, we decided to split the dataset to 80% training and 20% testing. The parameter of Random-State controls the shuffling applied to the data before applying the split. We chose this parameter as 19, while the default is 42, since we deal with small dataset and we want to shuffle the dataset more often.

4.2.6. Feature Scaling

Feature scaling of the dataset is performed to change the feature vector into the format that is more suitable for machine learning algorithms. There are many scalers approaches for feature scaling, most used ones are Standard Scaler, Min Max Scaler, Normalizer, and Robust Scaler. Our dataset contains variables of different scales. Therefore, we used the Standard Scaler methodology, which transforms the dataset such that the resulting distribution mean value is zero and the standard deviation is one. The transformed value is obtained by subtracting mean value from the original value and dividing by the standard deviation [33].

z = \frac{x - u}{σ}

(9)

where z is the transformed value of the feature, x is the original value, µ is the mean, and

σ

is the standard deviation.

5. Classification Algorithms

5.1. Decision Tree, Logistic Regression, and Adaboost Classifiers

Decision tree classifiers are considered a well-known method to achieve data classification [34]. Special caution should be exercised when implementing machine learning algorithms on small datasets in order to avoid overfitting. To avoid overfitting, we have used the same model with different maximum depths. Maximum depth can also be described as the length of the longest path from the tree root to a leaf. The root node is considered to have a depth of 0. Maximum depth would be N–1 while running it with default parameters. Figure 17 illustrates the outputs of the model while running maximum depth in a range of 1–30, where the X-axis is the maximum depth and the Y-axis is the accuracy score of the model [35].

As can be seen from Figure 17, a tree depth of a point around 10–12 levels fits the training dataset perfectly. Therefore, the use of these parameters when implementing the decision tree algorithm to avoid overfitting and at the same time achieve maximum accuracy. The accuracy on the test set improves with tree depth until a depth of 10–15, after which accuracy begins to converge to its maximum accuracy values (0.8 < accuracy rate < 0.85). Additional accuracy improvement has been achieved by implementing decision tree-based Packet Ensemble Classifier. The reason for selecting decision tree as a single classifier for our ensemble is that our dataset is highly imbalanced, so the decision tree algorithm presents a very good behavior by weighting the results of the trees and reducing the variance of the dataset and the overfitting. The bagging ensemble classifier is fast and can efficiently handle imbalanced datasets [36].

The random forest classifier algorithm is a metalearner, meaning that it consists of many individual learners (also known as trees). The random forest uses multiple random trees classification to vote on an overall classification for the given set of inputs. In general, each individual machine learner vote is given equal weight. Random forest chooses the individual classification that contains the most votes [37]. The following hyperparameters were used: Criterion—“Gini”; Max Depth—1; Max Features—square root of the total number of features in our dataset; Max Leaf Nodes—“None”; Bootstrap—“True” (the default); Max Sample—“True”; Out of Bag Score—“True”.

All the other hyperparameters were kept as default, as they did not impact the result accuracy.

We further used logistic regression [38] to describe the relationship of the target variable in our multivariate problem. We also implemented AdaBoost [37] with the logistic regression algorithm base. In order to find the best learning rate, we ran the model multiple times with different parameters, where the best learning rate was between 0.3 and 0.4.

5.2. Deep Learning Model (Neural Networks) Classifier

5.2.1. Neural Network Architecture

We used a Fully Connected binary classification NN [39] with ReLU activation functions and a binary cross-entropy loss function, which is described in Equation (10).

Log loss = \frac{1}{N} \sum_{i = 1}^{N} - (y_{i} * \log (p_{i}) + (1 - y_{i}) * \log (1 - p_{i}))

(10)

where

p_{i}

is the probability of class 1 and (1 −

p_{i}

) is the probability of class 0.

The network architecture [40,41] is depicted in Figure 18. For training we used the Adam Optimizer [42,43].

We used Early Stopping regularization to avoid overfitting when training a learner with an iterative method, such as gradient descent [44].

5.2.2. Hyperparameter Optimization Using Parallel Processing with Shared Memory

Hyperparameter optimization is the problem of optimizing a loss function over a graph-structured configuration space [45]. Hyperparameters may have a significant impact on model accuracy and their optimization is highly computation-demanding process due to the multiple training runs required. It is possible to run such optimization in parallel using independent models. Embarrassingly parallel programming is a form of parallel algorithms requiring almost no communication between the processes where each process can perform their own computations without any need for communication with the others [46]. Today, most computers have a multicore architecture using the same shared memory, where each core has a private memory. The shared memory architecture is illustrated in Figure 19.

Embarrassingly parallel algorithms can easily improve execution time of a single task while implementing them correctly. Monte Carlo Simulations [47] and Mandelbrot Sets (also known as Fractals) [48] are examples for embarrassingly parallel algorithms. The ideal case of embarrassingly parallel algorithms [49] is that all subproblems/tasks are defined before the computations begin. In that case, all the subsolutions are stored in independent memory locations (arrays, variables, etc.). Thus, the computation of the subsolutions is completely independent. We used Parallel Computation and Task Scheduling with Dask, a Python library extension which enables parallel out-of-core computation. Dask has many built-in functions which enables dynamic and memory aware task scheduling to achieve parallelism [50]. The model used in this research (Figure 18) was ran

4^{4}

(256) times for different sets of parameters. The following list represents the values of the different hyperparameters of the independent neural network function,

f (x_{i}, y_{i}, m_{i}, n_{i})

, which are ran in parallel on different cores:

Batch Size: x = {1,26,51,76}—When handling low size datasets, it is not necessary to use a big batch size.

Dense Units (dimensionality of the output space): y = {0,20,40,60}—When handling small set of features, it is not necessary to use a high dimensionality of the outputs.

Drop Rate: m = {0.02,0.18,0.34,0.5}.

Patience Parameter for Early Stopping: n = {1,3,5,7}—This parameter should be as close to 0, while we know that a small dataset does not require many epochs.

Testing the model with more hyperparameter values improves the chance of obtaining a higher classification score; however, more models need to be ran.

Each function is independent and does not need any communication with another function; for example,

f (x_{0}, y_{0}, j_{0}, i_{0})

is not equal to

f (x_{0}, y_{0}, j_{0}, i_{1})

, and therefore both can run in parallel on different cores without any need of communication with one another.

Dask enables to set the number of workers (=cores). Figure 20 is part of Dask visualization which enables the user to view the process while it is running.

Figure 20 indicates the number of tasks processing, the number of tasks that are already done/waiting to be processed, and also if some tasks failed to run. It also indicates used memory, which allows monitoring to avoid crashes due to out-of-memory issues.

Figure 21 represents time results for running the 256 models, with different number of workers, based on 17 observations for each class, where each class represents different number of workers.

Figure 21 represents the mean and standard deviation for each class. While it takes in average 11:37 min (697 s) to execute 256 different models with one core, and for six workers it takes 3:23 min (203 s) in average, this approach (parallel processing) runs 3.4 faster than a serial–manual optimizing technique.

The best hyperparameter values that we have reached are:

Batch Size = 76;

Dense Units = 60;

Drop Rate = 0.18;

Patience (for early stopping) = 7.

6. Performance Results

We have evaluated model performance using Precision, Recall, Accuracy, and F1 Score. Table 1 and Figure 22 present the performance measurement criteria and the comparison between different algorithms.

Neural network provides the best Accuracy (0.927), Precision (0.92), Recall (0.95), and F1 Score (0.93). The second highest score is obtained from the random forest (0.863). The poorest accuracy score of (0.772) is of the logistic regression algorithm.

7. Discussion

The work in the field of protrusions appearance and textile surface quality evaluation has been widely explored in the literature. There were many who used classical image processing techniques for fabric defect evaluation, while others used fabric defect localization and evaluation using Fourier transform and computer vision using the convolutional neural network [18,19,20,21,22].

In general, expert grading of rested fabric is subjective and, in some cases, difficult to perform. The experts in the field that measuring the fabric physically have a differing wide range of visual abilities as well as opinions on the severity of the defects.

Common to all of past works in the literature is that they used 2D images of fabric as the starting point of their analysis. A number of publications are devoted to pilling detection and evaluation. One of the first approaches is based on the Gaussian fit theory and the use of threshold grading for linear convolution of the pill’s parameters.

In terms of hyperparameter optimization, there were many approaches cited in the literature. Most of them use GridSearch and RandomSearch, whereas the classic approach iterates over all the parameters given in a certain list.

Our approach is based on the fact that we can use all the computer power where each execution is independent, and we run it in parallel. This approach expedites execution time by a factor determined by the number of the available computer cores.

8. Conclusions and Future Work

In this paper, we have proposed an approach to improve the past accuracy of classifying fabric appearance to detect faults. We described our low-level image processing technique for processing and classifying protrusions on profiles of random field describing the two-dimensional fabric surface. We have used a published dataset with low number of observations to develop an end-to-end approach for classifying surface quality based on images of one-dimensional surface profiles. We implemented different statistic preprocessing to handle outliers and missing values, and we have extracted new features (feature engineering) and preformed different kinds of data-encoding. We handled an imbalanced dataset by applying a resampling method and scaled the data to improve model classification performance. We compared six different machine and deep learning algorithms in order to get an accurate prediction rate over the used dataset. We further proposed a method of optimizing hyperparameters using a parallel processing technique in shared memory machines, whereas this approach not only improves the model accuracy rate, but also runs 3.4 times faster than a serial code.

In future work, we will use 3D images in order to extract more parameters from the images. We will also validate the estimation of using parallel processing in order to reduce execution time while handling a large amount of data, and we will check if this is possible without causing memory errors.

Author Contributions

Conceptualization, R.K.; methodology, R.K.; software, R.K.; validation, R.K., R.B. and O.H.; formal analysis, R.K.; investigation, R.K.; resources, R.K.; data curation, R.K.; writing—original draft preparation, R.K., R.B. and O.H.; writing—review and editing, R.K., R.B. and O.H.; visualization, R.K.; supervision, R.B. and O.H.; project administration, R.K., R.B. and O.H.; funding acquisition, R.K., R.B. and O.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Skansi, S. Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence; Springer: New York, NY, USA, 2018. [Google Scholar]
Reitermanova, Z. Data splitting. In WDS’10 Proceedings of Contributed Papers; Matfyzpress: Prague, Czech Republic, 2010; Volume 10, p. 10. [Google Scholar]
Cherkassky, A.; Weinberg, A. Objective evaluation of textile fabric appearance part 1: Basic principles, protrusion detection, and parameterization. Textile Res. J. 2010, 80, 226–235. [Google Scholar] [CrossRef]
Cherkassky, A.; Weinberg, A. Objective Evaluation of Textile Fabric Appearance. Part 2: SET Opti-grade Tester, Grading Algorithms, and Testing. Textile Res. J. 2010, 80, 135–144. [Google Scholar] [CrossRef]
Zhang, Y.F.; Bresee, R.R. Fabric defect detection and classification using image analysis. Textile Res. J. 1995, 65, 1–9. [Google Scholar] [CrossRef]
Dar, I.M.; Mahmood, W.; Vachtsevanos, G. Automated pilling detection and fuzzy classification of textile fabrics. In Machine Vision Applications in Industrial Inspection V; SPIE: Bellingham, WA, USA, 1997; Volume 3029, pp. 26–36. [Google Scholar]
Abril, H.C.; Garcia-Verela MS, M.; Moreno YM, T.; Navarro, R.F. Automatic method based on image analysis for pilling evaluation in fabrics. Optical Eng. 1998, 37, 2937–2947. [Google Scholar] [CrossRef] [Green Version]
Konda, A.; Xin, L.C.; Takadera, M. Evaluation of Pilling by Computer Image Analysis. J. Textile Machin. Soc. Jpn. 1988, 36, 96–107. [Google Scholar] [CrossRef]
Ramgulam, R.B.; Amirbayat, J.; Porat, I. The Objective Assessment of Fabric Pilling, Part 1: Methodology. J. Textile Inst. 1993, 84, 221–226. [Google Scholar] [CrossRef]
Xu, B. Instrumental Evaluation of Fabric Pilling. J. Textile Inst. 1997, 88, 488–500. [Google Scholar] [CrossRef]
His, C.H.; Bresee, R.R.; Annis, P.A. Characterizing Fabric Pilling by Using Image Analysis Techniques, Part 1: Pill Detection and Description. J. Textile Inst. 1997, 88, 80–95. [Google Scholar]
Xin, B.; Hu, J. Objective Evaluation of Fabric Pilling Using Image Analysis Techniques. Textile Res. J. 2002, 72, 1057–1064. [Google Scholar]
Jensen, K.L.; Carstensen, J.M. Fuzz and Pills Evaluated on Knitted Textiles by Image Analysis. Textile Res. J. 2002, 72, 34–38. [Google Scholar] [CrossRef]
Behera, B.K.; Madan Mohan, T.E. Objective Measurement of Pilling by Image Processing Technique. Internat. J. Clothing Sci. Technol. 2005, 17, 279–291. [Google Scholar] [CrossRef]
Behera, B.K.; Mishra, R. Objective Measurement of Fabric Appearance Using Digital Image Processing. J. Textile Inst. 2006, 97, 147–153. [Google Scholar] [CrossRef]
Xu, B. Identifying Fabric Structures with Fast Fourier Transform Techniques. Textile Res. J. 1996, 66, 496–506. [Google Scholar]
Kuo CF, J.; Lee, C.J.; Tsai, C.C. Using a Neural Network to Identify Fabric Defects in Dynamic Cloth Inspection. Textile Res. J. 2003, 73, 238–244. [Google Scholar]
Tilocca, A.; Borzone, P.; Carosio, S.; Durante, A. Detecting Fabric Defects with a Neural Network Using Two Kinds of Optical Patterns. Textile Res. J. 2002, 72, 545–550. [Google Scholar] [CrossRef]
Park, S.W.; Hwang, Y.G.; Kang, R.C. Applying Fuzzy Logic and Neural Networks to Total Hand Evaluation of Knitted Fabric, Textile Res. J. 2000, 70, 675–681. [Google Scholar]
Rajasekaran, S. Training-free Counter Propagation Neural Network for Pattern Recognition of Fabric Defects. Textile Res. J. 1997, 67, 401–405. [Google Scholar] [CrossRef]
Palmer, S. Objective Classification of Fabric Pilling Based on the Two-dimensional Discrete Wavelet Transform. Textile Res. J. 2003, 73, 713–720. [Google Scholar] [CrossRef] [Green Version]
Barrett, G.; Clapp, T.G.; Titus, K. An On-Line Fabric Classification Technique Using a Wavelet-based Neural Network Approach. Textile Res. J. 1996, 66, 521–528. [Google Scholar] [CrossRef]
Kim, S.C.; Kang, T.J. Image Analysis of Standard Pilling Photographs Using Wavelet Reconstruction. Textile Res. J. 2005, 75, 801–811. [Google Scholar] [CrossRef]
Palmer, S.; Wang, X. Evaluating the Robustness of Objective Pilling Classification with the Two-dimensional Discrete Wavelet Transform. Textile Res. J. 2004, 74, 140–145. [Google Scholar] [CrossRef] [Green Version]
Zhang, J.; Wang, X.; Palmer, S. Objective grading of fabric pilling with wavelet texture analysis. Textile Res. J. 2007, 77, 871–879. [Google Scholar]
Shamrat, F.J.M.; Ghosh, P.; Sadek, M.H.; Kazi, M.A.; Shultana, S. Implementation of machine learning algorithms to detect the prognosis rate of kidney disease. In Proceedings of the 2020 IEEE International Conference for Innovation in Technology (INOCON), Bangalore, India, 6–8 November 2020; pp. 1–7. [Google Scholar]
Mohammed, R.; Rawashdeh, J.; Abdullah, M. Machine learning with oversampling and undersampling techniques: Overview study and experimental results. In Proceedings of the 2020 11th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; pp. 243–248. [Google Scholar]
Pietersma, D.; Lacroix, R.; Lefebvre, D.; Wade, K.M. Performance analysis for machine-learning experiments using small data sets. Comput. Electron. Agric. 2003, 38, 1–17. [Google Scholar] [CrossRef]
Gros, C.; Lemay, A.; Cohen-Adad, J. SoftSeg: Advantages of soft versus binary training for image segmentation. Medical Image Anal. 2021, 71, 102038. [Google Scholar] [CrossRef] [PubMed]
Visa, S.; Ramsay, B.; Ralescu, A.L.; Van Der Knaap, E. Confusion matrix-based feature selection. MAICS 2011, 710, 120–127. [Google Scholar]
Yu, L.; Zhou, R.; Chen, R.; Lai, K.K. Missing data preprocessing in credit classification: One-hot encoding or imputation? Emerg. Markets Finance Trade 2022, 58, 472–482. [Google Scholar] [CrossRef]
Elreedy, D.; Atiya, A.F. A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inform. Sci. 2019, 505, 32–64. [Google Scholar] [CrossRef]
Thara, D.K.; PremaSudha, B.G.; Xiong, F. Auto-detection of epileptic seizure events using deep neural network with different feature scaling techniques. Pattern Recog. Lett. 2019, 128, 544–550. [Google Scholar]
Charbuty, B.; Abdulazeez, A. Classification based on decision tree algorithm for machine learning. J. Appl. Sci. Technol. Trends 2021, 2, 20–28. [Google Scholar] [CrossRef]
Ying, C.; Qi-Guang, M.; Jia-Chen, L.; Lin, G. Advance and prospects of AdaBoost algorithm. Acta Autom. Sinica 2013, 39, 745–758. [Google Scholar]
Zareapoor, M.; Shamsolmoali, P. Application of credit card fraud detection: Based on bagging ensemble classifier. Procedia Comput. Sci. 2015, 48, 679–685. [Google Scholar] [CrossRef] [Green Version]
Livingston, F. Implementation of Breiman’s random forest machine learning algorithm. ECE591Q Machine Learn. J. Paper 2005, 2005, 1–13. [Google Scholar]
Kleinbaum, D.G.; Klein, M. Introduction to logistic regression. In Logistic Regression; Springer: New York, NY, USA, 2010; pp. 1–39. [Google Scholar]
Jurtz, V.I.; Johansen, A.R.; Nielsen, M.; Almagro Armenteros, J.J.; Nielsen, H.; Sønderby, C.K.; Winther, O.; Sønderby, S.K. An introduction to deep learning on biological sequence data: Examples and solutions. Bioinformatics 2017, 33, 3685–3690. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Han, J.; Moraga, C. The influence of the sigmoid function parameters on the speed of backpropagation learning. In International Workshop on Artificial Neural Networks; Springer: Berlin/Heidelberg, Germany, 1995; pp. 195–201. [Google Scholar]
Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Machine Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Halgamuge, M.N.; Daminda, E.; Nirmalathas, A. Best optimizer selection for predicting bushfire occurrences using deep learning. Nat. Hazards 2020, 103, 845–860. [Google Scholar] [CrossRef]
Bock, S.; Goppold, J.; Weiß, M. An improvement of the convergence proof of the ADAM-Optimizer. arXiv 2018, arXiv:1804.10587. [Google Scholar]
Ruder, S. An overview of gradient descent optimization algorithms. arXiv 2016, arXiv:1609.04747. [Google Scholar]
Bergstra, J.; Bengio, Y. Random search for hyper-parameter optimization. J. Machine Learn. Res. 2012, 13, 281–305. [Google Scholar]
Chandra, R.; Dagum, L.; Kohr, D.; Menon, R.; Maydan, D.; McDonald, J. Parallel Programming in OpenMP; Morgan Kaufmann: Burlington, MA, USA, 2001. [Google Scholar]
Rosenthal, J.S. Parallel computing and Monte Carlo algorithms. Far East J. Theor. Stat. 2000, 4, 207–236. [Google Scholar]
Drakopoulos, V.; Mimikou, N.; Theoharis, T. An overview of parallel visualisation methods for Mandelbrot and Julia sets. Comput. Graph. 2003, 27, 635–646. [Google Scholar] [CrossRef]
Pacheco, P. An Introduction to Parallel Programming; Elsevier: Amsterdam, The Netherlands, 2011. [Google Scholar]
Rocklin, M. Dask: Parallel computation with blocked algorithms and task scheduling. In Proceedings of the 14th Python in Science Conference (SciPy), Austin, TX, USA, 6–12 July 2015; Volume 130. [Google Scholar]

Figure 1. An example of the image dataset.

Figure 2. Flow chart of the methodology used to compare between different models.

Figure 3. An example of one sample from the original study dataset [2,3].

Figure 4. A sample of the cleaned dataset.

Figure 5. Distribution of the target class.

Figure 6. Distribution of the binary target class.

Figure 7. Box plot for important features.

Figure 8. Pie chart for the column shade.

Figure 9. Effect of average number of protrusions per observation on the target class.

Figure 10. Effect of average height of protrusions on the target class.

Figure 11. Full dataset correlation heat map.

Figure 12. Scatter plot for max height and average height of protrusions.

Figure 13. Scatter plot for average distance between protrusions and average number of protrusions.

Figure 14. Scatter plot for shade and brightness.

Figure 15. Relationship between the direction of the image to its brightness.

Figure 16. An example of the processed dataset after one-hot encoding the column color.

Figure 17. Different maximum depth sizes for decision tree algorithm.

Figure 18. Neural network optimized layers used in the research.

Figure 19. Shared memory architecture.

Figure 20. Dask profiling.

Figure 21. Visualization for execution time in minutes.

Figure 22. Performance comparison between various approaches.

Table 1. Performance measurement criteria.

Dimension			Value
Algorithm	Decision Tree	Bagging Classifier (Decision Tree Base)	Logistic Regression	Ada Boost (Logistic Regression Base)	Random Forest	Neural Network
Confusion Matrix	[52 12] [9 37]	[49 15] [3 43]	[45 19] [6 40]	[45 19] [5 41]	[55 9] [6 40]	[56 5] [3 46]
Accuracy	0.809	0.836	0.772	0.782	0.863	0.927
Precision	0.81	0.86	0.8	0.81	0.87	0.92
Recall	0.81	0.84	0.77	0.78	0.86	0.95
F1 Score	0.81	0.84	0.77	0.78	0.86	0.93

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kazoom, R.; Birman, R.; Hadar, O. Meta Classification Model of Surface Appearance for Small Dataset Using Parallel Processing. Electronics 2022, 11, 3426. https://doi.org/10.3390/electronics11213426

AMA Style

Kazoom R, Birman R, Hadar O. Meta Classification Model of Surface Appearance for Small Dataset Using Parallel Processing. Electronics. 2022; 11(21):3426. https://doi.org/10.3390/electronics11213426

Chicago/Turabian Style

Kazoom, Roie, Raz Birman, and Ofer Hadar. 2022. "Meta Classification Model of Surface Appearance for Small Dataset Using Parallel Processing" Electronics 11, no. 21: 3426. https://doi.org/10.3390/electronics11213426

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Meta Classification Model of Surface Appearance for Small Dataset Using Parallel Processing

Abstract

1. Introduction

2. Literature Review

3. Research Methodology

3.1. System Overview

3.2. Hypothesis for Images—Low-Level Image Processing

3.3. Evaluation Criteria on Performance Measure Indices

3.4. Tool and Language

4. Dataset Preprocessing

4.1. Dataset Overview

4.2. Dataset Processing and Preparation

4.2.1. Missing Values and Outliers Detection

4.2.2. Univariate, Bivariate, and Multivariate Analysis

4.2.3. Categorical Columns (One-Hot Encoding)

4.2.4. Imbalanced Data (Resampling)

4.2.5. Splitting the Dataset

4.2.6. Feature Scaling

5. Classification Algorithms

5.1. Decision Tree, Logistic Regression, and Adaboost Classifiers

5.2. Deep Learning Model (Neural Networks) Classifier

5.2.1. Neural Network Architecture

5.2.2. Hyperparameter Optimization Using Parallel Processing with Shared Memory

6. Performance Results

7. Discussion

8. Conclusions and Future Work

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI