A Two-Stage Automatic Color Thresholding Technique

Pootheri, Shamna; Ellam, Daniel; Grübl, Thomas; Liu, Yang

doi:10.3390/s23063361

Open AccessArticle

A Two-Stage Automatic Color Thresholding Technique

by

Shamna Pootheri

^1,*

,

Daniel Ellam

²,

Thomas Grübl

¹

and

Yang Liu

¹

HP-NTU Digital Manufacturing Corporate Lab, Nanyang Technological University, Singapore 639798, Singapore

²

HP Security Lab, HP Inc., Bristol BS1 6NP, UK

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(6), 3361; https://doi.org/10.3390/s23063361

Submission received: 9 February 2023 / Revised: 24 February 2023 / Accepted: 20 March 2023 / Published: 22 March 2023

(This article belongs to the Special Issue Machine Learning in Robust Object Detection and Tracking)

Download

Browse Figures

Versions Notes

Abstract

:

Thresholding is a prerequisite for many computer vision algorithms. By suppressing the background in an image, one can remove unnecessary information and shift one’s focus to the object of inspection. We propose a two-stage histogram-based background suppression technique based on the chromaticity of the image pixels. The method is unsupervised, fully automated, and does not need any training or ground-truth data. The performance of the proposed method was evaluated using a printed circuit assembly (PCA) board dataset and the University of Waterloo skin cancer dataset. Accurately performing background suppression in PCA boards facilitates the inspection of digital images with small objects of interest, such as text or microcontrollers on a PCA board. The segmentation of skin cancer lesions will help doctors to automate skin cancer detection. The results showed a clear and robust background–foreground separation across various sample images under different camera or lighting conditions, which the naked implementation of existing state-of-the-art thresholding methods could not achieve.

Keywords:

image binarization; robust color thresholding; image segmentation; histogram analysis; printed circuit assembly board inspection; medical image analysis

1. Introduction

Thresholding is a key computer vision (CV) technique for image segmentation. It is an important image pre-processing step for many applications, such as medical image analysis [1,2], satellite image analysis [3], the spatio-temporal analysis of videos [4,5], and text document analysis [6,7]. Thresholding helps to distinguish the foreground or region of interest (ROI) from the background by grouping pixels into distinct sets based on features such as grey-level changes, edges, texture, color, smoothness, and pixel connectivity [8]. Thresholding techniques can be broadly categorized into global and local methods. Global thresholding applies a fixed threshold value to the entire image in order to separate background and foreground pixels. The local thresholding approach adjusts the threshold value to different subregions in the image and is therefore dependent on the subregion. Concretely, local thresholding algorithms determine the threshold based on the pixel values around a small region or block [9]. As a result, better segmentation within subregions can be achieved. However, recalculating the threshold for each subregion in an image can be computationally expensive. Furthermore, choosing an optimal block size for local thresholding is critical, since the wrong block size may cause blocking artefacts and result in poor foreground–background separation [10]. Issues arise when natural variations occur within an image dataset caused by artefacts such as varying background colors, lighting differences, and camera specifications, to name a few. A thresholding system resilient to these natural and often-occurring deviations in real-life applications is needed to automate the thresholding process, and existing local and global techniques are insufficient to address these requirements.

Conventional thresholding techniques segment the image into two or more classes based on the grayscale value, which holds the intensity information of a pixel. Building upon grayscale thresholding, traditional color thresholding methods use the color intensity to determine a threshold value [8]. Well-established, commonly used methods for grayscale thresholding include Otsu’s [11] and Kapur’s [12] methods, which were initially proposed for grayscale images and later tailored for RGB-based color images thresholding [1,13,14,15,16,17]. RGB-based color thresholding techniques separate the foreground and background pixels based on the gray value of each RGB channel. In addition, in RGB-based image representation, the intensity of the image pixels is not decoupled from the chromaticity, and there is a high correlation between the R, G, and B channels. Thus, RGB channel-based color thresholding techniques are susceptible to intensity changes and are not reliable for thresholding color images. Therefore, other color spaces are generally applied to image processing, such as the HSV (hue, saturation, value) space or the CIE L*a*b* space, where the chromatic features are decoupled from the intensity parameters [8]. In this paper, we propose a two-stage global–local HSV-based color thresholding technique. The main contributions of our method can be summarized as follows:

A fully automated histogram-based color thresholding approach is provided, which is invariant to natural variations in images such as varying background colors, lighting differences, and camera specifications.
Block size determination and addressing the blocking artefacts problem during the local thresholding stage are achieved by automatically detecting the blocks from the global thresholded image.
The method represents an unsupervised technique, as it does not require any labeled data for training, making it advantageous in situations where labeled data are limited or difficult/costly to generate.

An automatic thresholding technique that adapts to image variations can provide immense value to various CV applications. Although the method is illustrated herein with the use case of thresholding images depicting printed circuit assembly (PCA) boards and skin lesion images, the techniques apply to other datasets. To motivate the techniques’ application, consider how PCA boards are becoming increasingly complex and densely packed with small components, making the clean separation of foreground and background pixels increasingly challenging. Performing this accurately makes a big difference in the ability to automate the visual inspection of PCA boards and to facilitate counterfeit or security analyses [18]. By automating the segmentation of lesions from medical images, physicians will be able to detect abnormalities clearly and make more accurate diagnoses.

The paper is organized as follows: Section 2 briefly describes fundamental image thresholding research. Section 3 presents the proposed two-stage automatic global–local color thresholding technique. Section 4 presents the implementation details, Section 5 provides details of the evaluation metrics, and Section 6 explains the experimental results. Section 7 discusses the results, and Section 8 describes the application areas. Section 9 presents the limitations and future directions of the proposed method, and Section 10 concludes the paper.

2. Related Work

In digital image processing, thresholding is a well-known technique for image segmentation. Because of its wide applicability, a range of different thresholding methods have been proposed over the years [9,19]. Histogram-based image segmentation methods are one of the most promising conventional computer vision techniques to separate foreground objects from an image background, and various types are presented in the literature [8,9,12,20,21,22,23]. Otsu’s method automatically finds an optimal threshold value from the intensity of the image histogram, which minimizes intra-class intensity variation between the foreground and background classes [24]. Bhandari [3] and Kapur’s [12] methods find the optimal threshold value by minimizing the cross-entropy between the original image and the thresholded image using the histogram. They have been used for decades in combination with other techniques. As an example, Su et al. [25] and Qi et al. [26] recently proposed multilevel thresholding methods that apply 2D histograms, 2D Kapur entropy, and non-local means to segment chest X-ray images. Otsu’s intraclass variation method and Kapur’s entropy-based techniques are promising for images with bimodal histograms [8], but they are not suitable for images with multimodal or unimodal histograms that have small foreground regions such as the PCA boards. Parker [27] described p-tile, two-peak, and local contrast thresholding. The p-tile method is one of the oldest basic histogram-based thresholding methods, which requires the manual input of a desired black or white pixel ratio. The two-peak method locates the valley between two peaks in a gray-level histogram and defines the valley as the ideal threshold value. Local contrast thresholding segments an image by enhancing the edges and then classifying the pixels based on the contrast measure of the gray-level co-occurrence matrix. Niblack [23] introduced a variable histogram-based thresholding method where the threshold value is dynamically adjusted to the mean and standard deviation in a neighborhood of a particular pixel. Sauvola et al. [28] proposed a local adaptive thresholding method that uses soft decision-based thresholding for non-textual components and histogram-based thresholding for textual components. To partition the image into textual and non-textual regions, they computed the average gray values and the transient difference of local windows.

Histogram concavity analysis methods have been employed for decades. Early research in this field [2,29,30] has shown that selecting thresholds based on the valleys or shoulders (concavities) of a histogram function leads to the sufficient separation of background and foreground pixels in an image. A fundamental histogram concavity analysis technique is the mode method, which was applied by Prewitt and Mendelsohn [2] to analyze images of human blood cells. This method places thresholds at the local minima of a smoothened histogram. To further isolate significant peaks and valleys from insignificant ones, several early methods have been proposed, such as weighting pixels according to the values of neighboring pixels [31] or recursively thresholding the histogram function [32]. Compared to cluster-based methods such as the p-tile method [33], which segments the image based on a manually pre-defined object area, histogram concavity methods have the advantage of automatically grouping pixels based on the result of the concavity analysis and therefore reducing the likelihood of inadvertently classifying foreground objects as background. However, one limitation of concavity analysis, pointed out by [9], is the lack of resilience when it comes to histograms with sharp peaks and elongated shoulders.

Text document binarization is an important task in the field of document analysis and image processing. Over the years, many techniques have been proposed in the literature for text document binarization [34,35]. Wolf and Jolion [36] proposed a text segmentation system that encompasses a contrast-based binarization technique combining the segmentation quality of Niblack’s method and the robustness of Sauvola’s method. Feng and Tan [37] described a local thresholding method for binarizing low-quality text images based on the gray values of two local windows. They compared the gray-value standard deviations of the primary (smaller) and secondary (larger) local window to separate text regions from background regions. Another well-suited method for text segmentation was introduced by Shaikh et al. [7], who used an iterative partitioning approach that calculates an optimal grayscale threshold value for different cells in an image. It works particularly well on text document images with a noisy background. Bradley and Roth’s [5] method thresholds pixels based on the average pixel value of a surrounding square window of pixels. To compute the averages, they used the integral image to accomplish the computation in linear time. Singh et al. [38] also used a local contrast thresholding technique based on the mean pixel value, minimum, and maximum of a local window for binarizing noisy and text document images. Sukesh et al. evaluated various deep-learning-based text document binarization techniques in [39].

Medical image thresholding is a critical task in medical image processing and analysis that involves segmenting an image into distinct regions based on intensity values [40]. Feng et al. [41] used a multi-scale 3D Otsu thresholding algorithm for medical image segmentation. Fazilov et al. [42] incorporated Otsu’s method for mammographic image segmentation. Kapur’s thresholding technique was used in [43] for detecting tumors in MRI images using a transformed differential evolution algorithm. One of the most widely used network and training strategies for biomedical image segmentation is U-Net [44,45]. It works with few training images and outputs more precise segmentations than its predecessors. Venugopal et al. [46] proposed DTP-Net, a deep convolutional neural network (DCNN) that aims to predict the optimal grayscale threshold value for an image. Their method is tailored to binarizing lesions on dermatological macro-images. Similarly, Han et al. [47] described a DCNN-based skin lesion image segmentation method. They trained their HWA-SegNet model with the image’s 2D discrete Fourier transform frequency information and further fine-tuned the edge information of skin lesions. There has also been an increase in newly published image segmentation methods that are specifically tailored to thresholding magnetic resonance and X-ray images. Chen et al. [48] introduced a transformer-based method that incorporates multilevel region and edge information and thus achieved high DSI scores on their magnetic resonance image test dataset. To further improve the run-time quality and segmentation performance of such methods, Uslu and Bharath [49] proposed a quality control method that ultimately aims to increase the trustworthiness of DCNN-based methods in the medical image analysis field.

Many techniques [50,51,52] in the recent literature do not analyze the chromaticity information to threshold color images, since they incorporate RGB-channel-based thresholding. While some [10,53,54] do apply HSV or L*A*B* color space analysis to thresholding problems, they do not automatically determine the threshold limits based on the unique characteristics of the image. Our proposed method determines the chromaticity of the background or foreground pixels using the hue and saturation histograms and computes the optimal color threshold by considering the changes in histogram gradient and histogram cumulative area. It is suitable for thresholding images with unimodal, bimodal, and multimodal histograms and histograms with sharp peaks or elongated shoulders compared to other histogram-based thresholding techniques. Thresholding solely based on histogram valleys and shoulders has a significant disadvantage compared to our proposed method. Depending on the characteristics of a valley, a suboptimal threshold might be computed. Moreover, histogram concavity analysis (valley and shoulder analysis) is not suitable for thresholding images with bimodal or multimodal histograms.There are advanced computer vision techniques using DCNN aimed at developing medical image segmentation approaches [46,55]. The DCNN-based methods do not typically work well when thresholding small objects [56] such as PCA board components. In addition, DCNN models require a large set of labeled training data, which is not feasible as it requires the manual labeling of small PCA board components.

In this paper, we propose a two-stage histogram-based automatic color thresholding technique based on the chromaticity of the image pixels. The proposed method is an unsupervised technique, as it does not require any labeled data like deep learning-based thresholding techniques [46,55]. The details of the proposed method are provided in the following section.

3. Methods

In this section, we explain the proposed two-stage histogram-based automatic color thresholding technique to segment an image into foreground and background regions. Figure 1 provides a high-level overview of the entire technique, and Algorithm 1 describes the global and local thresholding stages. Initially, the image is converted to the hue (H), saturation (S), and value (V) format. In the HSV format, the chromaticity components (H and S) are decoupled from the intensity component (V). The intensity component V is excluded from the threshold computation to avoid illumination changes.

Figure 1. (a) Overall sequence of the two-stage global–local thresholding technique. Gray-shaded boxes indicate the key contributions of the proposed method. (b) Schematic overview of the global–local thresholding stages.

Algorithm 1 Two-stage global–local thresholding
Input:
$I m g$	▹ Input color image
$C u t o f f_G r a d i e n t$	▹ Threshold value for histogram $A v g_G r a d i e n t$
$C u t o f f_A r e a$	▹ Threshold value for histogram $A v g_A r e a$
$W i n d o w_S i z e$	▹ Window size for calculating the $A v g_G r a d i e n t$ and $A v g_A r e a$
$L i m i t 1$	▹ The limit for maximum continuous hue range
$L i m i t 2$	▹ The limit for maximum continuous saturation range
$C 1$	▹ A constant that determines the degree of change from global hue to local hue
$C 2$	▹ A constant that determines the degree of change from global saturation to local saturation
Output:
$G T_I m g$	▹ Globally thresholded image, background in black and foreground in white
$L T_I m g$	▹ Locally thresholded image, background in black and foreground in white
Stage 1: Global Thresholding 1: Let H, S, and V be the hue, saturation, and value components of $I m g$ 2: $P M F H \leftarrow compute_pmf (H)$ 3: $S m o o t h e n e d_P M F H \leftarrow smoothening (P M F H)$ 4: $P e a k H \leftarrow a r g m a x (S m o o t h e n e d_P M F H)$ ▹ Index of max H component from Smoothened_PMF 5: $N o m i n a t e d_R a n g e G H \leftarrow nominated_range (S m o o t h e n e d_P M F H)$ 6: $R e s u l t_G H \leftarrow {}$ 7: while $P e a k H$ not in $R e s u l t_G H$ do ▹ Ensures that the global nominated hue range includes the PeakH 8: $N o m i n a t e d_R a n g e_G H \leftarrow N o m i n a t e d_R a n g e_G H - R e s u l t_G H$ 9: $R e s u l t_G H \leftarrow \max_continuous_range (N o m i n a t e d_R a n g e_G H, L i m i t 1)$ 10: end while 11: $G l o b a l_H_R a n g e (G H_l o w, G H_h i g h) \leftarrow [m i n (R e s u l t_G H), m a x (R e s u l t_G H)]$ 12: $S h o r t l i s t e d_S \leftarrow find_sat_values (G H_l o w, G H_h i g h, H, S)$ 13: $P M F S \leftarrow compute_pmf (S h o r t l i s t e d_S)$ 14: $S m o o t h e n e d_P M F S \leftarrow smoothening (P M F S)$ 15: $N o m i n a t e d_R a n g e_G S \leftarrow nominated_range (S m o o t h e n e d_P M F S)$ 16: $R e s u l t_G S \leftarrow \max_continuous_range (N o m i n a t e d_R a n g e_G S, L i m i t 2)$ 17: $G l o b a l_S_R a n g e (G S_l o w, G S_h i g h) \leftarrow [m i n (R e s u l t_G S), m a x (R e s u l t_G S)]$ 18: $G T_I m g \leftarrow threshold (I m g, G l o b a l_H_R a n g e (G H_l o w, G H_h i g h), G l o b a l_S_R a n g e (G S_l o w, G S_h i g h))$ 19: $L T_I m g \leftarrow local_threshold (I m g, G T_I m g, G l o b a l_H_R a n g e (G H_l o w, G H_h i g h), G l o b a l_S_R a n g e (G S_l o w, G S_h i g h))$ Stage 2: Local Thresholding 20: function $l o c a l_t h r e s h o l d$ ( $I m g, G T_I m g, G l o b a l_H_R a n g e (G H_l o w, G H_h i g h), G l o b a l_S_R a n g e (G S_l o w, G S_h i g h)$ ) 21: $F i n a l_I m a g e \leftarrow z e r o s (I m g . w i d t h, I m g . h e i g h t)$ ▹ Initialize a blank image with the same size as the input image Img 22: $B l o b s \leftarrow b l o b s$ in $G T_I m g$ ▹ Detect all blobs from GT_Img 23: $R e l e v a n t_b l o b s \leftarrow select_relevant_blobs (B l o b s, m i n i m u m_S i z e = 500)$ ▹ Blobs with area ≥ 500 pixels are considered relevant 24: for b in $R e l e v a n t_b l o b s$ do 25: Let $[(m i n_x, m i n_y), (m a x_x, m a x_y)]$ be the location coordinate of blob ’b’ in $G T_I m g$ 26: $C r o p p e d \leftarrow I m g [m i n_x : m a x_x, m i n_y : m a x_y]$ ▹ Crops corresponding section from the input image Img 27: $H_l o c a l, S_l o c a l,$ and $V_l o c a l$ ← hue, saturation, and value components of $C r o p p e d$ section 28: $H_l o c a l_c o n f i n e d \leftarrow H_l o c a l [(G H_l o w - C 1) : (G H_h i g h + C 1)]$ ▹ Confines local hue to global hue range ± C1 to eliminate the hues of foreground components in the local region 29: $P M F H_L o c a l \leftarrow compute_pmf (H_l o c a l_c o n f i n e d)$ 30: $S m o o t h e n e d_P M F H_L o c a l \leftarrow smoothening (P M F H_L o c a l)$ 31: $N o m i n a t e d_R a n g e_L H \leftarrow nominated_range (S m o o t h e n e d_P M F H_L o c a l)$ 32: $R e s u l t_L H \leftarrow \max_continuous_range (N o m i n a t e d_R a n g e_L H, L i m i t 2)$ 33: $L o c a l_H_R a n g e (L H_l o w, L H_h i g h) \leftarrow [m i n (R e s u l t_L H), m a x (R e s u l t_L H)]$ 34: $S h o r t l i s t e d_S \leftarrow find_sat_values (L o c a l_H_R a n g e (L H_l o w, L H_h i g h), H_l o c a l, S_l o c a l)$ 35: $S_l o c a l_c o n f i n e d \leftarrow S h o r t l i s t e d_S [(G S_l o w - C 2) : (G S_h i g h + C 2)]$ ▹ Confines local saturation to global saturation range ± C2, to eliminate the saturation of foreground components in the local region 36: $P M F S_L o c a l \leftarrow compute_pmf (S_l o c a l_c o n f i n e d)$ 37: $S m o o t h e n e d_P M F S_L o c a l \leftarrow smoothening (P M F S_L o c a l)$ 38: $N o m i n a t e d_R a n g e_L S \leftarrow nominated_range (S m o o t h e n e d_P M F S_L o c a l)$ 39: $R e s u l t_L S \leftarrow \max_continuous_range (N o m i n a t e d_R a n g e_L S, L i m i t 2)$ 40: $L o c a l_S_R a n g e (L S_l o w, L S_h i g h) \leftarrow [m i n (R e s u l t_L S), m a x (R e s u l t_L S)]$ 41: $C r o p p e d_T h r e s h o l d e d \leftarrow threshold (C r o p p e d, L o c a l_H_R a n g e (L H_l o w, L H_h i g h), L o c a l_S_R a n g e (L S_l o w, L S_h i g h))$ 42: $F i n a l_I m a g e [m i n_x : m a x_x + 1, m i n_y : m a x_y + 1] \leftarrow C r o p p e d_T h r e s h o l d e d$ 43: end for 44: return $F i n a l_I m a g e$ 45: end function
Helper Functions 46: function $s m o o t h e n i n g$ ( $h i s t o g r a m, w i n d o w_s i z e \leftarrow 5$ ) 47: $s m o o t h e n e d_h i s t o g r a m \leftarrow z e r o s (l e n (h i s t o g r a m))$ 48: for i in $r a n g e (l e n (h i s t o g r a m))$ do 49: $l o w e r_b o u n d \leftarrow m a x (0, i - w i n d o w_s i z e / 2)$ 50: $u p p e r_b o u n d \leftarrow m i n (l e n (h i s t o g r a m), i + w i n d o w_s i z e / 2)$ 51: $w i n d o w_s u m \leftarrow 0$ 52: for j in $r a n g e (l o w e r_b o u n d, u p p e r_b o u n d)$ do 53: $w i n d o w_s u m \leftarrow w i n d o w_s u m + h i s t o g r a m [j]$ 54: $s m o o t h e n e d_h i s t o g r a m [i] \leftarrow w i n d o w_s u m / ((u p p e r_b o u n d - l o w e r_b o u n d) + 1)$ 55: end for 56: end for 57: return $s m o o t h e n e d_h i s t o g r a m$ 58: end function 59: function $c o m p u t e_p m f$ ( $H i s t$ ) 60: $N \leftarrow s u m (H i s t)$ 61: $p m f \leftarrow z e r o s (l e n (H i s t))$ 62: for i in $r a n g e (l e n (H i s t))$ do 63: $c o u n t \leftarrow H i s t [i]$ 64: $p m f [i] \leftarrow c o u n t / N$ 65: end for 66: return $p m f$ 67: end function 68: 69: function $n o m i n a t e d_r a n g e$ ( $S m o o t h e n e d_P M F, w i n d o w_s i z e \leftarrow 5$ ) 70: $N o m i n a t e d_R a n g e \leftarrow {}$ 71: for each h in $S m o o t h e n e d_P M F$ do 72: if (h is not in $N o m i n a t e d_R a n g e$ ) then ▹ Compute the Avg_Gradient and Avg_Area within the window 73: $h_l o w e r \leftarrow m a x (0, h - w i n d o w_s i z e / 2)$ 74: $h_u p p e r \leftarrow m i n (l e n (S m o o t h e n e d_P M F), h + w i n d o w_s i z e / 2)$ 75: $A v g_A r e a \leftarrow \frac{\sum_{i = h_l o w e r}^{h_u p p e r} S m o o t h e n e d_P M F (i)}{(h_u p p e r - h_l o w e r) + 1}$ 76: if ( $h_u p p e r < 180$ ) then 77: $A v g_G r a d i e n t \leftarrow \frac{\sum_{i = h_l o w e r}^{h_u p p e r} arctan (\| S m o o t h e n e d_P M F (i) - S m o o t h e n e d_P M F (i + 1) \|)}{(h_u p p e r - h_l o w e r) + 1}$ 78: else 79: $A v g_G r a d i e n t \leftarrow \frac{\sum_{i = h_l o w e r}^{h_u p p e r - 1} arctan (\| S m o o t h e n e d_P M F (i) - S m o o t h e n e d_P M F (i + 1) \|)}{(h_u p p e r - h_l o w e r)}$ 80: end if 81: if ( $A v g_G r a d i e n t \geq C u t o f f_G r a d i e n t$ ) or ( $A v g_A r e a \geq C u t o f f_A r e a$ ) then 82: $N o m i n a t e d_R a n g e \leftarrow N o m i n a t e d_R a n g e \cup {h_l o w e r t o h_u p p e r}$ 83: end if 84: end if 85: end for 86: return $N o m i n a t e d_R a n g e$ 87: end function 88: 89: function $m a x_c o n t i n u o u s_r a n g e$ ( $r a n g e_i n p u t, L i m i t$ ) 90: $r e s u l t \leftarrow {}$ 91: $c u r r e n t \leftarrow {}$ 92: for r in $r a n g e_i n p u t$ do 93: if ( $l e n (c u r r e n t) = = 0)$ or ( $\| r - c u r r e n t [- 1] \| < = L i m i t$ ) then 94: $c u r r e n t \leftarrow c u r r e n t \cup {r}$ 95: else 96: if ( $l e n (c u r r e n t) > l e n (r e s u l t)$ ) then 97: $r e s u l t \leftarrow c u r r e n t$ 98: end if 99: end if 100: $c u r r e n t \leftarrow {r}$ 101: end for 102: if ( $l e n (c u r r e n t) > l e n (r e s u l t)$ ) then 103: $r e s u l t \leftarrow c u r r e n t$ 104: end if 105: return $r e s u l t$ 106: end function 107: 108: function $f i n d_s a t_v a l u e s$ ( $H_l o w, H_h i g h, H, S$ ) 109: $S a t \leftarrow {}$ 110: for i in $r a n g e (l e n (H))$ do 111: if ( $H_l o w \leq H [i] \leq H_h i g h$ ) then 112: $S a t \leftarrow S a t \cup {S [i]}$ 113: end if 114: end for 115: return $S a t$ 116: end function 117: function $t h r e s h o l d$ ( $i m a g e, H_R a n g e (H_l o w, H_h i g h), S_R a n g e (S_l o w, S_h i g h)$ ) 118: $T h r e s h o l d e d_i m a g e \leftarrow z e r o s (i m a g e . h e i g h t, i m a g e . w i d t h)$ 119: for i in $r a n g e (i m a g e . h e i g h t)$ do 120: for j in $r a n g e (i m a g e . w i d t h)$ do 121: $H \leftarrow$ hue component of $i m a g e [i] [j]$ 122: $S \leftarrow$ saturation component of $i m a g e [i] [j]$ 123: if ( $(H_l o w \leq H \leq H_h i g h)$ and $(S_l o w \leq S \leq S_h i g h)$ ) then 124: $T h r e s h o l d e d_i m a g e [i] [j] \leftarrow 0$ 125: else 126: $T h r e s h o l d e d_i m a g e [i] [j] \leftarrow 255$ 127: end if 128: end for 129: end for 130: return $T h r e s h o l d e d_i m a g e$ 131: end function 132: 133: function $s e l e c t_r e l e v a n t_b l o b s$ ( $b l o b s, m i n i m a l_s i z e$ ) 134: $r e l e v a n t_b l o b s \leftarrow {}$ 135: for $b l o b$ in $b l o b s$ do 136: $a r e a \leftarrow h e i g h t (b l o b) \times w i d t h (b l o b)$ 137: if ( $a r e a \geq m i n i m a l_s i z e$ ) then 138: $r e l e v a n t_b l o b s \leftarrow r e l e v a n t_b l o b s \cup {b l o b}$ 139: end if 140: end for 141: return $r e l e v a n t_b l o b s$ 142: end function

After splitting the image into HSV components, a hue probability mass function (PMF) is computed from the hue histogram component of the input image. In order to reduce unwanted noise, the PMF is smoothened. It is computationally advantageous to perform image smoothening on the image’s histogram rather than on the image itself, which effectively binds all necessary computations to a fixed-size histogram, thus avoiding an increase in computational complexity with a growing image resolution. After smoothening the PMF, a two-stage thresholding technique is applied based on the chromaticity components to both global and local image regions. In both stages, the proposed method initially determines the probable background hue or sat values (nominated hue or sat values), and then finalizes the optimal background hue or sat-range selection (max continuous hue or sat range) from the nominated values. A detailed explanation of global and local thresholding stages is provided in Section 3.1 and Section 3.2.

3.1. Stage 1: Global Thresholding

Stage 1 of Algorithm 1 defines the global hue and saturation threshold ranges that apply to the entire image. Initially, the image is converted to HSV format (step 1), and the PMF of the hue component is smoothened (steps 2 and 3). A specific hue ‘h’ qualifies for the nominated hue when either the average area within a window is greater than a predefined cut-off value (Cutoff_Area) or the average slope surrounding ’h’ (a sudden change in PMF within a window) is greater than a predefined cut-off value (Cutoff_Gradient) (step 5). We applied the max continuous hue range heuristic to the nominated hue values. The global max continuous hue range is the largest range of hues, including the peak hue value (PeakH) within the nominated hue values, for which the difference between the consecutive hue values is less than a certain small threshold Limit 1 (e.g., 2) (steps 6–11). For example, as presented in Figure 2c, the nominated hue for the input PCA board image (Figure 2a) is between 18 and 130. From the hue and saturation shades shown in Figure 2b, it is evident that this range includes shades of green, yellow, and blue. From the input image presented in Figure 2a, one can infer that the hue of the yellow text and the blue capacitor caused small peaks in the PMF, and these local maxima are also included in the nominated hue values. To eliminate these foreground hues, we used the maximum continuous hue range heuristic. This estimated continuous hue range is defined as the global hue range Global_H_Range(GH_low, GH_high). We refer to this as ’global’ since this hue range is derived from all image pixels. It is applied to the whole image for thresholding; hence, it is the global hue range. In Figure 2c, the green shaded region between 72 and 93 is the global hue range, which represents the background hues of the input PCA board image presented in Figure 2a.

As depicted in step 12 of Algorithm 1, to find the global saturation range, the pixels within the global hue range are shortlisted, and the S components of the shortlisted pixels (Shortlisted_S) are collected. From the smoothened_PMF of Shortlisted_S, Algorithm 1 estimates the nominated saturation values as described in step 15. Then, we applied the maximum continuous saturation range heuristic to the nominated saturation values to obtain the global saturation range (GS_low to GS_high) (steps 16–17). The max continuous saturation range is the largest range of saturation values within the nominated saturation values, for which the difference between the consecutive saturation values is less than a certain small threshold Limit 2 (e.g., 4). The max continuous hue range heuristic must select a range including the peak hue. However, in the max continuous saturation hue heuristic, it is not compulsory to select the range including the peak saturation; the selection can also be a significant range that may/may not include the global maximum of the saturation histogram. Once the global hue and saturation ranges are fixed, the image is segmented into background and foreground regions. If a pixel’s hue is within the global hue range and its saturation is within the global saturation range, that pixel is considered a background pixel; otherwise, it is considered a foreground pixel. A globally thresholded binary image is generated by setting the intensity values of all background pixels to ‘0’ and the intensity of the foreground pixels to ’255’ (step 18). The input color image, global thresholded image, and global hue–saturation threshold ranges are passed to stage 2 (local thresholding) to further improve the results within subregions (step 19).

Figure 2 explains the estimation of the nominated and max continuous hue and saturation ranges for a given input image. As per the proposed algorithm, a specific hue ‘h’ qualifies as a nominated hue value when either the average area or the average gradient within a window (including ‘h’) is greater than a predefined Cutoff_Area or Cutoff_Gradient, respectively. We set the Cutoff_Gradient value to 0.001, the Cutoff_Area to 1/180 (1 over the length of the histogram ≈ 0.0055), and the Window_Size constant to 5 (these values were heuristically determined, and more details about the parameter settings can be found in Section 4). In Figure 2c, the average gradient (Avg_Gradient) and average area (Avg_Area) within the window (where hue 72 is the starting point of the window) are 0.0966 and 0.0035, respectively. The Avg_Gradient is greater than the Cutoff_Gradient (

0.0966 \geq 0.001

), and the Avg_Area is smaller than the respective cut-off value (

0.0035 \leq 0.0055

). As described earlier, either the gradient OR the area must be greater than its respective cut-off value to be considered as the nominated hue. Both the Avg_Area and the Avg_Gradient of the preceding window of hue point 72 are less than their respective cut-off values. Likewise, the Avg_Area and Avg_Gradient of the window after hue point 93 are less than the cut-off values. Hence, 72–93 are included in the nominated hue range set. Similarly, hue ranges 18 to 35 and 116 to 130 are included in the nominated set. Finally, the shaded region 72 to 93 is selected as the optimal hue range by the maximum continuous hue range heuristic. The same reasoning applies to the saturation histogram in Figure 2d. Both the Avg_Area and the Avg_Gradient of the window before saturation point 137 and after saturation point 255 are less than the cut-off values. Hence, 137–255 are included in the nominated saturation range set. Similarly, the saturation ranges 30 to 55 and 75 to 90 are also included in the nominated set. Finally, the shaded region 137–255 is selected as the optimal saturation range by the maximum continuous saturation range heuristic.

3.2. Stage 2: Local Thresholding

The second stage dynamically determines the varying local hue and saturation threshold ranges to refine the background and foreground segmentation within subregions. The locally relevant blocks or regions are detected from the global thresholded binary image (steps 22 and 23), and the detected blocks are improved using local hue and saturation thresholds. The size for local regions is automatically detected using blob detection techniques from the globally thresholded image (step 22). Areas around the blobs are extracted using a bounding box (see Figure 3c,d). We selected the relevant regions by eliminating backgrounds (areas outside the bounding boxes) and irrelevant regions (bounding boxes less than a minimal size) (step 23) for further refining. Subsequently, we picked a region from the relevant list, cropped the image section from the input color image, and computed the corresponding H, S, and V components (steps 24–27). The local H component is confined within the global hue range (step 28). This step helps to eliminate the foreground object’s hues while fine-tuning the background in the local regions. From the smoothened PMF of the local H components, the Local_H_Range(LH_low, LH_high) is computed (steps 31–33). Similarly, the local saturation range Local_S_Range(LS_low, LS_high) is computed as shown in steps 34 to 40. Thresholding is applied dynamically to the image regions based on the corresponding local hue and saturation ranges, and anything within the threshold range is classified as background (steps 41 and 42).

4. Implementation Details

This section presents the implementation details and parameter settings for the proposed algorithm. For our implementation, we used Python 3.10 and the OpenCV image processing library [57]. The main parameters for Algorithm 1 are Cutoff_Gradient, Cutoff_Area, Window_Size, Limit1, Limit2, C1, and C2. These values are determined heuristically, and the following section explains how we fine-tuned these parameters to obtain optimal results.

A specific hue ‘h’ qualifies as the nominated hue when either the average area around h (within the Window_Size) is greater than Cutoff_Area or the average slope surrounding ‘h’ (within the Window_Size) is greater than the Cutoff_Gradient. We set the Cutoff_Gradient value to 0.001 and the optimal Cutoff_Area to 1/180 (1 over the length of the histogram ≈ 0.0055). The Window_Size constant was set to 5, which effectively calculates the average gradient and area within a window of five consecutive values.

When determining the continuous histogram ranges that qualify as a potential background hue or saturation, we introduced Limit1 (hue continuity) and Limit2 (saturation continuity), which refer to the maximum number of consecutive points on the histogram x-axis that could lie outside of our desired gradient or cumulative area. These parameters provide some flexibility when it comes to color discontinuities or variations in the background. Limit1 defines the allowable hue discontinuity, and Limit2 defines the allowable saturation discontinuity. The constants C1 and C2 define the degree of change in local hue and saturation from the global values. The optimal values for these constants depend on the variance in the image background chromaticity in local regions. If there are limited changes in background chromaticity (no shadows or different shades), a smaller value is sufficient. Otherwise, a higher value is required to perform accurate thresholding on local blobs. Please refer to Appendix B for a collection of configuration examples.

5. Evaluation Metrics

We evaluated the performance of the image thresholding techniques using the following evaluation metrics, where GT refers to the ground-truth image and T refers to the thresholded image.

5.1. Dice Similarity Index

The Dice similarity index (DSI ), or Dice similarity coefficient (DSC), is commonly used in computer vision tasks to measure the spatial overlap of two images [58] and is defined in Equation (1). The DSI is twice the area of the overlap divided by the total number of pixels in both images. A high DSI score indicates a large spatial overlap between GT and the thresholded image T.

D S I (G T, T) = \frac{2 (| G T \cap T |)}{(| G T | + | T |)}

(1)

5.2. Matthews Correlation Coefficient

The Matthews correlation coefficient (MCC) is a more reliable evaluation metric for binary classification tasks [59]. It considers the true positives, false positives, true negatives, and false negatives:

Correctly predicted foreground pixels are considered true positives (TP)—the number of pixels segmented as foreground in both GT and T images.
Falsely predicted foreground pixels are considered false positives (FP)—the number of pixels segmented as foreground in T and background in GT.
Correctly predicted background pixels are considered true negatives (TN)—the number of pixels segmented as background in both GT and T images.
Falsely predicted background pixels are considered false negatives (FN)—the number of pixels segmented as background in T and foreground in GT.

The MCC calculates the Pearson product-moment correlation coefficient between the thresholded image T and a ground-truth image GT and is defined in Equation (2); the higher the MCC score, the higher the thresholding accuracy.

M C C = \frac{T P \times T N - F P \times F N}{\sqrt{(T P + F P) \times (T P + F N) \times (T N + F P) \times (T N + F N)}}

(2)

5.3. Peak Signal-to-Noise Ratio

The peak signal-to-noise ratio (PSNR) is commonly used to evaluate the overall quality of an image. It is defined as the “proportion between maximum attainable powers and the corrupting noise that influence likeness of image” [60]. The PSNR is calculated as shown in Equation (3); the higher the PSNR value, the higher the thresholding accuracy.

P S N R = 20 \times l o g_{10} (\frac{M A X_{I}}{\sqrt{M S E}})

(3)

The

M A X_{I}

value refers to the maximum intensity value. In our case,

M A X_{I}

was set to 255, which is the highest possible value in an 8-bit grayscale image. The mean squared error (MSE) between GT and the thresholded image T is defined in Equation (4), where

G T_{i j}

and

T_{i j}

represent the ground-truth image and the thresholded image intensity at the

{(i, j)}^{t h}

position, respectively, and

m, n

are the height and width of the GT and T images.

M S E = \frac{1}{m n} \sum_{i = 0}^{m - 1} \sum_{j = 0}^{n - 1} {(G T_{i j} - T_{i j})}^{2}

(4)

6. Results

We evaluated the proposed method using a skin cancer image dataset and a PCA board image dataset. We tested our method on the University of Waterloo skin cancer database [61], which contains 206 images and their corresponding ground-truth images. The PCA board dataset consists of 50 images of PCA boards with varying image quality and background colors. Our team captured 44 images of PCA boards with different image resolutions and lighting conditions and blue and green board background colors. To incorporate color variability, we downloaded six PCA board images with red, yellow, and orange backgrounds from a free stock image website [62]. The ground-truth images of the 50 PCA board images were produced using a semi-automatic process: global thresholding followed by manual adjustments. Figure 4 provides sample images and corresponding ground truths of the PCA board and skin cancer image datasets. The skin cancer dataset consists of macro skin images featuring lesions with a bimodal histogram, and the PCA board image dataset consists of a background featuring different colored foreground components with unimodal or multimodal histograms. The performance of our proposed two-stage thresholding technique was evaluated quantitatively and qualitatively against other state-of-the-art thresholding techniques [5,11,12,23,27,28,36,37,38,46], and the results are presented in Section 6.1 and Section 6.2.

6.1. Experimental Results Using the Skin Cancer Dataset

This section provides the experimental details for state-of-the-art thresholding techniques [5,11,12,23,27,28,36,37,38,44,46] using the University of Waterloo skin cancer database [61]. The authors of the DTP-Net thresholding method [46] provided a pre-trained network that was trained with a custom dataset of 4550 skin cancer images. The custom dataset was created by merging images of melanoma and nevus lesions from the University of Waterloo skin cancer database [61] and the MED-NODE [63], SD-260 [64], and SD-198 [65] databases. Ground-truth images of all 4550 images are not publicly accessible, and only the University of Waterloo skin cancer database contained the ground-truth images for evaluation. Hence, we used the University of Waterloo skin cancer database to compare the performance of DTP-Net and all other methods shown in Table 1.

The DTP-Net [46] performance was evaluated by fine-tuning the pre-trained model and training the model from scratch in addition to testing the pre-trained model provided by the authors. We evaluated U-Net [44] with the Resnet-152 [66] architecture as the backbone by fine-tuning the pre-trained model (using the skin cancer images) in addition to the pre-trained model (2012 ILSVRC ImageNet dataset [67]). We performed a five-fold cross-validation on the 206 images of the University of Waterloo skin cancer database (four folds with 41 images and the fifth fold with 42 images). Four folds were used to train (fine-tune or train from scratch) the U-Net and the DTP-Net model, and one fold was used for testing. This process was repeated five times, with each fold being used as the test set once. The DSI, MCC, and PSNR scores were then averaged over all five iterations, resulting in the final performance scores. We randomly selected five images from the training fold to set the parameters of the proposed two-stage thresholding technique. Limit1, Limit2, C1, and C2 were heuristically determined as 4, 4, 12, and 12 for the skin cancer dataset. Table 1 presents the DSI, MCC, and PSNR scores of the state-of-the-art thresholding techniques techniques [5,11,12,23,27,28,36,37,38,46,66] and the proposed method. The U-Net model [44] achieved higher performance scores for the skin cancer image dataset. From the results shown in Table 1 and Figure 5, it is evident that the proposed method was quantitatively and qualitatively more accurate in segmenting skin lesion images than the other methods [5,11,12,23,27,28,36,37,38,46] used for comparison.

6.2. Experimental Results Using the PCA Board Dataset

This section presents the experimental results of the state-of-the-art thresholding techniques [5,11,12,23,27,28,36,37,38,44,46] using our PCA board database. In order to test the performance of the thresholding techniques under varying conditions, we systematically created the PCA board dataset, including images with different background colors, lighting intensities, and image qualities. The DTP-Net [46] performance was also evaluated by fine-tuning the pre-trained model and training the model from scratch (using the PCA board images) in addition to the pre-trained model provided by the authors of DTP-Net. Similarly, we evaluated U-Net [44] with the Resnet-152 [66] architecture as the backbone by fine-tuning the pre-trained model and training the model from scratch (using the PCA board images) in addition to the pre-trained model (2012 ILSVRC ImageNet dataset [67]). We performed a five-fold cross-validation on the 50 PCA board images (each fold consisting of 10 images). Four folds were used to train (fine-tune or train from scratch) the DTP-Net model, and one fold was used for testing. This process was repeated five times, with each fold being used as the test set once. The DSI, MCC, and PSNR scores were then averaged over all five iterations, resulting in the final performance scores. To evaluate the efficacy of the proposed method for thresholding PCA board images with varying background colors, the parameters of the proposed method were set using five randomly selected green-colored PCA board images from the training fold. Limit1, Limit2, C1, and C2 were heuristically determined to be 2, 4, 6, and 12 for the PCA board dataset. In order to validate the statistical stability of the proposed method, we conducted statistical analyses using the Shapiro–Wilk test [68], a one-way ANOVA [69] and the multiple comparisons test [70] for the proposed method applied to the PCA board dataset. The experimental results of the statistical analysis (presented in Appendix A) provided strong evidence for the robustness and reliability of the proposed method.

Table 2 provides the DSI, MCC, and PSNR scores of the thresholding techniques and our proposed method. From the results presented in Table 2 and Figure 6, it is evident that the proposed method achieved more accurate image segmentation results compared to other thresholding techniques in the literature.

To check the effect of background color changes, we deliberately determined the parameters for the algorithm using only green-colored PCA boards (five boards randomly selected from the training folds). Figure 7 depicts the thresholding results for the PCA boards with varying background colors; it is evident that the results of the proposed method were invariant to the changes in the PCA board’s background colors. We also analyzed the performance of the proposed method under varying lighting conditions. From the results illustrated in Figure 8, it is clear that the proposed method was effective in thresholding images with changes in intensity.

Figure 9 shows the output of the global and local thresholding stages of the proposed two-stage color thresholding technique. We could efficiently suppress the PCA board background without affecting the small foreground objects. Furthermore, the image resolution differed significantly across the sample images, ranging from 0.6 MP to 41.9 MP. The global thresholding stage (center column) effectively suppressed the background colors yet left unwanted traces in the output, such as small particles, shadows, and incorrectly connected components. The local thresholding stage (right column) further improved the results by removing such traces. The overall results showed that the proposed method was invariant to changes in background color, illumination, and image quality.

7. Discussion

We proposed a global–local color thresholding technique based on the chromaticity of image pixels. The performance of the proposed method was evaluated using the University of Waterloo skin cancer dataset and a new PCA board dataset. From the experimental results presented in Table 1 and Table 2 and Figure 5 and Figure 6, it is evident that the proposed two-stage global–local color thresholding method outperformed the state-of-the-art thresholding techniques in suppressing the image background.

As depicted in Table 1, the U-Net model achieved the highest performance score (DSI 0.8384, MCC 0.8384, and PSNR 18.79) for the skin cancer image dataset. The proposed method achieved the second highest score (DSI 0.7362, MCC 0.7259, and PSNR 16.2185), and the DTP-Net pre-trained model had the third highest score (DSI 0.6794, MCC 0.6639, and PSNR 15.7098). The U-Net and DTP-Net methods are supervised techniques that require annotated images for training the network. The proposed two-stage color thresholding technique does not require any GT data for training, which is advantageous in the medical domain, as such GT images are limited in number and expensive to obtain.

The PCA board dataset is more complex compared to the skin cancer dataset, since it consists of images depicting small foreground components with varying image quality, intensity, and background color. As presented in Table 2, the proposed method outperformed both the deep-learning-based U-Net model and the DTP-Net fine-tuned model in terms of performance scores. The proposed method achieved a DSI of 0.9846, an MCC of 0.9791, and a PSNR of 23.1545, which were significantly higher than the DSI of 0.6922, MCC of 0.5858, and PSNR of 9.555 achieved by the U-Net model and the DSI of 0.6431, MCC of 0.4996, and PSNR of 8.2162 achieved by the DTP-Net fine-tuned model. The pre-trained network (provided by the DTP-Net authors) was fine-tuned with PCA board images. The U-Net was trained from scratch on PCA board images. To train and fine-tune the network, GT information was required, whereas the proposed method’s parameters were set heuristically based on five green-colored PCA board images. The inadequate performance of the deep learning methods for the PCA board images could be ascribed to the absence of a sufficiently large training dataset and the challenge of precisely thresholding small objects, such as the components typically observed on a PCA board. The DCNN-based methods are not well-equipped to handle such cases [56].

Even though the parameters of the proposed method were set for green-colored PCA boards, the proposed method was efficient in thresholding PCA boards with red, blue, and yellow background colors. In contrast, the performance of U-Net was notably worse for PCA board images with red and yellow backgrounds, which could be attributed to the limited number of training images for these colors in the PCA board dataset. It is worth noting that most PCA boards are typically available in green and blue colors, which could explain the lack of training data for yellow or red backgrounds. The thresholding results presented in Figure 7 and Figure 8 indicate that the U-Net and DTP-net models were not robust for thresholding images with varying background colors and intensities. The results in Figure 7, Figure 8 and Figure 9 show that the proposed thresholding method was invariant to changes in the background color, intensity, and image quality. The findings suggest that for images without visible shadows, such as rows 1 and 2 of Figure 9, the global thresholding result was adequate, and for images with shadows, such as row 3 of Figure 9, the performance could be enhanced using the local thresholding stage. The proposed method was adaptive to changes in image background color, illumination, and image-capturing equipment. We did not have to adjust the parameters to achieve optimal thresholding for varying image conditions—the technique is fully automated, on contrast to many other color thresholding techniques in the literature [11,20,33]. The statistical results obtained from the Shapiro–Wilk analysis [68], one-way ANOVA [69], and multiple comparisons test [70] (presented in Appendix A) provided strong evidence for the robustness and reliability of the proposed method. Overall, our approach showed great potential in tackling the difficulties of image binarization in scenarios where there are limited training data, diverse image conditions, and a need to segment small objects.

To summarize, the proposed color thresholding technique is:

An unsupervised method, as it does not require any ground-truth data;
A robust method that is invariant to background color variations and changes in intensity;
A fully automated color thresholding approach, as there is no need to adjust parameters based on varying image conditions;
Able to automatically detect the block size for the local thresholding stage;
Effective at suppressing shadow regions;
Easily adjustable to different image qualities;
Efficient in suppressing background pixels of images with tiny foreground components;
Efficient in determining the threshold value for unimodal, bimodal, and multimodal histograms and also for histograms with sharp peaks and elongated shoulders;
Effective for symmetric, skewed, or uniform histogram analysis.

8. Application Areas

Automated skin lesion detection for disease diagnosis can act as an assistive tool for dermatologists to detect malignant lesions. Such a tool could be used to extract features of skin lesions and monitor changes over time. A robust skin lesion segmentation algorithm would form the basis for such a tool.

The dominant trend of outsourcing the PCA board manufacturing process to lower costs [18,71] and the increasing complexity of these boards have exposed the PCA board supply chain to a range of hardware attacks [72]. Scenarios exist whereby during assembly within the supply chain, a board’s functionality could be maliciously altered by adding, substituting, or removing components. Having human experts check boards during or after assembly is time-consuming, expensive, and prone to error. Therefore, robust computer vision techniques are required to analyze boards for unwanted alterations. The application of such techniques is a multistage process, including some pre-processing. As stated, one key pre-processing technique is distinguishing between background and foreground regions in an image (thresholding). Performing this task accurately makes a big difference in the ability to detect anomalies on a PCA board. Even though there are many well-established methods to perform thresholding, most are not fully automatic for varying PCA board conditions. In [18], the user must manually adjust the parameters to optimize results for backgrounds of varying colors, and the method in [71] requires the user to mark the foreground and background region with the help of an expert. These constraints motivated us to propose the two-stage global–local automatic thresholding technique to distinguish between the background of a PCA board and the foreground components mounted thereon.

In addition to the medical image and PCA board analysis, the proposed method could be utilized to analyze a range of images that are relevant for today’s robotic landscape, text detection or recognition, satellite image interpretation, or small object detection tasks. Figure 10 presents some sample images and the corresponding globally and locally thresholded outputs. Moreover, the potential privacy and security aspects of this method have not yet been studied. Due to the automated nature and adaptability of our proposed method, it may act as a building block for systems such as automated web image crawlers and text retrieval systems, which could entail privacy concerns. Existing methods can help to prevent automated text retrieval [73,74].

9. Limitations and Future Work

We observed that the presence of any additional, substantially large background region (e.g table or assembly line) affected the determination of background and foreground hues. Hence, the input image must be properly cropped before being passed to the thresholding algorithm. This is not difficult to achieve, since such image backgrounds are uniform and can be cropped automatically. Furthermore, foreground objects that have the same (or a very similar) hue value are classified as background (refer to Figure 11a–c). This is based on the fact that the image segmentation process groups pixels based on a hue range into which foreground objects may fall. Contrarily, background pixels may be classified as foreground if the hue is similar to the foreground color. As the example of the handwritten text document thresholding shown in Figure 11d–f demonstrates, the proposed method in its current format is incapable of suppressing ink stains that resemble the color of the text font. The field of text document binarization has been extensively researched, and there are numerous techniques [7,34,35,75] available in the literature that offer effective solutions for binarizing text images similar to the example shown in Figure 11d. A future research direction may be to incorporate edge information together with chromaticity while determining the foreground or background, which will help to improve the thresholding accuracy of foreground objects with the same color as the background and vice versa. From the experimental results with varying image resolutions (Figure 9), it is evident that the quality of the thresholded output decreased when the image resolution was reduced. Figure 9j (image resolution of 0.6 MP) shows connected blobs in the thresholded output (Figure 9l), compared to the higher-resolution (1.9 MP) image in Figure 9g–i. Some opportunities for future development may include quantitatively measuring the thresholded image quality, so that end users can determine the minimum image resolution that is needed to meet their requirements. U-Net-based segmentation methods have demonstrated impressive performance in many challenging image segmentation tasks [45,76,77], but they require a large amount of annotated training data to achieve such accuracy. Generating ground-truth annotations manually for PCA images using PCA components is a time-consuming and laborious process due to the small size of the components. In the future, if enough annotated data become available, it would be worthwhile to evaluate the performance of U-Net-based segmentation methods for PCA board images.

10. Conclusions

In this paper, we presented an unsupervised automatic color thresholding approach to threshold images by isolating significant hue ranges using the image histogram. To evaluate this method, we used a custom-generated PCA board image dataset (with varying background colors, lighting, and image quality) and the University of Waterloo skin cancer image database. We thereby focused on separating the PCA board foreground components from the board’s background and skin lesion binarization. Our proposed global–local color thresholding technique achieved good performance in terms of DSI, MCC, and PSNR scores compared to the naked implementations of state-of-the-art thresholding methods. The proposed method performed well for segmenting lesions from skin cancer images and thresholding small components from PCA board images without any training data. The results showed a clear and robust background–foreground separation across PCA boards with varying background colors, cameras, and lighting setups. With the advancements in PCA board design and components rapidly shrinking in size, such an automated and reliable thresholding method is key for detecting anomalies on PCA boards. The proposed method is fully automatic and does not require any ground-truth information, and it is advantageous in the medical domain, as obtaining ground-truth images is expensive and strenuous. Our approach showed great potential in tackling the difficulties of image binarization in scenarios where there are limited training data, diverse image conditions, and a need to segment small objects.

Author Contributions

Conceptualization, S.P., D.E. and T.G.; methodology, S.P., D.E. and T.G.; software, S.P. and T.G.; validation, S.P. and T.G.; data curation, S.P. and T.G.; writing—original draft preparation, S.P., D.E. and T.G.; writing—review and editing, S.P., D.E., T.G. and Y.L.; supervision, S.P., D.E. and Y.L.; funding acquisition, Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This study is supported under the RIE2020 Industry Alignment Fund—Industry Collaboration Projects (IAF-ICP) Funding Initiative (I1801E0028), as well as cash and in-kind contribution from the industry partner, HP Inc., through the HP-NTU Digital Manufacturing Corporate Lab.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

CV	Computer vision
DCNN	Deep convolutional neural network
DSI	Dice similarity index
FN	False negative
FP	False positive
GT	Ground truth
HSV	Hue, saturation, value
MCC	Matthews correlation coefficient
MP	Megapixels
PCA	Printed circuit assembly
PCB	Printed circuit board
PMF	Probability mass function
PSNR	Peak signal-to-noise ratio
ROI	Region of interest
TN	True negative
TP	True positive

Appendix A. Statistical Evaluation of the Proposed Method on the PCA Board Dataset

In order to validate the statistical stability of our algorithm and its performance against competitive methods, we performed a one-way analysis of variance (ANOVA) [69] using the 50 images in the PCA board dataset. In order to apply the ANOVA test, we first performed a Shapiro–Wilk test [68] using the PSNR scores of the top five methods, which are presented in Table 2 (Otsu [11], two-peak [27], Kapur et al. [12], DTP-NET (fine-tuned model) [46], and the proposed method). The Shapiro–Wilk test is a widely used statistical tool for testing the normality assumption of data. The null hypothesis

H_{0}

is defined as: the sample data (PSNR scores) originate from a normally distributed population. The cut-off significance

γ = 0.05

was chosen for this analysis. Table A1 shows the p-value obtained for the PSNR distribution of each method. The high p-values (>

γ

) in Table A1 indicate that we obtained statistically significant evidence at

γ = 0.05

, showing that there was not enough evidence to reject the null hypothesis, and the sample may be considered normally distributed. The Q–Q plots visually indicate that the PSNR scores were normally distributed, and this backed up the Shapiro–Wilk test results.

Table A1. p-values obtained for Shapiro–Wilk analysis using the PSNR values of Otsu [11], two-peak [27], Kapur et al. [12], DTP-NET (fine-tuned model) [46], and the proposed method.

	p-Value
Otsu [11]	0.1383
Two-peak [27]	0.1404
Kapur et al. [12]	0.0501
DTP-NET fine-tuned model [46]	0.1239
Proposed method	0.4575

After analyzing the normality, we performed a one-way analysis of variance (ANOVA) [69] test with the PSNR scores obtained for Otsu [11], two-peak [27], Kapur et al. [12], DTP-NET (fine-tuned model) [46], and the proposed method. The null hypothesis

H_{0}

was that the mean of the PSNR scores obtained by the different methods would be equal, and the alternative hypothesis

H_{1}

was that the mean of the PSNR scores obtained by the different methods would not be equal. A cut-off significance of

γ = 0.05

was chosen for the analysis. Table A2 shows the p-value obtained for the PSNR. The small p-value (<

γ

) indicates that we obtained statistically significant evidence at

γ = 0.05

to show that there was a difference in the PSNR scores obtained by different methods, so we had sufficient evidence to reject

H_{0}

.

Table A2. p-value obtained for ANOVA test for Otsu [11], two-peak [27], Kapur et al. [12], DTP-NET (fine-tuned model) [46], and the proposed method.

	PSNR
p-value	2.11 × 10⁻⁶⁰

We also performed a multiple comparisons test [70] to evaluate the pair-wise differences of the PSNR scores obtained from Otsu [11], two-peak [27], Kapur et al. [12], DTP-NET (fine-tuned model) [46], and the proposed method. The null hypothesis

H_{0}

was that the means of the PSNR scores obtained by the two methods would be equal, and the alternative hypothesis

H_{1}

was that the means of the PSNR scores obtained by the two methods would not be equal, with a cut-off significance of

γ = 0.05

. Table A3 depicts the results of the pair-wise comparison between the PSNR scores of different methods. The proposed method’s pair-wise comparison results indicated small p-values (

< γ

). Thus, we had statistically significant evidence to reject the null hypothesis, which stated that the mean of the proposed method and the method used for comparison would be equal. It is evident from the graph in Figure A1 that the proposed method achieved better PSNR scores than the other four methods.

Table A3. Results of multiple comparisons test: pair-wise comparison of different methods based on the PSNR scores. The third column (Difference) indicates the difference between the means of Method 1 and Method 2. The fourth column (p-value) indicates the probability of the means of Method 1 and Method 2 being equal.

Method 1	Method 2	Difference	p-Value
Proposed method	Otsu [11]	−13.8644	0.0000
Proposed method	Two-peaks [27]	−15.4534	0.0000
Proposed method	Kapur et al. [12]	−13.8882	0.0000
Proposed method	DTP-NET (fine-tuned model) [46]	−14.9383	0.0000
Otsu [11]	Two-peak [27]	1.5889	0.2719
Otsu [11]	Kapur et al. [12]	0.0237	1.0000
Otsu [11]	DTP-NET (fine-tuned model) [46]	−0.5151	0.6640
Two-peak [27]	Kapur et al. [12]	−1.5652	0.2868
Two-peak [27]	DTP-NET (fine-tuned model) [46]	1.0739	0.9677
Kapur et al. [12]	DTP-NET (fine-tuned model) [46]	1.0502	0.6827

The overall results obtained from the Shapiro–Wilk analysis [68], one-way ANOVA [69], and multiple comparisons test [70] provided strong evidence for the robustness and reliability of the proposed method.

Figure A1. Multiple comparisons test: pair-wise comparison of different methods based on the PSNR scores. The central red line indicates the median PSNR value of each method, and the blue box is the interquartile range (IQR), indicating the 25th and 75th percentiles of the PSNR dataset. The dashed black line represents the whiskers, extending to the minimum and maximum data points. Statistical outliers are marked using the red ’+’ symbol.

Appendix B. Configurations of Parameters

In this section, we present some sample experimental results based on different configurations of the parameters Cutoff_Gradient, Cutoff_Area, Window_Size, Limit1, Limit2, C1, and C2 to show how the segmentation quality changed when adjusting the parameters.

As a reminder, a specific hue ‘h’ qualifies as the nominated hue when either the average area around ’h’ (within the Window_Size) is greater than the Cutoff_Area or the average slope surrounding ‘h’ (within the Window_Size) is greater than the Cutoff_Gradient. We set the Cutoff_Gradient value as 0.001 and the optimal Cutoff_Area as 1/180 based on the shape of the hue or saturation histograms. The Window_Size constant was set to 5, which effectively calculated the average gradient and area within a window of five consecutive values. When determining the continuous histogram ranges that qualified as a potential background hue or saturation, we introduced Limit1 (hue continuity) and Limit2 (saturation continuity), which refer to the maximum number of consecutive points on the histogram x-axis that could lie outside of our desired gradient or cumulative area. These parameters provided some flexibility when it came to color discontinuities or variations in the background. Limit1 defines the allowable hue discontinuity, and Limit2 defines the allowable saturation discontinuity. The constants C1 and C2 define the degree of change in local hue and saturation from the global values, respectively. The optimal values for these constants depended on the variance in the image background chromaticity. The example configurations in Table A4 were based on the input image shown in Figure A2.

Figure A2. The input image used to test different configurations of the parameters.

Table A4. Example configurations of parameters.

Thresholding Result	Parameters
	Ideal parameters: determined heuristically. Cutoff_Gradient: 0.001 Cutoff_Area: 1/180, Window_Size: 5 Limit1: 4, Limit2: 10 C1: 5, C2: 10
	Cutoff_Gradient: 0.1 An increase in Cutoff_Gradient by a factor of 100 to 0.1 led to a decrease in the nominated hue or saturation range, which led to the misclassification of some background pixels as foreground.
	Cutoff_Gradient: 0.00001 A decrease in Cutoff_Gradient by a factor of 100 to 0.00001 led to an increase in the nominated hue or saturation range, which led to the misclassification of some foreground pixels as background. A change in Cutoff_Area led to the same effect.
	Window_Size: 10, Limit1: 8, Limit2: 20 An increase in Window_Size, Limit1, and Limit2 by a factor of 2 led to an increase in the nominated hue or saturation range, which led to the misclassification of some foreground pixels as background pixels. Limit1 defines the allowable hue discontinuity, and Limit2 defines the allowable saturation discontinuity.
	Window_Size: 2, Limit1: 2, Limit2: 5 A decrease in Window_Size, Limit1, and Limit2 by a factor of 2 led to a decrease in the nominated hue or saturation range, which led to the misclassification of some background pixels as foreground.
	C1: 10, C2: 20 An increase in C1 and C2 by a factor of 2 led to the misclassification of some foreground pixels as background in the local thresholding stage.
	C1: 2, C2: 5 A decrease in C1 and C2 by a factor of 2 led to the misclassification of background pixels as foreground in the local thresholding stage.

References

Fan, H.; Xie, F.; Li, Y.; Jiang, Z.; Liu, J. Automatic segmentation of dermoscopy images using saliency combined with Otsu threshold. Comput. Biol. Med. 2017, 85, 75–85. [Google Scholar] [CrossRef] [PubMed]
Prewitt, J.M.; Mendelsohn, M.L. The analysis of cell images. Ann. N. Y. Acad. Sci. 1966, 128, 1035–1053. [Google Scholar] [CrossRef] [PubMed]
Bhandari, A.; Kumar, A.; Singh, G. Tsallis entropy based multilevel thresholding for colored satellite image segmentation using evolutionary algorithms. Expert Syst. Appl. 2015, 42, 8707–8730. [Google Scholar] [CrossRef]
Fan, J.; Yu, J.; Fujita, G.; Onoye, T.; Wu, L.; Shirakawa, I. Spatiotemporal segmentation for compact video representation. Signal Process. Image Commun. 2001, 16, 553–566. [Google Scholar] [CrossRef]
Bradley, D.; Roth, G. Adaptive thresholding using the integral image. J. Graph. Tools 2007, 12, 13–21. [Google Scholar] [CrossRef]
White, J.M.; Rohrer, G.D. Image thresholding for optical character recognition and other applications requiring character image extraction. IBM J. Res. Dev. 1983, 27, 400–411. [Google Scholar] [CrossRef]
Shaikh, S.H.; Maiti, A.K.; Chaki, N. A new image binarization method using iterative partitioning. Mach. Vis. Appl. 2013, 24, 337–350. [Google Scholar] [CrossRef]
Garcia-Lamont, F.; Cervantes, J.; López, A.; Rodriguez, L. Segmentation of images by color features: A survey. Neurocomputing 2018, 292, 1–27. [Google Scholar] [CrossRef]
Sahoo, P.K.; Soltani, S.A.; Wong, A.K. A survey of thresholding techniques. Comput. Vis. Graph. Image Process. 1988, 41, 233–260. [Google Scholar] [CrossRef]
Hamuda, E.; Mc Ginley, B.; Glavin, M.; Jones, E. Automatic crop detection under field conditions using the HSV colour space and morphological operations. Comput. Electron. Agric. 2017, 133, 97–107. [Google Scholar] [CrossRef]
Otsu, N. A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef] [Green Version]
Kapur, J.; Sahoo, P.; Wong, A. A new method for gray-level picture thresholding using the entropy of the histogram. Comput. Vis. Gr. Image Process 1985, 29, 273–285. [Google Scholar] [CrossRef]
Pare, S.; Kumar, A.; Bajaj, V.; Singh, G.K. A multilevel color image segmentation technique based on cuckoo search algorithm and energy curve. Appl. Soft Comput. 2016, 47, 76–102. [Google Scholar] [CrossRef]
Mushrif, M.M.; Ray, A.K. A-IFS histon based multithresholding algorithm for color image segmentation. IEEE Signal Process. Lett. 2009, 16, 168–171. [Google Scholar] [CrossRef]
Cao, Z.; Zhang, X.; Mei, X. Unsupervised segmentation for color image based on graph theory. In Proceedings of the Second International Symposium on Intelligent Information Technology Application, Shanghai, China, 20–22 December 2008; Volume 2, pp. 99–103. [Google Scholar]
Harrabi, R.; Braiek, E.B. Color image segmentation using multi-level thresholding approach and data fusion techniques. EURASIP J. Image Video Process. 2012, 1, 1–11. [Google Scholar]
Kang, S.D.; Yoo, H.W.; Jang, D.S. Color image segmentation based on the normal distribution and the dynamic thresholding. Comput. Sci. Appl. ICCSA 2007, 4705, 372–384. [Google Scholar]
Mehta, D.; Lu, H.; Paradis, O.P.; MS, M.A.; Rahman, M.T.; Iskander, Y.; Chawla, P.; Woodard, D.L.; Tehranipoor, M.; Asadizanjani, N. The big hack explained: Detection and prevention of PCB supply chain implants. ACM J. Emerg. Technol. Comput. Syst. (JETC) 2020, 16, 1–25. [Google Scholar] [CrossRef]
Mittal, H.; Pandey, A.C.; Saraswat, M.; Kumar, S.; Pal, R.; Modwel, G. A comprehensive survey of image segmentation: Clustering methods, performance parameters, and benchmark datasets. Multimed. Tools Appl. 2021, 10, 1–26. [Google Scholar] [CrossRef]
Dirami, A.; Hammouche, K.; Diaf, M.; Siarry, P. Fast multilevel thresholding for image segmentation through a multiphase level set method, Signal Process. Signal Process. 2013, 93, 139–153. [Google Scholar] [CrossRef]
Sezgin, M.; Sankur, B. Survey over image thresholding techniques and quantitative performance evaluation. J. Electron. Imaging 2004, 13, 146–166. [Google Scholar]
Glasbey, C.A. An analysis of histogram-based thresholding algorithms. CVGIP Graph. Model. Image Process. 1993, 55, 532–537. [Google Scholar] [CrossRef]
Niblack, W. An Introduction to Digital Image Processing; Prentice-Hall: Englewood Cliffs, NJ, USA, 1986; pp. 115–116. [Google Scholar]
Goh, T.Y.; Basah, S.N.; Yazid, H.; Safar, M.J.A.; Saad, F.S.A. Performance analysis of image thresholding: Otsu technique. Measurement 2018, 114, 298–307. [Google Scholar] [CrossRef]
Su, H.; Zhao, D.; Elmannai, H.; Heidari, A.A.; Bourouis, S.; Wu, Z.; Cai, Z.; Gui, W.; Chen, M. Multilevel threshold image segmentation for COVID-19 chest radiography: A framework using horizontal and vertical multiverse optimization. Comput. Biol. Med. 2022, 146, 105618. [Google Scholar] [CrossRef] [PubMed]
Qi, A.; Zhao, D.; Yu, F.; Heidari, A.A.; Wu, Z.; Cai, Z.; Alenezi, F.; Mansour, R.F.; Chen, H.; Chen, M. Directional mutation and crossover boosted ant colony optimization with application to COVID-19 X-ray image segmentation. Comput. Biol. Med. 2023, 148, 105810. [Google Scholar] [CrossRef] [PubMed]
Parker, J.R. Algorithms for Image Processing and Computer Vision; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Sauvola, J.; Seppanen, T.; Haapakoski, S.; Pietikainen, M. Adaptive document binarization. In Proceedings of the IEEE Proceedings of the Fourth International Conference on Document Analysis and Recognition, Ulm, Germany, 18–20 August 1997; Volume 1, pp. 147–152.
Rosenfeld, A.; Kak, A.C. Digital Picture Processing, 2nd ed.; Academic Press: New York, NY, USA, 1982. [Google Scholar]
Rosenfeld, A.; De La Torre, P. Histogram concavity analysis as an aid in threshold selection. IEEE Trans. Syst. Man Cybern. 1983, 2, 231–235. [Google Scholar] [CrossRef]
Mason, D.; Lauder, I.; Rutovitz, D.; Spowart, G. Measurement of C-bands in human chromosomes. Comput. Biol. Med. 1975, 5, 179–201. [Google Scholar] [CrossRef]
Tseng, D.C.; Li, Y.F.; Tung, C.T. Circular histogram thresholding for color image segmentation. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 2, pp. 673–676. [Google Scholar]
Doyle, W. Operations useful for similarity-invariant pattern recognition. J. ACM (JACM) 1962, 9, 259–267. [Google Scholar] [CrossRef]
Pratikakis, I.; Zagoris, K.; Barlas, G.; Gatos, B. ICDAR2017 competition on document image binarization (DIBCO 2017). In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; Volume 1, pp. 1395–1403. [Google Scholar]
Sulaiman, A.; Omar, K.; Nasrudin, M.F. Degraded historical document binarization: A review on issues, challenges, techniques, and future directions. J. Imaging 2019, 5, 48. [Google Scholar] [CrossRef] [Green Version]
Wolf, C.; Jolion, J.M. Extraction and recognition of artificial text in multimedia documents. Form. Pattern Anal. Appl. 2004, 6, 309–326. [Google Scholar] [CrossRef] [Green Version]
Feng, M.L.; Tan, Y.P. Contrast adaptive binarization of low quality document images. IEICE Electron. Express 2004, 1, 501–506. [Google Scholar] [CrossRef] [Green Version]
Singh, O.I.; Sinam, T.; James, O.; Singh, T.R. Local contrast and mean thresholding in image binarization. Int. J. Comput. Appl. 2012, 51, 4–10. [Google Scholar]
Sukesh, R.; Seuret, M.; Nicolaou, A.; Mayr, M.; Christlein, V. A Fair Evaluation of Various Deep Learning-Based Document Image Binarization Approaches. In Document Analysis Systems, Proceedings of the 15th IAPR International Workshop, La Rochelle, France, 22–25 May 2022; Springer International Publishing: Cham, Switzerland, 2022; Volume 1, pp. 771–785. [Google Scholar]
Bankman, I. Handbook of Medical Image PROCESSING and Analysis; Elsevier: Amsterdam, The Netherlands, 2008; Volume 1, p. 69. [Google Scholar]
Feng, Y.; Zhao, H.; Li, X.; Zhang, X.; Li, H. A multi-scale 3D Otsu thresholding algorithm for medical image segmentation. Digit. Signal Process. 2017, 60, 186–199. [Google Scholar] [CrossRef] [Green Version]
Fazilov, S.K.; Yusupov, O.R.; Abdiyeva, K.S. Mammography image segmentation in breast cancer identification using the otsu method. Web Sci. Int. Sci. Res. J. 2022, 3, 196–205. [Google Scholar]
Ramadas, M.; Abraham, A. Detecting tumours by segmenting MRI images using transformed differential evolution algorithm with Kapur’s thresholding. Neural Comput. Appl. 2020, 32, 6139–6149. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention, Proceedings of the MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Springer International Publishing: Berlin/Heidelberg, Germany, 2015; Volume 3, pp. 234–241. [Google Scholar]
Punn, N.S.; Agarwal, S. Modality specific U-Net variants for biomedical image segmentation: A survey. Artif. Intell. Rev. 2022, 55, 5845–5889. [Google Scholar] [CrossRef]
Venugopal, V.; Joseph, J.; Das, M.V.; Nath, M.K. DTP-Net: A convolutional neural network model to predict threshold for localizing the lesions on dermatological macro-images. Comput. Biol. Med. 2022, 148, 105852. [Google Scholar] [CrossRef]
Han, Q.; Wang, H.; Hou, M.; Weng, T.; Pei, Y.; Li, Z.; Chen, G.; Tian, Y.; Qiu, Z. HWA-SegNet: Multi-channel skin lesion image segmentation network with hierarchical analysis and weight adjustment. Comput. Biol. Med. 2023, 152, 106343. [Google Scholar] [CrossRef] [PubMed]
Chen, S.; Zhong, L.; Qiu, C.; Zhang, Z.; Zhang, X. Transformer-based multilevel region and edge aggregation network for magnetic resonance image segmentation. Comput. Biol. Med. 2023, 152, 106427. [Google Scholar] [CrossRef]
Uslu, F.; Bharath, A.A. TMS-Net: A segmentation network coupled with a run-time quality control method for robust cardiac image segmentation. Comput. Biol. Med. 2023, 152, 106422. [Google Scholar] [CrossRef] [PubMed]
Borjigin, S.; Sahoo, P.K. Color image segmentation based on multi-level Tsallis–Havrda–Charvát entropy and 2D histogram using PSO algorithms. Pattern Recognit. 2019, 92, 107–118. [Google Scholar] [CrossRef]
Fan, P.; Lang, G.; Yan, B.; Lei, X.; Guo, P.; Liu, Z.; Yang, F. A method of segmenting apples based on gray-centered RGB color space. Remote Sens. 2021, 13, 1211. [Google Scholar] [CrossRef]
Naik, M.K.; Panda, R.; Abraham, A. An entropy minimization based multilevel colour thresholding technique for analysis of breast thermograms using equilibrium slime mould algorithm. Appl. Soft Comput. 2021, 113, 107955. [Google Scholar] [CrossRef]
Ito, Y.; Premachandra, C.; Sumathipala, S.; Premachandra, H.W.H.; Sudantha, B.S. Tactile paving detection by dynamic thresholding based on HSV space analysis for developing a walking support system. IEEE Access 2021, 9, 20358–20367. [Google Scholar] [CrossRef]
Rahimi, W.N.S.; Ali, M.S.A.M. Ananas comosus crown image thresholding and crop counting using a colour space transformation scheme. Telkomnika 2020, 18, 2472–2479. [Google Scholar] [CrossRef]
Minaee, S.; Boykov, Y.Y.; Porikli, F.; Plaza, A.J.; Kehtarnavaz, N.; Terzopoulos, D. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef] [PubMed]
Nguyen, N.D.; Do, T.; Ngo, T.D.; Le, D.D. An evaluation of deep learning methods for small object detection. J. Electr. Comput. Eng. 2020, 2020. [Google Scholar] [CrossRef]
OpenCV. Available online: https://docs.opencv.org/4.x/df/d9d/tutorial_py_colorspaces.html (accessed on 20 June 2022).
Zou, K.H.; Warfield, S.K.; Bharatha, A.; Tempany, C.M.; Kaus, M.R.; Haker, S.J.; Wells, W.M., III; Jolesz, F.A.; Kikinis, R. Statistical validation of image segmentation quality based on a spatial overlap index1: Scientific reports. Acad. Radiol. 2004, 11, 178–189. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 1–13. [Google Scholar] [CrossRef] [Green Version]
Dhanachandra, N.; Manglem, K.; Chanu, Y.J. Image segmentation using K-means clustering algorithm and subtractive clustering algorithm. Procedia Comput. Sci. 2015, 54, 764–771. [Google Scholar] [CrossRef] [Green Version]
University of Waterloo. Vision and Image Processing Lab. Skin Cancer Detection. 2022. Available online: https://uwaterloo.ca/vision-image-processing-lab/research-demos/skin-cancer-detection (accessed on 27 November 2022).
Shutterstock. 2022. Available online: https://www.shutterstock.com (accessed on 20 October 2022).
Giotis, I.; Molders, N.; Land, S.; Biehl, M.; Jonkman, M.F.; Petkov, N. MED-NODE: A computer-assisted melanoma diagnosis system using non-dermoscopic images. Expert Syst. Appl. 2015, 42, 6578–6585. [Google Scholar] [CrossRef]
Yang, J.; Wu, X.; Liang, J.; Sun, X.; Cheng, M.M.; Rosin, P.L.; Wang, L. Self-paced balance learning for clinical skin disease recognition. IEEE Trans. Neural Netw. Learn. Syst. 2019, 31, 2832–2846. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Lin, J.; Guo, Z.; Li, D.; Hu, X.; Zhang, Y. Automatic classification of clinical skin disease images with additional high-level position information. In Proceedings of the 2019 IEEE Chinese Control Conference (CCC), Guangzhou, China, 27–30 July 2019; pp. 8606–8610. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; Volume 1, pp. 770–778. [Google Scholar]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
Shapiro, S.S.; Wilk, M.B. An analysis of variance test for normality (complete samples). Biometrika 1965, 52, 591–611. [Google Scholar] [CrossRef]
Girden, E.R. ANOVA: Repeated Measures; Sage University Paper Series on Quantitative Applications in the Social Sciences; Sage: Thousand Oaks, CA, USA, 1992; Volume 7. [Google Scholar]
Hochberg, Y.; Tamhane, A.C. Multiple Comparison Procedures; John Wiley & Sons, Inc.: New York, NY, USA, 1987. [Google Scholar]
Zhao, W.; Gurudu, S.R.; Taheri, S.; Ghosh, S.; Mallaiyan Sathiaseelan, M.A.; Asadizanjani, N. PCB Component Detection Using Computer Vision for Hardware Assurance. Big Data Cogn. Comput. 2022, 6, 39. [Google Scholar] [CrossRef]
Ghosh, S.; Basak, A.; Bhunia, S. How secure are printed circuit boards against trojan attacks? IEEE Des. Test 2014, 32, 7–16. [Google Scholar] [CrossRef]
Wu, Z.; Shen, S.; Lian, X.; Su, X.; Chen, E. A dummy-based user privacy protection approach for text information retrieval. Knowl.-Based Syst. 2020, 195, 105679. [Google Scholar] [CrossRef]
Wu, Z.; Shen, S.; Li, H.; Zhou, H.; Lu, C. A basic framework for privacy protection in personalized information retrieval: An effective framework for user privacy protection. J. Organ. End User Comput. (JOEUC) 2021, 33, 1–26. [Google Scholar] [CrossRef]
Mustafa, W.A.; Khairunizam, W.; Zunaidi, I.; Razlan, Z.M.; Shahriman, A.B. A comprehensive review on document image (DIBCO) database. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2019; Volume 557, p. 012006. [Google Scholar]
Zhang, S.; Zhang, C. Modified U-Net for plant diseased leaf image segmentation. Comput. Electron. Agric. 2023, 204, 107511. [Google Scholar] [CrossRef]
Zhang, Z.; Liu, Q.; Wang, Y. Road extraction by deep residual U-Net. IEEE Geosci. Remote Sens. Lett. 2018, 15, 749–753. [Google Scholar] [CrossRef] [Green Version]

Figure 2. (a) Input color image. (b) Hue and saturation representation for the HSV format in OpenCV. The hue ranged from 0 to 180, the saturation ranged from 0 to 255, and the value was fixed at 255 for this representation. (c) The probability mass function of the hue component (symmetric unimodal histogram) of the input image. The nominated hue ranges were 18 to 35, 72 to 93, and 116 to 130, where the area under the curve and gradient were greater than a cut-off value within a window. The shaded region (72 to 93) shows the optimal global hue range selected by the proposed algorithm according to the maximum continuous hue range heuristic, including the peak hue at position 83. (d) The probability mass function of the saturation component of the input image (skewed unimodal histogram). The nominated saturation ranges were 30 to 55, 75 to 90, and 137 to 255, where the area under the curve and gradient were greater than a cut-off value within a window. The shaded region (137 to 255) shows the optimal global saturation range selected by the maximum continuous saturation range heuristic. The small black box indicates the window, and the red arrow represents the gradient value in the corresponding window.

Figure 3. Automatic detection of local windows to refine the background. (a) Input color image. (b) Global thresholded image. Global hue range: (75, 90). Global saturation range: (111, 255). (c) Relevant blobs and selected regions before applying local thresholding (d). Relevant corresponding image regions. (e) Refined background after global and local thresholding. Local hue and saturation ranges of the two selected subregions: (75, 80), (102, 234) and (70, 95), (101, 252), respectively.

Figure 4. (a–d) Sample images of the PCA dataset and (e–h) corresponding ground truth. (i–l) Sample images of skin cancer dataset and (m–p) corresponding ground truth.

Figure 5. Sample thresholding results using a skin lesion image: (a) input image, (b) ground truth, (c) Otsu [11], (d) Kapur et al. [12], (e) Niblack [23], (f) P-tile [27], (g) two-peak [27], (h) local contrast [27], (i) Sauvola et al. [28], (j) Wolf and Jolion [36], (k) Feng and Tan [37], (l) Bradley and Roth [5], (m) Singh et al. [38], (n) DTP-NET [46] pre-trained model, (o) U-Net [44] with Resnet-152 as backbone, (p) proposed method.

Figure 6. Sample thresholding results using a PCA board image: (a) input image, (b) ground truth, (c) Otsu [11], (d) Kapur et al.[12], (e) Niblack [23], (f) P-tile [27], (g) two-peak [27], (h) local contrast [27], (i) Sauvola et al. [28], (j) Wolf and Jolion [36], (k) Feng and Tan [37], (l) Bradley and Roth [5], (m) Singh et al. [38], (n) DTP-NET [46] pre-trained model, (o) U-Net [44] with Resnet-152 as backbone, (p) proposed method.

Figure 7. (a–d) Sample images of the PCA board dataset with varying background colors and (e–h) corresponding ground truth. (i–l) Results using Singh et al. [38], (m–p) DTP-Net fine-tuned model [46], (q–t) U-Net with Resnet-152 as backbone, and (u–x) proposed method.

Figure 8. (a–e) Sample images of the PCA board dataset with varying image intensity and (f–j) corresponding ground truth. (k–o) Results using Singh et al. [38], (p–t) DTP-Net fine-tuned model [46], (u–y) U-Net (Resnet-152) fine-tuned model, and (z–ad) proposed method.

Figure 9. Global and local thresholding results with varying image resolution. Left column (a,d,g,j): input PCA board image. Centre column (b,e,h,k): global thresholded image. Right column (c,f,i,l): local thresholded image. Image resolutions: (a) 41.9 MP, (d) 10.1 MP, (g) 1.9 MP, (j) 0.6 MP.

Figure 10. Different sample images and the corresponding globally and locally thresholded outputs: (a,b) screws on a wooden table; (c,d) needles on a red background; (e,f) a drone image of container ships; (g,h) a photo of an old text document; (i–l) text with noise in the background from the DIBCO dataset [75].

Figure 11. Thresholding results of images with simlarly colored foreground or background regions. Left column: input image. Centre column: ground truth. Right column: image thresholded by the proposed method. The red-colored component D11 in (a) is misclassified as the background in (c), based on the ground truth (b). Ink stains in the input text image (d) are misclassified as foreground in (f). Images (d–f) were taken from the DIBCO database [75].

Table 1. DSI, MCC, and PSNR scores for the skin cancer dataset of 206 images used to test various thresholding techniques. The highest performance values are marked with bold font.

Method	DSI	MCC	PSNR
Otsu [11]	$0.4242 \pm 0.3415$	$0.4188 \pm 0.3447$	$8.3002 \pm 5.7482$
Kapur et al. [12]	$0.6194 \pm 0.2964$	$0.5950 \pm 0.3388$	$13.8523 \pm 6.5068$
Niblack [23]	$0.1480 \pm 0.1098$	$0.0181 \pm 0.0211$	$3.3668 \pm 0.2171$
p-tile [27]	$0.2614 \pm 0.1964$	$0.2498 \pm 0.1753$	$3.9546 \pm 1.0244$
Two-peak [27]	$0.4505 \pm 0.3180$	$0.4402 \pm 0.3437$	$10.9754 \pm 5.8259$
Local contrast [27]	$0.1652 \pm 0.1323$	$0.0385 \pm 0.0536$	$1.9889 \pm 0.5827$
Sauvola et al. [28]	$0.1812 \pm 0.1519$	$0.2523 \pm 0.1659$	$12.6377 \pm 4.6075$
Wolf and Jolion [36]	$0.2002 \pm 0.1396$	$0.2745 \pm 0.1494$	$12.6661 \pm 4.5816$
Feng and Tan [37]	$0.3254 \pm 0.1694$	$0.3132 \pm 0.1700$	$10.6398 \pm 3.2629$
Bradley and Roth [5]	$0.3000 \pm 0.1611$	$0.3245 \pm 0.1639$	$12.2064 \pm 4.0079$
Singh et al. [38]	$0.3746 \pm 0.2612$	$0.3610 \pm 0.2493$	$6.9653 \pm 3.9958$
DTP-NET pre-trained model [46]	$0.6794 \pm 0.2599$	$0.6639 \pm 0.2726$	$15.7098 \pm 5.8357$
DTP-NET training from scratch [46]	$0.5808 \pm 0.2825$	$0.5695 \pm 0.2911$	$13.3061 \pm 5.7834$
DTP-NET fine-tuned model [46]	$0.6723 \pm 0.2616$	$0.6482 \pm 0.2888$	$15.2232 \pm 5.8416$
U-Net (Resnet-152) pre-trained [44]	$0.0006 \pm 0.0035$	$- 0.0336 \pm 0.0228$	$11.2333 \pm 3.7380$
U-Net (Resnet-152) fine-tuned [44]	$0.8384 \pm 0.1734$	$0.8384 \pm 0.1651$	$18.7921 \pm 5.0272$
Proposed method	$0.7362 \pm 0.2262$	$0.7259 \pm 0.2334$	$16.2185 \pm 6.3079$

Table 2. DSI, MCC, and PSNR scores for the PCA board dataset of 50 images used to test various thresholding techniques. The highest performance values are marked with bold font.

Method	DSI	MCC	PSNR
Otsu [11]	$0.6621 \pm 0.1493$	$0.6001 \pm 0.2161$	$9.2901 \pm 3.678$
Kapur et al. [12]	$0.6608 \pm 0.157$	$0.5928 \pm 0.2039$	$9.2663 \pm 3.5146$
Niblack [23]	$0.3979 \pm 0.1229$	$0.1068 \pm 0.0556$	$3.2040 \pm 0.3981$
p-tile [27]	$0.4515 \pm 0.1462$	$0.2217 \pm 0.1434$	$3.8677 \pm 0.8526$
Two-peak [27]	$0.4916 \pm 0.2491$	$0.4475 \pm 0.2991$	$7.7011 \pm 3.9106$
Local contrast [27]	$0.4074 \pm 0.1081$	$0.1723 \pm 0.0749$	$4.3730 \pm 0.5206$
Sauvola et al. [28]	$0.3939 \pm 0.1393$	$- 0.0601 \pm 0.0681$	$1.5010 \pm 0.7247$
Wolf and Jolion [36]	$0.3927 \pm 0.14$	$- 0.0637 \pm 0.0647$	$1.4920 \pm 0.7086$
Feng and Tan [37]	$0.3787 \pm 0.1379$	$- 0.0733 \pm 0.1184$	$1.7017 \pm 0.8356$
Bradley and Roth [5]	$0.3881 \pm 0.1391$	$- 0.0649 \pm 0.0995$	$1.6008 \pm 0.7862$
Singh et al. [38]	$0.3439 \pm 0.1501$	$- 0.0128 \pm 0.2084$	$2.7533 \pm 1.3579$
DTP-NET pre-trained [46]	$0.6040 \pm 0.1814$	$0.4357 \pm 0.3327$	$7.5109 \pm 4.5824$
DTP-NET training from scratch [46]	$0.6197 \pm 0.14$	$0.4597 \pm 0.3179$	$7.7875 \pm 4.3566$
DTP-NET fine-tuned [46]	$0.6431 \pm 0.1646$	$0.4996 \pm 0.3178$	$8.2162 \pm 4.2064$
U-Net (Resnet-152) pre-trained [44]	$0.3207 \pm 0.0966$	$0.0523 \pm 0.0712$	$4.0549 \pm 0.8271$
U-Net (Resnet-152) training f. s. [44]	$0.6922 \pm 0.1930$	$0.5858 \pm 0.3065$	$9.5552 \pm 4.5157$
U-Net (Resnet-152) fine-tuned [44]	$0.6249 \pm 0.2790$	$0.5154 \pm 0.3569$	$9.1995 \pm 4.7670$
Proposed method	$0.9846 \pm 0.0149$	$0.9791 \pm 0.0203$	$23.1545 \pm 4.406$

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pootheri, S.; Ellam, D.; Grübl, T.; Liu, Y. A Two-Stage Automatic Color Thresholding Technique. Sensors 2023, 23, 3361. https://doi.org/10.3390/s23063361

AMA Style

Pootheri S, Ellam D, Grübl T, Liu Y. A Two-Stage Automatic Color Thresholding Technique. Sensors. 2023; 23(6):3361. https://doi.org/10.3390/s23063361

Chicago/Turabian Style

Pootheri, Shamna, Daniel Ellam, Thomas Grübl, and Yang Liu. 2023. "A Two-Stage Automatic Color Thresholding Technique" Sensors 23, no. 6: 3361. https://doi.org/10.3390/s23063361

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Two-Stage Automatic Color Thresholding Technique

Abstract

1. Introduction

2. Related Work

3. Methods

3.1. Stage 1: Global Thresholding

3.2. Stage 2: Local Thresholding

4. Implementation Details

5. Evaluation Metrics

5.1. Dice Similarity Index

5.2. Matthews Correlation Coefficient

5.3. Peak Signal-to-Noise Ratio

6. Results

6.1. Experimental Results Using the Skin Cancer Dataset

6.2. Experimental Results Using the PCA Board Dataset

7. Discussion

8. Application Areas

9. Limitations and Future Work

10. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A. Statistical Evaluation of the Proposed Method on the PCA Board Dataset

Appendix B. Configurations of Parameters

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI