DSCU-Net: MEMS Defect Detection Using Dense Skip-Connection U-Net

Wu, Shang; Zhu, Yaxin; Liang, Pengchen

doi:10.3390/sym16030300

Open AccessArticle

DSCU-Net: MEMS Defect Detection Using Dense Skip-Connection U-Net

by

Shang Wu

,

Yaxin Zhu

and

Pengchen Liang

^*

School of Microelectronics, Shanghai University, Shanghai 200444, China

^*

Author to whom correspondence should be addressed.

Symmetry 2024, 16(3), 300; https://doi.org/10.3390/sym16030300

Submission received: 20 December 2023 / Revised: 21 January 2024 / Accepted: 22 February 2024 / Published: 4 March 2024

Download

Browse Figures

Versions Notes

Abstract

:

With the rapid development of intelligent manufacturing and electronic information technology, integrated circuits play a vital role in high-end chips. The semiconductor chip manufacturing process requires precise operation and strict control to ensure chip quality. The traditional manual visual inspection method has a high workforce cost and intense subjectivity and is accompanied by a high level of misdetection and leakage. Computer vision-based wafer defect detection technology is gaining popularity in the industry. However, previous methods still find it challenging to meet the production requirements regarding accuracy. To solve the problem, we propose a defect detection network based on a coding and decoding structure, Dense Skip-Connection U-Net (DSCU-Net), which optimizes the skip connection between the encoder and decoder and enhances the profound fusion of high-level semantics and low-level semantics to improve accuracy. To verify the effectiveness of DSCU-Net, we validate it in actual microelectromechanical systems (MEMS) data, and the results show that DSCU-Net reaches an optimal level. Therefore, the DSCU-Net proposed in this paper effectively solves the defect detection problem in semiconductor chip manufacturing. This method reduces workforce cost and subjectivity interference and improves inspection efficiency and accuracy. It will help to promote further development in the field of intelligent manufacturing and electronic information technology.

Keywords:

semiconductor; defect detection; code and decoder structure

1. Introduction

With the rapid development of industrial intelligent manufacturing and electronic information technology, integrated circuits have become increasingly important in today’s high-end chip field [1]. During the manufacturing process of semiconductor chips, the processes are closely linked, forming a complex technology chain. From the initial input of raw materials to the final chip packaging, each link requires precise operation and strict control. Even the slightest change in materials, environment, or process parameters may lead to defects in the chip, affecting chip performance and, consequently, product yield. Therefore, chip quality inspection is an indispensable critical link in the production line.

Through quality inspection, we can find and provide feedback on product quality problems in time to better control each production link’s product status. With the continuous progress of science and technology, this role is becoming more and more prominent. However, the traditional quality inspection method, i.e., manual visual inspection, has many disadvantages, such as low efficiency, precision, high cost, high labor intensity, and inconsistent standards [2]. These limit its application in modern production lines.

To solve these problems, automatic inspection technology has gradually replaced manual visual inspection methods. Early automatic inspection mainly relies on machine vision technology. Machine vision technology has the advantages of high efficiency, high accuracy, high reliability, non-contact, and objectivity, so it has been widely studied and applied in many fields. The classical approach combines feature selection algorithms based on manually designed features and pattern recognition classification algorithms.

However, in recent years, with the development of industrial big data technology and artificial intelligence technology, it is expected that by analyzing images of semiconductor chips (wafers), products can be detected for defects in real time, thus saving costs and improving yields [3,4,5]. It will bring higher efficiency and accuracy to semiconductor foundries, providing strong support for the rapid development of the industry. The successful application of deep learning models in computer vision, represented by convolutional neural networks (CNNs) [6], provides a new solution for defect detection. These deep learning models can automatically learn and identify various defects and anomalies through large amounts of data training. It improves the accuracy and efficiency of detection and adapts to various complex environments and different kinds of chips. With the further development of technology, deep learning will likely play a more significant role in chip quality inspection. However, the vital confidentiality of semiconductor data and other reasons lead to less research on wafer defect detection.

In this paper, we propose a novel network structure for wafer defect detection, DSCU-Net, which is based on the code and decoder structure and includes three core components: encoder, decoder, and skip connection. DSCU-Net is based on a typical U-Net network framework, where encoders and decoders are arranged symmetrically on both sides of the network. This symmetric structure can keep the dimensionality of inputs and outputs consistent and is widely used in scene segmentation, defect recognition, and localization. DSCU-Net improves the skip connection of the U-Net, i.e., adding more prosperous information connection methods based on the traditional skip connection, enhancing the multi-information fusion between the encoder and decoder, making the model easier to optimize, and improving defect detection performance. The contribution points of this paper are summarized as follows:

A novel network model, DSCU-Net, is proposed. It designs a complex and dense hopping connection. It enhances the multi-information fusion between the encoder and the decoder to make it more potent in feature extraction and defect reconstruction.
Tested on actual semiconductor production data, the results show that DSCU-Net can effectively improve the wafer defect detection recognition effect.

2. Related Work

Wafer defect inspection plays a crucial role in the semiconductor manufacturing process. As the semiconductor manufacturing process becomes more high-end and refined, the traditional manual visual inspection method can no longer meet modern production lines’ high efficiency, accuracy, and increased sensitivity inspection requirements. The main problems include low efficiency, low accuracy, high cost, high labor intensity, and inconsistent standards, significantly limiting its application in modern production lines.

Although traditional wafer inspection schemes have good detection capability, they have poor generalization ability. For example, Lee et al. [7] proposed an intelligent online measurement sampling method for wafer fabrication process parameter monitoring, introducing a new application prospect for data mining technology. The method specifies the chip locations within a wafer that need to be measured and the number of estimated chip locations in each wafer to ensure 100% wafer coverage and good defect detection sensitivity. However, this method does not address the problem of poor generalization.

As design rules shrink, the requirements for susceptible defect detection in the lithography process become more stringent. Meshulach et al. [8] analyzed the light scattering and detection of wafer defects for various 3 nm design rule-resistant structures with different polarizations and optical configurations at visible, UV, and DUV wavelengths. However, there needs to be an effective solution for generalization capability improvement.

Deep learning solutions are starting to be taken into account by researchers. Bourgeat et al. [9] introduced a segmentation algorithm designed explicitly for semiconductor wafer images generated by optical inspection tools. Applying the segmentation method on optical microscope wafer images can significantly enhance the efficiency of the defect detection process. However, this method is only for detecting known defects and needs to be better for detecting unknown types.

To solve the above problems, some novel neural network structures have been proposed. For example, the Competitive Hopfield Wafer Defect Detection Neural Network (CHWDNN) (Chang et al. [10]) was used to detect defective regions in wafer images. CHWDNN extends the original single-layer two-dimensional Hopfield neural network into a two-layer three-dimensional Hopfield neural network and implements defect detection on its three dimensions. However, this method may not accurately detect some tiny defects.

A spatial correlation map concept was also introduced to wafer defect detection (Jeong et al. [11]). The authors innovatively proposed the idea of spatial correlation maps for detecting the presence of spatial autocorrelation and classifying defect patterns on wafer maps. Experimental results show that this method has excellent robustness to random noise, regardless of the location and size of the defects. However, the effectiveness of this method for classifying complex defect patterns needs to be improved.

Researchers have begun to explore deep learning-based wafer defect detection methods to address the shortcomings of traditional wafer inspection schemes. For example, Yu and Lu [12] successfully developed a flow-based learning system for wafer graphical defect detection and identification. In this system, a joint local and non-local linear discriminant analysis (JLNDA) was proposed to discover the intrinsic streamform information that provides the discriminative features of defect patterns. However, the effectiveness of this method for detecting defects of different types and sizes could be improved. Li and Tsai [13] proposed an automated defect detection method specifically for the photovoltaic industry, focusing on polysilicon solar wafers. The technique helps to detect saw mark defects as early as possible during the wafer dicing process, thus reducing material waste and improving production yield. However, this method is only suitable for specific defect detection in the photovoltaic industry and is not adequate for other types of defect detection.

To further improve the generalization ability and accuracy of defect detection, some studies have begun exploring voting-integrated classifiers with multi-type features (Saqlain et al. [14]). This approach uses a voting-integrated classifier with multi-type features for identifying wafer map defect patterns in semiconductor manufacturing. However, the effectiveness of this method for classifying complex defect patterns could be improved.

On the other hand, Jin et al. [15] proposed a novel clustering-based framework for detecting and classifying wafer bin map (WBM) defect patterns. The framework’s anomaly detection and defect clustering pattern extraction can be performed simultaneously and can detect cluster patterns of arbitrary shapes without specifying the number of clusters in advance. This approach combines the advantages of cluster analysis and deep learning to effectively improve defect detection’s generalization capability and accuracy.

3. Method

This section delves into the novel U-shaped structure-based defect detection model, DSCU-Net, proposed in this paper. Firstly, we clarify the definition of the semiconductor wafer defect detection task and then describe the DSCU-Net model’s structure in detail.

3.1. Problem Definition

Given wafer structure data taken by a microscope, the task of semiconductor defect detection is to determine the location and contour of defects and to obtain a defect mask. The wafer structure data captured by the microscope can be represented as X. Given the wafer structure data X, it is processed through a filter f to learn an inference function F. This function can detect the defect mask M in future wafered structure data. The expression for the function F is

X \overset{F}{⟶} M,

(1)

where M denotes a single channel image with the same size as X and contains only two values, 0 and 1; 0 means the pixel is normal, and 1 means the pixel is in a defective area.

3.2. Dense Residual U-Net

This section introduces the proposed DSCU-Net network for semiconductor wafer defect detection, as shown in Figure 1. DSCU-Net is similar to the classical U-NET model, which is also based on the coding and decoding structure [16], including the encoder, decoder, and skip connection. The difference is that DSCU-Net optimizes the skip connection in the traditional U-Net network, which is the core of DSCU-Net. DSCU-Net establishes many skip connections between the encoder and the decoder, reducing the difficulty of the convolutional operation to learn the features and compensating for the loss of information due to downsampling for the decoder. This design makes DSCU-Net perform well in the semiconductor wafer defect detection task and has high practical value.

The DSCU-Net detection model takes X as input and demonstrates an effective feature learning capability. In the encoder, each downsampling layer enhances the information learning by using different numbers of f, which are 8, 16, 32, and 64, to extract more advanced feature information gradually. Meanwhile, in the decoder, the information in the channel is compressed and condensed for more accurate feature reconstruction. In the upsampling session, f then uses 64, 32, 16, and 8, which are symmetrical with the encoder to ensure the integrity of information transfer. Finally, in the output layer of the model, we set the number of filters to 1 to produce high-quality detection results.

3.2.1. Encoder

The encoder takes data X as input. Its core task is to learn a potential vector V that expresses the key features of data X. This process can be modeled as

E (x)

, expressed as

x \overset{E}{⟶} v,

(2)

where

E (x)

is passed through a convolution module to extract features. After the convolution module extracts the features, the pooling operation downsamples the feature map, and the feature map size after each sampling is half of the original map.

The convolution module is the heart of the encoder. It contains the convolution operation, normalization operation, and nonlinear transformation, as shown in Figure 2. In the encoder, all convolution operations are uniformly performed using a

3 \times 3

convolution kernel. Compared with the

5 \times 5

and

7 \times 7

convolution kernels, the

3 \times 3

convolution kernel performs better in reducing the number of model parameters, improving the training efficiency of the model and its generalization ability to avoid the overfitting phenomenon [17]. This is because the

3 \times 3

convolution kernel can better capture the local features of the input data, thus improving the model’s accuracy. Batch normalization (BN) [18] and a rectified linear unit (ReLU) [19] are used for the normalization operation and nonlinear transformation, respectively. The specific operation of this approach is that BN first normalizes the results of the convolution operation, and then, the processed results are input into the ReLU. This operation utilizes the unilateral suppression of the ReLU for better feature selection.

Using methods such as a

3 \times 3

convolutional kernel, BN, and ReLU nonlinear transformation, the encoder can learn the key features of the input X more efficiently, thus improving the accuracy and generalization of the model.

3.2.2. Decoder

The decoder receives the latent layer vector V as input, intending to reconstruct the defect information by upsampling with feature extraction. This process can be modeled as

D (v)

, expressed as

v \overset{D}{⟶} M,

(3)

where

D (v)

progressively recovers the defect feature map through upsampling operations and the convolution module re-refines the defect features. Each upsampling operation increases the feature map resolution by a factor of 2, corresponding to downsampling in the encoder. The convolution module adopts the same structure as the convolution module in the encoder, and this structure can efficiently extract the defect features from the latent layer vectors.

In addition, the output layer of the decoder uses a convolution kernel size of

1 \times 1

and sets the output channel to 1. This setting can effectively convert the defect feature map into defect detection results. In practice, the decoder output can be combined with the original image to detect the defect location and size accurately.

3.2.3. Dense Skip Connection

DSCU-Net optimizes the problem of insufficient information fusion of traditional U-Net skip connections by establishing dense skip connections between convolutional modules and between the encoder and decoder, enabling the network to understand and exploit the contextual information of the image entirely. As shown in Algorithm 1, in the encoder part,

b l o c k 1

utilizes the feature map

i n_f

output from the previous layer as input, while

b l o c k 2

receives both input and output from

b l o c k 1

. This design strategy enhances the feature extraction capability of the encoder convolution module significantly. In the decoder section, in addition to the skip connections, which are consistent with the encoder, each block fully integrates all the information in the encoder as its input. Specifically,

d b l o c k 1

receives the three messages

i n f

,

e b 1_o u t

, and

r b 2_o u t

, while

d b l o c k 2

, in addition to inheriting all the inputs from

d b l o c k 1

, additionally contains the output of

d b l o c k 1

as its input. Such a structure allows the decoder to understand and reconstruct the feature information captured by the encoder more comprehensively. This design allows the network to perform explicitly targeted feature reconstruction tasks at each step of the feature recovery process, resulting in better recovery of image details and textures. The idea of deep supervision is reflected in this design, allowing the network to understand the input image more comprehensively. Different convolutional modules focus on different feature information, and this connection helps to fuse different levels of feature information so that the network can understand the input image more comprehensively.

Algorithm 1 Dense skip connection.

Require:: feature matrix $i n_f$ , encoder layer convolution module $e b l o c k 1$ , $e b l o c k 2$ ; corresponding layer decoder convolution module $d b l o c k 1$ , $d b l o c k 2$
Ensure:: result $o u t_f$
//Encoder feature extraction

e b 1_o u t

=

e b l o c l 1

(

i n_{f}

)

r b 2_o u t

=

e n b o c k 2

(concat(

i n f

,

e b 1_o u t

))
// Decoder defect reconstruction

d b 1_o u t

=

d b l o c k 1

(concat(

i n f

,

e b 1_o u t

,

r b 2_o u t

))

d b 2_o u t

=

d b l o c k 2

(concat(

i n f

,

e b 1_o u t

,

r b 2_o u t

,

d b 1_o u t

))
//Output

o u t_{f}

= upsample(concat(

i n f

,

e b 1_o u t

,

r b 2_o u t

,

d b 1_o u t

,

d b 2_o u t

))

In the same layer of the decoder, input and output features from the encoder are also used as inputs for the skip connection, which facilitates full communication and fusion of bottom- and top-level features. This design allows the detailed information to be passed to the decoder, improving the segmentation results’ accuracy and retention. As a result, the DSCU-Net model can effectively solve the problem where traditional fully convolutional networks are prone to lose details and contextual information in segmentation tasks.

In addition, the dense skip-connection structure also helps to alleviate the gradient propagation problem. During backpropagation, the gradient may disappear or explode, making the model difficult to optimize. However, with a skip connection, information between the encoder and decoder can be passed directly, making it easier for gradients to propagate to earlier model layers and optimizing the network. Also, a dense skip connection helps to improve the robustness of the network. Since features at different levels have different sensory field sizes and semantic information, feature fusion through a skip connection can provide more comprehensive information. It means the network is more adaptable to scenes such as scale changes and occlusions and can better handle various complex image segmentation tasks.

4. Experiment

4.1. Experiment Setup and Datasets

In this study, all experiments used NVIDIA A5000 Tensor Core GPUs, NVIDIA Corporate, Santa Clara, CA, USA with 24 GB of memory to train and test deep learning models. All algorithms in this section were implemented using Pytorch [20] and tested based on the MEMS dataset training. The Adam optimizer set the learning rate and batch size to 0.001 for DSCU-Net optimization’s end-to-end training algorithm. The data size was scaled to 512 × 512 during the training and testing process.

We collected data from a semiconductor manufacturing factory to validate the dense skip-connected U-Net network proposed in this paper. From the data, we constructed our training and testing datasets after calibration by senior engineers to ensure the authenticity and validity of the data. The specific dataset information is shown in Table 1.

The training dataset contains 1588 pieces of data, and the test dataset includes 682 pieces of data. This dataset contains many defects, mainly bubbles, particles, flaking, scratches, residue, etc., as shown in Figure 3.

4.2. Evaluation

We comprehensively evaluated the performance of DSCU-Net using metrics such as the mean intersection over union (mIOU) [21], dice, accuracy, precision, recall, and F1 scores. IOU measures the accuracy of detecting the corresponding object in a given dataset, calculated using the intersection ratio and concatenation between the inference and actual results.

\begin{matrix} m I O U = \frac{| A \cap B |}{| A \cup B |} = \frac{T P}{T P + F P + F N} \end{matrix}

(4)

A represents the model segmentation results, and B represents the actual segmentation labels. The symbol

| |

denotes the modulo operation, ∩ denotes the region intersection, and ∪ denotes the concatenation operation. Dice [22] is a set similarity measure, usually used to calculate the similarity of two samples.

\begin{matrix} D i c e = \frac{2 | (A \cap B) |}{| A | + | B |} = \frac{2 T P}{2 T P + F P + F N} \end{matrix}

(5)

The coefficient in the denominator is used to balance the existence of double-counting inference results with the actual result of common elements.

Defect segmentation is a pixel-based classification task, so we introduce a combined evaluation of accuracy, precision, recall, and F1 value [23]. Accuracy indicates the proportion of correct predictions, precision indicates the checking accuracy of predicted positive examples, and recall indicates the checking completeness of true positive examples. Precision and recall rates are judged based on predicted and actual results, respectively, so the F1 value is introduced as a reconciliation index for precision rate and recall rate.

\begin{matrix} A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N} \end{matrix}

(6)

\begin{matrix} P r e c i s i o n = \frac{T P}{T P + F P} \end{matrix}

(7)

\begin{matrix} R e c a l l = \frac{T P}{T P + F N} \end{matrix}

(8)

\begin{matrix} F 1 - S c o r e = 2 \frac{P r e c i s i o n \cdot R e c a l l}{P r e c i s i o n + R e c a l l}, \end{matrix}

(9)

where

T P

denotes that both the predicted and actual labels are positive samples;

F N

denotes that the predicted result is a positive sample and the actual label is a counterexample;

T N

denotes that both the predicted and actual results are counterexamples; and

F P

denotes that the predicted result is a counterexample and the actual label is a positive sample. The coefficient 2 in the F1-score refers to the reconciliation coefficient.

4.3. Baseline Methods

In the experimental section, we compare the proposed method with the following methods, which are considered the most classical semantic segmentation methods.

U-Net [24]. This model is a compelling image segmentation network applied in medical image processing, remote sensing image segmentation, and other fields, with excellent performance and wide applications.

U-Net++ [25]. This model is an exquisite U-Net network structure that enhances the segmentation performance by injecting more upsampling and downsampling paths at each stage of the U-Net. These innovative improvements have led to the widespread use of U-Net++, as it demonstrates excellent performance in tasks such as medical image segmentation.

U-Net3+ [26]. This improved lightweight U-Net model has better performance and faster training speed, making it even better in challenging image segmentation tasks.

ResU-Net [27]. This model adheres to the residual networks (ResNet) concept, which mitigates the problem of vanishing gradients and degradation of network performance by subtly introducing a skip connection between convolutional layers. This optimized model exhibits excellent performance and outstanding generalization capabilities.

ResU-Net++ [28]. This model introduces the attention mechanism based on ResU-Net, which gives it a stronger generalization ability and can be widely used in various application scenarios. This model has become an important reference model in deep learning and has received extensive attention and research.

Attention U-Net [29]. This extremely efficient image segmentation model improves performance and accuracy through its introduced attention mechanism and other cutting-edge technologies.

BiSeNet-V2 [30]. This high-performance model has excellent scalability. Its performance can be further enhanced by expanding the number of layers or adopting more complex structures. This model has been widely used in computer vision, demonstrating its power and unlimited possibilities.

4.4. Results and Analysis

To evaluate the performance of the DSCU-Net algorithm proposed in this paper, we compare it experimentally with classical segmentation algorithms. These algorithms are widely used and influential in image segmentation, and their design ideas and implementation methods are essential for DSCU-Net design.

Table 2 compares the segmentation accuracies of U-Net, U-Net++, U-Net3+, BiSeNet-V2, ResU-Net, ResU-Net++, and Attention U-Net models for the semiconductor image-based wafer defect segmentation task. Regarding IOU metrics, DSCU-Net is slightly inferior to U-Net++ in segmentation performance, with a difference of 0.27%. However, DSCU-Net shows superior performance in all other metrics, such as dice, accuracy, recall, precision, and F1-score, which are improved by at least 0.13%, 0.03%, 0.9%, 3.22%, and 3.22%, respectively, compared to other classical methods. However, ResU-Net and ResU-Net++ show a significant advantage in the recall metric, but this metric only evaluates the defective pixel-level recognition accuracy. Specifically, suppose the algorithm considers all pixels as defective points. In that case, recall is 1, but at this time, the algorithm is invalid, so recall cannot be used as an algorithm evaluation metric alone. It needs to be evaluated in combination with precision.

These performance improvements can be attributed to several optimized designs of DSCU-Net. Firstly, by introducing many skip-connection and encoder–decoder structures, DSCU-Net effectively mitigates the problem of information loss in traditional segmentation networks and improves the efficiency of feature extraction and information transfer. Secondly, for the problem of disappearing or exploding gradients during the training process, DSCU-Net adopts techniques such as residual connection and batch normalization, thus optimizing the model training process. DSCU-Net demonstrates superior performance in the semiconductor image-based wafer defect segmentation task compared to other classical segmentation methods. Its optimized design and innovative technical applications effectively improve the segmentation accuracy and robustness of the model, providing a valuable reference for research and practice in related fields.

4.5. Visualization

We also show defect segmentation visualization results for the classical algorithm with the DSCU-Net algorithm on the MEMS dataset, as shown in Figure 3. These results cover common defect types, such as bubbles, scratches, residuals, dust, etc. Please refer to the Figure 3 for a more intuitive view of the performance of these algorithms in defect segmentation.

From the visualization results, the overall effect of DSCU-Net is similar to that of U-Net++, which is significantly better than the other methods. The difference is pointed out in that DSCU-Net is more capable of handling details than U-Net++, such as the morphology of the bubble defect, which is shown as a ring, but U-Net++ did not predict it accurately. According to the visualization results, the defect segmentation ability of DSCU-Net is shown from the side, proving this paper’s correctness for skip-connection optimization and multi-scale information fusion.

5. Conclusions and Discussion

Wafer quality inspection can find and provide feedback on product quality problems in time and is an essential part of the wafer manufacturing process. Traditional graphical wafer defect detection methods are gradually eliminated, and deep learning-based solutions usher in the boom. This paper proposes a deep learning algorithm based on coding and decoding structure—DSCU-Net. Experimental results show that DSCU-Net achieves excellent results, precisely 72.53% for IOU, 81.96% for dice, and 87.67% for F1-score. Dice improved by more than 0.13% compared to its comparison algorithms, and F1-score improved by 3.22%. In addition, the visualization results corroborate the experimental results, where the segmentation results of DSCU-Net are more accurate than other comparison methods. This is significant for quality inspection and defect traceability in the wafer fabrication process and helps improve yield. However, DSCU-Net also has some limitations and drawbacks, such as the limited performance as well as generalization of DSCU-Net due to the limited number of device types, defect categories, and data volume included in the dataset; meanwhile, DSU-Net does not take into account the extraction of edge information, which causes it to be rough at the edges of defects. In future work, we aim to enhance the algorithm’s ability to extract edge information, improve its precision and recall, and further optimize the model’s performance by introducing edge feature extraction or edge loss.

Author Contributions

Conceptualization, S.W. and P.L.; methodology, S.W.; validation, S.W. and Y.Z.; data curation, Y.Z.; writing—original draft preparation, S.W.; writing—review and editing, S.W. and Y.Z.; supervision, P.L.; project administration, P.L.; funding acquisition, S.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Due to the confidentiality of the data we used, we are not at liberty to disclose it now.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Huang, Q. Intelligent manufacturing. In Understanding China’s Manufacturing Industry; Springer: Berlin/Heidelberg, Germany, 2022; pp. 111–127. [Google Scholar]
Han, H.; Gao, C.; Zhao, Y.; Liao, S.; Tang, L.; Li, X. Polycrystalline silicon wafer defect segmentation based on deep convolutional neural networks. Pattern Recognit. Lett. 2020, 130, 234–241. [Google Scholar] [CrossRef]
Zhao, W.; Wei, Q.; Zeng, Z. A Deeply Supervised Semantic Segmentation Method Based on GAN. arXiv 2023, arXiv:2310.04081. [Google Scholar]
Muksimova, S.; Mardieva, S.; Cho, Y.I. Deep Encoder–Decoder Network-Based Wildfire Segmentation Using Drone Images in Real-Time. Remote Sens. 2022, 14, 6302. [Google Scholar] [CrossRef]
Umirzakova, S.; Ahmad, S.; Mardieva, S.; Muksimova, S.; Whangbo, T.K. Deep learning-driven diagnosis: A multi-task approach for segmenting stroke and Bell’s palsy. Pattern Recognit. 2023, 144, 109866. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems: 26th Annual Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; Volume 25. [Google Scholar]
Lee, J.H.; Yu, S.J.; Park, S.C. Design of intelligent data sampling methodology based on data mining. IEEE Trans. Robot. Autom. 2001, 17, 637–649. [Google Scholar] [CrossRef]
Meshulach, D.; Dolev, I.; Yamazaki, Y.; Tsuchiya, K.; Kaneko, M.; Yoshino, K.; Fujii, T. Advanced lithography: Wafer defect scattering analysis at DUV. In Proceedings of the Metrology, Inspection, and Process Control for Microlithography XXIV, San Jose, CA, USA, 22–25 February 2010; SPIE: Bellingham, WA, USA, 2010; Volume 7638, pp. 195–204. [Google Scholar]
Bourgeat, P.; Meriaudeau, F.; Gorria, P.; Tobin, K.W., Jr. Content-based segmentation of patterned wafer for automatic threshold determination. In Proceedings of the Machine Vision Applications in Industrial Inspection XI, Santa Clara, CA, USA, 22–24 January 2003; SPIE: Bellingham, WA, USA, 2003; Volume 5011, pp. 183–189. [Google Scholar]
Chang, C.Y.; Lin, S.Y.; Jeng, M. Using a two-layer competitive hopfield neural network for semiconductor wafer defect detection. In Proceedings of the IEEE International Conference on Automation Science and Engineering, Edmonton, AB, Canada, 1–2 August 2005; IEEE: Piscataway, NJ, USA, 2005; pp. 301–306. [Google Scholar]
Jeong, Y.S.; Kim, S.J.; Jeong, M.K. Automatic identification of defect patterns in semiconductor wafer maps using spatial correlogram and dynamic time warping. IEEE Trans. Semicond. Manuf. 2008, 21, 625–637. [Google Scholar] [CrossRef]
Yu, J.; Lu, X. Wafer map defect detection and recognition using joint local and nonlocal linear discriminant analysis. IEEE Trans. Semicond. Manuf. 2015, 29, 33–43. [Google Scholar] [CrossRef]
Li, W.C.; Tsai, D.M. Automatic saw-mark detection in multicrystalline solar wafer images. Sol. Energy Mater. Sol. Cells 2011, 95, 2206–2220. [Google Scholar] [CrossRef]
Saqlain, M.; Jargalsaikhan, B.; Lee, J.Y. A voting ensemble classifier for wafer map defect patterns identification in semiconductor manufacturing. IEEE Trans. Semicond. Manuf. 2019, 32, 171–182. [Google Scholar] [CrossRef]
Jin, C.H.; Na, H.J.; Piao, M.; Pok, G.; Ryu, K.H. A novel DBSCAN-based defect pattern detection and classification framework for wafer bin map. IEEE Trans. Semicond. Manuf. 2019, 32, 286–292. [Google Scholar] [CrossRef]
Socher, R.; Huang, E.; Pennin, J.; Manning, C.D.; Ng, A. Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In Proceedings of the Advances in Neural Information Processing Systems: 25th Annual Conference on Neural Information Processing Systems, Granada, Spain, 12–14 December 2011; Volume 24. [Google Scholar]
Ying, X. An overview of overfitting and its solutions. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 7–9 July 2015; pp. 448–456. [Google Scholar]
Hara, K.; Saito, D.; Shouno, H. Analysis of function of rectified linear unit used in deep learning. In Proceedings of the 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, 12–17 July 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–8. [Google Scholar]
Imambi, S.; Prakash, K.B.; Kanagachidambaresan, G. PyTorch. In Programming with TensorFlow: Solution for Edge Computing Applications; Springer: Berlin/Heidelberg, Germany, 2021; pp. 87–104. [Google Scholar]
Rahman, M.A.; Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In Proceedings of the International Symposium on Visual Computing, Las Vegas, NV, USA, 12–14 December 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 234–244. [Google Scholar]
Soomro, T.A.; Afifi, A.J.; Gao, J.; Hellwich, O.; Paul, M.; Zheng, L. Strided U-Net model: Retinal vessels segmentation using dice loss. In Proceedings of the 2018 Digital Image Computing: Techniques and Applications (DICTA), Canberra, ACT, Australia, 10–13 December 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1–8. [Google Scholar]
Yacouby, R.; Axman, D. Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, Online, 20 November 2020; pp. 79–91. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
Zhou, Z.; Rahman Siddiquee, M.M.; Tajbakhsh, N.; Liang, J. Unet++: A nested u-net architecture for medical image segmentation. In Proceedings of the Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 3–11. [Google Scholar]
Huang, H.; Lin, L.; Tong, R.; Hu, H.; Zhang, Q.; Iwamoto, Y.; Han, X.; Chen, Y.W.; Wu, J. Unet 3+: A full-scale connected unet for medical image segmentation. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1055–1059. [Google Scholar]
Xiao, X.; Lian, S.; Luo, Z.; Li, S. Weighted res-unet for high-quality retina vessel segmentation. In Proceedings of the 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou, China, 19–21 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 327–331. [Google Scholar]
Jha, D.; Smedsrud, P.H.; Riegler, M.A.; Johansen, D.; De Lange, T.; Halvorsen, P.; Johansen, H.D. Resunet++: An advanced architecture for medical image segmentation. In Proceedings of the 2019 IEEE International Symposium on Multimedia (ISM), San Diego, CA, USA, 9–11 December 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 225–2255. [Google Scholar]
Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
Yu, C.; Gao, C.; Wang, J.; Yu, G.; Shen, C.; Sang, N. Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. Int. J. Comput. Vis. 2021, 129, 3051–3068. [Google Scholar] [CrossRef]

Figure 1. Network structure. CB is the convolution block. DSCU-Net contains three core components: encoder, decoder, and skip connection. The difference is that the design of the skip connection in traditional U-Net is optimized in DSCU-Net, and the information fusion between the encoder and decoder is enhanced.

Figure 2. Convolution block. The block contains three core components: convolution operation, normalization operation, and nonlinear transformation.

Figure 3. Visualization. The segmentation results of DSCU-Net with other classical semantic segmentation methods on common semiconductor defects, such as bubbles and residuals, are shown.

Table 1. Dataset. The dataset consists of both training and test datasets, and all data are actual tape-out data from the factory.

Type	Count
Train dataset	1588
Test dataset	682
Total	2270

Table 2. Experimental results. The table compares experimental results of classical segmentation algorithms, such as U-Net, U-Net++, U-Net3+, ResU-Net, BiSeNet-V2, etc., with DSCU-Net for several metrics.

Metrics	mIOU	Dice	Accuracy	Precision	Recall	F1-Score
U-Net	71.84%	81.01%	99.70%	91.04%	77.79%	83.90%
U-Net++	72.80%	81.87%	99.72%	93.38%	77.41%	84.65%
U-Net3+	70.30%	78.94%	99.69%	93.06%	74.22%	82.58%
BiSeNet-V2	65.70%	76.27%	99.74%	94.03%	78.96%	85.84%
ResU-Net	46.99%	60.78%	98.98%	49.56%	87.25%	63.21%
ResU-Net++	44.23%	58.71%	98.80%	44.67%	88.28%	59.32%
Attention U-Net	71.38%	80.66%	99.72%	94.18%	77.01%	84.73%
Ours	72.53%	81.96%	99.77%	94.93%	81.44%	87.67%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, S.; Zhu, Y.; Liang, P. DSCU-Net: MEMS Defect Detection Using Dense Skip-Connection U-Net. Symmetry 2024, 16, 300. https://doi.org/10.3390/sym16030300

AMA Style

Wu S, Zhu Y, Liang P. DSCU-Net: MEMS Defect Detection Using Dense Skip-Connection U-Net. Symmetry. 2024; 16(3):300. https://doi.org/10.3390/sym16030300

Chicago/Turabian Style

Wu, Shang, Yaxin Zhu, and Pengchen Liang. 2024. "DSCU-Net: MEMS Defect Detection Using Dense Skip-Connection U-Net" Symmetry 16, no. 3: 300. https://doi.org/10.3390/sym16030300

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

DSCU-Net: MEMS Defect Detection Using Dense Skip-Connection U-Net

Abstract

1. Introduction

2. Related Work

3. Method

3.1. Problem Definition

3.2. Dense Residual U-Net

3.2.1. Encoder

3.2.2. Decoder

3.2.3. Dense Skip Connection

4. Experiment

4.1. Experiment Setup and Datasets

4.2. Evaluation

4.3. Baseline Methods

4.4. Results and Analysis

4.5. Visualization

5. Conclusions and Discussion

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI