Lung Nodule CT Image Segmentation Model Based on Multiscale Dense Residual Neural Network

Zhang, Xinying; Kong, Shanshan; Han, Yang; Xie, Baoshan; Liu, Chunfeng

doi:10.3390/math11061363

Open AccessArticle

Lung Nodule CT Image Segmentation Model Based on Multiscale Dense Residual Neural Network

by

Xinying Zhang

^1,2,

Shanshan Kong

^2,3,4,*,

Yang Han

⁵,

Baoshan Xie

² and

Chunfeng Liu

^1,2,3,4

¹

The Key Laboratory of Engineering Computing in Tangshan City, North China University of Science and Technology, Tangshan 063210, China

²

College of Science, North China University of Science and Technology, Tangshan 063210, China

³

Hebei Key Laboratory of Data Science and Application, North China University of Science and Technology, Tangshan 063210, China

⁴

Tangshan Intelligent Industry and Image Processing Technology Innovation Center, North China University of Science and Technology, Tangshan 063210, China

⁵

Hebei Engineering Research Center for the Intelligentization of Iron Ore Optimization and Ironmaking Raw Materials Preparation Processes, North China University of Science and Technology, Tangshan 063210, China

^*

Author to whom correspondence should be addressed.

Mathematics 2023, 11(6), 1363; https://doi.org/10.3390/math11061363

Submission received: 9 November 2022 / Revised: 3 March 2023 / Accepted: 8 March 2023 / Published: 10 March 2023

(This article belongs to the Special Issue Engineering Calculation and Data Modeling)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

To solve the problem of the low segmentation accuracy of lung nodule CT images using U-Net, an improved method for segmentation of lung nodules by U-Net was proposed. Initially, the dense network connection and sawtooth expanded convolution design was added to the feature extraction part, and a local residual design was adopted in the upsampling process. Finally, the effectiveness of the proposed algorithm was evaluated using the LIDC-IDRI lung nodule public dataset. The results showed that the improved algorithm had 7.03%, 14.05%, and 10.43% higher performance than the U-Net segmentation algorithm under the three loss functions of DC, MIOU, and SE, and the accuracy was 2.45% higher compared with that of U-Net. Thus, the proposed method had an effective network structure.

Keywords:

U-Net; medical image; segmentation; DenseNet; lung cancer

MSC:

68T07

1. Introduction

In 2020, there were 1,796,144 lung cancer-associated mortalities, accounting for 18% of all new cancer deaths, making it the world’s leading cause of cancer-associated mortality [1]. If lung cancer is diagnosed early, the chance of cure increases significantly, from 14% to 49% within five years [2]. In the early stage, lung cancer is often detected as non-calcified lung nodules. Research on segmentation methods for lung nodules has important clinical value [3].

Pulmonary nodule segmentation methods can be grouped into traditional, deep learning and other methods. Traditional methods are carefully designed according to the morphological and gray characteristics of pulmonary nodules and performs segmentation based on known nodule locations. The traditional method can be divided into threshold and region growth, clustering, active contour model and mathematical model optimization methods. To achieve segmentation, threshold and region growth methods use the difference in gray values between different regions. In 2003, based on threshold segmentation, Kostis et al. used a circular operator with a fixed radius to perform open operation to remove adhering vessels and pleura. Then, they used a circular operator with a gradually decreasing radius to perform the iterative expansion operation to supplement the boundary details. However, this operation resulted in over-segmentation [4]. In 2011, Kubota et al. used the coupled competitive diffusion algorithm to extract the foreground region, combined with the region growth method to remove background structure, and obtained preliminary segmentation results [5]. This method is applicable to segmentation of nodules with different densities; however, the disadvantage is that seed points have to be manually selected on each section for growth. The clustering method realizes pixel clustering through the feature similarity of pixels in the target region. In 2015, Liu et al. integrated gray and spatial information of nodule pixels into the objective function of a fuzzy C-means (FCM) clustering algorithm, which was effective for segmentation of adhesion and ground glass nodules [6]. Then, Liu et al. proposed an adaptive FCM clustering algorithm to achieve faster and accurate segmentation of nodules; however, for nodules with a diameter of less than 10 mm, the segmentation effect was not good [7]. In 2020, Li et al. used Gaussian mixture model (GMM) statistics to introduce prior information on nodules into the traditional FCM algorithm, which markedly improved the segmentation accuracy by eliminating the uneven strength of nodules and interference from surrounding tissues [8]. Deformable methods such as the active contour model (ACM) use a curve that stops evolving at the boundary to get the target contour. In 2016, Nithila et al. combined an ACM with a FCM clustering method to segment nodules. Even though pulmonary parenchyma reconstruction based on the ACM reduced the error rate of nodular segmentation, the parameters were highly uncertain [9]. In 2017, to achieve 3D nodule segmentation, Farhangi et al. presented the prior shapes of nodules as sparse linear combinations of training shapes, which were integrated into the level set objective function [10]. The universality of this method is good; however, the non-automatic initialization of the function is not satisfactory. Studies on nodule segmentation based on the mathematical model optimization method have been reported. In 2007, Wang et al. used various rays emitted from the center of 3D nodules to project 3D nodules onto a 2D plane and then used a dynamic programming algorithm to determine the optimal contour [11]. The 3D nodule surface is projected onto a curve, which simplifies the segmentation method and makes the segmentation results more reliable. In 2016, based on the 3D Hessian matrix eigenvalues of each voxel point, Goncalves et al. calculated the shape index and curvature and combined them to set the optimal threshold for segmenting nodules, realizing multi-scale nodule segmentation. However, this method was sensitive to tubular structures [12].

The traditional method is based on mathematical knowledge representation and strong robustness, and the accurate segmentation effect does not require a large amount of labeled data during model training and is easy to integrate with anatomical knowledge and clinician experience. However, this method is strongly dependent on manual intervention. For example, the threshold method has a high segmentation efficiency and good repeatability; however, threshold setting is too empirical, and the segmentation effects of adhesion nodules are not ideal. Clustering, region growth, and dynamic programming methods include pixel neighborhood information, which has a low computational complexity but is sensitive to the location of initial seed points, and the distinction between target boundary and background. Moreover, it is difficult to combine with prior knowledge, and over-segmentation as well as under-segmentation are common. The active contour model can be combined with prior knowledge such as nodule shape and gray level, and it can self-adjust the energy functional parameters to improve the segmentation accuracy. However, segmentation based on this method depends on the initial contour position, is sensitive to noise, and the computational complexity increases with the number of iterations.

Currently, neural networks and deep learning are widely used in medical imaging [13]. Deep learning uses neural network models to train a large amount of image data, actively learns the low-level features of the nodules, and forms more abstract high-level features to predict and segment images. According to their network structure, deep learning methods can be grouped into three types: convolutional neural network-based segmentation methods, full convolutional neural network-based segmentation methods, and coding–decoding structural network-based segmentation methods. Since 2013, when Cernazanu-Glavan et al. used a convolutional neural network (CNN) to segment X-ray images, CNNs have been widely used because of their good feature extraction and expression abilities [14]. In 2019, Wang et al. combined a 3D Mask-RCNN with self-stepping and active learning strategies and obtained a segmentation accuracy that was almost the same as that of 100% tagged data when only 14.85% tagged images were used in model training [15]. In 2015, to reduce excess loss of information by the full convolutional layer in a CNN, Long et al. changed the full convolutional layer into the convolutional layer and proposed the full convolutional neural network (FCN), which has been recognized [16]. The parameter number for FCNs is smaller while the calculation time is shorter. Feature maps can be restored to input image size via the final deconvolution layer, and end-to-end image prediction can be completed. In 2019, to obtain a residual network, Liu et al. replaced the convolutional layer with a residual block. When segmenting nodules with a double-branch full convolutional residual network, the local features and rich context information of nodules can be extracted from multiple perspectives and scales, thereby improving the segmentation accuracy [17]. In 2015, Olaf Ronneberger et al. proposed the coding–decoding structural network U-net, which uses the encoder to extract target features, and integrates the information and recovers the resolution via the decoder to achieve finer segmentation [18]. Due to their low data demand and good segmentation effects, U-Nets have been widely used and are derived from many optimized networks. In 2018, Nikolov S et al. demonstrated a 3D U-Net architecture that achieved expert-level performance in describing different head and neck OARs that are commonly segmented in clinical practice [19]. In 2018, Tong et al. introduced Bottleneck blocks into the codec and decoder of U-Nets [20]. The rectified linear unit (ReLU) function, Dropout layer, and Dice loss function effectively replaced the traditional cross-entropy loss function, improving the segmentation accuracy of pulmonary nodules; however, the segmentation performance was poor. In 2019, Chen H et al. proposed a recursive aggregation model based on a 3D U-Net to segment OARs on magnetic resonance (MR) images [21]. In 2019, Man Y et al. introduced a U-Net with a geometric sensing function to segment the pancreas [22], while Lu L et al. added a residual module to U-Net for accurate pancreas segmentation [23]. In the same year, Amorim et al. input the images of three sections of pulmonary nodules and the corresponding gold standard into three branches for training and obtained robust segmentation results [24].

The deep learning method can effectively segment all nodule types, overcoming the limitation that a single traditional method cannot satisfy the segmentation of all nodule types. With support from high-performance computers, the deep learning method can quickly complete the task of fully automatic nodule segmentation. However, most of the current deep learning models rely on a large number of manually marked sections for supervision and training, which is extremely resource-consuming. Despite the limitation, the deep learning method may be of significance in the research field of high-accuracy pulmonary nodule segmentation.

There are other ways for segmenting lung nodules. In 2017, Zhang et al. applied deep trust network training to detect nodules larger than 30 mm and achieved more than 90% accuracy, sensitivity, and specificity [25]. In 2020, Suji et al. used a motion-based optical flow method to segment pulmonary nodules, verified the effectiveness of the optical flow method, and proposed a method for improving the segmentation efficiency [26]. These methods provide different ideas for pulmonary nodule segmentation and should be investigated further.

Based on the U-Net structure, we combined the residual structure and dense network principle to improve the feature extraction and upsampling of U-Net. The LIDC-IDRI dataset, which contains lung cancer CT images, was used as the training data. Under the same parameters, experimental verification was performed on the improved U-Net and FCN, SegNet, U-Net, ResNet, U-Net++, as well as DenseNet. The improved U-Net was shown to improve the segmentation effects of lung nodules.

2. Models and Methods

2.1. U-Net

U-Net is a semantic segmentation network whose development is based on the fully convolutional neural network [18]. This network has 23 layers and has a symmetric structure, similar to the English letter U; therefore, it is called U-Net (Figure 1). The first half performs feature extraction while the second half performs upsampling; thus, it is an encoder–decoder structure. In feature extraction, the input image passes through the convolution and pooling layers, after which feature maps of different levels are obtained, which contain image features at different abstraction levels. In upsampling, the deconvolution layer is used to gradually recover feature map sizes, and the upsampled feature map is fused to repair details lost in the training process with a low degree of abstraction to improve the network’s segmentation accuracy. The U-Net structure is simple, the number of layers is small, the training does not need a large number of samples, and the training speed is fast.

2.2. ResNet

In 2015, He et al. proposed the Residual Network (ResNet) model in the ImageNet image recognition competition [27], which directly added the input results to the bottom layer by adding a direct channel in the network. The idea is shown in Formula (1):

C(x) = x + F(x)

(1)

Whereby x is the input, F(x) is the output result of the hidden layer, and C(x) is the underlying mapping. The residual structure is shown in Figure 2. The convergence and pixel classification performance of the residual network were improved.

The Batch Normalization (BN) layer is added before the ReLU activation function, the BN layer is added after the last convolutional layer, and the ReLU activation function is applied after the unit plus operation. The BN layer can speed up the networks’ training and convergence, prevent overfitting, and the residual network can effectively avoid the influence of gradient disappearance. Using the improved residual structure to replace the convolution layer in the upsampling part of U-Net can effectively avoid the overfitting phenomenon, and effectively reduce the gradient disappearance problem that is due to the deepening of the network structure, improving the models’ segmentation performance. The improved residual structure is shown in Figure 3.

2.3. DenseNet

Dense Connected Convolutional Networks (DenseNet) establish the connection between different layers, achieve the reuse of features, and have excellent performance [28]. Introduction of dense blocks in U-Net can improve network fitting to better solve the limitations associated with missing details in medical images. The schematic presentation of a two-layer dense block structure is shown in Figure 4. The dense block in the first layer includes three parts: the BN layer, ReLU activation function, and 3 × 3 convolution operation. Each layer of the network in the dense block is connected to the next layer but also to all subsequent network layers so that each layer can accept the remaining layers as additional data input.

The dense connection mechanism involves using a dense connection from the convolutional layer or from parts of the encoder and decoder [29]. In this paper, the dense block is introduced into the U-Net in the encoder part, and there is a direct connection between any two layers of the dense block, which better solves the challenge associated with image detail loss. The idea of expansive convolution is also introduced. Expansive convolution increases the receptive field under the same computational conditions. To avoid excess redundant parameters, the global dense network connection and local dense network connection are combined to improve the encoder and jump connection, to reduce the overfitting from dense connections, and to retain as much detailed feature information as possible while controlling the computation amount (Figure 5).

2.4. Network Structure

In this paper, Multiscale-Dense-Residual-U-Net (MDRU-Net), which is based on the U-Net structure, is proposed to improve the segmentation of lung nodule images of different sizes. The network structure is shown in Figure 6. Feature extraction adopts global dense network and local dense network designs to achieve the reuse of multiscale features from an input image in the channel dimension while controlling the number of parameters. Moreover, an expanded convolution is introduced to expand the receptive field without reducing or missing the coverage area of the receptive field. Each convolution output contains a large range of information while also reducing the cost. In upsampling, the residual mechanism is used to effectively suppress the influence of network degradation and gradient fragmentation on segmentation accuracy.

The DNet module shown on the left of Figure 6 is the introduced dense structure. A dilated convolution with a convolution kernel size of 3 and dilation rates of 3 and 5 is adopted to form a sawtooth structure. The topology of global dense connections is shown by the solid lines on the left half of Figure 6, where each layer of the network is connected to all the following layers, making full use of these features. The loss of background information in the pooling process is reduced by using the average pooling method. The topological structure of local dense connections is shown by the dashed line on the left part of (Figure 6). After dimensionality reduction by 1 × 1 convolutional layer, feature maps with large mesoscale changes in the feature extraction part are spliced together with the output of the backbone network and input into the upsampling part. Then, the maximum pooling method is adopted to retain the information of feature maps with different scales. The feature map representation method of the pooling process is shown in Formulas (2) and (3):

X_{i} = Concat (W_{1} (X_{1}), W_{2} (X_{2}), \dots, W_{i - 1} (X_{i - 1}))

(2)

Y = Concat (X_{o u t}, W_{c o n v 1} (C o n v 1), \dots, W_{c o n v 4} (C o n v 1))

(3)

X_{i}

represents the new feature map obtained by concatenating feature maps for all previous layers received by the first layer,

Y

denotes the feature map output by the encoder,

X_{o u t}

represents the feature map output by the encoder backbone network,

W_{i}

stands for the pooling operation while

W_{c o n v i}

denotes the convolution operation.

3. Experiment

3.1. Introduction to Dataset

Lung cancer CT image training data were acquired from LIDC-IDRI, which consists of chest medical image files (such as CT and X-ray images) and corresponding lesion annotations derived from diagnostic results [30]. A total of 1018 research examples are included in this dataset. The LIDC-IDRI dataset is shown in Table 1.

Each instance of an image in the data set was independently diagnosed by four physicians and marked with patient location and category. Three information types were included: (1) nodules ≥ 3 mm; (2) nodules < 3 mm; and (3) non-nodules ≥ 3 mm. Each case in the data set had an XML file storing nodule information. For nodules ≥ 3 mm, nodule features are described based on fineness, internal structure, calcification, sphericity, edge, foliation, burr sign, texture, and whether they are benign or malignant, among others. For nodules < 3 mm, the information consists of an image identifier of nodule location and coordinates of the nodule center point. Each image has 512 × 512 pixels, and 7379 images were obtained in the experiment.

3.2. Data Preprocessing

The selected images first marked the positions of pulmonary nodules in the images according to the annotation file in the case folder. Given the small proportion of pulmonary nodules in the original CT images, a quasi-imbalance problem would be generated, thereby affecting the network’s training process. Therefore, to reduce the influence of other lung tissues on experimental results, the original image obtained was cropped. Based on the center point of the nodule position provided in the annotation file, the sizes of the original and labeled images were cut to 96 × 96 pixels to completely retain the nodule information. The pretreatment experiment results are shown in Figure 7.

3.3. Model Training

This experiment was implemented in the Windows 10 operating system and the deep learning network was developed and run using the Keras [31] library on Tensorflow [32] back-end for experimental verification. The experiments’ environment was python3.9, the processor was an i5-8265U2.30G HzCPU, and the memory was 8 GB. Based on the experimental data processing process in Section 3.2, 7379 pulmonary nodule images obtained by LDC-IDRI were divided into a training set, verification set, and test set at a ratio of 8:1:1. Due to the small number of labels in the segmentation data for pulmonary nodules, the model is prone to overfitting. To improve the generalization ability of the network, the data obtained were extended by random clipping, random horizontal flips, and random rotation.

A ten-fold cross-validation strategy was used to assess the performance of the method, and a similar data distribution was maintained in the training and test data sets to avoid over- and under-segmentation due to data imbalance. The Adam optimization algorithm was used for parameter training. In the standard backpropagation update, the initial learning rate was set to 0.05, the step decay strategy was used to establish the learning rate, and the period was 10, that is, the learning rate was reduced by 50% every 10 rounds. The batch size was set to 2, the number of training iterations was 100, and the momentum factor was 0.9.

The number of training iterations is very important for training the deep learning network model. The number of training iterations can be determined by observing the changing trend of the curve of the training and verification sets during training. If the model’s performance is not further improved during training, training of the model will automatically stop after 10 additional training generations.

In Figure 8 and Figure 9, when Epoch = 100, changes in the Dice similarity coefficient value (DC) and loss function (Loss) curve of the network on the verification set tended to be stable. Therefore, the number of training iterations was set to 100. Moreover, for the network to be adequately trained, the Step for each Epoch was set to 500.

3.4. Evaluation Index

The loss function plays an important role in the training of neural networks. The loss function is used to assess the inconsistency between the real value and the value predicted by the model and can be backpropagated to the previous layer to update the optimization weight. Image segmentation can essentially be transformed into foreground and background classification problems. When the sample belongs to the positive class and the classifier correctly predicts it as a positive sample, it is called a true positive (TP); when positive samples are wrongly predicted to be negative samples, it is called a false positive (FP). Similarly, when a negative sample is correctly predicted as belonging to the negative class, it is called a true negative (TN), and when a positive sample is wrongly judged as being a negative sample, it is called a false negative (FN). In the following equation, the real image (or expert annotation) is denoted as

T \in {[0, 1]}^{m \times n}

; prediction (or segmentation) is expressed as

P \in {[0, 1]}^{m \times n}

; and n index represents each pixel value in the image space

N

. The label of each class is written as

1

in class

C

. The following three evaluation indices were selected.

Dice Coefficient (DC) loss

The DC is a function used to measure the similarity of sets, which shows good performance as a loss function [22]. Its element measurements range from

0

to

1

, where a DC of 1 indicates perfect and complete overlap. The DC is defined as:

D I C E (T, P) = \frac{2 \times \sum_{n = 1}^{N} (T_{n} \times P_{n})}{\sum_{n = 1}^{N} (T_{n} + P_{n})}

(4)

The DC loss tends to provide the best segmentation, which is defined as:

L o s s_{D C} (T, P) = 1 - D C (T, P)

(5)

2.: Sensitivity (SE)

S E = \frac{T P}{T P + F N}

(6)

3.: Mean Intersection over Union (MIOU)

The MIOU is calculated as the ratio of the intersection and union of two sets of true values and predicted values. This ratio can be transformed into the ratio of TP (intersection) to the sum of TP, FP, and FN (union). The closer to 1, the more accurate the segmentation effect. The calculation formula is:

M I O U = \frac{1}{k + 1} \sum_{i = 0}^{k} \frac{p_{i i}}{\sum_{j = 0}^{k} p_{i j} + \sum_{j = 0}^{k} p_{j i} - p_{i i}}

(7)

which is equivalent to:

M I O U = \frac{1}{k + 1} \sum_{i = 0}^{k} \frac{T P}{F N + F P + T P}

(8)

where i is the true value, j is the predicted value,

p_{i j}

is the number of j predicted from the true value i, and

p_{i i}

is the real quantity.

p_{i j}

and

p_{j i}

stand for false positive and false negative, respectively, and k + 1 is the number of categories (including empty classes). The MIOU is generally calculated based on classes. After the Intersection over Union (IOU) of each class is calculated and accumulated, an evaluation based on the global is obtained.

3.5. Contrast Experiment

To verify the segmentation effect of the proposed MDRU-Net model, six models, including FCN, SegNet, U-Net, Resnet, U-net++, and Densenet, were respectively trained using the same training data and training parameters. Comparative experimental results are shown in Table 2.

In Table 2, the segmentation performance of the FCN network is the lowest because the FCN is insensitive to details in the nodule region, ignores the relationship between pixels, and cannot learn the global context information. Since SegNet does not use skip connections, it cannot send shallow feature information to the deeper level, and the global context information cannot be utilized. Even though the U-Net network uses a jump connection, the expression of nodule features is not high, and segmentation results are not fine enough. The DC, MIOU, and SE values of the three models are all low. The Resnet model with a residual structure was introduced to effectively avoid the problem of gradient disappearance or gradient explosion, and all indices improved. U-Net++ is a further improvement of the U-Net network that uses a multi-scale fusion strategy. Following the improvement, its indicators are better than those of the U-Net model. The Densenet model establishes a dense connection between all the preceding and following layers, achieving better performance than ResNet with fewer parameters and computational costs. The Dice coefficient, MIOU, SE, and accuracy on the LIDC-IDRI test set were 92.37%, 87.13%, 91.77%, and 97.68%, respectively. Compared with the traditional U-Net, the four indices were improved by 7.03%, 14.05%, 10.43%, and 2.45% respectively, indicating that the proposed MDRU-net model has certain advantages because it can reuse the multi-scale features of the image, making the segmentation results more precise and optimizing network performance.

4. Experimental Result

Combined with the comparison experiment, lesion visualization results for pulmonary nodule images are shown in Figure 10. Figure 10a–i show the original image, the gold standard segmentation of pulmonary nodules labeled by professional physicians, and the segmentation results of the FCN, SegNet, U-Net, ResNet, UNet++, Densenet and MDRU-Net models for pulmonary nodules. The parts in the red box highlight the finer segmentation of the proposed MDRU-Net as compared with those of other models, and the segmentation results are more intuitive.

As the FCN model structure is relatively simple, it is not sensitive to details in the image. Figure 10c shows that the segmentation effect of the FCN is very rough and contains a large number of false positive areas. By introducing an anti-pooling structure, the SegNet model can achieve accurate upsampling. Compared with the FCN model SegNet, false positive areas are reduced (Figure 10d), and the segmentation effect is improved. U-Net uses skip connections to make full use of underlying features and improves the segmentation effect (Figure 10e). However, problems such as low nodule resolution and blurred edges may lead to the division of some non-nodule tissues into foreground areas in these models, with obvious false positives and unsatisfactory segmentation effects. The ResNet model alleviates such outcomes by introducing a residual structure and improving the segmentation effect (Figure 10f). U-net++ averages the segmentation maps sampled on different layers as the final segmentation result, improving the accuracy of segmentation of pulmonary nodules. However, the importance of different layers is not considered; thus, there are still some false positive pixels in the model (Figure 10g). Densenet establishes a dense connection between all front and back layers through the dense connection mechanism, further refining the segmentation effect (Figure 10h). The MDRU-Net model proposed in this paper makes full use of multi-scale feature information. In Figure 10i, the MDRU-Net model is superior to other models in the processing of model edges, and it can be seen that the edge outline of pulmonary nodules is clearer. These findings show that the MDRU-Net model can identify finer edges. The segmentation result is closer to the gold standard.

5. Discussion

To solve the problem of lung nodule CT image segmentation using U-Net, this paper proposes a pulmonary nodule segmentation method based on an improved U-Net structure. The feature extraction part adopts a dense network which improves the global image transmission characteristics at different scales. The local characteristics of the dense network will enable large-scale changes in the further reuse of figures, while also considering the fact that feature information is easy to lose in the next sampling part. The expansion of the convolution with different expansion rates enlarged the receptive field. This makes the network able to capture the characteristics of context information. The residual design in the upsampling part can effectively suppress the network degradation and gradient fragmentation problems in the improved method proposed in this paper. Through comparative experiments, the proposed network model performs better than other mainstream algorithms and is therefore applicable as a basic network combined with various attention mechanisms and other models that can be improved. Moreover, although the proposed method further improves the segmentation performance of the U-Net model for lung nodules in CT images, the segmentation accuracy of the model can be further improved, which needs to be studied and solved in future scientific research.

Author Contributions

Conceptualization, C.L.; Methodology, X.Z.; Validation, X.Z.; Formal analysis, B.X.; Resources, Y.H.; Writing—original draft, X.Z.; Writing—review & editing, S.K. and B.X.; Project administration, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Key Science and Technology Project of Hebei Provincial Department of Education (North China University of Science and Technology, Project Number: JYG2020001).

Institutional Review Board Statement

LIDC-IDRI belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other conflicts of interest.

Informed Consent Statement

LIDC-IDRI belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other conflicts of interest.

Data Availability Statement

The data can be found at this link Data from The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): A completed reference database of lung nodules on CT scans (LIDC-IDRI)—The Cancer Imaging Archive (TCIA) Public Access—Cancer Imaging Archive Wiki.

Acknowledgments

We thank the Hebei Provincial Education Department Science and Technology Project (North China University of Science and Technology, Project Number: JYG2020001) for supporting this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 2021, 71, 209–249. [Google Scholar] [CrossRef]
Gurcan, M.N.; Sahiner, B.; Petrick, N.; Chan, H.P.; Kazerooni, E.A.; Cascade, P.N.; Hadjiiski, L. Lung nodule detection on thoracic computed tomography images: Preliminary evaluation of a computer-aided diagnosis system. Med. Phys. 2002, 29, 2552–2558. [Google Scholar] [CrossRef] [Green Version]
Lopez Torres, E.; Fiorina, E.; Pennazio, F.; Peroni, C.; Saletta, M.; Camarlinghi, N.; Fantacci, M.E.; Cerello, P. Large scale validation of the M5L lung CAD on heterogeneous CT datasets. Med. Phys. 2015, 42, 1477–1489. [Google Scholar] [CrossRef] [PubMed]
Kostis, W.J.; Reeves, A.P.; Yankelevitz, D.F.; Henschke, C.I. Three-dimensional segmentation and growth-rate estimation of small pulmonary nodules in helical CT images. IEEE Trans. Med. Imaging 2003, 22, 1259–1274. [Google Scholar] [CrossRef]
Kubota, T.; Jerebko, A.K.; Dewan, M.; Salganicoff, M.; Krishnan, A. Segmentation of pulmonary nodules of various densities with morphological approaches and convexity models. Med. Image Anal. 2011, 15, 133–154. [Google Scholar] [CrossRef]
Liu, H.; Zhang, C.M.; Su, Z.Y.; Wang, K.; Deng, K. Research on a pulmonary nodule segmentation method combining fast self-adaptive FCM and classification. Comput. Math. Methods Med. 2015, 2015, 185726. [Google Scholar] [CrossRef] [PubMed]
Liu, H.; Geng, F.; Guo, Q.; Zhang, C.; Zhang, C. A fast weak-supervised pulmonary nodule segmentation method based on modified self-adaptive FCM algorithm. Soft Comput. 2018, 22, 3983–3995. [Google Scholar] [CrossRef]
Li, X.; Li, B.; Liu, F.; Yin, H.; Zhou, F. Segmentation of pulmonary nodules using a GMM fuzzy C-means algorithm. IEEE Access 2020, 8, 37541–37556. [Google Scholar] [CrossRef]
Nithila, E.E.; Kumar, S.S. Segmentation of lung nodule in CT data using active contour model and Fuzzy C-mean clustering. Alex. Eng. J. 2016, 55, 2583–2588. [Google Scholar] [CrossRef] [Green Version]
Farhangi, M.M.; Frigui, H.; Seow, A.; Amini, A.A. 3-d active contour segmentation based on sparse linear combination of training shapes (scots). IEEE Trans. Med. Imaging 2017, 36, 2239–2249. [Google Scholar] [CrossRef]
Wang, J.; Engelmann, R.; Li, Q. Segmentation of pulmonary nodules in three-dimensional CT images by use of a spiral-scanning technique. Med. Phys. 2007, 34, 4678–4689. [Google Scholar] [CrossRef] [PubMed]
Gonçalves, L.; Novo, J.; Campilho, A. Hessian based approaches for 3D lung nodule segmentation. Expert Syst. Appl. 2016, 61, 1–15. [Google Scholar] [CrossRef]
Aresta, G.; Jacobs, C.; Araújo, T.; Cunha, A.; Ramos, I.; van Ginneken, B.; Campilho, A. iW-Net: An automatic and minimalistic interactive lung nodule segmentation deep network. Sci. Rep. 2019, 9, 11591. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cernazanu-Glavan, C.; Holban, S. Segmentation of bone structure in X-ray images using convolutional neural network. Adv. Electr. Comput. Eng. 2013, 13, 87–94. [Google Scholar] [CrossRef]
Wang, W.; Feng, R.; Chen, J.; Lu, Y.; Chen, T.; Yu, H.; Chen, D.; Wu, J. Nodule-plus R-CNN and deep self-paced active learning for 3D instance segmentation of pulmonary nodules. IEEE Access 2019, 7, 128796–128805. [Google Scholar] [CrossRef]
Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 12 June 2015; pp. 3431–3440. [Google Scholar]
Liu, H.; Cao, H.; Song, E.; Ma, G.; Xu, X.; Jin, R.; Jin, Y.; Hung, C.C. A cascaded dual-pathway residual network for lung nodule segmentation in CT images. Phys. Med. 2019, 63, 112–121. [Google Scholar] [CrossRef] [Green Version]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
Nikolov, S.; Blackwell, S.; Zverovitch, A.; Mendes, R.; Livne, M.; De Fauw, J.; Patel, Y.; Meyer, C.; Askham, H.; Romera-Paredes, B.; et al. Deep learning to achieve clinically applicable segmentation of head and neck anatomy for radiotherapy. arXiv 2018, 180904430. [Google Scholar]
Tong, G.; Li, Y.; Chen, H.; Zhang, Q.; Jiang, H. Improved U-NET network for pulmonary nodules segmentation. Optik 2018, 174, 460–469. [Google Scholar] [CrossRef]
Chen, H.; Lu, W.; Chen, M.; Zhou, L.; Timmerman, R.; Tu, D.; Nedzi, L.; Wardak, Z.; Jiang, S.; Zhen, X.; et al. A recursive ensemble organ segmentation (reos) framework: Application in brain radiotherapy. Phys. Med. Biol. 2019, 64, 025015. [Google Scholar] [CrossRef]
Man, Y.; Huang, Y.; Feng, J.; Li, X.; Wu, F. Deep q learning driven ct pancreas segmentation with geometry-aware u-Net. IEEE Trans. Med. Imaging 2019, 38, 1971–1980. [Google Scholar] [CrossRef] [Green Version]
Lu, L.; Jian, L.; Luo, J.; Xiao, B. Pancreatic segmentation via ringed residual u-Net. IEEE Access 2019, 7, 172871–172878. [Google Scholar] [CrossRef]
Amorim, P.H.; de Moraes, T.F.; da Silva, J.V.; Pedrini, H. Lung nodule segmentation based on convolutional neural networks using multi-orientation and patchwise mechanisms. In Proceedings of the VipIMAGE 2019: VII ECCOMAS Thematic Conference on Computational Vision and Medical Image Processing, Porto, Portugal, 16–18 October 2019; Springer International Publishing: Berlin/Heidelberg, Germany, 2019; pp. 286–295. [Google Scholar]
Zhang, T.; Zhao, J.; Luo, J.; Qiang, Y. Deep belief network for lung nodules diagnosed in CT imaging. Int. J. Perform. Eng. 2017, 13, 1358. [Google Scholar] [CrossRef]
Suji, R.J.; Bhadouria, S.S.; Dhar, J.; Godfrey, W.W. Optical flow methods for lung nodule segmentation on LIDC-IDRI images. J. Digit. Imaging 2020, 33, 1306–1324. [Google Scholar] [CrossRef] [PubMed]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Wang, Z.H.; Liu, Z.; Song, Y.Q.; Zhu, Y. Densely connected deep u-Net for abdominal multi-organ segmentation. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 1415–1419. [Google Scholar]
Armato, S.G., III; McLennan, G.; Bidaut, L.; McNitt-Gray, M.F.; Meyer, C.R.; Reeves, A.P.; Zhao, B.; Aberle, D.R.; Henschke, C.I.; Hoffman, E.A.; et al. The lung image database consortium (LIDC) and image database resource initiative (IDRI): A completed reference database of lung nodules on CT scans. Med. Phys. 2011, 38, 915–931. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Gulli, A.; Pal, S. Deep Learning with Keras; Packt Publishing Ltd.: Birmingham, UK, 2017. [Google Scholar]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. Tensorflow: A system for large-scale machine learning. In Proceedings of the Osdi, Savannah, GA, USA, 2–4 November 2016; Volume 16, pp. 265–283. [Google Scholar]

Figure 1. The U-Net structure.

Figure 2. The residual block.

Figure 3. The improved residual block.

Figure 4. Dense blocks.

Figure 5. Improved dense block internal structure.

Figure 6. Schematic presentation of the MDRU-Net structure.

Figure 7. Data preprocessing.

Figure 8. Dice similarity coefficient curve.

Figure 9. Loss function curve.

Figure 10. Comparisons of the visual effects of the experiment. (a) Original image; (b) gold Standard; (c) FCN; (d) SegNet; (e) U-Net; (f) Resnet; (g) U-net ++; (h) Densenet; and (i) MDRU-Net.

Table 1. Dataset characteristics.

Collection Statistics	Updated 21 March 2012
Data size	124 GB
Image type	CT, DX, CR
Picture number	244,527
Number of patients	1010
Number of series	1018 CT 290 CR/DX
Number of studies	1308

Table 2. Comparative experimental results.

Model	DC	MIOU	SE	Accuracy
FCN	65.77%	53.69%	71.23%	94.13%
SegNet	82.31%	74.29%	84.81%	96.24%
U-Net	80.31%	73.08%	80.56%	95.23%
ResNet	87.91%	82.12%	81.34%	95.97%
U-Net++	90.87%	85.90%	90.23%	96.63%
DenseNet	88.21%	81.26%	85.76%	96.69%
MDRU-Net	92.37%	92.13%	91.77%	97.68%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, X.; Kong, S.; Han, Y.; Xie, B.; Liu, C. Lung Nodule CT Image Segmentation Model Based on Multiscale Dense Residual Neural Network. Mathematics 2023, 11, 1363. https://doi.org/10.3390/math11061363

AMA Style

Zhang X, Kong S, Han Y, Xie B, Liu C. Lung Nodule CT Image Segmentation Model Based on Multiscale Dense Residual Neural Network. Mathematics. 2023; 11(6):1363. https://doi.org/10.3390/math11061363

Chicago/Turabian Style

Zhang, Xinying, Shanshan Kong, Yang Han, Baoshan Xie, and Chunfeng Liu. 2023. "Lung Nodule CT Image Segmentation Model Based on Multiscale Dense Residual Neural Network" Mathematics 11, no. 6: 1363. https://doi.org/10.3390/math11061363

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Lung Nodule CT Image Segmentation Model Based on Multiscale Dense Residual Neural Network

Abstract

1. Introduction

2. Models and Methods

2.1. U-Net

2.2. ResNet

2.3. DenseNet

2.4. Network Structure

3. Experiment

3.1. Introduction to Dataset

3.2. Data Preprocessing

3.3. Model Training

3.4. Evaluation Index

3.5. Contrast Experiment

4. Experimental Result

5. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI