Next Article in Journal
An HBase-Based Optimization Model for Distributed Medical Data Storage and Retrieval
Previous Article in Journal
An Image Denoising Method for Arc-Scanning SAR for Airport Runway Foreign Object Debris Detection
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Effects of Image Size on Deep Learning

Clinical Physiology (Lund), Department of Clinical Sciences (Lund), Lund University, Skåne University Hospital (2nd Floor, Hall B), 221 85 Lund, Sweden
Electronics 2023, 12(4), 985; https://doi.org/10.3390/electronics12040985
Submission received: 12 December 2022 / Revised: 2 February 2023 / Accepted: 14 February 2023 / Published: 16 February 2023

Abstract

:
In this work, the best size for late gadolinium enhancement (LGE) magnetic resonance imaging (MRI) images in the training dataset was determined to optimize deep learning training outcomes. Non-extra pixel and extra pixel interpolation algorithms were used to determine the new size of the LGE-MRI images. A novel strategy was introduced to handle interpolation masks and remove extra class labels in interpolated ground truth (GT) segmentation masks. The expectation maximization, weighted intensity, a priori information (EWA) algorithm was used for the quantification of myocardial infarction (MI) in automatically segmented LGE-MRI images. Arbitrary threshold, comparison of the sums, and sums of differences are methods used to estimate the relationship between semi-automatic or manual and fully automated quantification of myocardial infarction (MI) results. The relationship between semi-automatic and fully automated quantification of MI results was found to be closer in the case of bigger LGE MRI images (55.5% closer to manual results) than in the case of smaller LGE MRI images (22.2% closer to manual results).

1. Introduction

In this study, the main objective is to determine the best size of LGE-MRI images in the training datasets to achieve optimal deep learning-based segmentation outcomes. Deep learning is a subfield of machine learning and refers to a particular class of neural networks [1,2,3,4,5]. Neural networks are the backbone of deep learning algorithms, and un-like shallow counterparts deep neural networks can directly process raw input data, including images, text, and sound [5]. In deep learning, a class of deep neural networks commonly applied to visual imagery is CNN [3,5,6]. Figure 1 shows a simplified representation of a few common deep learning architectures, applicable to visual imagery [7]. Figure 1 shows a schematic representation of two examples of the most commonly used networks. As can be seen in Figure 1, one type of deep neural network architecture can also form the backbone of more sophisticated architectures for advanced applications [5,7,8,9]. In this paper, the CNN architecture of interest is U-net. U-net was chosen not only because it outperformed the then-best method of sliding-window convolutional network or won many challenges, but also because it could provide a fast and precise segmentation of heart images [10]. Typically, image segmentation locates object boundaries in the image to simplify or change the image into something more meaningful and/or easier to analyze [11,12,13,14,15]. In medical image analysis, segmentation is the stage where a significant commitment is made to delineate structures of interest and discriminate them from background tissue, but this kind of separation or segmentation is generally effortless and swift for the human visual system [16,17,18,19]. In this work, U-net was dedicated to that stage to ensure swift and accurate delineations and discriminations.
The current literature shows that there exist many works which are mostly proposed for the segmentation of medical images using U-net or closely related versions [20,21,22,23,24,25,26,27,28,29,30,31].
For example, in [26], the author focused on different values of the regularization hyperparameters to evaluate the effects such values had on the quality of semantic segmentation with U-net against GT segmentation. Regarding tunning other training hyperparameters, the author adopted a strategy of manually performing new adjustments only when 10% of all epochs were reached before achieving the 90% validation accuracy. A comparison of semantic segmentation with U-net against GT segmentation results demonstrated that the small value of L2 regularization could produce semantic segmentation with U-net results much closer to ground truth segmentation results. However, the effects of such a regularization hyperparameter on the fully automated quantification of MI were not studied in [26]. Therefore, in [32], the author presented the preliminary work related to fully automating the quantification of the MI. Here, the author chose the regularization hyperparameter value considering or following recommendations given in [26]. In [32], the quantification algorithm known as EWA, incorporated into the Segment CMR Software, quantified the infarct scar sizes during the process of full automation of the quantification of MI. EWA was based on expectation-maximization and a weighted intensity, and in [33] the authors proved that it might serve as a clinical standard for the quantification of MI in LGE-MRI images. Normally, quantification algorithms were applied to segmented structures to extract essential diagnostic information such as shape, size, texture, angle, and motion [16]. As the types of measurement and tissue vary considerably, numerous quantification techniques, including EWA were developed that addressed specific applications [16,33]. In the preliminary work presented in [32], the author demonstrated that more than 50% of the average infarct scar volume, 75% of the infarct scar percentage and 65 % of the microvascular obstruction (mo) percentage were achieved with the EWA algorithm. However, in both previous works [26,32] the effects of the size of LGE-MRI images in the training datasets on the deep learning training outcome or the output of deep learning algorithms were not studied. Therefore, in this paper, the author studied such effects using different interpolation algorithms. To the best of the author’s knowledge, image interpolation algorithms are divided into two major categories of non-extra-pixel and extra-pixel interpolation algorithms [34]. Unlike the extra-pixel approach, the non-extra-pixel approach only uses original or source image pixels to produce or output interpolated images of the desired size [35]. Selected examples of such approaches-based interpolation algorithms are provided in Part 2, Section 2.2. Given that a non-extra pixel category algorithm, such as nearest neighbor interpolation, is routinely used to interpolate ground truth masks due to its inherent advantage of not creating non-original or extra class labels in the interpolated masks (during the datasets image resizing processes), in this work the author demonstrated the possibility and importance of improving the deep learning-based segmentation and MI quantification results by resizing images in the training datasets, using extra pixel approach-based interpolation algorithms. In brief, the author first determined the new size of LGE-MRI images of the reference training datasets using extra-pixel approach-based interpolation algorithms, and corrected errors or removed extra class labels in interpolated ground truth segmentation masks using a novel strategy developed for interpolation masks handling purposes. In this way, the author was able to evaluate how the change-in-image-size improves or worsens the predictive capability or performance of deep learning-based U-net via semantic segmentation and quantification operations. It is important to note that, in this context, the U-net is used as an existing and well-documented method to carry out deep learning-based semantic segmentation operations. It is also important to note that the nearest neighbor image interpolation algorithm normally produces heavy visual texture and edge artefacts that reduce or worsen the quality of interpolated images.
The fully automated quantification of the MI was achieved by the EWA algorithm applied to the outcome of automatic semantic segmentation with U-net. During experiments, common class metrics were used to evaluate the quality of semantic segmentation with U-net against the GT segmentation. Additionally, arbitrary threshold, comparison of the sums, and sums of differences were used as criteria or options to estimate the relationship between the semi-automatic and fully automated quantification of MI results. After experimental simulations, a strong/close relationship was found between semi-automatic and fully automated quantification of MI results with larger LGE MRI image dataset than with smaller one.
In the next parts of this paper, the word manual may refer to semi-automatic or medical experts-based results while the word automated refers to fully automated or U-net-based results. The rest of the paper is organized as follows: Part II presents the materials and methods used to demonstrate effects. Part III presents a description of the dataset used, metrics, methods, U-net settings, and graphic card information. Part IV presents discussions related to the experimental results. Part V gives the conclusion of this work.

2. Materials and Methods

2.1. U-Net Architecture

U-Net is a CCN architecture widely used for semantic segmentation tasks [10]. It features a U-shaped design, comprising contracting and expansive paths.
In our experiments, we used the U-Net Layers function in MATLAB to easily create a U-Net architecture for semantic segmentation. This function follows the U-shaped architecture described in the original U-Net paper [10]. The contracting path consists of repeating blocks of convolution, ReLU activation, and max pooling. The expansive path involves transposed convolution, ReLU activation, concatenation with the downsampled feature map, and additional convolution. The U-Net Layers function provides options to customize the network, but note that it is just one implementation of the U-Net architecture. For more information, refer to the MATLAB documentation [36,37]. Figure 2 briefly shows the input and output layers, as well as the intermediate layers and connections, of a deep learning network as visualized by the analyzeNetwork function in MATLAB.

2.2. Selected Methods for Image Interpolation

Interpolation is a technique that pervades or penetrates many applications [34,38,39,40,41,42]. Interpolation is rarely the goal (in itself), yet it affects both the desired results and the ways to obtain them [16]. In this work, the nearest neighbor, bicubic, and Lanczos interpolation algorithms are used to determine the new size of LGE-MRI images in the training datasets, due to their acceptable performance and popularity in image processing and analysis software [35,43,44,45,46].

2.2.1. Nearest Neighbor Interpolation

Nearest neighbor interpolation (NN) is the fastest image interpolation method that belongs to the non-extra pixel category [35,43,45]. NN does not include a weighted weighting function; instead, it is based on the (linear scaling and) rounding functions that decide which pixel to copy from the source to destination image [35,43,45].

2.2.2. Bicubic Interpolation

Bicubic interpolation (BIC) is an extension of cubic interpolation for interpolating data points on a two-dimensional regular grid that belongs to the extra pixel category [35,44]. BIC uses a weighted average of 16 samples to achieve the interpolated value of the new pixel sample [44].

2.2.3. Lanczos3 Interpolation

Lanczos interpolation (LCZ) is based on the 3-lobed Lanczos window function as the interpolation function [46,47]. LCZ also belongs to the extra pixel category [35]. LCZ uses source image pixels (36 pixels) and interpolates some pixels along the x-axis and y-axis to produce intermediate results [46,47].

2.3. Histogram Visualization of Interpolated GT Segmentation Masks

After changing the size of LGE images in the reference dataset or simply after interpolating LGE-MRI images and GT segmentation images in the training dataset, there comes a risk of misplacing class labels in the interpolated GT segmentation masks, or extra classes or class labels are created in the mask regions where they should not be present. To visualize and examine possible extra class labels after GT segmentation masks interpolation, the histogram visualization technique is used and histograms of interpolated GT segmentation masks are presented in Figure 3. Figure 3 (top left) shows the histogram of the non-interpolated GT mask of the size 128 × 128. Figure 3 (top right) shows the histogram of the NN interpolated GT mask of the size 256 × 256. In both Figure 3 top left and top right cases the histograms look the same way. Both histograms show three classes regardless of how images are obtained. In that case, the NN interpolation did not change the number of classes of the original GT segmentation mask—and the reason was that the NN did not create extra pixels in the interpolated GT segmentation masks [35]. Figure 3 bottom left and Figure 3 bottom right show histograms of the BIC and LCZ interpolated GT segmentation images, respectively. As can be seen, in both Figure 3 bottom left and bottom right cases, the histograms do not look the same way. On top of that, the histograms show more than three classes (instead of the expected three classes). In Figure 3 bottom left and bottom right cases, BIC and LCZ interpolation algorithms changed the number of classes of the original GT segmentation mask, thus requiring re-moving extra class labels to keep the original number of classes unchanged. Note that, while NN did not create extra pixels in the interpolated GT segmentation masks, the reduced quality of NN interpolated images due to heavy artefacts necessitated the use of extra pixel interpolation algorithms despite the extra effort required.

2.4. A Novel Strategy for Removing Extra Class Labels in Interpolated GT Segmentation Mask

First, it is important to remember that the nearest neighbor interpolation would be the simplest option to interpolate GT masks due to its inherent advantage of not creating non-original or extra class labels in the interpolated masks. The only problem is the deterministic rounding function on which its pixel selection strategy is based [48]. Such a strategy slightly shifts the entire image content to some extent, and is responsible for creating heavy jagged artefacts in interpolation results [35,45,48]. Additionally, it is important to remember that extra-pixel category-based interpolation algorithms do not shift the image content and do not produce heavy jagged artefacts. The only problem is that their weighting functions create extra class labels once used to interpolate GT masks.
There are certainly many strategies one can think of to remove extra class labels, thus solving an image processing problem of this kind. For example, it could be easier to think or imagine that extra class labels could only be removed using a function based on Equation (1) or closely related to it.
T ( x ) = { 0 255 128 x < 64     192 < x               o t h e r w i s e    
Figure 4 shows the outcome of implementing Equation (1)′s function to remove extra class labels in the interpolated GT segmentation mask. As can be seen, Equation (1)′s idea did not work as one would expect because around the edges between the classes represented by 0- and 255-pixel labels there were still pixel labels that looked like 128-pixel labels, which should not be the case. Another strategy which is routinely used is the use of extra-pixel-category-based algorithms for training images and the nearest neighbor interpolation algorithm for training masks. To the best of the author’s knowledge, that is not a better strategy due to the risk of misalignment of both endocardium and epicardium outlines in nearest neighbor interpolated GT masks, which is likely to worsen annotation errors, thus negatively affecting the accuracy of segmentation with deep learning methods.
Therefore, the author developed a better (and dedicated) strategy focusing on removing extra class labels in interpolated GT images, and the developed strategy is based on three important techniques/operations: (1) thresholding, (2) median-filtering, and (3) subtraction. In this way, extra class labels are removed in five steps (designated by the S letter), as shown in Figure 5.
Step 1: Initially, a GT segmentation mask size is resized to the size of interest using either BIC or any other extra pixel approach-based image interpolation algorithms. Here, the resulting mask is referred to as S1 and is shown in Figure 6a. Note that S1 is a mask to filter or in which extra class labels must be removed.
Step 2: Extra class labels of S1 falling outside the desired class labels range are re-moved via thresholding. The resulting mask is referred to as S2, as shown in Figure 6b. However, there are still a few extra labels of pixels that remained scattered on the S2 surface (e.g., see Figure 6b) that are still present even after applying the median filter.
Step 3: Unwanted class labels of S2 (e.g., 128) are removed and the result is referred to as S3, as shown in Figure 7a.
Step 4: Again, other unwanted class labels of S2 (e.g., 255) are removed and the result is referred to as S4, as shown in Figure 7b. Here, it is important to note that after removing class labels (255), there were class labels (128) on (or in the neighborhood) of the epicardium outline which had to be removed using the median filter.
Step 5: Here, S4 is subtracted from S3 only when any class label of S3 is equal to 0 (this is to avoid adding one to zero pixels). When none of the class labels of S3 are equal to 0, S4 is subtracted from S3 and one is added to the difference (because in that case the difference is equal to 127). Figure 8a shows the output mask, referred to as S5, and Figure 8b shows the input mask whose interpolated version was shown as S1. Note that all these five steps are executed in one single operation.

3. Results

The description of the dataset, metrics, methods, U-net hyperparameter settings, and graphic card information is provided in this part. However, details on experimental results are provided in combination with discussions, in the discussion part.

3.1. Image Datasets

The reference dataset included a total of 3587 LGE MRI images and GT segmentation masks of the size 128 × 128. GT segmentation masks were converted from semi-automatically annotated LGE-MRI images using the Software Segment version 3.1. R8225 (http://segment.heiberg.se (accessed on 14 October 2022) [49]. Each GT segmentation mask consisted of three classes, with class IDs, corresponding to 255-, 128- and 0-pixel labels. As in [26,32], the main dataset was split into three datasets, namely the training set (60% of the main dataset), the validation set (20% of the main dataset) and the test set (20% of the main dataset). Note that information or details related to clinical trial registration can be found or are provided in [33], and therefore are not included in this section.

3.2. Metrics and Methods

To evaluate the quality of the masks from semantic segmentation using U-net against the GT segmentation, class metrics, namely classification accuracy, intersection over un-ion (IoU) and mean (boundary F-1) BF score, were used to (1) estimate the percentage of correctly identified pixels for each class, (2) achieve a statistical accuracy measurement that penalizes false positives and (3) see how well the predicted boundary of each class aligns with the true boundary or simply use a metric that tends to correlate with human qualitative assessment, respectively [50,51]. In addition, Sørensen–Dice similarity coefficients were used to evaluate the quality of U-nets’ segmented output masks against GT segmentation output masks. To evaluate the relationship between semi-automatic or medical experts-based and fully automated quantification of MI results, the values or sizes of the infarct scar volume and percentage as well as the microvascular obstruction percentage were calculated or obtained by applying the EWA algorithm on automatically segmented masks [26,27,28,29,30,31,32,33]. It is important to also mention that the simulation software was MATLAB R2020b. Segment CMR software worked well with MATLAB R2019b.

3.3. U-Net Settings and Graphic Cards

The training hyperparameters were manually adjusted based on the observation of the training graph, with the possibility for new adjustments when 10% of all epochs were reached before the training accuracy reached 90% [26]. Here, U-net’s training hyperparameters, manually adjusted, included the number of the epochs = 180, minimum batch size = 16, initial learning rate = 0.0001 and L2 regularization = 0.000005 (referring to recommendations provided in [26]). Adam was the optimizer. The loss function used in this case was the default cross-entropy function provided by the U-Net Layers function. Further information on this function can be found in reference [37]. The execution environment was multi-GPU with both Nvidia Titan RTX and Nvidia GeForce RTX 3090 graphic cards. Data augmentation options used to increase the number of images in the dataset used to train the U-net were a random reflection in the left–right direction as well as the range of vertical and horizontal translations on the interval ranging from −10 to 10.

4. Discussion

4.1. Evaluation of the Effects of Image Size on the Quality of Automatic Segmentation with U-Net against the GT Segmentation

In the effort to evaluate the effects of image size on the quality of deep learning-based segmentation (or deep learning performance or outcome on segmentation), when the im-age size is changed from 128 × 128 to 256 × 256 three classes or regions of segmented masks were evaluated using Accuracy, IoU and mean BF score. Before going into the evaluation of each region, it is important to note that C128 represents the U-net trained on LGE-MRI images of the size 128 × 128. N256F, B256F and L256F represent the U-nets trained on LGE-MRI images of size 256 × 256 obtained after performing interpolation operations using the NN, BIC, and LCZ methods and filtering the corresponding GT segmentation masks using the strategy introduced in Part II. N256U, B256U and L256U also represent the U-nets trained on LGE-MRI images of the size 256 × 256 obtained after doing interpolation operations using the NN, BIC and LCZ methods, but without removing extra class labels.

4.1.1. Region 1

Region 1 represents the class of the GT segmentation mask corresponding to the 255-pixel label. Class metrics-based results from automated segmentation with U-net of this region are shown/provided in Figure 9. As can be seen in Figure 9, N256F and N256U produced the same results in terms of Accuracy, IoU and mean BFScore, thus confirming no need for filtering of the NN interpolated GT segmentation images. Additionally, as can be seen, the C128-based network led to the poorest performance among other networks compared to or mentioned in terms of Accuracy, IoU and mean BFScore.

4.1.2. Region 2

Region 2 represents the class of the GT segmentation mask corresponding to the 128-pixel label. Class metrics-based results from automated segmentation with U-net of this region are shown in Figure 10.
Here, again N256F and N256U produced the same results in terms of Accuracy, IoU and mean BFScore, thus confirming again no need for filtering of the NN interpolated GT segmentation images. Here, C128 did not always achieve the poorest performance among all other networks mentioned. For example, in terms of mean BF score, C128 outperformed B256U and L256U. In terms of accuracy, C128 outperformed the N256F/U. The C128-based network achieved the poorest performance, only, in terms of IoU.

4.1.3. Region 3

Region 3 represents the class of the GT segmentation mask corresponding to the 0-pixel label. Class metrics-based results from the automated segmentation with the U-net of this region are shown in Figure 11. As can be seen, for the third time N256F and N256U produced the same results in terms of Accuracy, IoU and mean BFScore, thus confirming no need for filtering of NN interpolated GT segmentation images. Again, the C128 did not always achieve the poorest performance among all other networks mentioned. For example, in terms of mean BF score, C128 outperformed L256U. In terms of accuracy, C128 outperformed the N256F/U, B256F, L256F, and L256U. However, the C128-based network achieved the poor performance in terms of IoU.

4.1.4. Comparison of Final Validation and Global Accuracies of Trained U-Nets

Table 1 shows the final validation and global accuracies achieved by each U-net mentioned. Additionally, Table 1 shows that the validation and global accuracies achieved are generally in the same range, thus there are no overfitting effects to be worried about. Note that previous experiments involving U-net-based segmentation demonstrated that filtering NN interpolated masks was not fruitful (see Figure 9, Figure 10 and Figure 11 as well as relevant discussions). In this regard, there is no more N256F or N256U but only N256, as shown in Table 1. Additionally, Table 1 shows that the training time of C128 is approximately half of the training time taken by other U-nets.

4.1.5. Performance Evaluation of the U-Net against Segnet

Segnet is another type of CNN designed for semantic image segmentation [52,53]. To the best of the author’s knowledge, these are the two that directly accept training sets of 2D grayscale images and whose source codes or functions are easily found for comparison purposes. In this section, the performance of Segnet is evaluated against the performance of U-net and decisive performance results (in terms of Accuracy, IoU and mean BFScore) are provided in Figure 12, Figure 13 and Figure 14 and Table 2. Note that on these three Figures’ y-axis, 0 to 3 or 3.5 are simply graphical scale values, automatically selected by MS Excel, and only represent how the real values differ from each other.

4.1.6. Evaluation of Automated Segmentation with U-Net and GT Segmentation Using LGE MRI Test Images

From left to right, Figure 15, Figure 16, Figure 17, Figure 18, Figure 19 and Figure 20 show different columns of LGE-MRI test images and masks. Here, in each figure’s case, the first column shows LGE-MRI test images. The second column shows GT segmentation masks. The third column shows segmented output masks using U-nets. The fourth column shows differences between GT segmentation masks and segmented output masks using U-nets. Such a difference is highlighted by colors. Here, it is important to note that the greenish and purplish regions highlight areas where the segmentation results differ from the GT segmentation mask. Additionally, note that dice indices are also provided in the caption of each figure in support of the qualitative evaluation. Comparing the dice indices in the caption of Figure 15 to those in the caption of Figure 16, it can be seen that the C128-based network was outperformed only three times by the N256-based network. Next, C128 was outperformed three times by B256F (see Figure 17’s caption) and four times by L256F (see Figure 18′s caption). However, C128 was outperformed zero times by both B256U and L256F (see Figure 19 and Figure 20′s captions); therefore, U-nets based on unfiltered images were excluded from further discussions. Only U-nets based on filtered images (previously labeled B256F and L256F) were kept and included in further discussions, as B256 and L256, respectively.

4.2. Evaluation of the Effects of Image Size on the Relationship between Fully Automated Quantification and Semi-Automatic Quantification of the MI Results

The arbitrary threshold, comparison of the sums, and sums of differences between medical experts or semi-automatic and fully automated quantification of MI results are three methods used to estimate the relationship, in terms of percentages, between semi-automatic and fully automated quantification of MI results. Here, it is important to note that the 100% percentage is the target percentage reflecting the semi-automatic or manual or medical expert-based results. Additionally, it is important to note that the MI quantification operation starts with an input image (resized to the size of interest) fed through the U-net, which creates the corresponding output segmentation mask that is later analyzed by the EWA algorithm to produce MI quantification results.

4.2.1. Arbitrary Threshold

This method or strategy separates the automated quantification of MI results using an arbitrary threshold or separate automated quantification results that are closer (to some extent) to manual or semi-automatic quantification results. With this option, threshold values, arbitrarily chosen, are 25, 15, and 0.35 for scar (mL), scar (%) and mo (%), respectively. These values reflect the author’s opinion on the relationship strength or closeness between semi-automatic and fully automated quantification of the MI results. Here, it is important to note that other observers could have different opinions.
With this option, when the fully automated quantification results are less than 25, 15, and 0.35 for scar (mL), scar (%) and mo (%), respectively, the automated quantification results are close to some extent to manual or semi-automatic quantification results; thus, there exists a strong or close relationship between semi-automatic and fully automatic quantification results. Table 3 shows the percentages, achieved using option-1, that help to estimate the relationship between semi-automatic or medical experts-based quantification (100%) and fully automated quantification (x%) results. In this context, the effects of im-age size on deep learning can be understood via how close the achieved percentages are to 100% in the cases of LGE-MRI images of sizes 128 × 128 and 256 × 256, respectively.

4.2.2. Comparison of the Sums

This method compares the sums of manual or semi-automatic and automated results by calculating the percentage of the sum of scar (mL), scar (%) and mo (%) of manual results versus the percentage of the sum of scar (mL), scar (%) and mo (%) of fully automatic quantification results. Table 4 shows the percentages achieved, using option 2, that help to estimate to some extent the relationship between semi-automatic quantification (100%) and fully automated quantification (x%) results. Again, in this context, the effects of image size on deep learning can be understood via observing how close achieved percentages are to 100% in the cases of LGE-MRI images of sizes 128 × 128 and 256 × 256, respectively.

4.2.3. Sums of Differences

This method compares the sums of differences between semi-automatic and fully automated quantification of the MI results by calculating the percentage of the sum of differences of scar (mL), scar (%) and mo (%) of manual or semi-automatic results versus the percentage of the sum of differences of scar (mL), scar (%) and mo (%) of fully automatic results. Table 5 shows the percentages achieved, using option-3, that help to estimate to some extent the relationship between medical experts-based or semi-automatic quantification of MI (100%) and fully automated quantification of MI (x %) results. As in the previous two options cases, the effects of image size on deep learning are also demonstrated by such percentages and can be understood via observing how close achieved per-centages are to 100% in the cases of LGE-MRI images of sizes 128 × 128 and 256 × 256, respectively.
To better interpret the results presented in Table 3, Table 4 and Table 5, it is important to bring attention to the following: In each of the three tables, each U-net has a maximum of three chances to outperform the rest in terms of scar (mL), scar (%) and mo (%). In three tables, the total chances for each U-net increase to nine times per each U-net. As can be seen via the bolded percentages shown in Table 3, Table 4 and Table 5, C128, N256 and B256 achieved the highest percentage two times out of the nine expected, which is equivalent to 22.2%. However, L256 achieved the highest percentage five times out of the nine expected, which is equivalent to 55.5%. With this in mind, quantification results based on the larger LGE MRI image dataset were 55.5% closer to manual or semi-automatic results, while those based on the smaller dataset were 22.2% closer to manual results. It is important to note that the Segment CMR software’s EWA algorithm is responsible for generating the scar (mL), scar (%) and mo (%) values (including possible quantification errors) once the plugin of interest or plugin linked to the trained U-net is run. Therefore, it is important to note that possible annotation and EWA algorithm errors may significantly affect results in this context, meaning that future works must pay attention to the effects of those possible sources of fully automatic quantification errors.

4.2.4. Comparison of the Results from Semi-Automatic and Fully Automated Quantification of MI

As can be seen in Figure 21, Figure 22 and Figure 23, twenty-four stacks of LGE-MRI images, referred to as CHIL-2-6-xxxxx, were used during the experiments. Additionally, these figures graphically show the variation of results from two main quantification approaches, namely semi-automatic (manual) and fully automated (C128, N256, B256, L256).

5. Conclusions

The effects of the size of LGE-MRI images for training datasets were investigated, presented and discussed. Specifically, such effects were presented in terms of the quality of automatic segmentation with U-net against the GT segmentation and the relationship between fully automated quantification and the semi-automatic quantification of MI results. After conducting experiments, a close relationship between semi-automatic and fully automated quantification of MI results was detected more in the case involving the dataset of bigger LGE MRI images than in that of the dataset of smaller LGE-MRI images. This happened because the outputs of the U-net trained on LGE-MRI images of the size 256 × 256 were much closer to target vectors than the U-net trained on LGE-MRI images of the size 128 × 128. In other words, the cross-entropy loss in the U-net trained on the training set of LGE-MRI images of the size 256 × 256 was lower than in the U-net trained on the training set of LGE-MRI images of the size 128 × 128, while it was well known that the lower the loss the more accurate the model (i.e., U-net, in this case). U-nets trained on the training set of LGE-MRI images of the size 256 × 256 took more time than the U-net trained on the training set of LGE-MRI images of the size 128 × 128.
It is important to note that the study’s main objective was to determine the best size for LGE-MRI images in the training dataset that could contribute to the improvement of LGE MRI image segmentation accuracy. Additionally, seeking to determine the best size and improve the segmentation accuracy required the use of extra-pixel category-based image interpolation algorithms instead of the traditional nearest neighbor of the non-extra pixel category. Given that extra pixel category interpolation algorithms produced extra class labels in the GT masks, this problem required the development of a novel strategy to remove extra class labels in interpolated GT segmentation masks. Finally, experimental results were provided to show how the change-in-LGE-MRI-image-size improved or worsened the predictive capability or performance of a U-net via segmentation and subsequent MI quantification operations. Note that prior experiments the author conducted demonstrated that interpolating training samples or images using an extra-pixel category-based interpolation algorithm and interpolating masks using the nearest neighbor interpolation algorithm did not produce results superior to cases of experiments shown in this paper, where the same interpolation algorithm was used for images and masks. Note that this study introduced a new method for interpolation mask handling or processing. Further research is needed to address potential errors in training datasets annotations and to investigate errors in the EWA algorithm.

Funding

This research work was supported by Lund University between July and December 2020.

Data Availability Statement

Data supporting the conclusions of this paper are not made public but are available on request and approval.

Acknowledgments

The author would like to thank Lund University and Medviso AB for the materials. Additionally, the author would like to thank reviewers and editors for their helpful comments.

Conflicts of Interest

The author declares no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE PAMI 2013, 35, 1798–1828. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  2. Kim, K.G. Book Review: Deep learning. Healthc. Inform. Res. 2016, 22, 351–354. [Google Scholar] [CrossRef] [Green Version]
  3. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
  4. Schmidhuber, J. Deep Learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [Green Version]
  5. Morra, L.; Delsanto, S.; Correale, L. Artificial Intelligence in Medical Imaging: From Theory to Clinical Practice; CRC Press: Boca Raton, FL, USA, 2019. [Google Scholar]
  6. Valueva, M.V.; Nagornov, N.N.; Lyakhov, P.A.; Valuev, G.V.; Chervyakov, N.I. Application of the residue number system to reduce hardware costs of the convolutional neural network implementation. Math. Comput. Simul. 2020, 177, 232–243. [Google Scholar] [CrossRef]
  7. Geert, L.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar]
  8. Ciresan, D.; Meier, U.; Schmidhuber, J. Multi-column deep neural networks for image classification. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 3642–3649. [Google Scholar]
  9. Krizhevsky, A.; Sutskever, I.; Hinton, G. ImageNet classification with deep convolutional neural networks. In Proceedings of the 26th Conference on Neural Information Processing Systems (NeurIPS), Lake Tahoe, NV, USA, 3–8 December 2012. [Google Scholar]
  10. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Navab, N., Horneg-ger, J., Wells, W., Frangi, A., Eds.; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
  11. Varfolomeev, I.; Yakimchuk, I.; Safonov, I. An application of deep neural networks for segmentation of microtomographic im-ages of rock samples. Computers 2019, 8, 72. [Google Scholar] [CrossRef] [Green Version]
  12. Fu, K.S.; Mui, J.K. A survey on image segmentation. Pattern Recognit. 1981, 13, 3–16. [Google Scholar] [CrossRef]
  13. Haralick, R.M.; Shapiro, L.G. Image segmentation technique. Comput. Vis. Graph. Image Process. 1985, 29, 100–132. [Google Scholar] [CrossRef]
  14. Pal, N.R.; Pal, S.K. A review on image segmentation techniques. Pattern Recognit. 1993, 26, 1277–1294. [Google Scholar] [CrossRef]
  15. Pham, D.L.; Xu, C.Y.; Prince, J.L. Current methods in medical image segmentation. Annu. Rev. Biomed. Eng. 2000, 2, 315–337. [Google Scholar] [CrossRef] [PubMed]
  16. Bankman, I.N. Handbook of Medical Image Processing and Analysis, 2nd ed.; Academic Press: San Diego, CA, USA, 2008; 984p, ISBN 0123739047. [Google Scholar]
  17. Bezdek, J.C.; Hall, L.O.; Clarke, L.P. Review of MR image segmentation techniques using pattern recognition. Med. Phys. 1993, 20, 1033–1048. [Google Scholar] [CrossRef] [PubMed]
  18. Clarke, L.P.; Velthuizen, R.P.; Camacho, M.A.; Heine, J.J.; Vaidyanathan, M.; Hall, L.O.; Thatcher, R.W.; Silbiger, M.L. MRI Segmentation: Methods and applications. Magn. Reson. Imaging 1995, 13, 343–368. [Google Scholar] [CrossRef] [PubMed]
  19. Yi, J.R.; Wu, P.X.; Jiang, M.L.; Huang, Q.Y.; Hoeppner, D.J.; Metaxas, D.N. Attentive neural cell instance segmentation. Med. Image Anal. 2019, 55, 228–240. [Google Scholar] [CrossRef] [PubMed]
  20. Cicek, O.; Abdulkadir, A.; Lienkamp, S.S.; Brox, T.; Ronneberger, O. 3D U-Net: Learning dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2016, Proceedings of the 19th International Conference, Athens, Greece, 17–21 October 2016; Springer: Cham, Switzerland, 2016; pp. 424–432. [Google Scholar]
  21. Lei, Y.; Liu, Y.Z.; Dong, X.; Tian, S.; Wang, T.H.; Jiang, X.J.; Higgins, K.; Beitler, J.J.; Yu, D.S.; Curran, W.J.; et al. Automatic multi-organ segmentation in thorax CT images using U-Net-GAN. Med. Phys. 2019, 46, 2157–2168. [Google Scholar]
  22. Zhou, Z.W.; Sidiquee, M.M.R.; Tajbakhsh, N.; Liang, J.M. UNet++: A nested U-Net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Proceedings of the 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, 20 September 2018; Springer: Cham, Switzerland, 2018; pp. 3–11. [Google Scholar]
  23. Dong, H.; Yang, G.; Liu, F.; Mo, Y.; Guo, Y. Automatic Brain Tumor Detection and Segmentation Using U-Net Based Fully Convolutional Networks. In Medical Image Understanding and Analysis, Proceedings of the 21st Annual Conference, MIUA 2017, Edinburgh, UK, 11–13 July 2017; Springer: Cham, Switzerland, 2017; pp. 506–517. [Google Scholar]
  24. Kohl, S.A.A.; Romera-Paredes, B.; Meyer, C.; De Fauw, J.; Ledsam, J.R.; Maier-Hein, K.H.; Ali Eslami, S.M.; Rezende, D.J.; Ronneberger, O. A probabilistic U-net for segmentation of ambiguous images. In Proceedings of the 32th Conference on Neural Information Processing Systems (NeurIPS), Montréal, QC, Canada, 2–8 December 2018. [Google Scholar]
  25. Ibtehaz, N.; Rahman, M.S. MultiResUNet: Rethinking the U-Net architecture for multimodal biomedical image segmentation. Neural Netw. 2020, 121, 74–87. [Google Scholar] [CrossRef]
  26. Rukundo, O. Effect of the regularization hyperparameter on deep learning-based segmentation in LGE-MRI. In Proceedings of the SPIE/COS Photonics Asia, Nantong, China, 10–20 October 2021; Volume 11897. [Google Scholar]
  27. Kadry, S.; Damaševičius, R.; Taniar, D.; Rajinikanth, V.; Lawal, I.A. U-Net Supported segmentation of ischemic-stroke-lesion from brain MRI slices. In Proceedings of the 7th International conference on Bio Signals, Images, and Instrumentation (ICBSII), Chennai, India, 25–27 March 2021; pp. 1–5. [Google Scholar]
  28. Maqsood, S.; Damasevicius, R.; Shah, F.M. An efficient approach for the detection of brain tumor using fuzzy logic and U-NET CNN classification. In Computational Science and Its Applications—ICCSA 2021, Proceedings of the 21st International Conference, Cagliari, Italy, 13–16 September 2021; Springer: Cham, Switzerland, 2021; pp. 105–118. [Google Scholar]
  29. Almajalid, R.; Zhang, M.; Shan, J. Fully Automatic Knee Bone Detection and Segmentation on Three-Dimensional MRI. Diag-nostics 2022, 12, 123. [Google Scholar] [CrossRef]
  30. Shaaf, Z.F.; Jamil, M.M.A.; Ambar, R.; Alattab, A.A.; Yahya, A.A.; Asiri, Y. Automatic Left Ventricle Segmentation from Short-Axis Cardiac MRI Images Based on Fully Convolutional Neural Network. Diagnostics 2022, 12, 414. [Google Scholar] [CrossRef]
  31. Daudé, P.; Ancel, P.; Confort Gouny, S.; Jacquier, A.; Kober, F.; Dutour, A.; Bernard, M.; Gaborit, B.; Rapacchi, S. Deep-Learning Segmentation of Epicardial Adipose Tissue Using Four-Chamber Cardiac Magnetic Resonance Imaging. Diagnostics 2022, 12, 126. [Google Scholar] [CrossRef]
  32. Rukundo, O. Evaluation of deep learning-based myocardial infarction quantification using segment CMR software. In Proceedings of the SPIE/COS Photonics Asia, Nantong, China, 10–20 October 2021; Volume 11897. [Google Scholar]
  33. Engblom, H.; Tufvesson, J.; Jablonowski, R.; Carlsson, M.; Aletras, A.H.; Hoffmann, P.; Jacquier, A.; Kober, F.; Metzler, B.; Er-linge, D.; et al. A new automatic algorithm for quantification of myocardial infarction imaged by late gadolinium enhancement cardiovascular magnetic resonance: Experimental validation and comparison to expert delineations in multi-center, multi-vendor patient data. J. Cardiovasc. Magn. Reason. 2016, 18, 27. [Google Scholar] [CrossRef] [Green Version]
  34. Rukundo, O. Normalized weighting schemes for image interpolation algorithms. Appl. Sci. 2023, 13, 1741. [Google Scholar] [CrossRef]
  35. Rukundo, O. Non-extra pixel interpolation. Int. J. Image Graph. 2020, 20, 2050031. [Google Scholar] [CrossRef]
  36. 2-D Convolutional Layer, Mathworks. Available online: https://se.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.convolution2dlayer.html (accessed on 2 February 2023).
  37. Specify Layers of Convolutional Neural Network, Mathworks. Available online: https://se.mathworks.com/help/deeplearning/ug/layers-of-a-convolutional-neural-network.html (accessed on 2 February 2023).
  38. Rukundo, O. Effects of improved-floor function on the accuracy of bilinear interpolation algorithm. Comput. Inf. Sci. 2015, 8, 1–25. [Google Scholar] [CrossRef] [Green Version]
  39. Rukundo, O. Effects of empty bins on image upscaling in capsule endoscopy. In Proceedings of the 9th International Conference on Digital Image Processing (ICDIP 2017), Hong Kong, China, 19–22 May 2017; Volume 10420. [Google Scholar]
  40. Rukundo, O. Half-unit weighted bilinear algorithm for image contrast enhancement in capsule endoscopy. In Proceedings of the 9th International Conference on Graphic and Image Processing (ICGIP 2017), Qingdao, China, 14–16 October 2017; Volume 10615. [Google Scholar]
  41. Rukundo, O.; Schmidt, S. Extrapolation for image interpolation. In Proceedings of the SPIE/COS Photonics Asia, Beijing, China, 11–13 October 2018; Volume 10817. [Google Scholar]
  42. Rukundo, O.; Schmidt, S. Effects of Rescaling bilinear interpolant on image interpolation quality. In Proceedings of the SPIE/COS Photonics Asia, Beijing, China, 11–13 October 2018; Volume 10817. [Google Scholar]
  43. Rukundo, O.; Cao, H.Q. Nearest neighbor value interpolation. Int. J. Adv. Comput. Sci. Appl. 2012, 3, 25–30. [Google Scholar]
  44. Rukundo, O.; Schmidt, S.E.; Von Ramm, O.T. Software implementation of optimized bicubic interpolated scan conversion in echocardiography. arXiv 2020, arXiv:2005.11269. [Google Scholar]
  45. Rukundo, O. Evaluation of rounding functions in nearest neighbor interpolation. Int. J. Comput. Methods 2021, 18, 2150024. [Google Scholar] [CrossRef]
  46. Thiago, M.; Paulo, A.; Da Silva, J.V.; Pedrini, H. Medical image interpolation based on 3D Lanczos filtering. Comput. Methods Biomech. Biomed. Eng. Imaging Vis. 2020, 8, 294–300. [Google Scholar]
  47. Lanczos Interpolation, Supercomputing Center of USFT. Available online: https://scc.ustc.edu.cn (accessed on 27 December 2020). (In Chinese).
  48. Rukundo, O.; Schmidt, S. Stochastic Rounding for Image Interpolation and Scan Conversion. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 13–22. [Google Scholar] [CrossRef]
  49. Heiberg, E.; Sjögren, J.; Ugander, M.; Carlsson, M.; Engblom, H.; Arheden, H. Design and validation of segment freely available software for cardiovascular image analysis. BMC Med. Imaging 2010, 10, 1. [Google Scholar] [CrossRef] [Green Version]
  50. Csurka, G.; Larlus, D.; Perronnin, F. What is a good evaluation measure for semantic segmentation? In Proceedings of the British Machine Vision Conference, Bristol, UK, 9–13 September 2013; pp. 32.1–32.11. [Google Scholar]
  51. EvaluateSemanticSegmentation. Matlab Documentation. Available online: https://se.mathworks.com/help/vision/ref/evaluatesemanticsegmentation.html (accessed on 10 October 2020).
  52. Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmen-tation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
  53. Kolhar, S.; Jagtap, J. Convolutional neural network-based encoder-decoder architectures for semantic segmentation of plants. Ecol. Inform. 2021, 64, 101373. [Google Scholar] [CrossRef]
Figure 1. A schematic representation of two examples of the most commonly used networks/architectures—(A) CNN and (B) multi-stream CNN—for automated medical image analysis. Each block contains relevant layer nodes, while relevant layer connections are generalized by a blue arrow symbol.
Figure 1. A schematic representation of two examples of the most commonly used networks/architectures—(A) CNN and (B) multi-stream CNN—for automated medical image analysis. Each block contains relevant layer nodes, while relevant layer connections are generalized by a blue arrow symbol.
Electronics 12 00985 g001
Figure 2. U-net architecture. Conv means convolution. ReLU is rectified linear unit. DepthConv is depth concatenation. UpConv means up-convolution or transposed convolution. MaxPool is Max Pooling.
Figure 2. U-net architecture. Conv means convolution. ReLU is rectified linear unit. DepthConv is depth concatenation. UpConv means up-convolution or transposed convolution. MaxPool is Max Pooling.
Electronics 12 00985 g002
Figure 3. Histograms: (Top left): GT segmentation mask of the size 128 × 128. (Top right): NN-based GT segmentation mask of the size 256 × 256. (Bottom left): BIC-based GT segmentation mask of the size 256 × 256. (Bottom right): LCZ-based GT segmentation mask of the size 256 × 256.
Figure 3. Histograms: (Top left): GT segmentation mask of the size 128 × 128. (Top right): NN-based GT segmentation mask of the size 256 × 256. (Bottom left): BIC-based GT segmentation mask of the size 256 × 256. (Bottom right): LCZ-based GT segmentation mask of the size 256 × 256.
Electronics 12 00985 g003
Figure 4. Example showing the bicubic (BIC) interpolated GT segmentation mask after removing extra class labels using the Equation (1)-based function.
Figure 4. Example showing the bicubic (BIC) interpolated GT segmentation mask after removing extra class labels using the Equation (1)-based function.
Electronics 12 00985 g004
Figure 5. Five steps to remove extra class labels in BIC interpolated GT segmentation masks.
Figure 5. Five steps to remove extra class labels in BIC interpolated GT segmentation masks.
Electronics 12 00985 g005
Figure 6. (a) S1 and (b) S2 output images of size 256 × 256.
Figure 6. (a) S1 and (b) S2 output images of size 256 × 256.
Electronics 12 00985 g006
Figure 7. (a) S3 and (b) S4 output images of size 256 × 256.
Figure 7. (a) S3 and (b) S4 output images of size 256 × 256.
Electronics 12 00985 g007
Figure 8. (a) Input mask of size 128 × 128. (b) S5 output mask of size 256 × 256.
Figure 8. (a) Input mask of size 128 × 128. (b) S5 output mask of size 256 × 256.
Electronics 12 00985 g008
Figure 9. Segmentation results: Region 1.
Figure 9. Segmentation results: Region 1.
Electronics 12 00985 g009
Figure 10. Segmentation results: Region 2.
Figure 10. Segmentation results: Region 2.
Electronics 12 00985 g010
Figure 11. Segmentation results: Region 3.
Figure 11. Segmentation results: Region 3.
Electronics 12 00985 g011
Figure 12. U-net vs. Segnet | Segmentation results | Region 1.
Figure 12. U-net vs. Segnet | Segmentation results | Region 1.
Electronics 12 00985 g012
Figure 13. U-net vs. Segnet | Segmentation results | Region 2.
Figure 13. U-net vs. Segnet | Segmentation results | Region 2.
Electronics 12 00985 g013
Figure 14. U-net vs. Segnet | Segmentation Results | Region 3.
Figure 14. U-net vs. Segnet | Segmentation Results | Region 3.
Electronics 12 00985 g014
Figure 15. C128 segmented output masks | From top to bottom: dice indices are equal to 0.9953, 0.9945, 0.9873, and 0.9929.
Figure 15. C128 segmented output masks | From top to bottom: dice indices are equal to 0.9953, 0.9945, 0.9873, and 0.9929.
Electronics 12 00985 g015
Figure 16. N256 segmented output masks | From top to bottom: dice indices are equal to 0.9961, 0.9963, 0.9909, 0.9925.
Figure 16. N256 segmented output masks | From top to bottom: dice indices are equal to 0.9961, 0.9963, 0.9909, 0.9925.
Electronics 12 00985 g016
Figure 17. B256F segmented output masks | From top to bottom: dice indices are equal to 0.9945, 0.9956, 0.9900, 0.9944.
Figure 17. B256F segmented output masks | From top to bottom: dice indices are equal to 0.9945, 0.9956, 0.9900, 0.9944.
Electronics 12 00985 g017
Figure 18. L256F segmented output masks | From top to bottom: dice indices are equal to 0.9953, 0.9957, 0.9902, 0.9942.
Figure 18. L256F segmented output masks | From top to bottom: dice indices are equal to 0.9953, 0.9957, 0.9902, 0.9942.
Electronics 12 00985 g018
Figure 19. B256U segmented output masks | From top to bottom: dice indices are equal to 0.9718, 0.9554, 0.8868, 0.9130.
Figure 19. B256U segmented output masks | From top to bottom: dice indices are equal to 0.9718, 0.9554, 0.8868, 0.9130.
Electronics 12 00985 g019
Figure 20. L256U segmented output masks | From top to bottom: dice indices equal to 0.9694, 0.9558, 0.8854, 0.9150.
Figure 20. L256U segmented output masks | From top to bottom: dice indices equal to 0.9694, 0.9558, 0.8854, 0.9150.
Electronics 12 00985 g020
Figure 21. MI quantification results—scar (mL).
Figure 21. MI quantification results—scar (mL).
Electronics 12 00985 g021
Figure 22. MI quantification results—scar (%).
Figure 22. MI quantification results—scar (%).
Electronics 12 00985 g022
Figure 23. MI quantification results—mo (%).
Figure 23. MI quantification results—mo (%).
Electronics 12 00985 g023
Table 1. U-net | Validation accuracy, global accuracy and training time.
Table 1. U-net | Validation accuracy, global accuracy and training time.
NetworkValidation AccuracyGlobal AccuracyTraining Time
C1280.99080.99078109 min 56 s
N2560.99190.9918225 min 52 s
B256F0.99140.99126225 min 31 s
L256F0.99160.9912226 min 35 s
B256U0.99770.99756258 min 35 s
L256U0.99730.9972265 min 30 s
Table 2. U-net vs. Segnet | Validation accuracy, global accuracy, and training time.
Table 2. U-net vs. Segnet | Validation accuracy, global accuracy, and training time.
NetworkValidation AccuracyGlobal AccuracyTraining Time
UC1280.99080.99078109 min 56 s
UN2560.99190.9918225 min 52 s
UB2560.99140.99126225 min 31 s
UL2560.99160.9912226 min 35 s
SC1280.97090.97149144 min 08 s
SN2560.92210.92137540 min 15 s
SB2560.92550.92499548 min 42 s
SL2560.92440.92474542 min 52 s
Table 3. Percentages achieved using arbitrary threshold.
Table 3. Percentages achieved using arbitrary threshold.
NetworkC128N256B256L256
Scar (mL)87.5%91.6%91.6%91.6%
Scar (%)79.1%95.8%75%83.3%
MO (%)62.5%62.5%62.5%66.6%
Table 4. Percentages achieved using comparison of the sums.
Table 4. Percentages achieved using comparison of the sums.
NetworkC128N256B256L256
Scar (mL)58.4%49.5%72.2%72.3%
Scar (%)74.8%75.1%74.7%75.7%
MO (%)6.6%10.7%11.3%9.5%
Table 5. Percentages achieved using sums of differences.
Table 5. Percentages achieved using sums of differences.
C128N256B256L256
Scar (mL)78.2%74.4%72.4%74.8%
Scar (%)79.07%75.2%71.9%73.7%
MO (%)74.2%75.2%75.08%75.4%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rukundo, O. Effects of Image Size on Deep Learning. Electronics 2023, 12, 985. https://doi.org/10.3390/electronics12040985

AMA Style

Rukundo O. Effects of Image Size on Deep Learning. Electronics. 2023; 12(4):985. https://doi.org/10.3390/electronics12040985

Chicago/Turabian Style

Rukundo, Olivier. 2023. "Effects of Image Size on Deep Learning" Electronics 12, no. 4: 985. https://doi.org/10.3390/electronics12040985

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop