Previous Issue
Volume 10, May
 
 

J. Imaging, Volume 10, Issue 6 (June 2024) – 10 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
11 pages, 587 KiB  
Article
Accuracy of Digital Imaging Software to Predict Soft Tissue Changes during Orthodontic Treatment
by Theerasak Nakornnoi and Pannapat Chanmanee
J. Imaging 2024, 10(6), 134; https://doi.org/10.3390/jimaging10060134 - 31 May 2024
Abstract
This study aimed to evaluate the accuracy of the Digital Imaging software in the prediction of soft tissue changes following three types of orthodontic interventions: non-extraction, extraction, and orthognathic surgery treatments. Ninety-six patients were randomly selected from the records of three orthodontic interventions [...] Read more.
This study aimed to evaluate the accuracy of the Digital Imaging software in the prediction of soft tissue changes following three types of orthodontic interventions: non-extraction, extraction, and orthognathic surgery treatments. Ninety-six patients were randomly selected from the records of three orthodontic interventions (32 subjects per group): (1) non-extraction, (2) extraction, and (3) orthodontic treatment combined with orthognathic surgery. The cephalometric analysis of soft tissue changes in both the actual post-treatment and the predicted treatment was performed using Dolphin Imaging software version 11.9. A paired t-test was utilized to assess the statistically significant differences between the predicted and actual treatment outcomes of the parameters (p < 0.05). In the non-extraction group, prediction errors were exhibited only in the lower lip parameters. In the extraction group, prediction errors were observed in both the upper and lower lip parameters. In the orthognathic surgery group, prediction errors were identified in chin thickness, facial contour angle, and upper and lower lip parameters (p < 0.05). Digital Imaging software exhibited inaccurate soft tissue prediction of 0.3–1.0 mm in some parameters of all treatment groups, which should be considered regarding the application of Dolphin Imaging software in orthodontic treatment planning. Full article
(This article belongs to the Section Medical Imaging)
11 pages, 1015 KiB  
Article
Implicit 3D Human Reconstruction Guided by Parametric Models and Normal Maps
by Yong Ren, Mingquan Zhou, Yifan Wang, Long Feng, Qiuquan Zhu, Kang Li and Guohua Geng
J. Imaging 2024, 10(6), 133; https://doi.org/10.3390/jimaging10060133 - 29 May 2024
Viewed by 122
Abstract
Abstract: Accurate and robust 3D human modeling from a single image presents significant challenges. Existing methods have shown potential, but they often fail to generate reconstructions that match the level of detail in the input image. These methods particularly struggle with loose [...] Read more.
Abstract: Accurate and robust 3D human modeling from a single image presents significant challenges. Existing methods have shown potential, but they often fail to generate reconstructions that match the level of detail in the input image. These methods particularly struggle with loose clothing. They typically employ parameterized human models to constrain the reconstruction process, ensuring the results do not deviate too far from the model and produce anomalies. However, this also limits the recovery of loose clothing. To address this issue, we propose an end-to-end method called IHRPN for reconstructing clothed humans from a single 2D human image. This method includes a feature extraction module for semantic extraction of image features. We propose an image semantic feature extraction aimed at achieving pixel model space consistency and enhancing the robustness of loose clothing. We extract features from the input image to infer and recover the SMPL-X mesh, and then combine it with a normal map to guide the implicit function to reconstruct the complete clothed human. Unlike traditional methods, we use local features for implicit surface regression. Our experimental results show that our IHRPN method performs excellently on the CAPE and AGORA datasets, achieving good performance, and the reconstruction of loose clothing is noticeably more accurate and robust. Full article
(This article belongs to the Special Issue Self-Supervised Learning for Image Processing and Analysis)
34 pages, 1881 KiB  
Article
Hybridizing Deep Neural Networks and Machine Learning Models for Aerial Satellite Forest Image Segmentation
by Clopas Kwenda, Mandlenkosi Gwetu and Jean Vincent Fonou-Dombeu
J. Imaging 2024, 10(6), 132; https://doi.org/10.3390/jimaging10060132 - 29 May 2024
Viewed by 128
Abstract
Forests play a pivotal role in mitigating climate change as well as contributing to the socio-economic activities of many countries. Therefore, it is of paramount importance to monitor forest cover. Traditional machine learning classifiers for segmenting images lack the ability to extract features [...] Read more.
Forests play a pivotal role in mitigating climate change as well as contributing to the socio-economic activities of many countries. Therefore, it is of paramount importance to monitor forest cover. Traditional machine learning classifiers for segmenting images lack the ability to extract features such as the spatial relationship between pixels and texture, resulting in subpar segmentation results when used alone. To address this limitation, this study proposed a novel hybrid approach that combines deep neural networks and machine learning algorithms to segment an aerial satellite image into forest and non-forest regions. Aerial satellite forest image features were first extracted by two deep neural network models, namely, VGG16 and ResNet50. The resulting features are subsequently used by five machine learning classifiers including Random Forest (RF), Linear Support Vector Machines (LSVM), k-nearest neighbor (kNN), Linear Discriminant Analysis (LDA), and Gaussian Naive Bayes (GNB) to perform the final segmentation. The aerial satellite forest images were obtained from a deep globe challenge dataset. The performance of the proposed model was evaluated using metrics such as Accuracy, Jaccard score index, and Root Mean Square Error (RMSE). The experimental results revealed that the RF model achieved the best segmentation results with accuracy, Jaccard score, and RMSE of 94%, 0.913 and 0.245, respectively; followed by LSVM with accuracy, Jaccard score and RMSE of 89%, 0.876, 0.332, respectively. The LDA took the third position with accuracy, Jaccard score, and RMSE of 88%, 0.834, and 0.351, respectively, followed by GNB with accuracy, Jaccard score, and RMSE of 88%, 0.837, and 0.353, respectively. The kNN occupied the last position with accuracy, Jaccard score, and RMSE of 83%, 0.790, and 0.408, respectively. The experimental results also revealed that the proposed model has significantly improved the performance of the RF, LSVM, LDA, GNB and kNN models, compared to their performance when used to segment the images alone. Furthermore, the results showed that the proposed model outperformed other models from related studies, thereby, attesting its superior segmentation capability. Full article
Show Figures

Figure 1

28 pages, 12383 KiB  
Article
Greedy Ensemble Hyperspectral Anomaly Detection
by Mazharul Hossain, Mohammed Younis, Aaron Robinson, Lan Wang and Chrysanthe Preza
J. Imaging 2024, 10(6), 131; https://doi.org/10.3390/jimaging10060131 - 28 May 2024
Viewed by 291
Abstract
Hyperspectral images include information from a wide range of spectral bands deemed valuable for computer vision applications in various domains such as agriculture, surveillance, and reconnaissance. Anomaly detection in hyperspectral images has proven to be a crucial component of change and abnormality identification, [...] Read more.
Hyperspectral images include information from a wide range of spectral bands deemed valuable for computer vision applications in various domains such as agriculture, surveillance, and reconnaissance. Anomaly detection in hyperspectral images has proven to be a crucial component of change and abnormality identification, enabling improved decision-making across various applications. These abnormalities/anomalies can be detected using background estimation techniques that do not require the prior knowledge of outliers. However, each hyperspectral anomaly detection (HS-AD) algorithm models the background differently. These different assumptions may fail to consider all the background constraints in various scenarios. We have developed a new approach called Greedy Ensemble Anomaly Detection (GE-AD) to address this shortcoming. It includes a greedy search algorithm to systematically determine the suitable base models from HS-AD algorithms and hyperspectral unmixing for the first stage of a stacking ensemble and employs a supervised classifier in the second stage of a stacking ensemble. It helps researchers with limited knowledge of the suitability of the HS-AD algorithms for the application scenarios to select the best methods automatically. Our evaluation shows that the proposed method achieves a higher average F1-macro score with statistical significance compared to the other individual methods used in the ensemble. This is validated on multiple datasets, including the Airport–Beach–Urban (ABU) dataset, the San Diego dataset, the Salinas dataset, the Hydice Urban dataset, and the Arizona dataset. The evaluation using the airport scenes from the ABU dataset shows that GE-AD achieves a 14.97% higher average F1-macro score than our previous method (HUE-AD), at least 17.19% higher than the individual methods used in the ensemble, and at least 28.53% higher than the other state-of-the-art ensemble anomaly detection algorithms. As using the combination of greedy algorithm and stacking ensemble to automatically select suitable base models and associated weights have not been widely explored in hyperspectral anomaly detection, we believe that our work will expand the knowledge in this research area and contribute to the wider application of this approach. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Graphical abstract

16 pages, 1657 KiB  
Article
Modeling of Ethiopian Beef Meat Marbling Score Using Image Processing for Rapid Meat Grading
by Tariku Erena, Abera Belay, Demelash Hailu, Bezuayehu Gutema Asefa, Mulatu Geleta and Tesfaye Deme
J. Imaging 2024, 10(6), 130; https://doi.org/10.3390/jimaging10060130 - 28 May 2024
Viewed by 309
Abstract
Meat characterized by a high marbling value is typically anticipated to display enhanced sensory attributes. This study aimed to predict the marbling scores of rib-eye, steaks sourced from the Longissimus dorsi muscle of different cattle types, namely Boran, Senga, and Sheko, by employing [...] Read more.
Meat characterized by a high marbling value is typically anticipated to display enhanced sensory attributes. This study aimed to predict the marbling scores of rib-eye, steaks sourced from the Longissimus dorsi muscle of different cattle types, namely Boran, Senga, and Sheko, by employing digital image processing and machine-learning algorithms. Marbling was analyzed using digital image processing coupled with an extreme gradient boosting (GBoost) machine learning algorithm. Meat texture was assessed using a universal texture analyzer. Sensory characteristics of beef were evaluated through quantitative descriptive analysis with a trained panel of twenty. Using selected image features from digital image processing, the marbling score was predicted with R2 (prediction) = 0.83. Boran cattle had the highest fat content in sirloin and chuck cuts (12.68% and 12.40%, respectively), followed by Senga (11.59% and 11.56%) and Sheko (11.40% and 11.17%). Tenderness scores for sirloin and chuck cuts differed among the three breeds: Boran (7.06 ± 2.75 and 3.81 ± 2.24, respectively), Senga (5.54 ± 1.90 and 5.25 ± 2.47), and Sheko (5.43 ± 2.76 and 6.33 ± 2.28 Nmm). Sheko and Senga had similar sensory attributes. Marbling scores were higher in Boran (4.28 ± 1.43 and 3.68 ± 1.21) and Senga (2.88 ± 0.69 and 2.83 ± 0.98) compared to Sheko (2.73 ± 1.28 and 2.90 ± 1.52). The study achieved a remarkable milestone in developing a digital tool for predicting marbling scores of Ethiopian beef breeds. Furthermore, the relationship between quality attributes and beef marbling score has been verified. After further validation, the output of this research can be utilized in the meat industry and quality control authorities. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Graphical abstract

25 pages, 3905 KiB  
Article
Point Cloud Quality Assessment Using a One-Dimensional Model Based on the Convolutional Neural Network
by Abdelouahed Laazoufi, Mohammed El Hassouni and Hocine Cherifi
J. Imaging 2024, 10(6), 129; https://doi.org/10.3390/jimaging10060129 - 27 May 2024
Viewed by 309
Abstract
Recent advancements in 3D modeling have revolutionized various fields, including virtual reality, computer-aided diagnosis, and architectural design, emphasizing the importance of accurate quality assessment for 3D point clouds. As these models undergo operations such as simplification and compression, introducing distortions can significantly impact [...] Read more.
Recent advancements in 3D modeling have revolutionized various fields, including virtual reality, computer-aided diagnosis, and architectural design, emphasizing the importance of accurate quality assessment for 3D point clouds. As these models undergo operations such as simplification and compression, introducing distortions can significantly impact their visual quality. There is a growing need for reliable and efficient objective quality evaluation methods to address this challenge. In this context, this paper introduces a novel methodology to assess the quality of 3D point clouds using a deep learning-based no-reference (NR) method. First, it extracts geometric and perceptual attributes from distorted point clouds and represent them as a set of 1D vectors. Then, transfer learning is applied to obtain high-level features using a 1D convolutional neural network (1D CNN) adapted from 2D CNN models through weight conversion from ImageNet. Finally, quality scores are predicted through regression utilizing fully connected layers. The effectiveness of the proposed approach is evaluated across diverse datasets, including the Colored Point Cloud Quality Assessment Database (SJTU_PCQA), the Waterloo Point Cloud Assessment Database (WPC), and the Colored Point Cloud Quality Assessment Database featured at ICIP2020. The outcomes reveal superior performance compared to several competing methodologies, as evidenced by enhanced correlation with average opinion scores. Full article
Show Figures

Figure 1

18 pages, 13380 KiB  
Article
Integrated Building Modelling Using Geomatics and GPR Techniques for Cultural Heritage Preservation: A Case Study of the Charles V Pavilion in Seville (Spain)
by María Zaragoza, Vicente Bayarri and Francisco García
J. Imaging 2024, 10(6), 128; https://doi.org/10.3390/jimaging10060128 - 27 May 2024
Viewed by 333
Abstract
This paper highlights the fundamental role of integrating different geomatics and geophysical imaging technologies in understanding and preserving cultural heritage, with a focus on the Pavilion of Charles V in Seville (Spain). Using a terrestrial laser scanner, global navigation satellite system, and ground-penetrating [...] Read more.
This paper highlights the fundamental role of integrating different geomatics and geophysical imaging technologies in understanding and preserving cultural heritage, with a focus on the Pavilion of Charles V in Seville (Spain). Using a terrestrial laser scanner, global navigation satellite system, and ground-penetrating radar, we constructed a building information modelling (BIM) system to derive comprehensive decision-making models to preserve this historical asset. These models enable the generation of virtual reconstructions, encompassing not only the building but also its subsurface, distributable as augmented reality or virtual reality online. By leveraging these technologies, the research investigates complex details of the pavilion, capturing its current structure and revealing insights into past soil compositions and potential subsurface structures. This detailed analysis empowers stakeholders to make informed decisions about conservation and management. Furthermore, transparent data sharing fosters collaboration, advancing collective understanding and practices in heritage preservation. Full article
Show Figures

Figure 1

16 pages, 6238 KiB  
Article
Enabling Low-Dose In Vivo Benchtop X-ray Fluorescence Computed Tomography through Deep-Learning-Based Denoising
by Naghmeh Mahmoodian, Mohammad Rezapourian, Asim Abdulsamad Inamdar, Kunal Kumar, Melanie Fachet and Christoph Hoeschen
J. Imaging 2024, 10(6), 127; https://doi.org/10.3390/jimaging10060127 - 22 May 2024
Viewed by 387
Abstract
X-ray Fluorescence Computed Tomography (XFCT) is an emerging non-invasive imaging technique providing high-resolution molecular-level data. However, increased sensitivity with current benchtop X-ray sources comes at the cost of high radiation exposure. Artificial Intelligence (AI), particularly deep learning (DL), has revolutionized medical imaging by [...] Read more.
X-ray Fluorescence Computed Tomography (XFCT) is an emerging non-invasive imaging technique providing high-resolution molecular-level data. However, increased sensitivity with current benchtop X-ray sources comes at the cost of high radiation exposure. Artificial Intelligence (AI), particularly deep learning (DL), has revolutionized medical imaging by delivering high-quality images in the presence of noise. In XFCT, traditional methods rely on complex algorithms for background noise reduction, but AI holds promise in addressing high-dose concerns. We present an optimized Swin-Conv-UNet (SCUNet) model for background noise reduction in X-ray fluorescence (XRF) images at low tracer concentrations. Our method’s effectiveness is evaluated against higher-dose images, while various denoising techniques exist for X-ray and computed tomography (CT) techniques, only a few address XFCT. The DL model is trained and assessed using augmented data, focusing on background noise reduction. Image quality is measured using peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), comparing outcomes with 100% X-ray-dose images. Results demonstrate that the proposed algorithm yields high-quality images from low-dose inputs, with maximum PSNR of 39.05 and SSIM of 0.86. The model outperforms block-matching and 3D filtering (BM3D), block-matching and 4D filtering (BM4D), non-local means (NLM), denoising convolutional neural network (DnCNN), and SCUNet in both visual inspection and quantitative analysis, particularly in high-noise scenarios. This indicates the potential of AI, specifically the SCUNet model, in significantly improving XFCT imaging by mitigating the trade-off between sensitivity and radiation exposure. Full article
(This article belongs to the Special Issue Recent Advances in X-ray Imaging)
25 pages, 7584 KiB  
Article
Fine-Grained Food Image Recognition: A Study on Optimising Convolutional Neural Networks for Improved Performance
by Liam Boyd, Nonso Nnamoko and Ricardo Lopes
J. Imaging 2024, 10(6), 126; https://doi.org/10.3390/jimaging10060126 - 22 May 2024
Viewed by 400
Abstract
Addressing the pressing issue of food waste is vital for environmental sustainability and resource conservation. While computer vision has been widely used in food waste reduction research, existing food image datasets are typically aggregated into broad categories (e.g., fruits, meat, dairy, etc.) rather [...] Read more.
Addressing the pressing issue of food waste is vital for environmental sustainability and resource conservation. While computer vision has been widely used in food waste reduction research, existing food image datasets are typically aggregated into broad categories (e.g., fruits, meat, dairy, etc.) rather than the fine-grained singular food items required for this research. The aim of this study is to develop a model capable of identifying individual food items to be integrated into a mobile application that allows users to photograph their food items, identify them, and offer suggestions for recipes. This research bridges the gap in available datasets and contributes to a more fine-grained approach to utilising existing technology for food waste reduction, emphasising both environmental and research significance. This study evaluates various (n = 7) convolutional neural network architectures for multi-class food image classification, emphasising the nuanced impact of parameter tuning to identify the most effective configurations. The experiments were conducted with a custom dataset comprising 41,949 food images categorised into 20 food item classes. Performance evaluation was based on accuracy and loss. DenseNet architecture emerged as the top-performing out of the seven examined, establishing a baseline performance (training accuracy = 0.74, training loss = 1.25, validation accuracy = 0.68, and validation loss = 2.89) on a predetermined set of parameters, including the RMSProp optimiser, ReLU activation function, ‘0.5’ dropout rate, and a 160×160 image size. Subsequent parameter tuning involved a comprehensive exploration, considering six optimisers, four image sizes, two dropout rates, and five activation functions. The results show the superior generalisation capabilities of the optimised DenseNet, showcasing performance improvements over the established baseline across key metrics. Specifically, the optimised model demonstrated a training accuracy of 0.99, a training loss of 0.01, a validation accuracy of 0.79, and a validation loss of 0.92, highlighting its improved performance compared to the baseline configuration. The optimal DenseNet has been integrated into a mobile application called FridgeSnap, designed to recognise food items and suggest possible recipes to users, thus contributing to the broader mission of minimising food waste. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

15 pages, 1666 KiB  
Article
MResTNet: A Multi-Resolution Transformer Framework with CNN Extensions for Semantic Segmentation
by Nikolaos Detsikas, Nikolaos Mitianoudis and Ioannis Pratikakis
J. Imaging 2024, 10(6), 125; https://doi.org/10.3390/jimaging10060125 - 21 May 2024
Viewed by 437
Abstract
A fundamental task in computer vision is the process of differentiation and identification of different objects or entities in a visual scene using semantic segmentation methods. The advancement of transformer networks has surpassed traditional convolutional neural network (CNN) architectures in terms of segmentation [...] Read more.
A fundamental task in computer vision is the process of differentiation and identification of different objects or entities in a visual scene using semantic segmentation methods. The advancement of transformer networks has surpassed traditional convolutional neural network (CNN) architectures in terms of segmentation performance. The continuous pursuit of optimal performance, with respect to the popular evaluation metric results, has led to very large architectures that require a significant amount of computational power to operate, making them prohibitive for real-time applications, including autonomous driving. In this paper, we propose a model that leverages a visual transformer encoder with a parallel twin decoder, consisting of a visual transformer decoder and a CNN decoder with multi-resolution connections working in parallel. The two decoders are merged with the aid of two trainable CNN blocks, the fuser that combined the information from the two decoders and the scaler that scales the contribution of each decoder. The proposed model achieves state-of-the-art performance on the Cityscapes and ADE20K datasets, maintaining a low-complexity network that can be used in real-time applications. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision)
Show Figures

Figure 1

Previous Issue
Back to TopTop