Next Issue
Volume 10, April
Previous Issue
Volume 10, February
 
 

J. Imaging, Volume 10, Issue 3 (March 2024) – 23 articles

Cover Story (view full-size image): Image decolorization is an image pre-processing step which is widely used in image analysis, computer vision, and printing applications. The most commonly used methods give each color channel a constant weight without considering image content. This approach is simple and fast, but it may cause significant information loss when images contain too many isoluminant colors. In this paper, we propose a new method which is not only efficient, but also can preserve a higher level of image contrast and detail than the traditional methods. The algorithm works in RGB color space directly without any color conversion. Experimental results show that the proposed algorithm can run as efficiently as the traditional methods and obtain the best overall performance across four different metrics. View this paper
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Section
Select all
Export citation of selected articles as:
17 pages, 30409 KiB  
Article
Data Fusion of RGB and Depth Data with Image Enhancement
by Lennard Wunsch, Christian Görner Tenorio, Katharina Anding, Andrei Golomoz and Gunther Notni
J. Imaging 2024, 10(3), 73; https://doi.org/10.3390/jimaging10030073 - 21 Mar 2024
Viewed by 856
Abstract
Since 3D sensors became popular, imaged depth data are easier to obtain in the consumer sector. In applications such as defect localization on industrial objects or mass/volume estimation, precise depth data is important and, thus, benefits from the usage of multiple information sources. [...] Read more.
Since 3D sensors became popular, imaged depth data are easier to obtain in the consumer sector. In applications such as defect localization on industrial objects or mass/volume estimation, precise depth data is important and, thus, benefits from the usage of multiple information sources. However, a combination of RGB images and depth images can not only improve our understanding of objects, capacitating one to gain more information about objects but also enhance data quality. Combining different camera systems using data fusion can enable higher quality data since disadvantages can be compensated. Data fusion itself consists of data preparation and data registration. A challenge in data fusion is the different resolutions of sensors. Therefore, up- and downsampling algorithms are needed. This paper compares multiple up- and downsampling methods, such as different direct interpolation methods, joint bilateral upsampling (JBU), and Markov random fields (MRFs), in terms of their potential to create RGB-D images and improve the quality of depth information. In contrast to the literature in which imaging systems are adjusted to acquire the data of the same section simultaneously, the laboratory setup in this study was based on conveyor-based optical sorting processes, and therefore, the data were acquired at different time periods and different spatial locations. Data assignment and data cropping were necessary. In order to evaluate the results, root mean square error (RMSE), signal-to-noise ratio (SNR), correlation (CORR), universal quality index (UQI), and the contour offset are monitored. With JBU outperforming the other upsampling methods, achieving a meanRMSE = 25.22, mean SNR = 32.80, mean CORR = 0.99, and mean UQI = 0.97. Full article
(This article belongs to the Section Image and Video Processing)
Show Figures

Figure 1

19 pages, 16022 KiB  
Article
Analyzing Data Modalities for Cattle Weight Estimation Using Deep Learning Models
by Hina Afridi, Mohib Ullah, Øyvind Nordbø, Solvei Cottis Hoff, Siri Furre, Anne Guro Larsgard and Faouzi Alaya Cheikh
J. Imaging 2024, 10(3), 72; https://doi.org/10.3390/jimaging10030072 - 21 Mar 2024
Viewed by 748
Abstract
We investigate the impact of different data modalities for cattle weight estimation. For this purpose, we collect and present our own cattle dataset representing the data modalities: RGB, depth, combined RGB and depth, segmentation, and combined segmentation and depth information. We explore a [...] Read more.
We investigate the impact of different data modalities for cattle weight estimation. For this purpose, we collect and present our own cattle dataset representing the data modalities: RGB, depth, combined RGB and depth, segmentation, and combined segmentation and depth information. We explore a recent vision-transformer-based zero-shot model proposed by Meta AI Research for producing the segmentation data modality and for extracting the cattle-only region from the images. For experimental analysis, we consider three baseline deep learning models. The objective is to assess how the integration of diverse data sources influences the accuracy and robustness of the deep learning models considering four different performance metrics: mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and R-squared (R2). We explore the synergies and challenges associated with each modality and their combined use in enhancing the precision of cattle weight prediction. Through comprehensive experimentation and evaluation, we aim to provide insights into the effectiveness of different data modalities in improving the performance of established deep learning models, facilitating informed decision-making for precision livestock management systems. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

15 pages, 980 KiB  
Article
FishSegSSL: A Semi-Supervised Semantic Segmentation Framework for Fish-Eye Images
by Sneha Paul, Zachary Patterson and Nizar Bouguila
J. Imaging 2024, 10(3), 71; https://doi.org/10.3390/jimaging10030071 - 15 Mar 2024
Viewed by 949
Abstract
The application of large field-of-view (FoV) cameras equipped with fish-eye lenses brings notable advantages to various real-world computer vision applications, including autonomous driving. While deep learning has proven successful in conventional computer vision applications using regular perspective images, its potential in fish-eye camera [...] Read more.
The application of large field-of-view (FoV) cameras equipped with fish-eye lenses brings notable advantages to various real-world computer vision applications, including autonomous driving. While deep learning has proven successful in conventional computer vision applications using regular perspective images, its potential in fish-eye camera contexts remains largely unexplored due to limited datasets for fully supervised learning. Semi-supervised learning comes as a potential solution to manage this challenge. In this study, we explore and benchmark two popular semi-supervised methods from the perspective image domain for fish-eye image segmentation. We further introduce FishSegSSL, a novel fish-eye image segmentation framework featuring three semi-supervised components: pseudo-label filtering, dynamic confidence thresholding, and robust strong augmentation. Evaluation on the WoodScape dataset, collected from vehicle-mounted fish-eye cameras, demonstrates that our proposed method enhances the model’s performance by up to 10.49% over fully supervised methods using the same amount of labeled data. Our method also improves the existing image segmentation methods by 2.34%. To the best of our knowledge, this is the first work on semi-supervised semantic segmentation on fish-eye images. Additionally, we conduct a comprehensive ablation study and sensitivity analysis to showcase the efficacy of each proposed method in this research. Full article
(This article belongs to the Special Issue Deep Learning in Computer Vision)
Show Figures

Graphical abstract

17 pages, 687 KiB  
Article
Enhancing Embedded Object Tracking: A Hardware Acceleration Approach for Real-Time Predictability
by Mingyang Zhang, Kristof Van Beeck and Toon Goedemé
J. Imaging 2024, 10(3), 70; https://doi.org/10.3390/jimaging10030070 - 13 Mar 2024
Viewed by 851
Abstract
While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not only have a minimal execution latency, but this latency should ideally also have zero variance, i.e., [...] Read more.
While Siamese object tracking has witnessed significant advancements, its hard real-time behaviour on embedded devices remains inadequately addressed. In many application cases, an embedded implementation should not only have a minimal execution latency, but this latency should ideally also have zero variance, i.e., be predictable. This study aims to address this issue by meticulously analysing real-time predictability across different components of a deep-learning-based video object tracking system. Our detailed experiments not only indicate the superiority of Field-Programmable Gate Array (FPGA) implementations in terms of hard real-time behaviour but also unveil important time predictability bottlenecks. We introduce dedicated hardware accelerators for key processes, focusing on depth-wise cross-correlation and padding operations, utilizing high-level synthesis (HLS). Implemented on a KV260 board, our enhanced tracker exhibits not only a speed up, with a factor of 6.6, in mean execution time but also significant improvements in hard real-time predictability by yielding 11 times less latency variation as compared to our baseline. A subsequent analysis of power consumption reveals our approach’s contribution to enhanced power efficiency. These advancements underscore the crucial role of hardware acceleration in realizing time-predictable object tracking on embedded systems, setting new standards for future hardware–software co-design endeavours in this domain. Full article
Show Figures

Figure 1

23 pages, 20628 KiB  
Article
Multi-Modal Convolutional Parameterisation Network for Guided Image Inverse Problems
by Mikolaj Czerkawski, Priti Upadhyay, Christopher Davison, Robert Atkinson, Craig Michie, Ivan Andonovic, Malcolm Macdonald, Javier Cardona and Christos Tachtatzis
J. Imaging 2024, 10(3), 69; https://doi.org/10.3390/jimaging10030069 - 12 Mar 2024
Viewed by 854
Abstract
There are several image inverse tasks, such as inpainting or super-resolution, which can be solved using deep internal learning, a paradigm that involves employing deep neural networks to find a solution by learning from the sample itself rather than a dataset. For example, [...] Read more.
There are several image inverse tasks, such as inpainting or super-resolution, which can be solved using deep internal learning, a paradigm that involves employing deep neural networks to find a solution by learning from the sample itself rather than a dataset. For example, Deep Image Prior is a technique based on fitting a convolutional neural network to output the known parts of the image (such as non-inpainted regions or a low-resolution version of the image). However, this approach is not well adjusted for samples composed of multiple modalities. In some domains, such as satellite image processing, accommodating multi-modal representations could be beneficial or even essential. In this work, Multi-Modal Convolutional Parameterisation Network (MCPN) is proposed, where a convolutional neural network approximates shared information between multiple modes by combining a core shared network with modality-specific head networks. The results demonstrate that these approaches can significantly outperform the single-mode adoption of a convolutional parameterisation network on guided image inverse problems of inpainting and super-resolution. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

21 pages, 10758 KiB  
Article
Neural Radiance Field-Inspired Depth Map Refinement for Accurate Multi-View Stereo
by Shintaro Ito, Kanta Miura, Koichi Ito and Takafumi Aoki
J. Imaging 2024, 10(3), 68; https://doi.org/10.3390/jimaging10030068 - 08 Mar 2024
Viewed by 1162
Abstract
In this paper, we propose a method to refine the depth maps obtained by Multi-View Stereo (MVS) through iterative optimization of the Neural Radiance Field (NeRF). MVS accurately estimates the depths on object surfaces, and NeRF accurately estimates the depths at object boundaries. [...] Read more.
In this paper, we propose a method to refine the depth maps obtained by Multi-View Stereo (MVS) through iterative optimization of the Neural Radiance Field (NeRF). MVS accurately estimates the depths on object surfaces, and NeRF accurately estimates the depths at object boundaries. The key ideas of the proposed method are to combine MVS and NeRF to utilize the advantages of both in depth map estimation and to use NeRF for depth map refinement. We also introduce a Huber loss into the NeRF optimization to improve the accuracy of the depth map refinement, where the Huber loss reduces the estimation error in the radiance fields by placing constraints on errors larger than a threshold. Through a set of experiments using the Redwood-3dscan dataset and the DTU dataset, which are public datasets consisting of multi-view images, we demonstrate the effectiveness of the proposed method compared to conventional methods: COLMAP, NeRF, and DS-NeRF. Full article
(This article belongs to the Special Issue Geometry Reconstruction from Images (2nd Edition))
Show Figures

Figure 1

14 pages, 7617 KiB  
Article
Revolutionizing Cow Welfare Monitoring: A Novel Top-View Perspective with Depth Camera-Based Lameness Classification
by San Chain Tun, Tsubasa Onizuka, Pyke Tin, Masaru Aikawa, Ikuo Kobayashi and Thi Thi Zin
J. Imaging 2024, 10(3), 67; https://doi.org/10.3390/jimaging10030067 - 08 Mar 2024
Viewed by 796
Abstract
This study innovates livestock health management, utilizing a top-view depth camera for accurate cow lameness detection, classification, and precise segmentation through integration with a 3D depth camera and deep learning, distinguishing it from 2D systems. It underscores the importance of early lameness detection [...] Read more.
This study innovates livestock health management, utilizing a top-view depth camera for accurate cow lameness detection, classification, and precise segmentation through integration with a 3D depth camera and deep learning, distinguishing it from 2D systems. It underscores the importance of early lameness detection in cattle and focuses on extracting depth data from the cow’s body, with a specific emphasis on the back region’s maximum value. Precise cow detection and tracking are achieved through the Detectron2 framework and Intersection Over Union (IOU) techniques. Across a three-day testing period, with observations conducted twice daily with varying cow populations (ranging from 56 to 64 cows per day), the study consistently achieves an impressive average detection accuracy of 99.94%. Tracking accuracy remains at 99.92% over the same observation period. Subsequently, the research extracts the cow’s depth region using binary mask images derived from detection results and original depth images. Feature extraction generates a feature vector based on maximum height measurements from the cow’s backbone area. This feature vector is utilized for classification, evaluating three classifiers: Random Forest (RF), K-Nearest Neighbor (KNN), and Decision Tree (DT). The study highlights the potential of top-view depth video cameras for accurate cow lameness detection and classification, with significant implications for livestock health management. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Figure 1

10 pages, 9244 KiB  
Article
Magnetic Resonance Imaging as a Diagnostic Tool for Ilio-Femoro-Caval Deep Venous Thrombosis
by Lisbeth Lyhne, Kim Christian Houlind, Johnny Christensen, Radu L. Vijdea, Meinhard R. Hansen, Malene Roland V. Pedersen and Helle Precht
J. Imaging 2024, 10(3), 66; https://doi.org/10.3390/jimaging10030066 - 08 Mar 2024
Viewed by 990
Abstract
This study aimed to test the accuracy of a magnetic resonance imaging (MRI)-based method to detect and characterise deep venous thrombosis (DVT) in the ilio-femoro-caval veins. Patients with verified DVT in the lower extremities with extension of the thrombi to the iliac veins, [...] Read more.
This study aimed to test the accuracy of a magnetic resonance imaging (MRI)-based method to detect and characterise deep venous thrombosis (DVT) in the ilio-femoro-caval veins. Patients with verified DVT in the lower extremities with extension of the thrombi to the iliac veins, who were suitable for catheter-based venous thrombolysis, were included in this study. Before the intervention, magnetic resonance venography (MRV) was performed, and the ilio-femoro-caval veins were independently evaluated for normal appearance, stenosis, and occlusion by two single-blinded observers. The same procedure was used to evaluate digital subtraction phlebography (DSP), considered to be the gold standard, which made it possible to compare the results. A total of 123 patients were included for MRV and DSP, resulting in 246 image sets to be analysed. In total, 496 segments were analysed for occlusion, stenosis, or normal appearance. The highest sensitivity compared occlusion with either normal or stenosis (0.98) in MRV, while the lowest was found between stenosis and normal (0.84). Specificity varied from 0.59 (stenosis >< occlusion) to 0.94 (occlusion >< normal). The Kappa statistic was calculated as a measure of inter-observer agreement. The kappa value for MRV was 0.91 and for DSP, 0.80. In conclusion, MRV represents a sensitive method to analyse DVT in the pelvis veins with advantages such as no radiation and contrast and the possibility to investigate the anatomical relationship in the area. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

18 pages, 24722 KiB  
Article
Historical Text Line Segmentation Using Deep Learning Algorithms: Mask-RCNN against U-Net Networks
by Florian Côme Fizaine, Patrick Bard, Michel Paindavoine, Cécile Robin, Edouard Bouyé, Raphaël Lefèvre and Annie Vinter
J. Imaging 2024, 10(3), 65; https://doi.org/10.3390/jimaging10030065 - 05 Mar 2024
Viewed by 1022
Abstract
Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept, [...] Read more.
Text line segmentation is a necessary preliminary step before most text transcription algorithms are applied. The leading deep learning networks used in this context (ARU-Net, dhSegment, and Doc-UFCN) are based on the U-Net architecture. They are efficient, but fall under the same concept, requiring a post-processing step to perform instance (e.g., text line) segmentation. In the present work, we test the advantages of Mask-RCNN, which is designed to perform instance segmentation directly. This work is the first to directly compare Mask-RCNN- and U-Net-based networks on text segmentation of historical documents, showing the superiority of the former over the latter. Three studies were conducted, one comparing these networks on different historical databases, another comparing Mask-RCNN with Doc-UFCN on a private historical database, and a third comparing the handwritten text recognition (HTR) performance of the tested networks. The results showed that Mask-RCNN outperformed ARU-Net, dhSegment, and Doc-UFCN using relevant line segmentation metrics, that performance evaluation should not focus on the raw masks generated by the networks, that a light mask processing is an efficient and simple solution to improve evaluation, and that Mask-RCNN leads to better HTR performance. Full article
(This article belongs to the Section Document Analysis and Processing)
Show Figures

Figure 1

14 pages, 4516 KiB  
Article
Elevating Chest X-ray Image Super-Resolution with Residual Network Enhancement
by Anudari Khishigdelger, Ahmed Salem and Hyun-Soo Kang
J. Imaging 2024, 10(3), 64; https://doi.org/10.3390/jimaging10030064 - 04 Mar 2024
Viewed by 1059
Abstract
Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and [...] Read more.
Chest X-ray (CXR) imaging plays a pivotal role in diagnosing various pulmonary diseases, which account for a significant portion of the global mortality rate, as recognized by the World Health Organization (WHO). Medical practitioners routinely depend on CXR images to identify anomalies and make critical clinical decisions. Dramatic improvements in super-resolution (SR) have been achieved by applying deep learning techniques. However, some SR methods are very difficult to utilize due to their low-resolution inputs and features containing abundant low-frequency information, similar to the case of X-ray image super-resolution. In this paper, we introduce an advanced deep learning-based SR approach that incorporates the innovative residual-in-residual (RIR) structure to augment the diagnostic potential of CXR imaging. Specifically, we propose forming a light network consisting of residual groups built by residual blocks, with multiple skip connections to facilitate the efficient bypassing of abundant low-frequency information through multiple skip connections. This approach allows the main network to concentrate on learning high-frequency information. In addition, we adopted the dense feature fusion within residual groups and designed high parallel residual blocks for better feature extraction. Our proposed methods exhibit superior performance compared to existing state-of-the-art (SOTA) SR methods, delivering enhanced accuracy and notable visual improvements, as evidenced by our results. Full article
Show Figures

Figure 1

16 pages, 2122 KiB  
Article
Enhancing COVID-19 Detection: An Xception-Based Model with Advanced Transfer Learning from X-ray Thorax Images
by Reagan E. Mandiya, Hervé M. Kongo, Selain K. Kasereka, Kyamakya Kyandoghere, Petro Mushidi Tshakwanda and Nathanaël M. Kasoro
J. Imaging 2024, 10(3), 63; https://doi.org/10.3390/jimaging10030063 - 29 Feb 2024
Viewed by 1092
Abstract
Rapid and precise identification of Coronavirus Disease 2019 (COVID-19) is pivotal for effective patient care, comprehending the pandemic’s trajectory, and enhancing long-term patient survival rates. Despite numerous recent endeavors in medical imaging, many convolutional neural network-based models grapple with the expressiveness problem and [...] Read more.
Rapid and precise identification of Coronavirus Disease 2019 (COVID-19) is pivotal for effective patient care, comprehending the pandemic’s trajectory, and enhancing long-term patient survival rates. Despite numerous recent endeavors in medical imaging, many convolutional neural network-based models grapple with the expressiveness problem and overfitting, and the training process of these models is always resource-intensive. This paper presents an innovative approach employing Xception, augmented with cutting-edge transfer learning techniques to forecast COVID-19 from X-ray thorax images. Our experimental findings demonstrate that the proposed model surpasses the predictive accuracy of established models in the domain, including Xception, VGG-16, and ResNet. This research marks a significant stride toward enhancing COVID-19 detection through a sophisticated and high-performing imaging model. Full article
Show Figures

Figure 1

17 pages, 4016 KiB  
Article
Enhancing Deep Edge Detection through Normalized Hadamard-Product Fusion
by Gang Hu and Conner Saeli
J. Imaging 2024, 10(3), 62; https://doi.org/10.3390/jimaging10030062 - 29 Feb 2024
Viewed by 954
Abstract
Deep edge detection is challenging, especially with the existing methods, like HED (holistic edge detection). These methods combine multiple feature side outputs (SOs) to create the final edge map, but they neglect diverse edge importance within one output. This creates a problem: to [...] Read more.
Deep edge detection is challenging, especially with the existing methods, like HED (holistic edge detection). These methods combine multiple feature side outputs (SOs) to create the final edge map, but they neglect diverse edge importance within one output. This creates a problem: to include desired edges, unwanted noise must also be accepted. As a result, the output often has increased noise or thick edges, ignoring important boundaries. To address this, we propose a new approach called the normalized Hadamard-product (NHP) operation-based deep network for edge detection. By multiplying the side outputs from the backbone network, the Hadamard-product operation encourages agreement among features across different scales while suppressing disagreed weak signals. This method produces additional Mutually Agreed Salient Edge (MASE) maps to enrich the hierarchical level of side outputs without adding complexity. Our experiments demonstrate that the NHP operation significantly improves performance, e.g., an ODS score reaching 0.818 on BSDS500, outperforming human performance (0.803), achieving state-of-the-art results in deep edge detection. Full article
Show Figures

Figure 1

25 pages, 10231 KiB  
Article
Comprehensive Evaluation of Multispectral Image Registration Strategies in Heterogenous Agriculture Environment
by Shubham Rana, Salvatore Gerbino, Mariano Crimaldi, Valerio Cirillo, Petronia Carillo, Fabrizio Sarghini and Albino Maggio
J. Imaging 2024, 10(3), 61; https://doi.org/10.3390/jimaging10030061 - 29 Feb 2024
Viewed by 1053
Abstract
This article is focused on the comprehensive evaluation of alleyways to scale-invariant feature transform (SIFT) and random sample consensus (RANSAC) based multispectral (MS) image registration. In this paper, the idea is to extensively evaluate three such SIFT- and RANSAC-based registration approaches over a [...] Read more.
This article is focused on the comprehensive evaluation of alleyways to scale-invariant feature transform (SIFT) and random sample consensus (RANSAC) based multispectral (MS) image registration. In this paper, the idea is to extensively evaluate three such SIFT- and RANSAC-based registration approaches over a heterogenous mix containing Triticum aestivum crop and Raphanus raphanistrum weed. The first method is based on the application of a homography matrix, derived during the registration of MS images on spatial coordinates of individual annotations to achieve spatial realignment. The second method is based on the registration of binary masks derived from the ground truth of individual spectral channels. The third method is based on the registration of only the masked pixels of interest across the respective spectral channels. It was found that the MS image registration technique based on the registration of binary masks derived from the manually segmented images exhibited the highest accuracy, followed by the technique involving registration of masked pixels, and lastly, registration based on the spatial realignment of annotations. Among automatically segmented images, the technique based on the registration of automatically predicted mask instances exhibited higher accuracy than the technique based on the registration of masked pixels. In the ground truth images, the annotations performed through the near-infrared channel were found to have a higher accuracy, followed by green, blue, and red spectral channels. Among the automatically segmented images, the accuracy of the blue channel was observed to exhibit a higher accuracy, followed by the green, near-infrared, and red channels. At the individual instance level, the registration based on binary masks depicted the highest accuracy in the green channel, followed by the method based on the registration of masked pixels in the red channel, and lastly, the method based on the spatial realignment of annotations in the green channel. The instance detection of wild radish with YOLOv8l-seg was observed at a mAP@0.5 of 92.11% and a segmentation accuracy of 98% towards segmenting its binary mask instances. Full article
Show Figures

Figure 1

17 pages, 8064 KiB  
Article
Development of A Micro-CT Scanner with Dual-Energy Option and Endovascular Contrast Agent Administration Protocol for Fetal and Neonatal Virtual Autopsy
by Robert Zboray, Wolf Schweitzer, Lars Ebert, Martin Wolf, Sabino Guglielmini, Stefan Haemmerle, Stephan Weiss and Bruno Koller
J. Imaging 2024, 10(3), 60; https://doi.org/10.3390/jimaging10030060 - 29 Feb 2024
Viewed by 1015
Abstract
The rate of parental consent for fetal and perinatal autopsy is decreasing, whereas parents are more likely to agree to virtual autopsy by non-invasive imaging methods. Fetal and perinatal virtual autopsy needs high-resolution and good soft-tissue contrast for investigation of the cause of [...] Read more.
The rate of parental consent for fetal and perinatal autopsy is decreasing, whereas parents are more likely to agree to virtual autopsy by non-invasive imaging methods. Fetal and perinatal virtual autopsy needs high-resolution and good soft-tissue contrast for investigation of the cause of death and underlying trauma or pathology in fetuses and stillborn infants. This is offered by micro-computed tomography (CT), as opposed to the limited resolution provided by clinical CT scanners, and this is one of the most promising tools for non-invasive perinatal postmortem imaging. We developed and optimized a micro-CT scanner with a dual-energy imaging option. It is dedicated to post-mortem CT angiography and virtual autopsy of fetuses and stillborn infants in that the chamber can be cooled down to around 5 °C; this increases tissue rigidity and slows decomposition of the native specimen. This, together with the dedicated gantry-based architecture, attempts to reduce potential motion artifacts. The developed methodology is based on prior endovascular injection of a BaSO4-based contrast agent. We explain the design choices and considerations for this scanner prototype. We give details of the treatment of the optimization of the dual-energy and virtual mono-energetic imaging option that has been based on minimizing noise propagation and maximizing the contrast-to-noise ratio for vascular features. We demonstrate the scanner capabilities with proof-of-concept experiments on phantoms and stillborn piglets. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

19 pages, 2355 KiB  
Article
Privacy-Preserving Face Recognition Method Based on Randomization and Local Feature Learning
by Yanhua Huang, Zhendong Wu, Juan Chen and Hui Xiang
J. Imaging 2024, 10(3), 59; https://doi.org/10.3390/jimaging10030059 - 28 Feb 2024
Viewed by 1019
Abstract
Personal privacy protection has been extensively investigated. The privacy protection of face recognition applications combines face privacy protection with face recognition. Traditional face privacy-protection methods encrypt or perturb facial images for protection. However, the original facial images or parameters need to be restored [...] Read more.
Personal privacy protection has been extensively investigated. The privacy protection of face recognition applications combines face privacy protection with face recognition. Traditional face privacy-protection methods encrypt or perturb facial images for protection. However, the original facial images or parameters need to be restored during recognition. In this paper, it is found that faces can still be recognized correctly when only some of the high-order and local feature information from faces is retained, while the rest of the information is fuzzed. Based on this, a privacy-preserving face recognition method combining random convolution and self-learning batch normalization is proposed. This method generates a privacy-preserved scrambled facial image and an image fuzzy degree that is close to an encryption of the image. The server directly recognizes the scrambled facial image, and the recognition accuracy is equivalent to that of the normal facial image. The method ensures the revocability and irreversibility of the privacy preserving of faces at the same time. In this experiment, the proposed method is tested on the LFW, Celeba, and self-collected face datasets. On the three datasets, the proposed method outperforms the existing face privacy-preserving recognition methods in terms of face visual information elimination and recognition accuracy. The recognition accuracy is >99%, and the visual information elimination is close to an encryption effect. Full article
(This article belongs to the Section Biometrics, Forensics, and Security)
Show Figures

Figure 1

15 pages, 5153 KiB  
Article
An Improved Path-Finding Method for the Tracking of Centerlines of Tortuous Internal Carotid Arteries in MR Angiography
by Se-On Kim and Yoon-Chul Kim
J. Imaging 2024, 10(3), 58; https://doi.org/10.3390/jimaging10030058 - 28 Feb 2024
Viewed by 902
Abstract
Centerline tracking is useful in performing segmental analysis of vessel tortuosity in angiography data. However, a highly tortuous) artery can produce multiple centerlines due to over-segmentation of the artery, resulting in inaccurate path-finding results when using the shortest path-finding algorithm. In this study, [...] Read more.
Centerline tracking is useful in performing segmental analysis of vessel tortuosity in angiography data. However, a highly tortuous) artery can produce multiple centerlines due to over-segmentation of the artery, resulting in inaccurate path-finding results when using the shortest path-finding algorithm. In this study, the internal carotid arteries (ICAs) from three-dimensional (3D) time-of-flight magnetic resonance angiography (TOF MRA) data were used to demonstrate the effectiveness of a new path-finding method. The method is based on a series of depth-first searches (DFSs) with randomly different orders of neighborhood searches and produces an appropriate path connecting the two endpoints in the ICAs. It was compared with three existing methods which were (a) DFS with a sequential order of neighborhood search, (b) Dijkstra algorithm, and (c) A* algorithm. The path-finding accuracy was evaluated by counting the number of successful paths. The method resulted in an accuracy of 95.8%, outperforming the three existing methods. In conclusion, the proposed method has been shown to be more suitable as a path-finding procedure than the existing methods, particularly in cases where there is more than one centerline resulting from over-segmentation of a highly tortuous artery. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

1 pages, 139 KiB  
Correction
Correction: Guo et al. A Siamese Transformer Network for Zero-Shot Ancient Coin Classification. J. Imaging 2023, 9, 107
by Zhongliang Guo, Ognjen Arandjelović, David Reid, Yaxiong Lei and Jochen Büttner
J. Imaging 2024, 10(3), 57; https://doi.org/10.3390/jimaging10030057 - 26 Feb 2024
Viewed by 745
Abstract
Jochen Büttner was not included as an author in the original publication [...] Full article
15 pages, 4956 KiB  
Article
Collaborative Modality Fusion for Mitigating Language Bias in Visual Question Answering
by Qiwen Lu, Shengbo Chen and Xiaoke Zhu
J. Imaging 2024, 10(3), 56; https://doi.org/10.3390/jimaging10030056 - 23 Feb 2024
Viewed by 1132
Abstract
Language bias stands as a noteworthy concern in visual question answering (VQA), wherein models tend to rely on spurious correlations between questions and answers for prediction. This prevents the models from effectively generalizing, leading to a decrease in performance. In order to address [...] Read more.
Language bias stands as a noteworthy concern in visual question answering (VQA), wherein models tend to rely on spurious correlations between questions and answers for prediction. This prevents the models from effectively generalizing, leading to a decrease in performance. In order to address this bias, we propose a novel modality fusion collaborative de-biasing algorithm (CoD). In our approach, bias is considered as the model’s neglect of information from a particular modality during prediction. We employ a collaborative training approach to facilitate mutual modeling between different modalities, achieving efficient feature fusion and enabling the model to fully leverage multimodal knowledge for prediction. Our experiments on various datasets, including VQA-CP v2, VQA v2, and VQA-VS, using different validation strategies, demonstrate the effectiveness of our approach. Notably, employing a basic baseline model resulted in an accuracy of 60.14% on VQA-CP v2. Full article
(This article belongs to the Section Multimedia Systems and Applications)
Show Figures

Figure 1

35 pages, 72538 KiB  
Article
Optical and Electromechanical Design and Implementation of an Advanced Multispectral Device to Capture Material Appearance
by Majid Ansari-Asl, Markus Barbieri, Gaël Obein and Jon Yngve Hardeberg
J. Imaging 2024, 10(3), 55; https://doi.org/10.3390/jimaging10030055 - 23 Feb 2024
Viewed by 1000
Abstract
The application of materials with changing visual properties with lighting and observation directions has found broad utility across diverse industries, from architecture and fashion to automotive and film production. The expanding array of applications and appearance reproduction requirements emphasizes the critical role of [...] Read more.
The application of materials with changing visual properties with lighting and observation directions has found broad utility across diverse industries, from architecture and fashion to automotive and film production. The expanding array of applications and appearance reproduction requirements emphasizes the critical role of material appearance measurement and surface characterization. Such measurements offer twofold benefits in soft proofing and product quality control, reducing errors and material waste while providing objective quality assessment. Some image-based setups have been proposed to capture the appearance of material surfaces with spatial variations in visual properties in terms of Spatially Varying Bidirectional Reflectance Distribution Functions (SVBRDF) and Bidirectional Texture Functions (BTF). However, comprehensive exploration of optical design concerning spectral channels and per-pixel incident-reflection direction calculations, along with measurement validation, remains an unexplored domain within these systems. Therefore, we developed a novel advanced multispectral image-based device designed to measure SVBRDF and BTF, addressing these gaps in the existing literature. Central to this device is a novel rotation table as sample holder and passive multispectral imaging. In this paper, we present our compact multispectral image-based appearance measurement device, detailing its design, assembly, and optical considerations. Preliminary measurements showcase the device’s potential in capturing angular and spectral data, promising valuable insights into material appearance properties. Full article
(This article belongs to the Special Issue Imaging Technologies for Understanding Material Appearance)
Show Figures

Figure 1

12 pages, 1835 KiB  
Article
Comparison of Echocardiography and Myocardial Scintigraphy to Detect Cancer Therapy-Related Cardiovascular Toxicity in Breast Cancer Patients
by Yuko Harada, Kyosuke Shimada, Satoshi John Harada, Tomomi Sato, Yukino Kubota and Miyoko Yamashita
J. Imaging 2024, 10(3), 54; https://doi.org/10.3390/jimaging10030054 - 21 Feb 2024
Viewed by 1125
Abstract
The mortality rate of cancer patients has been decreasing; however, patients often suffer from cardiac disorders due to chemotherapy or other cancer therapies (e.g., cancer-therapy-related cardiovascular toxicity (CVR-CVT)). Therefore, the field of cardio-oncology has drawn more attention in recent years. The first European [...] Read more.
The mortality rate of cancer patients has been decreasing; however, patients often suffer from cardiac disorders due to chemotherapy or other cancer therapies (e.g., cancer-therapy-related cardiovascular toxicity (CVR-CVT)). Therefore, the field of cardio-oncology has drawn more attention in recent years. The first European Society of Cardiology (ESC) guidelines on cardio-oncology was established last year. Echocardiography is the gold standard for the diagnosis of CVR-CVT, but many breast cancer patients are unable to undergo echocardiography due to their surgery wounds or anatomical reasons. We performed a study to evaluate the usefulness of myocardial scintigraphy using Iodine-123 β-methyl-P-iodophenyl-pentadecanoic acid (123I-BMIPP) in comparison with echocardiography and published the results in the Journal of Imaging last year. This is the secondary analysis following our previous study. A total of 114 breast cancer patients who received chemotherapy within 3 years underwent echocardiography, as well as Thallium (201Tl) and 123I-BMIPP myocardial perfusion and metabolism scintigraphy. The ratio of isotope uptake reduction was scored by Heart Risk View-S software (Nihon Medi-Physics). The scores were then compared with the echocardiography parameters. All the patients’ charts and data from January 2022 to November 2023 were reviewed for the secondary analysis. Echocardiogram parameters were obtained from 99 patients (87% of total patients). No correlations were found between the echocardiography parameters and Heart Risk View-S scores of 201Tl myocardial perfusion scintigraphy, nor those of the BMIPP myocardial metabolism scintigraphy. In total, 8 patients out of 114 (7.0%) died within 22 months, while 3 patients out of 26 CVR-CVT patients (11.5%) died within 22 months. Evaluation by echocardiography was sometimes difficult to perform on breast cancer patients. However, other imaging modalities, including myocardial scintigraphy, cannot serve as alternatives to echocardiography. Cardiac scintigraphy detects circulation disorder or metabolism disorder in the myocardium; therefore, it should be able to reveal myocardial damage to some extent. The mortality rate of breast cancer patients was higher with CVR-CVT. A new modality to detect CVR-CVT besides echocardiography can possibly be anticipated for patients who cannot undergo echocardiography. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

26 pages, 29677 KiB  
Article
Development of a Powder Analysis Procedure Based on Imaging Techniques for Examining Aggregation and Segregation Phenomena
by Giuseppe Bonifazi, Paolo Barontini, Riccardo Gasbarrone, Davide Gattabria and Silvia Serranti
J. Imaging 2024, 10(3), 53; https://doi.org/10.3390/jimaging10030053 - 21 Feb 2024
Viewed by 928
Abstract
In this manuscript, a method that utilizes classical image techniques to assess particle aggregation and segregation, with the primary goal of validating particle size distribution determined by conventional methods, is presented. This approach can represent a supplementary tool in quality control systems for [...] Read more.
In this manuscript, a method that utilizes classical image techniques to assess particle aggregation and segregation, with the primary goal of validating particle size distribution determined by conventional methods, is presented. This approach can represent a supplementary tool in quality control systems for powder production processes in industries such as manufacturing and pharmaceuticals. The methodology involves the acquisition of high-resolution images, followed by their fractal and textural analysis. Fractal analysis plays a crucial role by quantitatively measuring the complexity and self-similarity of particle structures. This approach allows for the numerical evaluation of aggregation and segregation phenomena, providing valuable insights into the underlying mechanisms at play. Textural analysis contributes to the characterization of patterns and spatial correlations observed in particle images. The examination of textural features offers an additional understanding of particle arrangement and organization. Consequently, it aids in validating the accuracy of particle size distribution measurements. To this end, by incorporating fractal and structural analysis, a methodology that enhances the reliability and accuracy of particle size distribution validation is obtained. It enables the identification of irregularities, anomalies, and subtle variations in particle arrangements that might not be detected by traditional measurement techniques alone. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Graphical abstract

16 pages, 9769 KiB  
Article
Pedestrian-Accessible Infrastructure Inventory: Enabling and Assessing Zero-Shot Segmentation on Multi-Mode Geospatial Data for All Pedestrian Types
by Jiahao Xia, Gavin Gong, Jiawei Liu, Zhigang Zhu and Hao Tang
J. Imaging 2024, 10(3), 52; https://doi.org/10.3390/jimaging10030052 - 21 Feb 2024
Viewed by 1049
Abstract
In this paper, a Segment Anything Model (SAM)-based pedestrian infrastructure segmentation workflow is designed and optimized, which is capable of efficiently processing multi-sourced geospatial data, including LiDAR data and satellite imagery data. We used an expanded definition of pedestrian infrastructure inventory, which goes [...] Read more.
In this paper, a Segment Anything Model (SAM)-based pedestrian infrastructure segmentation workflow is designed and optimized, which is capable of efficiently processing multi-sourced geospatial data, including LiDAR data and satellite imagery data. We used an expanded definition of pedestrian infrastructure inventory, which goes beyond the traditional transportation elements to include street furniture objects that are important for accessibility but are often omitted from the traditional definition. Our contributions lie in producing the necessary knowledge to answer the following three questions. First, how can mobile LiDAR technology be leveraged to produce comprehensive pedestrian-accessible infrastructure inventory? Second, which data representation can facilitate zero-shot segmentation of infrastructure objects with SAM? Third, how well does the SAM-based method perform on segmenting pedestrian infrastructure objects? Our proposed method is designed to efficiently create pedestrian-accessible infrastructure inventory through the zero-shot segmentation of multi-sourced geospatial datasets. Through addressing three research questions, we show how the multi-mode data should be prepared, what data representation works best for what asset features, and how SAM performs on these data presentations. Our findings indicate that street-view images generated from mobile LiDAR point-cloud data, when paired with satellite imagery data, can work efficiently with SAM to create a scalable pedestrian infrastructure inventory approach with immediate benefits to GIS professionals, city managers, transportation owners, and walkers, especially those with travel-limiting disabilities, such as individuals who are blind, have low vision, or experience mobility disabilities. Full article
(This article belongs to the Special Issue Image and Video Processing for Blind and Visually Impaired)
Show Figures

Figure 1

15 pages, 3905 KiB  
Article
An Efficient and Effective Image Decolorization Algorithm Based on Cumulative Distribution Function
by Tirui Wu, Ciaran Eising, Martin Glavin and Edward Jones
J. Imaging 2024, 10(3), 51; https://doi.org/10.3390/jimaging10030051 - 20 Feb 2024
Viewed by 1166
Abstract
Image decolorization is an image pre-processing step which is widely used in image analysis, computer vision, and printing applications. The most commonly used methods give each color channel (e.g., the R component in RGB format, or the Y component of an image in [...] Read more.
Image decolorization is an image pre-processing step which is widely used in image analysis, computer vision, and printing applications. The most commonly used methods give each color channel (e.g., the R component in RGB format, or the Y component of an image in CIE-XYZ format) a constant weight without considering image content. This approach is simple and fast, but it may cause significant information loss when images contain too many isoluminant colors. In this paper, we propose a new method which is not only efficient, but also can preserve a higher level of image contrast and detail than the traditional methods. It uses the information from the cumulative distribution function (CDF) of the information in each color channel to compute a weight for each pixel in each color channel. Then, these weights are used to combine the three color channels (red, green, and blue) to obtain the final grayscale value. The algorithm works in RGB color space directly without any color conversion. In order to evaluate the proposed algorithm objectively, two new metrics are also developed. Experimental results show that the proposed algorithm can run as efficiently as the traditional methods and obtain the best overall performance across four different metrics. Full article
(This article belongs to the Special Issue Image Processing and Computer Vision: Algorithms and Applications)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop