Journal of Imaging

17 pages, 9007 KiB

Open AccessFeature PaperArticle

ChainLineNet: Deep-Learning-Based Segmentation and Parameterization of Chain Lines in Historical Prints

by Aline Sindel, Thomas Klinke, Andreas Maier and Vincent Christlein

J. Imaging 2021, 7(7), 120; https://doi.org/10.3390/jimaging7070120 - 19 Jul 2021

Cited by 3 | Viewed by 2436

The paper structure of historical prints is sort of a unique fingerprint. Paper with the same origin shows similar chain line distances. As the manual measurement of chain line distances is time consuming, the automatic detection of chain lines is beneficial. We propose [...] Read more.

The paper structure of historical prints is sort of a unique fingerprint. Paper with the same origin shows similar chain line distances. As the manual measurement of chain line distances is time consuming, the automatic detection of chain lines is beneficial. We propose an end-to-end trainable deep learning method for segmentation and parameterization of chain lines in transmitted light images of German prints from the 16th Century. We trained a conditional generative adversarial network with a multitask loss for line segmentation and line parameterization. We formulated a fully differentiable pipeline for line coordinates’ estimation that consists of line segmentation, horizontal line alignment, and 2D Fourier filtering of line segments, line region proposals, and differentiable line fitting. We created a dataset of high-resolution transmitted light images of historical prints with manual line coordinate annotations. Our method shows superior qualitative and quantitative chain line detection results with high accuracy and reliability on our historical dataset in comparison to competing methods. Further, we demonstrated that our method achieves a low error of less than 0.7 mm in comparison to manually measured chain line distances. Full article

(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)

► Show Figures

Figure 1

16 pages, 54918 KiB

Open AccessArticle

Forgery Detection in Digital Images by Multi-Scale Noise Estimation

by Marina Gardella, Pablo Musé, Jean-Michel Morel and Miguel Colom

J. Imaging 2021, 7(7), 119; https://doi.org/10.3390/jimaging7070119 - 17 Jul 2021

Cited by 12 | Viewed by 3515

Abstract

A complex processing chain is applied from the moment a raw image is acquired until the final image is obtained. This process transforms the originally Poisson-distributed noise into a complex noise model. Noise inconsistency analysis is a rich source for forgery detection, as [...] Read more.

A complex processing chain is applied from the moment a raw image is acquired until the final image is obtained. This process transforms the originally Poisson-distributed noise into a complex noise model. Noise inconsistency analysis is a rich source for forgery detection, as forged regions have likely undergone a different processing pipeline or out-camera processing. We propose a multi-scale approach, which is shown to be suitable for analyzing the highly correlated noise present in JPEG-compressed images. We estimate a noise curve for each image block, in each color channel and at each scale. We then compare each noise curve to its corresponding noise curve obtained from the whole image by counting the percentage of bins of the local noise curve that are below the global one. This procedure yields crucial detection cues since many forgeries create a local noise deficit. Our method is shown to be competitive with the state of the art. It outperforms all other methods when evaluated using the MCC score, or on forged regions large enough and for colorization attacks, regardless of the evaluation metric. Full article

(This article belongs to the Special Issue Image and Video Forensics)

► Show Figures

Figure 1

31 pages, 8121 KiB

Open AccessArticle

Computer Vision Meets Image Processing and UAS PhotoGrammetric Data Integration: From HBIM to the eXtended Reality Project of Arco della Pace in Milan and Its Decorative Complexity

by Fabrizio Banfi and Alessandro Mandelli

J. Imaging 2021, 7(7), 118; https://doi.org/10.3390/jimaging7070118 - 16 Jul 2021

Cited by 15 | Viewed by 3806

Abstract

This study aims to enrich the knowledge of the monument Arco della Pace in Milan, surveying and modelling the sculpture that crowns the upper part of the building. The statues and the decorative apparatus are recorded with the photogrammetric technique using both a [...] Read more.

This study aims to enrich the knowledge of the monument Arco della Pace in Milan, surveying and modelling the sculpture that crowns the upper part of the building. The statues and the decorative apparatus are recorded with the photogrammetric technique using both a terrestrial camera and an Unmanned Aerial Vehicle (UAV). Research results and performance are oriented to improve computer vision and image processing integration with Unmanned Aerial System (UAS) photogrammetric data to enhance interactivity and information sharing between user and digital heritage models. The vast number of images captured from terrestrial and aerial photogrammetry will also permit to use of the Historic Building Information Modelling (HBIM) model in an eXtended Reality (XR) project developed ad-hoc, allowing different types of users (professionals, non-expert users, virtual tourists, and students) and devices (mobile phones, tablets, PCs, VR headsets) to access details and information that are not visible from the ground. Full article

(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)

► Show Figures

Figure 1

15 pages, 3473 KiB

Open AccessArticle

Improved JPEG Coding by Filtering 8 × 8 DCT Blocks

by Yasir Iqbal and Oh-Jin Kwon

J. Imaging 2021, 7(7), 117; https://doi.org/10.3390/jimaging7070117 - 15 Jul 2021

Cited by 5 | Viewed by 2088

Abstract

The JPEG format, consisting of a set of image compression techniques, is one of the most commonly used image coding standards for both lossy and lossless image encoding. In this format, various techniques are used to improve image transmission and storage. In the [...] Read more.

The JPEG format, consisting of a set of image compression techniques, is one of the most commonly used image coding standards for both lossy and lossless image encoding. In this format, various techniques are used to improve image transmission and storage. In the final step of lossy image coding, JPEG uses either arithmetic or Huffman entropy coding modes to further compress data processed by lossy compression. Both modes encode all the 8 × 8 DCT blocks without filtering empty ones. An end-of-block marker is coded for empty blocks, and these empty blocks cause an unnecessary increase in file size when they are stored with the rest of the data. In this paper, we propose a modified version of the JPEG entropy coding. In the proposed version, instead of storing an end-of-block code for empty blocks with the rest of the data, we store their location in a separate buffer and then compress the buffer with an efficient lossless method to achieve a higher compression ratio. The size of the additional buffer, which keeps the information of location for the empty and non-empty blocks, was considered during the calculation of bits per pixel for the test images. In image compression, peak signal-to-noise ratio versus bits per pixel has been a major measure for evaluating the coding performance. Experimental results indicate that the proposed modified algorithm achieves lower bits per pixel while retaining quality. Full article

(This article belongs to the Special Issue New and Specialized Methods of Image Compression)

► Show Figures

Figure 1

18 pages, 3180 KiB

Open AccessArticle

Performance Evaluation of Source Camera Attribution by Using Likelihood Ratio Methods

by Pasquale Ferrara, Rudolf Haraksim and Laurent Beslay

J. Imaging 2021, 7(7), 116; https://doi.org/10.3390/jimaging7070116 - 15 Jul 2021

Cited by 3 | Viewed by 1846

Abstract

Performance evaluation of source camera attribution methods typically stop at the level of analysis of hard to interpret similarity scores. Standard analytic tools include Detection Error Trade-off or Receiver Operating Characteristic curves, or other scalar performance metrics, such as Equal Error Rate or [...] Read more.

Performance evaluation of source camera attribution methods typically stop at the level of analysis of hard to interpret similarity scores. Standard analytic tools include Detection Error Trade-off or Receiver Operating Characteristic curves, or other scalar performance metrics, such as Equal Error Rate or error rates at a specific decision threshold. However, the main drawback of similarity scores is their lack of probabilistic interpretation and thereby their lack of usability in forensic investigation, when assisting the trier of fact to make more sound and more informed decisions. The main objective of this work is to demonstrate a transition from the similarity scores to likelihood ratios in the scope of digital evidence evaluation, which not only have probabilistic meaning, but can be immediately incorporated into the forensic casework and combined with the rest of the case-related forensic. Likelihood ratios are calculated from the Photo Response Non-Uniformity source attribution similarity scores. The experiments conducted aim to compare different strategies applied to both digital images and videos, by considering their respective peculiarities. The results are presented in a format compatible with the guideline for validation of forensic likelihood ratio methods. Full article

(This article belongs to the Special Issue Image and Video Forensics)

► Show Figures

Figure 1

20 pages, 1248 KiB

Open AccessArticle

Camera Color Correction for Cultural Heritage Preservation Based on Clustered Data

by Marco Trombini, Federica Ferraro, Emanuela Manfredi, Giovanni Petrillo and Silvana Dellepiane

J. Imaging 2021, 7(7), 115; https://doi.org/10.3390/jimaging7070115 - 13 Jul 2021

Cited by 4 | Viewed by 2060

Abstract

Cultural heritage preservation is a crucial topic for our society. When dealing with fine art, color is a primary feature that encompasses much information related to the artwork’s conservation status and to the pigments’ composition. As an alternative to more sophisticated devices, the [...] Read more.

Cultural heritage preservation is a crucial topic for our society. When dealing with fine art, color is a primary feature that encompasses much information related to the artwork’s conservation status and to the pigments’ composition. As an alternative to more sophisticated devices, the analysis and identification of color pigments may be addressed via a digital camera, i.e., a non-invasive, inexpensive, and portable tool for studying large surfaces. In the present study, we propose a new supervised approach to camera characterization based on clustered data in order to address the homoscedasticity of the acquired data. The experimental phase is conducted on a real pictorial dataset, where pigments are grouped according to their chromatic or chemical properties. The results show that such a procedure leads to better characterization with respect to state-of-the-art methods. In addition, the present study introduces a method to deal with organic pigments in a quantitative visual approach. Full article

(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)

► Show Figures

Figure 1

19 pages, 10347 KiB

Open AccessArticle

DCNet: Noise-Robust Convolutional Neural Networks for Degradation Classification on Ancient Documents

by Fitri Arnia, Khairun Saddami and Khairul Munadi

J. Imaging 2021, 7(7), 114; https://doi.org/10.3390/jimaging7070114 - 12 Jul 2021

Cited by 1 | Viewed by 2416

Abstract

Analysis of degraded ancient documents is challenging due to the severity and combination of degradation present in a single image. Ancient documents also suffer from additional noise during the digitalization process, particularly when digitalization is done using low-specification devices and/or under poor illumination [...] Read more.

Analysis of degraded ancient documents is challenging due to the severity and combination of degradation present in a single image. Ancient documents also suffer from additional noise during the digitalization process, particularly when digitalization is done using low-specification devices and/or under poor illumination conditions. The noises over the degraded ancient documents certainly cause a troublesome document analysis. In this paper, we propose a new noise-robust convolutional neural network (CNN) architecture for degradation classification of noisy ancient documents, which is called a degradation classification network (DCNet). DCNet was constructed based on the ResNet101, MobileNetV2, and ShuffleNet architectures. Furthermore, we propose a new self-transition layer following DCNet. We trained the DCNet using (1) noise-free document images and (2) heavy-noise (zero mean Gaussian noise (ZMGN) and speckle) document images. Then, we tested the resulted models with document images containing different levels of ZMGN and speckle noise. We compared our results to three CNN benchmarking architectures, namely MobileNet, ShuffleNet, and ResNet101. In general, the proposed architecture performed better than MobileNet, ShuffleNet, ResNet101, and conventional machine learning (support vector machine and random forest), particularly for documents with heavy noise. Full article

(This article belongs to the Section Document Analysis and Processing)

► Show Figures

Figure 1

10 pages, 1842 KiB

Open AccessArticle

Giant Intraosseous Cyst-Like Lesions of the Metacarpal Bones in Rheumatoid Arthritis

by Wanxuan Fang, Ikuma Nakagawa, Kenneth Sutherland, Kazuhide Tanimura and Tamotsu Kamishima

J. Imaging 2021, 7(7), 113; https://doi.org/10.3390/jimaging7070113 - 12 Jul 2021

Viewed by 4871

Abstract

The purpose of this study was to illustrate the clinical and imaging properties of giant intraosseous cyst-like lesions (GICLs) of the metacarpal bones extending beyond the central diaphysis in rheumatoid arthritis (RA) patients on magnetic resonance (MR) images. A keyword search was conducted [...] Read more.

The purpose of this study was to illustrate the clinical and imaging properties of giant intraosseous cyst-like lesions (GICLs) of the metacarpal bones extending beyond the central diaphysis in rheumatoid arthritis (RA) patients on magnetic resonance (MR) images. A keyword search was conducted to extract GICLs of the metacarpal bones out of MR reports in RA patients. There were nine GICLs extending from the subchondral bone region beyond the central diaphysis of the metacarpal bones on MR images in eight subjects with RA (seven females, one male). The age range was from 60 to 87 years with a median age of 65.5 years. The average disease duration was 13.1 years. As for the disease activity, one was low, six were moderate and one was high. None of the nine lesions were visible on radiography. The Steinbrocker stage distribution was as follows: I (n = 3), II (n = 2), and III (n = 3). Intraosseous cyst-like lesion of the metacarpal bones on MR images is a relatively rare manifestation in patients with long-standing RA. Although the lesion seems to be derived from subcortical bone break, it is not necessarily erosive in nature. Full article

(This article belongs to the Section Medical Imaging)

► Show Figures

Figure 1

15 pages, 673 KiB

Open AccessArticle

No-Reference Image Quality Assessment with Multi-Scale Orderless Pooling of Deep Features

by Domonkos Varga

J. Imaging 2021, 7(7), 112; https://doi.org/10.3390/jimaging7070112 - 10 Jul 2021

Cited by 4 | Viewed by 3200

Abstract

The goal of no-reference image quality assessment (NR-IQA) is to evaluate their perceptual quality of digital images without using the distortion-free, pristine counterparts. NR-IQA is an important part of multimedia signal processing since digital images can undergo a wide variety of distortions during [...] Read more.

The goal of no-reference image quality assessment (NR-IQA) is to evaluate their perceptual quality of digital images without using the distortion-free, pristine counterparts. NR-IQA is an important part of multimedia signal processing since digital images can undergo a wide variety of distortions during storage, compression, and transmission. In this paper, we propose a novel architecture that extracts deep features from the input image at multiple scales to improve the effectiveness of feature extraction for NR-IQA using convolutional neural networks. Specifically, the proposed method extracts deep activations for local patches at multiple scales and maps them onto perceptual quality scores with the help of trained Gaussian process regressors. Extensive experiments demonstrate that the introduced algorithm performs favorably against the state-of-the-art methods on three large benchmark datasets with authentic distortions (LIVE In the Wild, KonIQ-10k, and SPAQ). Full article

(This article belongs to the Special Issue Image and Video Quality Assessment)

► Show Figures

Figure 1

19 pages, 4441 KiB

Open AccessArticle

A Deep Learning Ensemble Method to Assist Cytopathologists in Pap Test Image Classification

by Débora N. Diniz, Mariana T. Rezende, Andrea G. C. Bianchi, Claudia M. Carneiro, Eduardo J. S. Luz, Gladston J. P. Moreira, Daniela M. Ushizima, Fátima N. S. de Medeiros and Marcone J. F. Souza

J. Imaging 2021, 7(7), 111; https://doi.org/10.3390/jimaging7070111 - 09 Jul 2021

Cited by 28 | Viewed by 3901

Abstract

In recent years, deep learning methods have outperformed previous state-of-the-art machine learning techniques for several problems, including image classification. Classifying cells in Pap smear images is very challenging, and it is still of paramount importance for cytopathologists. The Pap test is a cervical [...] Read more.

In recent years, deep learning methods have outperformed previous state-of-the-art machine learning techniques for several problems, including image classification. Classifying cells in Pap smear images is very challenging, and it is still of paramount importance for cytopathologists. The Pap test is a cervical cancer prevention test that tracks preneoplastic changes in cervical epithelial cells. Carrying out this exam is important in that early detection. It is directly related to a greater chance of curing or reducing the number of deaths caused by the disease. The analysis of Pap smears is exhaustive and repetitive, as it is performed manually by cytopathologists. Therefore, a tool that assists cytopathologists is needed. This work considers 10 deep convolutional neural networks and proposes an ensemble of the three best architectures to classify cervical cancer upon cell nuclei and reduce the professionals’ workload. The dataset used in the experiments is available in the Center for Recognition and Inspection of Cells (CRIC) Searchable Image Database. Considering the metrics of precision, recall, F1-score, accuracy, and sensitivity, the proposed ensemble improves previous methods shown in the literature for two- and three-class classification. We also introduce the six-class classification outcome. Full article

► Show Figures

Figure 1

36 pages, 860 KiB

Open AccessArticle

HOSVD-Based Algorithm for Weighted Tensor Completion

by Zehan Chao, Longxiu Huang and Deanna Needell

J. Imaging 2021, 7(7), 110; https://doi.org/10.3390/jimaging7070110 - 07 Jul 2021

Cited by 2 | Viewed by 1939

Abstract

Matrix completion, the problem of completing missing entries in a data matrix with low-dimensional structure (such as rank), has seen many fruitful approaches and analyses. Tensor completion is the tensor analog that attempts to impute missing tensor entries from similar low-rank type assumptions. [...] Read more.

Matrix completion, the problem of completing missing entries in a data matrix with low-dimensional structure (such as rank), has seen many fruitful approaches and analyses. Tensor completion is the tensor analog that attempts to impute missing tensor entries from similar low-rank type assumptions. In this paper, we study the tensor completion problem when the sampling pattern is deterministic and possibly non-uniform. We first propose an efficient weighted Higher Order Singular Value Decomposition (HOSVD) algorithm for the recovery of the underlying low-rank tensor from noisy observations and then derive the error bounds under a properly weighted metric. Additionally, the efficiency and accuracy of our algorithm are both tested using synthetic and real datasets in numerical simulations. Full article

► Show Figures

Figure 1

19 pages, 13071 KiB

Open AccessArticle

Fall Detection of Elderly People Using the Manifold of Positive Semidefinite Matrices

by Abdessamad Youssfi Alaoui, Youness Tabii, Rachid Oulad Haj Thami, Mohamed Daoudi, Stefano Berretti and Pietro Pala

J. Imaging 2021, 7(7), 109; https://doi.org/10.3390/jimaging7070109 - 06 Jul 2021

Cited by 13 | Viewed by 2593

Abstract

Falls are one of the most critical health care risks for elderly people, being, in some adverse circumstances, an indirect cause of death. Furthermore, demographic forecasts for the future show a growing elderly population worldwide. In this context, models for automatic fall detection [...] Read more.

Falls are one of the most critical health care risks for elderly people, being, in some adverse circumstances, an indirect cause of death. Furthermore, demographic forecasts for the future show a growing elderly population worldwide. In this context, models for automatic fall detection and prediction are of paramount relevance, especially AI applications that use ambient, sensors or computer vision. In this paper, we present an approach for fall detection using computer vision techniques. Video sequences of a person in a closed environment are used as inputs to our algorithm. In our approach, we first apply the V2V-PoseNet model to detect 2D body skeleton in every frame. Specifically, our approach involves four steps: (1) the body skeleton is detected by V2V-PoseNet in each frame; (2) joints of skeleton are first mapped into the Riemannian manifold of positive semidefinite matrices of fixed-rank 2 to build time-parameterized trajectories; (3) a temporal warping is performed on the trajectories, providing a (dis-)similarity measure between them; (4) finally, a pairwise proximity function SVM is used to classify them into fall or non-fall, incorporating the (dis-)similarity measure into the kernel function. We evaluated our approach on two publicly available datasets URFD and Charfi. The results of the proposed approach are competitive with respect to state-of-the-art methods, while only involving 2D body skeletons. Full article

(This article belongs to the Special Issue 2020 Selected Papers from Journal of Imaging Editorial Board Members)

► Show Figures

Figure 1

29 pages, 1797 KiB

Open AccessArticle

Media Forensics Considerations on DeepFake Detection with Hand-Crafted Features

by Dennis Siegel, Christian Kraetzer, Stefan Seidlitz and Jana Dittmann

J. Imaging 2021, 7(7), 108; https://doi.org/10.3390/jimaging7070108 - 01 Jul 2021

Cited by 20 | Viewed by 4575

Abstract

DeepFake detection is a novel task for media forensics and is currently receiving a lot of research attention due to the threat these targeted video manipulations propose to the trust placed in video footage. The current trend in DeepFake detection is the application [...] Read more.

DeepFake detection is a novel task for media forensics and is currently receiving a lot of research attention due to the threat these targeted video manipulations propose to the trust placed in video footage. The current trend in DeepFake detection is the application of neural networks to learn feature spaces that allow them to be distinguished from unmanipulated videos. In this paper, we discuss, with features hand-crafted by domain experts, an alternative to this trend. The main advantage that hand-crafted features have over learned features is their interpretability and the consequences this might have for plausibility validation for decisions made. Here, we discuss three sets of hand-crafted features and three different fusion strategies to implement DeepFake detection. Our tests on three pre-existing reference databases show detection performances that are under comparable test conditions (peak AUC > 0.95) to those of state-of-the-art methods using learned features. Furthermore, our approach shows a similar, if not better, generalization behavior than neural network-based methods in tests performed with different training and test sets. In addition to these pattern recognition considerations, first steps of a projection onto a data-centric examination approach for forensics process modeling are taken to increase the maturity of the present investigation. Full article

(This article belongs to the Special Issue Image and Video Forensics)

► Show Figures

Figure 1

14 pages, 2193 KiB

Open AccessArticle

Quantitative Analyses of the Left Ventricle Volume and Cardiac Function in Normal and Infarcted Yucatan Minipigs

by Anna V. Naumova, Gregory Kicska, Kiana Pimentel, Lauren E. Neidig, Hiroshi Tsuchida, Kenta Nakamura and Charles E. Murry

J. Imaging 2021, 7(7), 107; https://doi.org/10.3390/jimaging7070107 - 01 Jul 2021

Cited by 1 | Viewed by 2501

Abstract

(1) Background: The accuracy of the left ventricular volume (LVV) and contractility measurements with cardiac magnetic resonance imaging (CMRI) is decreased if the papillary muscles are abnormally enlarged, such as in hypertrophic cardiomyopathy in human patients or in pig models of human diseases. [...] Read more.

(1) Background: The accuracy of the left ventricular volume (LVV) and contractility measurements with cardiac magnetic resonance imaging (CMRI) is decreased if the papillary muscles are abnormally enlarged, such as in hypertrophic cardiomyopathy in human patients or in pig models of human diseases. The purpose of this work was to establish the best method of LVV quantification with CMRI in pigs. (2) Methods: The LVV in 29 Yucatan minipig hearts was measured using two different techniques: the “standard method”, which uses smooth contouring along the endocardial surface and adds the papillary volume to the ventricular cavity volume, and the “detailed method”, which traces the papillary muscles and trabeculations and adds them to the ventricular mass. (3) Results: Papillary muscles add 21% to the LV mass in normal and infarcted hearts of Yucatan minipigs. The inclusion or exclusion of these from the CMRI analysis significantly affected the study results. In the normal pig hearts, the biggest differences were found in measurements of the LVV, ejection fraction (EF), LV mass and indices derived from the LV mass (p < 0.001). The EF measurement in the normal pig heart was 11% higher with the detailed method, and 19% higher in the infarcted pig hearts (p < 0.0001). The detailed method of endocardium tracing with CMRI closely represented the LV mass measured ex vivo. (4) Conclusions: The detailed method, which accounts for the large volume of the papillary muscles in the pig heart, provides better accuracy and interobserver consistency in the assessment of LV mass and ejection fraction, and might therefore be preferable for these analyses. Full article

(This article belongs to the Topic Medical Image Analysis)

► Show Figures

Figure 1

22 pages, 12934 KiB

Open AccessArticle

Comparing CAM Algorithms for the Identification of Salient Image Features in Iconography Artwork Analysis

by Nicolò Oreste Pinciroli Vago, Federico Milani, Piero Fraternali and Ricardo da Silva Torres

J. Imaging 2021, 7(7), 106; https://doi.org/10.3390/jimaging7070106 - 29 Jun 2021

Cited by 13 | Viewed by 3610

Abstract

Iconography studies the visual content of artworks by considering the themes portrayed in them and their representation. Computer Vision has been used to identify iconographic subjects in paintings and Convolutional Neural Networks enabled the effective classification of characters in Christian art paintings. However, [...] Read more.

Iconography studies the visual content of artworks by considering the themes portrayed in them and their representation. Computer Vision has been used to identify iconographic subjects in paintings and Convolutional Neural Networks enabled the effective classification of characters in Christian art paintings. However, it still has to be demonstrated if the classification results obtained by CNNs rely on the same iconographic properties that human experts exploit when studying iconography and if the architecture of a classifier trained on whole artwork images can be exploited to support the much harder task of object detection. A suitable approach for exposing the process of classification by neural models relies on Class Activation Maps, which emphasize the areas of an image contributing the most to the classification. This work compares state-of-the-art algorithms (CAM, Grad-CAM, Grad-CAM++, and Smooth Grad-CAM++) in terms of their capacity of identifying the iconographic attributes that determine the classification of characters in Christian art paintings. Quantitative and qualitative analyses show that Grad-CAM, Grad-CAM++, and Smooth Grad-CAM++ have similar performances while CAM has lower efficacy. Smooth Grad-CAM++ isolates multiple disconnected image regions that identify small iconographic symbols well. Grad-CAM produces wider and more contiguous areas that cover large iconographic symbols better. The salient image areas computed by the CAM algorithms have been used to estimate object-level bounding boxes and a quantitative analysis shows that the boxes estimated with Grad-CAM reach 55% average IoU, 61% GT-known localization and 31% mAP. The obtained results are a step towards the computer-aided study of the variations of iconographic elements positioning and mutual relations in artworks and open the way to the automatic creation of bounding boxes for training detectors of iconographic symbols in Christian art images. Full article

(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)

► Show Figures

Figure 1

10 pages, 1373 KiB

Open AccessArticle

How Can a Deep Learning Algorithm Improve Fracture Detection on X-rays in the Emergency Room?

by Guillaume Reichert, Ali Bellamine, Matthieu Fontaine, Beatrice Naipeanu, Adrien Altar, Elodie Mejean, Nicolas Javaud and Nathalie Siauve

J. Imaging 2021, 7(7), 105; https://doi.org/10.3390/jimaging7070105 - 25 Jun 2021

Cited by 7 | Viewed by 4925

Abstract

The growing need for emergency imaging has greatly increased the number of conventional X-rays, particularly for traumatic injury. Deep learning (DL) algorithms could improve fracture screening by radiologists and emergency room (ER) physicians. We used an algorithm developed for the detection of appendicular [...] Read more.

The growing need for emergency imaging has greatly increased the number of conventional X-rays, particularly for traumatic injury. Deep learning (DL) algorithms could improve fracture screening by radiologists and emergency room (ER) physicians. We used an algorithm developed for the detection of appendicular skeleton fractures and evaluated its performance for detecting traumatic fractures on conventional X-rays in the ER, without the need for training on local data. This algorithm was tested on all patients (N = 125) consulting at the Louis Mourier ER in May 2019 for limb trauma. Patients were selected by two emergency physicians from the clinical database used in the ER. Their X-rays were exported and analyzed by a radiologist. The prediction made by the algorithm and the annotation made by the radiologist were compared. For the 125 patients included, 25 patients with a fracture were identified by the clinicians, 24 of whom were identified by the algorithm (sensitivity of 96%). The algorithm incorrectly predicted a fracture in 14 of the 100 patients without fractures (specificity of 86%). The negative predictive value was 98.85%. This study shows that DL algorithms are potentially valuable diagnostic tools for detecting fractures in the ER and could be used in the training of junior radiologists. Full article

(This article belongs to the Special Issue Deep Learning in Medical Image Analysis, Volume II)

► Show Figures

Figure 1

18 pages, 2223 KiB

Open AccessArticle

Unsupervised Foreign Object Detection Based on Dual-Energy Absorptiometry in the Food Industry

by Vladyslav Andriiashen, Robert van Liere, Tristan van Leeuwen and Kees Joost Batenburg

J. Imaging 2021, 7(7), 104; https://doi.org/10.3390/jimaging7070104 - 24 Jun 2021

Cited by 2 | Viewed by 2269

Abstract

X-ray imaging is a widely used technique for non-destructive inspection of agricultural food products. One application of X-ray imaging is the autonomous, in-line detection of foreign objects in food samples. Examples of such inclusions are bone fragments in meat products, plastic and metal [...] Read more.

X-ray imaging is a widely used technique for non-destructive inspection of agricultural food products. One application of X-ray imaging is the autonomous, in-line detection of foreign objects in food samples. Examples of such inclusions are bone fragments in meat products, plastic and metal debris in fish, and fruit infestations. This article presents a processing methodology for unsupervised foreign object detection based on dual-energy X-ray absorptiometry (DEXA). A novel thickness correction model is introduced as a pre-processing technique for DEXA data. The aim of the model is to homogenize regions in the image that belong to the food product and to enhance contrast where the foreign object is present. In this way, the segmentation of the foreign object is more robust to noise and lack of contrast. The proposed methodology was applied to a dataset of 488 samples of meat products acquired from a conveyor belt. Approximately 60% of the samples contain foreign objects of different types and sizes, while the rest of the samples are void of foreign objects. The results show that samples without foreign objects are correctly identified in 97% of cases and that the overall accuracy of foreign object detection reaches 95%. Full article

► Show Figures

Figure 1

19 pages, 54262 KiB

Open AccessArticle

Restoration and Enhancement of Historical Stereo Photos

by Marco Fanfani, Carlo Colombo and Fabio Bellavia

J. Imaging 2021, 7(7), 103; https://doi.org/10.3390/jimaging7070103 - 24 Jun 2021

Cited by 2 | Viewed by 2095

Abstract

Restoration of digital visual media acquired from repositories of historical photographic and cinematographic material is of key importance for the preservation, study and transmission of the legacy of past cultures to the coming generations. In this paper, a fully automatic approach to the [...] Read more.

Restoration of digital visual media acquired from repositories of historical photographic and cinematographic material is of key importance for the preservation, study and transmission of the legacy of past cultures to the coming generations. In this paper, a fully automatic approach to the digital restoration of historical stereo photographs is proposed, referred to as Stacked Median Restoration plus (SMR+). The approach exploits the content redundancy in stereo pairs for detecting and fixing scratches, dust, dirt spots and many other defects in the original images, as well as improving contrast and illumination. This is done by estimating the optical flow between the images, and using it to register one view onto the other both geometrically and photometrically. Restoration is then accomplished in three steps: (1) image fusion according to the stacked median operator, (2) low-resolution detail enhancement by guided supersampling, and (3) iterative visual consistency checking and refinement. Each step implements an original algorithm specifically designed for this work. The restored image is fully consistent with the original content, thus improving over the methods based on image hallucination. Comparative results on three different datasets of historical stereograms show the effectiveness of the proposed approach, and its superiority over single-image denoising and super-resolution methods. Results also show that the performance of the state-of-the-art single-image deep restoration network Bringing Old Photo Back to Life (BOPBtL) can be strongly improved when the input image is pre-processed by SMR+. Full article

(This article belongs to the Special Issue Fine Art Pattern Extraction and Recognition)

► Show Figures

Figure 1

23 pages, 11458 KiB

Open AccessArticle

Exposing Manipulated Photos and Videos in Digital Forensics Analysis

by Sara Ferreira, Mário Antunes and Manuel E. Correia

J. Imaging 2021, 7(7), 102; https://doi.org/10.3390/jimaging7070102 - 24 Jun 2021

Cited by 11 | Viewed by 6989

Abstract

Tampered multimedia content is being increasingly used in a broad range of cybercrime activities. The spread of fake news, misinformation, digital kidnapping, and ransomware-related crimes are amongst the most recurrent crimes in which manipulated digital photos and videos are the perpetrating and disseminating [...] Read more.

Tampered multimedia content is being increasingly used in a broad range of cybercrime activities. The spread of fake news, misinformation, digital kidnapping, and ransomware-related crimes are amongst the most recurrent crimes in which manipulated digital photos and videos are the perpetrating and disseminating medium. Criminal investigation has been challenged in applying machine learning techniques to automatically distinguish between fake and genuine seized photos and videos. Despite the pertinent need for manual validation, easy-to-use platforms for digital forensics are essential to automate and facilitate the detection of tampered content and to help criminal investigators with their work. This paper presents a machine learning Support Vector Machines (SVM) based method to distinguish between genuine and fake multimedia files, namely digital photos and videos, which may indicate the presence of deepfake content. The method was implemented in Python and integrated as new modules in the widely used digital forensics application Autopsy. The implemented approach extracts a set of simple features resulting from the application of a Discrete Fourier Transform (DFT) to digital photos and video frames. The model was evaluated with a large dataset of classified multimedia files containing both legitimate and fake photos and frames extracted from videos. Regarding deepfake detection in videos, the Celeb-DFv1 dataset was used, featuring 590 original videos collected from YouTube, and covering different subjects. The results obtained with the 5-fold cross-validation outperformed those SVM-based methods documented in the literature, by achieving an average F1-score of

99.53 %

,

79.55 %

, and

89.10 %

, respectively for photos, videos, and a mixture of both types of content. A benchmark with state-of-the-art methods was also done, by comparing the proposed SVM method with deep learning approaches, namely Convolutional Neural Networks (CNN). Despite CNN having outperformed the proposed DFT-SVM compound method, the competitiveness of the results attained by DFT-SVM and the substantially reduced processing time make it appropriate to be implemented and embedded into Autopsy modules, by predicting the level of fakeness calculated for each analyzed multimedia file. Full article

(This article belongs to the Special Issue Image and Video Forensics)

► Show Figures

Figure 1

Journal Menu

Journal Browser

J. Imaging, Volume 7, Issue 7 (July 2021) – 19 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI