sensors-logo

Journal Browser

Journal Browser

Deep Learning Technology and Image Sensing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 20 April 2024 | Viewed by 21977

Special Issue Editors


E-Mail Website
Guest Editor
Division of Computer Engineering, Dongseo University, 47 Jurye Road, Sasang Gu, Busan 47011, Republic of Korea
Interests: image deconvolution/restoration; color image compression; computer vision; deep learning
Special Issues, Collections and Topics in MDPI journals

E-Mail Website1 Website2
Guest Editor
Machine Learning/Deep Learning Research Labs, Department of Computer Engineering, Dongseo University, Busan 47011, Republic of Korea
Interests: automated machine learning; adversarial machine learning; multi-agent reinforcement learning; few shot learning; generative adversarial network
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning-based computing technology is significantly improving the quality and reliability of image recognition data today. For example, in the field of autonomous driving, the performance of the sensor itself is also increasing through deep learning based on sensor and data fusion between front camera sensors and radars. Other deep-learning-based computer vision technologies help to improve the performance of smartphone camera applications such as face recognition, panorama photography, depth/geometry detection, and high-quality magnification and detection. Still, other computer vision technologies have come to accurately recognize human behavior and posture. This allows the use of human behavior as a tool for human–computer interfaces (HCI) in applications such as the Metaverse. This Special Issue covers all topics related to applications using deep-learning-based image and video sensing technologies.

Topics include but are not limited to:

  • Deep-learning-based image sensing techniques;
  • Deep-learning-based video sensing techniques;
  • Deep-learning-based computer vision algorithms;
  • Deep-learning-based signal processing techniques;
  • Deep-learning-based computational photography.

Prof. Dr. Sukho Lee
Prof. Dr. Dae-Ki Kang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • deep learning
  • image sensing
  • video sensing
  • image sensor
  • video sensor
  • computer vision

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

17 pages, 3874 KiB  
Article
Hubble Meets Webb: Image-to-Image Translation in Astronomy
by Vitaliy Kinakh, Yury Belousov, Guillaume Quétant, Mariia Drozdova, Taras Holotyak, Daniel Schaerer and Slava Voloshynovskiy
Sensors 2024, 24(4), 1151; https://doi.org/10.3390/s24041151 - 09 Feb 2024
Viewed by 781
Abstract
This work explores the generation of James Webb Space Telescope (JWSP) imagery via image-to-image translation from the available Hubble Space Telescope (HST) data. Comparative analysis encompasses the Pix2Pix, CycleGAN, TURBO, and DDPM-based Palette methodologies, assessing the criticality of image registration in astronomy. While [...] Read more.
This work explores the generation of James Webb Space Telescope (JWSP) imagery via image-to-image translation from the available Hubble Space Telescope (HST) data. Comparative analysis encompasses the Pix2Pix, CycleGAN, TURBO, and DDPM-based Palette methodologies, assessing the criticality of image registration in astronomy. While the focus of this study is not on the scientific evaluation of model fairness, we note that the techniques employed may bear some limitations and the translated images could include elements that are not present in actual astronomical phenomena. To mitigate this, uncertainty estimation is integrated into our methodology, enhancing the translation’s integrity and assisting astronomers in distinguishing between reliable predictions and those of questionable certainty. The evaluation was performed using metrics including MSE, SSIM, PSNR, LPIPS, and FID. The paper introduces a novel approach to quantifying uncertainty within image translation, leveraging the stochastic nature of DDPMs. This innovation not only bolsters our confidence in the translated images but also provides a valuable tool for future astronomical experiment planning. By offering predictive insights when JWST data are unavailable, our approach allows for informed preparatory strategies for making observations with the upcoming JWST, potentially optimizing its precious observational resources. To the best of our knowledge, this work is the first attempt to apply image-to-image translation for astronomical sensor-to-sensor translation. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

20 pages, 37823 KiB  
Article
A Real-Time Subway Driver Action Sensoring and Detection Based on Lightweight ShuffleNetV2 Network
by Xing Shen and Xiukun Wei
Sensors 2023, 23(23), 9503; https://doi.org/10.3390/s23239503 - 29 Nov 2023
Viewed by 571
Abstract
The driving operations of the subway system are of great significance in ensuring the safety of trains. There are several hand actions defined in the driving instructions that the driver must strictly execute while operating the train. The actions directly indicate whether equipment [...] Read more.
The driving operations of the subway system are of great significance in ensuring the safety of trains. There are several hand actions defined in the driving instructions that the driver must strictly execute while operating the train. The actions directly indicate whether equipment is normally operating. Therefore, it is important to automatically sense the region of the driver and detect the actions of the driver from surveillance cameras to determine whether they are carrying out the corresponding actions correctly or not. In this paper, a lightweight two-stage model for subway driver action sensoring and detection is proposed, consisting of a driver detection network to sense the region of the driver and an action recognition network to recognize the category of an action. The driver detection network adopts the pretrained MobileNetV2-SSDLite. The action recognition network employs an improved ShuffleNetV2, which incorporates a spatial enhanced module (SEM), improved shuffle units (ISUs), and shuffle attention modules (SAMs). SEM is used to enhance the feature maps after convolutional downsampling. ISU introduces a new branch to expand the receptive field of the network. SAM enables the model to focus on important channels and key spatial locations. Experimental results show that the proposed model outperforms 3D MobileNetV1, 3D MobileNetV3, SlowFast, SlowOnly, and SE-STAD models. Furthermore, a subway driver action sensoring and detection system based on a surveillance camera is built, which is composed of a video-reading module, main operation module, and result-displaying module. The system can perform action sensoring and detection from surveillance cameras directly. According to the runtime analysis, the system meets the requirements for real-time detection. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

20 pages, 7301 KiB  
Article
Deep Learning Framework for Liver Segmentation from T1-Weighted MRI Images
by Md. Sakib Abrar Hossain, Sidra Gul, Muhammad E. H. Chowdhury, Muhammad Salman Khan, Md. Shaheenur Islam Sumon, Enamul Haque Bhuiyan, Amith Khandakar, Maqsud Hossain, Abdus Sadique, Israa Al-Hashimi, Mohamed Arselene Ayari, Sakib Mahmud and Abdulrahman Alqahtani
Sensors 2023, 23(21), 8890; https://doi.org/10.3390/s23218890 - 01 Nov 2023
Viewed by 1321
Abstract
The human liver exhibits variable characteristics and anatomical information, which is often ambiguous in radiological images. Machine learning can be of great assistance in automatically segmenting the liver in radiological images, which can be further processed for computer-aided diagnosis. Magnetic resonance imaging (MRI) [...] Read more.
The human liver exhibits variable characteristics and anatomical information, which is often ambiguous in radiological images. Machine learning can be of great assistance in automatically segmenting the liver in radiological images, which can be further processed for computer-aided diagnosis. Magnetic resonance imaging (MRI) is preferred by clinicians for liver pathology diagnosis over volumetric abdominal computerized tomography (CT) scans, due to their superior representation of soft tissues. The convenience of Hounsfield unit (HoU) based preprocessing in CT scans is not available in MRI, making automatic segmentation challenging for MR images. This study investigates multiple state-of-the-art segmentation networks for liver segmentation from volumetric MRI images. Here, T1-weighted (in-phase) scans are investigated using expert-labeled liver masks from a public dataset of 20 patients (647 MR slices) from the Combined Healthy Abdominal Organ Segmentation grant challenge (CHAOS). The reason for using T1-weighted images is that it demonstrates brighter fat content, thus providing enhanced images for the segmentation task. Twenty-four different state-of-the-art segmentation networks with varying depths of dense, residual, and inception encoder and decoder backbones were investigated for the task. A novel cascaded network is proposed to segment axial liver slices. The proposed framework outperforms existing approaches reported in the literature for the liver segmentation task (on the same test set) with a dice similarity coefficient (DSC) score and intersect over union (IoU) of 95.15% and 92.10%, respectively. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

16 pages, 3435 KiB  
Article
A Novel Approach for Brain Tumor Classification Using an Ensemble of Deep and Hand-Crafted Features
by Hareem Kibriya, Rashid Amin, Jinsul Kim, Marriam Nawaz and Rahma Gantassi
Sensors 2023, 23(10), 4693; https://doi.org/10.3390/s23104693 - 12 May 2023
Cited by 7 | Viewed by 2403
Abstract
One of the most severe types of cancer caused by the uncontrollable proliferation of brain cells inside the skull is brain tumors. Hence, a fast and accurate tumor detection method is critical for the patient’s health. Many automated artificial intelligence (AI) methods have [...] Read more.
One of the most severe types of cancer caused by the uncontrollable proliferation of brain cells inside the skull is brain tumors. Hence, a fast and accurate tumor detection method is critical for the patient’s health. Many automated artificial intelligence (AI) methods have recently been developed to diagnose tumors. These approaches, however, result in poor performance; hence, there is a need for an efficient technique to perform precise diagnoses. This paper suggests a novel approach for brain tumor detection via an ensemble of deep and hand-crafted feature vectors (FV). The novel FV is an ensemble of hand-crafted features based on the GLCM (gray level co-occurrence matrix) and in-depth features based on VGG16. The novel FV contains robust features compared to independent vectors, which improve the suggested method’s discriminating capabilities. The proposed FV is then classified using SVM or support vector machines and the k-nearest neighbor classifier (KNN). The framework achieved the highest accuracy of 99% on the ensemble FV. The results indicate the reliability and efficacy of the proposed methodology; hence, radiologists can use it to detect brain tumors through MRI (magnetic resonance imaging). The results show the robustness of the proposed method and can be deployed in the real environment to detect brain tumors from MRI images accurately. In addition, the performance of our model was validated via cross-tabulated data. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

14 pages, 3022 KiB  
Communication
Identifying the Edges of the Optic Cup and the Optic Disc in Glaucoma Patients by Segmentation
by Srikanth Tadisetty, Ranjith Chodavarapu, Ruoming Jin, Robert J. Clements and Minzhong Yu
Sensors 2023, 23(10), 4668; https://doi.org/10.3390/s23104668 - 11 May 2023
Cited by 3 | Viewed by 1735
Abstract
With recent advancements in artificial intelligence, fundus diseases can be classified automatically for early diagnosis, and this is an interest of many researchers. The study aims to detect the edges of the optic cup and the optic disc of fundus images taken from [...] Read more.
With recent advancements in artificial intelligence, fundus diseases can be classified automatically for early diagnosis, and this is an interest of many researchers. The study aims to detect the edges of the optic cup and the optic disc of fundus images taken from glaucoma patients, which has further applications in the analysis of the cup-to-disc ratio (CDR). We apply a modified U-Net model architecture on various fundus datasets and use segmentation metrics to evaluate the model. We apply edge detection and dilation to post-process the segmentation and better visualize the optic cup and optic disc. Our model results are based on ORIGA, RIM-ONE v3, REFUGE, and Drishti-GS datasets. Our results show that our methodology obtains promising segmentation efficiency for CDR analysis. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

15 pages, 23033 KiB  
Article
Kernel Estimation Using Total Variation Guided GAN for Image Super-Resolution
by Jongeun Park, Hansol Kim and Moon Gi Kang
Sensors 2023, 23(7), 3734; https://doi.org/10.3390/s23073734 - 04 Apr 2023
Cited by 2 | Viewed by 1551
Abstract
Various super-resolution (SR) kernels in the degradation model deteriorate the performance of the SR algorithms, showing unpleasant artifacts in the output images. Hence, SR kernel estimation has been studied to improve the SR performance in several ways for more than a decade. In [...] Read more.
Various super-resolution (SR) kernels in the degradation model deteriorate the performance of the SR algorithms, showing unpleasant artifacts in the output images. Hence, SR kernel estimation has been studied to improve the SR performance in several ways for more than a decade. In particular, a conventional research named KernelGAN has recently been proposed. To estimate the SR kernel from a single image, KernelGAN introduces generative adversarial networks(GANs) that utilize the recurrence of similar structures across scales. Subsequently, an enhanced version of KernelGAN, named E-KernelGAN, was proposed to consider image sharpness and edge thickness. Although it is stable compared to the earlier method, it still encounters challenges in estimating sizable and anisotropic kernels because the structural information of an input image is not sufficiently considered. In this paper, we propose a kernel estimation algorithm called Total Variation Guided KernelGAN (TVG-KernelGAN), which efficiently enables networks to focus on the structural information of an input image. The experimental results show that the proposed algorithm accurately and stably estimates kernels, particularly sizable and anisotropic kernels, both qualitatively and quantitatively. In addition, we compared the results of the non-blind SR methods, using SR kernel estimation techniques. The results indicate that the performance of the SR algorithms was improved using our proposed method. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

17 pages, 1233 KiB  
Article
Fully Cross-Attention Transformer for Guided Depth Super-Resolution
by Ido Ariav and Israel Cohen
Sensors 2023, 23(5), 2723; https://doi.org/10.3390/s23052723 - 02 Mar 2023
Cited by 3 | Viewed by 3828
Abstract
Modern depth sensors are often characterized by low spatial resolution, which hinders their use in real-world applications. However, the depth map in many scenarios is accompanied by a corresponding high-resolution color image. In light of this, learning-based methods have been extensively used for [...] Read more.
Modern depth sensors are often characterized by low spatial resolution, which hinders their use in real-world applications. However, the depth map in many scenarios is accompanied by a corresponding high-resolution color image. In light of this, learning-based methods have been extensively used for guided super-resolution of depth maps. A guided super-resolution scheme uses a corresponding high-resolution color image to infer high-resolution depth maps from low-resolution ones. Unfortunately, these methods still have texture copying problems due to improper guidance from color images. Specifically, in most existing methods, guidance from the color image is achieved by a naive concatenation of color and depth features. In this paper, we propose a fully transformer-based network for depth map super-resolution. A cascaded transformer module extracts deep features from a low-resolution depth. It incorporates a novel cross-attention mechanism to seamlessly and continuously guide the color image into the depth upsampling process. Using a window partitioning scheme, linear complexity in image resolution can be achieved, so it can be applied to high-resolution images. The proposed method of guided depth super-resolution outperforms other state-of-the-art methods through extensive experiments. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

14 pages, 14818 KiB  
Article
Deep Non-Line-of-Sight Imaging Using Echolocation
by Seungwoo Jang, Ui-Hyeon Shin and Kwangsu Kim
Sensors 2022, 22(21), 8477; https://doi.org/10.3390/s22218477 - 03 Nov 2022
Cited by 2 | Viewed by 2159
Abstract
Non-line-of-sight (NLOS) imaging is aimed at visualizing hidden scenes from an observer’s (e.g., camera) viewpoint. Typically, hidden scenes are reconstructed using diffused signals that emit light sources using optical equipment and are reflected multiple times. Optical systems are commonly adopted in NLOS imaging [...] Read more.
Non-line-of-sight (NLOS) imaging is aimed at visualizing hidden scenes from an observer’s (e.g., camera) viewpoint. Typically, hidden scenes are reconstructed using diffused signals that emit light sources using optical equipment and are reflected multiple times. Optical systems are commonly adopted in NLOS imaging because lasers can transport energy and focus light over long distances without loss. In contrast, we propose NLOS imaging using acoustic equipment inspired by echolocation. Existing acoustic NLOS is a computational method motivated by seismic imaging that analyzes the geometry of underground structures. However, this physical method is susceptible to noise and requires a clear signal, resulting in long data acquisition times. Therefore, we reduced the scan time by modifying the echoes to be collected simultaneously rather than sequentially. Then, we propose end-to-end deep-learning models to overcome the challenges of echoes interfering with each other. We designed three distinctive architectures: an encoder that extracts features by dividing multi-channel echoes into groups and merging them hierarchically, a generator that constructs an image of the hidden object, and a discriminator that compares the generated image with the ground-truth image. The proposed model successfully reconstructed the outline of the hidden objects. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

16 pages, 8387 KiB  
Article
Instance-Level Contrastive Learning for Weakly Supervised Object Detection
by Ming Zhang and Bing Zeng
Sensors 2022, 22(19), 7525; https://doi.org/10.3390/s22197525 - 04 Oct 2022
Cited by 1 | Viewed by 1875
Abstract
Weakly supervised object detection (WSOD) has received increasing attention in object detection field, because it only requires image-level annotations to indicate the presence or absence of target objects, which greatly reduces the labeling costs. Existing methods usually focus on the current individual image [...] Read more.
Weakly supervised object detection (WSOD) has received increasing attention in object detection field, because it only requires image-level annotations to indicate the presence or absence of target objects, which greatly reduces the labeling costs. Existing methods usually focus on the current individual image to learn object instance representations, while ignoring instance correlations between different images. To address this problem, we propose an instance-level contrastive learning (ICL) framework to mine reliable instance representations from all learned images, and use the contrastive loss to guide instance representation learning for the current image. Due to the diversity of instances, with different appearances, sizes or shapes, we propose an instance-diverse memory updating (IMU) algorithm to mine different instance representations and store them in a memory bank with multiple representation vectors per class, which also considers background information to enhance foreground representations. With the help of memory bank, we further propose a memory-aware instance mining (MIM) algorithm that combines proposal confidence and instance similarity across images to mine more reliable object instances. In addition, we also propose a memory-aware proposal sampling (MPS) algorithm to sample more positive proposals and remove some negative proposals to balance the learning of positive-negative samples. We conduct extensive experiments on the PASCAL VOC2007 and VOC2012 datasets, which are widely used in WSOD, to demonstrate the effectiveness of our method. Compared to our baseline, our method brings 14.2% mAP and 13.4% CorLoc gains on PASCAL VOC2007 dataset, and 12.2% mAP and 8.3% CorLoc gains on PASCAL VOC2012 dataset. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

12 pages, 4137 KiB  
Article
Low-Light Image Enhancement Using Hybrid Deep-Learning and Mixed-Norm Loss Functions
by JongGeun Oh and Min-Cheol Hong
Sensors 2022, 22(18), 6904; https://doi.org/10.3390/s22186904 - 13 Sep 2022
Cited by 4 | Viewed by 2642
Abstract
This study introduces a low-light image enhancement method using a hybrid deep-learning network and mixed-norm loss functions, in which the network consists of a decomposition-net, illuminance enhance-net, and chroma-net. To consider the correlation between R, G, and B channels, YCbCr channels converted from [...] Read more.
This study introduces a low-light image enhancement method using a hybrid deep-learning network and mixed-norm loss functions, in which the network consists of a decomposition-net, illuminance enhance-net, and chroma-net. To consider the correlation between R, G, and B channels, YCbCr channels converted from the RGB channels are used for training and restoration processes. With the luminance, the decomposition-net aims to decouple the reflectance and illuminance and to train the reflectance, leading to a more accurate feature map with noise reduction. The illumination enhance-net connected to the decomposition-net is used to enhance the illumination such that the illuminance is improved with reduced halo artifacts. In addition, the chroma-net is independently used to reduce color distortion. Moreover, a mixed-norm loss function used in the training process of each network is described to increase the stability and remove blurring in the reconstructed image by reflecting the properties of reflectance, illuminance, and chroma. The experimental results demonstrate that the proposed method leads to promising subjective and objective improvements over state-of-the-art deep-learning methods. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

18 pages, 3395 KiB  
Article
A Novel Quick-Response Eigenface Analysis Scheme for Brain–Computer Interfaces
by Hojong Choi, Junghun Park and Yeon-Mo Yang
Sensors 2022, 22(15), 5860; https://doi.org/10.3390/s22155860 - 05 Aug 2022
Cited by 13 | Viewed by 1749
Abstract
The brain–computer interface (BCI) is used to understand brain activities and external bodies with the help of the motor imagery (MI). As of today, the classification results for EEG 4 class BCI competition dataset have been improved to provide better classification accuracy of [...] Read more.
The brain–computer interface (BCI) is used to understand brain activities and external bodies with the help of the motor imagery (MI). As of today, the classification results for EEG 4 class BCI competition dataset have been improved to provide better classification accuracy of the brain computer interface systems (BCIs). Based on this observation, a novel quick-response eigenface analysis (QR-EFA) scheme for motor imagery is proposed to improve the classification accuracy for BCIs. Thus, we considered BCI signals in standardized and sharable quick response (QR) image domain; then, we systematically combined EFA and a convolution neural network (CNN) to classify the neuro images. To overcome a non-stationary BCI dataset available and non-ergodic characteristics, we utilized an effective neuro data augmentation in the training phase. For the ultimate improvements in classification performance, QR-EFA maximizes the similarities existing in the domain-, trial-, and subject-wise directions. To validate and verify the proposed scheme, we performed an experiment on the BCI dataset. Specifically, the scheme is intended to provide a higher classification output in classification accuracy performance for the BCI competition 4 dataset 2a (C4D2a_4C) and BCI competition 3 dataset 3a (C3D3a_4C). The experimental results confirm that the newly proposed QR-EFA method outperforms the previous the published results, specifically from 85.4% to 97.87% ± 0.75 for C4D2a_4C and 88.21% ± 6.02 for C3D3a_4C. Therefore, the proposed QR-EFA could be a highly reliable and constructive framework for one of the MI classification solutions for BCI applications. Full article
(This article belongs to the Special Issue Deep Learning Technology and Image Sensing)
Show Figures

Figure 1

Back to TopTop