sensors-logo

Journal Browser

Journal Browser

Deep Learning-Based Image and Signal Sensing and Processing

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 30 June 2024 | Viewed by 15520

Special Issue Editors


E-Mail Website
Guest Editor
Graduate Institute of Communication Engineering, National Taiwan University, Taipei 10617, Taiwan
Interests: digital signal processing; digital image processing

E-Mail Website
Guest Editor
Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan
Interests: signal processing; deep learning; green learning; wireless communications

E-Mail Website
Guest Editor
Department of Information and Computer Engineering, Chung Yuan Christian University, Taoyuan 320314, Taiwan
Interests: pattern recognition; computer vision; deep learning; learning analytics
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Deep learning is very effective in signal sensing, computer vision, and object recognition. Many of the advanced sensing image and signal processing algorithms proposed in recent years are related to it. Deep learning is a critical technique in image sensing and signal sensing. In image processing, deep learning techniques have been widely applied in object detection, object recognition, object tracking, image denoising, image quality improvement, and medical image analysis. In signal processing, deep learning techniques can be applied to speech recognition, musical signal recognition, source separation, signal quality improvement, ECG and EEG signal analysis, and medical signal processing. Therefore, deep learning techniques are important for both academic research and product design. In this special session, we encourage the authors to submit the manuscripts that are related to the algorithms, architectures, solutions, and applications adopting deep learning techniques. Potential topics include but are not limited to: 

  • Face detection and recognition
  • Learning-based object detection
  • Learning-based object tracing and ReID
  • Hand gesture recognition
  • Human motion recognition
  • Semantic, instance, and panoptic segmentation
  • Image denoising and quality enhancement
  • Medical image processing
  • Learning-based speech recognition
  • Music signal recognition
  • Source separation and echo removal for vocal signals
  • Signal denoising and quality improvement
  • Medical signal analysis

Prof. Dr. Jian-Jiun Ding
Prof. Dr. Feng-Tsun Chien
Dr. Chih-Chang Yu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • sensing
  • object detection
  • object recognition
  • tracking
  • medical image processing
  • denoising
  • signal enhancement
  • speech
  • music signal recognition
  • medical signal processing

Published Papers (11 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 2849 KiB  
Article
A Lightweight Image Super-Resolution Reconstruction Algorithm Based on the Residual Feature Distillation Mechanism
by Zihan Yu, Kai Xie, Chang Wen, Jianbiao He and Wei Zhang
Sensors 2024, 24(4), 1049; https://doi.org/10.3390/s24041049 - 06 Feb 2024
Viewed by 732
Abstract
In recent years, the development of image super-resolution (SR) has explored the capabilities of convolutional neural networks (CNNs). The current research tends to use deeper CNNs to improve performance. However, blindly increasing the depth of the network does not effectively enhance its performance. [...] Read more.
In recent years, the development of image super-resolution (SR) has explored the capabilities of convolutional neural networks (CNNs). The current research tends to use deeper CNNs to improve performance. However, blindly increasing the depth of the network does not effectively enhance its performance. Moreover, as the network depth increases, more issues arise during the training process, requiring additional training techniques. In this paper, we propose a lightweight image super-resolution reconstruction algorithm (SISR-RFDM) based on the residual feature distillation mechanism (RFDM). Building upon residual blocks, we introduce spatial attention (SA) modules to provide more informative cues for recovering high-frequency details such as image edges and textures. Additionally, the output of each residual block is utilized as hierarchical features for global feature fusion (GFF), enhancing inter-layer information flow and feature reuse. Finally, all these features are fed into the reconstruction module to restore high-quality images. Experimental results demonstrate that our proposed algorithm outperforms other comparative algorithms in terms of both subjective visual effects and objective evaluation quality. The peak signal-to-noise ratio (PSNR) is improved by 0.23 dB, and the structural similarity index (SSIM) reaches 0.9607. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

12 pages, 2446 KiB  
Article
Recovery-Based Occluded Face Recognition by Identity-Guided Inpainting
by Honglei Li, Yifan Zhang, Wenmin Wang, Shenyong Zhang and Shixiong Zhang
Sensors 2024, 24(2), 394; https://doi.org/10.3390/s24020394 - 09 Jan 2024
Viewed by 714
Abstract
Occlusion in facial photos poses a significant challenge for machine detection and recognition. Consequently, occluded face recognition for camera-captured images has emerged as a prominent and widely discussed topic in computer vision. The present standard face recognition methods have achieved remarkable performance in [...] Read more.
Occlusion in facial photos poses a significant challenge for machine detection and recognition. Consequently, occluded face recognition for camera-captured images has emerged as a prominent and widely discussed topic in computer vision. The present standard face recognition methods have achieved remarkable performance in unoccluded face recognition but performed poorly when directly applied to occluded face datasets. The main reason lies in the absence of identity cues caused by occlusions. Therefore, a direct idea of recovering the occluded areas through an inpainting model has been proposed. However, existing inpainting models based on an encoder-decoder structure are limited in preserving inherent identity information. To solve the problem, we propose ID-Inpainter, an identity-guided face inpainting model, which preserves the identity information to the greatest extent through a more accurate identity sampling strategy and a GAN-like fusing network. We conduct recognition experiments on the occluded face photographs from the LFW, CFP-FP, and AgeDB-30 datasets, and the results indicate that our method achieves state-of-the-art performance in identity-preserving inpainting, and dramatically improves the accuracy of normal recognizers in occluded face recognition. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

14 pages, 5282 KiB  
Article
A Real-Time Flame Detection Method Using Deformable Object Detection and Time Sequence Analysis
by Jingyuan Zhang, Bo Shi, Bin Chen, Heping Chen and Wangming Xu
Sensors 2023, 23(20), 8616; https://doi.org/10.3390/s23208616 - 21 Oct 2023
Cited by 1 | Viewed by 953
Abstract
Timely and accurate flame detection is a very important and practical technology for preventing the occurrence of fire accidents effectively. However, the current methods of flame detection are still faced with many challenges in video surveillance scenarios due to issues such as varying [...] Read more.
Timely and accurate flame detection is a very important and practical technology for preventing the occurrence of fire accidents effectively. However, the current methods of flame detection are still faced with many challenges in video surveillance scenarios due to issues such as varying flame shapes, imbalanced samples, and interference from flame-like objects. In this work, a real-time flame detection method based on deformable object detection and time sequence analysis is proposed to address these issues. Firstly, based on the existing single-stage object detection network YOLOv5s, the network structure is improved by introducing deformable convolution to enhance the feature extraction ability for irregularly shaped flames. Secondly, the loss function is improved by using Focal Loss as the classification loss function to solve the problems of the imbalance of positive (flames) and negative (background) samples, as well as the imbalance of easy and hard samples, and by using EIOU Loss as the regression loss function to solve the problems of a slow convergence speed and inaccurate regression position in network training. Finally, a time sequence analysis strategy is adopted to comprehensively analyze the flame detection results of the current frame and historical frames in the surveillance video, alleviating false alarms caused by flame shape changes, flame occlusion, and flame-like interference. The experimental results indicate that the average precision (AP) and the F-Measure index of flame detection using the proposed method reach 93.0% and 89.6%, respectively, both of which are superior to the compared methods, and the detection speed is 24–26 FPS, meeting the real-time requirements of video flame detection. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

15 pages, 3236 KiB  
Article
Content-Seam-Preserving Multi-Alignment Network for Visual-Sensor-Based Image Stitching
by Xiaoting Fan, Long Sun, Zhong Zhang, Shuang Liu and Tariq S. Durrani
Sensors 2023, 23(17), 7488; https://doi.org/10.3390/s23177488 - 29 Aug 2023
Viewed by 792
Abstract
As an important representation of scenes in virtual reality and augmented reality, image stitching aims to generate a panoramic image with a natural field-of-view by stitching multiple images together, which are captured by different visual sensors. Existing deep-learning-based methods for image stitching only [...] Read more.
As an important representation of scenes in virtual reality and augmented reality, image stitching aims to generate a panoramic image with a natural field-of-view by stitching multiple images together, which are captured by different visual sensors. Existing deep-learning-based methods for image stitching only conduct a single deep homography to perform image alignment, which may produce inevitable alignment distortions. To address this issue, we propose a content-seam-preserving multi-alignment network (CSPM-Net) for visual-sensor-based image stitching, which could preserve the image content consistency and avoid seam distortions simultaneously. Firstly, a content-preserving deep homography estimation was designed to pre-align the input image pairs and reduce the content inconsistency. Secondly, an edge-assisted mesh warping was conducted to further align the image pairs, where the edge information is introduced to eliminate seam artifacts. Finally, in order to predict the final stitched image accurately, a content consistency loss was designed to preserve the geometric structure of overlapping regions between image pairs, and a seam smoothness loss is proposed to eliminate the edge distortions of image boundaries. Experimental results demonstrated that the proposed image-stitching method can provide favorable stitching results for visual-sensor-based images and outperform other state-of-the-art methods. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

22 pages, 5577 KiB  
Article
High-Speed Tracking with Mutual Assistance of Feature Filters and Detectors
by Akira Matsuo and Yuji Yamakawa
Sensors 2023, 23(16), 7082; https://doi.org/10.3390/s23167082 - 10 Aug 2023
Viewed by 904
Abstract
Object detection and tracking in camera images is a fundamental technology for computer vision and is used in various applications. In particular, object tracking using high-speed cameras is expected to be applied to real-time control in robotics. Therefore, it is required to increase [...] Read more.
Object detection and tracking in camera images is a fundamental technology for computer vision and is used in various applications. In particular, object tracking using high-speed cameras is expected to be applied to real-time control in robotics. Therefore, it is required to increase tracking speed and detection accuracy. Currently, however, it is difficult to achieve both of those things simultaneously. In this paper, we propose a tracking method that combines multiple methods: correlation filter-based object tracking, deep learning-based object detection, and motion detection with background subtraction. The algorithms work in parallel and assist each other’s processing to improve the overall performance of the system. We named it the “Mutual Assist tracker of feature Filters and Detectors (MAFiD method)”. This method aims to achieve both high-speed tracking of moving objects and high detection accuracy. Experiments were conducted to verify the detection performance and processing speed by tracking a transparent capsule moving at high speed. The results show that the tracking speed was 618 frames per second (FPS) and the accuracy was 86% for Intersection over Union (IoU). The detection latency was 3.48 ms. These experimental scores are higher than those of conventional methods, indicating that the MAFiD method achieved fast object tracking while maintaining high detection performance. This proposal will contribute to the improvement of object-tracking technology. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

21 pages, 32835 KiB  
Article
Enhancing UAV Detection in Surveillance Camera Videos through Spatiotemporal Information and Optical Flow
by Yu Sun, Xiyang Zhi, Haowen Han, Shikai Jiang, Tianjun Shi, Jinnan Gong and Wei Zhang
Sensors 2023, 23(13), 6037; https://doi.org/10.3390/s23136037 - 29 Jun 2023
Cited by 5 | Viewed by 1463
Abstract
The growing intelligence and prevalence of drones have led to an increase in their disorderly and illicit usage, posing substantial risks to aviation and public safety. This paper focuses on addressing the issue of drone detection through surveillance cameras. Drone targets in images [...] Read more.
The growing intelligence and prevalence of drones have led to an increase in their disorderly and illicit usage, posing substantial risks to aviation and public safety. This paper focuses on addressing the issue of drone detection through surveillance cameras. Drone targets in images possess distinctive characteristics, including small size, weak energy, low contrast, and limited and varying features, rendering precise detection a challenging task. To overcome these challenges, we propose a novel detection method that extends the input of YOLOv5s to a continuous sequence of images and inter-frame optical flow, emulating the visual mechanisms employed by humans. By incorporating the image sequence as input, our model can leverage both temporal and spatial information, extracting more features of small and weak targets through the integration of spatiotemporal data. This integration augments the accuracy and robustness of drone detection. Furthermore, the inclusion of optical flow enables the model to directly perceive the motion information of drone targets across consecutive frames, enhancing its ability to extract and utilize features from dynamic objects. Comparative experiments demonstrate that our proposed method of extended input significantly enhances the network’s capability to detect small moving targets, showcasing competitive performance in terms of accuracy and speed. Specifically, our method achieves a final average precision of 86.87%, representing a noteworthy 11.49% improvement over the baseline, and the speed remains above 30 frames per second. Additionally, our approach is adaptable to other detection models with different backbones, providing valuable insights for domains such as Urban Air Mobility and autonomous driving. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

17 pages, 3816 KiB  
Article
Research on Apple Recognition Algorithm in Complex Orchard Environment Based on Deep Learning
by Zhuoqun Zhao, Jiang Wang and Hui Zhao
Sensors 2023, 23(12), 5425; https://doi.org/10.3390/s23125425 - 08 Jun 2023
Cited by 2 | Viewed by 1181
Abstract
In the complex environment of orchards, in view of low fruit recognition accuracy, poor real-time and robustness of traditional recognition algorithms, this paper propose an improved fruit recognition algorithm based on deep learning. Firstly, the residual module was assembled with the cross stage [...] Read more.
In the complex environment of orchards, in view of low fruit recognition accuracy, poor real-time and robustness of traditional recognition algorithms, this paper propose an improved fruit recognition algorithm based on deep learning. Firstly, the residual module was assembled with the cross stage parity network (CSP Net) to optimize recognition performance and reduce the computing burden of the network. Secondly, the spatial pyramid pool (SPP) module is integrated into the recognition network of the YOLOv5 to blend the local and global features of the fruit, thus improving the recall rate of the minimum fruit target. Meanwhile, the NMS algorithm was replaced by the Soft NMS algorithm to enhance the ability of identifying overlapped fruits. Finally, a joint loss function was constructed based on focal and CIoU loss to optimize the algorithm, and the recognition accuracy was significantly improved. The test results show that the MAP value of the improved model after dataset training reaches 96.3% in the test set, which is 3.8% higher than the original model. F1 value reaches 91.8%, which is 3.8% higher than the original model. The average detection speed under GPU reaches 27.8 frames/s, which is 5.6 frames/s higher than the original model. Compared with current advanced detection methods such as Faster RCNN and RetinaNet, among others, the test results show that this method has excellent detection accuracy, good robustness and real-time performance, and has important reference value for solving the problem of accurate recognition of fruit in complex environment. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

18 pages, 2308 KiB  
Article
Obstacle Detection System for Navigation Assistance of Visually Impaired People Based on Deep Learning Techniques
by Yahia Said, Mohamed Atri, Marwan Ali Albahar, Ahmed Ben Atitallah and Yazan Ahmad Alsariera
Sensors 2023, 23(11), 5262; https://doi.org/10.3390/s23115262 - 01 Jun 2023
Cited by 5 | Viewed by 2813
Abstract
Visually impaired people seek social integration, yet their mobility is restricted. They need a personal navigation system that can provide privacy and increase their confidence for better life quality. In this paper, based on deep learning and neural architecture search (NAS), we propose [...] Read more.
Visually impaired people seek social integration, yet their mobility is restricted. They need a personal navigation system that can provide privacy and increase their confidence for better life quality. In this paper, based on deep learning and neural architecture search (NAS), we propose an intelligent navigation assistance system for visually impaired people. The deep learning model has achieved significant success through well-designed architecture. Subsequently, NAS has proved to be a promising technique for automatically searching for the optimal architecture and reducing human efforts for architecture design. However, this new technique requires extensive computation, limiting its wide use. Due to its high computation requirement, NAS has been less investigated for computer vision tasks, especially object detection. Therefore, we propose a fast NAS to search for an object detection framework by considering efficiency. The NAS will be used to explore the feature pyramid network and the prediction stage for an anchor-free object detection model. The proposed NAS is based on a tailored reinforcement learning technique. The searched model was evaluated on a combination of the Coco dataset and the Indoor Object Detection and Recognition (IODR) dataset. The resulting model outperformed the original model by 2.6% in average precision (AP) with acceptable computation complexity. The achieved results proved the efficiency of the proposed NAS for custom object detection. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

17 pages, 8781 KiB  
Article
Tool Wear Condition Monitoring Method Based on Deep Learning with Force Signals
by Yaping Zhang, Xiaozhi Qi, Tao Wang and Yuanhang He
Sensors 2023, 23(10), 4595; https://doi.org/10.3390/s23104595 - 09 May 2023
Cited by 5 | Viewed by 2089
Abstract
Tool wear condition monitoring is an important component of mechanical processing automation, and accurately identifying the wear status of tools can improve processing quality and production efficiency. This paper studied a new deep learning model, to identify the wear status of tools. The [...] Read more.
Tool wear condition monitoring is an important component of mechanical processing automation, and accurately identifying the wear status of tools can improve processing quality and production efficiency. This paper studied a new deep learning model, to identify the wear status of tools. The force signal was transformed into a two-dimensional image using continuous wavelet transform (CWT), short-time Fourier transform (STFT), and Gramian angular summation field (GASF) methods. The generated images were then fed into the proposed convolutional neural network (CNN) model for further analysis. The calculation results show that the accuracy of tool wear state recognition proposed in this paper was above 90%, which was higher than the accuracy of AlexNet, ResNet, and other models. The accuracy of the images generated using the CWT method and identified with the CNN model was the highest, which is attributed to the fact that the CWT method can extract local features of an image and is less affected by noise. Comparing the precision and recall values of the model, it was verified that the image obtained by the CWT method had the highest accuracy in identifying tool wear state. These results demonstrate the potential advantages of using a force signal transformed into a two-dimensional image for tool wear state recognition and of applying CNN models in this area. They also indicate the wide application prospects of this method in industrial production. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

17 pages, 2000 KiB  
Article
PW-360IQA: Perceptually-Weighted Multichannel CNN for Blind 360-Degree Image Quality Assessment
by Abderrezzaq Sendjasni and Mohamed-Chaker Larabi
Sensors 2023, 23(9), 4242; https://doi.org/10.3390/s23094242 - 24 Apr 2023
Cited by 4 | Viewed by 1247
Abstract
Image quality assessment of 360-degree images is still in its early stages, especially when it comes to solutions that rely on machine learning. There are many challenges to be addressed related to training strategies and model architecture. In this paper, we propose a [...] Read more.
Image quality assessment of 360-degree images is still in its early stages, especially when it comes to solutions that rely on machine learning. There are many challenges to be addressed related to training strategies and model architecture. In this paper, we propose a perceptually weighted multichannel convolutional neural network (CNN) using a weight-sharing strategy for 360-degree IQA (PW-360IQA). Our approach involves extracting visually important viewports based on several visual scan-path predictions, which are then fed to a multichannel CNN using DenseNet-121 as the backbone. In addition, we account for users’ exploration behavior and human visual system (HVS) properties by using information regarding visual trajectory and distortion probability maps. The inter-observer variability is integrated by leveraging different visual scan-paths to enrich the training data. PW-360IQA is designed to learn the local quality of each viewport and its contribution to the overall quality. We validate our model on two publicly available datasets, CVIQ and OIQA, and demonstrate that it performs robustly. Furthermore, the adopted strategy considerably decreases the complexity when compared to the state-of-the-art, enabling the model to attain comparable, if not better, results while requiring less computational complexity. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

16 pages, 15058 KiB  
Article
SR-FEINR: Continuous Remote Sensing Image Super-Resolution Using Feature-Enhanced Implicit Neural Representation
by Jinming Luo, Lei Han, Xianjie Gao, Xiuping Liu and Weiming Wang
Sensors 2023, 23(7), 3573; https://doi.org/10.3390/s23073573 - 29 Mar 2023
Cited by 2 | Viewed by 1603
Abstract
Remote sensing images often have limited resolution, which can hinder their effectiveness in various applications. Super-resolution techniques can enhance the resolution of remote sensing images, and arbitrary resolution super-resolution techniques provide additional flexibility in choosing appropriate image resolutions for different tasks. However, for [...] Read more.
Remote sensing images often have limited resolution, which can hinder their effectiveness in various applications. Super-resolution techniques can enhance the resolution of remote sensing images, and arbitrary resolution super-resolution techniques provide additional flexibility in choosing appropriate image resolutions for different tasks. However, for subsequent processing, such as detection and classification, the resolution of the input image may vary greatly for different methods. In this paper, we propose a method for continuous remote sensing image super-resolution using feature-enhanced implicit neural representation (SR-FEINR). Continuous remote sensing image super-resolution means users can scale a low-resolution image into an image with arbitrary resolution. Our algorithm is composed of three main components: a low-resolution image feature extraction module, a positional encoding module, and a feature-enhanced multi-layer perceptron module. We are the first to apply implicit neural representation in a continuous remote sensing image super-resolution task. Through extensive experiments on two popular remote sensing image datasets, we have shown that our SR-FEINR outperforms the state-of-the-art algorithms in terms of accuracy. Our algorithm showed an average improvement of 0.05 dB over the existing method on ×30 across three datasets. Full article
(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)
Show Figures

Figure 1

Back to TopTop