Research

19 pages, 2849 KiB

Open AccessArticle

A Lightweight Image Super-Resolution Reconstruction Algorithm Based on the Residual Feature Distillation Mechanism

by Zihan Yu, Kai Xie, Chang Wen, Jianbiao He and Wei Zhang

Sensors 2024, 24(4), 1049; https://doi.org/10.3390/s24041049 - 06 Feb 2024

Viewed by 732

In recent years, the development of image super-resolution (SR) has explored the capabilities of convolutional neural networks (CNNs). The current research tends to use deeper CNNs to improve performance. However, blindly increasing the depth of the network does not effectively enhance its performance. [...] Read more.

In recent years, the development of image super-resolution (SR) has explored the capabilities of convolutional neural networks (CNNs). The current research tends to use deeper CNNs to improve performance. However, blindly increasing the depth of the network does not effectively enhance its performance. Moreover, as the network depth increases, more issues arise during the training process, requiring additional training techniques. In this paper, we propose a lightweight image super-resolution reconstruction algorithm (SISR-RFDM) based on the residual feature distillation mechanism (RFDM). Building upon residual blocks, we introduce spatial attention (SA) modules to provide more informative cues for recovering high-frequency details such as image edges and textures. Additionally, the output of each residual block is utilized as hierarchical features for global feature fusion (GFF), enhancing inter-layer information flow and feature reuse. Finally, all these features are fed into the reconstruction module to restore high-quality images. Experimental results demonstrate that our proposed algorithm outperforms other comparative algorithms in terms of both subjective visual effects and objective evaluation quality. The peak signal-to-noise ratio (PSNR) is improved by 0.23 dB, and the structural similarity index (SSIM) reaches 0.9607. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

12 pages, 2446 KiB

Open AccessArticle

Recovery-Based Occluded Face Recognition by Identity-Guided Inpainting

by Honglei Li, Yifan Zhang, Wenmin Wang, Shenyong Zhang and Shixiong Zhang

Sensors 2024, 24(2), 394; https://doi.org/10.3390/s24020394 - 09 Jan 2024

Viewed by 714

Abstract

Occlusion in facial photos poses a significant challenge for machine detection and recognition. Consequently, occluded face recognition for camera-captured images has emerged as a prominent and widely discussed topic in computer vision. The present standard face recognition methods have achieved remarkable performance in [...] Read more.

Occlusion in facial photos poses a significant challenge for machine detection and recognition. Consequently, occluded face recognition for camera-captured images has emerged as a prominent and widely discussed topic in computer vision. The present standard face recognition methods have achieved remarkable performance in unoccluded face recognition but performed poorly when directly applied to occluded face datasets. The main reason lies in the absence of identity cues caused by occlusions. Therefore, a direct idea of recovering the occluded areas through an inpainting model has been proposed. However, existing inpainting models based on an encoder-decoder structure are limited in preserving inherent identity information. To solve the problem, we propose ID-Inpainter, an identity-guided face inpainting model, which preserves the identity information to the greatest extent through a more accurate identity sampling strategy and a GAN-like fusing network. We conduct recognition experiments on the occluded face photographs from the LFW, CFP-FP, and AgeDB-30 datasets, and the results indicate that our method achieves state-of-the-art performance in identity-preserving inpainting, and dramatically improves the accuracy of normal recognizers in occluded face recognition. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

14 pages, 5282 KiB

Open AccessArticle

A Real-Time Flame Detection Method Using Deformable Object Detection and Time Sequence Analysis

by Jingyuan Zhang, Bo Shi, Bin Chen, Heping Chen and Wangming Xu

Sensors 2023, 23(20), 8616; https://doi.org/10.3390/s23208616 - 21 Oct 2023

Cited by 1 | Viewed by 953

Abstract

Timely and accurate flame detection is a very important and practical technology for preventing the occurrence of fire accidents effectively. However, the current methods of flame detection are still faced with many challenges in video surveillance scenarios due to issues such as varying [...] Read more.

Timely and accurate flame detection is a very important and practical technology for preventing the occurrence of fire accidents effectively. However, the current methods of flame detection are still faced with many challenges in video surveillance scenarios due to issues such as varying flame shapes, imbalanced samples, and interference from flame-like objects. In this work, a real-time flame detection method based on deformable object detection and time sequence analysis is proposed to address these issues. Firstly, based on the existing single-stage object detection network YOLOv5s, the network structure is improved by introducing deformable convolution to enhance the feature extraction ability for irregularly shaped flames. Secondly, the loss function is improved by using Focal Loss as the classification loss function to solve the problems of the imbalance of positive (flames) and negative (background) samples, as well as the imbalance of easy and hard samples, and by using EIOU Loss as the regression loss function to solve the problems of a slow convergence speed and inaccurate regression position in network training. Finally, a time sequence analysis strategy is adopted to comprehensively analyze the flame detection results of the current frame and historical frames in the surveillance video, alleviating false alarms caused by flame shape changes, flame occlusion, and flame-like interference. The experimental results indicate that the average precision (AP) and the F-Measure index of flame detection using the proposed method reach 93.0% and 89.6%, respectively, both of which are superior to the compared methods, and the detection speed is 24–26 FPS, meeting the real-time requirements of video flame detection. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

15 pages, 3236 KiB

Open AccessArticle

Content-Seam-Preserving Multi-Alignment Network for Visual-Sensor-Based Image Stitching

by Xiaoting Fan, Long Sun, Zhong Zhang, Shuang Liu and Tariq S. Durrani

Sensors 2023, 23(17), 7488; https://doi.org/10.3390/s23177488 - 29 Aug 2023

Viewed by 792

Abstract

As an important representation of scenes in virtual reality and augmented reality, image stitching aims to generate a panoramic image with a natural field-of-view by stitching multiple images together, which are captured by different visual sensors. Existing deep-learning-based methods for image stitching only [...] Read more.

As an important representation of scenes in virtual reality and augmented reality, image stitching aims to generate a panoramic image with a natural field-of-view by stitching multiple images together, which are captured by different visual sensors. Existing deep-learning-based methods for image stitching only conduct a single deep homography to perform image alignment, which may produce inevitable alignment distortions. To address this issue, we propose a content-seam-preserving multi-alignment network (CSPM-Net) for visual-sensor-based image stitching, which could preserve the image content consistency and avoid seam distortions simultaneously. Firstly, a content-preserving deep homography estimation was designed to pre-align the input image pairs and reduce the content inconsistency. Secondly, an edge-assisted mesh warping was conducted to further align the image pairs, where the edge information is introduced to eliminate seam artifacts. Finally, in order to predict the final stitched image accurately, a content consistency loss was designed to preserve the geometric structure of overlapping regions between image pairs, and a seam smoothness loss is proposed to eliminate the edge distortions of image boundaries. Experimental results demonstrated that the proposed image-stitching method can provide favorable stitching results for visual-sensor-based images and outperform other state-of-the-art methods. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

22 pages, 5577 KiB

Open AccessArticle

High-Speed Tracking with Mutual Assistance of Feature Filters and Detectors

by Akira Matsuo and Yuji Yamakawa

Sensors 2023, 23(16), 7082; https://doi.org/10.3390/s23167082 - 10 Aug 2023

Viewed by 904

Abstract

Object detection and tracking in camera images is a fundamental technology for computer vision and is used in various applications. In particular, object tracking using high-speed cameras is expected to be applied to real-time control in robotics. Therefore, it is required to increase [...] Read more.

Object detection and tracking in camera images is a fundamental technology for computer vision and is used in various applications. In particular, object tracking using high-speed cameras is expected to be applied to real-time control in robotics. Therefore, it is required to increase tracking speed and detection accuracy. Currently, however, it is difficult to achieve both of those things simultaneously. In this paper, we propose a tracking method that combines multiple methods: correlation filter-based object tracking, deep learning-based object detection, and motion detection with background subtraction. The algorithms work in parallel and assist each other’s processing to improve the overall performance of the system. We named it the “Mutual Assist tracker of feature Filters and Detectors (MAFiD method)”. This method aims to achieve both high-speed tracking of moving objects and high detection accuracy. Experiments were conducted to verify the detection performance and processing speed by tracking a transparent capsule moving at high speed. The results show that the tracking speed was 618 frames per second (FPS) and the accuracy was 86% for Intersection over Union (IoU). The detection latency was 3.48 ms. These experimental scores are higher than those of conventional methods, indicating that the MAFiD method achieved fast object tracking while maintaining high detection performance. This proposal will contribute to the improvement of object-tracking technology. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

21 pages, 32835 KiB

Open AccessArticle

Enhancing UAV Detection in Surveillance Camera Videos through Spatiotemporal Information and Optical Flow

by Yu Sun, Xiyang Zhi, Haowen Han, Shikai Jiang, Tianjun Shi, Jinnan Gong and Wei Zhang

Sensors 2023, 23(13), 6037; https://doi.org/10.3390/s23136037 - 29 Jun 2023

Cited by 5 | Viewed by 1463

Abstract

The growing intelligence and prevalence of drones have led to an increase in their disorderly and illicit usage, posing substantial risks to aviation and public safety. This paper focuses on addressing the issue of drone detection through surveillance cameras. Drone targets in images [...] Read more.

The growing intelligence and prevalence of drones have led to an increase in their disorderly and illicit usage, posing substantial risks to aviation and public safety. This paper focuses on addressing the issue of drone detection through surveillance cameras. Drone targets in images possess distinctive characteristics, including small size, weak energy, low contrast, and limited and varying features, rendering precise detection a challenging task. To overcome these challenges, we propose a novel detection method that extends the input of YOLOv5s to a continuous sequence of images and inter-frame optical flow, emulating the visual mechanisms employed by humans. By incorporating the image sequence as input, our model can leverage both temporal and spatial information, extracting more features of small and weak targets through the integration of spatiotemporal data. This integration augments the accuracy and robustness of drone detection. Furthermore, the inclusion of optical flow enables the model to directly perceive the motion information of drone targets across consecutive frames, enhancing its ability to extract and utilize features from dynamic objects. Comparative experiments demonstrate that our proposed method of extended input significantly enhances the network’s capability to detect small moving targets, showcasing competitive performance in terms of accuracy and speed. Specifically, our method achieves a final average precision of 86.87%, representing a noteworthy 11.49% improvement over the baseline, and the speed remains above 30 frames per second. Additionally, our approach is adaptable to other detection models with different backbones, providing valuable insights for domains such as Urban Air Mobility and autonomous driving. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

17 pages, 3816 KiB

Open AccessArticle

Research on Apple Recognition Algorithm in Complex Orchard Environment Based on Deep Learning

by Zhuoqun Zhao, Jiang Wang and Hui Zhao

Sensors 2023, 23(12), 5425; https://doi.org/10.3390/s23125425 - 08 Jun 2023

Cited by 2 | Viewed by 1181

Abstract

In the complex environment of orchards, in view of low fruit recognition accuracy, poor real-time and robustness of traditional recognition algorithms, this paper propose an improved fruit recognition algorithm based on deep learning. Firstly, the residual module was assembled with the cross stage [...] Read more.

In the complex environment of orchards, in view of low fruit recognition accuracy, poor real-time and robustness of traditional recognition algorithms, this paper propose an improved fruit recognition algorithm based on deep learning. Firstly, the residual module was assembled with the cross stage parity network (CSP Net) to optimize recognition performance and reduce the computing burden of the network. Secondly, the spatial pyramid pool (SPP) module is integrated into the recognition network of the YOLOv5 to blend the local and global features of the fruit, thus improving the recall rate of the minimum fruit target. Meanwhile, the NMS algorithm was replaced by the Soft NMS algorithm to enhance the ability of identifying overlapped fruits. Finally, a joint loss function was constructed based on focal and CIoU loss to optimize the algorithm, and the recognition accuracy was significantly improved. The test results show that the MAP value of the improved model after dataset training reaches 96.3% in the test set, which is 3.8% higher than the original model. F1 value reaches 91.8%, which is 3.8% higher than the original model. The average detection speed under GPU reaches 27.8 frames/s, which is 5.6 frames/s higher than the original model. Compared with current advanced detection methods such as Faster RCNN and RetinaNet, among others, the test results show that this method has excellent detection accuracy, good robustness and real-time performance, and has important reference value for solving the problem of accurate recognition of fruit in complex environment. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

18 pages, 2308 KiB

Open AccessArticle

Obstacle Detection System for Navigation Assistance of Visually Impaired People Based on Deep Learning Techniques

by Yahia Said, Mohamed Atri, Marwan Ali Albahar, Ahmed Ben Atitallah and Yazan Ahmad Alsariera

Sensors 2023, 23(11), 5262; https://doi.org/10.3390/s23115262 - 01 Jun 2023

Cited by 5 | Viewed by 2813

Abstract

Visually impaired people seek social integration, yet their mobility is restricted. They need a personal navigation system that can provide privacy and increase their confidence for better life quality. In this paper, based on deep learning and neural architecture search (NAS), we propose [...] Read more.

Visually impaired people seek social integration, yet their mobility is restricted. They need a personal navigation system that can provide privacy and increase their confidence for better life quality. In this paper, based on deep learning and neural architecture search (NAS), we propose an intelligent navigation assistance system for visually impaired people. The deep learning model has achieved significant success through well-designed architecture. Subsequently, NAS has proved to be a promising technique for automatically searching for the optimal architecture and reducing human efforts for architecture design. However, this new technique requires extensive computation, limiting its wide use. Due to its high computation requirement, NAS has been less investigated for computer vision tasks, especially object detection. Therefore, we propose a fast NAS to search for an object detection framework by considering efficiency. The NAS will be used to explore the feature pyramid network and the prediction stage for an anchor-free object detection model. The proposed NAS is based on a tailored reinforcement learning technique. The searched model was evaluated on a combination of the Coco dataset and the Indoor Object Detection and Recognition (IODR) dataset. The resulting model outperformed the original model by 2.6% in average precision (AP) with acceptable computation complexity. The achieved results proved the efficiency of the proposed NAS for custom object detection. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

17 pages, 8781 KiB

Open AccessArticle

Tool Wear Condition Monitoring Method Based on Deep Learning with Force Signals

by Yaping Zhang, Xiaozhi Qi, Tao Wang and Yuanhang He

Sensors 2023, 23(10), 4595; https://doi.org/10.3390/s23104595 - 09 May 2023

Cited by 5 | Viewed by 2089

Abstract

Tool wear condition monitoring is an important component of mechanical processing automation, and accurately identifying the wear status of tools can improve processing quality and production efficiency. This paper studied a new deep learning model, to identify the wear status of tools. The [...] Read more.

Tool wear condition monitoring is an important component of mechanical processing automation, and accurately identifying the wear status of tools can improve processing quality and production efficiency. This paper studied a new deep learning model, to identify the wear status of tools. The force signal was transformed into a two-dimensional image using continuous wavelet transform (CWT), short-time Fourier transform (STFT), and Gramian angular summation field (GASF) methods. The generated images were then fed into the proposed convolutional neural network (CNN) model for further analysis. The calculation results show that the accuracy of tool wear state recognition proposed in this paper was above 90%, which was higher than the accuracy of AlexNet, ResNet, and other models. The accuracy of the images generated using the CWT method and identified with the CNN model was the highest, which is attributed to the fact that the CWT method can extract local features of an image and is less affected by noise. Comparing the precision and recall values of the model, it was verified that the image obtained by the CWT method had the highest accuracy in identifying tool wear state. These results demonstrate the potential advantages of using a force signal transformed into a two-dimensional image for tool wear state recognition and of applying CNN models in this area. They also indicate the wide application prospects of this method in industrial production. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

17 pages, 2000 KiB

Open AccessArticle

PW-360IQA: Perceptually-Weighted Multichannel CNN for Blind 360-Degree Image Quality Assessment

by Abderrezzaq Sendjasni and Mohamed-Chaker Larabi

Sensors 2023, 23(9), 4242; https://doi.org/10.3390/s23094242 - 24 Apr 2023

Cited by 4 | Viewed by 1247

Abstract

Image quality assessment of 360-degree images is still in its early stages, especially when it comes to solutions that rely on machine learning. There are many challenges to be addressed related to training strategies and model architecture. In this paper, we propose a [...] Read more.

Image quality assessment of 360-degree images is still in its early stages, especially when it comes to solutions that rely on machine learning. There are many challenges to be addressed related to training strategies and model architecture. In this paper, we propose a perceptually weighted multichannel convolutional neural network (CNN) using a weight-sharing strategy for 360-degree IQA (PW-360IQA). Our approach involves extracting visually important viewports based on several visual scan-path predictions, which are then fed to a multichannel CNN using DenseNet-121 as the backbone. In addition, we account for users’ exploration behavior and human visual system (HVS) properties by using information regarding visual trajectory and distortion probability maps. The inter-observer variability is integrated by leveraging different visual scan-paths to enrich the training data. PW-360IQA is designed to learn the local quality of each viewport and its contribution to the overall quality. We validate our model on two publicly available datasets, CVIQ and OIQA, and demonstrate that it performs robustly. Furthermore, the adopted strategy considerably decreases the complexity when compared to the state-of-the-art, enabling the model to attain comparable, if not better, results while requiring less computational complexity. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

16 pages, 15058 KiB

Open AccessArticle

SR-FEINR: Continuous Remote Sensing Image Super-Resolution Using Feature-Enhanced Implicit Neural Representation

by Jinming Luo, Lei Han, Xianjie Gao, Xiuping Liu and Weiming Wang

Sensors 2023, 23(7), 3573; https://doi.org/10.3390/s23073573 - 29 Mar 2023

Cited by 2 | Viewed by 1603

Abstract

Remote sensing images often have limited resolution, which can hinder their effectiveness in various applications. Super-resolution techniques can enhance the resolution of remote sensing images, and arbitrary resolution super-resolution techniques provide additional flexibility in choosing appropriate image resolutions for different tasks. However, for [...] Read more.

Remote sensing images often have limited resolution, which can hinder their effectiveness in various applications. Super-resolution techniques can enhance the resolution of remote sensing images, and arbitrary resolution super-resolution techniques provide additional flexibility in choosing appropriate image resolutions for different tasks. However, for subsequent processing, such as detection and classification, the resolution of the input image may vary greatly for different methods. In this paper, we propose a method for continuous remote sensing image super-resolution using feature-enhanced implicit neural representation (SR-FEINR). Continuous remote sensing image super-resolution means users can scale a low-resolution image into an image with arbitrary resolution. Our algorithm is composed of three main components: a low-resolution image feature extraction module, a positional encoding module, and a feature-enhanced multi-layer perceptron module. We are the first to apply implicit neural representation in a continuous remote sensing image super-resolution task. Through extensive experiments on two popular remote sensing image datasets, we have shown that our SR-FEINR outperforms the state-of-the-art algorithms in terms of accuracy. Our algorithm showed an average improvement of 0.05 dB over the existing method on

\times 30

across three datasets. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing)

► Show Figures

Figure 1

Journal Menu

Journal Browser

Deep Learning-Based Image and Signal Sensing and Processing

Share This Special Issue

Special Issue Editors

Special Issue Information

Keywords

Published Papers (11 papers)

Research

Further Information

Guidelines

MDPI Initiatives

Follow MDPI