sensors-logo

Journal Browser

Journal Browser

Object Detection Based on Vision Sensors and Neural Network

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 31 August 2024 | Viewed by 9948

Special Issue Editors

Senior Lecturer in Computing, Canterbury Christ Church University, Canterbury, UK
Interests: Internet of Things; cyber security; intelligent computing and applications; HCI

E-Mail Website
Guest Editor
Machine Learning and Perception Lab, University of Udine, Via delle Scienze, 206, 33100 Udine, Italy
Interests: ADAS development; driver’s stopping behavior; computer vision; machine learning; deep learning; data modelling

Special Issue Information

Dear Colleagues,

For a long time, object detection has been a research hotspot in computer vision. Nowadays, it is gaining increasing popularity from the research community and industry with rapid development and deployment of enabling technologies, such as deep neural networks (DNNs) and high-resolution vision sensors. The past few years have witnessed tremendous successful applications of DNNs and their performance. However, mainstream DNNs tend to be getting more complex in computation, deeper in network structures and larger in size in the training datasets. This has placed a barrier when employing both data and computationally intensive DNNs on resource-limited vision sensors for applications such as object detection, especially in a timely manner.

This Special Issue looks at object detection from another angle, aiming to solicit the state-of-the-art research efforts and works that can be employed to enable object detection in a more lightweight way taking into account the resource constraints of vision sensors. For this purpose, below are the topics to be included in this Special Issue but not limited to:

  • Computation efficient lightweight DNNs;
  • Object detection in data streams;
  • One shot object detection;
  • Object detection on the move;
  • Edge computing in support of object detection on sensors;
  • Neural network compression techniques;
  • Federated learning for object detection;
  • Bio-inspired sensing technologies;
  • Real-time object detection techniques;
  • New object representation techniques;
  • Swarm learning for object detection in a collective manner;
  • High-performance sensing systems.

Dr. Man Qi
Dr. Matteo Dunnhofer
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • object detection
  • vision sensing
  • lightweight neural network

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

19 pages, 2306 KiB  
Article
Enhanced Knowledge Distillation for Advanced Recognition of Chinese Herbal Medicine
by Lu Zheng, Wenhan Long, Junchao Yi, Lu Liu and Ke Xu
Sensors 2024, 24(5), 1559; https://doi.org/10.3390/s24051559 - 28 Feb 2024
Viewed by 510
Abstract
The identification and classification of traditional Chinese herbal medicines demand significant time and expertise. We propose the dual-teacher supervised decay (DTSD) approach, an enhancement for Chinese herbal medicine recognition utilizing a refined knowledge distillation model. The DTSD method refines output soft labels, adapts [...] Read more.
The identification and classification of traditional Chinese herbal medicines demand significant time and expertise. We propose the dual-teacher supervised decay (DTSD) approach, an enhancement for Chinese herbal medicine recognition utilizing a refined knowledge distillation model. The DTSD method refines output soft labels, adapts attenuation parameters, and employs a dynamic combination loss in the teacher model. Implemented on the lightweight MobileNet_v3 network, the methodology is deployed successfully in a mobile application. Experimental results reveal that incorporating the exponential warmup learning rate reduction strategy during training optimizes the knowledge distillation model, achieving an average classification accuracy of 98.60% for 10 types of Chinese herbal medicine images. The model boasts an average detection time of 0.0172 s per image, with a compressed size of 10 MB. Comparative experiments demonstrate the superior performance of our refined model over DenseNet121, ResNet50_vd, Xception65, and EfficientNetB1. This refined model not only introduces an approach to Chinese herbal medicine image recognition but also provides a practical solution for lightweight models in mobile applications. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

20 pages, 868 KiB  
Article
Simple Conditional Spatial Query Mask Deformable Detection Transformer: A Detection Approach for Multi-Style Strokes of Chinese Characters
by Tian Zhou, Wu Xie, Huimin Zhang and Yong Fan
Sensors 2024, 24(3), 931; https://doi.org/10.3390/s24030931 - 31 Jan 2024
Viewed by 525
Abstract
In the Chinese character writing task performed by robotic arms, the stroke category and position information should be extracted through object detection. Detection algorithms based on predefined anchor frames have difficulty resolving the differences among the many different styles of Chinese character strokes. [...] Read more.
In the Chinese character writing task performed by robotic arms, the stroke category and position information should be extracted through object detection. Detection algorithms based on predefined anchor frames have difficulty resolving the differences among the many different styles of Chinese character strokes. Deformable detection transformer (deformable DETR) algorithms without predefined anchor frames result in some invalid sampling points with no contribution to the feature update of the current reference point due to the random sampling of sampling points in the deformable attention module. These processes cause a reduction in the speed of the vector learning stroke features in the detection head. In view of this problem, a new detection method for multi-style strokes of Chinese characters, called the simple conditional spatial query mask deformable DETR (SCSQ-MDD), is proposed in this paper. Firstly, a mask prediction layer is jointly determined using the shallow feature map of the Chinese character image and the query vector of the transformer encoder, which is used to filter the points with actual contributions and resample the points without contributions to address the randomness of the correlation calculation among the reference points. Secondly, by separating the content query and spatial query of the transformer decoder, the dependence of the prediction task on the content embedding is relaxed. Finally, the detection model without predefined anchor frames based on the SCSQ-MDD is constructed. Experiments are conducted using a multi-style Chinese character stroke dataset to evaluate the performance of the SCSQ-MDD. The mean average precision (mAP) value is improved by 3.8% and the mean average recall (mAR) value is improved by 1.1% compared with the deformable DETR in the testing stage, illustrating the effectiveness of the proposed method. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

21 pages, 8098 KiB  
Article
The Impact of Noise and Brightness on Object Detection Methods
by José A. Rodríguez-Rodríguez, Ezequiel López-Rubio, Juan A. Ángel-Ruiz and Miguel A. Molina-Cabello
Sensors 2024, 24(3), 821; https://doi.org/10.3390/s24030821 - 26 Jan 2024
Viewed by 734
Abstract
The application of deep learning to image and video processing has become increasingly popular nowadays. Employing well-known pre-trained neural networks for detecting and classifying objects in images is beneficial in a wide range of application fields. However, diverse impediments may degrade the performance [...] Read more.
The application of deep learning to image and video processing has become increasingly popular nowadays. Employing well-known pre-trained neural networks for detecting and classifying objects in images is beneficial in a wide range of application fields. However, diverse impediments may degrade the performance achieved by those neural networks. Particularly, Gaussian noise and brightness, among others, may be presented on images as sensor noise due to the limitations of image acquisition devices. In this work, we study the effect of the most representative noise types and brightness alterations on images in the performance of several state-of-the-art object detectors, such as YOLO or Faster-RCNN. Different experiments have been carried out and the results demonstrate how these adversities deteriorate their performance. Moreover, it is found that the size of objects to be detected is a factor that, together with noise and brightness factors, has a considerable impact on their performance. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

15 pages, 4821 KiB  
Article
Lightweight Detection Methods for Insulator Self-Explosion Defects
by Yanping Chen, Chong Deng, Qiang Sun, Zhize Wu, Le Zou, Guanhong Zhang and Wenbo Li
Sensors 2024, 24(1), 290; https://doi.org/10.3390/s24010290 - 03 Jan 2024
Viewed by 721
Abstract
The accurate and efficient detection of defective insulators is an essential prerequisite for ensuring the safety of the power grid in the new generation of intelligent electrical system inspections. Currently, traditional object detection algorithms for detecting defective insulators in images face issues such [...] Read more.
The accurate and efficient detection of defective insulators is an essential prerequisite for ensuring the safety of the power grid in the new generation of intelligent electrical system inspections. Currently, traditional object detection algorithms for detecting defective insulators in images face issues such as excessive parameter size, low accuracy, and slow detection speed. To address the aforementioned issues, this article proposes an insulator defect detection model based on the lightweight Faster R-CNN (Faster Region-based Convolutional Network) model (Faster R-CNN-tiny). First, the Faster R-CNN model’s backbone network is turned into a lightweight version of it by substituting EfficientNet for ResNet (Residual Network), greatly decreasing the model parameters while increasing its detection accuracy. The second step is to employ a feature pyramid to build feature maps with various resolutions for feature fusion, which enables the detection of objects at various scales. In addition, replacing ordinary convolutions in the network model with more efficient depth-wise separable convolutions increases detection speed while slightly reducing network detection accuracy. Transfer learning is introduced, and a training method involving freezing and unfreezing the model is employed to enhance the network’s ability to detect small target defects. The proposed model is validated using the insulator self-exploding defect dataset. The experimental results show that Faster R-CNN-tiny significantly outperforms the Faster R-CNN (ResNet) model in terms of mean average precision (mAP), frames per second (FPS), and number of parameters. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

13 pages, 4703 KiB  
Article
Swin Transformer-Based Edge Guidance Network for RGB-D Salient Object Detection
by Shuaihui Wang, Fengyi Jiang and Boqian Xu
Sensors 2023, 23(21), 8802; https://doi.org/10.3390/s23218802 - 29 Oct 2023
Viewed by 966
Abstract
Salient object detection (SOD), which is used to identify the most distinctive object in a given scene, plays an important role in computer vision tasks. Most existing RGB-D SOD methods employ a CNN-based network as the backbone to extract features from RGB and [...] Read more.
Salient object detection (SOD), which is used to identify the most distinctive object in a given scene, plays an important role in computer vision tasks. Most existing RGB-D SOD methods employ a CNN-based network as the backbone to extract features from RGB and depth images; however, the inherent locality of a CNN-based network limits the performance of CNN-based methods. To tackle this issue, we propose a novel Swin Transformer-based edge guidance network (SwinEGNet) for RGB-D SOD in which the Swin Transformer is employed as a powerful feature extractor to capture the global context. An edge-guided cross-modal interaction module is proposed to effectively enhance and fuse features. In particular, we employed the Swin Transformer as the backbone to extract features from RGB images and depth maps. Then, we introduced the edge extraction module (EEM) to extract edge features and the depth enhancement module (DEM) to enhance depth features. Additionally, a cross-modal interaction module (CIM) was used to integrate cross-modal features from global and local contexts. Finally, we employed a cascaded decoder to refine the prediction map in a coarse-to-fine manner. Extensive experiments demonstrated that our SwinEGNet achieved the best performance on the LFSD, NLPR, DES, and NJU2K datasets and achieved comparable performance on the STEREO dataset compared to 14 state-of-the-art methods. Our model achieved better performance compared to SwinNet, with 88.4% parameters and 77.2% FLOPs. Our code will be publicly available. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

13 pages, 3783 KiB  
Article
A Novel Approach for Apple Freshness Prediction Based on Gas Sensor Array and Optimized Neural Network
by Wei Wang, Weizhen Yang, Maozhen Li, Zipeng Zhang and Wenbin Du
Sensors 2023, 23(14), 6476; https://doi.org/10.3390/s23146476 - 17 Jul 2023
Viewed by 1298
Abstract
Apple is an important cash crop in China, and the prediction of its freshness can effectively reduce its storage risk and avoid economic loss. The change in the concentration of odor information such as ethylene, carbon dioxide, and ethanol emitted during apple storage [...] Read more.
Apple is an important cash crop in China, and the prediction of its freshness can effectively reduce its storage risk and avoid economic loss. The change in the concentration of odor information such as ethylene, carbon dioxide, and ethanol emitted during apple storage is an important feature to characterize the freshness of apples. In order to accurately predict the freshness level of apples, an electronic nose system based on a gas sensor array and wireless transmission module is designed, and a neural network prediction model using an improved Sparrow Search Algorithm (SSA) based on chaotic sequence (Tent) to optimize Back Propagation (BP) is proposed. The odor information emitted by apples is studied to complete an apple freshness prediction. Furthermore, by fitting the relationship between the prediction coefficient and the input vector, the accuracy benchmark of the prediction model is set, which further improves the prediction accuracy of apple odor information. Compared with the traditional prediction method, the system has the characteristics of simple operation, low cost, reliable results, mobile portability, and it avoids the damage to apples in the process of freshness prediction to realize non-destructive testing. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

28 pages, 3115 KiB  
Article
CoSOV1Net: A Cone- and Spatial-Opponent Primary Visual Cortex-Inspired Neural Network for Lightweight Salient Object Detection
by Didier Ndayikengurukiye and Max Mignotte
Sensors 2023, 23(14), 6450; https://doi.org/10.3390/s23146450 - 17 Jul 2023
Viewed by 916
Abstract
Salient object-detection models attempt to mimic the human visual system’s ability to select relevant objects in images. To this end, the development of deep neural networks on high-end computers has recently achieved high performance. However, developing deep neural network models with the same [...] Read more.
Salient object-detection models attempt to mimic the human visual system’s ability to select relevant objects in images. To this end, the development of deep neural networks on high-end computers has recently achieved high performance. However, developing deep neural network models with the same performance for resource-limited vision sensors or mobile devices remains a challenge. In this work, we propose CoSOV1net, a novel lightweight salient object-detection neural network model, inspired by the cone- and spatial-opponent processes of the primary visual cortex (V1), which inextricably link color and shape in human color perception. Our proposed model is trained from scratch, without using backbones from image classification or other tasks. Experiments on the most widely used and challenging datasets for salient object detection show that CoSOV1Net achieves competitive performance (i.e., Fβ=0.931 on the ECSSD dataset) with state-of-the-art salient object-detection models while having a low number of parameters (1.14 M), low FLOPS (1.4 G) and high FPS (211.2) on GPU (Nvidia GeForce RTX 3090 Ti) compared to the state of the art in lightweight or nonlightweight salient object-detection tasks. Thus, CoSOV1net has turned out to be a lightweight salient object-detection model that can be adapted to mobile environments and resource-constrained devices. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

17 pages, 41360 KiB  
Article
Traversable Region Detection and Tracking for a Sparse 3D Laser Scanner for Off-Road Environments Using Range Images
by Jhonghyun An
Sensors 2023, 23(13), 5898; https://doi.org/10.3390/s23135898 - 25 Jun 2023
Cited by 1 | Viewed by 1324
Abstract
This study proposes a method for detecting and tracking traversable regions in off-road conditions for unmanned ground vehicles (UGVs). Off-road conditions, such as rough terrain or fields, present significant challenges for UGV navigation, and detecting and tracking traversable regions is essential to ensure [...] Read more.
This study proposes a method for detecting and tracking traversable regions in off-road conditions for unmanned ground vehicles (UGVs). Off-road conditions, such as rough terrain or fields, present significant challenges for UGV navigation, and detecting and tracking traversable regions is essential to ensure safe and efficient operation. Using a 3D laser scanner and range-image-based approach, a method is proposed for detecting traversable regions under off-road conditions; this is followed by a Bayesian fusion algorithm for tracking the traversable regions in consecutive frames. Our range-image-based traversable-region-detection approach enables efficient processing of point cloud data from a 3D laser scanner, allowing the identification of traversable areas that are safe for the unmanned ground vehicle to drive on. The effectiveness of the proposed method was demonstrated using real-world data collected during UGV operations on rough terrain, highlighting its potential as a solution for improving UGV navigation capabilities in challenging environments. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

14 pages, 2189 KiB  
Article
DenseTextPVT: Pyramid Vision Transformer with Deep Multi-Scale Feature Refinement Network for Dense Text Detection
by My-Tham Dinh, Deok-Jai Choi and Guee-Sang Lee
Sensors 2023, 23(13), 5889; https://doi.org/10.3390/s23135889 - 25 Jun 2023
Viewed by 1981
Abstract
Detecting dense text in scene images is a challenging task due to the high variability, complexity, and overlapping of text areas. To adequately distinguish text instances with high density in scenes, we propose an efficient approach called DenseTextPVT. We first generated high-resolution features [...] Read more.
Detecting dense text in scene images is a challenging task due to the high variability, complexity, and overlapping of text areas. To adequately distinguish text instances with high density in scenes, we propose an efficient approach called DenseTextPVT. We first generated high-resolution features at different levels to enable accurate dense text detection, which is essential for dense prediction tasks. Additionally, to enhance the feature representation, we designed the Deep Multi-scale Feature Refinement Network (DMFRN), which effectively detects texts of varying sizes, shapes, and fonts, including small-scale texts. DenseTextPVT, then, is inspired by Pixel Aggregation (PA) similarity vector algorithms to cluster text pixels into correct text kernels in the post-processing step. In this way, our proposed method enhances the precision of text detection and effectively reduces overlapping between text regions under dense adjacent text in natural images. The comprehensive experiments indicate the effectiveness of our method on the TotalText, CTW1500, and ICDAR-2015 benchmark datasets in comparison to existing methods. Full article
(This article belongs to the Special Issue Object Detection Based on Vision Sensors and Neural Network)
Show Figures

Figure 1

Back to TopTop