Object Detection, Segmentation and Categorization in Artificial Intelligence

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Artificial Intelligence".

Deadline for manuscript submissions: 15 April 2024 | Viewed by 8373

Special Issue Editors

School of Electronic Engineering, Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi’an 710071, China
Interests: computational intelligence; evolutionary computation; neural networks; multi-objective optimization; remote sensing image interpretation
Special Issues, Collections and Topics in MDPI journals
Academy of Advanced Interdisciplinary Research, Xidian University, Xi’an 710071, China
Interests: image processing; pattern recognition; machine learning; change detection; few-shot knowledge graph
Shaanxi Key Laboratory of Underwater Information Technology, School of Marine Science and Technology, Northwestern Polytechnical University, Xi’an 710072, China
Interests: underwater object detection; machine learning; neural networks; few-shot knowledge graph
School of Electronic Engineering, Xidian University, Xi’an 710071, China
Interests: visualization system simulation and modeling; intelligent algorithm research on body posture and expression; image processing

Special Issue Information

Dear Colleagues,

Object detection, segmentation and categorization are the core tasks of artificial intelligence in applications such as image understanding, remote sensing image intelligent interpretation, medical image analysis, augmented reality, object recognition and tracking, object retrieval, video surveillance and autonomous vehicles. Due to their wide practical applicability, these tasks have attracted considerable attention from researchers around the world. Object detection involves exacting both the location and class of specific objects or all instances in an image. Segmentation supports the determination of the boundaries of same-class objects in an entire scene. Categorization aims to assign class labels to specific pixels or images. As important steps of image processing and further analysis, improvement in object detection, segmentation and categorization techniques is urgently needed to achieve higher performance in these tasks. Although deep learning has achieved unprecedented success in the field, there are still open application issues that must be comprehensively addressed.

This Special Issue aims to gather papers presenting recent advances in object detection, segmentation and categorization with novel and impactful applications. Topics of interest include, but are not limited to:

  • Machine learning for object detection, segmentation and categorization;
  • Multiobjective or multitask optimization for object detection, segmentation and categorization;
  • Object detection, segmentation and categorization based on evolutionary computation;
  • Remote sensing/ teaching image object detection, segmentation and categorization;
  • Medical image segmentation and categorization;
  • Underwater target detection, identification and tracking;
  • Ocean acoustic remote sensing;
  • Sensor signal detection, identification and categorization.

Dr. Hao Li
Dr. Fei Xie
Dr. Jianbo Zhou
Dr. Jieyi Liu
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • image processing
  • object detection
  • image segmentation
  • image classification
  • deep learning
  • neural networks
  • computational intelligence

Published Papers (7 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

12 pages, 7760 KiB  
Article
Rotated Object Detection with Circular Gaussian Distribution
by Hang Xu, Xinyuan Liu, Yike Ma, Zunjie Zhu, Shuai Wang, Chenggang Yan and Feng Dai
Electronics 2023, 12(15), 3265; https://doi.org/10.3390/electronics12153265 - 29 Jul 2023
Viewed by 1008
Abstract
Rotated object detection is a challenging task due to the difficulties of locating the rotated objects and separating them effectively from the background. For rotated object prediction, researchers have explored numerous regression-based and classification-based approaches to predict a rotation angle. However, both paradigms [...] Read more.
Rotated object detection is a challenging task due to the difficulties of locating the rotated objects and separating them effectively from the background. For rotated object prediction, researchers have explored numerous regression-based and classification-based approaches to predict a rotation angle. However, both paradigms are constrained by some flaws that make it difficult to accurately predict angles, such as multi-solution and boundary issues, which limits the performance upper bound of detectors. To address these issues, we propose a circular Gaussian distribution (CGD)-based method for angular prediction. We convert the labeled angle into a discrete circular Gaussian distribution spanning a single minimal positive period, and let the model predict the distribution parameters instead of directly regressing or classifying the angle. To improve the overall efficiency of the detection model, we also design a rotated object detector based on CenterNet. Experimental results on various public datasets demonstrated the effectiveness and superior performances of our method. In particular, our approach achieves better results than state-of-the-art competitors, with improvements of 1.92% and 1.04% in terms of AP points on the HRSC2016 and DOTA datasets, respectively. Full article
Show Figures

Figure 1

16 pages, 5810 KiB  
Article
An Underwater Dense Small Object Detection Model Based on YOLOv5-CFDSDSE
by Jingyang Wang, Yujia Li, Junkai Wang and Ying Li
Electronics 2023, 12(15), 3231; https://doi.org/10.3390/electronics12153231 - 26 Jul 2023
Viewed by 1293
Abstract
Underwater target detection is a key technology in the process of exploring and developing the ocean. Because underwater targets are often very dense, mutually occluded, and affected by light, the detection objects are often unclear, and so, underwater target detection technology faces unique [...] Read more.
Underwater target detection is a key technology in the process of exploring and developing the ocean. Because underwater targets are often very dense, mutually occluded, and affected by light, the detection objects are often unclear, and so, underwater target detection technology faces unique challenges. In order to improve the performance of underwater target detection, this paper proposed a new target detection model YOLOv5-FCDSDSE based on YOLOv5s. In this model, the CFnet (efficient fusion of C3 and FasterNet structure) structure was used to optimize the network structure of the YOLOv5, which improved the model’s accuracy while reducing the number of parameters. Then, Dyhead technology was adopted to achieve better scale perception, space perception, and task perception. In addition, the small object detection (SD) layer was added to combine feature information from different scales effectively, retain more detailed information, and improve the detection ability of small objects. Finally, the attention mechanism squeeze and excitation (SE) was introduced to enhance the feature extraction ability of the model. This paper used the self-made underwater small object dataset URPC_UODD for comparison and ablation experiments. The experimental results showed that the accuracy of the model proposed in this paper was better than the original YOLOv5s and other baseline models in the underwater dense small object detection task, and the number of parameters was also reduced compared to YOLOv5s. Therefore, YOLOv5-FCDSDSE was an innovative solution for underwater target detection tasks. Full article
Show Figures

Figure 1

22 pages, 9681 KiB  
Article
A Specialized Database for Autonomous Vehicles Based on the KITTI Vision Benchmark
by Juan I. Ortega-Gomez, Luis A. Morales-Hernandez and Irving A. Cruz-Albarran
Electronics 2023, 12(14), 3165; https://doi.org/10.3390/electronics12143165 - 21 Jul 2023
Cited by 2 | Viewed by 1135
Abstract
Autonomous driving systems have emerged with the promise of preventing accidents. The first critical aspect of these systems is perception, where the regular practice is the use of top-view point clouds as the input; however, the existing databases in this area only present [...] Read more.
Autonomous driving systems have emerged with the promise of preventing accidents. The first critical aspect of these systems is perception, where the regular practice is the use of top-view point clouds as the input; however, the existing databases in this area only present scenes with 3D point clouds and their respective labels. This generates an opportunity, and the objective of this work is to present a database with scenes directly in the top-view and their labels in the respective plane, as well as adding a segmentation map for each scene as a label for segmentation work. The method used during the creation of the proposed database is presented; this covers how to transform 3D to 2D top-view image point clouds, how the detection labels in the plane are generated, and how to implement a neural network for the generated segmentation maps of each scene. Using this method, a database was developed with 7481 scenes, each with its corresponding top-view image, label file, and segmentation map, where the road segmentation metrics are as follows: F1, 95.77; AP, 92.54; ACC, 97.53; PRE, 94.34; and REC, 97.25. This article presents the development of a database for segmentation and detection assignments, highlighting its particular use for environmental perception works. Full article
Show Figures

Figure 1

18 pages, 7502 KiB  
Article
Aircraft Detection and Fine-Grained Recognition Based on High-Resolution Remote Sensing Images
by Qinghe Guan, Ying Liu, Lei Chen, Shuang Zhao and Guandian Li
Electronics 2023, 12(14), 3146; https://doi.org/10.3390/electronics12143146 - 20 Jul 2023
Viewed by 834
Abstract
In order to realize the detection and recognition of specific types of an aircraft in remote sensing images, this paper proposes an algorithm called Fine-grained S2ANet (FS2ANet) based on the improved Single-shot Alignment Network (S2ANet) for remote [...] Read more.
In order to realize the detection and recognition of specific types of an aircraft in remote sensing images, this paper proposes an algorithm called Fine-grained S2ANet (FS2ANet) based on the improved Single-shot Alignment Network (S2ANet) for remote sensing aircraft object detection and fine-grained recognition. Firstly, to address the imbalanced number of instances of various aircrafts in the dataset, we perform data augmentation on some remote sensing images using flip and color space transformation methods. Secondly, this paper selects ResNet101 as the backbone, combines space-to-depth (SPD) to improve the FPN structure, constructs the FPN-SPD module, and builds the aircraft fine feature focusing module (AF3M) in the detection head of the network, which reduces the loss of fine-grained information in the process of feature extraction, enhances the extraction capability of the network for fine aircraft features, and improves the detection accuracy of remote sensing micro aircraft objects. Finally, we use the SkewIoU based on Kalman filtering (KFIoU) as the algorithm’s regression loss function, improving the algorithm’s convergence speed and the object boxes’ regression accuracy. The experimental results of the detection and fine-grained recognition of 11 types of remote sensing aircraft objects such as Boeing 737, A321, and C919 using the FS2ANet algorithm show that the mAP0.5 of FS2ANet is 46.82%, which is 3.87% higher than S2ANet, and it can apply to the field of remote sensing aircraft object detection and fine-grained recognition. Full article
Show Figures

Figure 1

17 pages, 8109 KiB  
Article
Eye-Blink Event Detection Using a Neural-Network-Trained Frame Segment for Woman Drivers in Saudi Arabia
by Muna S. Al-Razgan, Issema Alruwaly and Yasser A. Ali
Electronics 2023, 12(12), 2699; https://doi.org/10.3390/electronics12122699 - 16 Jun 2023
Viewed by 1032
Abstract
Women have been allowed to drive in Saudi Arabia since 2018, revoking a 30-year ban that also adhered to the traffic rules provided in the country. Conventional drivers are often monitored for safe driving by monitoring their facial reactions, eye blinks, and expressions. [...] Read more.
Women have been allowed to drive in Saudi Arabia since 2018, revoking a 30-year ban that also adhered to the traffic rules provided in the country. Conventional drivers are often monitored for safe driving by monitoring their facial reactions, eye blinks, and expressions. As driving experience and vehicle handling features have been less exposed to novice women drivers in Saudi Arabia, technical assistance and physical observations are mandatory. Such observations are sensed as images/video frames for computer-based analyses. Precise computer vision processes are employed for detecting and classifying events using image processing. The identified events are unique to novice women drivers in Saudi Arabia, assisting with their vehicle usage. This article introduces the Event Detection using Segmented Frame (ED-SF) method to improve the abnormal Eye-Blink Detection (EBD) of women drivers. The eye region is segmented using variation pixel extraction in this process. The pixel extraction process requires textural variation identified from different frames. The condition is that the frames are to be continuous in the event detection. This method employs a convolution neural network with two hidden layer processes. In the first layer, continuous and discrete frame differentiations are identified. The second layer is responsible for segmenting the eye region, devouring the textural variation. The variations and discrete frames are used for training the neural network to prevent segment errors in the extraction process. Therefore, the frame segment changes are used for Identifying the expressions through different inputs across different texture luminosities. This method applies to less-experienced and road-safety-knowledge-lacking woman drivers who have initiated their driving journey in Saudi-Arabia-like countries. Thus the proposed method improves the EBD accuracy by 9.5% compared to Hybrid Convolutional Neural Networks (HCNN), Long Short-Term Neural Networks (HCNN + LSTM), Two-Stream Spatial-Temporal Graph Convolutional Networks (2S-STGCN), and the Customized Driving Fatigue Detection Method CDFDM. Full article
Show Figures

Figure 1

17 pages, 4345 KiB  
Article
Underwater Noise Modeling and Its Application in Noise Classification with Small-Sized Samples
by Guoli Song, Xinyi Guo, Qianchu Zhang, Jun Li and Li Ma
Electronics 2023, 12(12), 2669; https://doi.org/10.3390/electronics12122669 - 14 Jun 2023
Viewed by 1059
Abstract
Underwater noise classification is of great significance for identifying ships as well as other vehicles. Moreover, it is helpful in ensuring a marine habitat-friendly, noise-free ocean environment. But a challenge we are facing is the small-sized underwater noise samples. Because noise is influenced [...] Read more.
Underwater noise classification is of great significance for identifying ships as well as other vehicles. Moreover, it is helpful in ensuring a marine habitat-friendly, noise-free ocean environment. But a challenge we are facing is the small-sized underwater noise samples. Because noise is influenced by multiple sources, it is often difficult to determine and label which source or which two sources are dominant. At present, research to solve the problem is focused on noise image processing or advanced computer technology without starting with the noise generation mechanism and modeling. Here, a typical underwater noise generation model (UNGM) is established to augment noise samples. It is established by generating noise with certain kurtosis according to the spectral and statistical characteristics of the actual noise and filter design. In addition, an underwater noise classification model is developed based on UNGM and convolutional neural networks (CNN). Then the UNGM-CNN-based model is used to classify nine types of typical underwater noise, with either the 1/3 octave noise spectrum level (NSL) or power spectral density (PSD) as the input features. The results show that it is effective in improving classification accuracy. Specifically, it increases the classification accuracy by 1.59%, from 98.27% to 99.86%, and by 2.44%, from 97.45% to 99.89%, when the NSL and PSD are used as the input features, respectively. Additionally, the UNGM-CNN-based method appreciably improves macro-precision and macro-recall by approximately 0.87% and 0.83%, respectively, compared to the CNN-based method. These results demonstrate the effectiveness of the UNGM established in noise classification with small-sized samples. Full article
Show Figures

Figure 1

18 pages, 3959 KiB  
Article
Multi-Mode Channel Position Attention Fusion Side-Scan Sonar Transfer Recognition
by Jian Wang, Haisen Li, Guanying Huo, Chao Li and Yuhang Wei
Electronics 2023, 12(4), 791; https://doi.org/10.3390/electronics12040791 - 04 Feb 2023
Cited by 1 | Viewed by 1162
Abstract
Side-scan sonar (SSS) target recognition is an important part of building an underwater detection system and ensuring a high-precision perception of underwater information. In this paper, a novel multi-channel multi-location attention mechanism is proposed for a multi-modal phased transfer side-scan sonar target recognition [...] Read more.
Side-scan sonar (SSS) target recognition is an important part of building an underwater detection system and ensuring a high-precision perception of underwater information. In this paper, a novel multi-channel multi-location attention mechanism is proposed for a multi-modal phased transfer side-scan sonar target recognition model. Optical images from the ImageNet database, synthetic aperture radar (SAR) images and SSS images are used as the training datasets. The backbone network for feature extraction is transferred and learned by a staged transfer learning method. The head network used to predict the type of target extracts the attention features of SSS through a multi-channel and multi-position attention mechanism, and subsequently performs target recognition. The proposed model is tested on the SSS test dataset and evaluated using several metrics, and compared with different recognition algorithms as well. The results show that the model has better recognition accuracy and robustness for SSS targets. Full article
Show Figures

Figure 1

Back to TopTop