Recent Advances in Computer Vision: Technologies and Applications

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Computer Science & Engineering".

Deadline for manuscript submissions: closed (15 March 2024) | Viewed by 24551

Special Issue Editors


E-Mail Website
Guest Editor
College of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255049, China
Interests: computer vision; machine learning; intelligent optimization control
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
College of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255049, China
Interests: person re-recognition; power vision technology
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
College of Electrical and Electronic Engineering, Shandong University of Technology, Zibo 255049, China
Interests: computer vision; machine learning; intelligent optimization control
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

In recent decades, the theory and technology of computer vision have made great progress, along with the rapid development of computing power and intelligent learning algorithms. Computer vision has achieved great success in many fields, such as object detection and tracking, image analysis and understanding, identity recognition, and smart cities. Automated image analysis and information extraction from mass real-world data based on computer vision have significantly increased productivity in practical engineering applications. In particular, deep learning and other advanced methods are accelerating the revolution. Computer vision will play a key role in the revolution due to its potential power in information processing, decision-making, and knowledge management.

This Special Issue aims to gather recent advances in both theoretical and practical studies of computer vision, emphasizing object detection and tracking, image analysis and understanding, and pattern recognition. Potential topics include, but are not limited to, the use of computer vision techniques such as neural networks, fuzzy logic, metaheuristics and expert systems in the following fields:

(1) Image processing, analysis and understanding;

(2) Object detection and visual tracking;

(3) Behavior analysis;

(4) Computer vision and smart cities;

(5) Industrial machine vision;

(6) Supervised/semi-supervised/unsupervised learning;

(7) Reinforcement learning;

(8) Deep learning theory and applications;

(9) Pattern recognition;

(10) Data mining;

(11) Feature engineering.

Technical Program Committee Member:

1. Qilei (Kevin) Li  Queen Mary University of London

Dr. Mingliang Gao
Dr. Guofeng Zou
Prof. Dr. Zhenzhou Wang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • computer vision
  • image processing
  • pattern recognition
  • image analysis and understanding
  • machine learning
  • deep learning

Related Special Issue

Published Papers (9 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Review

19 pages, 1069 KiB  
Article
PCNet: Leveraging Prototype Complementarity to Improve Prototype Affinity for Few-Shot Segmentation
by Jing-Yu Wang, Shang-Kun Liu, Shi-Cheng Guo, Cheng-Yu Jiang and Wei-Min Zheng
Electronics 2024, 13(1), 142; https://doi.org/10.3390/electronics13010142 - 28 Dec 2023
Viewed by 586
Abstract
With the advent of large-scale datasets, significant advancements have been made in image semantic segmentation. However, the annotation of these datasets necessitates substantial human and financial resources. Therefore, the focus of research has shifted towards few-shot semantic segmentation, which leverages a small number [...] Read more.
With the advent of large-scale datasets, significant advancements have been made in image semantic segmentation. However, the annotation of these datasets necessitates substantial human and financial resources. Therefore, the focus of research has shifted towards few-shot semantic segmentation, which leverages a small number of labeled samples to effectively segment unknown categories. The current mainstream methods are to use the meta-learning framework to achieve model generalization, and the main challenges are as follows. (1) The trained model will be biased towards the seen class, so the model will misactivate the seen class when segmenting the unseen class, which makes it difficult to achieve the idealized class agnostic effect. (2) When the sample size is limited, there exists an intra-class gap between the provided support images and the query images, significantly impacting the model’s generalization capability. To solve the above two problems, we propose a network with prototype complementarity characteristics (PCNet). Specifically, we first generate a self-support query prototype based on the query image. Through the self-distillation, the query prototype and the support prototype perform feature complementary learning, which effectively reduces the influence of the intra-class gap on the model generalization. A standard semantic segmentation model is introduced to segment the seen classes during the training process to achieve accurate irrelevant class shielding. After that, we use the rough prediction map to extract its background prototype and shield the background in the query image by the background prototype. In this way, we obtain more accurate fine-grained segmentation results. The proposed method exhibits superiority in extensive experiments conducted on the PASCAL-5i and COCO-20i datasets. We achieve new state-of-the-art results in the few-shot semantic segmentation task, with an mIoU of 71.27% and 51.71% in the 5-shot setting, respectively. Comprehensive ablation experiments and visualization studies show that the proposed method has a significant effect on small-sample semantic segmentation. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

14 pages, 1300 KiB  
Article
GMIW-Pose: Camera Pose Estimation via Global Matching and Iterative Weighted Eight-Point Algorithm
by Fan Chen, Yuting Wu, Tianjian Liao, Huiquan Zeng, Sujian Ouyang and Jiansheng Guan
Electronics 2023, 12(22), 4689; https://doi.org/10.3390/electronics12224689 - 18 Nov 2023
Viewed by 1092
Abstract
We propose a novel approach, GMIW-Pose, to estimate the relative camera poses between two views. This method leverages a Transformer-based global matching module to obtain robust 2D–2D dense correspondences, followed by iterative refinement of matching weights using ConvGRU. Ultimately, the camera’s relative pose [...] Read more.
We propose a novel approach, GMIW-Pose, to estimate the relative camera poses between two views. This method leverages a Transformer-based global matching module to obtain robust 2D–2D dense correspondences, followed by iterative refinement of matching weights using ConvGRU. Ultimately, the camera’s relative pose is determined through the weighted eight-point algorithm. Compared with the previous best two-view pose estimation method, GMIW-Pose reduced the Absolute Trajectory Error (ATE) by 24% on the TartanAir dataset; it achieved the best or second-best performance in multiple scenarios of the TUM-RGBD and KITTI datasets without fine-tuning, among which ATE decreased by 22% on the TUM-RGBD dataset. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

19 pages, 852 KiB  
Article
Hierarchical Classification for Large-Scale Learning
by Boshi Wang and Adrian Barbu
Electronics 2023, 12(22), 4646; https://doi.org/10.3390/electronics12224646 - 14 Nov 2023
Viewed by 644
Abstract
Deep neural networks (DNNs) have drawn much attention due to their success in various vision tasks. Current DNNs are used on data with a relatively small number of classes (e.g., 1000 or less) and employ a fully connected layer for classification, which allocates [...] Read more.
Deep neural networks (DNNs) have drawn much attention due to their success in various vision tasks. Current DNNs are used on data with a relatively small number of classes (e.g., 1000 or less) and employ a fully connected layer for classification, which allocates one neuron for each class and thus, per-example, the classification scales as O(K) with the number of classes K. This approach is computationally intensive for many real-life applications where the number of classes is very large (e.g., tens of thousands of classes). To address this problem, our paper introduces a hierarchical approach for classification with a large number of classes that scales as O(K) and could be extended to O(logK) with a deeper hierarchy. The method, called Hierarchical PPCA, uses a self-supervised pretrained feature extractor to obtain meaningful features and trains Probabilistic PCA models on the extracted features for each class separately, making it easy to add classes without retraining the whole model. The Mahalanobis distance is used to obtain the classification result. To speed-up classification, the proposed Hierarchical PPCA framework clusters the image class models, represented as Gaussians, into a smaller number of super-classes using a modified k-means clustering algorithm. The classification speed increase is obtained by Hierarchical PPCA assigning a sample to a small number of the most likely super-classes and restricting the image classification to the image classes corresponding to these super-classes. The fact that the model is trained on each class separately makes it applicable to training on very large datasets such as the whole ImageNet with more than 10,000 classes. Experiments on three standard datasets (ImageNet-100, ImageNet-1k,and ImageNet-10k) indicate that the hierarchical classifier can achieve a superior accuracy with up to a 16-fold speed increase compared to a standard fully connected classifier. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

17 pages, 3541 KiB  
Article
Digital Restoration and 3D Virtual Space Display of Hakka Cardigan Based on Optimization of Numerical Algorithm
by Qianqian Yu and Guangzhou Zhu
Electronics 2023, 12(20), 4190; https://doi.org/10.3390/electronics12204190 - 10 Oct 2023
Cited by 3 | Viewed by 1085
Abstract
The Hakka cardigan stands as a quintessential representation of traditional Hakka attire, embodying not only the rich cultural heritage of a nation but also serving as a global cultural treasure. In this academic paper, we focus on a representative model to showcase the [...] Read more.
The Hakka cardigan stands as a quintessential representation of traditional Hakka attire, embodying not only the rich cultural heritage of a nation but also serving as a global cultural treasure. In this academic paper, we focus on a representative model to showcase the development of an autonomous 3D scanning system founded on an offline point cloud generation algorithm. Through a meticulous process involving the emulation of clothing pattern restoration, we employ a diverse array of software tools including Photoshop, Autodesk Maya, and CORELDRAW, harnessing graphic and image processing techniques to seamlessly transition from two-dimensional pattern restoration to a three-dimensional realm. This study revolves around the establishment of an autonomous 3D scanning system centered on a representative model, leveraging an offline point cloud generation algorithm. We incorporate the La-place mesh deformation algorithm to execute conformal transformations on neighboring vertices of motion vertices, while delving into the fundamental methodologies behind digital restoration and the three-dimensional virtual presentation of Hakka cardigans. Our experiments culminate in the measurement of six three-dimensional clothing pieces, revealing absolute deviation between the model and the actual clothing. Furthermore, when we compare the automatic measurements from 200 3D scanned human bodies with their manually obtained counterparts, the displayed measurement error hovers at approximately 0.5 cm. This research endeavor charts an expedited pathway to achieve digital restoration and three-dimensional virtual representation of Hakka cardigans. It not only offers a novel perspective for the digital revitalization of traditional clothing but also serves as a valuable augmentation to contemporary methods of preserving traditional clothing. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

13 pages, 1831 KiB  
Article
High-Level Hessian-Based Image Processing with the Frangi Neuron
by Tomasz Hachaj and Marcin Piekarczyk
Electronics 2023, 12(19), 4159; https://doi.org/10.3390/electronics12194159 - 07 Oct 2023
Viewed by 969
Abstract
The Frangi neuron proposed in this work is a complex element that allows high-level Hessian-based image processing. Its adaptive parameters (weights) can be trained using a minimum number of training data. In our experiment, we showed that just one image is enough to [...] Read more.
The Frangi neuron proposed in this work is a complex element that allows high-level Hessian-based image processing. Its adaptive parameters (weights) can be trained using a minimum number of training data. In our experiment, we showed that just one image is enough to optimize the values of the weights. An intuitive application of the Frangi neuron is to use it in image segmentation process. In order to test the performance of the Frangi neuron, we used diverse medical datasets on which second-order structures are visualized. The Frangi network presented in this paper trained on a single image proved to be significantly more effective than the U-net trained on the same dataset. For the datasets tested, the network performed better as measured by area under the curve receiver operating characteristic (ROC AUC) than U-net and the Frangi algorithm. However, the Frangi network performed several times faster than the non-GPU implementation of Frangi. There is nothing to prevent the Frangi neuron from being used as part of any other network as a component to process two-dimensional images, for example, to detect certain second-order features in them. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

24 pages, 7434 KiB  
Article
Software System for Automatic Grading of Paper Tests
by Vladimir Jocovic, Bosko Nikolic and Nebojsa Bacanin
Electronics 2023, 12(19), 4080; https://doi.org/10.3390/electronics12194080 - 28 Sep 2023
Cited by 1 | Viewed by 1307
Abstract
The advent of digital technology has revolutionized numerous aspects of modern life, including the field of assessment and testing. However, paper tests, despite their seemingly archaic nature, continue to hold a prominent position in various assessment domains. The accessibility, familiarity, security, cost-effectiveness, and [...] Read more.
The advent of digital technology has revolutionized numerous aspects of modern life, including the field of assessment and testing. However, paper tests, despite their seemingly archaic nature, continue to hold a prominent position in various assessment domains. The accessibility, familiarity, security, cost-effectiveness, and versatility of paper tests collectively contribute to their continued prominence. Hence, numerous educational institutions responsible for conducting examinations involving a substantial number of candidates continue to rely on paper tests. Consequently, there arises a demand for the possibility of automated assessment of these tests, aiming to alleviate the burden on teaching staff, enhance objectivity in evaluation, and expedite the delivery of test results. Therefore, diverse software systems have been developed, showcasing the capability to automatically score specific question types. Thus, it becomes imperative to categorize related question types systematically, thereby facilitating a preliminary classification based on the content and format of the questions. This classification serves the purpose of enabling effective comparison among existing software solutions. In this research paper, we present the implementation of such a software system using artificial intelligence techniques, progressively expanding its capabilities to evaluate increasingly complex question types, with the ultimate objective of achieving a comprehensive evaluation of all question types encountered in paper-based tests. The system detailed above demonstrated a recognition success rate of 99.89% on a curated dataset consisting of 734,825 multiple-choice answers. For the matching type, it achieved a recognition success rate of 99.91% on 86,450 answers. In the case of short answer type, the system achieved a recognition success rate of 95.40% on 129,675 answers. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

21 pages, 9047 KiB  
Article
Study on a Low-Illumination Enhancement Method for Online Monitoring Images Considering Multiple-Exposure Image Sequence Fusion
by Wenlong Zhao, Chengwei Jiang, Yunzhu An, Xiaopeng Yan and Chaofeng Dai
Electronics 2023, 12(12), 2654; https://doi.org/10.3390/electronics12122654 - 13 Jun 2023
Cited by 1 | Viewed by 824
Abstract
In order to improve the problem of low image quality caused by insufficient illumination, a low-light image enhancement method with robustness is proposed, which can effectively handle extremely dark images while achieving good results for scenes with insufficient local illumination. First, we expose [...] Read more.
In order to improve the problem of low image quality caused by insufficient illumination, a low-light image enhancement method with robustness is proposed, which can effectively handle extremely dark images while achieving good results for scenes with insufficient local illumination. First, we expose the images to different degrees to form a multi-exposure image sequence; then, we introduce global-based luminance weights and contrast-based gradient weights to fuse the multi-exposure image sequence; finally, we use a bootstrap filter to suppress the noise that may occur during the image processing. We employ pertinent assessment criteria, such as the Peak Signal to Noise Ratio (PSNR), Structural Similarity (SSIM), the Average Gradient (AG), and the Figure Definition (FD), to assess how well the method enhances. Experimental results show that PSNR (31.32) and SSIM (0.74) are the highest in pretty dark scenes compared to most conventional algorithms such as MF, BIMEF, LECARM, etc. Similarly, in processing uneven illumination such as “moonlit night” images, the AG (10.21) and the FD (14.54) are at maximum. In addition, other evaluation metrics such as Shannon (SH) are optimal in the above scenarios. In addition, we apply the algorithm in this paper to the online monitoring images of electric power equipment, which can improve the image lightness while recovering the detail information. The algorithm has strong robustness in extremely dark images and natural low-light images, and the enhanced images have minimal distortion and best appearance in different low-light scenes. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

16 pages, 9226 KiB  
Article
Multi-Scale Cost Attention and Adaptive Fusion Stereo Matching Network
by Zhenguo Liu, Zhao Li, Wengang Ao, Shaoshuang Zhang, Wenlong Liu and Yizhi He
Electronics 2023, 12(7), 1594; https://doi.org/10.3390/electronics12071594 - 28 Mar 2023
Viewed by 1333
Abstract
At present, compared to 3D convolution, 2D convolution is less computationally expensive and faster in stereo matching methods based on convolution. However, compared to the initial cost volume generated by calculation using a 3D convolution method, the initial cost volume generated by 2D [...] Read more.
At present, compared to 3D convolution, 2D convolution is less computationally expensive and faster in stereo matching methods based on convolution. However, compared to the initial cost volume generated by calculation using a 3D convolution method, the initial cost volume generated by 2D convolution in the relevant layer lacks rich information, resulting in the area affected by illumination in the disparity map having a lower robustness and thus affecting its accuracy. Therefore, to address the lack of rich cost volume information in the 2D convolution method, this paper proposes a multi-scale adaptive cost attention and adaptive fusion stereo matching network (MCAFNet) based on AANet+. Firstly, the extracted features are used for initial cost calculation, and the cost volume is input into the multi-scale adaptive cost attention module to generate attention weight, which is then combined with the initial cost volume to suppress irrelevant information and enrich the cost volume. Secondly, the cost aggregation part of the model is improved. A multi-scale adaptive fusion module is added to improve the fusion efficiency of cross-scale cost aggregation. In the Scene Flow dataset, the EPE is reduced to 0.66. The error matching rates in the KITTI2012 and KITTI2015 datasets are 1.60% and 2.22%, respectively. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

Review

Jump to: Research

24 pages, 5493 KiB  
Review
Techniques and Challenges of Image Segmentation: A Review
by Ying Yu, Chunping Wang, Qiang Fu, Renke Kou, Fuyu Huang, Boxiong Yang, Tingting Yang and Mingliang Gao
Electronics 2023, 12(5), 1199; https://doi.org/10.3390/electronics12051199 - 02 Mar 2023
Cited by 22 | Viewed by 15171
Abstract
Image segmentation, which has become a research hotspot in the field of image processing and computer vision, refers to the process of dividing an image into meaningful and non-overlapping regions, and it is an essential step in natural scene understanding. Despite decades of [...] Read more.
Image segmentation, which has become a research hotspot in the field of image processing and computer vision, refers to the process of dividing an image into meaningful and non-overlapping regions, and it is an essential step in natural scene understanding. Despite decades of effort and many achievements, there are still challenges in feature extraction and model design. In this paper, we review the advancement in image segmentation methods systematically. According to the segmentation principles and image data characteristics, three important stages of image segmentation are mainly reviewed, which are classic segmentation, collaborative segmentation, and semantic segmentation based on deep learning. We elaborate on the main algorithms and key techniques in each stage, compare, and summarize the advantages and defects of different segmentation models, and discuss their applicability. Finally, we analyze the main challenges and development trends of image segmentation techniques. Full article
(This article belongs to the Special Issue Recent Advances in Computer Vision: Technologies and Applications)
Show Figures

Figure 1

Back to TopTop