sensors-logo

Journal Browser

Journal Browser

Deep Learning for Computer Vision and Image Processing Sensors

A special issue of Sensors (ISSN 1424-8220). This special issue belongs to the section "Sensing and Imaging".

Deadline for manuscript submissions: 20 July 2024 | Viewed by 15191

Special Issue Editors


E-Mail Website
Guest Editor
School of Mechanics and Engineering Science, Shanghai University, Shanghai 200444, China
Interests: opto-mechanics

E-Mail Website
Co-Guest Editor
Shanghai Key Laboratory of Mechanics in Energy Engineering, School of Mechanics and Engineering Science, Shanghai University, Shanghai, China
Interests: 3D vision & photomechanics; elastodynamics simulation; deep learning; numeric computation

Special Issue Information

Dear Colleagues,

This Special Issue aims to bring together recent research advances in the field of deep learning for image-related tasks. The goal is to highlight the most innovative and impactful works that leverage deep learning algorithms to analyze and understand images captured by various sensors, including cameras, depth sensors, and other synthetic imaging devices. This issue aims to showcase the potentials of deep learning in computer-vision-based inspection, measurement, and image processing, including new techniques for visible and infrared image classification, object detection, segmentation, and static and deformable object tracking. Additionally, this Special Issue also aims to explore the challenges and opportunities in the integration of deep learning and image sensors for various applications, such as robotics, smart devices, and optical instruments.

Prof. Dr. Dongsheng Zhang
Dr. Zhilong Su 
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Sensors is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • defect/fault diagnosis
  • object tracking
  • sparse and dense optical flow
  • image analysis
  • measurement

Published Papers (12 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

24 pages, 3866 KiB  
Article
Dual-Dependency Attention Transformer for Fine-Grained Visual Classification
by Shiyan Cui and Bin Hui
Sensors 2024, 24(7), 2337; https://doi.org/10.3390/s24072337 - 06 Apr 2024
Viewed by 323
Abstract
Visual transformers (ViTs) are widely used in various visual tasks, such as fine-grained visual classification (FGVC). However, the self-attention mechanism, which is the core module of visual transformers, leads to quadratic computational and memory complexity. The sparse-attention and local-attention approaches currently used by [...] Read more.
Visual transformers (ViTs) are widely used in various visual tasks, such as fine-grained visual classification (FGVC). However, the self-attention mechanism, which is the core module of visual transformers, leads to quadratic computational and memory complexity. The sparse-attention and local-attention approaches currently used by most researchers are not suitable for FGVC tasks. These tasks require dense feature extraction and global dependency modeling. To address this challenge, we propose a dual-dependency attention transformer model. It decouples global token interactions into two paths. The first is a position-dependency attention pathway based on the intersection of two types of grouped attention. The second is a semantic dependency attention pathway based on dynamic central aggregation. This approach enhances the high-quality semantic modeling of discriminative cues while reducing the computational cost to linear computational complexity. In addition, we develop discriminative enhancement strategies. These strategies increase the sensitivity of high-confidence discriminative cue tracking with a knowledge-based representation approach. Experiments on three datasets, NABIRDS, CUB, and DOGS, show that the method is suitable for fine-grained image classification. It finds a balance between computational cost and performance. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

17 pages, 10223 KiB  
Article
A Novel Method for Monocular Depth Estimation Using an Hourglass Neck Module
by Seung-Jin Oh and Seung-Ho Lee
Sensors 2024, 24(4), 1312; https://doi.org/10.3390/s24041312 - 18 Feb 2024
Viewed by 565
Abstract
In this paper, we propose a novel method for monocular depth estimation using the hourglass neck module. The proposed method has the following originality. First, feature maps are extracted from Swin Transformer V2 using a masked image modeling (MIM) pretrained model. Since Swin [...] Read more.
In this paper, we propose a novel method for monocular depth estimation using the hourglass neck module. The proposed method has the following originality. First, feature maps are extracted from Swin Transformer V2 using a masked image modeling (MIM) pretrained model. Since Swin Transformer V2 has a different patch size for each attention stage, it is easier to extract local and global features from images input by the vision transformer (ViT)-based encoder. Second, to maintain the polymorphism and local inductive bias of the feature map extracted from Swin Transformer V2, a feature map is input into the hourglass neck module. Third, deformable attention can be used at the waist of the hourglass neck module to reduce the computation cost and highlight the locality of the feature map. Finally, the feature map traverses the neck and proceeds through a decoder, comprised of a deconvolution layer and an upsampling layer, to generate a depth image. To evaluate the objective reliability of the proposed method in this paper, we used the NYU Depth V2 dataset to compare and evaluate the methods published in other papers. As a result of the experiment, the RMSE value of the novel method for monocular depth estimation using the hourglass neck module proposed in this paper was 0.274, which was lower than those published in other papers. The lower the RMSE value, the better the depth estimation method; therefore, its efficiency compared to other techniques has been proven. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

17 pages, 24418 KiB  
Article
Accuracy Improvement of Automatic Smoky Diesel Vehicle Detection Using YOLO Model, Matching, and Refinement
by Yaojung Shiao, Tan-Linh Huynh and Jie Ruei Hu
Sensors 2024, 24(3), 771; https://doi.org/10.3390/s24030771 - 24 Jan 2024
Viewed by 717
Abstract
The detection of smoky diesel vehicles is a key step in reducing air pollution from transportation. We propose a new method for identifying smoky vehicles that proceeds in three stages: (1) the detection of vehicle shapes, license plates, and smoke regions; (2) the [...] Read more.
The detection of smoky diesel vehicles is a key step in reducing air pollution from transportation. We propose a new method for identifying smoky vehicles that proceeds in three stages: (1) the detection of vehicle shapes, license plates, and smoke regions; (2) the implementation of the two matching techniques based on the smoke region–vehicle shape and smoke region–license plate relationships; and (3) the refinement of the smoke region detected. The first stage involves the evaluation of various You Only Look Once (YOLO) models to identify the best-fit model for object detection. YOLOv5s was the most effective, particularly for the smoke region prediction, achieving a precision of 91.4% and a mean average precision at 0.5 (mAP@0.5) of 91%. It also had the highest mean mAP@0.5 of 93.9% across all three classes. The application of the two matching techniques significantly reduced the rate of false negatives and enhanced the rate of true positives for the smoky diesel vehicles through the detection of their license plates. Moreover, a refinement process based on image processing theory was implemented, effectively eliminating incorrect smoke region predictions caused by vehicle shadows. As a result, our method achieved a detection rate of 97.45% and a precision of 93.50%, which are higher than that of the two existing popular methods, and produced an acceptable false alarm rate of 5.44%. Particularly, the proposed method substantially reduced the processing time to as low as 85 ms per image, compared to 140.3 and 182.6 ms per image in the two reference studies. In conclusion, the proposed method showed remarkable improvements in the accuracy, robustness, and feasibility of smoky diesel vehicle detection. Therefore, it offers potential to be applied in real-world situations. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

19 pages, 9951 KiB  
Article
Dynamic and Real-Time Object Detection Based on Deep Learning for Home Service Robots
by Yangqing Ye, Xiaolon Ma, Xuanyi Zhou, Guanjun Bao, Weiwei Wan and Shibo Cai
Sensors 2023, 23(23), 9482; https://doi.org/10.3390/s23239482 - 28 Nov 2023
Cited by 1 | Viewed by 1205
Abstract
Home service robots operating indoors, such as inside houses and offices, require the real-time and accurate identification and location of target objects to perform service tasks efficiently. However, images captured by visual sensors while in motion states usually contain varying degrees of blurriness, [...] Read more.
Home service robots operating indoors, such as inside houses and offices, require the real-time and accurate identification and location of target objects to perform service tasks efficiently. However, images captured by visual sensors while in motion states usually contain varying degrees of blurriness, presenting a significant challenge for object detection. In particular, daily life scenes contain small objects like fruits and tableware, which are often occluded, further complicating object recognition and positioning. A dynamic and real-time object detection algorithm is proposed for home service robots. This is composed of an image deblurring algorithm and an object detection algorithm. To improve the clarity of motion-blurred images, the DA-Multi-DCGAN algorithm is proposed. It comprises an embedded dynamic adjustment mechanism and a multimodal multiscale fusion structure based on robot motion and surrounding environmental information, enabling the deblurring processing of images that are captured under different motion states. Compared with DeblurGAN, DA-Multi-DCGAN had a 5.07 improvement in Peak Signal-to-Noise Ratio (PSNR) and a 0.022 improvement in Structural Similarity (SSIM). An AT-LI-YOLO method is proposed for small and occluded object detection. Based on depthwise separable convolution, this method highlights key areas and integrates salient features by embedding the attention module in the AT-Resblock to improve the sensitivity and detection precision of small objects and partially occluded objects. It also employs a lightweight network unit Lightblock to reduce the network’s parameters and computational complexity, which improves its computational efficiency. Compared with YOLOv3, the mean average precision (mAP) of AT-LI-YOLO increased by 3.19%, and the detection precision of small objects, such as apples and oranges and partially occluded objects, increased by 19.12% and 29.52%, respectively. Moreover, the model inference efficiency had a 7 ms reduction in processing time. Based on the typical home activities of older people and children, the dataset Grasp-17 was established for the training and testing of the proposed method. Using the TensorRT neural network inference engine of the developed service robot prototype, the proposed dynamic and real-time object detection algorithm required 29 ms, which meets the real-time requirement of smooth vision. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

27 pages, 6782 KiB  
Article
Attention-Aware Patch-Based CNN for Blind 360-Degree Image Quality Assessment
by Abderrezzaq Sendjasni and Mohamed-Chaker Larabi
Sensors 2023, 23(21), 8676; https://doi.org/10.3390/s23218676 - 24 Oct 2023
Viewed by 811
Abstract
An attention-aware patch-based deep-learning model for a blind 360-degree image quality assessment (360-IQA) is introduced in this paper. It employs spatial attention mechanisms to focus on spatially significant features, in addition to short skip connections to align them. A long skip connection is [...] Read more.
An attention-aware patch-based deep-learning model for a blind 360-degree image quality assessment (360-IQA) is introduced in this paper. It employs spatial attention mechanisms to focus on spatially significant features, in addition to short skip connections to align them. A long skip connection is adopted to allow features from the earliest layers to be used at the final level. Patches are properly sampled on the sphere to correspond to the viewports displayed to the user using head-mounted displays. The sampling incorporates the relevance of patches by considering (i) the exploration behavior and (ii) a latitude-based selection. An adaptive strategy is applied to improve the pooling of local patch qualities to global image quality. This includes an outlier score rejection step relying on the standard deviation of the obtained scores to consider the agreement, as well as a saliency to weigh them based on their visual significance. Experiments on available 360-IQA databases show that our model outperforms the state of the art in terms of accuracy and generalization ability. This is valid for general deep-learning-based models, multichannel models, and natural scene statistic-based models. Furthermore, when compared to multichannel models, the computational complexity is significantly reduced. Finally, an extensive ablation study gives insights into the efficacy of each component of the proposed model. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

24 pages, 11853 KiB  
Article
A New Efficient Multi-Object Detection and Size Calculation for Blended Tobacco Shreds Using an Improved YOLOv7 Network and LWC Algorithm
by Kunming Jia, Qunfeng Niu, Li Wang, Yang Niu and Wentao Ma
Sensors 2023, 23(20), 8380; https://doi.org/10.3390/s23208380 - 11 Oct 2023
Viewed by 991
Abstract
Detection of the four tobacco shred varieties and the subsequent unbroken tobacco shred rate are the primary tasks in cigarette inspection lines. It is especially critical to identify both single and overlapped tobacco shreds at one time, that is, fast blended tobacco shred [...] Read more.
Detection of the four tobacco shred varieties and the subsequent unbroken tobacco shred rate are the primary tasks in cigarette inspection lines. It is especially critical to identify both single and overlapped tobacco shreds at one time, that is, fast blended tobacco shred detection based on multiple targets. However, it is difficult to classify tiny single tobacco shreds with complex morphological characteristics, not to mention classifying tobacco shreds with 24 types of overlap, posing significant difficulties for machine vision-based blended tobacco shred multi-object detection and unbroken tobacco shred rate calculation tasks. This study focuses on the two challenges of identifying blended tobacco shreds and calculating the unbroken tobacco shred rate. In this paper, a new multi-object detection model is developed for blended tobacco shred images based on an improved YOLOv7-tiny model. YOLOv7-tiny is used as the multi-object detection network’s mainframe. A lightweight Resnet19 is used as the model backbone. The original SPPCSPC and coupled detection head are replaced with a new spatial pyramid SPPFCSPC and a decoupled joint detection head, respectively. An algorithm for two-dimensional size calculation of blended tobacco shreds (LWC) is also proposed, which is applied to blended tobacco shred object detection images to obtain independent tobacco shred objects and calculate the unbroken tobacco shred rate. The experimental results showed that the final detection precision, mAP@.5, mAP@.5:.95, and testing time were 0.883, 0.932, 0.795, and 4.12 ms, respectively. The average length and width detection accuracy of the blended tobacco shred samples were −1.7% and 13.2%, respectively. The model achieved high multi-object detection accuracy and 2D size calculation accuracy, which also conformed to the manual inspection process in the field. This study provides a new efficient implementation method for multi-object detection and size calculation of blended tobacco shreds in cigarette quality inspection lines and a new approach for other similar blended image multi-object detection tasks. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

16 pages, 3788 KiB  
Article
Defect Detection in Steel Using a Hybrid Attention Network
by Mudan Zhou, Wentao Lu, Jingbo Xia and Yuhao Wang
Sensors 2023, 23(15), 6982; https://doi.org/10.3390/s23156982 - 06 Aug 2023
Cited by 3 | Viewed by 1591
Abstract
Defect detection in steel surface focuses on accurately identifying and precisely locating defects on the surface of steel materials. Methods of defect detection with deep learning have gained significant attention in research. Existing algorithms can achieve satisfactory results, but the accuracy of defect [...] Read more.
Defect detection in steel surface focuses on accurately identifying and precisely locating defects on the surface of steel materials. Methods of defect detection with deep learning have gained significant attention in research. Existing algorithms can achieve satisfactory results, but the accuracy of defect detection still needs to be improved. Aiming at this issue, a hybrid attention network is proposed in this paper. Firstly, a CBAM attention module is used to enhance the model’s ability to learn effective features. Secondly, an adaptively spatial feature fusion (ASFF) module is used to improve the accuracy by extracting multi-scale information of defects. Finally, the CIOU algorithm is introduced to optimize the training loss of the baseline model. The experimental results show that the performance of our method in this work is superior on the NEU-DET dataset, with an 8.34% improvement in mAP. Compared with major algorithms of object detection such as SSD, EfficientNet, YOLOV3, and YOLOV5, the mAP was improved by 16.36%, 41.68%, 20.79%, and 13.96%, respectively. This demonstrates that the mAP of our proposed method is higher than other major algorithms. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

12 pages, 8991 KiB  
Article
An Underwater Image Enhancement Method for a Preprocessing Framework Based on Generative Adversarial Network
by Xiao Jiang, Haibin Yu, Yaxin Zhang, Mian Pan, Zhu Li, Jingbiao Liu and Shuaishuai Lv
Sensors 2023, 23(13), 5774; https://doi.org/10.3390/s23135774 - 21 Jun 2023
Cited by 2 | Viewed by 2081
Abstract
This paper presents an efficient underwater image enhancement method, named ECO-GAN, to address the challenges of color distortion, low contrast, and motion blur in underwater robot photography. The proposed method is built upon a preprocessing framework using a generative adversarial network. ECO-GAN incorporates [...] Read more.
This paper presents an efficient underwater image enhancement method, named ECO-GAN, to address the challenges of color distortion, low contrast, and motion blur in underwater robot photography. The proposed method is built upon a preprocessing framework using a generative adversarial network. ECO-GAN incorporates a convolutional neural network that specifically targets three underwater issues: motion blur, low brightness, and color deviation. To optimize computation and inference speed, an encoder is employed to extract features, whereas different enhancement tasks are handled by dedicated decoders. Moreover, ECO-GAN employs cross-stage fusion modules between the decoders to strengthen the connection and enhance the quality of output images. The model is trained using supervised learning with paired datasets, enabling blind image enhancement without additional physical knowledge or prior information. Experimental results demonstrate that ECO-GAN effectively achieves denoising, deblurring, and color deviation removal simultaneously. Compared with methods relying on individual modules or simple combinations of multiple modules, our proposed method achieves superior underwater image enhancement and offers the flexibility for expansion into multiple underwater image enhancement functions. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

21 pages, 10784 KiB  
Article
Enhance the Accuracy of Landslide Detection in UAV Images Using an Improved Mask R-CNN Model: A Case Study of Sanming, China
by Lu Yun, Xinxin Zhang, Yuchao Zheng, Dahan Wang and Lizhong Hua
Sensors 2023, 23(9), 4287; https://doi.org/10.3390/s23094287 - 26 Apr 2023
Cited by 6 | Viewed by 1731
Abstract
Extracting high-accuracy landslide areas using deep learning methods from high spatial resolution remote sensing images is a hot topic in current research. However, the existing deep learning algorithms are affected by background noise and landslide scale effects during the extraction process, leading to [...] Read more.
Extracting high-accuracy landslide areas using deep learning methods from high spatial resolution remote sensing images is a hot topic in current research. However, the existing deep learning algorithms are affected by background noise and landslide scale effects during the extraction process, leading to poor feature extraction effects. To address this issue, this paper proposes an improved mask regions-based convolutional neural network (Mask R-CNN) model to identify the landslide distribution in unmanned aerial vehicles (UAV) images. The improvement of the model mainly includes three aspects: (1) an attention mechanism of the convolutional block attention module (CBAM) is added to the backbone residual neural network (ResNet). (2) A bottom-up channel is added to the feature pyramidal network (FPN) module. (3) The region proposal network (RPN) is replaced by guided anchoring (GA-RPN). Sanming City, China was selected as the study area for the experiments. The experimental results show that the improved model has a recall of 91.4% and an accuracy of 92.6%, which is 12.9% and 10.9% higher than the original Mask R-CNN model, respectively, indicating that the improved model is more effective in landslide extraction. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

18 pages, 64628 KiB  
Article
Regularization for Unsupervised Learning of Optical Flow
by Libo Long and Jochen Lang
Sensors 2023, 23(8), 4080; https://doi.org/10.3390/s23084080 - 18 Apr 2023
Viewed by 1365
Abstract
Regularization is an important technique for training deep neural networks. In this paper, we propose a novel shared-weight teacher–student strategy and a content-aware regularization (CAR) module. Based on a tiny, learnable, content-aware mask, CAR is randomly applied to some channels in the convolutional [...] Read more.
Regularization is an important technique for training deep neural networks. In this paper, we propose a novel shared-weight teacher–student strategy and a content-aware regularization (CAR) module. Based on a tiny, learnable, content-aware mask, CAR is randomly applied to some channels in the convolutional layers during training to be able to guide predictions in a shared-weight teacher–student strategy. CAR prevents motion estimation methods in unsupervised learning from co-adaptation. Extensive experiments on optical flow and scene flow estimation show that our method significantly improves on the performance of the original networks and surpasses other popular regularization methods. The method also surpasses all variants with similar architectures and the supervised PWC-Net on MPI-Sintel and on KITTI. Our method shows strong cross-dataset generalization, i.e., our method solely trained on MPI-Sintel outperforms a similarly trained supervised PWC-Net by 27.9% and 32.9% on KITTI, respectively. Our method uses fewer parameters and less computation, and has faster inference times than the original PWC-Net. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

20 pages, 937 KiB  
Article
Multi-Domain Feature Alignment for Face Anti-Spoofing
by Shizhe Zhang and Wenhui Nie
Sensors 2023, 23(8), 4077; https://doi.org/10.3390/s23084077 - 18 Apr 2023
Viewed by 1442
Abstract
Face anti-spoofing is critical for enhancing the robustness of face recognition systems against presentation attacks. Existing methods predominantly rely on binary classification tasks. Recently, methods based on domain generalization have yielded promising results. However, due to distribution discrepancies between various domains, the differences [...] Read more.
Face anti-spoofing is critical for enhancing the robustness of face recognition systems against presentation attacks. Existing methods predominantly rely on binary classification tasks. Recently, methods based on domain generalization have yielded promising results. However, due to distribution discrepancies between various domains, the differences in the feature space related to the domain considerably hinder the generalization of features from unfamiliar domains. In this work, we propose a multi-domain feature alignment framework (MADG) that addresses poor generalization when multiple source domains are distributed in the scattered feature space. Specifically, an adversarial learning process is designed to narrow the differences between domains, achieving the effect of aligning the features of multiple sources, thus resulting in multi-domain alignment. Moreover, to further improve the effectiveness of our proposed framework, we incorporate multi-directional triplet loss to achieve a higher degree of separation in the feature space between fake and real faces. To evaluate the performance of our method, we conducted extensive experiments on several public datasets. The results demonstrate that our proposed approach outperforms current state-of-the-art methods, thereby validating its effectiveness in face anti-spoofing. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

15 pages, 5973 KiB  
Article
Polarized Object Surface Reconstruction Algorithm Based on RU-GAN Network
by Xu Yang, Cai Cheng, Jin Duan, You-Fei Hao, Yong Zhu and Hao Zhang
Sensors 2023, 23(7), 3638; https://doi.org/10.3390/s23073638 - 31 Mar 2023
Viewed by 1255
Abstract
There are six possible solutions for the surface normal vectors obtained from polarization information during 3D reconstruction. To resolve the ambiguity of surface normal vectors, scholars have introduced additional information, such as shading information. However, this makes the 3D reconstruction task too burdensome. [...] Read more.
There are six possible solutions for the surface normal vectors obtained from polarization information during 3D reconstruction. To resolve the ambiguity of surface normal vectors, scholars have introduced additional information, such as shading information. However, this makes the 3D reconstruction task too burdensome. Therefore, in order to make the 3D reconstruction more generally applicable, this paper proposes a complete framework to reconstruct the surface of an object using only polarized images. To solve the ambiguity problem of surface normal vectors, a jump-compensated U-shaped generative adversarial network (RU-Gan) based on jump compensation is designed for fusing six surface normal vectors. Among them, jump compensation is proposed in the encoder and decoder parts, and the content loss function is reconstructed, among other approaches. For the problem that the reflective region of the original image will cause the estimated normal vector to deviate from the true normal vector, a specular reflection model is proposed to optimize the dataset, thus reducing the reflective region. Experiments show that the estimated normal vector obtained in this paper improves the accuracy by about 20° compared with the previous conventional work, and improves the accuracy by about 1.5° compared with the recent neural network model, which means the neural network model proposed in this paper is more suitable for the normal vector estimation task. Furthermore, the object surface reconstruction framework proposed in this paper has the characteristics of simple implementation conditions and high accuracy of reconstructed texture. Full article
(This article belongs to the Special Issue Deep Learning for Computer Vision and Image Processing Sensors)
Show Figures

Figure 1

Back to TopTop