remotesensing-logo

Journal Browser

Journal Browser

Advances in Deep Learning Based 3D Scene Understanding from LiDAR

A special issue of Remote Sensing (ISSN 2072-4292). This special issue belongs to the section "Remote Sensing Image Processing".

Deadline for manuscript submissions: closed (30 January 2023) | Viewed by 37030

Special Issue Editors

Department of Geomatics Engineering, College of Civil Engineering, Nanjing Forestry University, Nanjing 210037, China
Interests: image- and LiDAR-based segmentation and reconstruction; full-waveform LiDAR data processing; related remote sensing applications in the field of forest ecosystems
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Mathematics and Computing Science, Faculty of Science, Saint Mary’s University, Halifax, NS B3H 3C2, Canada
Interests: computer graphics; 3D computer vision; geometric deep learning; related applications including motion capture for VR/AR and LiDAR-based urban modeling
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Department of Remote Sensing Technology, College of Resource Environment and Tourism, Capital Normal University, Beijing 100048, China
Interests: light detection and ranging data processing; quality analysis of geographic information systems; remote sensing image processing; algorithm development
Special Issues, Collections and Topics in MDPI journals
Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China
Interests: remote sensing imagery recognition; 3D modeling and reconstruction; deep learning
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

Rapid advancements in the Light Detection Ranging (LiDAR) technology and the recent break throughs in 3D deep learning has dramatically improved the ability to recognize physical objects and interpret the physical world at scale. Many applications such as autonomous robotics and urban planning uses real time and/or offline inference of information about the physical world and the objects therein from 3D point clouds. In general, 3D scene understanding problem consists of a set of sub-problems including scan registration, segmentation, recognition of objects and the scene modeling. Driven by the increasing availability of annotated public data, e.g., KITTI, Toronto3D, RoofND or Semantic3D, remote sensing community is more and more shifting towards machine learning/deep learning algorithms to efficiently solve these fundamental problems in physical world interpretation. This special issue aims to provide a forum for disseminating the recent advances in the research and applications of 3D scene understanding from LiDAR scans, especially with a focus on deep learning algorithms. This issue calls for machine learning/deep learning models, data sets and any specific tools for data generation or annotation for LiDAR based scene understanding, object classification and modeling.

Dr. Dong Chen
Dr. Jiju Poovvancheri
Dr. Zhengxin Zhang
Dr. Liqiang Zhang
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Remote Sensing is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2700 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • Deep Learning for LiDAR processing
  • LiDAR scan processing
  • LiDAR registration
  • Object detection
  • 3D object recognition
  • LiDAR segmentation
  • LiDAR classification
  • Scene understanding
  • 3D scene modeling
  • Dynamic object tracking
  • Tree classification/modeling
  • Road segmentation
  • Building reconstruction
  • Ubiquitous point clouds interpretation
  • Point clouds filtering

Published Papers (14 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

26 pages, 31710 KiB  
Article
MC-UNet: Martian Crater Segmentation at Semantic and Instance Levels Using U-Net-Based Convolutional Neural Network
by Dong Chen, Fan Hu, P. Takis Mathiopoulos, Zhenxin Zhang and Jiju Peethambaran
Remote Sens. 2023, 15(1), 266; https://doi.org/10.3390/rs15010266 - 02 Jan 2023
Cited by 5 | Viewed by 2098
Abstract
Crater recognition on Mars is of paramount importance for many space science applications, such as accurate planetary surface age dating and geological mapping. Such recognition is achieved by means of various image-processing techniques employing traditional CNNs (convolutional neural networks), which typically suffer from [...] Read more.
Crater recognition on Mars is of paramount importance for many space science applications, such as accurate planetary surface age dating and geological mapping. Such recognition is achieved by means of various image-processing techniques employing traditional CNNs (convolutional neural networks), which typically suffer from slow convergence and relatively low accuracy. In this paper, we propose a novel CNN, referred to as MC-UNet (Martian Crater U-Net), wherein classical U-Net is employed as the backbone for accurate identification of Martian craters at semantic and instance levels from thermal-emission-imaging-system (THEMIS) daytime infrared images. Compared with classical U-Net, the depth of the layers of MC-UNet is expanded to six, while the maximum number of channels is decreased to one-fourth, thereby making the proposed CNN-based architecture computationally efficient while maintaining a high recognition rate of impact craters on Mars. For enhancing the operation of MC-UNet, we adopt average pooling and embed channel attention into the skip-connection process between the encoder and decoder layers at the same network depth so that large-sized Martian craters can be more accurately recognized. The proposed MC-UNet is adequately trained using 2∼32 km radii Martian craters from THEMIS daytime infrared annotated images. For the predicted Martian crater rim pixels, template matching is subsequently used to recognize Martian craters at the instance level. The experimental results indicate that MC-UNet has the potential to recognize Martian craters with a maximum radius of 31.28 km (136 pixels) with a recall of 0.7916 and F1-score of 0.8355. The promising performance shows that the proposed MC-UNet is on par with or even better than other classical CNN architectures, such as U-Net and Crater U-Net. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

18 pages, 3495 KiB  
Article
Semantic Segmentation of 3D Point Clouds Based on High Precision Range Search Network
by Zhonghua Su, Guiyun Zhou, Fulin Luo, Shihua Li and Kai-Kuang Ma
Remote Sens. 2022, 14(22), 5649; https://doi.org/10.3390/rs14225649 - 09 Nov 2022
Cited by 1 | Viewed by 2251
Abstract
Semantic segmentation for 3D point clouds plays a critical role in the construction of 3D models. Due to the sparse and disordered natures of the point clouds, semantic segmentation of such unstructured data yields technical challenges. A recently proposed deep neural network, PointNet, [...] Read more.
Semantic segmentation for 3D point clouds plays a critical role in the construction of 3D models. Due to the sparse and disordered natures of the point clouds, semantic segmentation of such unstructured data yields technical challenges. A recently proposed deep neural network, PointNet, delivers attractive semantic segmentation performance, but it only exploits the global features of point clouds without incorporating any local features, limiting its ability to recognize fine-grained patterns. For that, this paper proposes a deeper hierarchical structure called the high precision range search (HPRS) network, which can learn local features with increasing contextual scales. We develop an adaptive ball query algorithm that designs a comprehensive set of grouping strategies. It can gather detailed local feature points in comparison to the common ball query algorithm, especially when there are not enough feature points within the ball range. Furthermore, compared to the sole use of either the max pooling or the mean pooling, our network combining the two can aggregate point features of the local regions from hierarchy structure while resolving the disorder of points and minimizing the information loss of features. The network achieves superior performance on the S3DIS dataset, with a mIoU declined by 0.26% compared to the state-of-the-art DPFA network. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

24 pages, 7621 KiB  
Article
HFENet: Hierarchical Feature Extraction Network for Accurate Landcover Classification
by Di Wang, Ronghao Yang, Hanhu Liu, Haiqing He, Junxiang Tan, Shaoda Li, Yichun Qiao, Kangqi Tang and Xiao Wang
Remote Sens. 2022, 14(17), 4244; https://doi.org/10.3390/rs14174244 - 28 Aug 2022
Cited by 6 | Viewed by 2016
Abstract
Landcover classification is an important application in remote sensing, but it is always a challenge to distinguish different features with similar characteristics or large-scale differences. Some deep learning networks, such as UperNet, PSPNet, and DANet, use pyramid pooling and attention mechanisms to improve [...] Read more.
Landcover classification is an important application in remote sensing, but it is always a challenge to distinguish different features with similar characteristics or large-scale differences. Some deep learning networks, such as UperNet, PSPNet, and DANet, use pyramid pooling and attention mechanisms to improve their abilities in multi-scale features extraction. However, due to the neglect of low-level features contained in the underlying network and the information differences between feature maps, it is difficult to identify small-scale objects. Thus, we propose a novel image segmentation network, named HFENet, for mining multi-level semantic information. Like the UperNet, HFENet adopts a top-down horizontal connection architecture while includes two improved modules, the HFE and the MFF. According to the characteristics of different levels of semantic information, HFE module reconstructs the feature extraction part by introducing an attention mechanism and pyramid pooling module to fully mine semantic information. With the help of a channel attention mechanism, MFF module up-samples and re-weights the feature maps to fuse them and enhance the expression ability of multi-scale features. Ablation studies and comparative experiments between HFENet and seven state-of-the-art models (U-Net, DeepLabv3+, PSPNet, FCN, UperNet, DANet and SegNet) are conducted with a self-labeled GF-2 remote sensing image dataset (MZData) and two open datasets landcover.ai and WHU building dataset. The results show that HFENet on three datasets with six evaluation metrics (mIoU, FWIoU, PA, mP, mRecall and mF1) are better than the other models and the mIoU is improved 7.41–10.60% on MZData, 1.17–11.57% on WHU building dataset and 0.93–4.31% on landcover.ai. HFENet can perform better in the task of refining the semantic segmentation of remote sensing images. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Graphical abstract

20 pages, 1794 KiB  
Article
Introducing Improved Transformer to Land Cover Classification Using Multispectral LiDAR Point Clouds
by Zhiwen Zhang, Teng Li, Xuebin Tang, Xiangda Lei and Yuanxi Peng
Remote Sens. 2022, 14(15), 3808; https://doi.org/10.3390/rs14153808 - 07 Aug 2022
Cited by 14 | Viewed by 2004
Abstract
The use of Transformer-based networks has been proposed for the processing of general point clouds. However, there has been little research related to multispectral LiDAR point clouds that contain both spatial coordinate information and multi-wavelength intensity information. In this paper, we propose networks [...] Read more.
The use of Transformer-based networks has been proposed for the processing of general point clouds. However, there has been little research related to multispectral LiDAR point clouds that contain both spatial coordinate information and multi-wavelength intensity information. In this paper, we propose networks for multispectral LiDAR point cloud point-by-point classification based on an improved Transformer. Specifically, considering the sparseness of different regions of multispectral LiDAR point clouds, we add a bias to the Transformer to improve its ability to capture local information and construct an easy-to-implement multispectral LiDAR point cloud Transformer (MPT) classification network. The MPT network achieves 78.49% mIoU, 94.55% OA, 84.46% F1, and 0.92 Kappa on the multispectral LiDAR point cloud testing dataset. To further extract the topological relationships between points, we present a standardization set abstraction (SSA) module, which includes the global point information while considering the relationships among the local points. Based on the SSA module, we propose an advanced version called MPT+ for the point-by-point classification of multispectral LiDAR point clouds. The MPT+ network achieves 82.94% mIoU, 95.62% OA, 88.42% F1, and 0.94 Kappa on the same testing dataset. Compared with seven point-based deep learning algorithms, our proposed MPT+ achieves state-of-the-art results for several evaluation metrics. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Graphical abstract

23 pages, 7843 KiB  
Article
Individual Tree Detection in Urban ALS Point Clouds with 3D Convolutional Networks
by Stefan Schmohl, Alejandra Narváez Vallejo and Uwe Soergel
Remote Sens. 2022, 14(6), 1317; https://doi.org/10.3390/rs14061317 - 09 Mar 2022
Cited by 12 | Viewed by 3501
Abstract
Since trees are a vital part of urban green infrastructure, automatic mapping of individual urban trees is becoming increasingly important for city management and planning. Although deep-learning-based object detection networks are the state-of-the-art in computer vision, their adaptation to individual tree detection in [...] Read more.
Since trees are a vital part of urban green infrastructure, automatic mapping of individual urban trees is becoming increasingly important for city management and planning. Although deep-learning-based object detection networks are the state-of-the-art in computer vision, their adaptation to individual tree detection in urban areas has scarcely been studied. Some existing works have employed 2D object detection networks for this purpose. However, these have used three-dimensional information only in the form of projected feature maps. In contrast, we exploited the full 3D potential of airborne laser scanning (ALS) point clouds by using a 3D neural network for individual tree detection. Specifically, a sparse convolutional network was used for 3D feature extraction, feeding both semantic segmentation and circular object detection outputs, which were combined for further increased accuracy. We demonstrate the capability of our approach on an urban topographic ALS point cloud with 10,864 hand-labeled ground truth trees. Our method achieved an average precision of 83% regarding the common 0.5 intersection over union criterion. 85% percent of the stems were found correctly with a precision of 88%, while tree area was covered by the individual tree detections with an F1 accuracy of 92%. Thereby, we outperformed traditional delineation baselines and recent detection networks. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

20 pages, 7143 KiB  
Article
AFE-RCNN: Adaptive Feature Enhancement RCNN for 3D Object Detection
by Feng Shuang, Hanzhang Huang, Yong Li, Rui Qu and Pei Li
Remote Sens. 2022, 14(5), 1176; https://doi.org/10.3390/rs14051176 - 27 Feb 2022
Cited by 6 | Viewed by 2255
Abstract
The point clouds scanned by lidar are generally sparse, which can result in fewer sampling points of objects. To perform precise and effective 3D object detection, it is necessary to improve the feature representation ability to extract more feature information of the object [...] Read more.
The point clouds scanned by lidar are generally sparse, which can result in fewer sampling points of objects. To perform precise and effective 3D object detection, it is necessary to improve the feature representation ability to extract more feature information of the object points. Therefore, we propose an adaptive feature enhanced 3D object detection network based on point clouds (AFE-RCNN). AFE-RCNN is a point-voxel integrated network. We first voxelize the raw point clouds and obtain the voxel features through the 3D voxel convolutional neural network. Then, the 3D feature vectors are projected to the 2D bird’s eye view (BEV), and the relationship between the features in both spatial dimension and channel dimension is learned by the proposed residual of dual attention proposal generation module. The high-quality 3D box proposals are generated based on the BEV features and anchor-based approach. Next, we sample key points from raw point clouds to summarize the information of the voxel features, and obtain the key point features by the multi-scale feature extraction module based on adaptive feature adjustment. The neighboring contextual information is integrated into each key point through this module, and the robustness of feature processing is also guaranteed. Lastly, we aggregate the features of the BEV, voxels, and point clouds as the key point features that are used for proposal refinement. In addition, to ensure the correlation among the vertices of the bounding box, we propose a refinement loss function module with vertex associativity. Our AFE-RCNN exhibits comparable performance on the KITTI dataset and Waymo open dataset to state-of-the-art methods. On the KITTI 3D detection benchmark, for the moderate difficulty level of the car and the cyclist classes, the 3D detection mean average precisions of AFE-RCNN can reach 81.53% and 67.50%, respectively. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

25 pages, 39607 KiB  
Article
AFGL-Net: Attentive Fusion of Global and Local Deep Features for Building Façades Parsing
by Dong Chen, Guiqiu Xiang, Jiju Peethambaran, Liqiang Zhang, Jing Li and Fan Hu
Remote Sens. 2021, 13(24), 5039; https://doi.org/10.3390/rs13245039 - 11 Dec 2021
Viewed by 2078
Abstract
In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding [...] Read more.
In this paper, we propose a deep learning framework, namely AFGL-Net to achieve building façade parsing, i.e., obtaining the semantics of small components of building façade, such as windows and doors. To this end, we present an autoencoder embedding position and direction encoding for local feature encoding. The autoencoder enhances the local feature aggregation and augments the representation of skeleton features of windows and doors. We also integrate the Transformer into AFGL-Net to infer the geometric shapes and structural arrangements of façade components and capture the global contextual features. These global features can help recognize inapparent windows/doors from the façade points corrupted with noise, outliers, occlusions, and irregularities. The attention-based feature fusion mechanism is finally employed to obtain more informative features by simultaneously considering local geometric details and the global contexts. The proposed AFGL-Net is comprehensively evaluated on Dublin and RueMonge2014 benchmarks, achieving 67.02% and 59.80% mIoU, respectively. We also demonstrate the superiority of the proposed AFGL-Net by comparing with the state-of-the-art methods and various ablation studies. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

22 pages, 85039 KiB  
Article
A Vehicle-Borne Mobile Mapping System Based Framework for Semantic Segmentation and Modeling on Overhead Catenary System Using Deep Learning
by Lei Xu, Shunyi Zheng, Jiaming Na, Yuanwei Yang, Chunlin Mu and Debin Shi
Remote Sens. 2021, 13(23), 4939; https://doi.org/10.3390/rs13234939 - 04 Dec 2021
Cited by 4 | Viewed by 2666
Abstract
Overhead catenary system (OCS) automatic detection is of important significance for the safe operation and maintenance of electrified railways. The vehicle-borne mobile mapping system (VMMS) may significantly improve the data acquisition. This paper proposes a VMMS-based framework to realize the automatic detection and [...] Read more.
Overhead catenary system (OCS) automatic detection is of important significance for the safe operation and maintenance of electrified railways. The vehicle-borne mobile mapping system (VMMS) may significantly improve the data acquisition. This paper proposes a VMMS-based framework to realize the automatic detection and modelling of OCS. The proposed framework performed semantic segmentation, model reconstruction and geometric parameters detection based on LiDAR point cloud using VMMS. Firstly, an enhanced VMMS is designed for accurate data generation. Secondly, an automatic searching method based on a two-level stereo frame is designed to filter the irrelevant non-OCS point cloud. Then, a deep learning network based on multi-scale feature fusion and an attention mechanism (MFF_A) is trained for semantic segmentation on a catenary facility. Finally, the 3D modelling is performed based on the OCS segmentation result, and geometric parameters are then extracted. The experimental case study was conducted on a 100 km high-speed railway in Guangxi, China. The experimental results show that the proposed framework has a better accuracy of 96.37%, outperforming other state-of-art methods for segmentation. Compared with traditional manual laser measurement, the proposed framework can achieve a trustable accuracy within 10 mm for OCS geometric parameter detection. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Graphical abstract

17 pages, 2685 KiB  
Article
Three-Dimensional Urban Land Cover Classification by Prior-Level Fusion of LiDAR Point Cloud and Optical Imagery
by Yanming Chen, Xiaoqiang Liu, Yijia Xiao, Qiqi Zhao and Sida Wan
Remote Sens. 2021, 13(23), 4928; https://doi.org/10.3390/rs13234928 - 04 Dec 2021
Cited by 5 | Viewed by 2262
Abstract
The heterogeneity of urban landscape in the vertical direction should not be neglected in urban ecology research, which requires urban land cover product transformation from two-dimensions to three-dimensions using light detection and ranging system (LiDAR) point clouds. Previous studies have demonstrated that the [...] Read more.
The heterogeneity of urban landscape in the vertical direction should not be neglected in urban ecology research, which requires urban land cover product transformation from two-dimensions to three-dimensions using light detection and ranging system (LiDAR) point clouds. Previous studies have demonstrated that the performance of two-dimensional land cover classification can be improved by fusing optical imagery and LiDAR data using several strategies. However, few studies have focused on the fusion of LiDAR point clouds and optical imagery for three-dimensional land cover classification, especially using a deep learning framework. In this study, we proposed a novel prior-level fusion strategy and compared it with the no-fusion strategy (baseline) and three other commonly used fusion strategies (point-level, feature-level, and decision-level). The proposed prior-level fusion strategy uses two-dimensional land cover derived from optical imagery as the prior knowledge for three-dimensional classification. Then, a LiDAR point cloud is linked to the prior information using the nearest neighbor method and classified by a deep neural network. Our proposed prior-fusion strategy has higher overall accuracy (82.47%) on data from the International Society for Photogrammetry and Remote Sensing, compared with the baseline (74.62%), point-level (79.86%), feature-level (76.22%), and decision-level (81.12%). The improved accuracy reflects two features: (1) fusing optical imagery to LiDAR point clouds improves the performance of three-dimensional urban land cover classification, and (2) the proposed prior-level strategy directly uses semantic information provided by the two-dimensional land cover classification rather than the original spectral information of optical imagery. Furthermore, the proposed prior-level fusion strategy provides a series that fills the gap between two- and three-dimensional land cover classification. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Graphical abstract

20 pages, 8716 KiB  
Article
GACM: A Graph Attention Capsule Model for the Registration of TLS Point Clouds in the Urban Scene
by Jianjun Zou, Zhenxin Zhang, Dong Chen, Qinghua Li, Lan Sun, Ruofei Zhong, Liqiang Zhang and Jinghan Sha
Remote Sens. 2021, 13(22), 4497; https://doi.org/10.3390/rs13224497 - 09 Nov 2021
Cited by 2 | Viewed by 1747
Abstract
Point cloud registration is the foundation and key step for many vital applications, such as digital city, autonomous driving, passive positioning, and navigation. The difference of spatial objects and the structure complexity of object surfaces are the main challenges for the registration problem. [...] Read more.
Point cloud registration is the foundation and key step for many vital applications, such as digital city, autonomous driving, passive positioning, and navigation. The difference of spatial objects and the structure complexity of object surfaces are the main challenges for the registration problem. In this paper, we propose a graph attention capsule model (named as GACM) for the efficient registration of terrestrial laser scanning (TLS) point cloud in the urban scene, which fuses graph attention convolution and a three-dimensional (3D) capsule network to extract local point cloud features and obtain 3D feature descriptors. These descriptors can take into account the differences of spatial structure and point density in objects and make the spatial features of ground objects more prominent. During the training progress, we used both matched points and non-matched points to train the model. In the test process of the registration, the points in the neighborhood of each keypoint were sent to the trained network, in order to obtain feature descriptors and calculate the rotation and translation matrix after constructing a K-dimensional (KD) tree and random sample consensus (RANSAC) algorithm. Experiments show that the proposed method achieves more efficient registration results and higher robustness than other frontier registration methods in the pairwise registration of point clouds. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Graphical abstract

19 pages, 5376 KiB  
Article
Early Labeled and Small Loss Selection Semi-Supervised Learning Method for Remote Sensing Image Scene Classification
by Ye Tian, Yuxin Dong and Guisheng Yin
Remote Sens. 2021, 13(20), 4039; https://doi.org/10.3390/rs13204039 - 09 Oct 2021
Cited by 4 | Viewed by 2640
Abstract
The classification of aerial scenes has been extensively studied as the basic work of remote sensing image processing and interpretation. However, the performance of remote sensing image scene classification based on deep neural networks is limited by the number of labeled samples. In [...] Read more.
The classification of aerial scenes has been extensively studied as the basic work of remote sensing image processing and interpretation. However, the performance of remote sensing image scene classification based on deep neural networks is limited by the number of labeled samples. In order to alleviate the demand for massive labeled samples, various methods have been proposed to apply semi-supervised learning to train the classifier using labeled and unlabeled samples. However, considering the complex contextual relationship and huge spatial differences, the existing semi-supervised learning methods bring different degrees of incorrectly labeled samples when pseudo-labeling unlabeled data. In particular, when the number of labeled samples is small, it affects the generalization performance of the model. In this article, we propose a novel semi-supervised learning method with early labeled and small loss selection. First, the model learns the characteristics of simple samples in the early stage and uses multiple early models to screen out a small number of unlabeled samples for pseudo-labeling based on this characteristic. Then, the model is trained in a semi-supervised manner by combining labeled samples, pseudo-labeled samples, and unlabeled samples. In the training process of the model, small loss selection is used to further eliminate some of the noisy labeled samples to improve the recognition accuracy of the model. Finally, in order to verify the effectiveness of the proposed method, it is compared with several state-of-the-art semi-supervised classification methods. The results show that when there are only a few labeled samples in remote sensing image scene classification, our method is always better than previous methods. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

25 pages, 9119 KiB  
Article
LiDAR-Based SLAM under Semantic Constraints in Dynamic Environments
by Weiqi Wang, Xiong You, Xin Zhang, Lingyu Chen, Lantian Zhang and Xu Liu
Remote Sens. 2021, 13(18), 3651; https://doi.org/10.3390/rs13183651 - 13 Sep 2021
Cited by 6 | Viewed by 2780
Abstract
Facing the realistic demands of the application environment of robots, the application of simultaneous localisation and mapping (SLAM) has gradually moved from static environments to complex dynamic environments, while traditional SLAM methods usually result in pose estimation deviations caused by errors in data [...] Read more.
Facing the realistic demands of the application environment of robots, the application of simultaneous localisation and mapping (SLAM) has gradually moved from static environments to complex dynamic environments, while traditional SLAM methods usually result in pose estimation deviations caused by errors in data association due to the interference of dynamic elements in the environment. This problem is effectively solved in the present study by proposing a SLAM approach based on light detection and ranging (LiDAR) under semantic constraints in dynamic environments. Four main modules are used for the projection of point cloud data, semantic segmentation, dynamic element screening, and semantic map construction. A LiDAR point cloud semantic segmentation network SANet based on a spatial attention mechanism is proposed, which significantly improves the real-time performance and accuracy of point cloud semantic segmentation. A dynamic element selection algorithm is designed and used with prior knowledge to significantly reduce the pose estimation deviations caused by SLAM dynamic elements. The results of experiments conducted on the public datasets SemanticKITTI, KITTI, and SemanticPOSS show that the accuracy and robustness of the proposed approach are significantly improved. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

20 pages, 53952 KiB  
Article
Point Cloud Classification Algorithm Based on the Fusion of the Local Binary Pattern Features and Structural Features of Voxels
by Yong Li, Yinzheng Luo, Xia Gu, Dong Chen, Fang Gao and Feng Shuang
Remote Sens. 2021, 13(16), 3156; https://doi.org/10.3390/rs13163156 - 10 Aug 2021
Cited by 7 | Viewed by 3238
Abstract
Point cloud classification is a key technology for point cloud applications and point cloud feature extraction is a key step towards achieving point cloud classification. Although there are many point cloud feature extraction and classification methods, and the acquisition of colored point cloud [...] Read more.
Point cloud classification is a key technology for point cloud applications and point cloud feature extraction is a key step towards achieving point cloud classification. Although there are many point cloud feature extraction and classification methods, and the acquisition of colored point cloud data has become easier in recent years, most point cloud processing algorithms do not consider the color information associated with the point cloud or do not make full use of the color information. Therefore, we propose a voxel-based local feature descriptor according to the voxel-based local binary pattern (VLBP) and fuses point cloud RGB information and geometric structure features using a random forest classifier to build a color point cloud classification algorithm. The proposed algorithm voxelizes the point cloud; divides the neighborhood of the center point into cubes (i.e., multiple adjacent sub-voxels); compares the gray information of the voxel center and adjacent sub-voxels; performs voxel global thresholding to convert it into a binary code; and uses a local difference sign–magnitude transform (LDSMT) to decompose the local difference of an entire voxel into two complementary components of sign and magnitude. Then, the VLBP feature of each point is extracted. To obtain more structural information about the point cloud, the proposed method extracts the normal vector of each point and the corresponding fast point feature histogram (FPFH) based on the normal vector. Finally, the geometric mechanism features (normal vector and FPFH) and color features (RGB and VLBP features) of the point cloud are fused, and a random forest classifier is used to classify the color laser point cloud. The experimental results show that the proposed algorithm can achieve effective point cloud classification for point cloud data from different indoor and outdoor scenes, and the proposed VLBP features can improve the accuracy of point cloud classification. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Figure 1

20 pages, 40439 KiB  
Article
Critical Points Extraction from Building Façades by Analyzing Gradient Structure Tensor
by Dong Chen, Jing Li, Shaoning Di, Jiju Peethambaran, Guiqiu Xiang, Lincheng Wan and Xianghong Li
Remote Sens. 2021, 13(16), 3146; https://doi.org/10.3390/rs13163146 - 09 Aug 2021
Cited by 9 | Viewed by 2930
Abstract
This paper proposes a building façade contouring method from LiDAR (Light Detection and Ranging) scans and photogrammetric point clouds. To this end, we calculate the confidence property at multiple scales for an individual point cloud to measure the point cloud’s quality. The confidence [...] Read more.
This paper proposes a building façade contouring method from LiDAR (Light Detection and Ranging) scans and photogrammetric point clouds. To this end, we calculate the confidence property at multiple scales for an individual point cloud to measure the point cloud’s quality. The confidence property is utilized in the definition of the gradient for each point. We encode the individual point gradient structure tensor, whose eigenvalues reflect the gradient variations in the local neighborhood areas. The critical point clouds representing the building façade and rooftop (if, of course, such rooftops exist) contours are then extracted by jointly analyzing dual-thresholds of the gradient and gradient structure tensor. Based on the requirements of compact representation, the initial obtained critical points are finally downsampled, thereby achieving a tradeoff between the accurate geometry and abstract representation at a reasonable level. Various experiments using representative buildings in Semantic3D benchmark and other ubiquitous point clouds from ALS DublinCity and Dutch AHN3 datasets, MLS TerraMobilita/iQmulus 3D urban analysis benchmark, UAV-based photogrammetric dataset, and GeoSLAM ZEB-HORIZON scans have shown that the proposed method generates building contours that are accurate, lightweight, and robust to ubiquitous point clouds. Two comparison experiments also prove the superiority of the proposed method in terms of topological correctness, geometric accuracy, and representation compactness. Full article
(This article belongs to the Special Issue Advances in Deep Learning Based 3D Scene Understanding from LiDAR)
Show Figures

Graphical abstract

Back to TopTop