Next Article in Journal
Illuminating the Spatio-Temporal Evolution of the 2008–2009 Qaidam Earthquake Sequence with the Joint Use of Insar Time Series and Teleseismic Data
Previous Article in Journal
Spatio-Temporal Characteristics for Moon-Based Earth Observations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Building Extraction from Airborne LiDAR Data Based on Min-Cut and Improved Post-Processing

1
School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
2
Department of Oceanography, Dalhousie University, Halifax, NS B3H 4R2, Canada
3
School of Resources Environment Science and Technology, Hubei University of Science and Technology, Xianning 437000, China
4
Faculty of Resources and Environmental Science, Hubei University, Wuhan 430062, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2020, 12(17), 2849; https://doi.org/10.3390/rs12172849
Submission received: 18 July 2020 / Revised: 25 August 2020 / Accepted: 31 August 2020 / Published: 2 September 2020

Abstract

:
Building extraction from LiDAR data has been an active research area, but it is difficult to discriminate between buildings and vegetation in complex urban scenes. A building extraction method from LiDAR data based on minimum cut (min-cut) and improved post-processing is proposed. To discriminate building points on the intersecting roof planes from vegetation, a point feature based on the variance of normal vectors estimated via low-rank subspace clustering (LRSC) technique is proposed, and non-ground points are separated into two subsets based on min-cut after filtering. Then, the results of building extraction are refined via improved post-processing using restricted region growing and the constraints of height, the maximum intersection angle and consistency. The maximum intersection angle constraint removes large non-building point clusters with narrow width, such as greenbelt along streets. Contextual information and consistency constraint are both used to eliminate inhomogeneity. Experiments of seven datasets, including five datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS), one dataset with high-density point data and one dataset with dense buildings, verify that most buildings, even with curved roofs, are successfully extracted by the proposed method, with over 94.1% completeness and a minimum 89.8% correctness at the per-area level. In addition, the proposed point feature significantly outperforms the comparison alternative and is less sensitive to feature threshold in complex scenes. Hence, the extracted building points can be used in various applications.

1. Introduction

Automatic building extraction from remote sensing data is a prerequisite step for the applications of three-dimensional (3D) building reconstruction, urban planning, disaster assessment, and updating digital maps and geographic information system (GIS) databases [1,2,3,4]. Since the emergence of airborne Light Detection and Ranging (LiDAR), it provides an alternative way to extract buildings due to its high-density and high-accuracy point data.
In the field of LiDAR, building extraction is to separate building points from other points, which can also be termed as building detection [1,5]. Due to the complexity of building structures and urban scenes, it remains a challenge to extract buildings from LiDAR data. Many studies have been conducted in the past two decades regarding building extraction. Some methods combine other data sources, such as optical images with spectral and texture information, intensity data, waveform data and GIS data [6,7,8,9]. Among these data, the image data is the most commonly used due to its high spatial resolution, color and texture information [6,10]. By combining two-dimensional (2D) information from images and 3D information from LiDAR data, complementary information can be exploited to extract and reconstruct buildings automatically [1,10]. However, these methods using images unavoidably involve some problems. First, LiDAR data and images need to be registered before fusion, which poses a challenge due to their different characteristics [10,11]. Second, the spatial resolution of images and LiDAR data are different, which may decrease accuracy after fusing them [10,11,12]. Moreover, in some regions, image and LiDAR data are not always both available to data-end users due to various reasons, which limits the practicality of these methods.
Building extraction methods based solely on point cloud are the mainstream focus [12,13,14,15,16]. Such methods can be broadly categorized into two classes: supervised and unsupervised methods [12,13]. Based on the basic processing units, supervised methods can be categorized into three groups: point-based, segment-based and multiple-entity-based classification methods [13,14]. Point-based classification methods are the most commonly used and are generally composed of three steps: sample points training, point features extraction, and classification. Segment-based methods generally consist of four steps: segmentation of raw points, sample segments training, segments features extraction, and classification [13]. Multiple-entity-based classification is considered as a combination of the segment-based and point-based classifications [14]. Satisfying results can be obtained from supervised approaches by using accurate features and proper training samples, but some defects cannot be ignored [15,16,17,18,19,20,21,22,23,24,25,26,27]. First, constant scale parameters of neighborhoods used for calculating point features fail to accurately describe local structure [21,23]. Second, most classifiers, such as JointBoost [15], support vector machine [16], random forest [17], expectation maximization [18], XGBoost [19], adaptive boosting [20], etc., classify each point independently without considering the labels of its neighborhoods, leading to inhomogeneous results in complex scenes [22,24]. Third, supervised methods involve many point features, but more features do not necessarily guarantee higher classification accuracy [27]. On the contrary, too many features may introduce redundant information and increase computation cost [25,26]. In addition, segment-based and multiple-entity-based classified methods mainly rely on the segment strategies and they are hierarchical procedures, which involve many steps [28,29,30].
Deep learning has been introduced in the field of point cloud in recent years, such as object recognition, classification, and segmentation [31,32,33]. However, the number of samples used for training models is far more than the aforementioned classifiers. Moreover, many deep learning algorithms developed for images cannot be employed for point cloud without modification due to its irregularity and discreteness [31,32].
Due to the above-mentioned reasons, many researchers study the unsupervised methods. These methods mainly include fitting methods [34,35,36], region growing methods [28,30], clustering methods [37,38], morphological-based methods [39,40], and energy minimization [12,15]. In the fitting methods, random sample consensus (RANSAC) [34] and 3D Hough transformation [36] methods are widely used to extract buildings. However, RANSAC approach often extracts pseudo planes in vegetation areas due to random sampling, and its efficiency significantly decreases with an increasing amount of point cloud data. 3D Hough transformation is time-consuming and sensitive to fitting parameters [36]. Region growing algorithm is also widely used due to its simple and easy operations. Generally, it only works well when the initial ideal building seeds are correctly selected and the constraints used in algorithms are reasonable. It is easy to overgrow due to unreasonable constraints and inaccurate features, especially in the transition region between different objects [28,30]. Clustering methods are statistical techniques that cluster points based on some local surface properties or other features [37,38]. In [37], building seeds were detected through semi-suppressed fuzzy C-means, then a restricted region growing algorithm was applied to search more building points based on building seeds. Morphological-based methods firstly perform rasterization on the point cloud, and then non-building pixels are removed under the constraints of size, shape, height, and building element structure. However, the size of the pixels and the geometrical properties greatly influence the final results [39]. The energy minimization approach, such as graph cuts algorithm, is a global solution that formulates the building extraction as an optimization problem [12,15]. In [12], a graph was constructed using pixels of a generated Digital Surface Model (DSM) image, and features of points and grids were both used to construct energy function. Finally, pixels were labelled by minimizing the energy function. In [41], the graph was constructed using voxels after voxelization on point cloud. Although satisfying results can be obtained from [12,41], data structure is changed and results may exhibit decreased accuracy in the process of rasterization or voxelization due to the fact that not all points within the same pixel or voxel belong to a building, and post-processing steps are needed to solve the problem of building boundary zigzags [12]. In [15], graph cuts algorithm was used to optimize classification results of JointBoost classifier due to the fact that contextual information between points was not considered in the classifier. However, results of JointBoost classifier were used as initial foreground and background seed points to define energy function. This is impractical for automatic building extraction due to the fact that no building seeds are available before building extraction.
Based on the above statements, the current studies on building extraction still face two problems [24,27]: (1) Large numbers of point features are used to extract buildings in most existing algorithms [13,24,37]. Too many features increase computation cost of calculating point features and reduce algorithmic efficiency. Moreover, not all features are suitable for building extraction and may decrease accuracy [27]. (2) Due to various roof types and complex scenes, the contextual information is useful for classification [12,24]. However, contextual information is always ignored for most methods, leading to inhomogeneous results and increased post-processing workload.
Therefore, it is of great importance to continue building extraction research using airborne LiDAR data. The main objectives of the study are (1) to design effective point feature to discriminate building points on the intersecting roof planes from vegetation, and evaluate its performance with the existing point feature in different scenes and parameter setting; (2) to use few point features to realize building extraction, avoiding spending too much time on point features calculation; (3) to introduce contextual information in the algorithm to reduce inhomogeneity and improve accuracy of extraction results.
The main contributions of this work can be summarized as follows: (1) A point feature based on the variance of accurate normal vectors estimated via low-rank subspace clustering (LRSC) technique is proposed to discriminate building points on the intersecting roof planes from vegetation. The proposed feature significantly outperforms the comparison alternative and is less sensitive to feature threshold in complex scenes. (2) The proposed maximum intersection angle constraint effectively removes large non-building point clusters with narrow width, such as greenbelt points along streets, overcoming the defects of area-based methods in setting area threshold. (3) Contextual information and consistency constraint are both used to eliminate inhomogeneity in the proposed method, which benefits building extraction. (4) Unlike most previous building extraction methods, only two point features are used in the proposed method, which beneficially decreases the computation cost of calculating point features and improves algorithmic efficiency.

2. Methodology

The proposed method includes three main steps: outliers removal and filtering; point features calculation, graph construction and cut; and improved post-processing. The algorithm workflow is shown in Figure 1.

2.1. Outliers Removal and Filtering

The original point cloud data provided by airborne LiDAR system contains some outliers obtained during data collection due to various reasons, such as multipath effect. It is necessary to remove outliers to alleviate their effects on LiDAR data processing. The “StatisticalOutlierRemoval” tool implemented in point cloud library (PCL) is applied to the original point cloud [42]. In the algorithm, the mean distance from each point to its all neighborhoods is calculated, and points whose mean distance are outside a defined range calculated by the global distance mean and standard deviation are removed [43]. After that, denoising points are classified into ground and non-ground points subsets by progressive TIN densification (PTD) [44]. PTD has been widely used in the field of academic community and in engineering applications due to its accuracy and efficiency, and it has been embedded in commercial software, such as TerraSolid and LiDAR_Suite [44,45].

2.2. Point Features Calculation and Normalization

After filtering, points can be classified as ground and non-ground points. Non-ground points are generated from laser echoes from buildings, vegetation and other man-made or natural ground objects (e.g., vehicles, wires, etc.) [15,37,40], which are the input for further building extraction process. In general, buildings are considered to be composed of planar patches, while vegetation are non-planar, meaning building points have flat characteristics and vegetation points are rough in local area. Therefore, two features based on the above characteristics are used in the proposed method.

2.2.1. Curvature Feature ( f c )

Let N P = { q 0 , q 1 , , q n } denote non-ground points, and N p = { p j | p j N P , p j k _ n e a r e s t _ q i } represent the point set of k-neighborhoods of q i . The covariance matrix M is constructed using q i and its neighborhoods N p , defined as follows:
M = 1 k p N p ( p p ¯ ) ( p p ¯ ) T
where p ¯ = 1 k p N p p is the centroid of all points in N p , and k is the number of N p . After that, three eigenvalues λ 1 , λ 2 , λ 3   ( λ 1 λ 2 λ 3 ) of M are calculated via eigen decomposition. The curvature feature f c is calculated based on three eigenvalues as follows [27]:
f c = λ 1 λ 1 + λ 2 + λ 3
The curvature feature can describe the flatness of surface, and is widely used for planes extraction [27,37,46]. Although the number of neighborhoods influences f c , the influence is minimal [47]. Empirically k is set to 15 to calculate f c by referring to [12].

2.2.2. Variance of LRSC-Based Normal Vector Feature ( f v )

Curvature of points on intersecting building roof planes is much larger than those of flat roofs, indicating these building points are likely to belong to vegetation [12,37]. To discriminate buildings from vegetation, feature based on the variance of normal vectors calculated via principle component analysis (PCA) is proposed in [12]. However, the estimated normal vectors via PCA without modification are inaccurate for points on intersecting building roof planes due to the fact that neighborhoods come from different planes [48]. Figure 2a,b illustrate PCA-based normal vectors of a synthetic cube and building roofs [49], and it can be seen that normal vectors of points on intersecting planes are inaccurate. As a result, features based on these normal vectors, such as variance of normal vector direction [12], and normal vector angle distribution histogram [50], are inaccurate.
Considering the above problem, an accurate normal vector estimation method via low-rank subspace clustering (LRSC) [48] is introduced. The normal vector estimation technique has already been used for automatic building roof segmentation from LiDAR data, and experimental results are satisfying [51]. The algorithm is composed of three main steps: First, points around sharp and smooth regions are identified by covariance analysis of their neighborhoods, and their initial normal vectors are estimated via PCA. Second, normal vectors of points’ neighborhoods are used as prior knowledge to construct a guiding matrix. Third, neighborhoods are segmented into several isotropic neighborhoods by low-rank subspace clustering (LRSC) with the guiding matrix. Then a consistent sub-neighborhood is used to estimate points’ final normal vectors. Figure 2c,d illustrate the LRSC-based normal vectors of a cube and building roof. Compared with the PCA-based normal vectors, LRSC-based normal vectors of points in sharp regions are more accurate and reasonable.
Therefore, feature f v , based on variance of LRSC-based normal vector, is proposed to discriminate buildings from vegetation, and its calculation includes following sub-steps:
Step 1: Normal vectors of all points are calculated via low-rank subspace clustering (LRSC) technique, and calculate the angle α between the normal vector and vertical direction ( v = ( 0 , 0 , 1 ) in 3D space). It should be noted that the normal vector of points may have opposite direction. Therefore, when α is larger than π / 2 , the corresponding angle α is set as π α .
Step 2: For point P and angles α P = { α 1 , α 2 , , α m } of its neighborhoods N P = { q 1 , q 2 , , q m } , divide the range [ 0 , π / 2 ] into equal D n bins to construct a D n dimensional histogram. Then, the number of angles falling within each bin is taken as the value of the bin in the histogram.
Step 3: Calculate variance of histogram as f v for P with the following formulas:
f v = σ 2 μ 2
σ 2 = i = 1 D n ( n i μ ) 2 D n
μ = m D n
where m is the number of neighborhoods, n i is the number of angles falling within the i t h bin.
Step 4: Repeat Step 2 and 3 until all points’ f v are calculated.
According to [12], the parameter m is empirically set to 60 for each point. D n has little impact in the range of 5 to 10, and is set to 6 in the proposed method.
Figure 3a,b illustrate the feature f v of [12] and ours respectively in Area 1 of Vaihingen [49], and points are rendered by the value of normalized f v . Compared with [12], the proposed f v of points on intersecting building roof planes with large slopes are almost consistent with the ones on smooth roof planes (green rectangles in Figure 3c), and difference of the proposed f v between buildings and vegetation are more obvious (yellow rectangles in Figure 3c). In addition, it is a challenging task to discriminate some small complex buildings composed of multiple small planar patches from vegetation for f v of [12] (blue rectangles in Figure 3c).

2.2.3. Feature Normalization

Considering that the range of values of the two features are significantly different, a normalization step is needed. A logistic function is employed to normalize these two features, and the logistic function is defined below:
f ( x ) = 1 1 + e k ( x x 0 )
where x 0 is the feature threshold and k controls the steepness of the logistic function curve. In practice, building extraction results are significantly influenced by   x 0 , and minimally influenced by k [12]. Therefore, k is set to −35, 2.0 for f c and f v respectively according to [12]. Whereas, the specific values of x 0 for f c and f v will be analyzed and discussed in later sections.

2.2.4. Graph Construction and Cut

Point segmentation can be viewed as a labeling problem, which is to assign a label from a set of labels to each point by minimizing an objective function [52,53]. For building extraction, the label problem is to assign a building or non-building label to each non-ground point. Generally, a typical representation of the objective function is an energy function with two terms: data term and smooth term. Among a series of optimization methods to minimize energy function, graph cuts approach [54] based on minimum cut (min-cut) shows good performance since it merges, and is commonly used in many applications, such as image segmentation [55,56,57] and point cloud segmentation [12,15,41]. Thus, graph cuts algorithm is adopted to minimize energy function.
In [54], the graph is composed of sets of nodes and edges. It should be noted that there are two special terminal nodes, called source and sink, which represent the “foreground” and “background” labels. In the proposed method, each non-ground point denotes a node. The energy function   E ( l ) is defined as follows:
E ( l ) = p P D p ( l p ) + p , q N V p q ( l p , l q )
where the first term p P D p ( l p ) is a data term and D p ( l p ) is the penalty to assign label l p to node p . Value of D p ( l p ) measures how well the label fits node   p . The second term p , q N V p q ( l p , l q ) is the smooth term and V p q ( l p , l q ) is interpreted as a penalty for discontinuity between nodes p and q . Generally, if p and q are similar, V p q ( l p , l q ) is large, which means p and q more likely belong to the same object. Data penalty D p ( l p ) is calculated as follows:
D p ( l p = b u i l d i n g ) = λ 1 f c + λ 2 f v
D p ( l p = n o n b u i l d i n g ) = 1 ( λ 1 f c + λ 2 f v )
where λ 1 and λ 2 are the weights of f c and f v respectively, and they satisfy λ 1 + λ 2 = 1 .
Smooth penalty V p q ( l p , l q ) is calculated as follows:
V p q ( l p , l q ) = e ( λ 1 | f p c f q c | + λ 2 | f p v f q v | ) · 1 d ( p , q ) 2                 d ( p , q ) < d s
V p q ( l p , l q ) = e ( λ 1 | f p c f q c | + λ 2 | f p v f q v | ) · 1 d s 2                     d ( p , q ) d s
where f p c and f q c are f c of p and q , f p v and f q v are f v of p and q . d ( p , q ) is the Euclidean distance between point p and q . | · | is any norm distance metric and L 1 norm is adopted, d s is the distance threshold between points and is set as twice the average point space. When the graph is constructed, it is cut based on min-cut and each node will be given a label. Thus, building points are extracted according to the given labels.

2.3. Improved Post-Processing

Although most building points are successfully extracted from the non-ground points, some non-building points are wrongly classified as buildings (e.g., vehicles with smooth surfaces and flat overpasses) and some building points are omitted. To solve these aforementioned problems, an improved post-processing is adopted to refine results of building extraction.

2.3.1. Height Constraint

In general, a building should be high enough. Thus, height threshold T h is set to remove these low points if their absolute height difference between it and its nearest ground points is less than T h . Under this constraint, partial or possibly whole points of vehicles with smooth surfaces can be excluded due to their low height. T h is set according to the average human’s height, such as 1.5 m [23].

2.3.2. Restricted Region Growing

It should be noted that some buildings are located on a slope, and some points satisfy the height constraint. Figure 4a illustrates a profile of a building located in a slope, and partial points are classified as non-building points after height constraint. Thus, restricted region growing based on height constraint is conducted to extract omitted buildings. In the process of restricted region growing, the non-building points are classified as buildings if the absolute height difference between them and their nearest building points is less than a predefined threshold T h . T h is set to 0.1 m according to [37], and building extraction results after restricted region growing are shown in Figure 4b. It should be noted that ground, building and non-building points are rendered by blue, red, and white respectively in subsequent sections.

2.3.3. Maximum Intersection Angle Constraint

After the above-mentioned steps, some non-building points or clusters belonging to vegetation with flat surface, or vehicles with smooth surface and small size are wrongly classified as buildings. Area-based strategies are commonly adopted to remove these above non-building points by clustering based on the assumption that buildings generally occupy a specific area [12,23,27]. Although satisfying results can be obtained by setting proper area thresholds, they fail to eliminate non-building points in some scenes. Figure 5 illustrates building extraction results in Area 3 of Vaihingen [49] after the above steps from LiDAR data with an average point density of 3.2 points/m2. Building points are separated into two clusters via Euclidean clustering, in which one is located in green and the other in the yellow rectangle. The number of points of the small cluster in the yellow rectangle is more than 110, which means the cluster occupies approximate 35 m2. But in practice, the area of many buildings is less than that and they will be eliminated via area-based methods if the area threshold is set to 35 m2. To solve this issue, the maximum intersection angle constraint is proposed.
Figure 5c illustrates the concept of intersection angle, which is composed of the current building point and its cylindrical neighborhoods searched from non-building and ground points. Due to the fact that buildings occupy a specific area and the façade of buildings act as barriers to prevent ground points falling inside the building area, the maximum intersection angle of real building points is larger than 90° at the building corners, and larger than 180° away from building corner. While, the maximum intersection angle of vegetation points or other non-building points is less than 90° due to the fact that it is surrounded by ground and non-building points in all directions within a cylinder (yellow circle in Figure 5c) [58].
Figure 5d shows the calculation of the maximum intersection angle of a given current building point O and its cylindrical nearest neighborhoods. The calculation is composed of three main steps:
Step 1: select initial direction O N and calculate rotational angle α i with respect to O N using their x, y coordinates for each nearest point.
Step 2: sort above angles in ascending order, and intersection angle δ between adjacent rotational angles is calculated as follows:
δ i = { α i + 1 α i i = 1 , 2 , , k 1 360 + α 1 α k i = k }
Step 3: the maximum δ is selected as the maximum intersection angle of point O.
For a current building point O, if its maximum intersection angle is larger than the pre-defined threshold, then O is classified as a building point. Otherwise, it is a non-building point. If there are neither non-building points nor ground points falling in the cylinder of O, then it is classified as buildings directly.
The maximum intersection angle constraint takes into account two threshold parameters: radius of cylinder to detect neighborhoods from ground and non-building points and angular threshold to consider the minimum angle defined by the façade alignments at the corners. In the proposed method, the radius is empirically set to 2.5 m according to [12,37] and the angular threshold was set to 90° according to [58].
It should be noted that generally the width of a flat overpass ranges between 2.5 m and 3.5 m [59], thus flat overpasses can be excluded from the detected building points under the constraint of the maximum intersection angle using above empirical thresholds.

2.3.4. Consistency Constraint

Although most buildings are extracted after the above three steps, some special building points are omitted and some non-building points are wrongly classified as buildings. Figure 6a illustrates a roof terrace with some attachments in Vaihingen [49], such as tables, chairs, and small vegetation on it. It is obvious that points in this area are rough, while the building surface is flat, as shown in Figure 6c. As a result, some building points fail to be detected (hereinafter referred to as undetected building points). Figure 7 shows two trees with dense leaves and the top of one is flat. Consequently, points falling in the region are wrongly classified as buildings (hereinafter referred to as false building points).
It should be noted that these undetected building points (false building points) are surrounded by building points (non-building points). Therefore, consistency constraint is proposed to solve the problem.
Considering that the steps of detecting undetected building points and removing false building points are similar, we take the elimination of false building points as example to introduce the process. It is composed of three main steps: First, the minimal and maximal values of x and y coordinates of non-ground points are obtained and denoted by x m a x , x m i n , y m a x , y m i n . Then the minimum bounding rectangle that covers non-building and ground points is partitioned into uniform cells with size l . Second, for each building point p ( x i , y i ) , row number [ ( x i x m i n ) / l ] and column number [ ( y i y m i n ) / l ] falling in cell g are calculated using its x, y coordinates. Third, search cells in a direction and stop if a cell contains points. Figure 8 illustrates search path in four directions. Where cell g is rendered by red, a yellow arrow with the same direction shows the search path in this direction, the last cell in a direction is colored by magenta, other cells are render by green. The l is set as twice the average point space according to [12,37].
Finally, 4 non-empty cells are obtained. If there are non-building points and no ground points in these 4 cells, then p ( x i , y i ) belongs to non-building points. Otherwise, it belongs to buildings. Figure 6d illustrates the undetected building points are extracted and Figure 7d shows the false building points are removed, avoiding inhomogeneity in the results of building extraction.

3. Experimental Results and Analysis

To validate the proposed method, datasets provided by the International Society for Photogrammetry and Remote Sensing (ISPRS) in Vaihingen and Toronto [49], one dataset in New Zealand [60] and one dataset in the state of Utah [61], were used in the experiments. The results were displayed in LiDAR_Suite, an airborne LiDAR data processing software developed by the Research and Development (R&D) group of the authors.

3.1. Experiments on the ISPRS Benchmark Dataset

3.1.1. Data Description

The ISPRS dataset is composed of five reference areas: Area 1 to 3 in Vaihingen and Area 4 to 5 in Toronto. The average point density of Area 1 to 3 is approximately 4–7 points/m2, and it is about 6 points/m2 in Area 4 to 5. It should be noted that three test areas (i.e., Area 1 to 3) are located in residential regions and other two test areas (i.e., Area 4 to 5) are within commercial zones, as shown in Figure 9a–e.
Area 1: includes 37 buildings, mainly composed of dense historic buildings having rather complex shapes along with roads and trees.
Area 2: includes 14 buildings, mainly composed of a few high regular residence buildings with horizontal roofs.
Area 3: includes 56 buildings, mainly composed of several detached buildings with simple structured roofs and vegetation along roads.
Area 4: includes 58 buildings, mainly composed of low and high-storey buildings with complex roof structure and rooftop attachments.
Area 5: includes 38 buildings, including a cluster of high-rise buildings with diverse roof structure, complex shapes and rooftop attachments.

3.1.2. Results and Analysis

Initial results of building extraction based on graph cuts algorithm and height constraint are shown in Figure 10. It can be seen that most buildings are successfully extracted in Area 1 to 5 and almost no inhomogeneity exists for individual buildings in Area 1 to 3 where there are almost no irregular complex attachments on roofs, due to the fact the proposed method considers spatial neighborhood information, which helps to exploit contextual information and improve classification accuracy [41,62]. The satisfying initial results benefit post-processing work. However, small clusters or individual points belonging to vegetation or small objects, still exist, as shown in the green rectangles in Figure 10b,c. Also, few points inside the buildings fail to be extracted, as shown in green rectangles in Figure 10a.
For qualitative evaluation, four indicators: completeness CP (%), correctness CR (%), quality Q (%) and F 1 -score F 1 (%) metrics of the results on per-area and per-object levels are evaluated [37,63,64,65]. Completeness represents the percentage of correctly detected buildings to total number of reference buildings. Completeness denotes the percentage of correctly detected buildings to total number of detected buildings. Quality and F 1 -score provide a compound performance metric that balances completeness and correctness. Equations are described as follows:
C P = T P T P + F N C R = T P T P + F P Q = T P T P + F N + F P F 1 = 2 C P C R C P + C R
where T P is the number of correctly detected buildings, F N is the number of omitted buildings, and F P is the number of wrongly detected buildings. The quantity evaluation results of Area 1 to 5 are listed in Table 1. To compare the performances of different methods to extract buildings from Area 1 to 3, 20 methods [12,23,37,49] that solely use LiDAR data are selected and the average results are listed in Table 2, where the ID “WHU_TQ” refers to the proposed method. For Area 4 to 5, 11 methods [23,49] are selected and compared with the proposed method, listed in Table 3. Figure 11 demonstrates the extraction results in the pixel against the reference ground truth data.
From Table 1, Table 2 and Table 3, we find that satisfying results on a per-area and per-object level are obtained by the proposed method. The CP metric at the area level is important because it is directly related to the post-editing work and subsequent building modeling. At this level, 95.5% and 98.4% of average CP in Vaihingen and Toronto are obtained respectively, outperforming other comparison methods in Table 2 and Table 3 respectively, which means buildings are more easily recognized by the proposed method. In addition, differences in Q between these five areas are minimal at the area level, which demonstrates the stability of the proposed method. Although the average Q metric of Area 1 to 3 on a per-area level is less than some methods, such as [12,23,37], there were some defects for them. Contextual information was not considered in [23,37], while existing research works have demonstrated that contextual information is beneficial to data processing [12,15,24]. As a result, the initial extraction results of [37] are not as good as ours due to lack of contextual information, which increase post-processing work. In [23], some inhomogeneity occurred in the process of extracting complex buildings. Moreover, the number of point features used in the proposed method is less than [12,23,37], which benefits building extraction by decreasing computation cost of calculating point features and improving algorithmic efficiency. For Area 4 to 5 datasets, the average Q metrics of the best method slightly outperforms the proposed method at the area level. This is mainly because there are several non-building objects with flat surfaces and large size, and they are wrongly classified as buildings. But the CP of the proposed method is obviously higher than other methods at the per-area level.
From Table 2 and Table 3, it can be seen that scores of four metrics are different for the same method, indicating different methods should be chosen merely according to different metrics. In fact, the best method should be application-oriented. For example, building with floor area larger than 50 m2 are more important in urban planning and reconstruction [1,5], thus it is reasonable to choose [12] to extract buildings from Area 1 to 3 datasets. Efficiency of a method needs to be considered in practice as well.
From Figure 11, most buildings can be correctly extracted by the proposed method. Although some vegetation with smooth surfaces are wrongly classified as buildings (green rectangle in Figure 10b), they are eliminated under the consistency constraint. Also, there are lots of long greenbelts vegetation area with the same height above 1.5 m, and points of these long greenbelts are classified as buildings due to their flat surfaces (shown in green rectangle in Figure 10c). These are eliminated using the maximum intersection angle constraint. However, some building attachments are omitted (hereinafter referred to as false negative errors) and some non-building points are wrongly classified as buildings (hereinafter referred to as false positive errors).
False negative errors can be explained by four main reasons: (1) Complex structure of buildings. In Figure 11f, there is a skylight (about 3.0 m *   4.0 m) and a roof terrace (about 2.5 m * 3.0 m) containing some small attachments (seems to be chairs, bed). As a result, points in these regions are rough and it is hard to effectively extracted buildings, which is a common problem for many methods [12,23,37]. However, if the attachments locate in large roof terrace, then the roof terrace can be extracted under consistency constraint, which is demonstrated in the green rectangle region in Figure 11g. The phenomenon also occurs in Area 5, shown in green circle in Figure 11e. (2) Occlusion by vegetation. The blue rectangle in Figure 11g is a building roof, and partial buildings are occluded. As a result, partially non-occluded buildings are extracted successfully. When a small building is partially occluded by vegetation, then the whole individual building is omitted. (3) Building with low height. Generally, a building should be high enough for people going in and out. However, in Figure 11i, the maximal absolute height difference between building and ground is about 1.0 m, less than 1.5 m, and many building points’ height are almost equal to ground. In [12,41], points in Figure 11i were wrongly classified as ground due to the above reasons. In this case, buildings failed to be extracted. Actually, without height constraint, points in this region are extracted in our experiments. The phenomenon also occurs in Area 4, shown in green rectangle in Figure 11d. (4) Data missing. Possibly due to the building roof material shown in Figure 11j, partial point cloud data of building are collected. Thus, partial buildings are extracted by the proposed method. A possible solution to solve the problem is to fuse LiDAR data and images.
Conversely, false positive errors occur for two main reasons: (1) Surrounding vegetation with height similar to building and smooth surface. Due to the fact that graph cuts algorithm considers the neighborhood relationship based on feature difference and the features are calculated using neighborhoods, then building regions are easy to overspread to partial vegetation areas, shown in Figure 11h. Although the problem can be solved by decreasing smooth term, completeness will decrease for lack of neighborhood information. (2) Non-building objects similar to buildings. Figure 11k illustrates a non-building object with flat surfaces and large size, and it is hard to discriminate them from buildings merely according to commonly used constraints, such as height constraint, area constraint, and the maximum intersection angle constraint. As a result, it is wrongly classified as building. This phenomenon also occurs in Area 5 in green rectangle in Figure 11e. A possible solution to this problem is the fusion of LiDAR data and other data sources.

3.2. Experiments on Other Two LiDAR Datasets

3.2.1. Data Description

Two datasets with different point density are used, one of which is captured in New Zealand [60] and the other which is captured in state of Utah [61]. The average point density of New Zealand dataset is about 20 points/m2, while Utah dataset is about 3 points/m2, as shown in Figure 12.
New Zealand dataset: includes two buildings with curved roofs and several large connected complex buildings.
Utah dataset: includes dense residential buildings with significantly different sizes, shapes and structures surrounded by vegetation.

3.3.2. Results and Analysis

New Zealand dataset and Utah dataset have been classified into four classes: ground, vegetation, building, and others by using LiDAR_Suite and manual post-processing. Therefore, the obtained buildings are used as truth data in the experiments. Building extraction results at the per-pixel level of these two datasets are illustrated in Figure 13, and quantitative evaluation results of building extraction at the area level are shown in Table 4.
In New Zealand dataset, the large complex connected buildings and the buildings with curved roofs are successfully extracted, as shown in Figure 13a,b. It should be noted that the Q (%) in New Zealand dataset is 93.2%, significantly larger than Utah dataset with 88.3%. The reason may include: (1) New Zealand dataset is high-density point data with 20 points/m2, and more accurate details can be obtained when point density increases [47], which is beneficial to building extraction; (2) Buildings are far from vegetation and slightly occluded by vegetation.
In Utah dataset, although the sizes, shapes and structures of buildings are significantly different, most buildings are successfully extracted by the proposed method. However, three buildings (green rectangles in Figure 13c) fail to be extracted completely, due to two main possible reasons: (1) Data missing. Possibly due to the building roof material, partial building points of these three buildings are collected, as shown in Figure 12b. As a result, points are rough in these local areas and partial points are classified as buildings and others are not. (2) The maximum intersection angle constraint. Some non-building and building points mix together after performing graph cuts algorithm due to missing data, then these building points are eliminated under the maximum intersection angle constraint due to its calculation theory. Despite the aforementioned problems, buildings with complete data are extracted successfully, as shown in Figure 13c.

4. Discussion

Point features, parameters x 0 of the logistic function to normalize each feature, weight λ 1 , λ 2 of each feature in the data term and smooth term are important to the final results. Therefore, discussions about the proposed feature, above parameters setting and the running time of the proposed method are conducted in this section.

4.1. Discussion of f v

In the proposed method, point feature   f v based on variance of LRSC-based normal vector (hereinafter referred to as f L R S C ), is used to discriminate building points from vegetation. To evaluate its performance, a comparison between f L R S C and feature f v based on PCA-based normal vector (hereinafter referred to as f P C A ) is conducted using Area 1 to 3 datasets. In the comparison, only f L R S C or f P C A is used to extract buildings in the proposed method. Quality Q (%) metric on a per-area level is used to measure the accuracy of building extraction, shown in Figure 14.
In Figure 14, the accuracy of f P C A significantly changes for different x 0 compared with f L R S C in Area 1 to 3, which indicates f P C A is more sensitive to the feature threshold parameter   x 0 . Moreover, Q difference between two features is larger in Area 1 than in Area 2 and 3. This is because Area 1 includes rather complex historic buildings composed of irregular roof planes with different slopes, while roofs of most buildings in Area 2 are horizontal and structures of buildings in Area 3 are simple and regular. Moreover, low-rank subspace clustering (LRSC) technique can calculate accurate normal vectors compared with PCA [48]. It indicates f L R S C perform much better than f P C A to extract buildings in complex scenes.

4.2. Discussion of Parameters Setting

Parameter x 0 of f L R S C is set to 1.0 according to the results in Figure 14. The Q (%) on a per-area level is used to study the optimal x 0 of f c using Area 1 to 3 datasets. Note that only f c is used in the proposed method, and Q is shown in Figure 15a. It can be seen that Q is more sensitive to x 0 in Area 1 than in the other two areas, possibly due to the complex buildings, and Q reaches the maximum when x 0 is set to the proper value for these three areas. According to the average results, when the optimal x 0 of f c is set to 0.06, the highest Q is obtained.
When analyzing the weight parameters λ 1 , λ 2 , one is adjusted from 0 to 1 and the other is correspondingly set as from 1 to 0. The metric Q on a per-area level is used to study the optimal λ 1 and λ 2 , and the Q is shown in Figure 15b. The highest average Q is obtained when λ 1 = 0.4 and   λ 2 = 0.6 , and accuracy will be improved by combining two features together.

4.3. Discussion of Running Time

Experiments of ISPRS datasets (i.e., Area 1 to 5) and other two LiDAR datasets (i.e., New Zealand dataset and Utah dataset) were performed on a laptop computer with 16 GB RAM and an Intel Core i7-7700HQ @ 2.8 GHz CPU, and a Windows 10 64-bit operating system. The proposed method was implemented using C++ with the platform of Visual Studio 2013. It should be noted that the total running time ( T ) of the proposed method is composed of two parts: time ( T 1 ) before post-processing (i.e., Step 1 and 2) and time ( T 2 ) of post-processing (i.e., Step 3), listed in Table 5.
From Table 5, it can be seen that T 1 are significantly less than T 2 due to fact that only two point features need to be calculated, and graph cuts and PTD algorithms are efficient [44,45,54]. Therefore, satisfying initial extraction results can be obtained using less time. Through analysis, step of the maximum intersection angle constraint occupies the most time in the step of post-processing. This is because large numbers of neighbors are searched to calculate angles and obtain the maximum intersection angle to eliminate non-building points of small clusters. Area-based method can be used to eliminate these non-building points with high efficiency [12], but it is sensitive to area threshold and fails to eliminate large non-building point clusters with narrow width compared with the maximum intersection angle constraint. A possible solution to the problem is the combination of area-based method and the maximum intersection angle constraint. How to combine these two methods and their potential influence to eliminate non-building points are worth further study. It should be noted that running time of Area 4 and 5 is far more than Area 1 to 3 due to the fact that number of points in Area 4 and 5 is far more than Area 1 to 3, and scenes in Area 4 to 5 are more complex.

5. Conclusions

Due to the complexity of urban scenes, it is still a challenging task to extract buildings automatically. In the paper, a building extraction method from airborne LiDAR data based on min-cut and improved post-processing is proposed. To discriminate building points on the intersecting roof planes from vegetation, a point feature based on variance of normal vectors via low-rank subspace clustering (LRSC) is proposed, and non-ground points are separated into two subsets based on min-cut after filtering. Then, an improved post-processing is adopted to refine building extraction results by using restricted region growing and the constraints of height, the maximum intersection angle, and consistency. Omitted points of buildings located in slopes are detected using the restricted region growing. The proposed maximum intersection angle constraint effectively removes large non-building point clusters with narrow width, such as greenbelt along street, overcoming the defects of area-based methods in setting area threshold. Contextual information and consistency constraint are both used to eliminate inhomogeneity in the process of building extraction. No manual operations are required in the process except predefining some threshold values. Experiments of seven datasets verify that most buildings, even with curved roofs, are successfully extracted by the proposed method. In terms of precision, for Area 1 to 3 in Vaihingen the average Q metrics of the proposed method achieves promising results with 87.7%, 83.7%, and 99.1% for the per-area, per-object, per-object >50 m2 levels respectively. And for Area 4 to 5 in Toronto, the average Q metrics are 88.9% for the per-area level. The Q metric for the per-area level of the proposed method achieves 93.2% for high-density dataset with average point density 20 points/m2. Moreover, the proposed point feature outperforms the comparison alternative and is less sensitive to feature threshold in complex scenes. In addition, only two point features are used in the proposed method, which beneficially decreases the computation cost of calculating point features and improving algorithmic efficiency.
However, some defects still exist in the proposed method. It is still a challenge for the proposed method to successfully extract some buildings attached by complex skylight or roof terrace due to rough points. A feasible solution is to combine images or intensity data to obtain extra features [6,10], which deserves further studies. In addition, there are several parameters that are adopted in the proposed method, which reduces the full automation of building extraction. We will attempt to construct a self-adaptive building extraction algorithm in the next step. Moreover, there are other normalization functions available, such as min-max normalization, Z-score normalization, etc. [66], but only logical function is employed to normalize features. How to introduce other normalization functions to normalize feature and their potential influence to building extraction are worth further research.

Author Contributions

Conceptualization, K.L.; methodology, K.L.; software, L.Z. and H.M. (Haichi Ma); validation, H.M. (Hongchao Ma) and Z.C.; formal analysis, K.L. and H.M. (Hongchao Ma); investigation, K.L.; resources, H.M. (Haichi Ma) and K.L.; data curation, H.M. (Haichi Ma) and K.L.; writing—original draft preparation, K.L.; writing—review and editing, H.M. (Hongchao Ma) and Z.C.; visualization, K.L. and H.M. (Haichi Ma); supervision, H.M. (Hongchao Ma); project administration, H.M. (Hongchao Ma).; funding acquisition, H.M. (Hongchao Ma) and L.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (2018YFB0504500), National Natural Science Foundation of China (Grant numbers 41601504 and 61378078), and National High Resolution Earth Observation Foundation (11-Y20A12-9001-17/18).

Acknowledgments

The test datasets were provided by the International Society for Photogrammetry and Remote Sensing (ISPRS) Working Group and OpenTopography. Authors would like to thank them and anonymous reviewers for their constructive comments for the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, R.; Peethambaran, J.; Chen, D. LiDAR point clouds to 3-D Urban Models: A review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 606–627. [Google Scholar] [CrossRef]
  2. Guo, L.; Luo, J.; Yuan, M.; Huang, Y.; Shen, H.; Li, T. The influence of urban planning factors on PM2.5 pollution exposure and implications: A case study in China based on remote sensing, LBS, and GIS data. Sci. Total Environ. 2019, 659, 1585–1596. [Google Scholar] [CrossRef] [PubMed]
  3. Janalipour, M.; Mohammadzadeh, A. A novel and automatic framework for producing building damage map using post-event LiDAR data. Int. J. Disaster Risk Reduct. 2019, 39, 1–13. [Google Scholar] [CrossRef]
  4. Peng, Z.; Gao, S.; Xiao, B.; Guo, S.; Yang, Y. CrowdGIS: Updating digital maps via mobile crowdsensing. IEEE Trans. Autom. Sci. Eng. 2017, 15, 369–380. [Google Scholar] [CrossRef]
  5. Zhou, Z.; Gong, J. Automated residential building detection from airborne LiDAR data with deep neural networks. Adv. Eng. Inform. 2018, 36, 229–241. [Google Scholar] [CrossRef]
  6. Chen, S.; Shi, W.; Zhou, M.; Zhang, M.; Chen, P. Automatic building extraction via adaptive iterative segmentation with LiDAR data and high spatial resolution imagery fusion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 2081–2095. [Google Scholar] [CrossRef]
  7. Mallet, C.; Bretar, F.; Roux, M.; Soergel, U.; Heipke, C. Relevance assessment of full-waveform lidar data for urban area classification. ISPRS J. Photogramm. Remote Sens. 2011, 66, S71–S84. [Google Scholar] [CrossRef]
  8. Donoghue, D.N.M.; Watt, P.J.; Cox, N.J.; Wilson, J. Remote sensing of species mixtures in conifer plantations using LiDAR height and intensity data. Remote Sens. 2007, 110, 509–522. [Google Scholar] [CrossRef]
  9. Salimzadeh, N.; Hammad, A. High-level framework for GIS-based optimization of building photovoltaic potential at urban scale using BIM and LiDAR. In Proceedings of the International Conference on Sustainable Infrastructure, New York, NY, USA, 26–28 October 2017; pp. 123–134. [Google Scholar] [CrossRef]
  10. Awrangjeb, M.; Zhang, C.; Fraser, C. Automatic extraction of building roofs using LIDAR data and multispectral imagery. ISPRS J. Photogramm. Remote Sens. 2013, 83, 1–18. [Google Scholar] [CrossRef] [Green Version]
  11. Awrangjeb, M.; Fraser, C. Automatic segmentation of raw LiDAR data for extraction of building roofs. Remote Sens. 2014, 6, 3716–3751. [Google Scholar] [CrossRef] [Green Version]
  12. Du, S.; Zhang, Y.; Zou, Z.; Xu, S.; He, X.; Chen, S. Automatic building extraction from LiDAR data fusion of point and grid-based features. ISPRS J. Photogramm. Remote Sens. 2017, 130, 294–307. [Google Scholar] [CrossRef]
  13. Ni, H.; Lin, X.; Zhang, J. Classification of ALS point cloud with improved point cloud segmentation and random forests. Remote Sens. 2017, 9, 288. [Google Scholar] [CrossRef] [Green Version]
  14. Vosselman, G.; Coenen, M.; Rottensteiner, F. Contextual segment-based classification of airborne laser scanner data. ISPRS J. Photogramm. Remote Sens. 2017, 128, 354–371. [Google Scholar] [CrossRef]
  15. Guo, B.; Huang, X.; Zhang, F.; Sohn, G. Classification of airborne laser scanning data using JointBoost. ISPRS J. Photogramm. Remote Sens. 2015, 100, 71–83. [Google Scholar] [CrossRef]
  16. Zhang, J.; Lin, X.; Ning, X. SVM-based classification of segmented airborne LiDAR point clouds in urban areas. Remote Sens. 2013, 5, 3749–3775. [Google Scholar] [CrossRef] [Green Version]
  17. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  18. Moon, T.K. The expectation-maximization algorithm. IEEE Signal. Process. Mag. 1996, 13, 47–60. [Google Scholar] [CrossRef]
  19. Torlay, L.; Perrone-Bertolotti, M.; Thomas, E.; Baciu, M. Machine learning–XGBoost analysis of language networks to classify patients with epilepsy. Brain Inform. 2017, 4, 1–11. [Google Scholar] [CrossRef]
  20. Zhang, Z.; Zhang, L.; Tong, X.; Mathiopoulos, T.; Guo, B.; Huang, X.; Wang, Z.; Wang, Y. A multilevel point-cluster-based discriminative feature for ALS point cloud classification. IEEE Trans. Geosci. Remote Sens. 2016, 54, 3309–3321. [Google Scholar] [CrossRef]
  21. Weinmann, M.; Jutzi, B.; Mallet, C. Semantic 3D scene interpretation: A framework combining optimal neighborhood size selection with relevant features. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2014, 2, 181–188. [Google Scholar] [CrossRef] [Green Version]
  22. Wang, C.; Shu, Q.; Wang, X.; Guo, B.; Liu, P.; Li, Q. A random forest classifier based on pixel comparison features for urban LiDAR data. ISPRS J. Photogramm. Remote Sens. 2019, 148, 75–86. [Google Scholar] [CrossRef]
  23. Huang, R.; Yang, B.; Liang, F.; Dai, W.; Li, J.; Tian, M.; Xu, W. A top-down strategy for buildings extraction from complex urban scenes using airborne LiDAR point clouds. Infrared Phys. Technol. 2018, 92, 203–218. [Google Scholar] [CrossRef]
  24. Wang, Y.; Jiang, T.; Yu, M.; Tao, S.; Sun, J.; Liu, S. Semantic-based building extraction from LiDAR point clouds using contexts and optimization in complex environment. Sensors 2020, 20, 3386. [Google Scholar] [CrossRef] [PubMed]
  25. Dong, W.; Lan, J.; Liang, S.; Yao, W.; Zhang, Z. Selection of LiDAR geometric features with adaptive neighborhood size for urban land cover classification. Int. J. Appl. Earth Obs. Geoinf. 2017, 60, 99–110. [Google Scholar] [CrossRef]
  26. Han, W.; Wang, R.; Huang, D.; Xu, C. Large-Scale ALS data semantic classification integrating location-context-semantics cues by higher-order CRF. Sensors 2020, 20, 1700. [Google Scholar] [CrossRef] [Green Version]
  27. Cai, Z.; Ma, H.; Zhang, L. Feature selection for airborne LiDAR data filtering: A mutual information method with Parzon window optimization. GIsci. Remote Sens. 2020, 57, 323–337. [Google Scholar] [CrossRef]
  28. Vo, A.V.; Truong-Hong, L.; Laefer, D.F.; Bertolotto, M. Octree-Based region growing for point cloud segmentation. ISPRS J. Photogramm. Remote Sens. 2015, 104, 88–100. [Google Scholar] [CrossRef]
  29. Xu, Y.; Yao, W.; Hoegner, L.; Stilla, U. Segmentation of building roofs from airborne LiDAR point clouds using robust voxel-based region growing. Remote Sens. Lett. 2017, 8, 1062–1071. [Google Scholar] [CrossRef]
  30. Awrangjeb, M.; Lu, G.; Fraser, C. Automatic building extraction from LiDAR data covering complex urban scenes. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2014, 40, 25–32. [Google Scholar] [CrossRef] [Green Version]
  31. Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic graph cnn for learning on point clouds. ACM Trans. Graph. 2019, 38, 1–12. [Google Scholar] [CrossRef] [Green Version]
  32. Hu, X.; Yuan, Y. Deep-Learning-Based classification for DTM extraction from ALS point cloud. Remote Sens. 2016, 8, 730. [Google Scholar] [CrossRef] [Green Version]
  33. Yao, X.; Guo, J.; Hu, J.; Cao, Q. Using deep learning in semantic classification for point cloud data. IEEE Access 2019, 7, 37121–37130. [Google Scholar] [CrossRef]
  34. Verma, V.; Kumar, R.; Hsu, S. 3D building detection and modeling from aerial LiDAR data. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Washington, DC, USA, 17–22 June 2006; pp. 2213–2220. [Google Scholar] [CrossRef]
  35. Tarsha-Kurdi, F.; Landes, T.; Grussenmeyer, P. Hough-Transform and extended RANSAC algorithms for automatic detection of 3D building roof planes from LiDAR data. Int. Arch. Photogramm. Remote Sens. 2007, 66, 124–132. [Google Scholar] [CrossRef]
  36. Borrmann, D.; Elseberg, J.; Lingemann, K.; Nüchte, A. The 3d hough transform for plane detection in point clouds: A review and a new accumulator design. 3D Res. 2011, 2, 1–13. [Google Scholar] [CrossRef]
  37. Cai, Z.; Ma, H.; Zhang, L. A Building detection method based on semi-suppressed fuzzy C-means and restricted region growing using airborne LiDAR. Remote Sens. 2019, 11, 848. [Google Scholar] [CrossRef] [Green Version]
  38. Adhikari, S.K.; Sing, J.K.; Basu, D.K.; Nasipuri, M. Conditional spatial fuzzy C-means clustering algorithm for segmentation of MRI images. Appl. Soft Comput. 2015, 34, 758–769. [Google Scholar] [CrossRef]
  39. Meng, X.; Wang, L.; Currit, N. Morphology-Based building detection from airborne LIDAR data. Photogramm. Eng. Remote Sens. 2009, 75, 437–442. [Google Scholar] [CrossRef]
  40. Cheng, L.; Zhao, W.; Han, P.; Zhang, W.; Shan, J.; Liu, Y. Building region derivation from LiDAR data using a reversed iterative mathematic morphological algorithm. Opt. Commun. 2013, 286, 244–250. [Google Scholar] [CrossRef]
  41. Gerke, M.; Xiao, J. Fusion of airborne laserscanning point clouds and images for supervised and unsupervised scene classification. ISPRS J. Photogramm. Remote Sens. 2014, 87, 78–92. [Google Scholar] [CrossRef]
  42. Rusu, R.B.; Cousins, S. 3D is here: Point cloud library. In Proceeding of the IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 1–4. [Google Scholar] [CrossRef] [Green Version]
  43. Rusu, R.B.; Marton, Z.C.; Blodow, N.; Dolha, M.; Beetz, M. Towards 3D point cloud based object maps for household environments. Rob. Auton. Syst. 2008, 56, 927–941. [Google Scholar] [CrossRef]
  44. Ma, H.; Zhou, W.; Zhang, L. DEM refinement by low vegetation removal based on the combination of fullwaveform data and progressive TIN densification. ISPRS J. Photogramm. Remote Sens. 2018, 146, 260–271. [Google Scholar] [CrossRef]
  45. Axelsson, P. Processing of laser scanner data-algorithms and applications. ISPRS J. Photogramm. Remote Sens. 1999, 54, 138–147. [Google Scholar] [CrossRef]
  46. Liu, K.; Ma, H.; Zhang, L.; Cai, Z.; Ma, H. Strip adjustment of airborne LiDAR data in urban scenes using planar features by the minimum hausdorff distance. Sensors 2019, 19, 5131. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  47. Blomley, R.; Jutzi, B.; Weinmann, M. Classification of airborne laser scanning data using geometric, multi-scale features and different neighborhood types. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 3, 169–176. [Google Scholar] [CrossRef]
  48. Zhang, J.; Cao, J.; Liu, X.; Wang, J.; Liu, J.; Shi, X. Point cloud normal estimation via low-rank subspace clustering. Comput. Graph. 2013, 37, 697–706. [Google Scholar] [CrossRef]
  49. ISPRS. Available online: http://www2.isprs.org/commissions/comm3/wg4/tests.html (accessed on 22 August 2020).
  50. Li, Y.; Tong, G.; Du, X.; Yang, X.; Zhang, J.; Yang, L. A single point-based multilevel features fusion and pyramid neighborhood optimization method for ALS point cloud classification. Appl. Sci. 2019, 9, 951. [Google Scholar] [CrossRef] [Green Version]
  51. Gilani, S.A.N.; Awrangjeb, M.; Lu, G. Segmentation of airborne point cloud data for automatic building roof extraction. GIsci. Remote Sens. 2018, 55, 63–89. [Google Scholar] [CrossRef] [Green Version]
  52. Delong, A.; Osokin, A.; Isack, H.N.; Boykov, Y. Fast approximate energy minimization with label costs. Int. J. Comput. Vis. 2012, 96, 1–27. [Google Scholar] [CrossRef]
  53. Ural, S.; Shan, J. Min-Cut based segmentation of airborne LiDAR point clouds. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, 39, 167–172. [Google Scholar] [CrossRef] [Green Version]
  54. Boykov, Y.; Kolmogorov, V. An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 2004, 26, 1124–1137. [Google Scholar] [CrossRef] [Green Version]
  55. Li, W.; Wang, Y.; Du, J.; Lai, J. Synergistic integration of graph-cut and cloud model strategies for image segmentation. Neurocomputing 2017, 257, 37–46. [Google Scholar] [CrossRef]
  56. Guo, Y.; Akbulut, Y.; Şengür, A.; Xia, R.; Smarandache, F. An efficient image segmentation algorithm using neutrosophic graph cut. Symmetry 2017, 9, 185. [Google Scholar] [CrossRef] [Green Version]
  57. Reza, M.N.; Na, I.S.; Baek, S.W.; Lee, K.H. Rice yield estimation based on K-means clustering with graph-cut segmentation using low-altitude UAV images. Biosyst. Eng. 2019, 177, 109–121. [Google Scholar] [CrossRef]
  58. Sánchez-Lopera, J.; Lerma, J.L. Classification of lidar bare-earth points, buildings, vegetation, and small objects based on region growing and angular classifier. Int. J. Remote Sens. 2014, 35, 6955–6972. [Google Scholar] [CrossRef]
  59. Truong, L.T.; Nguyen, H.T.; Nguyen, H.D.; Vu, H.V. Pedestrian overpass use and its relationships with digital and social distractions, and overpass characteristics. Accid. Anal. Prev. 2019, 131, 234–238. [Google Scholar] [CrossRef]
  60. OpenTopography. Available online: https://portal.opentopography.org/datasetMetadata?otCollectionID=OT.022020.2193.2 (accessed on 22 August 2020). [CrossRef]
  61. OpenTopography. Available online: https://portal.opentopography.org/datasetMetadata?otCollectionID=OT.122014.26912.1 (accessed on 22 August 2020). [CrossRef]
  62. Ardila, J.P.; Tolpekin, V.A.; Bijker, W.; Stein, A. Markov-Random-Field-Based super-resolution mapping for identification of urban trees in VHR images. ISPRS J. Photogramm. Remote Sens. 2011, 66, 762–775. [Google Scholar] [CrossRef]
  63. Rutzinger, M.; Rottensteiner, F.; Pfeifer, N. A comparison of evaluation techniques for building extraction from airborne laser scanning. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2009, 2, 11–20. [Google Scholar] [CrossRef]
  64. Ma, L.; Li, Y.; Li, J.; Zhong, Z.; Chapman, M. Generation of horizontally curved driving lines in HD maps using mobile laser scanning point clouds. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 1572–1586. [Google Scholar] [CrossRef]
  65. Powers, D. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Tech. 2011, 2, 37–63. [Google Scholar]
  66. Eesa, A.S.; Arabo, W.K. A normalization methods for backpropagation: A comparative study. Sci. J. Univ. Zakho 2017, 5, 319–323. [Google Scholar] [CrossRef]
Figure 1. Algorithm flowchart.
Figure 1. Algorithm flowchart.
Remotesensing 12 02849 g001
Figure 2. Normal vector. (a) PCA-based normal vector of a cube; (b) PCA-based normal vector of a building roof; (c) LRSC -based normal vector of a cube; (d) LRSC -based normal vector of a building roof.
Figure 2. Normal vector. (a) PCA-based normal vector of a cube; (b) PCA-based normal vector of a building roof; (c) LRSC -based normal vector of a cube; (d) LRSC -based normal vector of a building roof.
Remotesensing 12 02849 g002
Figure 3. Visualization of   f v (rendered by the value of normalized   f v ). (a) f v of [12]; (b) the proposed   f v ; (c) Orthophoto.
Figure 3. Visualization of   f v (rendered by the value of normalized   f v ). (a) f v of [12]; (b) the proposed   f v ; (c) Orthophoto.
Remotesensing 12 02849 g003
Figure 4. Profile of a building located on a slope. (a) Before restricted region growing; (b) After restricted region growing.
Figure 4. Profile of a building located on a slope. (a) Before restricted region growing; (b) After restricted region growing.
Remotesensing 12 02849 g004
Figure 5. Concept of intersection angle. (a) Building extraction after restricted region growing; (b) Orthophoto; (c) Concept of intersection angles; (d) Calculation of the maximum intersection angle.
Figure 5. Concept of intersection angle. (a) Building extraction after restricted region growing; (b) Orthophoto; (c) Concept of intersection angles; (d) Calculation of the maximum intersection angle.
Remotesensing 12 02849 g005
Figure 6. Roof terrace. (a) Orthophoto; (b) Point data. Yellow lines are reference vectors provided by ISPRS. The approximate location of profile is indicated by the green line; (c) Profile before consistency constraint; (d) Profile after consistency constraint.
Figure 6. Roof terrace. (a) Orthophoto; (b) Point data. Yellow lines are reference vectors provided by ISPRS. The approximate location of profile is indicated by the green line; (c) Profile before consistency constraint; (d) Profile after consistency constraint.
Remotesensing 12 02849 g006
Figure 7. Two trees with dense leaves. (a) Orthophoto; (b) Point data. The approximate location of profile is indicated by the green line; (c) Profile before consistency constraint; (d) Profile after consistency constraint.
Figure 7. Two trees with dense leaves. (a) Orthophoto; (b) Point data. The approximate location of profile is indicated by the green line; (c) Profile before consistency constraint; (d) Profile after consistency constraint.
Remotesensing 12 02849 g007
Figure 8. Search path in four directions.
Figure 8. Search path in four directions.
Remotesensing 12 02849 g008
Figure 9. Test datasets. (a) Area 1; (b) Area 2; (c) Area 3; (d) Area 4; (e) Area 5.
Figure 9. Test datasets. (a) Area 1; (b) Area 2; (c) Area 3; (d) Area 4; (e) Area 5.
Remotesensing 12 02849 g009aRemotesensing 12 02849 g009b
Figure 10. Initial results of building extraction based on graph cuts and height constraints. (a) Results of Area 1; (b) Results of Area 2; (c) Results of Area 3; (d) Results of Area 4; (e) Results of Area 5.
Figure 10. Initial results of building extraction based on graph cuts and height constraints. (a) Results of Area 1; (b) Results of Area 2; (c) Results of Area 3; (d) Results of Area 4; (e) Results of Area 5.
Remotesensing 12 02849 g010
Figure 11. Visualization of building extraction results at the per-pixel level and error factors. (a) Area 1; (b) Area 2; (c) Area 3; (d) Area 4; (e) Area 5; (f) Details of A in Area 1; (g) Details of B in Area 1; (h) Details of C in Area 1; (i) Details of D in Area 2; (j) Details of E in Area 3; (k) Details of F in Area 4.
Figure 11. Visualization of building extraction results at the per-pixel level and error factors. (a) Area 1; (b) Area 2; (c) Area 3; (d) Area 4; (e) Area 5; (f) Details of A in Area 1; (g) Details of B in Area 1; (h) Details of C in Area 1; (i) Details of D in Area 2; (j) Details of E in Area 3; (k) Details of F in Area 4.
Remotesensing 12 02849 g011aRemotesensing 12 02849 g011b
Figure 12. Test datasets. (a) New Zealand dataset; (b) Utah dataset.
Figure 12. Test datasets. (a) New Zealand dataset; (b) Utah dataset.
Remotesensing 12 02849 g012
Figure 13. Building extraction at the per-pixel level. (a) Results of New Zealand dataset; (b) Details in New Zealand dataset; (c) Results of Utah dataset.
Figure 13. Building extraction at the per-pixel level. (a) Results of New Zealand dataset; (b) Details in New Zealand dataset; (c) Results of Utah dataset.
Remotesensing 12 02849 g013
Figure 14. Q (quality) metric on a per-area level of extraction results. (a) Q of Area 1; (b) Q of Area 2; (c) Q of Area 3; (d) Average Q.
Figure 14. Q (quality) metric on a per-area level of extraction results. (a) Q of Area 1; (b) Q of Area 2; (c) Q of Area 3; (d) Average Q.
Remotesensing 12 02849 g014aRemotesensing 12 02849 g014b
Figure 15. Results of different x 0 for f c and different weights for features. (a) Results of different x 0 for f c ; (b) Results of different weights for each feature.
Figure 15. Results of different x 0 for f c and different weights for features. (a) Results of different x 0 for f c ; (b) Results of different weights for each feature.
Remotesensing 12 02849 g015aRemotesensing 12 02849 g015b
Table 1. Building extraction results. (The best values per column are shown in the bold font.).
Table 1. Building extraction results. (The best values per column are shown in the bold font.).
Test CasePer-Area (%)Per-Object (%)Per-Object > 50 m2 (%)
CPCRQ F 1 CPCRQ F 1 CPCRQ F 1
Area 197.191.488.994.283.896.981.689.9100100100100
Area 295.492.988.994.185.710085.792.3100100100100
Area 394.190.285.492.183.910083.991.297.410097.498.7
Average95.591.587.793.584.599.083.791.299.110099.199.5
Area 498.290.589.194.298.385.584.691.510093.493.496.6
Area 598.689.888.794.094.772.069.281.897.184.682.590.4
Average98.490.288.994.196.578.876.786.698.689.088.093.5
Table 2. Average evaluation results comparison of the building extraction (Area 1 to 3). (The best values per column are shown in the bold font.).
Table 2. Average evaluation results comparison of the building extraction (Area 1 to 3). (The best values per column are shown in the bold font.).
IDPer-Area (%)Per-Object (%)Per-Object > 50 m2 (%)
CPCRQ F 1 CPCRQ F 1 CPCRQ F 1
UMTA92.387.581.589.880.098.679.188.399.1100.099.199.5
UMTP92.486.080.389.180.995.878.187.798.897.296.098.0
MON92.788.782.890.782.793.177.787.699.1100.099.199.5
VSK85.898.484.691.779.7100.079.788.797.9100.097.998.9
WHUY187.391.680.889.477.698.176.586.797.497.995.497.6
WHUY289.790.982.390.383.097.581.389.799.198.097.298.5
HANC191.592.585.292.081.572.762.476.8100.095.895.897.9
HANC290.293.284.691.785.169.661.976.6100.0100.0100.0100.0
MAR187.097.184.891.878.296.275.786.399.1100.099.199.5
MAR289.795.285.892.480.693.776.586.799.198.998.099.0
TON77.797.776.386.667.598.966.980.292.798.891.695.7
HANC391.395.987.893.585.482.271.783.8100.098.998.999.4
WHU_QC85.898.784.891.880.999.080.389.096.8100.096.898.4
MON287.69180.689.386.393.981.689.999.1100.099.199.5
WHU_YD89.898.688.694.087.899.387.393.299.1100.099.199.5
MON494.382.979.088.283.993.879.388.699.1100.099.199.5
MON589.990.38290.187.296.384.491.599.1100.099.199.5
[12]94.094.989.594.483.3100.083.390.9100.0100.0100.0100.0
[23]89.898.688.694.087.899.387.393.2----
[37]93.495.889.694.6--------
WHU_TQ95.591.587.793.584.599.083.791.299.1100.099.199.5
Table 3. Average evaluation results comparison of the building extraction (Area 4 to 5). (The best values per column are shown in the bold font.).
Table 3. Average evaluation results comparison of the building extraction (Area 4 to 5). (The best values per column are shown in the bold font.).
IDPer-Area (%)Per-Object (%)Per-Object > 50 m2 (%)
CPCRQ F 1 CPCRQ F 1 CPCRQ F 1
TUM85.180.670.682.783.990.376.987.088.292.582.390.3
MAR196.192.188.794.098.786.885.892.498.687.686.592.8
WHUY294.391.386.592.790.495.887.093.094.895.891.095.3
ITCM76.987.569.281.886.521.720.934.689.770.565.278.9
ITCR75.094.571.983.679.643.539.156.283.891.877.987.6
MAR294.094.388.994.191.391.984.591.695.796.892.896.2
MON295.992.288.794.093.481.176.786.895.794.590.795.1
Z_GIS91.790.383.491.095.786.483.190.896.387.384.491.5
WHU_YD95.894.690.895.291.395.487.493.395.795.491.495.5
HKP97.692.790.695.193.990.485.492.195.790.486.993.0
[23]95.894.790.895.291.395.487.593.3----
WHU_TQ98.490.288.994.196.578.876.786.698.689.088.093.5
Table 4. Evaluation results of building extraction at the area level.
Table 4. Evaluation results of building extraction at the area level.
Precision (Per-Area)CP (%)CR (%)Q (%) F 1
Data
New Zealand dataset
Utah dataset
98.494.793.296.5
95.392.388.393.8
Table 5. Running time (s).
Table 5. Running time (s).
Area IDArea 1Area 2Area 3Area 4Area 5New Zealand DatasetUtah Dataset
Item
T 1 14182035730770127
T 2 235161483146426532239
T 376981518849497232366

Share and Cite

MDPI and ACS Style

Liu, K.; Ma, H.; Ma, H.; Cai, Z.; Zhang, L. Building Extraction from Airborne LiDAR Data Based on Min-Cut and Improved Post-Processing. Remote Sens. 2020, 12, 2849. https://doi.org/10.3390/rs12172849

AMA Style

Liu K, Ma H, Ma H, Cai Z, Zhang L. Building Extraction from Airborne LiDAR Data Based on Min-Cut and Improved Post-Processing. Remote Sensing. 2020; 12(17):2849. https://doi.org/10.3390/rs12172849

Chicago/Turabian Style

Liu, Ke, Hongchao Ma, Haichi Ma, Zhan Cai, and Liang Zhang. 2020. "Building Extraction from Airborne LiDAR Data Based on Min-Cut and Improved Post-Processing" Remote Sensing 12, no. 17: 2849. https://doi.org/10.3390/rs12172849

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop