3D Point Cloud Completion Method Based on Building Contour Constraint Diffusion Probability Model

Ye, Bo; Wang, Han; Li, Jingwen; Jiang, Jianwu; Lu, Yanling; Gao, Ertao; Yue, Tao

doi:10.3390/app132011246

Open AccessArticle

3D Point Cloud Completion Method Based on Building Contour Constraint Diffusion Probability Model

by

Bo Ye

^1,2,

Han Wang

¹,

Jingwen Li

^1,3,*,

Jianwu Jiang

^1,3

,

Yanling Lu

^1,3,

Ertao Gao

^1,3

and

Tao Yue

^1,3

¹

College of Geomatics and Geoinformation, Guilin University of Technology, Guilin 541004, China

²

Guangxi Institute of Industrial Technology, Nanning 530200, China

³

Ecological Spatiotemporal Big Data Perception Service Laboratory, Guilin 541004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2023, 13(20), 11246; https://doi.org/10.3390/app132011246

Submission received: 6 September 2023 / Revised: 29 September 2023 / Accepted: 9 October 2023 / Published: 13 October 2023

(This article belongs to the Special Issue 3D Scene Understanding and Object Recognition)

Download

Browse Figures

Versions Notes

Abstract

:

Building point cloud completion is the process of reconstructing missing parts of a building’s point cloud, which have been affected by external factors during data collection, to restore the original geometric shape of the building. However, the uncertainty in filling point positions in the areas where building features are missing makes it challenging to recover the original distribution of the building’s point cloud shape. To address this issue, we propose a point cloud generation diffusion probability model based on building outline constraints. This method constructs building-outline-constrained regions using information related to the walls on the building’s surface and adjacent roofs. These constraints are encoded by an encoder and fused into latent codes representing the incomplete building point cloud shape. This ensures that the completed point cloud adheres closely to the real geometric shape of the building by constraining the generated points within the missing areas. The quantitative and qualitative results of the experiment clearly show that our method performs better than other methods in building point cloud completion.

Keywords:

building point cloud completion; building contour constraint; DDPM

1. Introduction

Urban-scale real-world 3D construction is becoming an important direction in the digital era of urbanization, with widespread applications in smart cities, urban planning, and surveying [1,2,3]. 3D point cloud can map information about the geometric shape, surface features, and spatial distribution of buildings in the real world into digital data [4], making them a crucial data source for real-world 3D construction. However, laser LiDAR point clouds may have non-uniform density distribution due to laser scattering, and point clouds generated by photogrammetry can be influenced by lighting conditions, weather, equipment accuracy, and other factors, leading to potential differences in the accuracy and consistency of point cloud data. On the other hand, both of these methods can result in missing point cloud data in certain areas due to occlusion by objects, leading to incomplete geometric information about buildings. Therefore, to correctly fill in missing building point cloud data, point cloud completion has become a crucial duty as a preprocessing approach for point clouds [5].

Generative Adversarial Models (GAN) [6], Variational Auto-Encoders (VAE) [7], Normalization Flow (NF) [8], and Denoising Diffusion Probabilistic Models (DDPM) [9] have achieved significant success in content generation task. Building upon these generative models, research has explored methods for point cloud completion based on GANs [10,11], VAE models [12], flow-based models [13], and diffusion-based models [14]. While these methods have made progress in certain aspects, they also have inherent limitations. For instance, GAN-based methods can be challenging to train due to the need for training two networks, leading to unstable training processes and potential convergence issues. VAE-based models may generate building shapes that are not sufficiently smooth, partially attributed to injected noise and the use of element-wise measurements like the squared error. Training times for flow-based models can be lengthy, and they may not perform well in generating sparse point clouds. DDPM-based models, with their high stochasticity in the noise injection and denoising process during training, may struggle to deterministically control the generation of geometric point clouds in missing areas of buildings.

To deal with these problems, we propose the building contour constraint diffusion probabilistic model for complementing building point clouds. By constructing the contour-constrained shape latent codes for the missing regions, the constrained point cloud generates the diffusion probabilistic model point cloud generation process, so that the complementary point cloud of the missing regions is close to the geometric features of the original building. This method not only recovers missing parts from incomplete point cloud data but also maintains consistency with the building’s geometric characteristics and contours. Here is an overview of our approach: First, we use the RANSAC [15] to fit the building’s point cloud into multiple surface polygons. Next, we group individual buildings into one or more clusters based on predefined polygon correlation constraints. Then, we project the roof and wall polygons of clusters with missing areas onto the ground plane and use geometric transformations to construct closed polygons, forming contour-constrained regions. Finally, the extracted contour-constrained region is encoded into a latent code representing its shape distribution, which is then fused with the latent code of the incomplete building point cloud. This fused representation is used in the reverse diffusion process to complete the point cloud. Here are our primary contributions:

(1) For clustered segments with missing parts, we construct constraint regions by projecting wall contour lines and roof edge contour lines onto the ground plane.

(2) By encoding the constrained region’s point cloud and fusing the latent code of the incomplete building point cloud with the latent code of the building’s contour-constrained region, the point cloud’s missing regions are completed through the reverse process.

The rest of this essay is structured as follows: Section 2 presents related work. Section 3 introduces our model. Section 4 describes the experiments we conducted on the BuildingNet dataset for point cloud completion tasks in incomplete building point clouds. Section 5 concludes our paper.

2. Related Works

2.1. Traditional Point Cloud Completion Methods

Traditional point cloud completion methods can typically be categorized into geometry-based methods and alignment-based methods [16]. Geometry-based methods leverage the geometric features of the original point cloud for shape completion. In geometry-based point cloud completion methods, Berger et al. [5] used interpolation to fill gaps in the incomplete point cloud. Thrun et al. [17] based on regular structural repetitions, identified symmetry in the input point cloud, and predicted the missing pieces. These methods, are sometimes useful in specific situations; however, the input point cloud is required to have symmetry with a low missing rate and may result in the loss of some fine-grained details.

Alignment-based methods aim to achieve point cloud shape completion by finding matching primitives from a model database. In alignment-based point cloud completion methods, Li et al. [18] segmented building point clouds into planar patches and used an improved global fitting approach to estimate the parameters of corresponding building primitives, progressively completing the missing regions. Verdie et al. [19] employed geometric attributes and a set of semantic rules combined with Markov random fields to classify building scenes and fit corresponding planar structures to complete missing areas. Although these methods are effective, retrieving matching building primitives from a model library often requires significant computational resources, and maintaining the reliability and accuracy of retrieval necessitates regular updates to the model library to reflect new building elements and their variations.

2.2. Deep Learning-Based Point Cloud Completion Methods

Deep learning-based point cloud complementation models can learn features from the input point cloud to complement the point cloud of the complete shape of the target. They offer stronger generalization capabilities compared to traditional methods.

2.2.1. Voxel-Based Method

The point cloud complementation of voxelized point clouds can be accomplished using CNN. Wu et al. [20] introduced 3D ShapeNets, a method that utilizes Convolutional Deep Belief Networks to transform 3D geometric shapes into binary variable probability distributions on 3D voxel grids. The authors of [21] employed MRF model combined with both geometric and multi-view RGB data for point cloud completion, the authors of [22] created a complementary point cloud by combining the overall structural information and local geometric information of the input point cloud, and the authors of [23] fused 3D shape information through a volumetric deep neural network to complement the point cloud. GRNet [24] regularizes disordered point clouds into 3D meshes to better access the geometric structure and contextual information of the point clouds. SPU-Net [25] achieves the simultaneous acquisition of contextual information within local regions and between local regions by combining self-attention modules with GCN. It then utilizes a hierarchically learnable folding strategy to generate point clouds with a learnable two-dimensional grid. However, these methods usually require voxelization of the point cloud followed by a convolution operation to complement the point cloud, resulting in high computational cost.

2.2.2. Transformer-Based Methods

Transformer-based methods can determine how a point cloud’s local features and general structural characteristics relate to one another. Pointr [26] treats point cloud completion as a set-to-set transformation task by representing a point cloud as a set of unordered point proxies, employing an Encoder–Decoder structure to generate a point cloud of missing regions, and designing a geometry-aware block to explicitly model the geometric relationships. PMP-Net++ [27] introduces the Transformer network on the basis of PMP-Net [28], which makes the point moving path conform to the target structural features through the relationship between the overall structural features and local features of the point cloud, and significantly improves the performance of the point cloud complementation. However, compared to other methods, the Transformer-based method has a lot of parameters, needs expensive experimental equipment, and its attention process is hard to understand.

2.2.3. Point-Based Method

Point-based methods directly model each point in the point cloud.

Multi-layer perceptrons (MLPs) are used by PointNet [29] to learn how the distribution and relationships of the points in the point cloud relate to one another, creating new points in the missing regions to fill them, so that the filled point cloud maintains a consistent geometric shapes. This approach ensures that the completed point cloud maintains consistent geometric shapes. PointNet++ [30] introduced hierarchical layers for multi-scale feature aggregation, enabling the learning of multi-scale features of point clouds. To mitigate structural losses introduced by MLPs, AtlasNet [31] reconstructs complete outputs by evaluating a set of parameterized surface elements, achieving full point cloud generation. PCN [16] can directly perform feature learning directly in the point cloud space to complete missing point clouds. However, PCN only utilizes global features and does not consider local geometric features, making it challenging to capture the topological structures of fine-grained building details. FinerPCN [32] combines PCN and pointwise convolution to extract local information, progressively generating detailed and complete point cloud data. LAKe-Net [33] completes point clouds using a keypoint-skeleton-shape prediction approach by locating aligned keypoints. AGConv [34] enhances the flexibility of point cloud convolution by generating adaptive kernels based on dynamically learned features of points. This enables effective and precise capture of diverse relationships between points from different semantic parts, achieving adaptability within the convolution operation. Although point-based methods independently process points at the local level to maintain permutation invariance, they lose local features because they do not consider the geometric relationships between points and their positions.

The latent space representation of point clouds provides crucial feature information for 3D shape reconstruction. Cycle4Completion [35] allows models to learn how to completely complement a point cloud by learning the process of two simultaneous cyclic transformations of the potential space of a complete 3D shape and an incomplete 3D shape. ShapeInversion [36] introduces GAN into point cloud completion by searching for the latent code in a pre-trained GAN for the complete shape that best reconstructs the given input partial shape. It then uses gradient descent to backpropagate the loss function to update the latent code. PC-GAN [11] combines hierarchical and interpretable sampling with a combination of hierarchical Bayesian and implicit generative networks. This increases the model’s stability and extracts more expressive features to generate more consistent point clouds, thus improving the quality of generated point clouds. PF-Net [37] uses a multi-layer network based on feature points to generate point clouds, taking in-complete point clouds as inputs, and utilizing multi-stage point cloud complementation and adversarial loss to produce more realistic point clouds of missing regions, thus better preserving the individual features of objects. CP3 [38] introduces an Incomplete-to-Incomplete (IOI) sampling mechanism as inverse prompt function, which randomly cuts the incomplete point cloud into a new incomplete point cloud. This new incomplete point cloud is then fed into a pre-trained generative network to recover the original incomplete point cloud. While GAN architectures can utilize discriminator-guided implicit learning to estimate sets of points provided by the generator, the unordered nature of point clouds can result in differently positioned corresponding areas in 3D space, leading to incorrect completions. Additionally, training GANs on complex raw point sets (exceeding 2048 points) can be challenging due to the difficulty in handling point distribution.

VAE and DDPM are typically trained on complete 3D objects to determine the model’s weights for generating the latent representations of the corresponding point cloud data. Spurek et al. [12] proposed the HyperPocket variational autoencoder architecture, which uses the hypernetwork paradigm to fill the missing regions of the point cloud with different variants produced by the underlying representations. VRCNet [39] uses dual-path units and VAE-based relationship enhancement modules for probabilistic modeling of point clouds. S. Luo and W. Hu et al. [14] proposed a point cloud generation diffusion probability model that transforms the noise distribution resulting from multiple noisy additions to the point cloud into a target distribution Markov chain. The neural network learns this process to obtain optimal parameters for generating point clouds from Gaussian noise that conforms to the target data distribution. However, the diffusion model is a stochastic process and may result in partial deviations between the finally generated point cloud and the actual shape.

3. Methods

The structure of our proposed 3D point cloud generation diffusion probability model based on building contour constraints is shown in Figure 1. Four modules make up the model: Contour Constraint, Point encoding, Diffusion, and Reverse. The contour constraint module aims to extract the building contour constraint region

\{C, M\}

of the missing part of the input point cloud and samples it as dense point clouds

X_{C}

and

X_{M}

. The diffusion module involves gradually adding Gaussian noise to the input original building point cloud over time steps. The point-encoding module is divided into three parts: the first part encodes the input building point cloud into its shape latent code

h

; the second part encodes the dense point cloud

X_{C}

and

X_{M}

extracted from the contour constraint, which represents the building’s contour constraint region, into its shape latent codes

C

and

M

; and the third step combines the shape latent codes obtained from the first two steps into

H

. The reverse module involves taking samples from the noise distribution

p (X^{(T)}) ~ N (0, I)

, which is approximated to be similar to the point set

q (X^{(T)})

. These samples are used as input to a network

θ

to estimate the denoising motion for each point at time step

t

, thereby reconstructing the overall building point cloud. In this context, the representation of building

Z

’s point cloud is denoted as

X^{(0)} = {\{x_{i}^{(0)}\}}_{i = 1}^{N}

.

3.1. Contour Constraint

Due to the complexity of the geometric structure of the building surface, extracting the geometric features of the building surface using deep learning requires a large amount of data for training, but training the deep learning model is complex and takes a long time. Therefore, we use the RANSAC [15] to fit the building’s point cloud into multiple surface polygons and divide it into one or more building blocks

\{Z_{1}, \dots, Z_{a}\}

. Then, we extract contour constraint regions for each group to address the missing parts. The extracted polygons are denoted as

V = \{V_{1}, V_{2}, \dots, V_{n}\}

, and the normal vectors for each polygon point outward are denoted as

\vec{n} = \{{\vec{n}}_{1}, {\vec{n}}_{2}, \dots, {\vec{n}}_{n}\}

The centers of each polygon are represented as

G = \{G_{1}, G_{2}, \dots, G_{n}\}

. Figure 2 illustrates the process of constructing Contour Constraint.

3.1.1. Grouping of Building Surface Polygons

We define a polygon correlation constraint

E = [F_{C o n}, ρ]

to group polygons

V

based on the relationships between polygons. The polygon correlation constraint

E

is composed of local convexity

F_{C o n}

[40] and minimum Euclidean distance

ρ

[41]. The polygon correlation constraint

E

is defined as follows:

E = \{\begin{array}{l} T r u e & , & F_{C o n (V_{i} - V_{j})} = c o n v e x a n d ρ^{(V_{i} - V_{j})} = 0 \\ F a l s e & , & F_{C o n (V_{i} - V_{j})} = c o n c a v e o r ρ^{(V_{i} - V_{j})} \neq 0 \end{array},

(1)

The local convexity

F_{C o n}

and minimum Euclidean distance

ρ

are defined as:

F_{C o n (V_{i} - V_{j})} = \{\begin{matrix} c o n v e x \Leftrightarrow θ_{i} > θ_{j} \Leftrightarrow {\vec{n}}_{i} \cdot \vec{G_{i - j}} < {\vec{n}}_{j} \cdot \vec{G_{i - j}} \\ c o n c a v e \Leftrightarrow θ_{i} < θ_{j} \Leftrightarrow {\vec{n}}_{i} \cdot \vec{G_{i - j}} > {\vec{n}}_{j} \cdot \vec{G_{i - j}} \end{matrix},

(2)

ρ^{(V_{i} - V_{j})} = \min \{\sqrt{{(x_{p}^{(V_{i})} - x_{q}^{(V_{j})})}^{2} + {(y_{p}^{(V_{i})} - y_{q}^{(V_{j})})}^{2} + {(z_{p}^{(V_{i})} - z_{q}^{(V_{j})})}^{2}}\},

(3)

In the above two equations,

\vec{n}

is the normal vector of the polygon,

\vec{G_{i - j}}

is the vector in the direction from the center of the polygon

G_{i}

to

G_{j}

, and

θ = \arccos 〈\vec{n}, \vec{G_{i - j}}〉

is the angle between the polygon vector

\vec{n}

and vector

\vec{G_{i - j}}

.

x_{p}

and

x_{q}

represent points on two different edges of the polygons.

i \neq j \in \{1, 2, \dots, n\}

.

Utilizing Equation (2), we have illustrated a schematic representation of the convex-concave relationships among adjacent polygons in Figure 3.

Based on the aforementioned polygon correlation constraint

E = [ρ, F_{C o n}]

, the polygons with correlation are merged into an associated group, thereby grouping the building surface polygons

V

into

\{Z_{1}, \dots, Z_{a}\}

.The actual incomplete building point cloud fitting results in plane grouping is shown in Figure 4.

3.1.2. Definition of Building Surface Polygon Attributes

Based on the grouping results of building surface polygons in Section 3.1.1, and in accordance with the standard Code for design of building foundation [42], which specifies that the wall inclination of buildings below 24 m should be less than 0.23°, we define the polygon attribute

F_{V}

to determine whether the polygon attribute is a wall or a roof.

F_{V} = \{\begin{array}{l} W a l l & , & θ = \arccos \frac{{\vec{n}}_{i} \cdot {\vec{n}}_{h o r i z o n}}{|{\vec{n}}_{i}| |{\vec{n}}_{h o r i z o n}|} \in (90^{°} \pm {0.23}^{°}) \\ R o o f & , & θ = \arccos \frac{{\vec{n}}_{i} \cdot {\vec{n}}_{h o r i z o n}}{|{\vec{n}}_{i}| |{\vec{n}}_{h o r i z o n}|} \in [0^{°}, 90^{°} - {0.23}^{°}] \cup [90^{°} + {0.23}^{°}, 180^{°}] \end{array},

(4)

where

{\vec{n}}_{i}

represents the normal vector of the building surface polygon,

i \in \{1, 2, \dots, n\}

;

{\vec{n}}_{h o r i z o n}

represents the normal vectors of the ground plane; and

θ

represents the angle between the building surface polygon and the ground plane.

Based on the building surface polygon attribute

F_{V}

, attributes are defined for each polygon contained within the individual building, and they are divided into wall polygons

\{W^{(Z_{1})}, \dots, W^{(Z_{a})}\}

and roof polygons

\{R^{(Z_{1})}, \dots, R^{(Z_{a})}\}

based on these attributes.

3.1.3. Constructing Contour Constraint Polygons

Due to the difficulty in handling wall contours in three dimensions and the mutual influence of edges of various subparts, each building subpart is sequentially projected onto the ground plane

V_{h o r i z o n}

to generate a two-dimensional constraint outline, which is then projected onto the roof plane to construct the contour constraint polygons.

As shown in Figure 5, taking the subpart

Z_{a}

with missing data as an example, the associated polygon is projected onto the ground plane as

V_{h o r i z o n}

, the wall polygon is projected as a straight line

\{L_{1}^{(W^{(Z_{a})})}, L_{2}^{(W^{(Z_{a})})}, \dots, L_{b}^{(W^{(Z_{a})})}\}

on

V_{h o r i z o n}

, and the roof polygon is projected as a roof projection polygon on

V_{h o r i z o n}

. Regular edges

\{L_{1}^{(R^{(Z_{a})})}, L_{2}^{(R^{(Z_{a})})}, \dots, L_{p}^{(R^{(Z_{a})})}\}

of the projection polygon are retained along with irregular curves

\{S_{1}^{(R^{(Z_{a})})}, \dots, S_{q}^{(R^{(Z_{a})})}\}

, if they exist (otherwise, the roof has no missing areas).

p = \{1, 2, \dots\}

,

q = \{1, 2, \dots\}

.

First, by projecting the regular edges

L^{(R)}

of the roof projection polygon associated with subpart

Z_{a}

onto the ground plane, we generate the contour constraint polygon

U^{(R)}

. As shown in Figure 6a, within the roof projection polygon of subpart

Z_{a}

, the extension segments

L_{1}^{{(R)}^{'}}

and

L_{4}^{{(R)}^{'}}

are obtained by extending the roof regular edges

L_{1}^{(R)}

and

L_{4}^{(R)}

on the side intersecting with the irregular curve

S^{(R)}

at point

P_{0}

. Projecting lines passing through

L_{1}^{{(R)}^{'}}

and

L_{4}^{{(R)}^{'}}

along the

z

-axis direction onto the roof plane results in intersecting segments

L_{1}^{{(R)}^{″}}

and

L_{4}^{{(R)}^{″}}

, which, along with

S^{{(R)}^{'}}

, form the contour constraint polygon

U^{(R)}

.

Next, by projecting the wall polygon associated with subpart

Z_{a}

onto the ground plane as the straight line

L^{(W)}

, we generate the contour constraint polygon

U^{(W)}

. As shown in Figure 6b, within subpart

Z_{a}

, the wall polygon’s projected straight lines

L_{1}^{(W)}

and

L_{4}^{(W)}

, which are closest to the midpoints of

L_{1}^{{(R)}^{'}}

and

L_{4}^{{(R)}^{'}}

, are translated to intersect at points

P_{1}

and

P_{5}

, generating the contour edges

L_{1}^{{(W)}^{'}}

and

L_{4}^{{(W)}^{'}}

for the missing areas. Then, by projecting

L_{1}^{{(W)}^{'}}

and

L_{4}^{{(W)}^{'}}

along the

z

-axis direction onto the roof plane, we obtain intersecting segments

L_{1}^{{(W)}^{″}}

and

L_{4}^{{(W)}^{″}}

, which, along with

S^{{(R)}^{'}}

, form the contour constraint polygon

U^{(W)}

.

3.1.4. Constructing Contour Constraint Regions

As shown in Figure 7, within subpart

Z_{a}

’s missing area, when

U^{(R)} \geq U^{(W)}

is considered, the contour constraint possibility region

C^{(Z_{a})}

is formed by contour constraint polygons

U^{(R)}

and

U^{(W)}

, while the contour constraint necessity region

M^{(Z_{a})}

is formed by the contour constraint polygon

U^{(W)}

and the irregular roof curve

S^{{(R)}^{'}}

(if

U^{(R)} < U^{(W)}

, then the opposite region is generated).

To facilitate the mapping of building contour constraints into latent space, we sample the extracted contour constraint possibility region

C^{(Z)}

and contour constraint necessity region

M^{(Z)}

as dense point clouds

X_{C}

and

X_{M}

, respectively.

3.2. Diffusion

To transform the distribution of the original building point cloud into a Gaussian distribution suitable for modeling, we follow the process shown in Figure 8. Over time steps, we gradually add Gaussian noise with known parameters, dispersing the points from the original distribution into a disordered point set that approximates a Gaussian distribution.

X^{(0)} = {\{x_{i}^{(0)}\}}_{i = 1}^{N}

make up the input-constructing point cloud, where each point is independently sampled from the point distribution. The process of adding random Gaussian noise at each time step is modeled as a Markov chain:

q (X^{(1 : T)} ∣ X^{(0)}) = \prod_{t = 1}^{T} q (X^{(t)} ∣ X^{(t - 1)}),

(5)

q (X^{(t)} ∣ X^{(t - 1)}) = N (X^{(t)} ∣ \sqrt{1 - β_{t}} X^{(t - 1)}, β_{t} I),

(6)

where

q (X^{(t)} ∣ X^{(t - 1)})

is the Markov diffusion kernel,

β_{t}

is the hyperparameter that controls the variance scheduling of the diffusion process, and

t = \{1, \dots, T\}

.

3.3. Point Encoding

As shown in Figure 9, to enable the fusion of contour constraint regions with the shape latent code of the original point cloud, we use the same encoder as the one used to encode the original building point cloud into shape latent codes to encode the contour constraint regions.

Inputting the original building point cloud

X^{(0)}

into the trained encoder

φ

, we map the point cloud from sample space to latent space, generating the corresponding shape latent code

h

. The process is described as follows:

q_{φ} (h ∣ X^{(0)}) = N (h ∣ μ_{φ} (X^{(0)}), Σ_{φ} (X^{(0)})),

(7)

where

μ_{φ} (X)

and

Σ_{φ} (X)

are estimated mean and variance obtained through pre-training with PointNet [29].

To enable the fusion of building contour constraints with the shape latent code of the incomplete point cloud, we use the same encoder

φ

to map the contour constraint shape point clouds

X_{C}

and

X_{M}

into the same low-dimensional feature space, generating the corresponding shape latent codes

C

and

M

. The process is described as follows:

C = φ (X_{C}),

(8)

M = φ (X_{M}),

(9)

where

φ

is the encoder.

By merging the shape latent codes

h

of the incomplete building point cloud, the contour constraint shape latent codes

M

, and the intersection of shape latent codes E, we obtain the building shape latent code

H

:

H = h \cup M \cup (h \cap C),

(10)

3.4. Reverse

As shown in Figure 10, the reverse process is contrary to the diffusion process, i.e., sampling point clouds from the Gaussian distribution that approximates the forward diffusion point cloud results. It is based on the encoded shape latent code and gradually denoises through decoding, thereby completing the missing building point cloud.

Sampling points from the Gaussian noise distribution

p (X^{(T)}) ~ N (0, I)

that approximates the point set

q (X^{(T)})

as input, using the shape latent code

H

fused with geometric features, the denoising decoding proceeds step by step from time step

T

to time step 0.

θ

network A is used to estimate the denoising motion of each point at time step

t

, recursively moving towards the point distribution that approximates

q (X^{(0)})

. This process can be modeled as a Markov chain:

p_{θ} (X^{(0 : T)} ∣ H) = p (X^{(T)}) \prod_{t = 1}^{T} p_{θ} (X^{(t - 1)} ∣ X^{(t)}, H),

(11)

p_{θ} (X^{(t - 1)} ∣ X^{(t)}, H) = N (X^{(t - 1)} ∣ μ_{θ} (X^{(t)}, t, H), β_{t} I),

(12)

where

μ_{θ}

is the estimated mean obtained by the neural network

θ

.

Since each point is sampled independently, the probability of the whole point cloud in reverse precession is the product of the probabilities of each point:

p_{θ} (X^{(0 : T)} ∣ H) = \prod_{i = 1}^{N} p_{θ} (X^{(0 : T)} ∣ H),

(13)

The points sampled from the noise distribution

p (X^{(T)}) ~ N (0, I)

that approximates the point set

q (X^{(T)})

are generated into a point cloud

X = {\{x_{i}\}}_{i = 1}^{N}

with the shape of the building using the reverse Markov chain

p_{θ} (X^{(0 : T)} ∣ H)

.

3.5. Training Objectives

The training objective of the point cloud generation process is to maximize the log-likelihood of the point cloud. We set a target

L

as the maximum variational lower bound of the log-likelihood with respect to

p_{θ} (X^{(0)})

:

E [\log p_{θ} (X^{(0)})]

, and based on the shape latent code

H

fused with geometric features, iteratively compute multiple times to minimize

L

and achieve the optimal solution for the entire model.

\begin{matrix} L_{G} (θ, φ, α) = E_{q} [\sum_{t = 2}^{T} \sum_{i = 1}^{N} D_{K L} (q (x_{i}^{(t - 1)} ∣ x_{i}^{(t)}, x_{i}^{(0)}) ‖ \\ p_{θ} (x_{i}^{(t - 1)} ∣ x_{i}^{(t)}, H)) \\ - \sum_{i = 1}^{N} \log p_{θ} (x_{i}^{(0)} ∣ x_{i}^{(1)}, H) \\ + D_{K L} (q_{φ} (H ∣ X^{(0)}) ‖ p_{w} (w) \cdot {|\det \frac{\partial F_{α}}{\partial w}|}^{- 1})], \end{matrix}

(14)

where

D_{K L} (\cdot)

is the Kullback–Leibler divergence,

q_{φ} (H ∣ X^{(0)})

is the point cloud shape encoder,

F_{α}

is a trainable bijection implemented by an affine coupling layer, and

p_{w} (w)

is an isotropic Gaussian distribution.

4. Experimental Section

4.1. Experimental Dataset and Processing

We utilized the BuildingNet [43] public dataset for the point cloud generation task. This dataset comprises 1849 complete building point cloud models, encompassing various architectural categories, including houses, churches, skyscrapers, city halls, libraries, and castles, among others. The dataset was randomly divided into training, testing, and validation sets in a ratio of 80%, 15%, and 5%, respectively. From each complete building point cloud, we uniformly sampled 20,480 points. We randomly selected one viewpoint from multiple viewpoints as the center and removed points within a certain radius to obtain incomplete point cloud data. We chose building point cloud data with a 5% missing rate for training and testing in the experiments. All input point cloud data were centered around the original points and their coordinates were normalized to [−1,1].

4.2. Experimental Equipment and Model Training Configuration

Experimental hardware configuration: The CPU model was an Intel Core i7-7700k, and the GPU model was a NVIDIA GeForce RTX 2080Ti. The deep learning environment for the experiments include CUDA 11.2, Python 3.6.9, and PyTorch 1.6. The training process consisted of 100 epochs with a batch size of 4, utilizing the Adam [44] optimizer with an initial learning rate of 0.001.

4.3. Evaluation Metrics

To comprehensively evaluate the quality of point cloud generation, we used CD (Chamfer Distance) and EMD (Earth Mover’s Distance) [45] as evaluation metrics. CD measures the average squared distance between the generated point cloud and the ground truth point cloud. A smaller CD distance indicates that the generated point cloud is closer to the vicinity of the building’s surface, reflecting the completeness of the point cloud shape and the smoothness of the edge contours. A smaller EMD distance is preferable, indicating that each spatial point generated by the model can be paired with real sample point cloud positions with smaller distances, reflecting the similarity in point cloud resolution and density.

Let

G

be the generated point cloud set and

R

be the real point cloud set. “

x

” and “

y

” are point clouds in point sets

G

and

R

, respectively.

d_{C D} (G, R) = d_{C D} (G \to R) = \frac{1}{|G|} \sum_{x \in G} \min_{y \in R} ‖ G - R ‖_{2}^{2},

(15)

In the equation, for each point in set

G

, you should first find the closest point in set

R

, calculate the Euclidean distance, and then sum these distances and take the average.

d_{C D} (R, G) = d_{C D} (R \to G) = \frac{1}{|R|} \sum_{y \in R} \min_{x \in G} ‖ R - G ‖_{2}^{2},

(16)

The above equation follows the same calculation process as Equation (15), but with the point sets taken in reverse.

The EMD requires a one-to-one point-to-point mapping between two points.

d_{E M D} (G, R) = \min_{ϕ : G \to R} \frac{1}{|G|} \sum_{x \in G} ‖ x - ϕ (x) ‖_{2},

(17)

d_{E M D} (R, G) = \min_{ϕ : R \to G} \frac{1}{|R|} \sum_{y \in R} ‖ y - ϕ (y) ‖_{2},

(18)

In the equation,

ϕ (\cdot)

is the bijection between them.

4.4. Comparative Analysis of Experimental Results

To demonstrate the superiority of our model, we conducted comparative experiments between PCN [16], PF-Net [37], VRC-Net [39], and our model on the BuildingNet dataset. We used the metrics described in Section 4.3 to evaluate the point clouds generated by the above-mentioned models and summarized the results in Table 1 and Table 2. In Figure 11, we present the results of a visual comparison between the three-point cloud completion technique and our approach.

Table 1 and Table 2 show the quantified comparison results of the models. G→R refers to the error from the generated point cloud to the real point cloud, which measures the difference between the generated and real scenarios. R→G refers to the error from the real point cloud to the generated point cloud, which reflects the degree to which the ground truth surface is covered by the predicted point cloud. The d_CD score is multiplied by 10³, and the d_EMD score is multiplied by 10².

In Table 1, the overall point cloud completion error consists of two components: prediction error in the missing regions and shape variation in the original parts. Table 1 unmistakably demonstrates that our approach surpasses other currently available methods in the comprehensive completion of building point clouds.

In Table 2, the missing area error is calculated to ensure the fairness of the model evaluation. Table 2 continues to affirm that our approach is better suited for completing building point clouds, specifically in terms of filling in the gaps, when compared to other methods.

The findings presented in Table 1 and Table 2 illustrate that our approach excels in producing point clouds with increased precision and reduced distortion, addressing both overall point cloud completion and the specific task of filling in missing areas.

As can be seen from the comparison of experimental results in Figure 11, our method has better point cloud completion effect. In Figure 11, PCN focuses on global features and generates point clouds from coarse to fine during decoding. However, due to insufficient extraction of local features, it fails to completely recover these structures and exhibits uneven distribution of generated point clouds in space, as shown in ‘complete’ for Building 1 and ‘View 1.’ Additionally, PCN is trained based on the assumption of smooth features, which leads to the smoothing of the roof parts connecting different sections, as seen in ‘View 2’ for Building 1 and ‘View 1’ for Building 2. Despite PF-Net’s utilization of multiple fully connected layers to improve the smoothness of prediction outcomes and simplify the point cloud generation process, during the encoding stage, it exclusively downsamples the input point cloud for feature extraction and does not effectively capture all the local point cloud features. As a result, when it attempts to recover missing areas, the point clouds generated at the boundaries deviate from the original geometric space of the building, as shown in ‘View 1’ for Building 1 and in Building 2. VRC-Net consistently provides a smooth global shape, and the generated point clouds exhibit good spatial uniformity. It effectively retains the intricate details present in the input point cloud. However, the edges of the buildings are sharp, and excessive smoothing results in some non-realistic points at the edges, as shown in ‘View 2’ for Building 1. Similarly, for approximately curved multi-planar surfaces, they are smoothed into curves, as seen in ‘View 1’ for Building 2. Our model addresses this by using contour-constrained shape codes to constrain the diffusion probability model; the generated point clouds do not deviate from the original geometric space of the buildings. As shown in Building 1, compared to other methods, its edges closely match the geometric edges of the real building. In the case of Building 2, View 2, our method accurately completes the building point cloud in the corner area between two parts, while other methods connect these two parts together. This demonstrates the strong reconstruction capability of our method, and the predicted point cloud shape exhibits strong spatial continuity and geometric invariance, reducing the generation of non-realistic point clouds and effectively improving point cloud completion accuracy.

4.5. Completion Comparison Experiments under Different Missing Rates

Experiments were conducted to complete the same buildings with different missing rates of 10%, 15%, 20%, and 25%, respectively, to evaluate the model’s capacity to restore missing buildings at varying rates of absence. The precise quantitative outcomes are displayed in Table 3, illustrating the performance of Model [14], VRC-Net, and our model on the building test dataset. The d_CD score is multiplied by 10³, and lower d_CD scores indicate better performance.

In Figure 12, the visualization shows the variation of d_CD values for three models: VRC-Net, Model [14] and Our.

Even as the number of input points gradually decreased from 10% to 25% compared to the true point cloud data, our model remained ahead in terms of building point cloud completion compared to Model [14] and VRC-Net. Nevertheless, when the missing rate of building data surpassed 20%, the substantial loss of building point cloud data resulted in a notable absence of contour structure information for the building. This makes it difficult for us to construct contour constrained polygons that fit the original building, leading to a noticeable decrease in the point cloud completion performance of our model for buildings. Nonetheless, even in the case of high data loss, our approach can effectively fill in the missing parts of the point cloud data. Compared to other models, it still demonstrates notably impressive building point cloud completion capabilities.

4.6. Ablation Experiment

To evaluate the influence of the shape latent code with incorporated contour constraints in our approach on point cloud completion, we performed ablation experiments using the BuildingNet dataset. The errors reported in the table are mainly divided into two parts: G→R and R→G, with d_CD scores multiplied by 10³. Smaller d_CD scores indicate better performance.

The quantitative results are shown in Table 4. When using the point cloud diffusion probability model alone to extract features, the highest error values were obtained, with d_CD values of 1.711/1.508. When adding the contour-constrained shape latent code

C

, the results were better than the Baseline Model, with a d_CD value of 1.556/0.923. Similarly, when adding the contour-constrained shape latent code

M

, the results were also better than the Baseline Model, with a d_CD value of 1.231/1.126. The contour-constrained shape latent code

C

introduces outward redundancy, which results in a larger difference between the generated d_CD and the ground truth, leading to relatively higher d_CD values for G→R. However, this redundancy increases the coverage of the ground truth surface by the predicted point cloud, and hence the d_CD value for R→G is relatively lower. On the other hand, the contour-constrained shape latent code

M

causes inward contraction, leading to a relatively smaller difference between the generated d_CD and the ground truth. However, this contraction reduces the coverage of the ground truth surface by the predicted point cloud, resulting in relatively lower d_CD values for G→R and relatively higher d_CD values for R→G. Our model combines contour-constrained shape latent codes

C

and

M

, leveraging their advantages to minimize the difference between the generated point cloud and the real point cloud while achieving a higher coverage of the ground truth surface by the predicted point cloud. The d_CD values were 0.928/0.751, which was better than all the models mentioned above. This suggests that the inclusion of building contour constraints enhances the performance of point cloud completion.

The visualization results, as shown in Figure 13, demonstrate that the point cloud diffusion probability model based on building contour constraints produces the best results. As shown in Figure 13, the Baseline Model can extract the complete shape of building point clouds. However, in comparison to the real building model, there is some redundancy at the edges of the completion area. This suggests that the point cloud diffusion probability can capture global building features. The model that only integrates the contour-constrained shape latent code

C

shows slight redundancy at the edges of the completion area. Compared to the Baseline Model, the completed point cloud deviates less, aligning better with the real building point cloud shape. The model that only integrates the contour-constrained shape latent code

M

exhibits some minor point cloud omissions at the edges of the completion area compared to real buildings. However, the completed point cloud deviates less overall. Our model combines both contour-constrained shape latent codes

C

and

M

. In the completion area, the point cloud at the roof’s edge is nearly identical to the real building point cloud. This indicates that the building contour constraints effectively suppress the generation of point clouds outside the building’s geometric space, significantly improving the geometric accuracy of point cloud completion and reducing the generation of non-realistic point clouds.

5. Discussion

In the digital realm, 3D architectural surfaces are typically composed of multiple planar polygons, which contain crucial information regarding the geometric shape of the building. Hence, we can employ geometric methods to construct high-precision constraint outlines for missing regions, thereby rectifying incomplete architectural point cloud 3D models. However, while the overall shape of the building point cloud encoded by the encoder

φ

remains intact, the encoded shape in the missing regions often deviates from the original geometric positions. Consequently, we amalgamated these two characteristics by merging the building-outline-constrained shape latent code with the building’s original point cloud shape latent code. This yields a shape latent code that more closely conforms to the architectural geometric shape, thus more effectively constraining the point cloud completion model to generate point clouds that correctly fill the missing regions.

In this undertaking, the construction of precise constraint outlines plays a pivotal role in enhancing the accuracy of point cloud completion. Achieving high geometric accuracy in constraint outlines primarily focuses on two critical aspects: ① Improving the geometric shape accuracy of the planar polygons extracted from the building surface point cloud; and ② rationalizing the grouping of building surface polygons.

Therefore, future work will emphasize the following two aspects: firstly, extracting planar polygons with high geometric accuracy, ensuring they accurately reflect the geometric shape of the building surface, and secondly, optimizing the grouping method of polygons to eliminate interference from irrelevant polygons and generate more rational constraint outlines within blocks that have missing regions.

6. Conclusions

This paper presents a 3D point cloud completion method based on a building-contour-constrained diffusion probability model. It constructs contour shape latent codes using building contour information, which consists of both walls and roofs, to constrain point cloud generation using the diffusion probability model. This ensures that the generated point cloud has the overall shape with geometric details of building edges. Our method effectively addresses the issue of incorrect geometric distribution in completing point clouds for missing areas in buildings. We tested our method’s performance in building point cloud completion on the BuildingNet dataset and reached the following conclusions:

(1): By grouping building surface polygons, we simplified the process of constructing building-contour-constrained polygons. This eliminates the need for training complex point cloud semantic segmentation models to obtain the required building-contour-constrained regions.
(2): In the comparison experiments presented in Table 1 and Table 2, our method demonstrates superior building point cloud completion accuracy compared to other point cloud completion methods, such as PCN, PF-Net, and VRC-Net. This is because we incorporated the geometric contour information of the target building point cloud during the building point cloud completion process, resulting in a point cloud distribution that closely matches the geometric shape of the target building.

Author Contributions

Conceptualization, B.Y. and H.W.; methodology, B.Y., H.W. and J.L.; software, J.J.; validation, B.Y. and J.L.; formal analysis, B.Y. and J.J.; investigation, Y.L.; resources, E.G.; data curation, H.W.; writing—original draft preparation, B.Y. and H.W.; writing—review and editing, B.Y. and J.L.; visualization, J.J.; supervision, T.Y.; project administration, J.L.; funding acquisition, J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, (No. 41961063); Guilin City Technology Application and Promotion Project in 2022 (20220138-2); Key R&D Projects in Guilin City in 2022 (20220109).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data are open public data and able to be downloaded free of charge. The BuildingNet dataset is available at https://buildingnet.org/ accessed on 15 June 2023.

Conflicts of Interest

The authors declare no conflict of interest.

References

Yang, B.; Haala, N.; Dong, Z. Progress and perspectives of point cloud intelligence. Geo-Spat. Inf. Sci. 2023, 26, 189–205. [Google Scholar] [CrossRef]
Yu, B.; Liu, H.; Wu, J.; Hu, Y.; Zhang, L. Automated derivation of urban building density information using airborne LiDAR data and object-based method. Landsc. Urban Plan. 2010, 98, 210–219. [Google Scholar] [CrossRef]
Wang, R.; Huang, S.; Yang, H. Building3D: An Urban-Scale Dataset and Benchmarks for Learning Roof Structures from Point Clouds. arXiv 2023. [Google Scholar] [CrossRef]
Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3d point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef]
Berger, M.; Tagliasacchi, A.; Seversky, L.M.; Alliez, P.; Levine, J.A.; Sharf, A.; Silva, C.T. State of the art in surface reconstruction from point clouds. In Proceedings of the 35th Annual Conference of the European Association for Computer Graphics, Eurographics 2014-State of the Art Reports, Strasbourg, France, 7–11 April 2014; pp. 161–185. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013. [Google Scholar] [CrossRef]
Kobyzev, I.; Prince, S.J.; Brubaker, M.A. Normalizing flows: An introduction and review of current methods. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3964–3979. [Google Scholar] [CrossRef]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar] [CrossRef]
Cheng, M.; Li, G.; Chen, Y.; Chen, J.; Wang, C.; Li, J. Dense point cloud completion based on generative adversarial network. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 1–10. [Google Scholar] [CrossRef]
Li, C.-L.; Zaheer, M.; Zhang, Y.; Poczos, B.; Salakhutdinov, R. Point cloud gan. arXiv 2018. [Google Scholar] [CrossRef]
Spurek, P.; Kasymov, A.; Mazur, M.; Janik, D.; Tadeja, S.K.; Tabor, J.; Trzciński, T. Hyperpocket: Generative point cloud completion. In Proceedings of the 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 23–27 October 2022; pp. 6848–6853. [Google Scholar] [CrossRef]
Yang, G.; Huang, X.; Hao, Z.; Liu, M.-Y.; Belongie, S.; Hariharan, B. Pointflow: 3d point cloud generation with continuous normalizing flows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 4541–4550. [Google Scholar] [CrossRef]
Luo, S.; Hu, W. Diffusion Probabilistic Models for 3D Point Cloud Generation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 2836–2844. [Google Scholar] [CrossRef]
Gallo, O.; Manduchi, R.; Rafii, A. CC-RANSAC: Fitting planes in the presence of multiple surfaces in range data. Pattern Recognit. Lett. 2011, 32, 403–410. [Google Scholar] [CrossRef]
Yuan, W.; Khot, T.; Held, D.; Mertz, C.; Hebert, M. PCN: Point Completion Network. In Proceedings of the 2018 International Conference on 3D Vision (3DV), Verona, Italy, 5–8 September 2018; pp. 728–737. [Google Scholar] [CrossRef]
Thrun, S.; Wegbreit, B. Shape from symmetry. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, Beijing, China, 17–21 October 2005; pp. 1824–1831. [Google Scholar] [CrossRef]
Li, Z.; Shan, J. RANSAC-based multi primitive building reconstruction from 3D point clouds. ISPRS J. Photogramm. Remote Sens. 2022, 185, 247–260. [Google Scholar] [CrossRef]
Verdie, Y.; Lafarge, F.; Alliez, P. LOD generation for urban scenes. ACM Trans. Graph. 2015, 34, 30. [Google Scholar] [CrossRef]
Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar] [CrossRef]
Nguyen, D.T.; Hua, B.-S.; Tran, K.; Pham, Q.-H.; Yeung, S.-K. A field model for repairing 3d shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5676–5684. [Google Scholar] [CrossRef]
Han, X.; Li, Z.; Huang, H.; Kalogerakis, E.; Yu, Y. High-resolution shape completion using deep neural networks for global structure and local geometry inference. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22 September 2017; pp. 85–93. [Google Scholar] [CrossRef]
Dai, A.; Ruizhongtai Qi, C.; Nießner, M. Shape completion using 3d-encoder-predictor cnns and shape synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 5868–5877. [Google Scholar] [CrossRef]
Xie, H.; Yao, H.; Zhou, S.; Mao, J.; Zhang, S.; Sun, W. Grnet: Gridding residual network for dense point cloud completion. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 365–381. [Google Scholar] [CrossRef]
Liu, X.; Liu, X.; Liu, Y.S.; Han, Z. SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization. IEEE Trans. Image Process. 2022, 31, 4213–4226. [Google Scholar] [CrossRef] [PubMed]
Yu, X.; Rao, Y.; Wang, Z.; Liu, Z.; Lu, J.; Zhou, J. Pointr: Diverse Point Cloud Completion with Geometry-Aware Transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 12498–12507. [Google Scholar] [CrossRef]
Wen, X.; Xiang, P.; Han, Z.; Cao, Y.-P.; Wan, P.; Zheng, W.; Liu, Y.-S. PMP-Net++: Point cloud completion by transformer-enhanced multi-step point moving paths. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 852–867. [Google Scholar] [CrossRef] [PubMed]
Wen, X.; Xiang, P.; Han, Z.; Cao, Y.-P.; Wan, P.; Zheng, W.; Liu, Y.-S. Pmp-net: Point cloud completion by learning multi-step point moving paths. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7443–7452. [Google Scholar] [CrossRef]
Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar] [CrossRef]
Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
Groueix, T.; Fisher, M.; Kim, V.G.; Russell, B.C.; Aubry, M. A papier-mâché approach to learning 3d surface generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 216–224. [Google Scholar] [CrossRef]
Chang, Y.; Jung, C.; Xu, Y. FinerPCN: High fidelity point cloud completion network using pointwise convolution. Neurocomputing 2021, 460, 266–276. [Google Scholar] [CrossRef]
Tang, J.; Gong, Z.; Yi, R.; Xie, Y.; Ma, L. Lake-net: Topology-aware point cloud completion by localizing aligned keypoints. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1726–1735. [Google Scholar] [CrossRef]
Wei, M.; Wei, Z.; Zhou, H.; Hu, F.; Si, H.; Chen, Z.; Zhu, Z.; Qiu, J.; Yan, X.; Guo, Y.; et al. AGConv: Adaptive Graph Convolution on 3D Point Clouds. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 9374–9392. [Google Scholar] [CrossRef]
Wen, X.; Han, Z.; Cao, Y.-P.; Wan, P.; Zheng, W.; Liu, Y.-S. Cycle4completion: Unpaired point cloud completion using cycle transformation with missing region coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13080–13089. [Google Scholar] [CrossRef]
Zhang, J.; Chen, X.; Cai, Z.; Pan, L.; Zhao, H.; Yi, S.; Yeo, C.K.; Dai, B.; Loy, C.C. Unsupervised 3D Shape Completion through GAN Inversion. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 1768–1777. [Google Scholar] [CrossRef]
Huang, Z.; Yu, Y.; Xu, J.; Ni, F.; Le, X. PF-Net: Point Fractal Network for 3D Point Cloud Completion. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 7659–7667. [Google Scholar] [CrossRef]
Xu, M.; Wang, Y.; Liu, Y.; He, T.; Qiao, Y. CP3: Unifying Point Cloud Completion by Pretrain-Prompt-Predict Paradigm. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 9583–9594. [Google Scholar] [CrossRef]
Pan, L.; Chen, X.; Cai, Z.; Zhang, J.; Zhao, H.; Yi, S.; Liu, Z. Variational relational point completion network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8524–8533. [Google Scholar] [CrossRef]
Hu, P.; Miao, Y.; Hou, M. Reconstruction of complex roof semantic structures from 3d point clouds using local convexity and consistency. Remote Sens. 2021, 13, 1946. [Google Scholar] [CrossRef]
Desolneux, A.; Moisan, L.; Morel, J.-M. Gestalt theory and computer vision. In Seeing, Thinking and Knowing: Meaning and Self-Organisation in Visual Cognition and Thought; Springer: Berlin/Heidelberg, Germany, 2004; pp. 71–101. [Google Scholar]
GB 50007-2011; National Standard of the People’s Republic of China Code for Design of Building Foundation. National Standards of People’s Republic of China: Beijing, China, 2012.
Selvaraju, P.; Nabail, M.; Loizou, M.; Maslioukova, M.; Averkiou, M.; Andreou, A.; Chaudhuri, S.; Kalogerakis, E. BuildingNet: Learning to label 3D buildings. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10397–10407. [Google Scholar] [CrossRef]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014. [Google Scholar] [CrossRef]
Fan, H.; Su, H.; Guibas, L.J. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 605–613. [Google Scholar] [CrossRef]

Figure 1. Point cloud generation diffusion probability model based on building contour constraints.

Figure 2. Building outline constraint feature extraction module: “

X^{(0)}

” refers to the input initial point cloud. “

V

” represents the plane polygons extracted by RANSAC. “

E

” denotes the polygon-related constraints. “

\{Z_{1}, \dots, Z_{a}\}

” represents the grouping of plane polygons. “

F_{V}

” is the definition of polygon attributes. “

R

” and “

W

” are categories defined by plane polygons

V

, representing the roof and walls, respectively. “

L^{(R)}

” denotes the regular edges of polygons projected onto the ground plane

V_{h o r i z o n}

from both the roof and walls. “

L^{(W)}

” represents the regular edges of polygons projected onto the ground plane

V_{h o r i z o n}

from the walls. “

S^{(R)}

” represents the irregular edges of the roof projected onto the ground plane

V_{h o r i z o n}

. “

C^{(Z)}

” represents the possible generation area for contour constraints, and “

X_{C}

” is the point cloud sampled from it. “

M^{(Z)}

” represents the inevitable generation area for contour constraints, and “

X_{M}

” is the point cloud sampled from it.

Figure 2. Building outline constraint feature extraction module: “

X^{(0)}

” refers to the input initial point cloud. “

V

” represents the plane polygons extracted by RANSAC. “

E

” denotes the polygon-related constraints. “

\{Z_{1}, \dots, Z_{a}\}

” represents the grouping of plane polygons. “

F_{V}

” is the definition of polygon attributes. “

R

” and “

W

” are categories defined by plane polygons

V

, representing the roof and walls, respectively. “

L^{(R)}

” denotes the regular edges of polygons projected onto the ground plane

V_{h o r i z o n}

from both the roof and walls. “

L^{(W)}

” represents the regular edges of polygons projected onto the ground plane

V_{h o r i z o n}

from the walls. “

S^{(R)}

” represents the irregular edges of the roof projected onto the ground plane

V_{h o r i z o n}

. “

C^{(Z)}

” represents the possible generation area for contour constraints, and “

X_{C}

” is the point cloud sampled from it. “

M^{(Z)}

” represents the inevitable generation area for contour constraints, and “

X_{M}

” is the point cloud sampled from it.

Figure 3. Schematic diagram of common polygonal convex–concave connections on building surfaces: (a,b,e) have angle

θ_{i} > θ_{j}

, and there is a convex connection between the polygons; (c,d) have angle

θ_{i} < θ_{j}

, and there is a concave connection between the polygons.

Figure 3. Schematic diagram of common polygonal convex–concave connections on building surfaces: (a,b,e) have angle

θ_{i} > θ_{j}

, and there is a convex connection between the polygons; (c,d) have angle

θ_{i} < θ_{j}

, and there is a concave connection between the polygons.

Figure 4. Building grouping results schematic diagram: The figure divides the surface polygons fitted from the incomplete building point cloud into four groups, with each color representing a polygon association group. The white irregular region represents the missing point cloud area.

Figure 5. Projection diagram of

Z_{a}

onto the horizontal plane

V_{h o r i z o n}

.

Figure 5. Projection diagram of

Z_{a}

onto the horizontal plane

V_{h o r i z o n}

.

Figure 6. Building constraint contour diagram: (a) contour constraint polygon

U^{(R)}

; (b) contour constraint polygon

U^{(W)}

.

Figure 6. Building constraint contour diagram: (a) contour constraint polygon

U^{(R)}

; (b) contour constraint polygon

U^{(W)}

.

Figure 7. The contour constraint possibility region

C^{(Z_{a})}

and the contour constraint necessity region

M^{(Z_{a})}

.

Figure 7. The contour constraint possibility region

C^{(Z_{a})}

and the contour constraint necessity region

M^{(Z_{a})}

.

Figure 8. Diffusion procession.

Figure 9. Point encoding procession.

Figure 10. Reverse procession.

Figure 11. Comparison of building point cloud completion results using different methods: This figure encompasses the initial incomplete point cloud, the completed point clouds generated by PCN, PF-Net, VRC-Net, and our method, alongside their respective real point clouds for reference.

Figure 12. Line plot of point cloud completion results at different missing rates: (a) G→R refers to the variation in d_CD values when pointing from generation to real point clouds; (b) R→G refers to the variation in d_CD values when pointing from real to generated point clouds.

Figure 13. Visualization of ablation experiment point cloud completion results: The point clouds visualized in the figure include, in sequence, the input incomplete building point cloud, the building point cloud completed by the Baseline Model, the building point cloud completed by the model with only the fusion of contour-constrained shape latent code

C

, the building point cloud completed by the model with only the fusion of contour-constrained shape latent code

M

, and the completed point cloud obtained by our method.

Figure 13. Visualization of ablation experiment point cloud completion results: The point clouds visualized in the figure include, in sequence, the input incomplete building point cloud, the building point cloud completed by the Baseline Model, the building point cloud completed by the model with only the fusion of contour-constrained shape latent code

C

, the building point cloud completed by the model with only the fusion of contour-constrained shape latent code

M

, and the completed point cloud obtained by our method.

Table 1. We present the comparative results of d_CD and d_EMD between our approach and PCN, PF-Net, and VRC-Net on the BuildingNet dataset for complete building point cloud completion. The paired numerical values indicate d_CD and d_EMD for G→R and R→G, where lower values are preferable.

	Building 1				Building 2
Model	d_CD (↓)		d_EMD (↓)		d_CD (↓)		d_EMD (↓)
Model	G→R	R→G	G→R	R→G	G→R	R→G	G→R	R→G
PCN	2.896	3.112	6.121	6.099	3.405	2.887	7.335	7.686
PF-Net	2.06	1.838	4.001	3.773	2.417	2.165	4.808	4.431
VRC-Net	1.419	1.307	3.853	3.531	1.774	1.579	4.434	4.182
Ours	1.057	0.828	3.568	3.196	1.646	1.522	4.253	4.015

Table 2. Below are the comparison results of d_CD and d_EMD for point cloud completion in missing areas of buildings between our method and PCN, PF-Net, and VRC-Net on the BuildingNet dataset. For each model, the paired numerical values represent d_CD and d_EMD for G→R and R→G. Lower values indicate better performance.

	Building 1				Building 2
Model	d_CD (↓)		d_EMD (↓)		d_CD (↓)		d_EMD (↓)
Model	G→R	R→G	G→R	R→G	G→R	R→G	G→R	R→G
PCN	5.674	5.763	9.156	9.043	7.263	7.038	11.847	11.647
PF-Net	4.835	4.785	7.248	6.97	6.098	5.945	8.566	8.891
VRC-Net	4.248	4.196	6.863	6.725	5.059	4.903	7.695	7.633
Ours	3.831	3.781	6.324	6.428	4.817	4.512	7.314	7.062

Table 3. On the BuildingNet dataset, we present the comparative results of d_CD for point cloud completion between our method and Model [14], as well as VRC-Net, across four different missing rates: 10%, 15%, 20%, and 25%. For each model, the paired numerical values represent d_CD for G→R and R→G. Lower values indicate better performance.

Missing Ratio	VRC-Net		Model [14]		Our
Missing Ratio	G→R	R→G	G→R	R→G	G→R	R→G
10%	2.431	2.095	2.619	2.283	1.835	1.606
15%	2.879	2.607	3.011	2.718	2.051	1.811
20%	5.154	4.942	4.914	4.868	3.586	2.963
25%	8.672	8.261	8.226	8.052	7.902	7.515

Table 4. Comparative results of our approach to ablation experiments d_CD on the BuildingNet dataset. For each model, the paired numerical values represent d_CD for G→R and R→G. Lower values indicate better performance.

	Contour-Constrained Shape Latent Code		d_CD (↓)
Type	$C$	$M$	G→R	R→G
Baseline Model	×	×	1.711	1.508
Model 1	✔	×	1.556	0.923
Model 2	×	✔	1.231	1.126
Our	✔	✔	0.928	0.751

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ye, B.; Wang, H.; Li, J.; Jiang, J.; Lu, Y.; Gao, E.; Yue, T. 3D Point Cloud Completion Method Based on Building Contour Constraint Diffusion Probability Model. Appl. Sci. 2023, 13, 11246. https://doi.org/10.3390/app132011246

AMA Style

Ye B, Wang H, Li J, Jiang J, Lu Y, Gao E, Yue T. 3D Point Cloud Completion Method Based on Building Contour Constraint Diffusion Probability Model. Applied Sciences. 2023; 13(20):11246. https://doi.org/10.3390/app132011246

Chicago/Turabian Style

Ye, Bo, Han Wang, Jingwen Li, Jianwu Jiang, Yanling Lu, Ertao Gao, and Tao Yue. 2023. "3D Point Cloud Completion Method Based on Building Contour Constraint Diffusion Probability Model" Applied Sciences 13, no. 20: 11246. https://doi.org/10.3390/app132011246

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

3D Point Cloud Completion Method Based on Building Contour Constraint Diffusion Probability Model

Abstract

1. Introduction

2. Related Works

2.1. Traditional Point Cloud Completion Methods

2.2. Deep Learning-Based Point Cloud Completion Methods

2.2.1. Voxel-Based Method

2.2.2. Transformer-Based Methods

2.2.3. Point-Based Method

3. Methods

3.1. Contour Constraint

3.1.1. Grouping of Building Surface Polygons

3.1.2. Definition of Building Surface Polygon Attributes

3.1.3. Constructing Contour Constraint Polygons

3.1.4. Constructing Contour Constraint Regions

3.2. Diffusion

3.3. Point Encoding

3.4. Reverse

3.5. Training Objectives

4. Experimental Section

4.1. Experimental Dataset and Processing

4.2. Experimental Equipment and Model Training Configuration

4.3. Evaluation Metrics

4.4. Comparative Analysis of Experimental Results

4.5. Completion Comparison Experiments under Different Missing Rates

4.6. Ablation Experiment

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI