Using Synthetic Tree Data in Deep Learning-Based Tree Segmentation Using LiDAR Point Clouds

Bryson, Mitch; Wang, Feiyu; Allworth, James

doi:10.3390/rs15092380

Open AccessTechnical Note

Using Synthetic Tree Data in Deep Learning-Based Tree Segmentation Using LiDAR Point Clouds

by

Mitch Bryson

^*

,

Feiyu Wang

and

James Allworth

Australian Centre For Robotics (ACFR), School of Aerospace, Mechanical and Mechatronic Engineering, University of Sydney, Sydney, NSW 2006, Australia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2023, 15(9), 2380; https://doi.org/10.3390/rs15092380

Submission received: 4 April 2023 / Revised: 27 April 2023 / Accepted: 27 April 2023 / Published: 1 May 2023

Download

Browse Figures

Versions Notes

Abstract

:

Deep learning, neural networks and other data-driven processing techniques are increasingly used in the analysis of LiDAR point cloud data in forest environments due to the benefits offered in accuracy and adaptability to new environments. One of the downsides of these techniques in practical applications is the requirement for manually annotated data necessary for training neural networks, which can be time consuming and costly to attain. We develop an approach to training neural networks for forest tree stem segmentation from point clouds that uses synthetic data from a custom tree simulator, which can generate large quantities of training examples without manual human effort. Our tree simulator captures the geometric characteristics of tree stems and foliage, from which automatically-labelled synthetic point clouds can be generated for training a semantic segmentation algorithm based on the PointNet++ architecture. Using evaluations on real aerial and terrestrial LiDAR point clouds from a range of different forest sites, we demonstrate our synthetic data-trained models can out-perform, or provide comparable performance with models trained on real data from other sites or when available real training data is limited (increases in IoU from 1–7%). Our simulation code is open-source and made available to the research community.

Keywords:

LiDAR; deep learning; synthetic data; forest inventory

1. Introduction

This paper examines the utility of synthetic tree data for supervised deep learning approaches to individual tree structure analysis from LiDAR point cloud data. LiDAR and other point cloud data sources are used extensively in forest management for mapping and monitoring change in biomass/forest tree composition, and for understanding/measuring forest resources in commercial applications. Recently, there has been a surge in the use of methods for processing and interpreting point cloud data using supervised deep learning methods [1,2,3,4,5,6,7,8,9], due to the advantages of accuracy and robustness offered. One issue with supervised deep learning is the requirement of a large, well-labelled database of tree point clouds necessary for training effective models and algorithms. Extensive training datasets are difficult and costly to attain, particularly when working in new forest environments or deploying new sensors [10]. Point clouds are particularly difficult and time-consuming to manually annotate/label (for example labelling of tree parts/regions (i.e., stem, foliage, branches, etc.)) due to their three-dimensional nature, and forest environments are typically cluttered and structurally-complex, which further adds to the difficulty and time taken to produce a training dataset.

One strategy for overcoming issues of training data scarcity in supervised deep learning is via the use of synthetic training data. Synthetic data is artificial or simulated data which is generated by a handcrafted simulation of the real-world scenario based on an empirical understanding of how data may appear in the real world. Simulations that generate synthetic data based on a randomised initial seed value can essentially produce unlimited, automatically-labelled training examples for “free” (i.e., without manual/human effort), significantly reducing the manual effort involved in labelling real data. This synthetic data can then be used to train models that are deployed on real-world data. The success of this approach depends heavily on the quality of the simulation and whether the synthetic data are representative of the details of the real-world data and scenario. Synthetic training data have been used successfully in other learning domains such as visual recognition for autonomous driving [11,12], plant leaf segmentation [13], object detection in indoor scenes [14] and human face analysis from images [15], amongst others. In the context of forest point cloud analysis, point clouds of simulated trees could potentially be generated through a wide variety of software designed primarily for computer graphics or silvicultural purposes [16,17,18].

In this paper, we explore the use of synthetic tree point clouds for training supervised deep learning models for tree point segmentation from LiDAR point cloud data. We develop a tree point cloud simulator based on a randomised seed point generator and evaluate its utility for training models for a semantic segmentation task which is deployed and tested on several real airborne and terrestrial LiDAR forest datasets. Our paper addresses the following questions:

1.: Does the use of simulated training examples help boost performance of point segmentation models on real data, compared to training with limited real examples only?
2.: How does the accuracy of trained models depend on the amount of real or simulated data?
3.: What level of sophistication in the simulation of individual tree point clouds is necessary to achieve a boost in performance?

1.1. Related Work

1.1.1. Individual Tree Segmentation and Forest Point Cloud Deep Learning

Advances in scanning technologies and resolutions for both aerial LiDAR and mobile ground-based LiDAR have enabled individual tree-level analyses of forest point clouds to be performed in recent times [6,19,20,21,22,23], as opposed to wider area-based approaches [24,25]. Once individual tree point clouds have been identified and extracted from broader forest scans, key structural properties of each tree, such as height, diameter and shape, can be measured and tallied using a variety of processing techniques. Many key structural parameters, such as stem diameter, basal area, stem volume, sweep and taper, depend on being able to identify which points in an individual tree point cloud correspond to the central stem of the tree (as opposed to foliage points and smaller branches). Automated processing techniques for determining stem points and measuring parameters include circle-fitting via the Hough Transform and RANdom SAmple Consensus (RANSAC) [26], clustering-based approaches [27] and Quantitative Structural Models (QSMs) [28]. Although effective in many situations, these techniques often require significant effort in parameter tuning when applying to data across varying resolutions and forest types, and often fail in the presence of significant gaps in coverage, for example as caused by scanning occlusions.

More recently, there has been an increase in the use of data-driven processing methods for the automatic analysis of forest point clouds, in particular via the use of supervised deep learning algorithms, including deep neural networks [1,2,3,4,5,6,7,8]. These techniques allow for algorithms to adapt to the underlying nature of the data on which they are trained, resulting in robustness and flexibility to the specifics of the forest study site, dominant tree species and point cloud data properties, including coverage, resolution and noise. Deep neural networks have been used to classify tree species [3,4] and to classify conifers vs. deciduous trees [2] using terrestrial laser scanning and high-resolution airborne LiDAR. In [1], the authors use a 3D convolutional neural network to directly estimate above-ground biomass and tree count from airborne LiDAR point clouds. Deep neural networks have also been used to segment/classify tree stem points from LiDAR pointclouds. In [5], the authors use a 3D fully convolutional network to classify stem and branch points from terrestrial LiDAR, and the authors in [6] demonstrate both 3D volumetric and PointNet++ [29,30] architectures for tree stem point segmentation as part of a larger processing pipeline, including individual tree detection and stem geometry reconstruction. In [8], the authors use a PointNet architecture that combines 3D points, laser return intensity and a handcrafted feature based on local point distributions to segment tree stem points. In [7], the authors demonstrate the use of a PointNet++ architecture for segmenting stem, foliage, ground and undergrowth vegetation using a variety of terrestrial and high-resolution aerial LiDAR datasets.

1.1.2. Deep Learning Using Synthetic Data

Although supervised deep learning approaches have demonstrated many advantages for forest point cloud processing, their effectiveness depends on the quality, variety and quantity of training data provided to a model. Collecting and manually labelling large training datasets of complex 3D data can be expensive and time-consuming. One way to mitigate this issue, which has been used with considerable success in other learning applications, is through the use of synthetic (or simulated) data during training.

Simulated, artificial city-scapes have been used to generate image training data for applications in autonomous driving [11,12], due to the large number of potential scenarios experienced in urban driving and the ability to simulate data for potentially dangerous driving situations that would be difficult to obtain in real data. Results from [11,12] have demonstrated that use of synthetic data improves the performance of supervised learning algorithms for semantic image segmentation over training with real images only, due to the very large number of synthetic data examples (up to 200 k image examples) used during training. In [31], the authors use Blender-based image simulation of 3D satellite models to boost the performance of deep learning-based space debris characterisation from optical telescope data, for which real world validation data (of objects in orbit) is difficult to acquire. In [15], the authors develop a new framework for simulating highly-realistic artificial images of human faces and use synthetic data alone to train models for face landmark localisation that outperform models trained on limited real face datasets. Deep learning using synthetic data has also been demonstrated successfully using point cloud data; the authors in [32] present a large simulated point cloud in an urban environment for deep learning applications, and [33] demonstrated a method for generating simulated LiDAR point clouds of level crossings which increased the performance of networks for point segmentation when used during training.

1.2. Contributions of This Paper

In this paper, we evaluate the potential of synthetic point cloud data for training deep learning-based models for semantic segmentation of forest trees from LiDAR point clouds. The main contributions of our paper are:

1.: We develop a tree point cloud simulation framework that generates realistic synthetic data of forest trees that can be used for training deep learning-based segmentation models. Our simulation code is open-source and made available to the research community (https://github.com/mitchbryson/SimpleSynthTree (accessed on 3 April 2023)).
2.: We demonstrate an approach to deep learning-based tree stem point detection that can use our synthetic data, and demonstrate its effectiveness on real forest LiDAR point clouds collected from a variety of data sources (aerial and terrestrial) and different forest sites.
3.: We demonstrate that models trained on synthetic data have competitive performance, or out-perform models built by training on limited amounts of real data or when using real data from the non-target forest sites.

We also present ablation studies of our simulation model that demonstrate which features of the simulation process are beneficial to model development via learning. Our paper demonstrates a feasible approach to forest point cloud analysis using deep learning without the need for extensive manual data annotation when operating in new forest sites or using new sensors, which addresses a common operational problem with deploying these techniques.

2. Materials and Methods

This section provides an overview of our methodology including the collection and processing of real LiDAR point cloud datasets, the development of a tree simulator and its use in a supervised deep learning framework for tree point cloud segmentation.

2.1. Real LiDAR Datasets

2.1.1. Study Areas and Data Capture

LiDAR point clouds used in experiments were collected in forests over three different study sites:

Tumut: Commercial forest located outside of Tumut, NSW, Australia consisting primarily of mature Radiata pine trees, collected using airborne scanning.
HQP: A commercial plantation consisting primarily of Pinus Caribaea spp. trees, located in Queensland, Australia, captured using mobile ground-based scanning.
DogPark: Recreational forest (various species) in Rotorua, New Zealand, captured using mobile ground-based scanning.

Airborne datasets were captured using a helicopter-mounted Reigl VUX-1 scanner flying at a height of approximately 60–90 m above the top of the forest canopy, resulting in resolutions of approximately 300 to 700 points per m

^{2}

. Ground-based datasets were captured using an Emesent Hovermap scanner that was carried in a backpack while scanning forest plots, resulting in resolutions of approximately 8000 points per m

^{2}

.

2.1.2. Initial Point Cloud Processing

At each study site, an existing method for individual tree detection based on past work [6,19] was used to split the forest point cloud into per-tree point clouds, where each per-tree point cloud corresponded to an individual tree, with ground and undergrowth points removed. At the HQP and DogPark sites, all visible trees were extracted from plot-level LiDAR scans (across several 30 m diameter sampling plots). At the Tumut site, trees were extracted from a 300-by-150 m sampling area. To provide ground-truth labels for training and validation, each point in each per-tree cloud was manually labelled/annotated into one of two classes corresponding to stem (main tree stem and any large branches/leaders coming from the main stem) and foliage (canopy points and any small branches). Points were manually labelled using the software package Meshlab (https://www.meshlab.net/ (accessed on 3 April 2023)).

Examples of per-tree point clouds from the three study sites are shown in Figure 1a. A total of 270 per-tree point clouds were processed across the three sites (90 trees from Tumut, 128 trees from HQP and 52 trees from DogPark). Each per-tree point cloud was down-sampled to 32,000 points per tree to maintain consistency in the resolution across the different datasets.

2.2. Development of a Simulator for Generating Synthetic Tree Point Cloud Data

A simulation process was developed for generating synthetic examples of individual tree point clouds as they might appear when scanned using LiDAR sensors. During laser scanning, pulses of light emitted by the scanner interact with the physical surfaces of the tree’s stem, branches and foliage and are reflected back to the scanner registering multiple 3D points in space representing where on the tree structure these interactions occur. The strength of the LiDAR return depends on the reflectance of the tree surface and the density of measured returns depends on the local tree structure and positioning of the scanner relative to the tree. Airborne scanning tends to produce a high density of returns towards the canopy at the top of a tree, whereas ground/terrestrial-based scanning typically produces higher densities of returns at the bottom of the tree and tree stem.

The synthetic tree point cloud simulation process was therefore broken into two stages: (1) simulation of the tree structure as a 3D mesh-based model and (2) a point-based sampling of this mesh that reflects density of points (and its variation for different parts of the tree) depending on the mode of scanning which is to be emulated (i.e., aerial vs. terrestrial). Figure 2 illustrates the various steps of the simulation process, which are described in detail in the sub-sections below.

2.2.1. Tree Mesh Model Simulation

Simulated tree mesh models were composed of two different meshes: one mesh representing the central tree stem/trunk and a second mesh representing the distribution of small branches and foliage/canopy/leafy material for the tree. The central stem of a tree was simulated as a curved and tapered cylinder, starting from the bottom/base of the tree. The diameter of the stem at the base of the tree (

d_{b}

) and the overall tree height (

h_{t}

) (length of the cylinder) are selected as random values, taken from a user-specified distribution of values. The taper of the stem (change in diameter over height

d (h)

) is simulated as a linear function from the specified base diameter to a diameter of zero at the top of the tree:

d (h) = d_{b} - (\frac{d_{b}}{h_{t}}) h

(1)

The curvature and sweep of the tree stem is simulated using a cubic spline which interpolates between the base of the tree stem with two random 3D points selected (

x_{m} = {x_{m}, y_{m}, h_{m}}

and

x_{t} = {x_{t}, y_{t}, h_{t}}

), corresponding to the mid-height of the tree (

h_{m}

) and the top of the tree (

h_{t}

), respectively. The x and y coordinates of these points are randomly sampled from user-specified maximum distance from the base of the tree. The tree base is located at the origin of a tree-relative 3D coordinate system (with

x, y

horizontal and z upwards). Cubic spline coefficients are computed for both the x and y directions (

k_{0 x}

,

k_{1 x}

,

k_{2 x}

,

k_{0 y}

,

k_{1 y}

,

k_{2 y}

) by solving the spline equations:

[\begin{matrix} \frac{2}{h_{m} - h_{b}} & \frac{1}{h_{m} - h_{b}} & 0 \\ \frac{1}{h_{m} - h_{b}} & \frac{2}{h_{m} - h_{b}} + \frac{2}{h_{t} - h_{m}} & \frac{2}{h_{t} - h_{m}} \\ 0 & \frac{2}{h_{t} - h_{m}} & \frac{1}{h_{t} - h_{m}} \end{matrix}] [\begin{matrix} k_{0 x} \\ k_{1 x} \\ k_{2 x} \end{matrix}] = [\begin{matrix} 3 \frac{x_{m} - x_{b}}{{(h_{m} - h_{b})}^{2}} \\ 3 (\frac{x_{m} - x_{b}}{{(h_{m} - h_{b})}^{2}} + \frac{x_{t} - x_{m}}{{(h_{t} - h_{m})}^{2}}) \\ 3 \frac{x_{t} - x_{m}}{{(h_{t} - h_{m})}^{2}} \end{matrix}],

(2)

[\begin{matrix} \frac{2}{h_{m} - h_{b}} & \frac{1}{h_{m} - h_{b}} & 0 \\ \frac{1}{h_{m} - h_{b}} & \frac{2}{h_{m} - h_{b}} + \frac{2}{h_{t} - h_{m}} & \frac{2}{h_{t} - h_{m}} \\ 0 & \frac{2}{h_{t} - h_{m}} & \frac{1}{h_{t} - h_{m}} \end{matrix}] [\begin{matrix} k_{0 y} \\ k_{1 y} \\ k_{2 y} \end{matrix}] = [\begin{matrix} 3 \frac{y_{m} - y_{b}}{{(h_{m} - h_{b})}^{2}} \\ 3 (\frac{y_{m} - y_{b}}{{(h_{m} - h_{b})}^{2}} + \frac{y_{t} - y_{m}}{{(h_{t} - h_{m})}^{2}}) \\ 3 \frac{y_{t} - y_{m}}{{(h_{t} - h_{m})}^{2}} \end{matrix}]

(3)

The

x, y

coordinates of the tree center line as a function of the tree height h are then calculated using the spline coefficients. For heights below the mid-height

h_{m}

, the following equation is used:

\begin{matrix} x (h) = x_{b} (1 - t_{1}) + x_{m} t_{1} + t_{1} (1 - t_{1}) (a_{1 x} (1 - t_{1}) + b_{1 x} t_{1}), \end{matrix}

(4)

\begin{matrix} y (h) = y_{b} (1 - t_{1}) + y_{m} t_{1} + t_{1} (1 - t_{1}) (a_{1 y} (1 - t_{1}) + b_{1 y} t_{1}) . \end{matrix}

(5)

For heights above the mid-height,

h_{m}

, the center line

x, y

coordinates are computed as follows:

\begin{matrix} x (h) = x_{m} (1 - t_{2}) + x_{t} t_{2} + t_{2} (1 - t_{2}) (a_{2 x} (1 - t_{2}) + b_{2 x} t_{2}), \end{matrix}

(6)

\begin{matrix} y (h) = y_{m} (1 - t_{2}) + y_{t} t_{2} + t_{2} (1 - t_{2}) (a_{2 y} (1 - t_{2}) + b_{2 y} t_{2}), \end{matrix}

(7)

where:

t_{1} = \frac{h}{h_{m}},

(8)

t_{2} = \frac{h - h_{m}}{h_{t} - h_{m}},

(9)

a_{1 x} = k_{0 x} (h_{m} - h_{b}) - (x_{m} - x_{b}),

(10)

a_{2 x} = k_{1 x} (h_{t} - h_{m}) - (x_{t} - x_{m}),

(11)

a_{1 y} = k_{0 y} (h_{m} - h_{b}) - (y_{m} - y_{b}),

(12)

a_{2 y} = k_{1 y} (h_{t} - h_{m}) - (y_{t} - y_{m}),

(13)

b_{1 x} = - k_{1 x} (h_{m} - h_{b}) - (x_{m} - x_{b}),

(14)

b_{2 x} = - k_{2 x} (h_{t} - h_{m}) - (x_{t} - x_{m}),

(15)

b_{1 y} = - k_{1 y} (h_{m} - h_{b}) - (y_{m} - y_{b}),

(16)

b_{2 y} = - k_{2 y} (h_{t} - h_{m}) - (y_{t} - y_{m})

(17)

To simulate splitting/forking of the main tree stem, additional curved, tapering cylinders that branch off from the main stem at random locations along the main/central tree stem, and reach out to random locations selected in the canopy of the tree were generated. Parameters for the curvature and branching location (with a branching height of

h_{b r}

) were selected randomly for each synthetic tree from a user-specified distribution. Using the shape parameters specified, a series of vertices and triangular mesh faces were generated for each cylinder that represents the surface of the tree stem (see Figure 2a).

To simulate smaller branches and the tree’s canopy, branching locations were selected randomly along the main tree stem and line segments generated radiating outwards from the main stem to randomly selected locations in the canopy. The length of each line segment branch was generated randomly using a mean length which depended upon the height of the branch. Two user-defined heights were used to determine how this mean length changed, in order to simulate various canopy shapes: a minimum canopy height and a maximum canopy width height. Branches could only exist above the minimum canopy height. Mean branch length increased linearly from zero to a user-defined maximum width at the maximum canopy width height, and then decreased proportionally to the square root of the height from the maximum canopy width height to zero length at the top of the tree. The resulting set of simulated branches/line segments were used to specify the location of randomised mesh segments (Figure 2b) from which sampled foliage/canopy points would be simulated in the mesh point sampling process described below.

2.2.2. Mesh Point Sampling

Once the structure of a tree has been generated using meshes, a simulated point cloud scan of the tree can be generated by randomly sampling 3D points on the tree structure mesh surface. The total number of sample points N is chosen to reflect the resolution of the type of scanner being simulated. Points are sampled separately for the two classes, stem (sampled from the cylindrical stem meshes) and foliage (sampled from the branch/foliage randomised mesh segments), where random noise is added to the vertical coordinates of the foliage points to simulate fine-scale complexity of tree foliage. The number of sampled stem points

N_{s t e m}

and number of sampled foliage points

N_{f o l}

are:

\begin{matrix} N_{s t e m} = λ N, \end{matrix}

(18)

\begin{matrix} N_{f o l} = (1 - λ) N, \end{matrix}

(19)

where

λ

is the user-definable relative ratio of the number of points sampled from the stem vs. overall points, a parameter which could be used to simulate scans that were closer in appearance to either ground-based (higher stem densities) or aerial (higher foliage densities) scanner locations.

Figure 1b and Figure 2c show examples of the final simulated tree point clouds, labelled according to their per-point stem/foliage classes, compared against real LiDAR scans of trees considered in the study.

2.2.3. Simulation Implementation

The tree simulation was developed in the Python programming language and uses the python packages numpy (https://numpy.org (accessed on 3 April 2023)) and trimesh (https://trimsh.org (accessed on 3 April 2023)) for simulating point sampling from the tree structure meshes. The simulation code is open-source and made available at https://github.com/mitchbryson/SimpleSynthTree (accessed on 3 April 2023).

2.3. Supervised Deep Learning Point Cloud Semantic Segmentation Model

The PointNet++ architecture [30] was used to develop a supervised deep learning model for tree point segmentation. PointNet++ models were trained using labelled per-tree point clouds taken from either the real LiDAR data and/or simulation-generated synthetic examples (see experiments described in Section 3 below). Once trained, each model would take new unlabelled per-tree point clouds as an input and provide per-point class labels (stem or foliage) as an output.

Two training datasets were created to train PointNet++ models: a synthetic dataset and a real dataset. The simulator described in Section 2.2 was used to generate

N =

12,800 synthetic tree point clouds for the synthetic training set, where each point cloud contained 4096 points, using the parameter values shown in Table 1. Real per-tree clouds processed and manually labelled from the LiDAR datasets (see Section 2.1) were randomly split 50% for validation and 50% for training at for each site. For each of the per-tree clouds selected for training, additional training examples were generated via data augmentation by randomly sampling 4096 points from the 32,000 available for each cloud, and applying a random rotation about the vertical (z) axis, resulting in an augmented training dataset of up to 12,800 real tree point clouds per site that could be used.

Each model was trained using Keras/Tensorflow (https://keras.io (accessed on 3 April 2023)) in Python, with custom PointNet++ layers implemented in the PointNet++ tensor flow 2.0 layers package (https://github.com/dgriffiths3/pointnet2-tensorflow2 (accessed on 3 April 2023)). For further details of the architecture and training process for PointNet++, the reader is referred to [30]. All models were trained for 30 epochs using the Adam optimiser [34] with a learning rate of

1 \times 10^{- 3}

and a batch size of 5. Hyper-parameters for the ball query radii in each sampling layer were manually optimised using cross-validation results from the training datasets.

3. Results

This section outlines our key experiments and results using the frameworks presented in Section 2, comparing the use of real and synthetic data during training.

3.1. Models Trained on Real Data

As a baseline for comparison, point segmentation models were trained on real LiDAR data with varying training dataset size, and evaluated using the real data evaluation splits across the different study sites described in Section 2.1.1. To study the effect of training dataset size on model performance, models were trained on increasing numbers of real training examples, varying from N = 20 to N = 12,800 trees from the augmented training data split described in Section 2.3. In this experiment, we trained different models for each of the three study sites and evaluated each model using evaluation set examples taken from the corresponding site for which the model was trained. For each site and each training dataset size N, we trained five models using five different randomly selected sets of training examples, to account for the variation in model performance when the number of training examples was low. Each model was evaluated using the real data validation sets by measuring the mean Intersection over Union (IoU), precision and recall of each of the stem and foliage classes.

Figure 3 shows the mean stem class IoU across different sites and for increasing numbers of examples used during training. Model performance was observed to increase as the number of training examples N used during training increased at each site, and optimal model performance was only observed at each site when approximately N = 1000 or greater training examples were used. Overall model performance was lowest at the Tumut site, which contained a larger number of trees with split (multi-apical) stems and a wider variety of foliage profiles and tree lean, and, hence, was more challenging for the model to accurately segment.

3.2. Models Trained on Synthetic-Only Data

To study the effectiveness of the synthetic data simulator for model training, point segmentation models were trained on synthetic data only, and evaluated using real LiDAR scans from the different study sites. To study the effect of synthetic training data size on model performance, models were trained on increasing numbers of synthetic training examples, varying from 20 to 12,800 synthetic trees. Each model was evaluated using the real data-only validation sets.

Figure 4 shows stem class IoU for synthetic-only trained models of various training sizes evaluated on the real data evaluation sets at each site. Performance on real data was found to increase as the number of synthetic examples used during training increased, with the best model trained on 12,800 examples resulting in IoUs of 0.675 to 0.83, depending on study site.

3.3. Comparison of Real vs. Synthetic Data-Trained Models

Figure 5 shows examples of segmented real tree point clouds from the different real vs. synthetic-data trained models, comparing to ground-truth point cloud annotations. Models trained on either real (all) or synthetic datasets were shown to be able to segment points corresponding to the central tree stem with a level of accuracy that would be reasonable in many down-stream processing methods (e.g., measuring stem height, shape and diameter for inventory purposes). In particular, models trained on a limited amount of real data (Real (N50)) had lower performance and often failed to identify stem points towards the upper half of trees where foliage often obscures the location of the stem.

Table 2 shows a summary of the performance of different models (trained on synthetic vs. real data) evaluated on the real data evaluation sets. Results indicated that the best performing models were those trained on the whole (N = 12,800) real data sets; however, models trained on synthetic data consistently outperformed models trained on real data when the amount of data available at training was limited to N = 50 real examples (mean stem IoU increases of 0.598 to 0.675 (Tumut), 0.817 to 0.830 (HQP) and 0.790 to 0.81 (DogPark)).

3.4. Cross-Dataset Performance of Real Data-Trained Models vs. Synthetic-Only Trained Model

In addition to the analysis above in which real dataset-trained models were evaluated on the same sites from which data were collected to train the model, we also evaluated the cross-dataset performance of each model. We evaluated the performance of synthetic data-trained models and models trained on real data from one site when evaluated on a different real data site not seen during training.

Table 3 shows the results of the cross-dataset performance. When comparing to Table 2, the performance of real-data trained models dropped when evaluated on a different site than the one they were trained on. For two of the three sites (Tumut and DogPark) the synthetic-data trained models outperformed all real-data trained models that had been trained on other sites, even when all available real training data examples were used (as measured by stem class IoU, Table 3).

3.5. Synthetic Data Simulator Ablation Study

In order to evaluate the aspects of the synthetic tree simulator which had the most influence on the performance of models deployed on real trees, an ablation study was performed. Segmentation models were trained using synthetic data that had been generated after removing key aspects of the simulation modelling of the tree:

1.: No sweep and taper (ST): tree stems were modelled as straight cylinders with a constant diameter ( $d (h)$ ) along the height of the tree and no cubic spline parameters.
2.: No foliage distribution (FD): the distribution of tree branching and foliage was generated along the height of the tree with a uniform random distribution and constant branch length of 2 m.

Models were trained with N = 12,800 synthetic training examples and compared to the models trained on the synthetic data from the original full simulation by evaluating on real tree data from all of the three sites.

Table 4 shows the results of the ablation study. The full simulation process described in Section 2.2 exhibited the best performance (in terms of mean IoU and recall) where each ablation to the simulation process was shown to decrease the overall performance of models trained on the resulting synthetic data.

4. Discussion

The use of synthetic data for training deep learning models for tree structural analysis has several advantages when compared to training from real data alone. The generation of large amounts of synthetic data is possible with effectively no human/manual effort, unlike the effort required to analyse and annotate large amounts of real LiDAR data when training models using real data. Although training with a large quantity of real, annotated training data results in the best model performance, there are many operational scenarios where, in practice, only a smaller, limited real dataset is available (due to cost/effort) or a model trained on an existing dataset is intended to be used at a new site for which annotated training data are not yet available.

Models trained purely on synthetic trees have the capacity to be competitive with, and out-perform models trained purely on real data, particularly in situations where accurately labelled real training data is scarce (Table 2). Real data-trained models exhibited improvements in performance as the size of the training set increases: our results showed that synthetic-only-trained models out-performed all real-data trained models when training dataset size was low; however, real-data-trained models out-performed synthetic-ones when the training set size reached a critical number, which varied at each study site (Figure 3 and Figure 4). This is consistent with the use of synthetic data for supervised deep learning using point clouds [35] and in other domains (e.g., [12,14]).

Cross-dataset performance experiments indicated that the synthetic data-trained model either out-performed (Tumut/DogPark sites) or had equivalent performance (HQP) to real data-trained models from other sites (Table 3). Due to variations in tree stem profiles and properties across the different sites, models trained from real data that were deployed across different sites were not as effective at segmenting new trees. This is consistent with other cross-dataset studies in the literature (e.g., [9]) where different methods (e.g., domain adaptation) have been used to address this issue. The synthetic data-trained models in our study had access to training examples that, although synthetic, emulated a broader distribution of structural characteristics, resulting in better predictions when applied to real data across sites. This has implications for a common operational scenario in which stem segmentation models are to be deployed at a new site in which training data has not yet been annotated. Our results show that in this scenario a model trained from our synthetic data often out-performs models trained using real data from other sites, and would be the preferred approach until data at a new site can be annotated and used for re-training.

Results of our ablation study demonstrate the effectiveness of the different simulation features used during synthetic data generation. Simulation features that model sweep, lean and taper of real tree stems and approximate the foliage distribution on trees in a realistic manner improve the performance of models trained using synthetic data when deployed on the real LiDAR datasets considered in our study (Table 4). These features are likely to be important in the ability for a model to learn how to find tree stem points, particularly when identifying stem points towards the upper half of trees, where foliage density increases relative to the stem and where branching and forking increases.

There are two main potential avenues for future work from this paper. The first avenue involves improving the fidelity of the tree simulation framework demonstrated here: although we demonstrate that our simulation framework is beneficial for learning, it does not capture fine-scale tree details or significantly multi-apical tree structures. More advanced tree simulation models are likely to provide the potential to further improve model performance and provide the ability for more fine-grained structural analysis tasks (such as branch detection) using deep learning based on synthetic-data-trained models. The second avenue for future work is to explore the development of methods that combine both real and synthetic data together during training. Approaches such as domain adaptation [36,37] might have the ability to learn common features between both real and synthetic trees, or transfer the appearance of synthetic trees to real LiDAR scans of trees using a limited amount of real tree point cloud data.

5. Conclusions

Deep learning and data-driven processing techniques will continue to be of increasing importance in the analysis of 3D data collected in natural environments, including forest point clouds. In this paper, we have established the utility of synthetic data for assisting in the process of training such models when applied to tasks such as the semantic segmentation of trees. We have developed a simple yet flexible tree simulation framework which is made available for future research, and has the ability to be tuned to adapt to a variety of different tree and forest types. Results of our experiments using real LiDAR forest data from several different sites demonstrate that models trained on our synthetic data are competitive with, or out-perform models trained on limited amounts of real data from the target site, or models trained on real data from other sites. This addresses an important gap often encountered in practical forest inventory when collecting data from a new forest (or potentially using a different point cloud sensor) for the first time, and provides a viable alternative to collecting a large annotated training dataset at the target site.

Author Contributions

Conceptualisation, M.B., Methodology, M.B., Software, M.B., F.W. and J.A., Formal Analysis, M.B., Data curation, M.B., Data annotation/labelling, M.B., F.W. and J.A., Writing—original draft, M.B., Writing—review and editing, F.W. and J.A., Project administration and funding acquisition, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in part by the National Institute for Forest Production Innovation grant NIF073-1819, Forest and Wood Products Australia grant VNC520-1920 and the University of Sydney.

Data Availability Statement

The simulation code use to generate our synthetic tree datasets is open-source and made available at https://github.com/mitchbryson/SimpleSynthTree (accessed on 3 April 2023).

Acknowledgments

This work has been supported by the Australian Centre For Robotics (ACFR), University of Sydney. Thanks to David Herries, Susana Gonzales, Lee Stamm, Interpine New Zealand and HQPlantations Australia for providing access to airborne and terrestrial laser scanning datasets.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ayrey, E.; Hayes, D. The Use of Three-Dimensional Convolutional Neural Networks to Interpret LiDAR for Forest Inventory. Remote Sens. 2018, 10, 649. [Google Scholar] [CrossRef]
Hamraz, H.; Jacobs, N.; Contreras, M.; Clark, C. Deep learning for conifer/deciduous classification of airborne LiDAR 3D point clouds representing individual trees. ISPRS J. Photogramm. Remote Sens. 2018, 158, 219–230. [Google Scholar] [CrossRef]
Chen, J.; Chen, Y.; Liu, Z. Classification of Typical Tree Species in Laser Point Cloud Based on Deep Learning. Remote Sens. 2021, 13, 4750. [Google Scholar] [CrossRef]
Liu, B.; Chen, S.; Huang, H.; Tian, X. Tree species classification of backpack laser scanning data using the PointNet++ point cloud deep learning method. Remote Sens. 2022, 14, 3809. [Google Scholar] [CrossRef]
Xi, Z.; Hopkinson, C.; Chasmer, L. Filtering Stems and Branches from Terrestrial Laser Scanning Point Clouds Using Deep 3-D Fully Convolutional Networks. Remote Sens. 2018, 10, 1215. [Google Scholar] [CrossRef]
Windrim, L.; Bryson, M. Detection, segmentation, and model fitting of individual tree stems from airborne laser scanning of forests using deep learning. Remote Sens. 2020, 12, 1469. [Google Scholar] [CrossRef]
Krisanski, S.; Taskhiri, M.; Gonzalez-Aracil, S.; Herries, D.; Turner, P. Sensor Agnostic Semantic Segmentation of Structurally Diverse and Complex Forest Point Clouds Using Deep Learning. Remote Sens. 2021, 13, 1413. [Google Scholar] [CrossRef]
Wu, B.; Zheng, G.; Chen, Y. An Improved Convolution Neural Network-Based Model for Classifying Foliage and Woody Components from Terrestrial Laser Scanning Data. Remote Sens. 2020, 12, 1010. [Google Scholar] [CrossRef]
Wang, F.; Bryson, M. Tree Segmentation and Parameter Measurement from Point Clouds Using Deep and Handcrafted Features. Remote Sens. 2023, 15, 1086. [Google Scholar] [CrossRef]
Lines, E.; Allen, M.; Cabo, C.; Calders, K.; Debus, A.; Greive, S.; Miltiadou, M.; Noach, A.; Owen, H.; Puliti, S. AI applications in forest monitoring need remote sensing benchmark datasets. arXiv 2022, arXiv:2212.09937. [Google Scholar]
Ros, G.; Sellart, L.; Materzynska, J.; Vazquez, D.; Lopez, A. The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
Johnson-Roberson, M.; Barto, C.; Mehta, R.; Sridhar, S.N.; Rosaen, K.; Vasudevan, R. Driving in the Matrix: Can Virtual Worlds Replace Human-Generated Annotations for Real World Tasks? In Proceedings of the International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017.
Ward, D.; Moghadam, P.; Hudson, N. Deep Leaf Segmentation Using Synthetic Data. In Proceedings of the British Machine Vision Conference (BMVC), Newcastle, UK, 3–6 September 2018. [Google Scholar]
Georgakis, G.; Mousavian, A.; Berg, A.; Kosecka, J. Synthesizing Training Data for Object Detection in Indoor Scenes. In Proceedings of the Robotics: Science and Systems, Cambridge, MA, USA, 12–16 July 2017. [Google Scholar]
Wood, E.; Baltrusaitis, T.; Hewitt, C. Fake it till you make it: Face analysis in the wild using synthetic data alone. In Proceedings of the International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar]
Weber, J.; Penn, J. Creation and rendering of realistic trees. In Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), Los Angeles, CA, USA, 6–11 August 1995. [Google Scholar]
Hewitt, C. Procedural Generation of Tree Models for Use in Computer Graphics. Undergraduate Dissertation, Trinity Hall, Dublin, Ireland, 2017. [Google Scholar]
Westling, F.; Bryson, M.; Underwood, J. SimTreeLS: Simulating aerial and terrestrial laser scans of trees. Comput. Electron. Agric. 2021, 187, 106277. [Google Scholar] [CrossRef]
Bryson, M. PointcloudITD: A software package for individual tree detection and counting. In Deployment and Integration of Cost-Effective, High Spatial Resolution, Remotely Sensed Data for the Australian Forestry Industry; FWPA Technical Report; Forest & Wood Products Australia: Melbourne, VIC, Australia, 2017; pp. 1–19. [Google Scholar]
Vandendaele, B.; Fournier, R.; Vepakomma, U.; Pelletier, G.; Lejeune, P.; Martin-Ducup, O. Estimation of northern hardwood forest inventory attributes using UAV laser scanning (ULS): Transferability of laser scanning methods and comparison of automated approaches at the tree- and stand-level. Remote Sens. 2021, 13, 2796. [Google Scholar] [CrossRef]
Neuville, R.; Bates, J.; Jonard, F. Estimating forest structure from UAV-mounted LiDAR point cloud using machine learning. Remote Sens. 2021, 13, 352. [Google Scholar] [CrossRef]
Hao, Y.; Widagdo, F.; Liu, X.; Liu, Y.; Dong, L.; Li, F. A hierarchical region-merging algorithm for 3-D segmentation of individual trees using UAV-LiDAR point clouds. IEEE Trans. Geosci. Remote Sens. 2022, 69, 5701416. [Google Scholar] [CrossRef]
Persson, P.; Olofsson, K.; Holmgren, J. Two-phase forest inventory using very-high-resolution laser scanning. Remote Sens. Environ. 2022, 271, 112909. [Google Scholar] [CrossRef]
Gobakken, T.; Naesset, E. Estimation of diameter and basal area distributions in coniferous forest by means of airborne laser scanner data. Scand. J. For. Res. 2004, 19, 529–542. [Google Scholar] [CrossRef]
Maltamo, M.; Suvanto, A.; Packalén, P. Comparison of basal area and stem frequency diameter distribution modelling using airborne laser scanner data and calibration estimation. For. Ecol. Manag. 2007, 247, 26–34. [Google Scholar] [CrossRef]
Olofsson, K.; Holmgren, J.; Olsson, H. Tree stem and height measurements using terrestrial laser scanning and the RANSAC algorithm. Remote Sens. 2014, 6, 4323–4344. [Google Scholar] [CrossRef]
Lamprecht, S.; Stoffels, J.; Dotzler, S.; Hab, E.; Udelhoven, T. aTrunk—An ALS-Based Trunk Detection Algorithm. Remote Sens. 2015, 7, 9975–9997. [Google Scholar] [CrossRef]
Raumonen, P.; Kaasalainen, M.; Akerblom, M.; Kaasalainen, S.; Kaartinen, H.; Vastaranta, M.; Holopainen, M.; Disney, M.; Lewis, P. Fast Automatic Precision Tree Models from Terrestrial Laser Scanner Data. Remote Sens. 2013, 5, 491–520. [Google Scholar] [CrossRef]
Qi, C.; Yi, L.; Su, H.; Guibas, L. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Qi, C.; Yi, L.; Su, H.; Guibas, L. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In Proceedings of the Conference on Neural Information Processing Systems (NIPS), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Allworth, J.; Windrim, L.; Bennett, J.; Bryson, M. A transfer learning approach to space debris classification using observational light curve data. Acta Astronaut. 2021, 181, 301–315. [Google Scholar] [CrossRef]
Griffiths, D.; Boehm, J. SynthCity: A large-scale synthetic point cloud. arXiv 2019, arXiv:1907.04758. [Google Scholar]
Uggla, G.; Horemuz, M. Towards synthesized training data for semantic segmentation of mobile laser scanning point clouds: Generating level crossings from real and synthetic point cloud samples. Autom. Constr. 2021, 130, 103839. [Google Scholar] [CrossRef]
Kingma, D.; Ba, J. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations (ICLR), San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Xiao, A.; Huang, J.; Guan, D.; Zhan, F.; Lu, S. Transfer Learning from Synthetic to Real LiDAR Point Cloud for Semantic Segmentation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vancouver, BC, Canada, 20–28 February 2022. [Google Scholar]
Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial Discriminative Domain Adaptation. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
Tsai, Y.; Sohn, K.; Schulter, S.; Chandraker, M. Domain Adaptation for Structured Output via Discriminative Patch Representations. In Proceedings of the International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]

Figure 1. Example tree point clouds: (a) real LiDAR scanned point clouds of trees taken from the three study sites and, (b) Synthetic tree point clouds generated using the simulation process.

Figure 2. Simulation process for randomised synthetic tree data: (a) Example tree main stem cylinder model. (b) randomised meshes used for the placement of smaller branches, foliage and other tree canopy. (c) Simulated LiDAR point cloud using point-based sampling of the tree structure.

Figure 3. Performance (mean IoU of stem class) of real data-trained models on real data evaluation sets with increasing number of real examples N used during training. Plots show the average IoU across five different models trained for each site and N combination, with error bars indicating the standard deviation over these different models.

Figure 4. Performance (mean IoU of stem class) of synthetic data-trained models on real data evaluation sets with increasing number of synthetic examples used during training.

Figure 5. Examples of predicted segmentations of real tree point clouds using models trained using real vs. synthetic training data. (a) ground truth class labels, (b) predicted classes by model trained on limited real data (Real N50), (c) predicted classes by model trained on all real data, and (d) predicted classes by model trained on synthetic data.

Table 1. Simulation parameters used to generate synthetic trees used to train tree point segmentation models.

Parameter	Values Used	Parameter	Values Used
tree height range ( $h_{t}$ )	30–50 m	Maximum canopy width	7 m
base diameter range ( $d_{b}$ )	0.5–1 m	Max canopy width height	0.4 $h_{t}$ –0.8 $h_{t}$
stem split/fork height range ( $h_{b r}$ )	0.15 $h_{t}$ –0.5 $h_{t}$	tree top x-y distance from base range	±2.5 m
stem split probability	0.15	tree mid x-y distance from base range	±0.5 m
number of small branches	60–100	foliage point randomised height	0.5 m
Minimum canopy height	0.2 $h_{t}$ –0.5 $h_{t}$	stem point ratio ( $λ$ )	0.1–0.3

Table 2. Performance of models trained on real vs. synthetic-only data: shown are the mean IoU, precision and recall for each of the real data evaluation sets at each study site for each model. “Real (N50)” are models trained on N = 50 real data examples, “Real (all)” are models trained on all available real data examples at each site and “Synthetic” are models trained on N = 12,800 synthetic examples. Best results shown in bold. Synthetic-trained models that out-performed real data-trained models with limited data (N = 50) shown in blue.

			Stem			Foliage
Test Dataset	Model Trained on	IoU	Precision	Recall	IoU	Precision	Recall
Tumut	Real (N50)	0.598	0.927	0.638	0.974	0.979	0.995
	Real (all)	0.761	0.884	0.841	0.985	0.991	0.994
	Synthetic	0.675	0.790	0.806	0.978	0.989	0.989
HQP	Real (N50)	0.817	0.858	0.947	0.945	0.989	0.955
	Real (all)	0.849	0.865	0.978	0.952	0.995	0.957
	Synthetic	0.830	0.852	0.968	0.947	0.993	0.953
DogPark	Real (N50)	0.790	0.872	0.897	0.791	0.911	0.863
	Real (all)	0.875	0.917	0.950	0.880	0.954	0.918
	Synthetic	0.810	0.854	0.941	0.797	0.943	0.837

Table 3. Cross-dataset performance of real data-trained models vs. synthetic-only trained model: shown are the mean IoU, precision and recall for the stem class for each of the three datasets when models used are trained on other datasets or synthetic data. All real data-trained models were those that were trained on all available real examples at each site (N = 12,800). Synthetic data models trained on N = 12,800 synthetic examples. Best results are shown in bold.

Test Dataset	Model Trained On	IoU	Precision	Recall
Tumut	HQP Model	0.596	0.875	0.644
	DogPark Model	0.602	0.859	0.658
	Synthetic Model	0.675	0.790	0.806
HQP	Tumut Model	0.862	0.875	0.984
	DogPark Model	0.882	0.899	0.980
	Synthetic Model	0.830	0.852	0.968
DogPark	Tumut Model	0.789	0.855	0.914
	HQP Model	0.798	0.893	0.882
	Synthetic Model	0.810	0.854	0.941

Table 4. Simulation ablation study results: shown are the mean IoU, precision and recall for the stem class for real trees at all three sites, when models used are trained on synthetic data (‘Full Sim’ is the complete tree simulator, ‘-ST’ is without stem sweep and taper modelling, ‘-FD’ is without foliage distribution modelling). Best results shown in bold.

Test Dataset	Synthetic Data-Trained Models	IoU	Precision	Recall
Combined Real Sites	Full Sim	0.769	0.832	0.903
	Full Sim -ST	0.725	0.804	0.875
	Full Sim -FD	0.709	0.850	0.803
	Full Sim -ST -FD	0.591	0.654	0.830

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bryson, M.; Wang, F.; Allworth, J. Using Synthetic Tree Data in Deep Learning-Based Tree Segmentation Using LiDAR Point Clouds. Remote Sens. 2023, 15, 2380. https://doi.org/10.3390/rs15092380

AMA Style

Bryson M, Wang F, Allworth J. Using Synthetic Tree Data in Deep Learning-Based Tree Segmentation Using LiDAR Point Clouds. Remote Sensing. 2023; 15(9):2380. https://doi.org/10.3390/rs15092380

Chicago/Turabian Style

Bryson, Mitch, Feiyu Wang, and James Allworth. 2023. "Using Synthetic Tree Data in Deep Learning-Based Tree Segmentation Using LiDAR Point Clouds" Remote Sensing 15, no. 9: 2380. https://doi.org/10.3390/rs15092380

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Using Synthetic Tree Data in Deep Learning-Based Tree Segmentation Using LiDAR Point Clouds

Abstract

1. Introduction

1.1. Related Work

1.1.1. Individual Tree Segmentation and Forest Point Cloud Deep Learning

1.1.2. Deep Learning Using Synthetic Data

1.2. Contributions of This Paper

2. Materials and Methods

2.1. Real LiDAR Datasets

2.1.1. Study Areas and Data Capture

2.1.2. Initial Point Cloud Processing

2.2. Development of a Simulator for Generating Synthetic Tree Point Cloud Data

2.2.1. Tree Mesh Model Simulation

2.2.2. Mesh Point Sampling

2.2.3. Simulation Implementation

2.3. Supervised Deep Learning Point Cloud Semantic Segmentation Model

3. Results

3.1. Models Trained on Real Data

3.2. Models Trained on Synthetic-Only Data

3.3. Comparison of Real vs. Synthetic Data-Trained Models

3.4. Cross-Dataset Performance of Real Data-Trained Models vs. Synthetic-Only Trained Model

3.5. Synthetic Data Simulator Ablation Study

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI