A VOXEL-BASED METHOD FOR THE THREE-DIMENSIONAL MODELLING OF HEATHLAND FROM LIDAR POINT CLOUDS: FIRST RESULTS

: Bushfires are an intrinsic part of the New South Wales’ (NSW) environment in Australia, especially in the Blue Mountains region (11400km 2 ), that is dominated by fire prone vegetation that includes heathland. Many of the Australian native plants in this region are fire-prone and combustible, and many species even require fire to regenerate. The classification of the lateral and vertical distribution of living vegetation is necessary to manage the complexity of bushfires. Currently, interpretation of aerial and satellite images is the prevalent method for the classification of vegetation in NSW. The result does not represent important vegetation structural attributes, such as vegetation height, subcanopy height, and destiny. This paper presents an automated method for the three-dimensional modelling of heathland and important heathland parameters, such as heath shrub height and continuity, and sparse tree and mallee height and density in support of bushfire behaviour modelling. For this study airborne lidar point clouds with a density of 120 points per square meter are used. For the processing and modelling the study is divided into a point cloud processing phase and a voxel-based modelling phase. The point cloud processing phase consists of the normalisation of the height and extraction of the above ground vegetation, while the voxel phase consists of seeded region growing for segmentation, and K-means clustering for the classification of the vegetation into three different canopy layers: a) heath shrubs, b) sparse trees and mallee, c) tall trees.


INTRODUCTION
The Blue Mountain region in is one of the most fire-prone regions in the world and it is renowned for its fire prone vegetation (New South Wales Rural Fire Service, 2021). Heathland is one of the many vegetation groups that is present in the Blue Mountains. Heathland is structurally complex, vertically non-uniform, and discontinuous and occurs in patches of three to four hectares (Hammill and Tasker, 2010;Sullivan et al., 2012). It is dominated by shrubs ranging between 0.5 meter (m) to 2 m in height, while trees are absent or sparsely scattered, rarely exceeding 10 m in height, or they may be present as mallee (multi-stemmed eucalyptus) (Keith et al., 2014).
Heathland is notorious for its high flammability. The presence of flammable terpenes and waxes in the foliage of some shrubs and mallee along with the ladder type structure of heathland, propagating surface to crown fire, create a complex fire behaviour. Bushfire behaviour models are essential for controlling such complex bushfires, requiring three-dimensional geoinformation. A primary input parameter for heathland bushfire modelling is vegetation height or bulk density. Currently aerial photography and satellite imagery are utilised in the classification of heathland in NSW. These two-dimensional techniques are leveraged off the geometric arrangement and illumination conditions of the vegetation. The drawback with such models is that even with high spatial resolution, important heathland parameters such as the canopy height, canopy continuity and density, cannot be modelled. This is specifically important for heathland as it is slightly unique in that it can be * Corresponding author floristically consistent but structurally variable. An example of this is the Eastern Suburbs Banksia Scrub that is an endangered ecological community of heath vegetation in Sydney (Figure 1).

Figure 1.
Heathland a) Exposed sites with heights of 1.2 m or less, b) Closed sites with heights in excess of 2 m and sparse trees Three-dimensional techniques have many merits compared to two-dimensional techniques, since they allow for the separation of objects found at different heights above the Earth's surface (Wagner et al., 2008). Point clouds acquired from Airborne Laser Scanner (ALS) and Mobile Laser Scanners (MLS) are a prominent technique for the characterisation and estimation of three-dimensional vegetation elements within forested and city environments (Chen et al., 2016;Morsdorf et al., 2006;Xu et al., 2021). The lidar beams can penetrate through the upper canopy foliage to provide three-dimensional spatial coordinates (x,y,z) of the upper canopy, understorey vegetation and terrain. From the acquired point clouds different vegetation layers can be modelled, and accurate estimation of vegetation height, cover, and canopy structure is possible. These vegetation parameters are important and can improve the reliability of heathland fire behaviour models (Anderson et al., 2015). This is because wind speed, an important weather parameter that is applied in heathland bushfire behaviour modelling, is directly impacted by objects (i.e. trees) and their porosity (their density). For example, a dense object has no porosity and can reduce wind speed. Thus, this influences the rate of spread and the intensity of a fire .
Lidar point clouds are applied in a range of studies for the threedimensional modelling and estimation of vegetation structural attributes (Xu et al., 2021). They are applied in individual tree modelling (Leiterer et al., 2012), tree position and tree species modelling (Huo et al., 2021;Lee and Lucas, 2007), canopy height (Coops et al., 2007), tree diameter distribution (Liang et al., 2018), foliage and wood density and stand volume modelling (Hall et al., 2005) or for spaces estimates (Wang et al., 2020;Wang et al., 2021) . However, raw point clouds can be very large (number of points in the millions), noisy, sparse and unstructured and featuring uneven point densities (Remondino, 2003). This unstructured nature of point clouds is a bottleneck for preprocessing and manipulation. Furthermore, point clouds do not contain sematic, topological, or geometrical information about an object. To derive localised information about point clouds, neighbourhoods between points must be defined, but this requires expensive computations when performed on the raw point clouds.
Voxels (VOlumetric piXELs), the 3D analogue to 2D pixels, resolve current drawbacks that are experienced with point clouds. They provide a structured and efficient method to represent threedimensional objects in a topologically explicit three-dimensional array while reducing processing time and memory space. This is achieved by combining qualitative information, such as a discrete vegetation layer, with quantitative information, such as volume. They allow for straightforward and efficient processing algorithms for the extraction of three-dimensional features through voxel neighbourhood connectivity. When applied in heathland modelling, they can represent important heath shrub connectivity and discontinuity.
Voxel-based modelling is frequently applied in a variety of point cloud applications, showing great potential and value in vegetation modelling. Voxels are applied at both a single tree level and area-based level vegetation modelling. At a single tree level, Gorte and Pfeifer (2004) and Gorte and Winterhalder (2004) applied voxels to extract tree stems and major tree branches from the tree topology skeletons by morphological operations and connectivity analysis. At an area based level, Kükenbrink et al. (2017) applied a voxel method for mapping and a voxel traversal method for quantifying occluded forest canopy volume by tracing ALS laser pulses through a pre-defined 1 m voxel grid. Bienert et al. (2010) applied a voxel method to model stand height, tree density and plant area density (PAD). This latter work shares similarities with our approach, but for the application of fire modelling a classification of vegetation is necessary too.
Although methods for the modelling of vegetation using both point clouds and voxels exist, currently a method based on the implementation of a series of algorithms for the threedimensional modelling of heathland and mallee parameters does not exist. Current methods for the modelling of single trees and forests are based on extracting important parameters such as different canopy layers and the different canopy structures, while in automated point cloud canopy model methods, cylindrical shape or vertical orientation are applied for the modelling of trees. This is too time consuming and programmatically extensive, and is expected to be less robust than voxel-based algorithms, because it requires more assumptions (e.g. extent of point coverage per stem). Furthermore, heathland is different in shape and structure compared to other vegetation groups such that it is without a shape and structurally complex, predominantly composed of shrubs with the presence of sparsely scattered trees and mallee. Since heathland does not have a distinct geometric structure (like cylindrical tree stems), voxels are the preferred method for the modelling of heathland. Therefore, this paper presents a new automated method, composed of a series of algorithms, for the three-dimensional modelling of heathland and important heathland parameters such as heath shrub height and continuity, and sparse trees and mallee height and density, in support of bushfire behaviour modelling in NSW.
To begin this paper provides a review on current automated methods for vegetation modelling and classification using point cloud and voxels in the following section. Section 3 follows with a presentation of the proposed methodology and the algorithms for the classification of heathland. Section 4 discusses the point cloud processing phase. While section 5 discusses the voxelbased processing phase. The analysis and results are discussed in section 6. Finally, the paper concludes with intentions for further research.

RELATED RESEARCH
Lidar point clouds are commonly applied for three-dimensional vegetation modelling and canopy layer classification. Novo et al. (2020) applied an automated approach for horizontal and vertical classification of vegetation for fire hazard modelling. The study focused on the classification of vegetation in two categories; shrubs and trees. Lee and Lucas (2007) applied lidar point clouds for the delineation of forest canopy at an individual tree or cluster level. Furthermore, they identified tree stem and height in the overstorey and sub-canopy layer of a multi-layered wooded savanna forest in Australia. Ferraz et al. (2012) classified ALS point clouds into three different categories of ground vegetation, understorey and overstorey vegetation group. This research also included obtaining the thickness of the main vegetation layers, but also the spatial arrangement and size of the individual plants that compose each stratum. Amiri et al. (2016) applied ALS point clouds for change detection at the lower canopy layers. This study specifically focused on the estimation of single tree regeneration coverage below 5 m.
Voxels are another method for the three-dimensional modelling and classification of different canopy layers. Voxels quantify point clouds into a structured three-dimensional composition for the modelling of homogeneous vegetation connectivity. Vetter et al. (2011) applied a 1x1 m 2 horizontal and 0.5 m vertical voxel structure method to count the number of echoes within each voxel cell. This study delineated homogeneous vegetation roughness based on the spatial discretisation of the laser echoes and aggregation of the points into cells, voxels, and connections. Mücke (2014) applied volumetric spaces referred to as density pixels that resemble a voxel but from a programming point of view, all the calculations were based on the pixel. Through the calculation of the number of ALS echoes inside one height level divided by the total number of ALS echoes inside the pixel column, a density pixel is calculated. Based on pre-defined parameters a series of pixels of one height represent the presence of the abundance of subdominant vegetation layers. A significance test results in the binary classification of the pixels and by setting a threshold the pixels are divided into different canopy layers. Brolly et al. (2021) applied a voxel-based approach in a multi-layered stand for identifying different tree parameters. A seeded region growing algorithm was applied for the segmentation of individual trees.
Voxels are also applied for the modelling of occluded understorey vegetation in dense multi-layered forests (Barton et al., 2020;Kükenbrink et al., 2017;Mücke, 2014). Barton et al. (2020) utilised a voxel method for measuring fuel load in a multilayered forest in Newcastle, Australia, with the aim to quantify vertical fuel layers for hazard reduction burning. Connected component labelling (CCL) was applied for the segmentation of different canopy layers while a thresholding approach was applied for the classification of the vegetation.
The current methods that are listed for the modelling of vegetation are for the classification of canopy layers such as forestland and woodland. Currently to our knowledge there are no studies that can be found on the three-dimensional modelling of heathland.

METHODOLOGY
A method is proposed for the three-dimensional modelling of heathland. This method consists of two phases. Phase one of this method relates to direct point cloud processing. It consists of the normalisation of the height and the above ground vegetation classification. Phase two consists of voxelisation of the point clouds, followed by segmentation and classification of the voxels (Figure 2). The novelty in this proposed approach is that a series of algorithms are assembled for the purpose of deriving important heathland parameters, such as heath shrub height and continuity, and sparse tree and mallee height and density, specifically in the voxel-based processing phase. Using the voxel-based method, segmentation and classification steps are applied to extract features. The segmentation step measures the similarity of features based on voxel neighbourhood connectivity, while the classification step derives the semantic details of the features.

POINT CLOUD PROCESSING
The first phase of this approach focusses on the point cloud processing, which consists of two primary steps, 1) the normalisation of the height to the Digital Terrain Model (DTM) and, 2) the classification of all above ground vegetation and the extraction of the DTM from the dataset. The point cloud process is completed using the scientific software OPALS (Orientation and processing of Airborne Laser Scanning) .

Normalisation of Height
Generating an accurate DTM is the first step in producing a threedimensional vegetation model (Hodgson and Bresnahan, 2004). This involves converting the lidar ellipsoidal heights (Z) to orthometric heights (H). The correct estimation of the DTM is critical for an unbiased estimation of vegetation height otherwise it can contribute to underestimation of canopy height. This can lead to errors in the range of several decimetres. The normalisation of the height includes removing the influence of the terrain on the above ground measurements

Above Ground Vegetation Classification and Terrain Removal
Following the normalisation of the height all above ground vegetation are extracted. The canopy height model (CHM) is applied for the extraction of all above ground vegetation. It is the height or distance between the ground and the topmost echoes such as the top of a tree and it is calculated as the difference between the DSM and the DTM. The DSM is a representation of the topmost surface visible from an aerial platform. This is represented by the first echo returns while the DTM represents the elevation on the ground surface. The next step is to perform the echo ratio estimation. The echo ratio is a good indication on the level of roughness and allows for classification of vertical vegetation point clouds such as trees and shrubs from artificial objects such as buildings (Höfle et al., 2009). It is also useful for removing other types of solid objects from the dataset. The echo ratio is defined by Höfle et al. (2009) as the ratio between the number of neighbouring echoes in a fixed search distance in 3D (a sphere) and all echoes located within the same search distance in 2D (a cylinder).
The output from the echo ratio displays flat surfaces such as grassland, buildings, and roads with a value close to 100%. This is because their geometric appearances are similar, and their point density ratios are less. Transparent objects such as trees and shrubs have lower echo ratio values due to having neighbouring point clouds in horizontal and vertical planes. Through these values two different classes of data can be extracted, objects with a high echo ratio value and objects with a low echo ratio value.
The advantage of using the echo ratio for this dataset is its application to remove solid objects, but the disadvantage is that grassland is separated into a different category from the vegetation group. This does not create a problem for this dataset as our data is focused on shrubs and trees.

VOXEL BASED VEGETATION MODEL
Phase two of this research is the voxel-based framework for 3D modelling of heathland. This phase consists of three steps; 1) structuring the point clouds into a voxel space, i.e. voxelisation 2) segmentation based on the seeded region growing algorithm, and 3) K-means clustering for the classification of all above ground vegetation (Figure 2). The voxelisation step is applied following the extraction of all above ground vegetation. This step requires defining a voxel resolution and a minimum voxel density. The next step in this phase is the segmentation and classification of the occupied voxels into three different vegetation height classes, 1) heath shrubs, 2) sparse trees and mallee, 3) tall trees.

Voxelisation
The point clouds are structured using the Python library routines NumPy (Harris et al., 2020) and Laspy (Python Package Index-PyPI, 2022). The first voxelisation step is the calculation of the bounding box over the point clouds using the Poux (2020) voxel code. The bounding box is further discretised into smaller cubic grids (voxel cubes) by defining the voxel resolution.
Defining a fitting voxel resolution is an integral part of the representation of the data. This is because voxelisated objects are subject to a bias in the estimation of surface areas, which causes a misrepresentation of objects that are not aligned with the grid. Hence, it is important to select a fitting voxel resolution for the data. A large voxel resolution can aggregate the data and result in the loss of significant geographic data. Though a small voxel resolution can represent geographical features more accurately to achieve a higher level of detail, it can create too many voxels such that each point in the point cloud has a voxel grid.

Segmentation of occupied voxels
Segmentation algorithms originating in two-dimensional image processing can easily be applied for object-based voxel segmentation, unlike point cloud segmentation which is mostly based on recognition of a simple shape in point clouds. Following the voxelisation of the point clouds, all occupied voxels are segmented.
The segmentation of the voxels is based on voxel neighbourhood connectivity. Seeded region growing or connected component labelling (CCL) are the two preferred algorithms. Both algorithms are based on voxel neighbourhood connectivity, while the former searches for an additional similarity trait. In a voxel space the seeded region growing algorithm assumes that all voxels that belong to one object are connected and have similar attributes. Therefore, the seeded region growing algorithm is the preferred algorithm for this research as it provides the option to introduce other attributes to the segmentation process at a later date.
The voxel neighbourhood connectivity is a common voxel property to consider. It defines how the voxels are linked. In three-dimensional space, neighbouring voxels, are connected to each other through either 6, 18 or 26 voxel neighbourhood connectivity. The different neighbourhoods directly impact the cluster (object) output. Strongly connected voxels with 18 or 26 neighbourhood connectivity result in thinner objects with shorter lengths or areas while a 6 voxel neighbourhood connectivity results in a thicker object output (Aleksandrov et al., 2021). Homogeneous vegetation clusters are extracted through the voxel neighbourhood connectivity. A strong voxel neighbourhood connectivity, such as 18 or 26 neighbourhood connectivity, results in less segmentation clusters compared to a 6-voxel connectivity. The voxel neighbourhood connectivity is an effective method to estimate the connectivity and discontinuity of heath shrubs. While tree heights are estimated from the segmentation of tree crowns in the voxel-space.
The Hancock (2017) seeded region growing code is utilised for this study, in which a minimum voxel density can be defined during the processing (Algorithm 1). This density threshold can directly impact the connectivity and cluster output. Thus, consideration should be taken of the voxel size and density prior to defining this limit.

Classification of the Segmented Voxels
All segmented voxels are classified following the seeded region growing segmentation, using the SciKit, K-means clustering python machine learning library (Pedregosa et al., 2011). This process assigns and defines regions to specific classes based on different criteria. The K-means clustering is an unsupervised algorithm to identify a set of clusters (objects) in a dataset based on their similarities. These clusters of objects are loosely defined as a group based on their computed distances to defined centroids. The expected result from the K-means clustering algorithm is three different canopy height levels: a) heath shrubs, b) sparse trees and mallee, c) tall trees.

RESULTS
For this study lidar point clouds collected over Kelly Hills Cave, Kangaroo Island, South Australia are utilised. Kangaroo Island is 4,405 km² in area and is dominated by heath and mallee vegetation. The Kelly Hills Cave lidar data is collected and supplied by Airborne Research Australia (ARA) (Airborne Research Australia, 2020). The scan for this region was carried out following the 2019-2020 bushfires. The data includes hyperspectral data in addition to geometric and radiometric attributes (xyz, intensity, Echo Number and Number of Echoes). The point clouds are discrete return point clouds, and the data is compressed and supplied in laz format, a compressed version of las. The data covers 1200 Hectare (ha) and is split in 6 strips. The strip used for this study is 190 ha in area with a point density of 120 points per square meter (ppsm) (Figure 3).

Point Cloud Processing
Point cloud density is an important attribute to consider for the three-dimensional modelling of heathland. ALS or MLS point clouds, point cloud density of >100ppsm, are required for heathland modelling. Low point density could affect the vegetation vertical complexity and the accuracy of heathland parameters. A high point density increases the overall ground return likelihood and the level of classification and estimation of vegetation structure (Hamraz et al., 2017). To analyse the penetration of laser echoes through vegetation a quantitative analysis using the output of the range between the all-echo density and the last echo density was produced in OPALS. The output from this analysis displays the point counts between 0.8 -3.2 ( Figure 4). It represents a greater point density in the vegetated regions while the flatter regions have a lower point density. Therefore, this is a representation of the discontinuous vegetation pattern of heathland.  (Korzeniowska et al., 2014). The most important purpose of the DTM is analysis of the terrain along with the normalisation of the height for generation a CHM. For generating the DTM the hierarchic robust filtering technique (Pfeifer et al., 1998) using the the moving plane interpolation is applied. The DTM is calculated from the last echoes with a grid resolution of 0.5 m, based on the 12 nearest neighbouring points. The maximum search distance is set to a large value of 1.25 m in order to fill gaps in the data.

Above Ground Vegetation Classification and Terrain Removal:
The next step is generating the CHM. The CHM is the height or distance between the ground and the top of the trees. The results from the CHM represent many above ground surfaces between 2-4 m in height. However, a clear peak around 20 m was also identified that represented tall trees.
After generating the CHM, the echo ratio is applied. A good recommendation for selecting the best fitting search radius, for applying the echo ratio, is to double the average point spacing (Höfle et al., 2009). As explained by Höfle et al. (2009) this guarantees a good representation of neighbourhoods and avoids a large neighbourhood selection, which can result in expanded transition zones at the border of two objects with different surface structure. A search radius of 1 m is applied. This guarantees a good representative number of neighbours, while avoiding too large of a neighbourhood, that can cause expanded transition zones at the border of two objects with different surface structure. The overlay results of the echo ratio output over the aerial imagery displays a good representation of trees and shrubs with the exclusion of artificial objects (Figure5). A comparison of the echo ratio output and the ratio of all-echoes over the last echo output (Figure 4) represents matching vegetation coverages. This is specifically noticeable on the sides of the road and the dirt road.

Voxel-based Vegetation Modelling:
The first step as part of the voxelisation phase is to compute a bounding box for the point clouds followed by setting the voxel size of 0.4 m which represents the vegetation more realistically. The result is a dense voxel array composed of empty and occupied voxels. To decrease storage space and processing time, the empty voxels and the occupied voxels are separated with only the occupied voxels remaining. A total of 2844446 voxels out of 300 million remained with an average point cloud density of 4 points/dm 3 per occupied voxel. This is referred to as a sparse voxel array.

Segmentation:
Following, the voxelisation, seeded region growing is performed on the spare voxel array. In this study the region growing is only based on the voxel neighbourhood connectivity. Due to the low number of points per voxels, average of 4 points per voxel, a minimum threshold of zero points per voxel was set in the region growing algorithms. Voxel connectivity can be lost by applying a threshold for such a low number of point clouds per voxel which impacts the final segmentation, resulting in creating multiple clusters in a single tree.
A 26-voxel neighbourhood connectivity is set for the seeded region growing algorithm. The result of the segmentation creates a total of 2415 clusters. A change in the voxel neighbourhood connectivity can impact the shape and size of the resulting clusters.

Classification:
The different canopy layers are grouped based on their relative height from the DTM. The K-means clustering algorithm cluster is set to three and all other clusters are disregarded by the algorithm. Clusters of 2 m in height are the shrub layer. Clusters between 2 m and 12 m are the sparse trees and mallee layer. While clusters above 12 m in height are the tall trees and region where the heathland integrates into woodland. Figure 6 displays the classification output from the K-means clustering. The result is a two-dimensional export in QGIS (QGIS Development Team, 2009). The image displays the 2415 segmentation clusters classified into three layers. A total of 1330 segments were classified as shrubs. A total of 571 segmentation clusters were classified as sparse trees and mallee. While a total of 514 segments are displayed in the taller tree layer. Figure 6. Classification result of the heathland voxels into three categories: 1) shrubs (in green), sparse trees and mallee (in blue) and tall trees (in red).

CONCLUSION AND FUTURE WORK
In this paper an automated method based on a series of algorithms for the three-dimensional modelling of heathland and important heathland parameters was presented. A voxel-based approach using 0.4 m voxel size, 26 voxel neighbourhood connectivity, and zero-point density threshold is applied on all above ground vegetation point clouds. Furthermore, the data is classified into three vegetation layers using K-means clustering.
Future work is recommended to further improve the results from this paper using morphological operations to improve the segmentation cluster output (Gorte and Pfeifer, 2004). Morphological operations will be applied in the point cloud phase to increase the point cloud density. This is to improve the segmentation output of single objects. Currently, some single objects are segmented into multiple clusters. This is due to the lack of connectivity in the point clouds.
Subsequently, tests and further estimates on the voxel size and selected thresholds for segmentation and classification will be applied. The 0.4 m voxel size will be compared with larger voxel resolution for the representation of vegetation parameters. The segmentation and classification output from the larger voxel sizes will be compared using different point density thresholds in the segmentation algorithm and the transition of vegetations into neighbouring layers will be investigated. Additionally, different voxel neighbourhood connectivities will be tested on the data to examine different segmentation cluster outputs.