AUTOMATED UAV LIDAR STRIP ALIGNMENT IN FORESTED AREAS USING DENSITY- BASED CANOPY CLUSTERING

Recently, LiDAR point cloud data acquired by Unmanned Aerial Vehicles (UAVs) are used in many scientific disciplines and like the former photogrammetric techniques these data are usually collected in overlapping strips. Generation of comprehensive models of the scanned areas requires these strips to be aligned together which is a challenging process due to the multi sensor scanning system including the scanning sensor, the GNSS receiver and the IMU sensor. The main errors result from the inaccurate GNSS locations and flight path shifts as well as failure of the GNSS signals in complex urban or forest environments. For that reasons, the development of an automatic feature-dependant method in urban areas or individual tree-based in forest areas where there are no distinct features for strip adjustment in these environments become a must. This research work focuses on automated co-registration/alignment multiple point cloud strips of forested areas acquired from UAV LiDAR (or referred to as ULS) lack of artificial ground control. The main limitations of ULS data of forests are the relatively low sampling density of near ground areas and stem nullity due to the top-view scanning mode of ULS. To obviate this, this work explicates the tree crowns shape to identify the key points required for co-registration by applying a density based clustering algorithm (DBSCAN) to the tree crowns and models resulting clusters with Gaussian mixture models by learning the best parameters using maximum likelihood estimation to define the key points. A feature vector is assigned to each point by quantifying its angular and linear relationship with respect to the local system origin. Next, the similarity score matrix is computed by a fixed geometric relationship between the distance and angle similarity. Then, the maximum weight matching problem is solved for the similarity score to gain point-to-point correspondence. Finally, the optimal 2D rigid transformation parameters (one rotation and two translations without scale factor ) are obtained using permutations to try out for all possible paired combinations and count the number of inlier points satisfying a tolerance of planimetric deviation after alignment within a user defined threshold. The results of two test forest plots with different tree species and ULS point densities show a mean planimetric enhancement from 1.79 m to 0.22 ± 0.13 m for plot one and from 2.33 ± 0.53 m to 0.61 ± 0.21 m.


INTRODUCTION
Forest monitoring and mapping are currently very vital because of their ecosystem's role in environment and economy. One of the most valuable techniques for forest mapping is Light Detection And Ranging (LiDAR) which exploit the rage measurements to the surroundings to calculate the 3D positions of the targets with extreme dense point cloud and high measuring accuracy. LiDAR platforms include air-based and ground-based. The quick developments in hardware manufacturing resulted in small, lightweight platforms like Unmanned Aerial Vehicle Laser Scanner (ULS) for air-based systems and Backpack Laser Scanner (BLS) in case of ground-based systems. Both ULS and BLS has significantly contributed to forest applications ( Polewski et al., 2019;Wallace et al., 2016). Lately, developments in small-scale technology offered a Hand-held Laser Scanning system (Bosse et al., 2012) with light weight and suitable for many mapping and localization applications. Each of these systems has its advantages and limitations. Unmanned Aerial Vehicles (UAVs) present a distinctive combination of very high-resolution data collection at a relatively lower survey charge. In forest it may be used for assessments of woodlots, fires investigation, vegetation monitoring, species detection, volume calculation, along with silviculture (Nex and Remondino, 2014). Forest structure, including spreading and physical appearance of trees, branches, The point cloud data from any LiDAR system are affected by different errors (especially random and systematic errors) from the system configuration and measurement. Strip adjustment aims at minimizing the effects of systematic errors to maintain qualified results when dealing with multiple strips of point clouds. A proper calibration of strips requires accurate Position and Orientation (POS) measurements of GNSS/IMU. In addition, the establishment of control points (artificial targets) is a time-consuming process and of high cost rather than the lack of GNSS signals in GPS-denied environments (urban canyon or dense forests). To overcome these difficulties, several approaches used the scanned features as alternatives for the ground control points to perform an automatic or object-based strip alignment of point cloud data using the wellknown Iterative Closest Point (ICP) algorithm ICP would be more suitable for urban point clouds than forest areas after coarse registration is completed (Xu et al., 2019), since it is lack of adequate variety of objects in forests which make the registration process more difficult. ICP approach performs a fine registration of an overlapping pair of point clouds by iteratively approximating the transformation parameters, presuming proper a priori configuration is provided.
One of the common features of ULS and other airborne photogrammetric data acquisition techniques that data are collected in adjacent strips to ensure full coverage of area of interest. According to this acquisition nature and the multisensory composition of the system some related errors constantly occur (e.g. trajectory errors, POS systematic errors and Misalignment and gyro drift of the INS (Filin, 2013). The occurrence of such errors is observed by visualizing the (figure 1) inconsistencies between the neighbouring strips. From this prospective, an adjustment process is necessary to reduce or eliminate these errors. Strip adjustment is the process in which a couple of scans, or more are to be truly geometrically configured. It requires the original observations (GPS, IMU and the laser measurements), which are not generally offered to the end-user (Habib et al., 2008). The chief purpose of strip alignment is the simultaneous optimization of relative and absolute orientation of the acquired point cloud data strips. Combining point cloud data from numerous, co-registered laser scans from the same or several platforms is a well-known preprocessing stage to overcome the restrictions of data obscuration due to laser occlusion. UAVs offer high-quality point clouds of reasonably small areas (few hectares). They have a wide field of view and ensure point cloud data with less occlusions (Glira et al., 2015). The significance of lidar strip adjustment affects many phases of forest inventory simply because it guarantees the thorough coverage, which in turn provides sufficient information on the canopy and tree structure (Wallace et al., 2016), biomass estimation (Cao et al., 2016).
This paper is to align pairwise UAV lidar strips in a forested area using the detected key points by fitting Gaussian mixture models to individual tree crowns assembled by Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm. The corresponding point pairs are identified by applying graph matching to the matching score matrix obtained from similarity measures of feature descriptors formed by the distance and angular metrics. After obtaining a correspondence set, the 2D rigid transformation parameters are calculated by permuting over all the possible combinations of matched pairs to reach the optimal transformation. The transformation parameters include one rotation angle and two translations while no scale factor because the two strips (datasets) are acquired by the same lidar sensor and platform.
In addition, no z-axis alignment is required as the ULS provides georeferenced data due to the GPS-IMU integration with no significant z-axis misalignment (figure 1 -bottom row left, which represent a profile of the two strips before alignment).
The main contributions of this work are represented in: Providing proper key points by estimating the best parameters of a normal Gaussian distribution fitted to the density-based clustered tree crowns to overcome the inaccuracy of tree locations in broadleaf forests where the treetop is ambiguous. Assuming fixed inner geometry of the forest structure, the feature descriptor is formed based on a combination of angular and linear relationships between each key point and the system origin (mean of all point coordinates) Permutations on the correspondence set to gain the optimal rigid transformation parameters The rest of this paper is organized as follows, section 2 contains the related works, section 3 the research methodology, section 4 describes the study area and the datasets, section 6 shows the research results with discussion and finally section 7 for the conclusion.

RELATED WORK
The co-registration of point cloud data has a vital role in many applications like 3D construction, urban mapping and forest biomass inventory. So, there are many approaches in literature regarding this topic. Theiler et al. (2014) performed the coregistration of TLS point cloud without any artificial markers placed in the scene. They used 4-Points Congruent Sets (4PCS) to match and align the point cloud utilizing methods from philosophies from image processing and computational geometry. Yang et al., (2016) proposed a method to co-register point cloud data from ALS and TLS depending on the extracted building outlines. These outlines were matched by applying geometric constraints on the features. Then, a correlation coefficient was computed of all geometric features by decomposing Laplacian matrices into the spectral space and find the coarse registration while the fine registration was performed via a multi-line adjustment approach.
Figure 1. The planimetric discrepancy between two UAV strips Kelbe et al. (2017) used TLS data and proposed a method for creating view-invariant feature descriptors that are fundamental to the point cloud data and allow blind marker-free registration in forest environments. Polewski et al. (2016) considered a similarity function based on the planimetric and vertical distances among the tree positions in the same plot to automatically register ALS/TLS data. The feature descriptors are compared to each other to obtain the matching score using a model of biological sequence matching and one-to-one correspondence is provided by graph matching while they use heuristic search to retrieve the optimal tie points set for optimal transformation. This study is then applied to co-register BLS and ULS datasets with updated similarity function and transformation model in (Polewski et al., 2019) using simulated annealing to calculate 3D transformation parameters. A probabilistic method is developed by Dai et al., (2019) to coregister point cloud data from ALS and TLS by extracting the modes of tree crowns and identifying the correspondence by probability density function. Guan et al. 2019) utilized the tree locations and common fixed forest geometry to register UAV, Backpack and TLS based on TIN matching and then applied ICP for fine registration.

METHODOLOGY
The research methodology three main stages: First, a densitybased clustering is performed on the point cloud to extract the tree crowns as point clusters. Then these clusters are modelled using mixture models to define the virtual key points which will be used in the next stage of adjustment. If these points hold fixed linear and angular relationship with respect to the origin of the Area of Interest (AOI) which is simply calculated as the mean of all keypoints coordinates, each point is defined by a feature descriptor with its distance and angle to the origin. Second, two similarity measures are used to calculate the cost matrix between the descriptors. Third, the maximum weight matching problem is solved by applying Hungarian algorithm to identify the correspondence pair list of the two scans. The optimal transformation parameters are obtained by permutation of all possible pair combinations.

DBSCAN Clustering
The unsupervised clustering approach Density-Based Spatial Clustering of Applications with Noise (DBSCAN) named in (Ester et al., 1996) is used to derive the tie points required for the strip adjustment. The algorithm returns several clusters by distinguishing the dense points zones separated by lower or blank areas otherwise the point is classified as noise. The main advantages of DBSCAN are identifying arbitrary shape clusters, robustness to noise and limited inputs. On the other hand, the key problem is its high sensitivity to the input parameters. The algorithm picks a random point and counts the nearby points within the search radius (Eps). If the number of neighbors is within the minimum number of points (MinPts) to form a cluster, the algorithm continues to search from new data point until no data points exist within the neighboring radius or the less than the MinPts. Then, a new cluster will be discovered. Otherwise, point is considered as outlier.

Gaussian Mixture Modelling and Parameters Estimation
The Gaussian distribution is used to fit the individual clusters where the model parameters (i.e. mean and covariance) are optimally estimated by Maximum Likelihood Estimation (MLE). A Gaussian Mixture Model (GMM) acts as a linear weighted sum of K Gaussian densities: (1) In which, is the k th mixture coefficient and, is the k th Gaussian density with mean and covariance matrix and dimensionality D.
The likelihood function ( ) of the multivariate Gaussian distribution (Bishop, 2006) for a D -dimensional vector is: ( 3 ) Therefore, for MLE of a Gaussian model, the best estimates (figure 3) of the parameters θ = [μ, Σ] can be obtained by taking the logarithm of the likelihood function then set the partial derivative and solve for θ.
( 4 ) Figure 3. The sampled tree crown with extracted the keypoint (red circle)

Feature Descriptor Formation
The output of clustering and modelling will be a pair of point sets representing the two datasets. Each point in named dataset will be described by two main components (figure 4): the angle and the radial planimetric distance to system centre (i = 1 : N points, j = 1,2). Each descriptor will be a vector of length two [ ] with DH indicating the distance to the system origin and β representing the angle with respect to the same origin.

Similarity Measure
For each single feature vector (Desi) in the model dataset, two modes of similarity are computed to all feature vectors of the target dataset. The angle similarity measure (Zhou et al., 2004) is used to define the affinity (S ) between two angles ( 1, 2) by the defining the angle difference (Δ( ) as a Gaussian distribution: The distance proximity matrix is determined by: The total similarity score matrix of dimension STotal (P1, P2) can be calculated by subtracting the two measures (9)

Feature Matching
The point-to point correspondence is identified by solving the maximum weight function (figure 5) using the Kuhn-Munkres algorithm. A weighted graph is established with a set of vertices (P1) from the first dataset and edges (P2) describing the points in the second dataset. The points count in both datasets usually differ, so pseudo vertices are added to the minor group. The matching pairs are obtained by maximizing the total similarity score matrix (STotal).

Optimal Transformation Parameters
From the graph matching, two sets of common horizontal coordinates (xi, yi) and (xj, yj) of tree locations were obtained. The two sets are aligned by performing rigid transformation using the method of rigid transformation (Kabsch, 1976). First, the centroids (c1, c2) of the two sets were determined, and then the covariance matrix C was computed. If VSW T be the singular value decomposition of C. Then, the optimum rotation Ro and translation Txy.
T xy = -Rc1 + c2 (11) The method searches for the optimal rigid transformation parameters by permutation without replacement to seek all conceivable combinations. The total attempts count will be with N the total number of matched pairs and M = 2 (least number of pairs to solve for. For each iteration, the planimetric difference before and after alignment is computed and the total number of points within the threshold (dthr) is recorded. The optimal pair is the one with the highest number of matched points. Hence, the best estimate of transformation is obtained and applied to the whole point set.

Vertical Offset Calculation
After identifying the optimal transformation, the vertical offset between the two strips can be estimated by comparing the elevations of the matched point set projected onto a DTM. A unique value (the median) is calculated for the vertical differences at all matched points and used to align the pair of strips vertically.
The total square area of the forest park is approximately 2235 ha including different tree groups. The basic tree species are, Dawn redwood (Metasequoia glyptostroboides), Poplar (Populus deltoids) and Ginkgo (Ginkgo biloba) (Cao et al., 2016). Two square test plots (table 2) with 50 m side length have been chosen for the adjustment process based on forest map with different tree species and stem density. The two plots were chosen to represent different characteristics. On hand, the test plot I has coniferous dawn redwood tree species with relatively high stem density (417 per hectare) which are regularly planted with average tree age 29 years. On the other hand, test plot II includes broadleaf poplar trees with around half stem density of the test plot I (208 per hectare) with irregular distribution and 22 years average tree age.

Data Collection
The strips of point cloud data were acquired using a GreenValley (GreenValley, 2018) UAV LiDAR System of the forest stands.
The UAV platform used for data collection consists of a lidar sensor (Velodyne Puck VLP-16) and an onboard GNSS sensor (Novatel) integrated with IMU (Novatel SPAN-MEMS-IMU) to update the sensor location during the scanning process. The platform is controlled by an auto-pilot system to ensure flying on the trajectories for around 25 min. (Guo et al., 2017). Table 1 introduces the flying parameters and the sensor characteristics.

Parameter name Value
Flying altitude ( The swath width of UAV covers each strip individually while the study area is covered by the overlap between the two strips to guarantee 100 % coverage within the study area.

EXPERIMENTAL PARAMETER SET UP
The main experiments parameters are the epsilon (Eps) and the minimum points to form a cluster with DBSCAN. The Eps is set as the same for both plots while the minimum number of points is 110 and 50 for first and second plots respectively because both plots are acquired with the same sensor under the same flight conditions. The cloth simulation filter (Zhang et al., 2016) is used to classify the point cloud to ground and non-ground points. The height above ground to eliminate ground points and define tree crowns was set to 15 m for plot one and 25 m for the second plot.

RESULTS AND DISCUSSIONS
The methodology has been examined on the two mentioned test plots. The test plot I with coniferous trees and regular stem distribution achieves a matching percentage over 69 % while the percentage is calculated w.r.t the smaller number of points from both strips. (matching percentage = ). Figure 7 illustrates the alignment results of test plot I with solid circles indicating the matched points from each strip. While figure shows the extracted points of the same plot to visually demonstrate the improvement after applying the proposed methodology. The same outputs are shown for test plot II (figure 8) to describe the situation before and after adjustment. The test plot II has two chief differences from test plot I: broadleaf poplar tree species and the stem density. Consequently, the matching percentage is 23 % which clearly reveals the influence of tree species on the matching results accuracy. The alignment accuracy has been evaluated by comparing the planimetric differences between the matched points before and after adjustment for the two test plots. The validation criteria demonstrate an improvement of the planimetric deviation range from 1.33 m to 2.27 m before alignment to 0.03 m to 0.5 m after alignment for the test plot one (top row in figure 9). The same values were investigated for the test plot 2 (figure 9 bottom row). The range before alignment was from 1.27m to 3.1m while after were 0.21 m to 1 m. After horizontal alignment of strips, the vertical shift correction takes place. Figure 10 describes the result of planimetric and vertical alignment using a profile of the point cloud of the two test plots. It is visually noticed that for both plots, there is a parallelism between their ground level which indicates true alignment of their z-axes before and after the process of strip adjustment. The study area is a planted forest in which the trees within the same plot have the same characteristics (e.g. species, age, height, stem diameter etc.). Also, the stem distribution is regular over most test plots. The effect of these factors appears in the matching and adjustment results as seen from figures 7,8 and 9. On hand, figure 7 represents test plot I which includes dawn redwood tree species with dense regular distribution with high matching percentage. On the other hand, figure 8 illustrates the irregular stem distribution of test plot II with low matching percentage. Figure  9 plots the planimetric deviations at the matched points before and after alignment, it is shown that test plot I has better alignment accuracy (mean = 0.22 m) than test plot II (0.61 m). From all mentioned, the method can perform better in dawn redwood tree species with regular stem distribution over broadleaf tree poplar trees.

CONCLUSION
This work proposed a method to align multiple ULS scans in forested areas. The research contributions can be represented in extraction of virtual keypoints by fitting GMM to tree crowns obtained from density based clustering (DBSCAN), creating the feature descriptors by wrapping radial distance and angle from each keypoint to the calculated system origin, and permutations over all possible pair combinations for best estimates of the transformation parameters. The proposed method can overcome the limitations of tree localization in case of complex forests dominated by broadleaf tree species in which the local maxima of single trees is fuzzy to determine. Moreover, the method stipulates nothing other than the existence of canopy cover in datasets to be aligned and keypoints are assumed to be extracted with the same rule based on canopy clustering. Herein, the strategy is validated on two real-life forest test plots and obtained promising results after alignment showing significant corrections to the planimetric differences for matched points. For test plot I the mean difference improved from 1.79 m to 0.22 m while the same values for test plot II are 2.33 m, 0.61 m, respectively. The future work underlaying this methodology will be focused on the aerial-ground multi-platform point cloud fusion especially in cases where tree locations cannot be accurately determined.