3D POINT ERRORS AND CHANGE DETECTION ACCURACY OF UNMANNED AERIAL VEHICLE LASER SCANNING DATA

Unmanned aerial vehicle laser scanning (ULS) has recently become available for operational mapping and monitoring (e.g. for forestry applications or erosion studies). It combines advantages of terrestrial and airborne laser scanning, but there is still little proof of ULS accuracy. For the detection and monitoring of small-magnitude surfaces changes with multitemporal point clouds, an estimate of the level of detection (LOD) is required. The LOD is a threshold applied on distance measurements to separate real surface change (e.g. due to erosion or deposition by geomorphic processes) from errors. This paper investigates key components of the error budget for two ULS point clouds acquired for erosion monitoring at a grassland site in the Alps. In addition to the registration error and effects of the local surface roughness, we assess the positional uncertainties of each point that result from laser footprint effects, which are a function of the scanning geometry (including range, incidence angle and beam divergence). By removing erroneous points with an increasingly stricter point error criterion, we illustrate that the positional point errors strongly affect the LOD and discuss how this type of error can be mitigated. Moreover, our experimental results with three different surface classes (bare earth and rock, buildings and grassland) show that the level of detection tends to be slightly better for areas with bare earth and rock than for grass-covered areas (due to their roughness). For all these surface types reliable distance measurements are possible with sub-decimetre levels of detection. * Corresponding author


INTRODUCTION
Today, laser scanning is an operational method in environmental monitoring and geomorphic applications (e.g. Höfle and Rutzinger, 2011;Telling et al., 2017;Hooke, 2020). The advent of unmanned aerial vehicle laser scanning (ULS) provides new possibilities in 3D change detection of geomorphological processes by overcoming difficulties of ground-based oblique-view mapping techniques, such as terrestrial laser scanning (TLS). The striking advantage of ULS is that it can acquire detailed data of larger areas with a minimum of flight strips and with better viewing angles, smaller ranges (and thus smaller footprint sizes), more homogeneous point distributions and less object shadow effects compared to TLS. For the detection and monitoring of small-magnitude surfaces changes with multitemporal ULS point clouds, an estimate of the level of detection (LOD) is required. This LOD can be applied as a threshold on distance measurements to separate systematic errors and noise from actual changes and, thus, obtain more reliable estimates for geomorphic process magnitudes and frequencies (e.g. Lane et al., 2003;Wheaton et al., 2010;Lague et al., 2013). The determination of LODs requires detailed knowledge of the error budgets of acquired data sets (Schär et al., 2007;Glennie, 2007). This paper demonstrates the estimation of an LOD for ULS point clouds for an Alpine grassland site, aiming to assess the suitability of ULS data for shallow erosion studies. The paper investigates several key components of the error budget. In addition to the registration error and effects of the local surface roughness, we consider the positional uncertainties of each point that result from laser footprint effects and thus are a function of the scanning geometry (including range, incidence angle and beam divergence; Schär et al., 2007;Fey and Wichmann, 2017). To the authors knowledge, LOD studies for ULS surveys do not yet exist, and thus the presented study provides important information for environmental monitoring tasks.
The paper is structured as follows. Section 2 gives a brief summary of related work, section 3 describes the test site characteristics and data sets used, and section 4 introduces the workflow and methods applied. Section 5 discusses the results, and section 6 concludes on applicability in geomorphological research and on possibilities for further improvement of LODs.

Erosion studies using laser scanning
In steep Alpine grasslands, shallow erosion is an abundant phenomenon, which results in a large number of relatively wellconstrained spots, where the bare earth (soil, unconsolidated sediments or bedrock) is exposed (Wiegand and Geitner, 2013). The development of such eroded areas can be initiated by different processes, such as trampling and grazing by cattle (Torresani et al., 2019), shallow landsliding (Cruden and Varnes, 1996), as well as abrasion and stripping of the vegetation and topsoil by snow gliding (Wiegand and Geitner, 2013;Höller et al., 2014).
Shallow eroded areas are monitored by laser scanning for morphological characterisation and enhanced insights to process dynamics (Mayr et al., 2019a), for assessing the spatial distribution of affected areas (Đomlija et al., 2019), for analysing their spatio-temporal occurrence (Zieher et al., 2016), and to estimate volumetric process magnitudes (Mayr et al., 2019b). Furthermore, laser scanning data and derivatives are inputs to erosion models (Schindewolf et al., 2016;Zieher et al., 2017). With typical process magnitudes in the decimetre-range (in terms of elevation change caused by erosion events), shallow erosion monitoring based on airborne laser scanning (ALS) is limited by ALS data accuracy and resolution (Zieher et al., 2016;Đomlija et al., 2019).

Unmanned aerial vehicle laser scanning
For a few years, survey-grade ULS systems have been in use for mapping and monitoring of forests (e.g. Brede et al., 2017;Wieser et al., 2017;Liang et al, 2019) or landslides and infrastructure . Due the high accuracy of laser point measurements combined with accurate positioning solutions (IMU/DGNSS), the geometric stability within individual ULS epochs is high (i.e. low internal distortions; see e.g. Mayr et al. 2019b). In many cases, a reasonable co-registration of multitemporal point clouds can be achieved via stable surface patches, which are either userdefined or automatically detected (e.g. Wujanz et al., 2016). This eliminates the need for targets or ground control points placed across the survey area. Besides reducing the workload in the field, this has advantages for applications in hazardous or inaccessible terrain (e.g. Pfeiffer et al., 2018).
The wide field-of-view of ULS-specific laser scanners (Sect. 3) results in a relatively large swath width and a potential for high strip overlap, but also in unfavourable scanning geometries for many points. Hence, a strong variation in point quality can be assumed, which is further amplified if the beam divergence of the scanner is comparatively large (e.g. 0.5 mrad in this study; Sect. 3), due to strong footprint effects (cf. Glennie, 2007;Schär et al., 2007;see Sect. 2.3). Thus, we adopt the idea of Schär et al (2007) and Fey and Wichmann (2017) to apply a qualitybased filtering to laser scanning point clouds, and we test how this affects the level of detection in the case of ULS.

3D Point cloud quality for change detection and deformation monitoring
Laser scanning based change detection and deformation analysis, e.g. in geomorphological studies, requires a thorough assessment of the elevation data quality. For example, Heritage and Hetherington (2007) discussed the problem of change detection analysis in fluvial geomorphology and limitations related to input digital elevation model resolution and quality. Prokop and Panholzer (2009) showed that knowledge of the data quality is necessary for the interpretation of erosion and deposition rates in a landslide area. Schär et al. (2007) presented a quality measure for ALS point clouds aiming at an enhancement for applications such as classification and digital terrain model generation. They proposed a quality indicator comprising errors due to the direct georeferencing of the laser beam, the measurement errors of the laser itself, local curvature, and scan geometry (i.e. range, incidence angle, and beam divergence). Lague et al. (2013) proposed a method to obtain a spatially variable estimate of the LOD with a certain confidence interval, considering registration errors and the local surface roughness as key error sources. Fey and Wichmann (2017) extended this approach by including the positional uncertainties due to laser footprint effects (Schär et al., 2007), incidence angle, and surface roughness. This allowed them to define an LOD for separating real changes at mountain slopes from noise and systematic measurement errors, enabling a more detailed and reliable slope deformation monitoring by long-range TLS. Kromer et al. (2017) monitored a landslide using permanent TLS. They introduced a spatio-temporal LOD, which considers system geometry, point density, atmospheric effects and surface roughness, and also includes the frequency of available scans. Anders et al. (2019) investigated dynamics of a sandy beach by analysing high-frequency long-range TLS time series. They defined the LOD after Kromer et al. (2017) by considering TLS system measurement accuracy, influence of scan geometry, registration and georeferencing errors, surface roughness, and atmospheric parameters. They analysed the LOD spatially and in time, and they discussed how scans with a high temporal frequency can improve the LOD.

TEST SITE AND DATA
The data used in our experiment was acquired at a test site in the Italian Alps (Villnöß Valley, Autonomous Province of Bozen -South Tyrol), with elevations between approx. 2100 m and 2450 m a.s.l. (Fig. 1). Most of this area around the summit of Zendleser Kofel (2423 m) is covered by grassland, which is partly used as meadow or pasture for cattle and partly abandoned today. Trees and shrubs, in contrast, are relatively scarce. The area contains several hiking trails and a gravel road leading to the buildings of a guesthouse for hikers. Bare rock is exposed at several small cliffs and boulders, which are scattered across the test site. Moreover, shallow eroded areas (Wiegand and Geitner, 2013) are an abundant phenomenon, resulting from different processes (Sect. 2.1). Within an erosion monitoring project (http://erodyn. mountainresearch.at/), two ULS flight campaigns with a Riegl RiCopter (Riegl LMS, 2019) were conducted in summer 2018 and 2019, respectively. This octocopter-type UAV carries a Riegl VUX-1LR laser scanner (Riegl LMS, 2019), as well as an Applanix AP20 inertial measurement unit (IMU) integrated with a differential global navigation satellite system (DGNSS) receiver (Applanix, 2019). A major requirement was to cover the project test site (approx. 48 ha) with three sets of batteries. Accordingly, the flight plan comprised three flights per campaign, each consisting of strips in different elevations above sea level to account for the complex terrain with > 300 m elevation difference. The average flying altitude was approximately 70 m and the planned flight speed was 8 m/s. The laser scanner was configured to a pulse repetition rate (PRR) of 820 kHz and an angular scan resolution of 0.0496°. The beam divergence of the VUX-1LR laser scanner is 0.5 mrad, which results in a beam cross section diameter (gaussian beam definition) of 50 mm at 100 m scan range (Riegl LMS, 2019).
The IMU/GNSS data for the recorded trajectories was postprocessed with Applanix PosPac (Applanix, 2019) using data from a permanent base station in the area (STPOS, 2019). Subsequent processing steps included point cloud extraction, georeferencing and strip adjustment with dedicated Riegl software packages (Riegl LMS, 2019). The resulting point clouds from the two campaigns (2018 and 2019) contain approx. 500 Mio points each and a mean point density of 760 (± 374) pts/m² (2018 point cloud). A rectangular subset of the test site was selected for the investigation presented in this paper ( Fig. 1).

METHODS
Using the two point cloud epochs from 2018 and 2019, we first determined some of the main factors contributing to the total error budget of point cloud distance measurement (deformation monitoring). Subsequently, we investigated the magnitude and distribution of the level of detection. Furthermore, we modelled point errors (Schär et al., 2007) to filter the point clouds used for distance calculation and analysed how a removal of erroneous points affects the LOD. These analyses were done for (i) the rectangular test site subset (Fig. 1, Fig. 2) and (ii) for manually selected sample areas of different surface classes (landcover types; i.e. bare earth and rock, buildings and grassland; Fig. 2). All processing steps presented in the following were performed using SAGA GIS (Conrad et al., 2015) with the LIS extension (Laserdata GmbH, 2020) and Python scripting (Python Software Foundation, 2020).

Registration of epochs
Point cloud (co-)registration and registration accuracy assessment was performed using small subsets of the two point clouds from 2018 and 2019 at circular surface patches (with 3 m diameter). These patches were selectively placed on vegetationfree areas (such as rock cliffs, boulders etc.), which are spread across the test site. Based on field observations (i.e. lack of visible geomorphological activity) and the local terrain context, these surface patches were assumed to be stable.
The registration by direct georeferencing, based on the sensor trajectories and orientations, was improved by using a set of five surface patches for iterative closest point (ICP) adjustment (Besl and McKay, 1992). A second set of 18 validation patches was used for estimating the registration error (reg) as the sum of (i) unsigned mean and (ii) standard deviation of the 3D point cloud distance. This distance was calculated according to Lague et al. (2013) and Fey and Wichmann (2017).

Quality-based point cloud filtering
In the next step, we considered the positional uncertainties for each point that result from laser footprint effects and, thus, are a function of the scanning geometry (including range, incidence angle and beam divergence) as an important criterion for point quality (Schär et al., 2007). First, we modelled these positional uncertainties, following the approach of Schär et al. (2007). Each footprint is formed by the intersection of the laser beam with the local surface and, thus, the point error depends on (i) the beam cross section at the given measurement range and (ii) the incidence angle of the laser beam on the local surface. The cross section of the laser beam is estimated from the (scannerto-surface) measurement range and the beam divergence (0.5 mrad for the scanner used).
The measurement range and the beam direction were calculated for each point, using its corresponding scanner position in the trajectory (identified via a unique time stamp). As a description of the local tangent plane, the local normal vector for each point was calculated by fitting planes to point sets in a spherical neighbourhood of 0.1 m radius. From this normal vector and the laser beam direction, the incidence angle of the beam was determined. Based on this incidence angle and the beam cross section, the laser footprint was then modelled as an ellipse in 3D. This ellipse was decomposed into its maximum extensions in horizontal and vertical direction to estimate the horizontal error, the vertical error and the total 3D point error, according to Schär et al. (2007) and Fey and Wichmann (2017).
Subsequently, we used this modelled total point error pe (i.e. positional uncertainty of each point) for a quality-based filtering of the point clouds. We first analysed the magnitude and frequency of point errors in both point cloud epochs for the test site subset (Fig. 3) to select a range of thresholds for filtering (tpe [m] = {0.20, 0.10, 0.08, 0.06, 0.04, 0.02, 0.01}). Different versions of the point cloud epochs were created, where all points with pe > tpe were removed. The intention for this step was to tackle the large errors first and to investigate how a gradually more restrictive filtering can further improve the level of detection (Sect. 4.3) but also how this affects the completeness of the point clouds for further analysis (such as distance calculation).

Level of detection for point cloud distances
In the context of distance calculation between point clouds, it is important to know the minimum magnitude of geometric surface change (i.e. deformation or erosion/deposition) that can be distinguished from point cloud errors. For this purpose, Lague et al. (2013) developed a method to determine the local level of detection at the 95% confidence interval (LOD95), which considers registration errors and effects of the surface roughness on the distance calculation. Calculated distance values with a magnitude > LOD95 are then regarded as "real" distance (i.e. difference of the surface geometry), while the others are considered as unreliable. Fey and Wichmann (2017) extended this approach to account also for those uncertainties of each point that are due to footprint effects on the positional accuracy (point errors, Sect. 4.2).
Based on the registration accuracy assessment with stable surface patches, a fixed value for the registration error reg (Sect. 4.1 and Sect. 5.1) was used for all points. We applied the implementation by Fey and Wichmann (2017) of the Multiscale Model to Model Cloud Comparison (M3C2) algorithm for point cloud distance calculation (Lague et al., 2013). Within the M3C2 method the local surface roughness and the point measurement noise affect the results, because the distance is calculated between planes fitted (least-squares) to the neighbourhoods of corresponding point pairs in the two epochs. Consequently, the LOD95 calculation includes the standard deviations of the point sets from these fitted planes. For our experiment, we defined a spherical neighbourhood with a radius of 0.15 m for querying these point sets. To get insights into surface roughness or noise effects (Sect. 5.4), we also computed the total standard deviation of the planes fitted in both point cloud epochs, calculated as the square root of the summed variances. Using the method of Fey and Wichmann (2017), the LOD with a confidence interval of 95% was calculated for two point cloud epochs A and B as 95 = ±1.96 (√( 2 + 2 ) + + where σ = plane fitting standard deviation, n = number of points in the plane fitting neighbourhood, reg = registration error, pe = point error.

Registration
Based on the assessment with the validation patches, the mean registration error with direct georeferencing was estimated as reg = 0.066 m (0.043 m ± 0.023 m) for the entire test site. After fine-registration by (global) ICP adjustment on the five registration patches, the registration error was reduced to reg = 0.018 m (0.002 m ± 0.016 m), again as assessed by the validation patches. Accordingly, we use reg = 0.018 m as input for estimating the LOD95 (Sect. 4.3). Fig. 3 shows the magnitude of 3D point errors for the test site subset (see Fig. 1) that are related to footprint effects, i.e. they depend on the range and incidence angle of the individual point's laser measurement (Sect. 4.1). Note that points with pe > 0.20 m are already excluded here. In both point cloud epochs, the majority of these point errors is small, but existing larger errors can severely compromise a change analysis locally. The distribution of 3D point error magnitudes in the samples is very similar for the class bare earth and rock and for the class grassland (Fig. 4). In the buildings samples, however, there are (relatively) more points with larger errors. We attribute this phenomenon to the situation that the few buildings are all located in the same peripheral part of the test site ( Fig. 1) and that, due to the projects' erosion monitoring scope, the flight pattern was not planned to cover this area very well. Hence, a less favourable scanning geometry is assumed to have negative impacts on the point accuracies here.

Level of detection
The computed level of detection (LOD95) for the least restrictive filtering threshold (tpe = 0.20 m) is very variable throughout the point cloud samples of all three classes (Fig. 5), with minima as low as 0.05 m to 0.06 m and maxima of up to approximately 0.25 m. For approximately 25% of the distance calculations the LOD95 exceeds 0.15 m (third quartile).
In our experiment, the quality-based point cloud filtering with stricter thresholds tpe improved the median level of detection (LOD95) for all three investigated classes by several centimetres, as shown by the boxplots in Fig. 5. The effect on the upper quartile of the distribution (i.e. above the upper whiskers in the plot) was even more pronounced, thus also reducing the variability of the LOD strongly. These effects are also illustrated exemplarily for two and three different thresholds in Tab. 1 and Fig. 6, respectively.
The improvements in LOD by quality-based filtering, however, came at the cost of a decreased point number and point density. The reduction of points by excluding those with pe > tpe is amplified by the circumstance that for a valid calculation of the distance and the LOD certain requirements must be fulfilled (i.e. (i) enough points for plane fitting within the search radius and (ii) a corresponding point in point cloud B (the 2018 epoch) within the search cylinder; see Fey and Wichmann (2017) for details regarding the distance calculation). Therefore, the number of points with a valid distance and LOD was reduced considerably by the quality-based filtering (Tab.1). In some areas of the test site, there will still be enough points for an analysis and interpretation of surface changes, while in other areas the point cloud became strikingly incomplete and dominated by gaps (Fig. 6). We interpret these spatial patterns as being governed largely by the flight plan. Optimizing the flight plan towards higher point density and lower point error by simply reducing the strip spacing, however, would reduce the productivity of ULS data acquisition (in terms of areal coverage in the given time). As data gaps are clearly increasing with stricter point error filtering (Fig. 6), the point error threshold must be chosen carefully. Future studies could implement a locally adaptive threshold, which considers the local point error distribution and point density.

Surface roughness effects
For an enhanced understanding of the level of detection (Sect. 5.3), this section takes a look at the plane fitting standard deviation (Sect. 4.3) and briefly discusses its contribution to the total error budget and the LOD, respectively. The standard deviation of locally fitted planes (required for M3C2 distance calculation; Lague et al., 2013) can be affected by (i) point cloud noise due to point errors and (ii) surface roughness. For visualization, the plane fitting standard deviation of qualityfiltered point clouds (tpe = 0.04 m) was rasterized as the arithmetic mean of the point values per 0.1 m cell (Fig. 7). Here, most areas with bare earth (such as the gravel road, hiking path or eroded areas) are characterized by low standard deviations. In comparison, grassland areas tend to have higher standard deviations, unless the grass is mown (e.g. in the area west and northwest of the buildings). These spatial patterns are partly reflected in the LOD maps (Fig. 6). For the samples, the tendency of higher standard deviations in grassland (compared to bare earth and rock) persists across all filtered point cloud versions in our experiment (Fig. 8), indicating that there is an effect of real surface roughness (not noise). However, Fig. 8 shows also a reduction of the plane fitting standard deviations by the quality-based filtering, which is most notable for the buildings class (tin roofs). This reflects the removal of noise (due to point errors) as well as the planarity of these artificial surfaces, and it highlights that the quality-based filtering can indirectly enhance the LOD (in addition to its direct effect by improving the point cloud accuracy).

Erosion monitoring example
To illustrate the applicability of the methods for shallow erosion monitoring, a small subset of the test site is shown in Fig. 9, with 3D point cloud distances representing surface change by secondary erosion. Between the two scans (2018 and 2019), the majority of existing eroded areas in the subset remained stable, but at one eroded area a clod of turf and soil was dislocated from the scarp and deposited in several smaller pieces approximately 40 m downslope (horizontal distance). For all points in this subset with distances exceeding the LOD95 (i.e. "real change"), the LOD95 ranges between 0.063 m and 0.358 m, with a mean of 0.080 m (± 0.017 m). The absolute (i.e. unsigned) 3D distances are ranging between 0.065 m and 0.761 m, with a mean of 0.196 m (± 0.126 m).
For cleanly visualizing this example, spurious effects of vegetation change on the deformation measurement were largely filtered out by deleting small point groups, i.e. points with less than 25 points in their spherical neighbourhood (with 0.2 m radius). Distances are signed to indicate if the new surface is below or behind (-) and above or in front of (+) the old surface, respectively. We conclude that the topographic change by this secondary erosion process with its decimetrescale event magnitude is well captured by the ULS-based distance measurement.

CONCLUSIONS
The presented study shows that sub-decimetre 3D change detection by multitemporal unmanned aerial vehicle laser scanning (ULS) is feasible over areas with tens hectares and with complex terrain. As expected (due to the different roughness of these surface types), the level of detection (LOD) tends to be slightly better for areas with bare earth and rock than for grass-covered areas. For shallow erosion monitoring, these LODs confirm that ULS point clouds are suitable to reliably detect new eroded areas, where the magnitude of change events typically is around 0.2 m (mean eroded area depth; Wiegand and Geitner, 2013). In addition, these findings indicate that also secondary erosion events (see e.g. Fig. 9) and tendencies over multiple years, such as retrogressive erosion or a reactivation of the movement at shallow landslides (cf. Mayr et al., 2019a) can be monitored by ULS.
However, our results show also that, if centimetre-level accuracy of the change detection is required, this calls for an assessment and filtering of point errors related to the scanning geometry (i.e. incidence angle and range) to improve the quality of the ULS point clouds. Such a quality-based filtering of the point clouds, however, can result in areas with low point density or even data gaps, depending on the terrain, flight plan, and sensor configuration. Consequently, filtering thresholds must be chosen carefully and should consider or adapt to local variations of the point density and quality, to maintain point clouds that are consistently complete and accurate across the area of interest.
To optimize both point cloud quality and completeness, the scanning geometry and the required LOD should be considered already during flight planning. Future work could integrate these aspects into dedicated flight planning and point cloud simulation tools (cf. Bechtold andHöfle, 2016, Bremer et al., 2019).