AUTOMATIC EXTRACTION OF CONTROL POINTS FROM 3D LIDAR MOBILE MAPPING AND UAV IMAGERY FOR AERIAL TRIANGULATION

: Installing targets and measuring them as ground control points (GCPs) are time consuming and cost inefficient tasks in a UAV photo-grammetry project. This research aims to automatically extract GCPs from 3D LiDAR mobile mapping system (L-MMS) measurements and UAV imagery to perform aerial triangulation in a UAV photogrammetric network. The L-MMS allows to acquire 3D point clouds of an urban environment including floors and facades of buildings with an accuracy of a few centimetres. Integration of UAV imagery, as complementary information enables to reduce the acquisition time of measurement as well as increasing the automation level in a production line. Therefore, a higher quality measurements and more diverse products are obtained. This research hypothesises that the spatial accuracy of the L-MMS is higher than that of the UAV photogrammetric point clouds. The tie points are extracted from the UAV imagery based on the well-known SIFT method, and then matched. The structure from motion (SfM) algorithm is applied to estimate the 3D object coordinates of the matched tie points. Rigid registration is carried out between the point clouds obtained from the L-MMS and the SfM. For each tie point extracted from the SfM point clouds, their corresponding neighbouring points are selected from the L-MMS point clouds, and then a plane is fitted and then a tie point was projected on the plane, and this is how the Lidar-based control points (LCPs) are calculated. The re-projection error of the analyses carried out on a test data sets of the Glian area in Iran show a half pixel size accuracy standing for a few centimetres range accuracy.


INTRODUCTION
In recent decades, advances in laser scanning measurement systems have improved the quality, resolution, and data acquisition speed for acquiring 3D spatial information.Besides, satellite, aerial, and close-range imagery have been widely employed to obtain spectral and spatial information.Unmanned aerial vehicles (UAVs) or drones are alternative solutions for relatively large-scale mapping and monitoring tasks (Kume et al., 2015).The reason for the popularity of UAVs are regarded as: 1) costeffective measurement techniques, 2) high-speed data transformation, 3) ease of use, 4) flexibility, 5) high quality, and 6) highdensity products (Colomina and Molina, 2014).A digital metric camera mounted in UAVs allows for a reliable 3D reconstruction based on structure from motion (SfM) workflows (Nex and Remondino, 2014).However, the use of non-metric cameras in UAV platforms has become common due to the improvements of the calibration procedure and their relatively low-cost values.
Among point clouds production technologies such as optical and laser scanning systems, laser scanners directly provide 3D spatial data together with intensity/reflectivity values.Such measurement system emerges in terrestrial laser scanners (TLS), airborne laser scanners (ALS), handheld or backpack laser scanners, or LiDAR mounted , e.g., in UAV or mobile mapping system (Chen et al., 2019).
UAV imagery and LiDAR point clouds data are complementary to each other, and the weakness of the one is overcome by the strength of the another one.Therefore, their combination improves the accuracy, completeness, and efficiency of 3D reconstruction and modelling.Many studies have confirmed the importance of this data fusion for numerous applications such as 3D modelling (Remondino and El-Hakim, 2006), change detection (Islam et al., 2018), building extraction (Dornaika et al., 2016), object identification (Kent et al., 2015), 3D object modelling (Prieto et al., 2020), and 3D tree crown analysis (Leckie et al. 2003), and so on.
The registration/co-registration technique is an important step in integration of multi sensor data and/or multi temporal data.The registration of imagery and laser scanning point clouds have been carried out by numerous researchers, e.g., Omidalizarandi and Neumann (2015); Eslami and Saadatseresht (2021).Such data fusion enables to relate the spatial information.It is typically carried out using different features such as, e.g., points, lines, and surfaces obtained from both aforementioned data sets.For instance, it is possible to reconstruct 3D building models, e.g., LOD2 building model, from the 3D point clouds as proposed by Partovi et al. (2019) and register the aforementioned data sets.Pandey et al. (2012) applied mutual information (MI)-based registration methods to register close-range photogrammetric images and TLS data.Nonetheless, it has been demonstrated that the MI-based approach has less efficiency in natural areas compared to urban areas (Mastin et al., 2009).MI-based approaches, typically minimise a cost function consists of, e.g., statistical correlation function, grayscale similarity, or the sum of squared differences (Parmehr et al., 2014).Omidalizarandi and Neumann (2015) performed target-based registration of TLS and digital camera based on the space resection bundle adjustment and variance component estimation.In addition, they have compared the target-based and MI-based approaches which has been shown the lower accuracy of the latter than the former one.Omidalizarandi et al. (2019) developed three different robust adjustment procedures to obtain precise, reliable, accurate and robust estimation of the exterior orientation parameters (EOPs) between the TLS and camera.
To derive accurate 3D geospatial information from imagery, it is a vital step to estimates the camera's interior and exterior orientation parameters.Interior orientation parameters (IOPs) consist of principal distance (i.e.focal length), principal point coordinates, and lens tangential and radial distortions which are specified through a camera calibration procedure.EOPs define the translations and orientations of the camera at the point of exposure in a mapping frame, which can be estimated using Ground Control Points (GCPs) through an Aerial Triangulation (AT) process (Zhou et al., 2020).
AT is one of the most critical steps in aerial photogrammetry to estimate the tie object points coordinates (OPC), EOPs, and IOP which is performed through the bundle Adjustment (BA) procedure (Chen et al., 2019).BA plays a vital role in 3D reconstruction (SLAM and SfM).Observed 2D points cannot be directly compared with the projected 3D points on the cameras so BA problems can be modelled as a sum of squares.This kind of problem is called a non-linear least-square problem (Angla et al., 2020).
The GCPs are essential for the task of AT.Traditionally, they are obtained from field surveying (e.g.GPS (Global Positioning System), orthoimages, or topographic maps (Liu and Zhang, 2008).Although the former method may achieve higher accuracy for GCPs, but it is time-consuming and labour intensive.Similarly, generating DEMs by field surveying or topographical maps is not cost-effective.In a photogrammetric project, it is demanded to have a high-quality GCPs for AT processing (Liu et al., 2007).For this purpose, researchers use indirect georeferencing to estimate IOP, EOP, and OPC.Indirect georeferencing has traditionally been used to integrate photogrammetric and LiDAR data sets.In case of LiDAR data, geometric primitives such as points, lines, and/or surface are required to be extracted in additional post-processing step.This is due to reason that LiDAR data does not directly show such primitives (Costa et al., 2017).Delara et al. (2004) used a low-cost digital camera for AT to extract LiDAR-based Control Points (LCPs) from LiDAR intensity images.She used the term LCP instead of GCP for the first time in this article.In fact, LCP is the same as GCP, which were extracted from lidar and are used in aerial triangulation.With the difference that LCP can be on walls or buildings or anywhere and not just on the ground, that's why it is a better term.Jaw (1999) and Jaw and Wu (2006) extended the photogrammetric model by establishing a new relationship with planar surfaces.Habib et al. (2004) and Habib et al. (2005) presented an approach for aligning photogrammetric models to LiDAR reference frames by using linear features derived from LiDAR data.A 3D similarity transformation and linear features were used to register LiDAR with photogrammetric data.James et al. (2006) extracted reference ground control points manually for a photogrammetric model using high-resolution shaded liDAR DEMs.The quality of DEMs plays a vital role in successfully implementing the overhead research projects.Liu et al. (2007) presented a method to extract Lidar-base control Points from intensity images from LiDAR and use DEM for image orthorectification.Mitishita et al. (2008) proposed a method of detecting roof centroid from LiDAR point clouds, which was used in a bundle adjustment to find the exact position of the roof centroid.In Wildan et al. (2011), LCPs were used to execute the AT of a large photogrammetric block of analogue aerial photographs.Ju (2013) used features, intensity, and frequency-based methods to develop a hybrid method for robust LiDAR and optical imagery registration.Gneeniss (2014) examined how much and where LCPs should be distributed when performing AT for large photogrammetry blocks.This approach focused on point-feature extraction by the intersection of three building roof planes.Also, it presented a comprehensive review of different methods of extracting features.Li et al. (2015) presented a method for solving the problem of linking LiDAR and photogrammetric data in deserted areas without GCP using sand ridges as primitives.Costa et al. (2017), the method entails four main steps: filtering LiDAR points on the rooftops, determining the roof building planes, modelling the roof building planes, and intersecting the roof building planes.It begins with the identification of the residential areas, which must contain a minimum of three-plane roofs.Afterwards, LiDAR points on building roofs are categorised and mapped.As a final step, three roof planes of the same building intersect to calculate the 3D coordinates of the LCP.We used least-squares adjustment to model the building roof planes.
A primarily based method for registering objects is to identify and extract standard spatial features, such as points, lines, and planar patches.Control points were manually extracted from aerial liDAR point clouds by all authors.This is followed by determining the transformation parameters required to align the two data sets, usually based on a 3D conformal or 3D similarity transformation.Registration is normally done by using straight lines drawn by intersecting two planes or directly observing the liDAR surface to identify reference targets Wang and Tseng (2011).It is necessary to have many linear features with good spatial distribution in the photogrammetric block to achieve equivalent accuracy to standard control point patterns Mitishita et al. (2008).Other methods have used planes as standard features Brenner et al. (2008).Registering regular or irregular surfaces is also possible by interpolating both data sets, with registration performed by minimising vertical or Euclidean distances between the two planes (Akca, 2007).Registration quality highly depends on the process adopted, which can be classified as manual, semiautomatic, or automatic.This paper proposes an approach to automatically extract LCPs from the LiDAR mobile mapping system (L-MMS) and UAV imagery used in the AT.The proposed strategy consists of three main steps as follows: 1. Generate image-based sparse point clouds via a SfM strategy, 2. Iterative plane fitting approach for extraction of LCPs from L-MMS, 3. Refine IOPs of a camera, EOPs of block images, and estimating OPCs of tie points through a BA procedure.
Here, it is assumed that L-MMS data as a reference spatial information.In one hand, it is supposed that all EOPs and IOPs of image sensors are inaccurate and should be reliably estimated.These inaccurate EOPs and IOPs values are related to approximate sensor positions and non-metric camera parameters.On the other hand, not much attention has been paid to a handheld laser scanners used in L-MMS.In this study due to the above necessities, we investigate the benefits of extracting control points from L-MMS data for being used in a photogrammetric UAV network.
This paper is organised as follows.In Section 2, data acquisition is explained.Section 3 describes the methodology.Section 4 explains experiments and results in two different case studies.In the end, conclusions and outlook are presented.

UAV Imagery
Images of this study were taken by DJI Phantom 4pro with the aim of producing a two-dimensional (2D) cadastral map with a scale of 1/500 during flight operations.According to the required accuracy, the flight altitude is approximately 88 m to achieve a spatial resolution of 2.4 cm (DJI, 2017).

LiDAR Mobile Mapping
In addition to UAV aerial images, mobile mapping point clouds are captured using a laser scanner.The mobile mapping system is georeferenced by using GNSS and IMU.In areas with inaccurate GNSS signal, the georeferencing is carried out based on the Simultaneous localisation and mapping (SLAM) algorithm.Largescale crossroads were generated using Velodyne Puck SBG Ellipse-D (Velodyne, 2019;SBG, 2021).Velodyne sensor can capture 300,000 points with a rotation rate of 5 to 20 revolutions per second with horizontal angles of 0.1 to 0.4 minutes and 16 lines with a vertical angle difference of two degrees.

METHODOLOGY
The proposed approach for extracting the control point from the mobile mapping data used in the AT of photogrammetric image blocks is presented in the following.A general flowchart of our proposed method is shown in Figure 1.This section is organised as following.In section 3.1, rigid registration are discussed.Section 3.2 explains about find neighbourhood and calculate control points.Section 3.3 contains blunder detection.Section 3.4 sparse bundle adjustment are discussed.Finally, section 3.5 presents the stop condition.

Rigid Registration
Registration algorithms is classified into the rigid and non-rigid approaches.As long as the environment is fixed and rigid, rigorous approaches can model a homogeneous transformation with only six degrees of freedom (DOF).One of the rigid registration methods is the high-speed Principal component analysis (PCA) Huizinga et al. (2016) method, which works without the need for any features and achieves satisfactory results.Two data sets are analyzed using PCA, which calculates three eigenvectors and identifies four complementary pairs of feature points based on their centers of gravity.Following that, the initial registration matrix is derived from the corresponding point set using the previously mentioned registration method.Finally, the PCA method brought the Point clouds from the UAV images and the L-MMS data close to each other.

Find Neighbourhood and Calculate LiDAR Control Points
LCPs are obtained for those tie points that are located close to the L-MMS Point Clouds through the neighbourhood analysis.
In this study, LCPs are used as ground references for aligning LiDAR and photogrammetric data sets.They are also used as control points to perform in-situ camera calibration and to estimate IOP and EOPs.Image point features can be extracted from LiDAR point clouds or LiDAR intensity images using a variety of approaches.Geometric shapes like corner angles of buildings or ridge roof corners are commonly used to extract point features.
In this study, the neighbourhood of each tie point is searched using the KD-tree method according to the two criteria of a neighbourhood radius and a minimum number of points.A KD-tree consists of nodes representing rectangles in d-dimensional space.
A rectangle is formed by d closed intervals on the coordinate axes.A hyperplane aligned with the axis of each internal node divides the rectangle into two sub-rectangles associated with each child node.Throughout a point cloud, points are located in rectangles that contain rectangles.The KD-tree is capable of accelerating searches for k-nearest neighbours by using ball-rectangle intersection tests.Given a query point p and k candidate neighbours, we can be sure that the true k-neighbourhood will be found inside a ball centred on p and passing through the current kthnearest candidate.Next, the distribution of points is checked using the analysis of individual values by Equation ( 1), and the neighbourhood that forms the page is identified.For this purpose, a ratio was defined experimentally.If in a neighbourhood, the scatter of points in two directions are more than the another direction, it indicates the presence of a plane in that neighbourhood.
In Equation (1), A contains the 3D coordinates of all neighbouring points in a predefined neighbourhood, U is left singular vectors, Σ is singular values and V is right singular values.
After identifying the neighbouring points from the L-MMS data, the plane is fitted to them based on the robust plane fitting method, and the plane parameters are calculated.Then, with Equation (3), the tie point is projected to the plane, and the new coordinates of the projected point are calculated.Given a straight line defined by tie (X, Y, Z) and direction Normal (nx, ny, nz) that shown in Figure 2.
and a plane defined by point M (Xm, Ym, Zm) that exist on plane and Normal (nx, ny, nz) Substituting in Equation ( 4) for the variable point from Equation (3) the solution for parameter t is: In the case of a line parallel to the plane, the denominator becomes zero.The coordinates of the point of intersection are obtained if the solution for t is substituted in Equation ( 3).Here that neighbourhood and its corresponding node are excluded from the calculations.Finally, the project of the node points on the plane play the role of our control points in Bundle Adjustment.
Figure 2. Representation of the tie point and its projection on the plane.

Blunder LCP Detection
Some estimated LCPs obtained from the previous step might have poor quality or be a gross error as a result of the inappropriate plane fitting to the neighbourhood or inefficiency in the SfM algorithm used to calculate the object coordinates of the tie point.
It is then necessary to identify blunders to exclude them from the calculations.High probability errors are used to find large errors.
Because large errors rarely occur in the data set.At this stage, the estimated LCPs are re-projected to the image and compared with the image observation values.In typical blunder detection approaches, any data that differs from the mean by more than 3σ is considered as outliers and should remove from the data set or down-weighted.These residuals are checked at 99% confidence interval, and the blunders are removed.

Sparse Bundle Adjustment
The basic photogrammetric principal geometry consists of three geometric entities: object space points (3D points), corresponding image points (2D points), and perspective centres.Such a geometry can be formalised with collinearity equations from the Brown-Conrady model Duane (1971) such as Equations ( 6) and ( 7).The additional parameters related to lens distortions, coordinates of the principal point (i.e., the point closest to the projection centre), and sensor distortions, in practice, be used to develop the theoretical collinearity condition equations between image points, camera position, and object points.To estimate the 6 EOPs, the image coordinates are first rectified using the calculated IOPs of the digital camera, consisting of the principal point (xp, yp), the focal length f , the coefficients of radial distortions (K1, K2, K3), and the coefficients of decentering distortions (P1,P2).According to the Brown's equation Luhmann et al. (2019), the image measurements (x, y) are rectified to (x ′ , y ′ ) according to Where Therewith, Rij(i = 1, 2, 3, ..., j = 1, 2, 3, ...) stands for the nine elements of the rotation matrix Equation ( 13), which can be modelled by three rotation Euler angles ω, φ and κ Luhmann et al. ( 2006) X0, Y0 and Z0 are the translation parameters of the camera pose.
Meanwhile, X, Y and Z are the object point coordinates usually given in meters in the mapping reference system (i.e., Earthfixed coordinate system; UTM).Finally, the IOPs, EOPs, and OPC are updated.In this research, the Sparse Bundle Adjustment (SBA) method is employed to solve linear condition equations.Since the number of observation equations and the number of unknowns is large, the dimensions of matrix A is so large that they cannot be solved in the usual way (Lourakis and Argyros, 2009).The open-source code of the presented method is available at https://github.com/naiem-reza/Sparse-Bundle-Adjustments-with-GCPs.
Bundle adjustment is commonly used in SfM and SLAM approaches.It minimises the sum of errors between 2D observations and their corresponding estimates.In addition to the 2D observations, there are also 3D LCPs which are defined as constraints.In addition, the residual vector dl is defined in such a way that the sum of squares errors minimise based on a robust cost function rather than just using a simple cost function.With a Weight as robuster, Equation (15) through the bundle adjustment procedure minimises the Φ. dl is the image and object residuals, w is weight matrix and df is a degree of freedom.

Stop Condition
Image and object residuals are calculated in each iteration of the SBA.The convergence criterion is satisfied in such a way that the residuals of each iteration are compared with the previous iteration.Ideally, it is expected that a stop condition will be issued when this value reaches zero.d is the maximum of the absolute differences of the residuals in Equation ( 16).
After executing the bundle adjustment, the camera IOPs, EOPs and OCP parameters are updated.Then the posterior standard deviation of the unit weight σ 2 0 is calculated, where σ 2 0 is the unknown theoretical variance of unit weight.Its a-priori value here is considered to be 1.Weight matrix is updated based on the variance component estimation applied in Omidalizarandi et al. (2019), and input to next SBA iteration.The algorithm runs with an updated weight matrix and searches for more reliable and optimal LCPs.In each iteration, the position of the tie points is updated, and they get closer to the L-MMS point clouds.This trend affects the number of control points and their regional distribution.

EXPERIMENTS AND RESULTS
The UAV images and L-MMS data of this research are captured from the Glian village located in Fars province, Iran.The geographical location of Glian village is 53°52'52"E, 28°52'38"N (Figure 3).First, the search algorithm, it extracted 478 LCPs according to the thresholds of the neighbourhood radius, the minimum point, and the ratio of singular values.According to the obtained experimental results, the minimum number of neighbouring points was 1/10 of the LiDAR mobile mapping density, so the neighbourhoods with a number of points less than 57 points of the calculation cycle were discarded.By reducing the singular value ratio, the strictness of the plane selection in the LiDAR point cloud space is reduced, resulting in a greater number of LCPs.In the Figure 4, the results of changes in thresholds are shown.
A red point represents the whole set of tie points.In contrast, a green point represents the points that pass the radius threshold of a neighbourhood and the minimum number of points.In contrast, a blue point represents the LCPs obtained after fitting the plane in the defined neighbourhood and exceeding the threshold.
As explained above, in the first iteration, 478 LCPs were extracted and the bundle was run and initially reduced the root mean square error (RMSE) of image residuals from 127 pixels to less half pixel during bundle loop as shown in Figure 5.In addition, the RMSE of object residuals increased at first and then decreased and reached the value of 5 cm as shown in Figure 6.In the beginning, the increase of the RMSE of object residuals can be that the tie points were produced with the approximate values of the camera parameters in the SFM method.It seems that in the initial iterations, the bundle tries to improve the parameters of the camera and then the improvement is made on the parameters of the OPC.After first iteration, the SBA runs again with updated IOPs, EOPs and OCP and weight matrix that tried to find more reliable and better LCPs.In the second iteration, the algorithm find new LCPs and the number of LCPs reached 470 points.This means that 8 points were outlier and not reliable for LCPs.In the current iteration, the bundle ban has only performed one iteration, and the remaining histograms have provided results similar to the Figure 7-(down).The results of IOPs, EOPs and OPC were similar to the previous iteration.

CONCLUSION
This research aims to automatically extract control points from the 3D L-MMS measurements and UAV imagery to perform aerial triangulation in a UAV photogrammetric network.For this purpose, first, tie points are extracted from the UAV imagery and their corresponding neighbouring point clouds are selected from the L-MMS data.A robust plane fitting is applied to the selected neighbouring points and the projected tie points on the fitted plane are defined as LCPs.SBA procedure is employed to estimate IOPs, EOPs, and OCP.This approach allows a dense network of LCPs to be extracted.
As a result, the need for conventional control points may be reduced if these are used in more significant numbers.Transformation parameters will shift or rotate if the camera parameters are changed, or the GNSS data is inaccurate.It is clear that the developed research approach provides an alternative to SBA that is reasonable and affordable.However, some technical challenges involve identifying the corresponding points within the L-MMS data and establishing a tie point as a control point.Huge differences in point density, different coverage, and the discrete nature of points.Further research is also recommended to improve the robust estimation and reweighting the LCPs.Also, using the voxel technique so that the control points are evenly distributed throughout the area.

Figure 3 .
Figure 3. Location and boundary of villages of Glian in Fars province, Iran.

Figure 4 .
Figure 4. Displaying LCPs in two modes with different thresholds.

Figure 5 .
Figure 5. Displaying RMSE of image residuals in Bundle adjustment loop.

Figure 6 .
Figure 6.Displaying RMSE of object residuals in Bundle adjustment loop.

Table 2 .
Laser scanner specifications