THREE-DIMENSIONAL PATH PLANNING OF UAVS IMAGING FOR COMPLETE PHOTOGRAMMETRIC RECONSTRUCTION

Lightweight unmanned aerial vehicles (UAVs) have been widely used in image acquisition for 3D reconstruction. With the availability of compact and high-end imaging sensors, UAVs can be the platform for precise photogrammetric reconstruction. However, the completeness and precision of complex environment or targets highly rely on the flight planning due to the self-occlusion of structures. Flight paths with back-and-forth pattern and nadir views will result in incompleteness and precision loss of the 3D reconstruction. Therefore, multiple views from different directions are preferred in order to eliminate the occlusion. We propose a 3D path planning method for multirotor UAVs aiming at capturing images for complete and precise photogrammetric 3D reconstructions. This method takes the coarse model from an initial flight as prior knowledge and estimates its completeness and precision. New imaging positions are then planned taking photogrammetric constraints into account. The real-world experiment on a ship lock shows that the proposed method can acquire a more complete result with similar precision compared with an existing 3D planning


INTRODUCTION
With the availability of lightweight unmanned aerial vehicles (UAVs) and compact imaging sensors, UAVs are being widely used in remote sensing and photogrammetry. The flexibility and mobility of lightweight UAVs have made it possible to capture close-up images for complex urban structures, and generate a realistic model of high-accuracy with the aid of stateof-the-art photogrammetric and structure from motion (SfM) -multi-view stereo (MVS) software, like Pix4DMapper and Agisoft Metashape (Pix4D SA, 2019, Agisoft LLC, 2019). These high-quality 3D models are of great use in the field of cultural heritage documentation and structural inspection (Rodríguez-Moreno et al., 2018, Chen et al., 2019. However, complex structures are often concave and have the nature of self-occlusion. Mapping them with nadir images often result in incompleteness and the precision loses additionally. Therefore, considering the specification of the UAV platform, planning viewpoints in three-dimensional space would be the best solution for lightweight UAVs mapping complex structures. General research into path 3D planning for UAV mapping has mainly two categories: model-free and model-based. The main difference between these two categories is whether prior knowledge, i.e. a coarse model, is required. The model-free methods aim at exploring unknown environments while updating the reconstructed model iteratively based on information gain and volumetric occupancy maps (Palazzolo, Stachniss, 2018, Bircher et al., 2018. However, the goal of these methods is to create a map for the guidance of robots, they do not focus on completeness or precision of the 3D reconstruction. The other category, model-based methods, takes an initial 3D model as input, additional viewpoints are then planned to refine the model. The explore-and-exploit procedure focuses on the completeness of the 3D model, which fulfills the need of photogrammetric * Corresponding author 3D reconstruction. Model-based methods do not require online calculation or the determination of next-best-view, they aim at reaching a global optimum of completeness (Roberts et al., 2017, Hepp et al., 2018. The initial model can be acquired and reconstructed from the image of a regular flight or extracted from a heightmap (Smith et al., 2018). An optimal subset of viewpoints is selected from candidates with different criteria. For example the angle between the viewpoint look-at direction and the normal of the surface (Hoppe et al., 2012), parallax angle and distance (Hepp et al., 2018, Smith et al., 2018, or other quality indices (Roberts et al., 2017, Peng, Isler, 2019. The selected viewpoints are finally connected to build actual flight paths for UAV considering the limitation of flying time or path length. Even though the completeness of the refined model is increased by considering the overlap between images, the precision is not guaranteed. Photogrammetric camera network design is not only model-based, but it also takes the precision of the acquired model into account as well. Photogrammetric quality indices are introduced to guide the planning, which contains the contribution of redundancy and the baseline/depth ratio (Alsadik et al., 2014), or the 'limited error propagation' which approximates the covariance matrix of the object point (Olague, Dunn, 2007). However, these methods are limited within small-scaled and highly-controlled scenes with several viewpoints, the convergence is not guaranteed either. The error ellipsoid of the observed point is only used for validation. Furthermore, they are not efficient enough to be applied in outdoorscaled scenes and plan the path for UAVs imaging for 3D reconstruction.
Recent studies of model-based methods have inspired this work. Roberts et al. and Hepp et al. build the framework of information-driven path planning (Roberts et al., 2017, Hepp et al., 2018. The framework is based on the observation that the mutual information obtained by all viewpoints never decreases and the objective function related to information is submodu-lar (Krause, Golovin, 2014). This allows adding viewpoints iteratively while considering global optimization, which significantly reduces the amount of computation. Therefore, the problem can be formulated as an orienteering problem or be solved using mixed-integer programming, downhill simplex, or genetic algorithm (Smith et al., 2018, Martin et al., 2016. Recently, Agisoft announced the 3D mission planning functionality in their Metashape software system which aims at obtaining optimal sets of camera positions based on a rough model and creating mission plans using these optimal sets (Agisoft LLC, 2019). Its aim is the same as ours and we will take it for comparison in this paper.
In this study, we introduce the photogrammetric constraints into the framework of model-based view planning, to plan a path for multirotor UAVs imaging for complete photogrammetric 3D reconstruction. The completeness and precision of the coarse input model are firstly analyzed and additional viewpoints are then planned to increase the quality of the 3D model. Camera geometry constraints like the baseline and observing angle are considered while the error ellipsoid of each observed point guides the view planning as well. The viewpoint planning takes the advantages of the submodularity but the exteriors of each viewpoint are optimized separately to achieve better completeness and precision.

METHOD
In this section, we discuss the 3D path planning method guided by point cloud analysis and photogrammetric constraints. The proposed method is inspired by the framework of model-based view planning (Scott, 2009), but with a two-phase's viewpoint addition (see Figure 1). The input of the method is the dense image matching (DIM) point cloud, reconstructed triangulated mesh and corresponding camera network from a prior flight. We firstly analyze the completeness of the point cloud. To analyze the completeness of the DIM point cloud, it should be compared with a 'complete' object. The triangulated mesh is reconstructed considering the visibility and with small gaps or holes completed (Vu, 2011). The incomplete part is marked out by point cloud index calculation and filtering. Then we do a two-phase viewpoint addition. The first phase is adding viewpoints to make the point cloud complete, which results in a 'strengthened camera network'. After that, the theoretical precision of each point is estimated, and the second phase is to increase the photogrammetric precision and an 'optimized camera network' is finally acquired.

Completeness and precision estimation of point cloud
In order to get a complete 3D reconstruction, the incomplete area of the initial flight should be detected beforehand. Considering the characteristics of the DIM point cloud, the main reason for incompleteness is occlusion. There are other causes like moving objects and the lack of texture though, we currently ignore them for simplification. So, the target of the completeness analysis is to locate the holes in the dense point cloud.
To find and locate the incompleteness (holes), the input DIM point cloud is compared with corresponding triangulated mesh. To make it easier when comparing point cloud with mesh, a Poisson Disk Sampling method is used to extract sample points s from the mesh surface, the radius of Poisson disks equals to the ground sampling distance (GSD). These sample points are compared with the DIM point cloud p. Some indices are then calculated to indicate the completeness of the point cloud. Finally, the incomplete area is marked out within the sample points.
The first index is the hole index, which indicates the likelihood of holes existing in the input dense point cloud Given Poisson Disk Sample point s, we firstly search for all DIM point p in the k-nearest neighbor of s. Vector np and ns are normals of corresponding points. The hole index f h is related to the dot product of np and ns, which is shown in Equation 1. If the mesh surface is consistent with the DIM point cloud, the dot product will result in a larger value. Otherwise, the result in the neighborhood varies where holes exist, which leads to a smaller f h .
Note that w(d) is a distance weighting function, dps is the distance between point p and s, dm is the largest distance in the k-nearest neighbor of s. σs is an indicator that controls the false positive rate of hole detection, its value is empirically set to 0.018 (Wu et al., 2014).
The curvature index fc represents the local curvature around the sample point s. It sums up the normal change rate rp of the DIM point cloud within the R radius neighbor of s (Equation 2). The radius R equals to the radius of the Poisson disk. A larger fc indicates a higher local normal change rate, which should be considered as unreconstructable vegetation and filtered out. The parameter k indicates the threshold of the curvature, its value is empirically set to 8.
By combining the hole index and the curvature index, a filter is created to filter the sample point cloud and the incomplete area is then marked out. It is worth noting that the noise of the DIM point cloud is inevitable though, it does not have a significant effect on the filtering result. Theoretically, the normal of the noise is randomly distributed, which may result in a larger hole index and curvature index. The distance weighting function w(d) in Equation 1 can assign a lower weight to the distant noise. Moreover, the radius in the curvature index is fixed, which can filter the majority of the noise. Practically, the reconstruction of the corresponding triangulated mesh has minimized the noise.
On the other hand, the inner precision of all target sample point s should be estimated as well to plan more viewpoints to increase the precision. The inner precision is estimated by computing the error ellipsoid of each point, using the multi-view intersection. The calculation is based on the linearization of the collinearity equations and bundle adjustment (Förstner, Wrobel, 2016). The error ellipsoid of each point is then extracted from the eigenvalues λi, i = 1, 2, 3 and eigenvectors of the covariance matrix (Equation 3).
The largest eigenvector stands for the direction of the axis. For a 0.95 confidence level, the scale s of the ellipsoid can be obtained from a three-degree chi-square distribution, χ 2 (0.95, 3) = 2.796 (Spruyt, 2015).

Camera viewpoint generation
This section discusses the camera viewpoint generation to capture more images to increase the completeness and precision of 3D reconstruction. The viewpoint generation splits into two parts: viewpoint addition for completeness and for precision, see Figure 1.
The viewpoint addition for completeness is a select-andoptimize process of generating an optimal set of viewpoints to maximize the completeness of 3D reconstruction. Viewpoints are selected from the initialization set and their position and orientation are then optimized. Candidate viewpoints are initialized for optimal photogrammetric reconstruction -they are located along the normal of each point in the incomplete area, with their viewing directions pointing towards corresponding points. The viewpoint candidate who observes the largest number of sample points is selected iteratively for optimization. Note that the overall number of observed sample points by all selected viewpoints never decreases and the marginal benefit decreases when adding new viewpoints. Therefore, the process of viewpoint selection is submodular. It can be solved with a greedy algorithm with a guarantee of at least a 0.63 approximation ratio, relative to the optimal set (Krause, Golovin, 2014).
Following the selection of each viewpoint, the location and viewing direction are then optimized, to achieve better precision. The viewpoint look-at direction should be as perpendicular to the major axis of the existing error ellipsoid as possible.
The objective function (Equation 4) takes the improvement of visibility as well as precision into account, denote that viewpoint C contains its position pc and viewing direction nc.
where fp(C) is the count of visible points of the viewpoint and Ψ(C) = i sin ψi is the sum of the angle between viewpoint look-at direction and the maximal dimension of error ellipsoid of existing visible points.
The photogrammetric constraints are as follows and also illustrated in Figure 2 1. GSD constraint -the distance between the planned viewpoint and target point should be within a certain threshold 6. Safety altitude constraint -the z value of the new camera location should be larger than a certain threshold zs After the previous step of viewpoint addition, a strengthened camera network is obtained, which expands the coverage of images. However, more viewpoints have to be planned to optimize the geometry of the entire camera network. After the estimation of the inner precision, we then find some points with lower or unacceptable precision, which have bad-shaped error ellipsoids, to be the target of refinement. Next, for each target points, one viewpoint is added to increase the precision of the forward intersection. The viewpoint addition is under several geometry constraints related to point normal, the direction of error ellipsoid and baseline, as illustrated in Figure 3. The details are the same as viewpoint addition for completeness. The objective is that the viewpoint should be as perpendicular to the major axis of the error ellipsoid as possible. So the objective function is similar to equation 4, while the last term here is Ψ(C) = cos ψi.
Combining those generated viewpoints from two view planning steps, a set of new waypoints is obtained. The 3D flight path of UAV can be generated by connecting the waypoints in sequence. The total length of the flight path mainly contributes to energy consumption (Liu et al., 2018), therefore, waypoints should be connected to achieve the shortest path. We assume that the UAV takes off and returns at the same position and the flight path consists of all waypoints. The shortest flight path can be generated by solving a traveling salesman problem (TSP).

EXPERIMENT AND RESULTS
The objective of the experiment is to evaluate the proposed 3D path planning method in the matter of accuracy, precision, and completeness. We compare the result of aerial triangulation as well as the reconstructed DIM point cloud form different modes of flight planning, including nadir flight and its combination with the 3D path from the proposed method and Agisoft Metashape (Agisoft LLC, 2019)). Although there exist a variety of drone planning software focusing on 3D modeling, e.g. Pix4DCapture, Hangar, or Litchi, they mainly plan circulating or façade scanning path. Agisoft Metashape can plan 3D missions through analyzing a pre-captured model, its goal is similar to ours.

EXPERIMENT SET UP
The test area was the ship lock on Neckar River at Hessigheim, Baden Württemberg, Germany. The dimension was about 55 × 206 m. Four control points and thirteen checkpoints were available in this experiment (Figure 4). The platform was a DJI Phantom 4 quadcopter. A manually refined mesh model was chosen as the ground truth. The model was reconstructed from oblique aerial images, it had 0.87 cm GSD and 2.70 cm overall accuracy.
Flight paths were designed based on the comparison between our 3D planning method, Metashape's method and a baseline Nadir path. All flight paths were designed with 1.29 cm GSD. The Nadir flight had a 70% overlap and 70% sidelap, proposed 3D path (3D) and Metashape's path (MS) are showed in Figure 5. The number of images of Nadir, 3D and MS paths was 64, 29 and 30, separately. Since the purpose of the flight planning was to complete and optimize the survey, images from 3D and MS paths were combined with Nadir's. Agisoft Metashape was used to perform aerial triangulation and generate a DIM point cloud. Control points and checkpoints were semi-automatically marked to ensure accuracy. All the parameters stayed default except 'image matching preselection' was turned off in the process of image alignment, to make sure Metashape could find enough image pairs.

RESULTS
In this section, we compare the flight planning between ours and Metashape's quantitatively. The comparison will be in the matter of accuracy, precision as well as completeness. We evaluated the accuracy of all flights after aerial triangulation with checkpoints. Moreover, the maximum dimension of the error ellipsoid was calculated from the covariance matrix of each tie point. The geometry configuration of the flight planning was compared by counting tie points whose theoretical uncertainties are within a certain threshold. The precision and completeness comparison was based on the DIM point cloud visually and quantitatively.

Tie point accuracy and uncertainty evaluation
Since accuracy is the basis in the matter of mapping and 3D reconstruction, we firstly evaluated the accuracy with thirteen checkpoints. Figure 6 demonstrates the total error of each checkpoint. The total error is the square sum of errors in X, Y and Z direction. A 3GSD threshold is marked in this figure with a red dash line, which indicates the commonly accepted error in mapping. In Figure 6, the total errors in the Nadir flight are within the 3GSD threshold except the point 3007, which is at the corner of the target area and further away from the controlled area. The error arises with the distance to control points. The errors from two 3D paths are nearly half of that from the Nadir flight, they are consistent and all within the threshold as well.
Concerning tie point uncertainties, we counted the maximum dimension of the error ellipsoid of tie points and plot them in histograms (Figure 7). The peak values of two 3D planning paths (Figure 7b and c) are lower than that of Nadir's, which means for both 3D planning paths, a larger amount of tie points had lower uncertainties.

Completeness evaluation of DIM point cloud
In this section, we compare the completeness of the 3D reconstruction from all paths. The DIM point cloud was taken in the comparison. The ground truth of the incomplete area was firstly generated by comparing the result from the baseline Nadir path with the ground truth model. Then, the completeness was evaluated using the method by (Hepp et al., 2018) and the coverage of the incomplete area. Table 1 shows the quantitative comparison of the DIM point cloud. Similar to (Hepp et al., 2018), we introduced precision P , recall R for evaluating the quality of the DIM point cloud.
The precision quantifies how many reconstructed points are  close to a ground truth point. The recall quantifies how many ground truth points are close to a reconstructed point, which can be recognized as completeness. The distance threshold was set to 3GSD as well. Moreover, the incomplete area coverage and the precision in the incomplete area were evaluated as well to test the performance of two 3D planning methods specifically for incomplete area. As can be seen from Table 1, the DIM point cloud precision of all 3D methods was significantly higher than the Nadir flight. The 3D configuration of flight paths may reduce noisy DIM points. The recall of 3D planning methods was higher as well, as adding more oblique images covered more structures. Compared with the MS path, the 3D path had a 2.6% higher recall but doubled coverage of the incomplete area. Its precision in the incomplete area was higher than MS as well.
To demonstrate a detailed visualization, Figure 8 shows the overview and zoomed snapshot of DIM point clouds from all paths. The first row of Figure 8 is the overview, in which yel-  low rectangles represent zoom in areas in the following rows. In Zoom #1, the façade of the center ship lock building was wellreconstructed in the flight 3D, however, the result from flight MS was as incomplete as the Nadir one. Zoom #2 was a small building on the ship lock. The result from Nadir and MS was barely the same while there were some points on the façade in our result. In Zoom #3, the result from Nadir and MS were lack of information underneath the road, while our method successfully captured the data there.

CONCLUSIONS
We proposed a novel 3D path planning method for UAVs imaging for complete and precise photogrammetric 3D reconstruction. This method takes the model from an initial flight as prior knowledge. The incompleteness area of the model is marked firstly. Two steps of viewpoint addition are then performed targeting the completeness and precision with the constraints of photogrammetry. The result from the in-field experiment suggested that compared with an existing 3D planning method from Agisoft Metashape, the proposed flight planning method could achieve a more complete 3D reconstruction with fewer images. Meanwhile, the accuracy of the proposed method was as good as Metashape's and the precision of the DIM point cloud from our method was better in the incomplete area. Since the framework of the proposed method is finding an optimal subset of viewpoint, it can be applied before the alignment of a large image dataset, to reduce redundancy before dense image matching.
However, during the implementation of the 3D path, more energy is required than traditional flight modes of aerial mapping, since it contains more turning and change in altitude. Our future work may introduce trajectory optimization or multi-drone collaboration method to reduce the energy or time cost. Moreover, the method of completeness analysis can be refined as well, by introducing semantic information to better detect objects that are inappropriate to be reconstructed. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume V-1-2020, 2020 XXIV ISPRS Congress (2020 edition)