ON THE BENEFIT OF CONCURRENT ADJUSTMENT OF ACTIVE AND PASSIVE OPTICAL SENSORS WITH GNSS & RAW INERTIAL DATA

: In airborne laser scanning a high-frequency trajectory solution is typically determined from inertial sensors and employed to directly geo-reference the acquired laser points. When low-cost MEMS inertial sensors are used, such as in lightweight unmanned aerial vehicles, non-negligible errors in the estimated trajectory project to the final point-cloud, resulting in unsatisfactory accuracy on the ground. There are different multi-sensor fusion approaches to correct the point-cloud errors caused by an imperfect trajectory determination. Mismatches between different optical observations and/or in the overlapping regions of the point-cloud can allow the correction of the final point-cloud, either directly, by means of rigid transformations, or indirectly, via improving the scanner trajectory estimation. In this work we propose to fuse lidar and cameras in a single adjustment based on dynamic networks, considering 2D tie-points from the imagery and 3D tie-points from overlapping point-cloud sections. On a challenging corridor mapping scenario, we show that considering either 2D or 3D tie-points, along with inertial and GNSS observations, results in a remarkably accurate point-cloud, even when low-cost inertial sensors are employed and in presence of challenging surface textures, such as high vegetation. Furthermore, since the distribution of the 2D and 3D tie-points is complementary, considering them together further increases the robustness of the adjustment due to higher redundancy. By employing the proposed approach within this controlled example, we were able to improve the final point-cloud accuracy by more than three times with respect to conventional geo-referencing methodology and to reduce the magnitude of the errors.


INTRODUCTION
Airborne Laser Scanning (ALS) is one of the most important geospatial data acquisition technologies, providing precise geometric products with reflectivity information, i.e., intensity values, and is currently used in several disciplines such as infrastructure monitoring, archaeology, agriculture and forestry. The constant effort to miniaturize Light Detection And Ranging (lidar) units reflects the need to use them in smaller, more flexible, lightweight and low-cost mapping platforms. When it comes to aerial mapping, Unmanned Aerial Vehicles (UAVs) have become of significant importance since they allow for prompt and flexible mobilization (Nex et al., 2022), while lidar units offer high spatial resolution and vegetation penetration due to their ability to detect multiple pulse echoes.
In kinematic laser scanning, due to the sequential measurement principle and the motion of the carrier platform, the exterior orientation parameters (position and attitude) of the scanner differ for every object point, or for some scanners, a line of sparse points. Hence, unlike in mapping with frame-based imagery, mobile lidar scanners rely on direct geo-referencing (Glennie, 2007) and therefore require appropriate GNSS and inertial hardware for trajectory determination. Restrictions on the weight and size (as well as the cost) of the payload on UAVs limit the accuracy of the navigation and optical sensors that can be employed. Consequently, the accuracy of laser scanning products depends heavily on the quality of the navigation * Corresponding author sensors used for the trajectory determination, and fails to meet the user expectations in certain cases (Schaer et al., 2009).
With modern high-end GNSS board-receivers occupying the size of a credit-card at the most, the trajectory limitations are mainly on the side of inertial sensors. Indeed, the residual attitude errors due to inertial sensor noise, and especially due to imperfections of platform initial orientation (so called "system misalignment errors" in navigation-terminology) are the predominant factors of trajectory quality. Additional systematic errors, or their projection, e.g., caused by an imperfect boresight between the lidar and the Inertial Measurement Unit (IMU), deteriorate further the final point-cloud accuracy. Plane-to-plane correspondences in overlapping parts of the point-cloud can correct and recover boresight and/or system misalignments and provide a refined point-cloud as shown in (Kager, 2004) and (Skaloud, Lichti, 2006). However, such approaches may fail due to under-fitting if planes of different slope and orientation are not present in the mapping area. Although such plane-toplane constraints can be generalized when matching certain terrain features or objects (Kerstling et al., 2012), these conditions are usually weaker and thus the parameter recovery less precise.
Many approaches (apart from system calibration) have been proposed to correct errors of the final 3D point-clouds caused by an imperfect determination of the scanner trajectory. These can be divided into the following categories: i) approaches that act directly on the final 3D point-cloud (e.g., via rigid transformations between different sections), and ii) approaches that attempt to directly correct the scanner trajectory. All these tech-niques typically rely on mismatches between the measurements of different optical sensors and/or in the overlapping regions of the point-cloud. Further details are given in the sections below.

Point-cloud alignment methods
Here, we discuss methods that attempt to correct the pointcloud after it has been geo-referenced, by exploiting overlapping scans, based on rigid transformations that are able to globally correct the point-cloud, e.g., a shift and rotation of entire strips. As an example we refer to (Glira et al., 2015) where the ICP algorithm is applied in overlapping regions to improve point-cloud alignment. However, this approach is sensitive to poor approximation of the initial registration, such as those possibly caused by low-accuracy inertial units on UAVs. Given a coarsely geo-referenced point-cloud, extra information from image tie-points can improve the point-cloud geo-referencing accuracy in a hybrid adjustment as shown, e.g., in (Glira et al., 2019). When imagery information is considered, obtaining sufficient constraints on the point-cloud is highly dependent on the employed methodology, as described in (Cledat, Skaloud, 2020). Rasterizing and colorizing the processed lidar point-cloud using the elevation (Jayendra-Lakshman, Devarajan, 2013) or the intensity values (Hussnain et al., 2019) and performing 2D feature matching between overlapping regions can improve the point-cloud geo-referencing. Such approaches result in a certain degree of destruction of the original pointcloud information since 2D feature extraction is applied on the raster data and the point-cloud is treated solely in the 2D domain. Ongoing research in neural networks has shown that end-to-end learned approaches can improve the point-cloud registration accuracy. 3DSmoothNet (Gojcic et al., 2019), Su-perGlue (Sarlin et al., 2019) and MDGAT-matcher (Shi et al., 2021) are some examples based on local feature matching, but to date these methods have not been evaluated for ALS.

Trajectory correction methods
At a trajectory level, (Glira et al., 2016) propose a spline trajectory correction model for a rigorous strip adjustment that handles the trajectory errors after the Kalman filtering of the INS/GNSS integration. (Mandlburger et al., 2017) employ dense image matching to link image tie-points and lidar pointclouds in a hybrid strip adjustment, still relying on direct georeferencing, separately for the lidar and imagery observations. (Li et al., 2019) exploit the advantage of a graph-based optimization algorithm to obtain an improved high-frequency trajectory via image matching and raw inertial observations, following (Cucci et al., 2017). Using the optimized image poses as reference, the authors register the lidar point-cloud by minimizing the depth discrepancy between the depth maps obtained from image matching and the raw laser scans. In this two-step methodology, lidar observations are considered only at the final stage. (Hussnain et al., 2021) propose a B-spline based 6 degrees-of-freedom trajectory adjustment with IMU observations and links between a terrestrial lidar and a reference (aerial) photogrammetric point-cloud. All these methods rely on multiple optical sensors and all related calibration parameters must be known accurately a-priori.
Recently, a novel approach has been proposed in (Brun et al., 2022) where the trajectory to be used in the final point-cloud geo-referencing is estimated considering raw inertial, GNSS and lidar observations in a single adjustment step. First, a coarse trajectory, obtained fusing raw inertial and GNSS observations only, is used to geo-reference the initial point-cloud.
Next, 3D matches, or tie-points are established within overlapping regions of such point-cloud by means of an automated 3D feature extraction and matching algorithm (Pham et al., 2019). These three-dimensional matches are then introduced as additional observations in a Dynamic Network (DN) adjustment (Cucci, Matteucci, 2014) along with GNSS positions and raw inertial data, namely specific forces and angular rates. This approach has been shown to drastically improve the final pointcloud registration. Further details are given in Section 2.

Contributions
In this work we extend the approach put forward in (Brun et al., 2022) by considering conventional tie-points, extracted from camera images acquired simultaneously with respect to the lidar points, together with 3D tie-points and raw inertial and GNSS observations in a single DN adjustment. We evaluate a challenging corridor mapping scenario observed simultaneously with industrial-grade and navigation-grade sensors, the latter providing the reference. We study the effect of passive (cameras) and active (lidar) sensor observations first separately and then jointly. The availability of a ground truth point-cloud with centimeter level accuracy allows us to precisely quantify the georeferencing error in object space. The details on reference accuracy verifications are presented in (Vallet et al., 2020).
This work is organized as follows. In Section 2, we briefly review the approach put forward in (Brun et al., 2022) for establishing 3D tie-points between overlapping point-clouds. In Section 3, we describe the way these matching points are introduced in the DN adjustment, along with photogrammetric observations, and we stress the single-step integration of all the available information for trajectory determination. In Section 4, we present the design of experiments and in Section 5 the obtained results, before drawing the conclusions in Section 6.

3D TIE-POINTS
This section summarizes the approach in (Brun et al., 2022). Apart from the rigorous formulation via DN in a global frame that builds on (Cucci et al., 2017), the approach put forward by the authors is based on the automatic extraction of matching three-dimensional features in overlapping regions of the pointcloud. This approach is comparable to the two-dimensional feature processing of any photogrammetric software that applies computer vision algorithms to detect 2D features on images and extract 2D tie-points based on feature matching. Similarly, in (Brun et al., 2022) 3D tie-point extraction was implemented to identify matching features to be used as additional information in the DN adjustment, as will be detailed in Section 3. The 3D tie-point extraction consists of four main steps, depicted in Fig. 1: detection, description, matching and outlier rejection of local features, that are then linked to the trajectory. In three-dimensional space, features correspond to spheroids and feature matching actually relates the centres of such spheroids. Automated 3D feature extraction has been studied extensively and there exist multiple 3D feature descriptors, handcrafted, e.g., Intrinsic Shape Signature (ISS) (Zhong, 2009) and Signature of Histograms of OrienTations (SHOT) (Tombari et al., 2010), and learned ones, e.g., SpinNet (Ao et al., 2021) and LCD (Pham et al., 2019). We refer the reader to (Brun et al., 2022) for a comparison of many methods employed with ALS.

Point-cloud preparation
An initial rough trajectory, estimated via a Kalman Filter (KF) followed by Recursive Smoother (RS), is required to coarsely geo-reference the point-cloud and thus identify overlapping areas. The overlapping regions are then split into rectangular tiles and automatic feature extraction is performed on each tile separately. The necessary steps are discussed in the following.

3D feature detection
This step, depicted in Fig. 1 (point 1), is implemented using the handcrafted ISS algorithm (Zhong, 2009), which was originally conceived as a descriptor. (Brun et al., 2022) employed only the first parts of ISS, which correspond to keypoint detection. Thus, keypoint candidates for each point-cloud are selected based on the eigenvalues of the weighted covariance matrices that are computed for the spherical neighbourhood of each point.

3D feature description
This step, shown in Fig. 1 (point 2a) is based on a learned approach, the recently proposed Learned Cross-Domain descriptor (LCD) algorithm (Pham et al., 2019). As showcased in (Brun et al., 2022), LCD has better performance when compared with other competing approaches, at least in the case of ALS over a mixed natural-built environment. By encoding the geometric structure of spherical neighbourhoods for each keypoint, LCD outputs feature descriptors so that similar geometries have similar descriptors.

3D feature matching
In this step, depicted in Fig. 1 (point 2b), keypoints from the first cloud are compared with all points in the overlapping one, as opposed to the conventional keypoint matching workflows where only keypoints of the two clouds are considered. Considering all points in the query point-cloud helps to compensate for potentially weak detector repeatability (Salti et al., 2012). The matching is based on searching, for every keypoint in the first point-cloud, its nearest neighbour point in the overlapping one in terms of L2-norm of the difference of the feature vectors.

Outlier rejection
Due to the comparison of keypoints from one tile with all the points of another tile, the matching process is prone to a large number of outliers, i.e., incorrect matching points. To eliminate false matches, as shown in Fig. 1 (point 3), a variant of the Random Sample Consensus (RANSAC) algorithm (Fischler, Bolles, 1981) is used, relying on the assumption that errors in the navigation solution yield similar geo-referencing errors for the entire tile. This assumption is reasonable provided that the tiles are small enough, i.e., were acquired within a time interval of a few seconds.
The final output of this four-step feature extraction workflow is a set of 3D homologous points in the two overlapping pointclouds. For each matching pair of points, the surroundings of the two different points (in each respective point-cloud) have similar structure in the object space, and thus probably identify the same point on the ground. In this way, couples of raw lidar measurements are created, for which geometric constraints can be formulated (as is discussed in Section 3), similar to the use of epipolar constraints for image coordinates in photogrammetric workflow.

MULTI SENSOR INFORMATION FUSION IN DYNAMIC NETWORKS
Dynamic Networks (DNs), first introduced in (Colomina, Blazquez, 2004), are an extension of conventional geodetic networks and have many applications in multi-sensor information fusion for trajectory determination, navigation and photogrammetry. In DNs, the unknowns are samples of the platform position and orientation, 3D coordinates of points in object space (such as tie-points) and optionally system calibration parameters, such as boresights, lens distortion coefficients, etc. Each raw sensor observation, such as the image coordinates of a tiepoint in a given image, or the GNSS position observation at a certain time, forms a constraint between one or more unknowns.
Minimizing the squared error associated with each constraint, weighted by the uncertainty of the sensor measurements, allows the determination of the maximum-likelihood estimate for all unknowns. An in-depth discussion of the approach is beyond the scope of this work, for which we refer the reader to the original publications, and to the ones referenced below.
In this contribution, we employ DNs to determine the trajectory of an airborne mapping system by fusing the following types of sensor measurements: 1. raw specific forces and angular velocities as measured by an IMU, the observation models being those presented in (Cucci, Skaloud, 2019),
As an example, a 3D tie-point allows us to formulate the following constraint on the trajectory: where Γ n b,t 1 and Γ n b,t 2 are the transformation matrices encoding the position and orientation of the body frame b with respect to the global or navigation frame n at times t1 and t2, Γ b L encodes the lidar boresight and lever-arm and v L 1 and v L 2 are the two raw lidar measurements (points in lidar frame L) that have been matched by the algorithm presented in the previous section. ξ is a zero-mean Gaussian error which identifies the uncertainty associated with the constraint and is related to the Ground Sampling Distance (GSD) of the point-cloud. For more details please see (Brun et al., 2022). More informally, Equation 1 states that if two points in the laser frame are matched to form a 3D tie-point, the same coordinates should be obtained once v L 1 and v L 2 are geo-referenced (translated and rotated to the global frame n) using the corresponding samples of the body frame position and orientation.
The advantage of DNs lies in their flexibility and generality: many types of constraints such as the one presented in eq. 1 can be formulated to fuse all information from multiple heterogeneous and partly redundant sensors in a single adjustment step (Fig. 2). This has several advantages, such as rigorous sensor modeling, consistent uncertainty quantification, better observability and easier de-correlation of calibration parameters.
In the following we will demonstrate how DNs allow the exploitation of different types of optical sensors, i.e., lidar sensors and cameras, along with inertial and positioning sensors, to obtain a high-frequency trajectory suitable for accurate georeferencing.

EXPERIMENTAL EVALUATION
In this section, we present the design of experiments to evaluate the concurrent adjustment of passive and active sensors. We describe the dataset used and its adequacy for the specific evaluation, as well as the different scenarios tested to investigate the significance of each set of optical observations.  The experimental evaluation of this work is performed on the controlled airborne dataset presented in (Vallet et al., 2020).

Data
This dataset includes measurements from optical and navigation sensors of high and low accuracy in terms of orientation and mapping performance. It was gathered over different terrain types to include many possible mapping features in urban and rural areas, such as low and high vegetation, a part of a railway and power lines. All sensors were rigidly mounted in a vibration dampened configuration on a helicopter. The helicopter flew at a constant speed of 12 m/s to mimic a small UAV.
In this contribution, we consider only some of the available sensors. We consider the pre-calibrated, IXAR180 (PhaseOne) camera with 80 megapixels and 42 mm lens and the mediumrange lidar sensor VQ-480 (Riegl). We also consider the the Navchip v1 (Thales) IMU with 500 Hz sampling frequency, the performances of which are similar to the popular APX15 (Applanix) (also present on board but not considered here), and the Delta TRE-3 (Javad) multi-frequency receiver, post-processed in PPK mode with Grafnav (Novatel). The optimal recursive smoothing of PPK with a navigation-grade AIRINS (iXblue) IMU, provides the reference trajectory (T REF in Tab. 1).
We focus on two successive flight lines (FL) depicted in Fig. 3. They include 66 images with 80% forward and 40% − 60% lateral overlap and a GSD of approximately 3 cm. The swath width of the lidar is about 180 m and the side overlap is close to 40%. The field-of-view of the lidar unit is almost the same as that of the camera (≈ 60 • ). Thus, the lidar swath coincides with the image footprints in Fig. 3. The helicopter was flown at 230 m above ground level which resulted in a point-cloud density of 35 pts/m 2 to 70 pts/m 2 (lidar GSD between 10 and 20 cm).

Study cases
The system and optical-sensor calibration parameters, including lever-arms, boresights, and camera-intrinsic parameters, were obtained by different techniques before the examined flight, including: (i) laboratory calibration of lever-arm proposed in (Vallet, Skaloud, 2004), (ii) in-flight recovery of boresight 1 between IMU and optical sensors as described in (Skaloud, Lichti, 2006), (iii) in-flight determination of camera additional parameters as detailed in (Lichti et al., 2008). These are considered as known. The time-correlated errors of the industrial-grade IMU are treated as unknowns in the dynamic network.
While the GNSS-derived position and velocity information is identical to that of the reference, we examine four trajectory determination approaches (summarized in Tab. 1) using different subsets of the considered sensors (see Section 4.1).

Reference trajectory ( T REF ):
We consider the trajectory computed with the navigation-grade IMU as groundtruth. We use the APPS software (iXblue) to integrate the AIR-INS IMU reference data with the GNSS position and velocity (PPK). The trajectory integrated with the AIRINS IMU has attitude errors smaller than < 0.003 • , which is roughly one order of magnitude lower than those based on MEMS IMUs. This implies that, at ranges < 250 m, the maximum error due to reference attitude is lower than the laser ranging precision, which is ≈ 2 cm (1σ). The boresight between the AIRINS and the Navchip IMUs is known. Since the two units operate under the same conditions, we can safely consider the AIRINS-trajectory as ground-truth for a fair comparison.

Recursive Smoothing ( T 0 ):
In this approach the trajectory is generated via the loosely coupled integration of the Navchip v1 IMU readings with the GNSS position and velocity solution in a recursive smoother. We use Posproc (Applanix) with internally designed filters tuned for the Navchip v1 IMU.

DN with 3D tie-points ( T 1·3D ):
This approach corresponds to the one proposed in (Brun et al., 2022) in which 3D tie-points between overlapping point-clouds, raw inertial and GNSS observations are tightly fused in a DN. The automatic 3D tie-point extraction is performed as described in Section 2.

DN with 2D tie-points ( T 2·2D ):
This approach corresponds to the one proposed in (Cucci et al., 2017) according to which 2D image tie-points, raw inertial and GNSS readings are fused simultaneously in a DN. This trajectory can be obtained with the freely available tools presented in (Cucci, 2022). The automatic 2D tie-point extraction is achieved using Metashape (Agisoft), where after initial alignment using camera poses derived from T 0 (and before dense matching) we extract the image coordinates of the tie-points per photo. The image observations are used in the DN along with the angular velocities, the specific forces and the GNSS position observations. The camera intrinsic calibration and the boresight with respect to the Navchip v1 IMU were previously estimated and are considered as known.

DN with 3D and 2D tie-points ( T 3·3D·2D ):
This approach combines lidar and camera observations as proposed in Section 3. Both 2D and 3D tie-points are jointly integrated with raw inertial and GNSS measurements in a DN.
All approaches allow us to obtain a high-frequency trajectory for the IMU reference frame. These trajectories are employed to geo-reference the lidar ranges by means of the in-house software LIEO (Skaloud, 2017), and obtain the final point-cloud, (the lidar boresight has been previously estimated). The accuracy of each of the four approaches is evaluated based on deviations of each point in the point-cloud with respect to the reference (obtained by geo-referencing the same lidar ranges using the reference trajectory, T REF  Table 1. Overview of the examined trajectories and their characteristics, i.e., the estimator used to compute them and the inclusion of tie-points with the IMU and GNSS observations. RS: recursive smoother, DN: dynamic network.

Distribution of 2D and 3D tie-points
An interesting property is observed in the spatial distribution of 2D and 3D tie-points. 2D tie-points are mainly detected in the urban areas or areas with low vegetation. For example, feature extraction is adequate in parts of the imagery that correspond to agricultural land, since the different colors of each crop and the harvesting traces create sufficient texture for the automated 2D feature matching (Fig. 4 -area 1). 2D feature detection is sensitive to surface homogeneity present in, e.g., vegetated, rocky or snow-covered areas. As a consequence, in parts of the imagery dominated by forest or high vegetation (Fig. 4area 3), few 2D tie-points are found. This results not only in sparse regions in the 3D point-cloud that can be obtained from photogrammetry only (e.g., by dense matching) and holes in the final digital surface model, but also in less controlled trajectory determination for such areas.
The opposite situation occurs with the automatic 3D tie-point extraction. As also reported in (Brun et al., 2022), in the forest area ( Fig. 4 -area 3) there are almost five times more 3D tiepoints than in the built-up regions. This is due to the ISS algorithm that detects more features in high-vegetation. In the same figure, one can also distinguish the 50 × 50 m tiles in which the point-cloud was split for 3D tie-point extraction.  It appears that each automatic feature extraction workflow is complementary, i.e., their combination allows to establish tiepoints where the other cannot and this offers continuous control (conditioning) along the trajectory when 2D and 3D tie-points are considered together (T 3·3D·2D). This was somewhat unexpected as it is known that it is notoriously difficult to detect features in high vegetation surfaces, e.g., trees, due to the high degree of homogeneity and the possible moving branches. This fact is better examined in the following section.

Trajectory comparison
In the following, we present the orientation error statistics of the different trajectories with respect to T REF. These are sum-marized in Tab. 2. We don't report position error statistics since they are similar for all methods. Indeed, the estimated position is largely driven by the GNSS measurements, resulting in negligible, sub-cm, differences for all methods.
We note that all methods for trajectory correction, i.e., including 2D, 3D or both types of tie-points in the DN adjustment, lead to more accurate orientation estimates with respect to the one obtained via optimal recursive smoothing (T 0). Such improvement is clearly visible in the mean error, where all methods achieve roll and pitch accuracy comparable to the reference (≈ 0.003 • ), but also in standard deviation, where an improvement by a factor of 2-3 with respect to T 0 is observed for all methods. A residual bias is visible in the yaw for all methods, but still ≈ 4 times smaller than what is obtained with T 0. The higher accuracy of the estimated orientation directly translates to a higher accuracy point-cloud once the estimated trajectories are employed for direct geo-referencing; this will be discussed in more detail in Section 5.3.  We also note that 3D tie-points appear to be responsible for most of the improvement in the orientation error. In general, photogrammetry (T 2·2D) exhibits higher residual biases in pitch and yaw, probably because of the weak geometry of the flight and the non-uniform distribution of 2D tie-points over the surveyed area. Note that no ground control points were used; residuals on the check-points are presented together with the employed DN solver in (Cucci, 2022) (their RMS remains at sub-pixel level). Nevertheless, the use of both types of tiepoints together (T 3·3D·2D) slightly reduces error statistics in most of the cases, especially in yaw (STD and RMS) with a slightly higher mean (bias). It becomes clear that in order to take full advantage of both types of tie-points, all sensor boresights need to be resolved better than 0.01 • . Given the weak geometry of the examined flight, it is not recommended to refine such parameters within the DN adjustment, for more details please see (Brun et al., 2022).

Point-cloud comparison
An accurately estimated trajectory is reflected in the georeferencing error of the lidar point-cloud. In the considered scenario, the attitude errors are responsible for over 90% of the geo-referencing error budget. As described in Section 4.2, a point-cloud is generated for each one of the four computed trajectories using the system parameters and the raw lidar measurements. The accuracy of each output is evaluated by comparing each point-cloud with the one generated from the reference trajectory. The difference of point coordinates with respect to the reference corresponds to the geo-referencing error which is shown in Fig. 5 while Tab. 3 presents the error statistics. Low-accuracy IMU observations (Fig. 5 -T 0) result in a poorly geo-referenced 3D model due to imperfect trajectory determination. The mean error in this case (0.51 m in FL1 and 0.46 m in FL2) is 2−3 times times larger than the mean value of the pointcloud GSD (≈ 0.15 m). A significant improvement factor of 4 − 5 in terms of lower mean error (0.09 m and 0.12 m) is obtained when 3D tie-points are inserted in the DN estimator ( Fig. 5 -T 1·3D). As is also reported in (Brun et al., 2022), when 3D tie-points are considered, the mean geo-referencing error is smaller than the mean point-cloud GSD.
In terms of global statistics the influence of the 2D tie-points (Tab. 3 -T 2·2D) on the geo-referencing accuracy is very similar to that for the 3D tie-points. When combining the observations from both optical sensors, no global geo-referencing accuracy improvement is observed over the whole flight lines, compared to the use of 3D tie-points alone. However, there are important local differences highlighted for example in Fig. 6.  Indeed, in the (T 3·3D·2D) trajectory, the complementary nature of the 2D and 3D matches on different surface textures gives an even distribution of tie-points along the flight lines. An example of this fact is shown in Fig. 6 that focuses on the extreme western side of flight line 1. The lack of 3D tie-points in this area results in considerable larger geo-referencing deviations between the T1 and T3 adjustments. The use of 2D tie-points in the T3 case reduces the maximum geo-referencing error within this area by ≈ 0.3 m, which is significant (Fig. 6).

CONCLUSIONS
In this work we have evaluated the joint contribution of 2D tiepoints from camera observations and 3D tie-points from a lidar point-cloud when employed along with GNSS position and raw inertial observations in a single dynamic network adjustment. We have shown that: 1. 2D tie-points are mostly detected in urban and lowvegetation areas, while 3D tie-points are mostly detected in high-vegetation areas, where 2D feature matching is challenging; this implies that the two types of tie-points are complementary for mixed terrain types, 2. 3D tie-points are responsible for most of the improvement in the trajectory orientation error, 3. the use of 2D tie-points results in higher residual biases in pitch and yaw, probably because of the very weak flight geometry (corridor), 4. the use of both type of tie-points together yields slightly better trajectory error statistics in most cases except for a higher residual yaw bias, 5. in this specific corridor mapping case, the inclusion of image tie-points in the adjustment does not allow the further improvement of the geo-referencing accuracy of the point-cloud when evaluated over whole flight lines; in other words, the mean absolute geo-referencing error of the point-cloud is equivalent to the one including 2D tiepoints only and the one using 3D tie-points only, 6. in all cases the mean error in the point-cloud coordinates improves by 4 − 5 times with respect to direct georeferencing and its value is (1.2 − 1.5×) smaller than the mean GSD value of the point-cloud while the standard deviation (norm) is reduced by a factor of 2 − 3, 7. when both optical observations are considered, the maximum geo-referencing errors are reduced significantly where 3D tie-points are sparse or absent.
This presented use case provides limited initial, yet encouraging, indications of the practical benefit of tie-points detected in both sets of optical observations within the adjustment involving raw inertial observations via the dynamic network paradigm. More insights are expected from future more detailed studies.