MAPPING GNSS RESTRICTED ENVIRONMENTS WITH A DRONE TANDEM AND INDIRECT POSITION CONTROL

The problem of autonomously mapping highly cluttered environments, such as urban and natural canyons, is intractable with the current UAV technology. The reason lies in the absence or unreliability of GNSS signals due to partial sky occlusion or multi-path effects. High quality carrier-phase observations are also required in efficient mapping paradigms, such as Assisted Aerial Triangulation, to achieve high ground accuracy without the need of dense networks of ground control points. In this work we consider a drone tandem in which the first drone flies outside the canyon, where GNSS constellation is ideal, visually tracks the second drone and provides an indirect position control for it. This enables both autonomous guidance and accurate mapping of GNSS restricted environments without the need of ground control points. We address the technical feasibility of this concept considering preliminary real-world experiments in comparable conditions and we perform a mapping accuracy prediction based on a simulation scenario.


INTRODUCTION
Unmanned Aerial Vehicles (UAVs) are becoming an important tool for surveyors, engineers and scientists as the number of costeffective and easy-to-use systems is increasing rapidly (Colomina and Molina, 2014).These platforms nowadays offer an alternative to conventional airborne mapping every time small or cluttered areas have to be mapped with centimeter level resolution.Many successful applications have been reported, such as in repetitive surveys of buildings, civil engineering structures or construction sites, land monitoring and precision farming.
One important limit of current UAV technology is the dependency on GNSS coverage.Indeed, mapping missions are typically planned offline defining a set of waypoints in terms of absolute coordinates; the autopilot then closes the position control loops employing the position observations from a GNSS receiver.We cite the eBee Plus platform (senseFly, 2016), from senseFly Ltd, a market leader in drones for professional applications, for which its ground control segment does not allow to take off if the GNSS reception is degraded.While certain platforms could also be flown in manual mode, the actual improvement in mapping productivity comes with a high degree of platform autonomy, as less qualified personnel is required and the scale of the operation can be wider.
The dependency on the GNSS reception limits the applicability of UAV based mapping in many interesting scenarios, such as natural and urban canyons, in which the sky is in large part occluded by natural or artificial structures.In these situations the quality of the constellation geometry is poor and severe multi-path effects can occur, introducing shifts in the position fix that could result in crashes, making GNSS based navigation extremely risky.In the worst case it is even impossible to compute the position fix.Examples of such sites, which require regular inspection for assessment, safety and renovation planning, are mountain roads, bridges, rock-fall protection galleries, dams, see Figure 1.
One very active research topic in UAVs and, more in general, in robotics regards the development of visual-only or visual/inertial Figure 1: Rockfall protection structures and bridges in a 300 m deep gorge (Viamala, Thusis, Switzerland), where the GNSS reception is absent or unreliable for autonomous UAV guidance.navigation systems which would allow to guide autonomous platforms in an unknown environment without the dependency on the GNSS coverage.Despite the number of promising solutions published in scientific venues, see for instance (Forster et al., 2014), the technology readiness level of such systems is still rather low, and no such general system is implemented in commercial products.One reason is that it's practically impossible to formulate guarantees about the performances of such navigation systems.
Even if such GNSS-independent navigation systems were available and well performing in arbitrary environmental conditions, high quality GNSS carrier-phase measurements are still required to perform high accuracy photogrammetry.Indeed, the far most common approach to image orientation in UAVs, Aerial Triangulation (AT), also referred as Indirect Sensor Orientation (ISO), is solely based on image observations, yet the process of establishing a dense network of ground control points (GCPs) is required to ensure global orientation and 3D pointing accuracy.The process of establishing ground control is extremely time and money expensive in absence of GNSS coverage, as conventional topographic methods based on total stations have to be put in place.Second, the topology of such scenarios can make the accessibility of certain areas very impractical and even dangerous for the operators, see again Figure 1.It is a well known fact that the requirements on GCPs can be eliminated in image-block scenarios if precise absolute or relative aerial control is introduced in the bundle adjustment, in the so called Assisted Aerial Triangulation (AAT) fashion (Rehak and Skaloud, 2015, Mian et al., 2015, Eling et al., 2014).Indeed, the recent evolution of GNSS antenna technology enabled the usage of multi-frequency and multi-constellation GNSS receivers on board of commercial MAVs (Mavinci, 2016, senseFly, 2016) and integrate the derived "geo-tags" (i.e., aerial position control) within the established processing software, e.g., (Pix4D, 2016).
In this work we propose a novel mapping concept, based on two UAVs, that enables the autonomous acquisition of aerial images in cluttered environments where the GNSS reception is degraded, such as deep gorges, natural and urban canyons.The first drone flies above the canyon where the GNSS reception is good.The second drone autonomously flies in the gorge employing position observations provided by the first drone.These are determined in real-time by tracking multiple optical signaling devices (e.g., high power LEDs) mounted on the second drone.Via the concept of indirect position control, the proposed mechanism also allows to georeference the aerial images taken by the second drone, and thus enables accurate mapping without the need of establishing dense networks of ground control points.
The idea of cooperative mapping is not new in the literature, yet it is often focused on strategies to divide the work and perform it in parallel (Avellar et al., 2015, Lakeside Labs, 2013).Cooperative localization instead consists in having a tight link between the mapping robots that permits them to achieve a shared notion of each one's position.In (Tully et al., 2010) three terrestrial robots are equipped with cameras and an optical target and move in a so-called "Leap-Frog" pattern: one robot is moving while the other two are staying stationary, then, the role of the robots is exchanged.This path permits to build a triangulation network similar to the ones used for mapping entire countries with theodolites in the nineteenth century (Levallois, 1988).This cooperative principle is used for terrestrial robots, for example in (Marjovi et al., 2010) where olfactory sensors (air quality sensors) are embedded on the robots, for underwater vehicles (Matsuda et al., 2015) and for a team of UAVs (Grocholsky and Michael, 2013).In this last case, if the precision of the positioning is not satisfactory, one UAV could land, and act as a fixed beacon.(Pires et al., 2016) raises the problem of the complexity of dealing with a numerous team of cooperative robots.
Recently (Wanasinghe et al., 2015) introduced a hierarchy between the robots.Certain robots (called leaders) have better localization capabilities and higher quality sensors and can assist the robots which do detailed mapping (child robots) in localization.Such hierarchy exist also in the mapKITE project1 , where tactical grade navigation instruments are placed on a terrestrial vehicle, along with an optical target.This target permits to track the moving terrestrial vehicle from an UAV and to enhance its aerial mapping accuracy (Cucci, 2016, Molina et al., 2017).
In this work we build on cooperative localization ideas and propose a solution to replace GNSS signal both in real-time, for guidance and in post-processing, for accurate mapping without ground control points.After presenting in detail the concept, in Section 2, we will discuss how the main technical difficulties could be tackled based on real world preliminary experiences.In Section 4 we will present the results of mapping accuracy predictions using different flavours of indirect position control in a conventional bundle adjustment scenario.We conclude the paper with some remarks and hints towards the real implementation.

INDIRECT POSITION CONTROL
In this work we propose a novel mapping system suited for operations in cluttered outdoor environments where natural or artificial structures occlude the line-of-sight to GNSS satellites.The system is based on two UAVs, refer to Figure 2. The first one, from now on referred as D1, performs the actual mapping mission, acquiring high resolution nadir and possibly side aerial images.D2 carries high accuracy navigation sensors.It follows D1 and it provides position observations for D1 in real-time.D2 also captures nadir images to be used in post-processing along with the ones acquired by D1.A detailed description follows.D2 flies in line of sight with respect to D1, typically, but not necessarily, above it.D2 flies high enough such that no environmental structure occludes the sky and the GNSS constellation is ideal.The payload of D2 includes a high grade INS/GNSS navigation system, such as, for instance, the SPAN-IGM-A1 (Novatel, 2016).Such systems nowadays weight around 0.5 kg and they are suitable for rotory-wing UAVs.The position and the orientation of D2 are thus available with high precision in real-time (RTK GNSS can be employed, but it is not necessary).The payload of D2 also includes a high resolution machine vision camera to acquire nadir images, store them, but also make them available to be processed by an on-board companion computer.
Multiple high power LEDs are mounted in a known, asymmetric, 3D pattern on the upper part of the D1 frame.These LEDs are visible from very high distance in camera images, as we will show later on, and are robustly identifiable with simple image processing algorithms.As the 3D LED pattern is known, the relative position and orientation of D2 with respect to D1 can be determined solving the Perspective-n-Point problem (Wu and Hu, 2006).For this, the intrinsic camera calibration parameters must be known, yet, as we will discuss later on, the quality of such calibration is not determinant for the real time processing.
Once the relative position of D2 with respect to D1 is known, the absolute position of D1 can also be determined in real time: we compose the absolute position and orientation of D2 given by the INS/GNSS navigation system with the relative information from the visual tracking system.The solution is then transmitted to D1 which uses it as a position observation in the autopilot navigation filter, as if it was computed by a conventional GNSS receiver.This is what we call indirect position control.
Once an absolute position fix is available, D1 can perform waypoint based navigation, and thus execute a conventional mapping mission autonomously.Such a mission can be planned beforehand by means of a 3D mission planning software, such as (Gandor et al., 2015).D1 is equipped with conventional nadir camera suited for UAVs, such as the Sony NEX-5, as in (Skaloud et al., 2014).Whereas the nadir camera is required, as it will become clear in the following, a side camera can be optionally installed in case the user wants to map facades or slopes, see again Figure 2. A low-cost IMU can also be installed on D1 and it provides relative attitude control in post-processing, as in (Blázquez and Colomina, 2012), as long as some robustness in case of temporary loss of position fixes from D2.
In order for this concept to work, D2 has to follow D1, such that D1 is always in line-of-sight.This is critical as if the line-ofsight is lost, also the position fix for D1 is lost, possibly leading to accidents.The simplest strategy is such that D2 generates for itself a stream of waypoints always on the vertical of D1.D2 could also send commands to D1 to control the execution of the mission plan, such as pause it, or abort, in case for instance lineof-sight is at danger or speed is to high.
Once the mapping mission has been performed, data has to be post processed in order to obtain the final mapping products.In the following we propose a post-processing strategy that can be performed with the currently available commercial software.
As a first step, the INS/GNSS raw data from D2 is fused by means of an offline Kalman smoother, such as the one available in commercial INS/GNSS processing software, as POSPac (Applanix, 2016).This gives centimeter level position (GNSS raw observations are processed in carrier-phase differential mode) and orientation for D2, the quality of which depends on the available IMU.
Next, the two streams of nadir images, from D1 and D2, are processed together for automatic tie-point detection.There will be thus two kind of matches: i) features that are matched only between images belonging to the same stream (i.e., only seen by the D1 or D2), and, ii) features that are matched in both streams, or, in other words, features that are identified at least in an image from D1 and in an image from D2. Matches of type ii) are the ones that allow to transfer the global position control between D1 and D2, which we call off-line indirect position control.
Image observations from D1 and D2, and absolute position and orientation control for the D2 ones, obtained from INS/GNSS (we assume that images from D2 are time-tagged via the GNSS receiver) are then combined in a conventional bundle-adjustment software capable of Assisted Aerial Triangulation (AAT).This step yields the nadir mapping products.
As we will discuss in Section 4, there are cases in which a limited number of common tie-points is available between D1 and D2 images.In this case, the precise image positions of the signaling devices fixed on D1, in D2 images, can be also introduced in the bundle-adjustment, as extra collinearity observations.Also, relative orientation control obtained pre-processing D1's IMU should be considered, as in (Blázquez andColomina, 2012, Rehak andSkaloud, 2016), which may require custom adjustment software.
Once the positions and the orientations for the D1 nadir camera are known, they can be used as position and orientation control for the D1 oblique cameras, once the proper boresight and leverarm have been applied.This allows to run the conventional Assisted Aerial Triangulation (AAT) pipeline for these images as well.Nadir and side images can also be processed together for increased accuracy, provided that the bundle-adjustment software can handle boresights and lever-arm between different cameras.
The proposed mechanism allows to perform autonomous mapping missions in environments that are intractable with the currently available technology.We will discuss certain critical, yet technical details in the next section.The proposed adjustment scheme also allows to obtain accurate georeferenced mapping products even in the absence of absolute position control for D1.In Section 4 we will discuss different adjustment scenarios and we will derive conclusions regarding the precision that can be expected for both mapping products.

TECHNICAL FEASIBILITY
Here we discuss possible issues and point towards technological solutions that have worked in the past in similar scenarios.

Visual Tracking of D1 from D2
We suggest to realize the real-time visual tracking of D1 from D2 by means of locating on D2 nadir images three high power LEDs fixed in an asymmetric path on the D1 airframe.To validate this idea, we have placed three high power white LEDs above black areas on an optical target, one B&wW image taken from 27 m is shown in Figure 3, along with a detail of the lower-right corner.One pixel on the image plane corresponds to approximately 9 mm on the optical target plane, wheras the LED dimension is 11 × 11 mm.
It is possible to see that the LEDs appear as easily distinguishable peaks in the image intensity.Note that part of the light coming from the LED is captured also by neighbouring pixels due to the lens point spread function.These pixels are also saturated, fact which suggests that the LED would have been clearly visible from higher distance as well.Also note that the LEDs are light sources pointing towards the camera and thus they are inherently brighter with respect to any other object in the environment, with the exception of spurious reflective surfaces possibly present in the scene.The power of the employed LEDs was 10 W, which is insignificant compared to the power consumption of rotary-wing UAV engines.Higher power LEDs can also be employed.
The concept of isolating intensity peaks in camera images to locate 3D targets is well known and successfully employed in commercial 3D motion capture systems, where passive targets which reflects infrared light are typically employed, fact which does not work in outdoor environments and with conventional cameras.

Accuracy of the Real-time Indirect Position Fix
Within the scope of the mapKITE project, an experiment was performed to test the feasibility of optical following of a terrestrial vehicle.An optical target (Cucci, 2016) was mounted on top of the vehicle and tracked in real time by the UAV.The relative position of the target was determined identifying five points on the target and then solving the Perspective-n-Point problem, see Figure 4.The absolute position of the terrestrial vehicle was then determined composing this relative information with the real time absolute position and orientation given by an INS/GNSS navigation system placed on the UAV.This setup is very similar to the one considered in this work and suits well to quantify the quality of the real-time indirect position control.
A description of the experimental setup follows.The rotory-wing UAV was equipped with a 4 Mp machine vision camera and the Trimble APX-15 INS/GNSS navigation system (in stand-alone mode).In this configuration, the error RMS for APX-15 is 1 − 3 m for position, 0.04 deg for roll and pitch, and 0.3 deg for heading, according to the producer's specifications (Trimble, 2014).The UAV flies at an average elevation of 100 m with respect to the terrestrial vehicle, which is driven for 2 km.The target was isolated and measured in 760 images.
A tactical grade INS/GNSS navigation system was used to determine, in post-processing, the reference position of the target center.The position error can be assumed to be below 5 cm.We compare the real-time target positions determined from the UAV with the reference.The error statistics are shown in Table 1, and their empirical probability density function is shown in Figure 5.  Equal or better accuracy and precision were obtained with respect to conventional code-only GNSS receivers commonly employed on UAVs.These results were obtained without boresight and focal-length calibration for the camera, which could explain part of the systematic error visible in Figure 5.This experiment suggests that an indirect position fix for D1 can be computed in real time from D2 with sufficient quality to replace a conventional GNSS receiver for navigation purposes.

Tie-points Matched in Both D1 and D2 Nadir Images
As presented in Section 2, indirect position control form D2 to D1 is obtained when the same environmental feature is seen from both UAVs' nadir camera.As D2 alone can accurately georeference world features seen in its own images via AAT, these points can act as ground control points for D1, if they are also seen in D1's nadir images.Thus, the key for indirect position control is that enough image points are correctly matched between D1 and D2 nadir images.
To confirm that such matches are possible and indeed common, even though images are captured from different elevations and From Figure 6 it is possible to see that common tie-points are uniformly distributed in the considered area (the red dashed polygon) and that there is no area in which these points are missing.We recognise that the considered flight depicts a nearly-optimal case, and that the elevation difference between crossing flight line may not reflect the one needed in the environments considered in this work.In the following we will consider a much lower percentage of common tie-points and we will show how the proposed method can work in much more degraded scenarios.

MAPPING ACCURACY PREDICTION
In this section we formulate predictions on the mapping quality achievable with the proposed method based on a simulated scenario.
We are interested in the precision of the tie-points 3D positions obtained in a conventional bundle-adjustment scenario.The parameters describing the photogrammetric network are the absolute poses of each drones (position and orientation), and the 3D position of each tie-points.These parameters are concatenated together to form the state vector x.The observations are: i) position and orientation control obtained from the D2 INS/GNSS navigation system (post-processed in tightly coupled, carrier-phase differential mode), ii) image observations of the tie-points in both D1 and D2 images, iii) (optionally) and image observation of the D1 LEDs in D2 nadir images.These observations are concatenated together to form the observation vector .It is possible to build a function f wich could simulate knowing x: = f (x).
The design matrix A is defined as the Jacobian matrix of f with respect to the state vector x, see Equation 1.The observation models are well known, e.g., see (Rehak and Skaloud, 2016).The covariance matrix Σxx of the parameters vector is obtained from the design matrix A and the observations covariance (2) The predicted tie-point precision is obtained from the proper diagonal blocks of Σxx.
For this study case we consider an irregular, 350 m long canyon, up to 70 m wide and 100 m deep.See Figure 7 for the isolines.
Both D1 and D2 cameras have a 16 Mp sensors (4912×3264 pixels), and a focal length of 16 mm (≈ 3300 pix).Thus, the vertical field of view is 73 • , and the horizontal one is 53 • .The precision of a tie-point observation in assumed to be one pixel, while the one of a LED observation is one third of a pixel.The standard deviation of the position control for D2 is 2 cm in planimetry and 3 cm in elevation, which is compatible with GNSS carrier-phase differential processing.For the position control, we considered a standard deviation of 0.012 • for roll and pitch, and 0.074 • for heading, as reported for the SPAN-IGM-A1 (Novatel, 2016).
D2 flies between 110 m and 115 m above the canyon floor, its ground sampling distance is around 33 mm on the floor of the canyon, and the footprint of the image is around 110 m (considered in the direction of the canyon).The forward overlap is around 90 %.D1 flies between 36 m and 42 m above the canyon floor.The ground sampling distance of the nadir camera is around 11 mm on the floor of the canyon, the footprint of these images is around 38 m (considered in the direction of the canyon).The longitudinal (i.e., in the direction of the canyon) distance between two poses remain 10 m, but the drone does also lateral displacements (i.e., perpendicular of the direction of the canyon).The overlap between two successive images is up to 70 %.Two sides cameras are also embedded on D1.These cameras are equivalent to the nadir one, and are rotated by 90 • .The distance from the canyon slopes oscillates between 10 m and 35 m, so, the GSD varies from 3 mm to 11 mm and the average overlap of the oblique images is around 40 %.
The simulation results are summarized in Table 2.The lines D1, D2, D12 and Side give the precision and the number of, respectively, the tie-points visible by D1 nadir camera, D2, and both.σx is the precision along x direction: perpendicular to the direction of the canyon, σy is the precision along y direction: in the direction of the canyon, σz is the precision along z direction.equipped with INS/GNSS navigation system and one or multiple cameras.This approach can not work due to the degraded GNSS constellation.Nevertheless, we can pretend that high quality GNSS observations were available and consider such case as a reference.(column SOTA case of Table 2).This case will act as a reference case for comparing others cases.
We consider four different adjustment scenarios.In the first case (Case 1) several tie-points are visible both by the upper drone, and by the lower one (line D12 of table 2).Most of these tiepoints are visible in at least two images of D2.It is thus possible to determine their position thanks to D2, and they could act as GCPs for D1.The precision of D1 tie-points matches the one of the SOTA case, meaning that the position and orientation control for D1 is fully replaced by the indirect approach in this work.In highly cluttered environment, like urban or natural canyon, the number of common tie-points visible both by D1 and D2 could be lower than in Case 1.The lower the number of common tiepoints is, the higher the standard deviation of the tie-points is.The extreme case arises when there are less than 3 commons tiepoints: the system becomes unsolvable.The Case 2, is a middle case, between Case 1 and this unsolvable case.
In Case 3, all the common tie-points are removed, see Figure 8.To make the system solvable again, we introduce the image observations of the LEDs.These observations permit to substitute all the common tie-points measurements between D1 and D2.The results are comparable to the ones of case 1, for the tie-points we are interested in: the tie-points visible by nadir and side cameras of D1.This shows the importance of LED observations, which could substitute to hundreds of common tie-points between D1 and D2 in difficult scenarios.Such observations are always available in post processing, as D2 has to maintain D1 in the lineof-sight and uses the LEDs to provide the real-time position fix.However, the x precision of the tie-points taken by the nadir camera of D1, and the z precision of the tie-points taken by the side camera is worse than in Case 1.This is due to bad determination of the roll angle of D1.
A final case is also considered in which we add another type of observation, more difficult to achieve in practice, that is, D2 posi-tion in D1 images, as if LEDs were also placed on the bottom of D2.The roll and pitch angle becomes more observable as these observation have the effect of introducing position control with tens of meters of lever-arm (position control is available for D2), and thus constraining also the D1 orientation.The results are comparable to the SOTA case (except for the altitude whose precision is slightly worse).

CONCLUSIONS
This paper has presented a new technique for mapping highly cluttered environment like natural or urban canyon.The principle is to have a cooperative mapping between two drones, one flying high enough to receive GNSS signals, and localize the other one, flying in the cluttered environment.
The visual link between the two drones has shown its importance first for guidance purposes (to permit to guide the lower drone), second, for post-processing photogrammetric data.This visual link permits to reach an accuracy comparable with the one it is possible to reach in non GNSS-denied scenario.
In this work we have neglected all the important aspects related to intrinsic camera calibration and boresights and lever-arms determination.We considered the cameras, the lever arm and the boresight matrix to be perfectly calibrated.However, we argue that the intrinsic camera calibration is also observable in the combined adjustment of D1 and D2 images, and that lever-arm and boresights can be calibrated in dedicated flights as it is common in single drone UAV-based photogrammetry.The only non-trivial lever-arms are the ones which relates D1 camera to the LEDs.However, this can be determined with millimeter level accuracy with careful UAV fabrication.

Figure 2 :
Figure 2: Schematic representation of the proposed method.Red shading represents field of view of the cameras embedded on D1 and D2 drone, blue lines represent image measurements, black dotted lines represent phase GNSS observation.

Figure 3 :
Figure 3: Three 10 W LEDs placed on the corners of an optical target, with a zoom on one of them.The image was taken at a distance of 27 m, 1 px ≈ 9 mm.

Figure 4 :
Figure 4: A portion of an aerial image of the mapKITE terrestrial vehicle with the optical target.The red dots mark the identified points for the PnP problem.A cube was overlayed on the image based on the extracted target 3D position and orientation.

Figure 5 :
Figure 5: Empirical probability distribution of the target positioning error.

ISPRSFigure 6 :
Figure 6: Planimetric position of tie-points.The black line are the UAV flight path.Yellow dots are seen by both N-S and E-W flight lines, blue dots only from N-S or E-W flight lines.orientations, we examine the tie-points extracted with Pix4D mapper in a standard, UAV based, photogrammetric flight over a rural area.See Figure 6.Norht-South flight lines are flown at an elevation of 150 m, while East-West ones at 190 m.The average GSD was 4.55 cm.A total of 1885 usable tie-points were extracted, out of which 1746 (92.63%) were seen from both elevation, while only 139 (7.37%)where matched in one image stream only.The density was 130 tie-points per hectare, which is quite conventional for this kind of surveys.

Figure 7 :
Figure 7: Contour lines of the canyon every five meters in height.

Table 1 :
Real time target tracking error statistics with respect to a local-level, Eeast-North-Up frame.

Table 2 :
Accuracy prediction of the tie-points representing the canyon floor, and the canyon slopes (unit: mm)