SURVEY ACCURACY AND SPATIAL RESOLUTION BENCHMARK OF A CAMERA SYSTEM MOUNTED ON A FAST FLYING DRONE

Abstract. Many drones are used to obtain high resolution imagery. Subsequent 3D object point derivation from images of these systems is an established technique. While rotor-craft drones are often used to capture fine, detailed structures and objects in small-scale areas fixed-wing versions are commonly used to cover larger areas even far beyond line of sight. Usually, these drones fly at much higher velocities during data acquisition and therefore the according sensor requirements are much higher.This paper presents the evaluation of a prototype camera system for fast flying fixed-wing drones. Focus of investigation is to find out if higher operating velocities, up to 100 km/h during image acquisition, has any influence on photogrammetric survey and image quality itself. It will be shown that images, obtained by the presented camera system and carrier, do not suffer from motion blur and that the overall survey accuracy is approximately 1/4 of ground sample distance.Survey accuracy analysis is carried out using standard photgrammetric procedures using signaled control- and checkpoints and verifying their conformity in image space and object space.Fundamentals of image quality will be introduced, as well asan approach to determine and evaluate motion smear of remote sensing senors (in theory and practical use case). Furthermore, it will be shown that the designed camera system mounted on a fixed-wing carrier does not suffer from motion smear.



INTRODUCTION
The rapid development of commercial drones has led to an increased availability of civilian solutions being reliable, safe and easy to operate. There are rotor-craft and fixed-wing designs.
Most common are small rotor-craft drones (e.g. DJI, Yuneec, Intel) equipped with optical payloads. These systems are suitable for small-scale aerial imaging applications, e.g. optical measurement and documentation of buildings and industrial plants.
In comparison to rotor-craft drones, fixed-wing versions (e.g. Delair, senseFly, Quantum Systems) allow for longer flight times because of their gliding characteristics which makes it an ideal tool for large-scale mapping applications (Hein et al., 2019) and for LIDAR-and multi-spectral applications (Khan et al., 2017, Bakuła et al., 2018, Hruska et al., 2012 Latest developments of rotor-craft and fixed-wing drones increasingly address the survey market. These drones are equipped with advanced GNSS receivers to improve position accuracy by enabling real-time kinematic (RTK) and post-processed kinematic (PPK) techniques leading to a more precise approximation for exterior orientation of aerial images. Survey accuracy of derived models from all these systems can be obtained using a geodeticphotogrammetric test field (Przybilla et al., 2019).
Regarding fixed-wing drones operating at typical velocities between 80 km/h and 100 km/h (rotor-craft appx. 20 -40 km/h) and having the advantage of far more flight endurance between 60 min and 120 min (rotor-craft appx. 20 -40 min) the question arises if survey accuracy and image quality decrease when operating at higher velocities. Survey accuracy is determined using the aforementioned geodetic-photogrammetric test field and image quality in terms of spatial resolution according to the upcom- * Corresponding author ing German standard DIN 18740-8 "Photogrammetric products -Part 8: Requirements for image quality (quality of optical remote sensing data)". Therefore, the used prototype camera system will be described in section 2. A short review of the test field and terrestrial geodetic surveying is given in section 3. The planned and executed campaign with its underlying preliminary considerations is described in section 4. A short review on image quality and spatial resolution measurement is given in section 5 followed by a theoretical description of motion smear and a benchmark procedure to compare sensor-lens combinations under static (laboratory) and kinematic (operating) conditions in the same section. Related results for survey accuracy and spatial resolution are presented in section 6 followed by several conclusions in section 7.

CAMERA SYSTEM
Due to national law policies the maximum take-off weight (MTOW) is often restricted which also leads to limitations for the payloads. Using carriers with an MTOW less 5kg is quite popular in Germany (BMVI, 2016) which was the starting point to develop a lightweight and metric aerial camera system (Kraft et al., 2016b) with the intention to verify the system using traditional photogrammetric evaluation procedures (Kraft et al., 2016a). This investigation confirmed that the prototype is a metric camera system with long term stable interior orientation.
Based on this metric camera system the first prototype of the drone-based real-time mapping camera was developed in 2018 (see figure 1). The system incorporates an industrial camera, a dual-frequency GNSS receiver including inertial-aided attitude processing (INS), and an embedded computer. The camera head consists of a 16 MPix CCD sensor (ON Semi-conductor KAI-16070 with Bayer pattern) and an industrial F-Mount lens (Schneider Kreuznach Xenon-Emerald 2.2/50). The aperture is set to f4.0 and the focus is fixed to the hyperfocal distance. Exterior orientation calculation is based on a dual-antenna GNSS receiver (Novatel OEM7720) in combination with an industrial grade MEMS-IMU (Epson G320N). The dual-antenna set-up is used to determine true-heading independently from INS. This improves the orientation accuracy, in particular when movement direction and heading do not correlate due to cross-wind. Depending on flight trajectory, differences of up to 10 degrees have been observed. Additionally the dual-antenna system allows for very fast attitude initialization already on ground without aircraft movement. The distance (basis) between both GNSS antennas (mounted in the front and tail) is 0.95 m. The GNSS receiver continuously estimates position and attitude. The end-of-exposure signal is signaled to the GNSS receiver.
Thus, every image is assigned with precisely measured time, position and orientation. Considering the interior camera orientation (long-term stable) direct geo-referencing can be applied. Due to continuous synchronization of all subsystems, each image can be time-stamped with a precision better than 1µs. Time synchronization, image acquisition and real-time image processing is done by the embedded computing unit. This computer is powered by a Quad Core Processor (Intel Atom E3950) with 8 GB RAM and runs a Linux operating system. In this configuration the system allows to capture up to 4 raw images per second which can be stored on an removable storage device. The camera system is shown in figure 1 (bottom). The weight is 1.4 kg (including embedded PC, camera, IMU, GNSS receiver, GNSS antenna, power management and structure) and the dimensions are 10 x 14 x 20 cm 3 .
A fixed-wing drone (see figure 1, top) is used as carrier providing a flight time of approximately 90 minutes at cruise speeds between 60 km/h and 90 km/h. Thus, the carrier is capable to travel a distance of up to 105 km per battery charge. It is specified with MTOW of 14 kg including a payload of up to 2 kg and has a wingspan of 3.5 m. It can operate at wind speeds of up to 8 m/s and temperatures between 0°C and 35°C. While its typical flight operation altitude is in the range between 100 m and 300 m above ground level, it is capable of operating at altitudes up to 3,000 m above sea level.
The operational range is only limited by the maximum flight time because the autopilot systems allows fully automated flights beyond visual line of sight (BVLOS). This requires a predefined flight plan with terrain follow mode for security reasons. The drone is equipped with a conventional command and control link as well as an additional mobile network radio for BVLOS operation. For safety reasons this carrier is equipped with position lights and an integrated automatic dependent surveillance broadcast (ADS-B) transceiver.

TEST FIELD AND TERRESTRIAL GEODETIC SURVEYING
The area of Zeche Zollern, also used as test field in ISPRS Benchmarks (Nex et al., 2015, Przybilla et al., 2019, covers almost the entire area of the open space museum (see figure 2) and has an extent of 320 m x 220 m. The highest vertical point is given by the approx. 40 m high pitheads. It consists of 45 rotor-shaped signaled ground points and 3 Siemens-stars, which are used for image quality analysis. The network measurements were carried out using tachymeters, a precision level and an RTK-GNSS system (Przybilla et al., 2018): "To determine the coordinates of the control points in ETRS89 system, GNSS RTK measurements using the SAPOS HEPS service have been obtained. The UTM coordinates and ellipsoidal heights of these measurements served as data points for the subsequent 3D network adjustment. The ellipsoidal heights of the GNSS measurements were additionally transformed with the quasi-geoid heights of the GCG2016 to obtain normal heights of the DHHN2016. They were used as connecting heights in the following adjustment of the leveling measurements. The standard deviations of the adjusted normal heights in the system DHHN2016 (NHN) are in a range of 1 mm to 3 mm and their relative accuracy is better than 1 mm. For the network measurements Figure 3. Relation between distance during exposure at 75km/h (cyan) and 95km/h (magenta) at different exposure times.
a standard deviation of a single coordinate for the 15 common tachymeter survey stations, which could be measured in forced centering, was 1.2 mm. The corresponding value for the ground control points (GCP) was 2.5 mm.".

FLIGHT CAMPAIGN
The altitude above ground level of the image flight was 95 m resulting in ground sample distance (GSD) of approximately 1.4 cm. Planned flight speed was 75 km/h. In total, 340 images of 9 alternating flight strips with an overlap of 80% in track and 60% across track have been used for evaluation. In addition one strip was planned with maximum speed of 95 km/h.
As already mentioned, the exposure time is important in terms of avoiding image blur due to over-exposure. Figure 3 shows the linear relation between exposure time and flight speed at the planned altitude and GSD. With a planned exposure time of 300 µsec, the flown distance during exposure is 6 mm at 75 km/h and 8 mm at 95 km/h and thus below GSD of 1.4 cm. Basically, flight planning follows the proposed procedure of (Wenzel et al., 2013) for image data acquisition. Image overlap is crucial for the quality and completeness of the later derived point cloud or surface model from these images. This is just one important aspect, however, a wide field of view causes objects to be imaged tilted (distorted) in the overlapping image area and occlusions might occur. Further point correspondences (features) are more difficult or impossible to find because of low image similarity. In addition to the overlap specifications, maximum intersection angle between identical image features must be considered. For high image similarity angular differences should be kept below 15 degrees (Wohlfeil et al., 2013). The range of intersection angles for this camera system and planned flight was between 5 degree and 24 degree in track and maximum 15 degree across track to adjacent strips. If the automatic measurement accuracy ∆B is defensively approximated with 1 /5 Pixel, based on planned base lengths B between two stereo images at height above ground level Z of 95 m, height accuracy ∆Z for a single stereo measurement in strip can be estimated (see table 1) according to (Kraus, 2007): Horizontal accuracy ∆X , ∆Y is calculated independently (Kraus, 2007) with Thus, the expected horizontal point accuracy (0.3 cm) of stereo images with this camera and flight plan is in an approximated range of 1 /4 GSD and the vertical point accuracy (1.5 cm) in the range around 1 GSD.

IMAGE QUALITY
Image quality of a sensor system is affected by multiple factors and directly influences perceptible detail in aerial images. Light rays which are being reflected by an object and detected by a camera sensor partially traverse the atmosphere and loose some of their energy due to diffusion and absorption. In drone applications this part could be considered very small and won't be discussed further here.
Next the light passes a (complex) lens system where an aperture is integrated and limits the effective solid angles for every ray. As a consequence the lens-aperture directly affects the amount of light which in turn determines the amount of photons that reach the sensor plane and contribute to the imaging process. The smaller the aperture is chosen the more diffraction of light limits a sharp optical imaging. On the other hand, if the aperture is chosen too large spherical and chromatic aberrations gain influence. The amount of photons passing through the lens system and reaching the sensor at a distinct time frame directly influences the exposure time needed to create an equivalent sensor signal. In aerial photogrammetry the exposure time however affects a sharp optical imaging in terms of motion blur that is a result of the system's change of location / movement whilst the sensor is exposed. This change of location can be compensated actively and several remote sensing systems offer some techniques. But almost all systems for drones are not equipped with according solutions as additional parts increase total weight limiting flight endurance and operation time.
Another interfering aspect is the gain of shading (or inverse the luminous intensity decrease) starting from the principle point to image corners. This effect is often described as vignetting and is caused by the lens-system itself and by the integrated aperture. Vignetting can be measured and corrected as an image processing step whilst determine the Photo Response Non-Uniformity (PRNU) (Wg EMVA, 2016).
After the light rays passed the lens-system they hit the sensor surface. That part of the camera system creates a digital interpretable signal and directly depends on the amount of collected photons during the exposure time window. The quality of that signal is affected by several electronic components (e.g. sensor read-out electronic, analog-digital converter). A measure of this quality is the signal-to-noise-ratio (SNR). The SNR also is characterized by a) the ambient noise level that unavoidably occurs when a semi-conductor is connected to its supply voltage and b) to the photo-effective area of each sensor element (pixel). The larger the effective area the more photons contribute to the signal assuming identical time frames and therefore increase the signal. Electronic ambient noise can be determined pixel by pixel as part of the Dark Signal Non-Uniformity (DSNU) (Wg EMVA, 2016).

Spatial Resolution Determination
Spatial resolution is an essential parameter of imaging systems (Meißner et al., 2018) as it defines a measure of imaged detail for every image taken by a sensor-lens configuration. Therefore resolution estimation is important to quantify the potential of aerial camera systems. Spatial resolution as an image quality parameter is part of the new upcoming German standard DIN 18740-8 "Photogrammetric products -Part 8: Requirements for image quality (quality of optical remote sensing data)".
Spatial resolution can be defined mathematically as follows: A point-like input signal U (x , y ) with object space coordinates x and y will be spread (or smeared) due to non-ideal imaging properties (Jahn and Reulke, 1995) and creates an output signal V (x, y) with image coordinates x and y: The spread (or smeared) output signal depends on the system impulse response H(r) = H(x, y, x , y ) with r = (x − x ) 2 + (y − y ) 2 which is therefore called point spread function (PSF) (Williams andBecklund, 1989, Jahn andReulke, 1995).
Furthermore, sharpness as an image property can be characterized by the modulation transfer function (MTF)H(k) which is the spatial frequency response of an imaging system to a given illumination. "High spatial frequencies correspond to fine image detail. The more extended the response, the finer the detail -the sharper the image." (Mix, 2005) and is equal to Fourier transform of PSF H(r): The effective resolution of an imaging device can be determined in different ways. A classic approach is the use of well-known test charts (e.g. USAF resolution test chart with groups of bars) (USAF, 1959). There, the (subjectively) identified image resolution corresponds to that distance where the smallest group is still discriminable. This is very similar to the Rayleigh resolution limit (Rayleigh, 1874). There, the response of an imaging system when illuminated with a point light source is defined and approximated by a sine cardinal function. Further, Rayleigh postulated the resolution limit as the minimal distance between the two sources where they are still discriminable. Using the definition that point light sources are approximated as sine cardinal functions the resolution limit is reached if the first maximum position of one function is identical to the first minimum of the other function.
Besides subjective components included in this process the function values (ground resolved distance -GRD) are discrete instead of continuous, depending on resolution steps between groups of bars. To reduce subjective influence with bar charts during the determination process and to convert discrete function values to continuous some approaches use signal processing techniques to calculate effective image resolution. The method described by (Reulke et al., 2004, Reulke et al., 2006 is one of the latter approaches. There, the contrast transfer function (CTF) and subsequently MTF is calculated for images with a designated test pattern (e.g. Siemens-star -see figure 4). According to the above mentioned approaches the smallest recognizable detail or "the resolution limit is reached if the distance between two points leads to a certain contrast in image intensity between the two maxima." Using a priori knowledge of the original scene (well-known Siemensstar target) CTF, MTF and PSF can be approximated e.g. by a Gaussian shape function (Honkavaara et al., 2006) or polynomial function.
Coordinate axis X for CTF and MTF is the spatial frequency k (eq. 6) and is calculated as the quotient of target frequency ks divided by current scan radius r multiplied by π. Target frequency ks is constant and equivalent to the number of blackwhite Siemens-star segments. Related (initially discrete) values for contrast transfer function C d (k) are derived using intensity maximum Imax and minimum Imin for every scanned circle (eq. 7). Simultaneously the function value is normalized to contrast level C0 at spatial frequency equal to 0 (infinite radius). Continuous function values C are derived by either fitting a Gaussian function into discrete input data or e.g. a fifth order polynomial. According to (Coltman, 1954) the obtained CTF describes the system response to a square wave input while MTF is the system response to a sine wave input. The proposed solution is a normalization with π 4 followed by series expansion using odd frequency multiples (eq. 8).
There are several criteria specifying resolving power of camera systems. The parameter σ (standard deviation) of the PSF (assuming Gaussian-shape) is one criterion. It directly relates to image space and can be seen as objective measure to compare different camera performances. Another criterion is the width of PSF at half the height of its maximum (full width half maximum -FWHM).
The value for MTF at 10% modulation contrast often is referred ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume V-1-2020, 2020 XXIV ISPRS Congress (2020 edition) to as resolution limit or cut-off frequency of MTFH(k) = 0.10 at spatial frequency kMT F 10 where it's reciprocal H(r) (PSF) corresponds to the least resolved scale in image domain. This scale factor multiplied by nominal ground sample distance then delivers the least resolved distance and is named ground resolved distance (GRD) (Kharfi et al., 2012, Artmann and Wueller, 2012, Valenzuela and Reyes, 2019, Nakamura, 2016.

Motion smear
Sensor motion of aerial imaging systems can be described as sensor-rotation with three DOF (roll, pitch, yaw), sensortranslation with three DOF (X Y , Z, e.g. world coordinates) or as both motions at the same time. An optical sensor under motion during image acquisition not only collects photons of the static pixel-footprint projected onto the observed surface but of the extended footprint along the projected line of movement. Then, the input signal U (x , y ) (eq. 4) can be described as an integral along the projected line of movement in object space: U (x , y ) = Uσ(x , y , m) dm (9) Where Uσ(x , y , m) is the input signal at every projected position depending on motion m (6 DOF) during the exposure time window.
During that window motion induces smear which directly affects the overall PSF H(r) (eq. 4). With increasing motion (e.g. 1Dtranslation in flight direction of the drone at high velocity) the PSF will be smeared along the projected motion. That hypothesis can be investigated empirically by a simulation. The sequence is to apply predefined modulation (MTF) or spread parameters (PSF) to an ideal representation of resolving patterns (see fig. 5). That can be done by forming a convolution of mathematical-ideal image-intensity values of an image (I), a Gaussian-shape model PSF (Hm) and a mathematical-ideal aliasing PSF (Hs). Simulated PSF (Hsim) then can be formulated as follows (Meißner et al., 2020): Convolution of image-intensity values I(ρ) and an increasingly smeared kernel (initial) Gaussian-shape (see figure 6, top) then delivers a more and more stretched version of corresponding Then, circle-shaped 2D-PSFs and corresponding ellipse fit will deliver values for M close to 1.00 and decreasing values for increasing elliptic-shape of measured 2D-PSF (see figure 6, bottom). This measure will be used to evaluate motion smear of the camera system (see section 2) designed for fixed-wing applications. Results are given in section 6.2.1.

Sensor Validation
Usually, spatial resolution of a sensor-lens combination under laboratory conditions will be significantly better than under operating and thus kinematic conditions due to aforementioned vibrations and motion (6 DOF) of the carrier. It is therefore beneficial to determine and compare spatial resolution in both cases.

Laboratory measurement
The benchmark procedure to determine spatial resolution parameters for a specific sensorlens combination under laboratory conditions is defined as follows (Meißner et al., 2017). In order to guarantee a repeatable measurement procedure with identical controlled light conditions and to prevent extraneous light a sufficiently large basement hall has been identified (see figure 7). The GSD in this benchmark (according to focal length and sensor pixel-size) has been set to 1.0 cm to address the aforementioned fields of application including their resolution requirements. Usually, resolving power is changing across the field of view. In order to analyze this effect multiple images have been taken and the resolution target is imaged at different locations in image space (e.g. image center -image half field -image corner). For every image the MTF10 (see section 5.1) has been calculated. This should guarantee the genuine system response from object space to sensor for the expected field of applications. Results are presented in section 6.2.2.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume V-1-2020, 2020 XXIV ISPRS Congress (2020 edition) 5.3.2 In-field measurement As mentioned before, spatial resolution parameter of a camera system in motion is expected to be lower than under (static) laboratory conditions. Main reason is the sensor movement (6 DOF) through 3D-space but with additional influence of the carrier-unit's (micro-)vibrations under operating conditions.
Prepared and correctly executed flight plans, introduced in section 4 with overlaps of 80% in track and 60% across track ensure the spatial resolution test target (Siemens-star) to be in at least 15 images. Simultaneously the test target is imaged at different image locations. Similar to the benchmark procedure under static conditions (see section 5.3.1) images containing the Siemens-star at center-, half-field and corner-position have been selected for measurement and comparison. Results are given in section 6.2.2.

Test Field Validation
The data processing chain consists of aerial triangulation for all images and subsequent dense image matching for surface reconstruction. Image orientation as the result of aerial triangulation has been calculated including 4 GCP in the corners and 1 GCP in the test field center. The remaining 40 points were used as checkpoints (CP) especially to verify relative accuracies of the bundle block (see figure 2). After bundle block adjustment the checkpoints' horizontal RMS was 4.0 mm and vertical RMS 6.0 mm as shown in figure 8. Balanced distribution of 40 independent checkpoints allows to conclude that it is possible to derive an object model (with no model distortion) which is accurate having a 3D-error standard deviation of approximately 7.2 mm. While this procedure only considers checked single points and its absolute accuracy before the automatically step of 3D point As estimated for the flight campaign (see section 4), measurements deviate in the range of 1.20 cm and 2.20 cm. From a photogrammetric point of view, it can therefore be concluded that the camera works in the range of planned and predicted accuracy even at high velocities.

Spatial Resolution
This sub-section is divided into two parts. The first part presents the results for motion smear in drone images (see section 5.2) followed by sensor validation under static and kinematic conditions (see section 5.3).

Motion Smear
A general comparison of aerial images with and without dominant motion smear is given in figure 9. The image in the upper row suffers from sharpness-loss due to motion smear. This becomes clear when looking at the grade of deformation of 2D-PSF. Quotient M in this case (see eq. 11) is 0.680. The image in the lower row does not suffer from dominant motion smear and related Quotient M in that case is 0.967. The Siemens-star test target has been imaged at 15 different locations in image space by maintaining predefined flight plan (see section 4). Using the software-tool for spatial resolution (see section 5.1) the 2D-PSF and corresponding quotient M (relation of semi-minor and semi-major ellipse axis) have been calculated 6.2.2 Sensor Validation As explained in section 5.3 spatial resolution parameters for static (laboratory) conditions are expected to be better than under kinematic (operating) conditions due to motion caused by carrier-movement and -vibrations. Further, spatial resolution is expected to be better when imaged in center than in image corners. To investigate both issues, images under static conditions containing the Siemens-star at center-, half-field and corner-position have been selected for measurement. The same selection has been made for kinematic conditions. For every image the value for MTF at 10% contrast level (resolution limit, see section 5.1) has been calculated and serves as measure to conduct the analysis. Results for both aspects are given in figure 10. As expected, resolution limit (MTF10) declines from image-center via image-half-field to image-corner for both static and kinematic conditions. This deterioration is mainly caused by chromatic and spherical aberrations induced by the lens-system. Furthermore, the mean difference between operating and static conditions is 23.6% (center 20.9%, half-field 22.7%, corner 27.2%). Considering very high image performance under laboratory conditions (around 1.00 line/pixel and thus near Nyquist-limit) a reduction of 23% of resolution under operating conditions still can be considered satisfactory.

SUMMARY AND CONCLUSION
An evaluation of a camera system prototype for fixed-wing drones has been presented. Focus of this investigation was to find out if higher velocities during image acquisition has any influence on photogrammetric survey and image quality itself in comparison to common rotor-craft drones.
Several preliminary considerations have been described such as general expectation of survey accuracy and the relation between exposure-time and carrier velocity. Furthermore, a measure for motion smear has been introduced and evaluated with simulated motion-PSF.
Survey accuracy has been determined using a geodeticphotogrammetric test field. Having included 5 GCP and 40 CP the standard deviation of 3D-error is approximately 7.2 mm and compared to other camera systems mentioned in (Przybilla et al., 2019) similar or even slightly better.
Presence of motion smear has been investigated with a software tool for spatial resolution determination. It has been shown that obtained aerial images and corresponding image quality do not suffer from motion smear. The mean loss of resolving power under operating conditions (compared to laboratory conditions) of approximately 23.6% is satisfactory and expected.
It can therefore be concluded that the presented camera system and carrier deliver very reliable and high precision results in terms of survey accuracy and spatial resolution. The presented fixed-wing system entails the advantage of much higher flight endurance while maintaining survey accuracy of slower carriers without sacrificing image quality.