HIGH-PRECISION OBJECT DELINEATION WITH UAV-DEMONSTRATED ON A TRACK SYSTEM

The proper function of rail-based transport networks relies on the accurate positioning of the tracks. Regular control and maintenance intervals are in place to guarantee safe and reliable operation. This also holds for the crane rails of the storage cranes in the container terminal in the Hamburg harbour. Especially in the terminal “Altenwerder” the geomorphological conditions of the soil lead to a permanent subsidence of the tracks and thus ask for intensive surveying and maintenance activities. The allowed tolerances are in the range of 10mm in the XY-plane on a stretch of 300m. In the daily practice, the measurements are done using traditional tachymetric survey, in combination with a rail car carrying a reflector. This method is reliable but comes with the disadvantage that the operation of cranes needs to be interrupted. In this paper we present an alternative, automatic approach which employs state-of-the-art UAVbased photogrammetry to measure the actual location of the rail. The mid-format camera system combined with a 150mm tele-lens results in a GSD of 0.9mm at 35m flying height. Challenges addressed concern the proper setup and installation of the ground control network, the flight planning and bundle adjustment. Furthermore, an automated rail delineation in the derived surface model was developed. First experiments show that an automatic workflow is possible, including the delineation task. Remaining obstacles concern, for instance, the compliance with the requirements regarding absolute positional accuracy, since the inner block geometry is theoretically much more accurate than the realised control point network.


INTRODUCTION
The crane rails of the storage cranes of the container terminal in Altenwerder of HHLA Hamburg Hafen und Logistik AG are subject to great demands in terms of unchanged position and exact tracking. However, the geomorphological condition of the ground in the port continually leads to significant subsidence and track changes in the rail systems, which therefore have to be regularly checked, measured and improved. Today's semi-automatic methods for surveying the exact rail location are very complex and costly. They are associated with a residual risk in terms of occupational safety and lead to container storage areas being closed down on an hourly or daily basis, which can lead to corresponding operational and capacity restrictions. In this research project, the automatic measurement of the crane tracks is to be achieved with the help of a camera system installed on a multicopter. The automation shall refer to the flight execution, especially regarding the consideration of the cranes moving on the rails to be measured, as well as regarding the image evaluation for the measurement of the rail position. The primary goal of the project is to determine theoretically and experimentally which advantages and disadvantages result from a UAV-based approach. At the Container Terminal in Altenwerder (CTA), so-called "Double Rail Mounted Gantry Cranes" (DRMG) are used, which have track widths of 31m, and 41m, and heights of 21m and 24m, respectively. Fig. 1 shows a sketch of a location block in the CTA. On the waterside, containers are unloaded with the container bridge and transported to the block storage by means of the automated guided vehicle (AGV). There the DRMGs store the containers until they are transported further on land by truck or rail. The containers for ship shipping are loaded back onto an AGV on the waterside. The accuracy requirements for the position and height measurements of the rails are derived from the German VDI 3576 guideline or the specification of the cranes. A tolerance of +/-10mm applies to the track location and the height tolerance is +/-100mm. In this article, the current project is presented. We discuss the system selection, the measurement of the ground control points and the flying on-site, as well as the automatic measurement of the rail position in the data generated from the image composites. The first results show the potential of the approach, as well as the challenge regarding the accessibility and verifiability of achievable accuracy. However, another important component, the detection of moving cranes during the image flight, is not discussed here. For a photogrammetric acquisition or determination of the rail position and height, many factors play a role when it comes to estimating the geometrical accuracy to be achieved. First of all, it must be noted that the tolerance refers to the maximum deviation. According to Kuhlmann et al. (2017), σ=T/4, with T: tolerance, applies for a probability of error of 5% for the measuring accuracy to be maintained. Hence, for this project, a measuring accuracy of σxy=2.5mm is considered for the location and σz=25mm for the height accuracy. In the following, we will first discuss the selection of the flight and camera system, with an additional question concerning the laboratory calibration of the camera parameters. Subsequently, details of the network measurement, which is necessary for the positioning of the ground control points, are explained. The fourth chapter is dedicated to the methodology for the automatic measurement of the track position in orthophotos and the elevation model. Chapter five finally deals with the results obtained.

Launch System and Camera
The selection of the camera and the launch system is a critical step in this project because there are some special features and requirements to consider. The flying height should not fall below 35m above ground in practical operation to ensure a safe distance to the cranes in operation. In addition, the ground resolution should be as high as possible, but on the other hand, the pixel size should not be arbitrarily small to ensure a good signal-to-noise ratio or to be able to realize short exposure times. Regarding the UAV system, it should be noted that sufficiently long flight time can be guaranteed with a relatively heavy payload. Furthermore, positioning via RTK-GNSS is desirable for precise navigation, but also to support the photogrammetric block (Gerke and Przybilla 2016), even if the expected internal accuracy of the photogrammetric block is better than RTK-GNSS by an order of magnitude. After a thorough literature and market research the following system was purchased for the project: The height measurement accuracy can be estimated by the height-base ratio. In the best case, the base is half the image width with a transversely mounted camera, i.e. about 4.5m. This means that the height accuracy theoretically deteriorates by factor H/W=35/4,5≈7 compared to the x-y-accuracy. However, since there is a high overlap and thus redundancy, this factor can be corrected accordingly (Förstner, 1988) for multi-image photogrammetry using √k3, where k represents the number of images in which a point is observed. From these data and the requirements resulting from the tolerance, it is clear that a 3D control point field with very high accuracy requirements must be realized for a thorough realization and verification. Chapter 3 deals with this aspect.

Laboratory Calibration
A difficulty in this project is that the scene to be measured corresponds to a plane. Therefore, the estimation of the internal camera parameters in the framework of the bundle adjustment (simultaneous calibration) probably cannot be performed reliably due to the strong correlation between focal length and object distance. With the chosen camera system, it is important to note additionally that the optics can only be focused to infinity from an object distance of 750m. Within this range, the focal plane corresponding to the distance is adjusted by a motor. According to the manufacturer's specifications, the focal plane can be approached with an accuracy of approx. 6 micrometers, i.e. 1.5 pixels. For any calibration method, it means that the internal camera parameters must be estimated separately for each flight altitude. The camera was calibrated at the Institute for Optical Sensor Systems, DLR, Berlin before the first flight. A method using optical diffraction (DOE, diffractive optical elements) was used, see (Bauer et al., 2008) and (Dahlke et al., 2019). In this calibration, a geometrically highly precisely defined pattern created by a laser and a diffraction unit is recorded by the camera behind a collimator. The imaging model, which contains the calibration parameters of the camera, including the optical distortion, is iteratively optimized on the basis of an image. Dahlke et al (2019) show that this method leads to similar good results as a 3D test field calibration. A disadvantage of this approach, however, is that the distortion parameters and the focal length cannot be decorrelated since only one image is used. Because of the small field of view due to the long focal length, only about 300 diffraction points were recorded on the sensor, which is why the results are not as reliable as shown, for example, in the publications mentioned above. The object distances were set to 35m and 20m and the measurements were repeated four and two times, respectively. Determined calibration parameters were averaged in each case. The calculated accuracy for the focal length was sf ≈ 0.1 pixel and for the main point position su0 ≈ sv0 ≈ 1 pixel. In chapter 5, a bundle block adjustment, where these parameters are assumed, is compared with a simultaneous calibrating adjustment.

ACCURACY ASPECTS AND GROUND CONTROL NETWORK
Some aspects must be considered when determining the requirements of the positional and vertical accuracy of the ground control network. The highest requirement is given by the expected internal accuracy of the photogrammetric block. Even with the pessimistic assumption that well-defined points cannot be determined better than one pixel, the expected accuracy is in the range of the GSD, i.e. approx. sxy ≈ GSD = 0.9mm. This means that for a thorough accuracy check using a simple rule of thumb, a network accuracy of sxy=0.3*GSD ≈ 0.3mm must be achieved. With an extension of the rail area, including ancillary areas, of 30x330m outdoors, even a measurement with highprecision laser trackers would be very difficult to achieve. For this reason, the target accuracy, which results from the tolerances for the rail position discussed above, was maintained. Assuming the same rule of thumb, the target accuracies σxy=2.5mm, and σz=25mm result in values for the network measurement of σxy' ≈ 1 mm, and σz' ≈ 10 mm. If these accuracies can be achieved (with a good distribution of the control points), the photogrammetric block cannot be completely tested, but the tolerance of the rail position can be checked. One assumption is of course that the rail body can be detected sufficiently accurately in the images.
To ensure a largely automatic workflow in the photogrammetric evaluation and to guarantee high point measurement accuracy, one requirement was that coded markers should be used as control and check points. The traditional measurement of fixed points is therefore basically not possible. For this reason, a combination of prisms and coded markers was developed, see Fig. 2. The two prisms and the centre of the marker are arranged in one axis and the distances are calibrated. By measuring the two prisms, the position and height of the centre of the marker can, therefore, be determined by extrapolation. The choice of two prisms was made because a vertical positioning of this target setup cannot be guaranteed. For the first experiment in summer 2019, 20 such combinations of prisms and targets were produced and attached to concrete weights, which are thus mobile on the one hand, but can also be assumed to be stable over a certain period of time. One difficulty in surveying this elongated track object is that fixed points are only available quite far outside. A simulation in the run-up to the measurement campaign has shown that the required position and height accuracy can still be achieved under optimal conditions. During the campaign, however, it was found that only four pre-determined datum points were visible from the block under consideration, and furthermore the meteorological conditions were not ideal (cloudless sky, direct sunlight). The realized network configuration including error ellipses at the 20 ground points (2x20 prisms) is shown in Fig. 3. By applying the target detection of the Leica MS50 total station in use, an automated set measurement could be achieved. However, in this network configuration, where the targets are almost in alignment, the total station can quickly confuse during the automated measurement. Atmospheric flickering due to ground proximity exacerbates this problem.
The adjustment was carried out using the PANDA software package and is based on the principle of free network adjustment. This network was then transformed into the four datum points. After the network adjustment, the above-mentioned extrapolation of the coordinates was carried out. The extrapolation requires a variance propagation so that the accuracies of the extrapolated coordinates can be determined for the photogrammetric targets. On average, a 3D accuracy of 3 mm is achieved, in sxy' ≈ 2.8 mm and in sz' ≈ 1.2 mm in height. During the evaluation at three points, unexplainable coordinate differences from the measurement of different device points occurred so that these were taken out of the net, thus 17 ground points remained for the photogrammetric evaluation. The accuracy required above was therefore not achieved, at least for the location, and this finding must be taken into account in the evaluation.

AUTOMATIC MEASUREMENT OF RAIL POSITION
The approach requires the DEM, which contains the 2.5D information for the study area. The procedure aims to extract the edges and centres of the rails from the DEM. For this purpose, profiles are created in the DSM at a distance of 5cm, perpendicular to the rail direction. The position of the profiles and an exemplary profile are shown in Fig. 4. Fig. 4: upper: simulated cuts at a distance of 5 cm each, averaging over 6 profiles, reference axis, lower: profile section.
Since the dense image matching on the rail section can lead to faulty points, only profiles that scatter in the upper area by less than +/-3cm from the mean value are taken into account. From the valid profiles, the lower points that are directly adjacent to the edge are then extracted. The points from the 6 adjacent profiles on both sides are used to create a straight line to further reduce the influence of matching errors (see Fig. 4, above, yellow points). The reference axis for each rail is specified by HHLA, see the dotted axis in Fig. 4 (upper). The actual position is determined from the averaged profile points over half the specified rail width.

RESULTS
This section describes the first flight campaign, which took place at the CTA in June 2019. A block was chosen for this campaign, whose rails had been renewed shortly before and in which, therefore, no containers were placed. Conventional position measurement of the rails was carried out with the rail measurement car of a surveying company and a track angle. The results are compared with those of the method developed here, see Section 5.4. The system flights were carried out outside of the operation of the cranes as well as during operation, mainly to test the approach to the detection of the cranes, which will not be discussed further here.

Flight planning and data acquisition
The area between two storage blocks contains 4 rails and has dimensions of 310x14m. The projection of an image has a size of 10.4x7.8m. The goal of the planning was on the one hand to capture the area in one flight, on the other hand, the resulting number of images should not be unnecessarily high, in order to be able to carry out the following processing in reasonable time frames. To limit motion blur in the images, the maximum speed of the aircraft is limited to: = 0.5 .  Table 1 shows the influence of forward and side overlap (OL/SL) on the flight of the mission area. The maximum trigger frequency of the camera is 3Hz, therefore the choice of a very high overlap would be possible, but leads to more images and thus to correspondingly high processing time. A variation of the side overlap has a direct influence on the number of flight strips, which increases both the flight time and the number of images. The choice of 80%/70% (OL/SL) thus seems to be a reasonable compromise.

Tab. 1: Influence of forward and side overlap (OL/SL) on relevant mission parameters
In order to meet the special requirements of the project, the ground station app "Inspekt GS" was developed as part of the project. It enables the planning of the mission with the necessary precision (e.g. flight at 1.1m/s in 35m height), Fig. 6. Also, functions for the automated identification of invalid photos (e.g. due to obstruction by cranes) and subsequent reactive mission planning are implemented. The identification here uses either external information sources (e.g. current position of cranes) or computer vision algorithms on the live camera image. The app is implemented using the DJI Mobile SDK for Android, is compatible with common DJI drones and covers the entire workflow of mission planning, monitoring, and evaluation. Fig.  5 shows the app during the execution of a mission (including live camera images, the planned mission and current drone and crane positions). Fig. 5: Execution of the mission using the self-developed ground station app "Inspekt GS An aerial image in PhaseOne RAW format "IIQ large" occupies about 110MB (complete mission 85GB) and as 8bit TIFF about 300MB (complete mission 232GB). During the necessary conversion in the software CaptureOne which comes with the hardware, the shadows in all images were lightened by one f-stop to retrieve information in the cast shadows. Fig. 6 shows an exemplary aerial image after the conversion. Fig. 6: The study area targeted by the drone (Top), Full coverage of an aerial photo from 35m height (11664x8750) (Down left), and image section (507x380 pixels, corresponds to approx. 45x34cm) (down right).

Image orientation
At this point, tests with different configurations of the bundle block will be reported. Different questions arise, whereby one interest is to determine experimentally the necessary number and the best distribution of control points on the ground. Furthermore, the role of laboratory calibration in bundle block adjustment (BBA) will be discussed. All evaluations were performed with the software Agisoft Metashape. One reason for this is that this software can read the coded markers and thus guarantees an automatic data flow. Experiments with other software packages are still pending. In all experiments, the respective control points used and the RTK GNSS observations are provided with their theoretically determined position and height accuracy. Wherever possible, other parameters were set to optimum values for evaluation, for example, the tie points were searched for in the best resolution. For reasons of limited space, only the following scenarios are compared: 5.2.1: Holding the laboratory calibration, flight altitude 35m, comparison of the best with the worst control point configuration. 5.2.2: Simultaneous calibration of the inner camera parameters, flight altitude 35m, comparison of the best and worst control point configuration.
The two control and checkpoint configurations are to be described as follows: In (the worst) configuration K1 an extrapolation beyond the control point area is explicitly created, i.e. 10 control points (GCP) are located in the center of the block and the 7 check points (CP) at the two longitudinal outer areas.
The configuration K2 corresponds to the usual recommendation: 10 GCP are regularly distributed in the block, with the 7 CP in between.

Maintaining the laboratory calibration, flight altitude 35m:
The image block, which was captured at a flight altitude of 35m and contains a total of 780 images, was evaluated in the software. The parameters resulting from the DOE calibration were set accordingly in the software and the option for simultaneous calibration was disabled. Fig. 7 shows the residuals in the form of RMSE values, separated according to GCP and CP, in the upper area for K1, in the lower area for K2. A comparison shows first of all that the residuals at the control points for K1, i.e. the worse configuration, are somewhat smaller than for K2 (1.9mm compared to 2.5mm). This behavior can be explained by the fact that the residual errors could be minimized over a spatially smaller area within the BBA.
On the other hand, the RMSE values at the checkpoints clearly resemble the unfavourable configuration K1, which is worse by a factor of 2.5 (XY) and 12 (Z) compared to K2. The model deformation resulting from the extrapolation, which can be seen in the height, is clearly visible at K1.

Simultaneous calibration, flight altitude 35m:
In these tests, the same input data were used as explained in 5.2.1. In contrast, however, a simultaneous calibration of the camera parameters was carried out. In fig. 8 the corresponding values are shown similar to fig. 7. The two graphs give RMSE values in the same orders of magnitude as shown in 5.2.1., but with the good configuration the positional accuracy is slightly worse here, but the height is better.

Conclusion on bundle block adjustment:
The comparison of the results from the use of the laboratory calibration and the simultaneous calibration does not show any significant differences. Especially, since it has to be considered that the ground point network on which the image block is adjusted to has an accuracy of the same order of magnitude as the residuals shown here. Thus, a final conclusion on the influence and stability of the calibration method used cannot be drawn. Nevertheless, a trend can be seen in the height accuracy. Since the control points were measured with a theoretical internal accuracy of sz' ≈ 1.2 mm, the RMSE values in Z at K2 of 3.1 mm and 2.4 mm can be considered quite realistic. Without further investigations, however, no reliable explanations for the different values can be found.
The RMSE values for K2 of 2.1mm (2.4mm) in XY and 3.1mm (3.4mm) in Z are better than the specifications (σxy=2.5mm, or σz=25mm), although it must be remembered that with the given network this accuracy cannot be tested thoroughly.
To check the relative accuracy, or local scale estimation, a highprecision calibrated scale made of CFRP from the company Aicon, which is used for the photogrammetry system DPA, was positioned in the scene, see Fig. 9. The difference of the length measurement from coordinate differences in the bundle block and the calibrated length is 0.26 mm. This value is an indication that coordinate differences in small areas can be determined with the internal accuracy resulting from the GSD.

Surface modelling and orthoprojection
The rail position is not detected directly in the image block via multi-image evaluation, but via the digital 2.5D height model (DEM) derived from the dense image matching. Furthermore, data is also superimposed with a generated (true) orthophoto (TOP), for example for visual control and superimposition with other georeferenced data. For these reasons it is important to quantify how accurate these derived products are. Sources of error during processing are deposits or "holes" in the point cloud, for example in low-contrast regions. Such artifacts directly affect the quality of the DEM. Furthermore, the quality of the orthoprojection depends on the surface modelling via the known relations (relief offset). To quantify the geometric accuracy of the products DEM and TOP, prominent points (the given, marked ground control points), but also points manually introduced into the BBA (clearly defined corners) were measured in the ortho image, respectively in DEM. The RMSE values for these deviations are in the range of 0.1mm for position and 0.4mm for height.

Measuring the rail position and discussion
Based on the image block and the BBA with the best results, a DEM and TOP of the whole area was created. According to the methodology presented above, rail points could be extracted and the difference to the reference line calculated.
As reference a measurement with a rail car and a track measurement angle, which was previously carried out by the surveying company, was used. The absolute accuracy of this result is given as σxyz=3mm and is thus of a similar dimension to our theoretical estimate. In Fig. 10 the position is considered first. In the figure above, the blue line shows the deviation from the reference line as a function of the position along the rail, taking into account the distance determined in this project. The red line is the distance calculated by the rail car/track measurement angle. First of all, it is noticeable that the blue line shows much more high-frequency fluctuations than the red line. These quite high fluctuations in the our result from one measuring point to the next -in the given 2m grid -indicate residual uncertainties in the automatic image analysis.
The deviation increases to the middle of the track up to approx. 2cm and becomes smaller again in the last third. The general trend is similar; interesting is that our result is closer to the reference axis up to about station 200m, but beyond this limit, it is further away than the position determined by the other system. Interesting is also the difference resulting from both measuring methods, which is shown in Fig. 10, below. If it can be assumed that the high fluctuations from one 2m profile to the next are due to the remaining residual uncertainties in the automatic image analysis, low-pass filtering should help to give a somewhat more realistic result. Therefore, the same values were subjected to different sliding low-pass filters (median and mean value, 3 and 5 neighbouring values, respectively). The resulting graphs are shown in Fig. 11. The largest deviation in terms of magnitude is -11mm for the median filter with filter length 5, the mean deviation is 1.5mm. Fig. 11: Difference between the two measurements with different sliding low-pass filters.
Due to the rather high uncertainties in this measurement campaign, no reliable statement can be made about the reasons for the deviation. The following aspects need to be further investigated: • Is there a systematic deviation? From Fig. 11, one could suspect a piecewise systematic shift. One aspect that cannot be conclusively assessed here concerns the datum definition. The measurement of the rail measurement car was not connected to the same fixed points as the network of control points. This means that inconsistencies in the datum defintion cannot be excluded and lead to systematic offsets.

•
If the position measurements with both methods are about σxy ≈ 3mm, then after variance propagation the difference σDxy ≈ √2 * 3mm ≈ is 4.5mm. This means that a 95% confidence band of +/-9mm applies to all deviations as shown in Fig. 11. Most of the values lie within this band, so theoretically they are not significantly different from zero.
In Fig. 12, above, the elevation profile is plotted, again separated according to our result and classical surveying. The "ramps" at the beginning and at the end of the track, where a height difference of 5 to 7cm is visible, can be observed well in both profiles. A technical explanation for these ramps is that the rails are not loaded at the end because the cranes do not drive into this area. In general, the slope to the waterside (ascending stations) is visible. The difference between the two profiles is shown in the lower graph in Fig. 12. This curve is much smoother than that of the location. At the ends of the original distance graphs in the upper graph, it can be seen very clearly that these curves are shifted laterally. This shift leads to the rather large deviations of 8 to 10mm and supports the hypothesis that a residual deviation could result from the datum definition. Since the deviation becomes larger towards the water side, there could be a scale error. This assumption will be investigated in future work.

SUMMARY
This article reports the first results from the BMVI project. The aim of this project is to carry out a rail survey at the Container Terminal in Altenwerder using very high resolution drone data. The special requirement in this project is the high absolute positional accuracy to be satisfied: stretching over a length of approx. 300m a positional tolerance of 10mm is allowed, the height tolerance is more moderate at 100mm. The challenge in selecting a UAV/camera system configuration is that a flight altitude of 35m above ground must not be undercut and that the recordings should also be made during crane operation. The nominal GSD that can be achieved with the selected system is 0.9mm. First, preliminary results show that fully automatic data flow from the image acquisition to the determination of the rail position is possible. However, a thorough accuracy analysis proves to be difficult for the data set, since reference data were recorded with similar accuracy. In future work, the methodology for extracting the rail edge will be improved, for example by estimating both edges of the rail. Regarding the measurement of the ground control points, the (weather) conditions should be better than in the campaign used here, and it should be ensured that the two measurement methods are using the same datum. Another goal is to reduce the number of required control points. It should be noted, however, that the internal accuracy theoretically achievable by the arrangement and system equipment presented here can only be transferred to object space and verified with a very high measurement effort.