UCalMiCeL-UNIFIED INTRINSIC AND EXTRINSIC CALIBRATION OF A MULTI-CAMERA-SYSTEM AND A LASERSCANNER

Unmanned Aerial Vehicle (UAV) with adequate sensors enable new applications in the scope between expensive, large-scale, aircraftcarried remote sensing and time-consuming, small-scale, terrestrial surveyings. To perform these applications, cameras and laserscanners are a good sensor combination, due to their complementary properties. To exploit this sensor combination the intrinsics and relative poses of the individual cameras and the relative poses of the cameras and the laserscanners have to be known. In this manuscript, we present a calibration methodology for the Unified Intrinsic and Extrinsic Calibration of a Multi-Camera-System and a Laserscanner (UCalMiCeL). The innovation of this methodology, which is an extension to the calibration of a single camera to a line laserscanner, is an unifying bundle adjustment step to ensure an optimal calibration of the entire sensor system. We use generic camera models, including pinhole, omnidirectional and fisheye cameras. For our approach, the laserscanner and each camera have to share a joint field of view, whereas the fields of view of the individual cameras may be disjoint. The calibration approach is tested with a sensor system consisting of two fisheye cameras and a line laserscanner with a range measuring accuracy of 30mm. We evaluate the estimated relative poses between the cameras quantitatively by using an additional calibration approach for Multi-Camera-Systems based on control points which are accurately measured by a motion capture system. In the experiments, our novel calibration method achieves a relative pose estimation with a deviation below 1.8◦ and 6.4mm.


INTRODUCTION
Capturing spatial information with sensors carried by an Unmanned Aerial Vehicle (UAV) has become popular in recent years due to low costs and flexible field of applications (Armenakis, 2015).These new applications arise in the scope between expensive, large-scale, aircraft-carried remote sensing and timeconsuming, small-scale, terrestrial surveyings.A typical Unmanned Aerial System (UAS) is equipped with optical sensors for the purpose of documentation, localization of the UAS or mapping of the environment.The localization task is often tackled by using Visual Odometry algorithms (Nistér et al., 2004) or recently with approaches which utilize convolutional neural networks, like PoseNet (Kendall et al., 2015) or SqueezePoseNet (Mueller et al., 2017).Furthermore, Structure from Motion (SfM) approaches, like Bundler (Snavely et al., 2008) or visual Simultaneous Localization and Mapping (SLAM) algorithms additionally handle the task of mapping in 3D by using images from single cameras (Engel et al., 2017), stereo cameras (Mur-Artal and Tardós, 2016) or Multi-Camera-Systems (MCSs) (Urban and Hinz, 2016).Instead of reconstructing the environment by utilizing visual observations, range measurements with laserscanners are practicable and can lead to dense and accurate results (Bosse et al., 2012;Weinmann et al., 2017).The manufacturers of laserscanners see market potential in UAVs and launched several light weight products like the RIEGL miniVUX-1UAV, the SICK TIM551 or the Hokuyo UTM-30LX-EW in recent years.Further, the combination of cameras and laserscanners attached to an UAV provides a powerful tool to capture and analyze spatial information (Jutzi et al., 2014;Premebida et al., 2009).* Corresponding author.For accurate spatial reconstructions the sensor system has to be geometrically calibrated, i.e. the intrinsics and relative poses of the individual cameras as well as the relative poses of the cameras and the laserscanner have to be known (Figure 1).Previous work concentrated on estimating the relative pose of cameras in a MCS and the relative pose of a single camera and a line laserscanner.In contrast, the purpose of our method, namely Unified Intrinsic and Extrinsic Calibration of a Multi-Camera-System and a Laserscanner (UCalMiCel 1 ), is to estimate the relative poses of the MCS and the laserscanner in an unified approach.Additionally to the calibration of the entire sensor system, with this approach, we offer an alternative calibration method of a MCS without joint fields of view.For this purpose the laserscanner is utilized to create a connection between the independent observations of the individual cameras.Either way, the cameras and the laserscanner have to share a joint field of view.Typical line laserscanners have fields of view of 270 • or even 360 • which makes it easy to achieve an overlap with the cameras.
Further, to receive ground truth to compare with, we implement a classical approach to calibrate a MCS by using accurately measured control points.Thereby, we are able to evaluate the calibration results quantitatively.This manuscript is organized as follows.Previous work related to the calibration of a MCS and the calibration of a single camera and a line laserscanner is summarized in Section 2. In Section 3 the methodology of our novel calibration approach for the entire sensor system consisting of multiple cameras and a line laserscanner is described.For convenience the article is stated on the basis of one laserscanner.However, the approach can easily be adapted to a sensor system with multiple cameras and multiple laserscanners.The experiments and results of the introduced calibration approach are presented in Section 4. Finally, in Section 5, we conclude this contribution.

RELATED WORK
The calibration of a sensor system consisting of cameras and line laserscanners in previous work can be subdivided into the calibration of a MCS and the calibration of a line laserscanner and a single camera.The calibration of a MCS can further be subdivided into methods assuming a joint field of view and methods that allow arbitrarily arranged but rigidly coupled cameras.
The calibration of a MCS with joint fields of view is well studied.The researchers use reference bars (Maas, 1999), laser-pointers respectively LED-markers (Baker and Aloimonos, 2000;Barreto and Daniilidis, 2004;Kurillo et al., 2008;Svoboda, 2003) or active self-calibration (Brückner et al., 2014) to estimate the extrinsics of the cameras in a reference frame which consequently leads to the relative poses of the MCS.These methods are usually used to calibrate motion capture or rigid body tracking systems.
The task of calibrating a MCS with arbitrarily arranged, but rigidly coupled cameras is more challenging.The core challenge is to establish correspondence between the observations of the different cameras.This challenge can be tackled by moving the MCS in a field of control points, which are known in a reference frame (Blaser et al., 2017).For a method like this the control points have to be measured with an extra sensor like a tachymeter or a lasertracker.To deal with this drawback, other approaches use natural features extracted in static environments.These approaches estimate the extrinsics of the cameras for different points of view and exploit the rigidity of the system to estimate the relative poses.The latter is usually performed by a nonlinear refinement step.The extrinsics are estimated by using Wide-Baseline Matching, Structure From Motion (Esquivel et al., 2007), Visual Odometry (Heng et al., 2013) or SLAM (Carrera et al., 2011;Urban and Hinz, 2016).These methods usually not only refine the calibration parameters, but also the intrinsics of the individual cameras and the estimated feature locations if applicable.
For the calibration of a line laserscanner to a single camera many approaches exist.Most of the approaches extract one or multiple planes in images taken from different points of view in conjunction with corresponding points or lines in the observation of the laserscanner.The calibration task is formulated as the registration of the corresponding observations from the camera and the laserscanner.Many of the existing approaches make use of special calibration objects like triangles, folding patterns, cubes or more complex calibration objects consisting of multiple connected planes (Hu et al., 2016;Sim et al., 2016;Dong and Isler, 2016;Li et al., 2016;Chen et al., 2016;Kong et al., 2013;Yu et al., 2013).Most of these calibration objects are additionally equipped with checkerboards or markers.Thereby, the pose of the camera can be estimated from a single image.For practical reasons, the calibration methods which are mostly used deal with planar checkerboards like they are commonly utilized for intrinsic camera calibrations (Zhang and Pless, 2004;Zhou, 2014;Tulsuk et al., 2014;Vasconcelos et al., 2012).In contrast, the usage of scene corners avoids the need for special calibration objects (Gomez-Ojeda et al., 2015) which offers potential to be extended to a self-calibration technique.Gräter et al. (2016) follow a specific approach.They mention that many cameras are also sensible to electromagnetic radiation in the wavelengths emitted by laserscanners.This enables to directly measure the laserscanning projections in the images and thus to receive corresponding observations.However, this approach is not applicable to any type of sensor combination.

METHODOLOGY
To calibrate a sensor system consisting of multiple cameras and a line laserscanner in an unified approach, we extend the calibration procedure proposed by Urban and Jutzi (2017).This procedure was developed to calibrate line laserscanners to a single camera with a generic camera model, including pinhole, fisheye and omnidirectional cameras.The calibration in Urban and Jutzi (2017) is an extension to The Robust Automatic Detection in Laser Of Calibration Chessboards (RADLOCC) toolbox (Zhang and Pless, 2004) and the minimal approach of registering a set of lines to a set of planes (Vasconcelos et al., 2012).These algorithms treat the calibration of a line laserscanner w.r.t. a single pinhole camera.Section 3.1 states a formulation of our calibration approach.The details of the calibration described by Urban and Jutzi (2017) are recapitulated in Section 3.2 and the extension to the calibration of the entire sensor system namely UCalMiCeL is described in Section 3.3.Figure 2 provides an overview of the methodology.

Problem Statement
The purpose of the calibration is to estimate the transformation matrix M M CS L which maps laserscanner measurements to the reference frame of the MCS and the transformation matrices M M CS C i which represent the relative pose of the MCS w.r.t.camera Ci.
Here i is the index of a camera (i = 1, ..., N ), N is the number of cameras in the MCS and M are transformation matrices in homogeneous representation.M M CS L is determined by using the transformation matrices M C i L that map the laserscanner measurements to each camera Ci: The top part depicts the estimation of a single camera to a laserscanner (Section 3.2).This part is performed for each camera Ci.The remaining projection and bundle adjustment are performed just once to calibrate the entire system (Section 3.3).For simplification the figure does not show the intrinsic camera calibration, which can also be estimated based on the calibration plane observed in different poses.
By defining the origin of the MCS reference frame as the origin of camera C1 and the rotation as the identity, Equation 1 can be represented as (2) Therefore, the transformation matrices M C i L that map the laserscanner measurements to camera Ci are required.To estimate approximations for these transformation matrices, we process the procedure described in Section 3.2 for each camera Ci.

L
Like the calibration procedure proposed by Zhang and Pless (2004) and Vasconcelos et al. (2012) our calibration requires a plane whose pose can be estimated from the images, e.g. a checkerboard.The calibration plane is observed from camera Ci and the laserscanner in different poses k.This enables to estimate the intrinsics of the individual cameras, by using a standard calibration approach like the ones proposed by Sturm and Maybank (1999) or Zhang (2000).The transformation matrix M C i L is determined by utilizing the corresponding observations of the camera and the laserscanner.Thus, we estimate the pose of the calibration plane from each image, e.g. in the case of a checkerboard by extracting the interest points and adjusting a plane to the points.In the next step of the procedure, all laserscanner points which correspond to the calibration plane are searched in the assigned observation.Finally, the approximate transformation matrix MC i L is estimated basically by using random sample consensus (RANSAC) like in the minimal approach of registering a set of lines to a set of planes (Vasconcelos et al., 2012).

Unified Bundle Adjustment
The procedure described in Section 3.2 provides approximate solutions to the laserscanner to single camera transformations MC i L .Therefore, an initial guess for the transformation matrices MC 1 C j can be determined by utilizing Equation 2: In the final step of UCalMiCeL, we exploit the rigidity of the entire sensor system by refining the transformation matrices MC 1 C j and MC i L in an unified bundle adjustment.Therefore, we project the planes extracted in the images of all cameras to the reference frame of the laserscanner: Here P L ik is a plane w.r.t. the laserscanner reference frame in homogeneous representation at pose k in camera i and P C i k denotes a plane extracted in camera Ci at pose k in homogeneous representation.Further, to obtain the optimal transformation matrices M C 1 C j and M C i L in a least-squares sense, we minimize the distances of all extracted line-segments S ik to their corresponding planes P C i k .
By using all observations in an unified bundle adjustment this approach achieves a larger coverage of the field of view of the laserscanner and therefore better geometrical requirements for the pose estimation of the laserscanner compared to a single camera to laserscanner calibration.

EXPERIMENTS AND RESULTS
We perform UCalMiCeL with a sensor system consisting of two fisheye-cameras of the type VRmagic VRmC-12/BW OEM and a laserscanner of the type Hokuyo UTM-30LX-EW.The cameras are arranged with a divergent view angle.Due to the large field of view of the fisheye cameras of 185 • , the images have a small overlap.However, the calibration approach doesn't require a joint field of view.The laserscanner is mounted onto the front of the UAS in an oblique angle like it is sketched in Figure 1.It allows to scan the ground ahead of and next to the UAS.The laserscanner has a small size of 62mm × 62mm × 87.5mm and a weight of 210g.Therefore and because of its low cost, it is frequently applied to UAS's (Huh et al., 2013;Mader et al., 2015).Table 1 summarizes the specifications of all sensors used in our experiments.
The origins of the laserscanner and the cameras can't be measured straightforwardly, because they can't be touched.Therefore, providing ground truth for the calibration task is hard to accomplish.We compare our calibration method with a method which uses control points.For our setup, we attach five markers used by a motion capture system to a standard checkerboard, which is commonly utilized for intrinsic camera calibrations.Thus, the 6DoF pose of the checkerboard w.r.t. the motion capture reference frame can be determined accurately.Moreover, we measure the position of the four outer checkerboard corners with the motion capture system and determine the position of each checkerboard corner w.r.t. the attached markers.Consequently, we obtain the exact position of every checkerboard corner in the motion capture reference frame.
The checkerboard is moved in front of the cameras with different orientations and distances.At any time the pose of the checkerboard is tracked by the motion capture system with 360 frames per second and the positions of the checkerboard corners are computed.
To determine the image points of the checkerboard corners in each image with subpixel-accuracy, we use a well-known detection algorithm by Geiger et al. (2012).Since we know the position of every checkerboard corner in the motion capture reference frame, we are able to compute the extrinsics of the camera in the reference frame based on 2D-3D correspondences by using OPnP (Zheng et al., 2013).We further refine the intrinsics of each individual camera and the extrinsics in a Levenberg-Marquardt optimization step, which minimizes the backprojection errors.Finally, we use the optimized intrinsics and extrinsics of the cameras to estimate the relative poses of the MCS by another Levenberg-Marquardt optimization.As we are able to create an arbitrarily dense set of control points with an intended spatial arrangement and with an accuracy of a few millimeters due to the motion capture system, we denote the result of this method as ground truth.
The data for UCalMiCeL are acquired as follows.We move the checkerboard in the joint field of view of the laserscanner and the cameras.While capturing, we stay for some seconds in the individual positions to eliminate remaining errors in the synchronization of the cameras and the laserscanner as well as to increase the accuracy of the distance measurements of the laserscanner by averaging five consecutive observations.To ensure that the minimal solution described by Vasconcelos et al. (2012) estimates a correct pose of the scanner w.r.t. the cameras, a set of well distributed laserscanner segments with varying orientations is needed.Figure 3 shows the extracted laserscanner segments in laserscanner reference frame which correspond to the extracted checkerboard planes.Note that outliers are already rejected by a RANSAC-step (Fischler and Bolles, 1981) during the estimation of the minimal solution by Vasconcelos et al. (2012).
In Table 2 the estimated parameters of the relative pose of the second camera w.r.t. the first camera (M C 2 C 1 ) of both calibrations are presented.Here the approximate solution is the result of the calibration before the final optimization i.e. the approximate estimation of the minimal solution by Vasconcelos et al. (2012) without exploiting the rigidity of the entire sensor system.For enhanced interpretability of the results the rotation part of the pose estimation is presented in Euler angles, whereas we use Rodrigues vectors for optimization.The approximate solution provides an estimation of the Euler angles which is already close to the estimation of the ground truth calibration.However, the translations deviate about two centimeter in each direction in space.In Table 3 the Relative Pose Error (RPE) of the estimated result of the approximate solution and the estimation by UCalMiCeL is presented.The rotation estimation is slightly worse for the first of the three angles.However, the estimation of the other two angles and particularly of the translation clearly outperforms the  We also evaluate the resulting laserscanner to MCS Calibration visually by backprojecting the laserscanner points to the image.Thus, we use the estimated transformation matrix M C 1 L and the intrinsics of the camera.Figure 4 shows a typical example of this backprojection for an image used in the calibration.The corresponding observations of the camera and the laserscanner match very well and the laserscanner points are flush-fitted and well aligned with the checkerboard plane.Moreover, to evaluate the quality in a common environment, Figure 5 presents the backprojected laserscanner observation in a desktop scene with concise depth-discontinuities.Figure 5a and 5b show the images of the two fisheye cameras.Figure 5c and 5d depict detailed views of these images.In the middle of Figure 5d a layover effect can be observed.The laserscanner pulses are reflected by the wall while the camera observes the computer case in the same image section.This parallax is caused by the distance between the optical center of the camera and the origin of the laserscanner.In case of Figure 5d the laserscanner is located at the left of the camera and consequently observes more points at the wall next to the computer case.In contrast, in Figure 5c, the laserscanner is located at the right of the camera.Therefore, some points at the left of the computer case seem to be missing.Finally, Figure 5e and 5f show two other examples of the parallax.In far ranges the effect of the parallax becomes smaller.

CONCLUSION
In this contribution, a novel unified approach for the intrinsic and extrinsic calibration of a sensor system consisting of multiple cameras and a laserscanner namely UCalMiCeL is presented.The challenge of establishing a connection between the individual observations of multiple cameras without joint field of view is tackled by using a line laserscanner.In other words, for the calibration approach the laserscanner and the cameras have to share a joint field of view whereas the fields of view of the cameras may be disjoint.The calibration does not require a special preparation of the environment.It just necessitates a plane whose pose can be estimated based on the images, e.g. a checkerboard like it is frequently used for intrinsic camera calibrations.By considering all observations in an unified bundle adjustment this approach achieves a larger coverage of the field of view of the laserscanner and therefore better geometrical requirements for the pose estimation of the laserscanner compared to a single camera to laserscanner calibration.
We test UCalMiCeL with a sensor system consisting of two fisheye cameras and a laserscanner on the basis of real data.To be able to evaluate the calibration results quantitatively, we moreover implement an additional approach for calibrating a multi camera system, which uses control points determined by a motion capture system.As we are able to create an arbitrarily dense set of control points with an intended spatial arrangement and with an accuracy of a few millimeters due to the motion capture system, we denote the result of this method as ground truth.
The results of the test calibrations show a Relative Pose Error of a few degrees and a few millimeter between the ground truth calibration and UCalMiCeL.Due to the fact that the range measurement of the used laserscanner has a low quality with a standard deviation of 30mm, the calibration result is very good.The good quantitative results are supported by the visual results which consist of a single laserscanner observation backprojected to the assigned image.The range measurements match the optical observations to an extent of a few pixel.Future work will examine the influence of the coverage of the field of view of the laserscanner on the calibration result.With regard to recent publications which use scene corners instead of calibration objects consisting of multiple connected planes to estimate the relative pose of a single camera and a laserscanner (Gomez-Ojeda et al., 2015), our approach has potential to be extended to a self calibration method for a sensor system consisting of multiple cameras and line laserscanners.

Figure 1 :
Figure 1: Sketch of an UAS with minimal configuration of sensors utilized by our calibration approach.The purpose of the calibration is to estimate the transformation matrices M.

Figure 3 :
Figure 3: Extracted segments in laserscanner reference frame which are used to estimate the pose of the laserscanner w.r.t. the MCS.Red dotted markers represent the segments that correspond to planes extracted in the left camera, blue dotted markers correspond to planes extracted in the right camera.The black asterisk represents the origin and the hatched area is the blind angle of the laserscanner.By using all observations in an unified adjustment, we achieve a larger coverage of the field of view of the laserscanner compared to a single camera to laserscanner calibration.

Figure 4 :
Figure 4: Laserscanner segment backprojected to the image.Green plus markers represent the extracted interest points used for estimation of the plane pose.Red dotted markers represent the correspondent laserscanner points.Obviously, in the data used for calibration, the correspondent observations are a good match.approximate solution.The final calibration result deviates a few millimeters from the ground truth calibration.

Figure 5 :
Figure 5: Backprojected laserscanner observation.(a) Image taken by left camera, (b) Image taken by right camera, (c) Detailed view of the left image, (d) Detailed view of the right image, (e) Example of the parallax at close range, (f) Example of the parallax at far range.The color represents the range measured by the laserscanner (blue: close range, red: far range).For convenience, the color spectrum for the cases (c) -(f) is adjusted to the individual image selection.

Table 1 :
Specifications of all sensors used in the experiments.

Table 3 :
Relative Pose Error regarding to ground truth