DESIGN OF AN INDOOR MAPPING SYSTEM USING THREE 2 D LASER SCANNERS AND 6 DOF SLAM

We present the design for a new indoor mapping system based on three 2D laser scanners as well as a method to process the range measurements such that the pose of the system and the planes of floor, ceiling and walls can be estimated simultaneously. By the combined use of the measurements of all three scanners the pose of the system can be reconstructed in 3D without the need for an IMU. The six pose parameters are modelled as a continuous function over time such that scan line deformations caused by rapid scanner movements do not lead to biases in the estimated poses. The theoretical feasibility of the approach is demonstrated by analysing reconstruction results derived from simulated sensor data of two indoor models. Assuming a perfectly calibrated sensor and ranging noise of 3 cm, the results on data in 10x20 m corridor show that the plane orientation precision is better than 0.1 degree and that the standard deviation of plane-to-plane distances is below 1.5 cm after three loops in the corridor.


INTRODUCTION
Indoor position and mapping has been an active research topic within the robotics community for several decades (Thrun, 2002).While the early research was often motivated by the need for autonomous navigation of robots, including obstacle avoidance, more recent work clearly also aims at 3D modelling of the indoor environment (Biber et al., 2004, Borrmann et al., 2008;Henry et al., 2014).With the increasing demands for indoor modelling (Worboys, 2011;Zlatanova et al., 2013) various commercial systems for indoor mapping have now been introduced.Trimble and Viametris both developed laser scanning systems mounted on a push cart.While Trimble integrated an IMU for indoor positioning of the TIMMS platform (Trimble, 2014), Viametris uses the scans of a horizontally mounted 2D laser scanner in their i-MMS for simultaneous localisation and mapping (SLAM) (Viametris, 2014).CSIRO developed Zebedee (commercially available as ZEB1) with a single 2D laser scanner and IMU on a spring (Bosse et al., 2012).Thomson et al. (2013) analysed the performance of i-MMS and Zebedee.Several tables in their paper indicate that i-MMS generated a higher accuracy point cloud than Zebedee.Yet, i-MMS as well as TIMMS only include 2D positioning and can therefore not be used on ramps, across thresholds, or on staircases.In this paper we present the design and data processing for a new indoor mapping system that allows accurate 3D indoor positioning without an IMU.The system only uses three 2D laser scanners.While the hardware is the same as in the i-MMS, we use a different configuration of the three scanners.We demonstrate that this enables a new SLAM procedure in which range observations of all three scanners contribute to an accurate 6-DoF pose estimation of the system.We do not use the common scan line matching, but predict the transformation of the next scan lines and associate linear scan segments to earlier reconstructed planar faces.The six transformation parameters are described as a function over time using splines.In this way the transformation varies from point to point such that scan deformations caused by rapid sensor movements are accurately modelled.Using simulated measurements in a corridor with a loop we demonstrate the high precision potential of the approach.After an overview on further related literature in the next section, the design of the new indoor scanning system is described and motivated in section 3.In section 4 we present the mathematical model that links the range measurements to the sensor pose parameters and plane parameters.We also elaborate the data association procedure (determining which points belong to which plane) and a strategy to avoid the need for a continuous global estimation of all pose and plane parameters.
Results of experiments on simulated data are presented in section 5. Borrmann et al. (2008) distinguish four categories of SLAM approaches using laser scanners: (a) 2D positioning and 2D mapping with a single 2D laser scanner.Many examples are found in early robotics work (Lu and Milios, 1994;Thrun, 2002).(b) 2D positioning combined with multiple scanners that capture the 3D geometry of the environment.The already mentioned TIMMS and i-MMS systems are in this category.The motion of such systems is restricted to a (typically horizontal) plane.(c) 3D positioning combined with a slice wise capturing of the environment with a single 2D laser scanner.In this case additional sensors are required for the platform positioning, e.g.IMU or visual odometry.The system for terrain analysis by Thrun et al. (2006), but also the Zebedee scanner (Bosse et al., 2012), combines an IMU with a 2D laser scanner.(d) 3D positioning combined with 3D range imaging.Here 3D scans are obtained at many nearby locations.This can be done in a stop-and-go fashion with a terrestrial laser scanner, like in (Borrmann et al., 2008), or with range cameras like Kinect that instantaneously capture a 3D point cloud (Henry et al., 2014).Many of the reported studies in these four categories make use of an IMU to improve the positioning accuracy, although, except for the third category, an IMU is not strictly required to obtain a positioning solution.Several systems have been designed to capture indoor environments with sensors attached to a backpack frame.This offers a large freedom to move around, but clearly requires 3D (and not just 2D) pose estimation.Elseberg et al. (2013) combined a 2D laser scanner mounted horizontally with a Riegl terrestrial laser scanner and concluded that global scene constraints are required to obtain an optimal solution.Naikal et al. (2009) combined three orthogonally mounted 2D laser scanners with a camera.The laser scanners were used to estimate the backpack's yaw, pitch and roll rotations by scan line matching.Estimates for the translation were obtained through visual odometry.In later work by the same group Chen et al. ( 2010) combine 2D laser scanners with an IMU, but they also provide a solution for 3D pose estimation without the IMU by independent analyses of the scans of the sensors.I.e., the scans of the horizontally mounted sensor are used to estimate changes in the 2D pose (yaw and X-and Y-translation).Next, the scans in the plane perpendicular to the direction of motion are used to estimate the roll angle change.Finally, the scan line matching in data of a third scanner is used to estimate changes in pitch and height.As the scans of the different scanners are processed separately, they need to make the assumption that the scanners remain scanning in the same planes.The commercially available systems make use of Hokuyo and SICK scanners.Some of their specifications are briefly described as reference for the following discussion on the system design presented in this paper.Bosse et al. (2012) describe the design and data processing of Zebedee.It employs the Hokuyo UTM-30LX 2D scanner with an opening angle of 270° and 40 scans per second with 1080 range measurements per scan.According to the scanner manufacturer the maximum range is 30 m and the ranging noise 3 cm.To associate segments of new scans to surface patches previously reconstructed they use an ICP-like registration.Pose parameters are estimated at fixed intervals and interpolated in between.The i-MMS system by Viametris uses three Hokuyo scanners of the same type.One horizontally mounted scanner is used for the 2D pose estimation.The two further scanners are scanning in two orthogonal vertical planes, both under an angle of 45° to the push cart's direction of motion (Viametris, 2014).The TIMMS system by Trimble is equipped with two SICK scanners scanning in the same vertical plane perpendicular to the direction of motion.With an opening angle of 180°, the two scanners combined cover the full circle at a rate of 75 Hz.The SICK scanner has a ranging accuracy of 0.5-1.5 cm, a maximum range of 16 m and an angular increment of 1° (Trimble, 2014).

DESIGN
For indoor mapping 2D range sensors are attractive because of their good accuracy over a longer range.Range cameras like Kinect would have the advantage that they capture a part of a scene in 3D at an instant.However, their limited field of view and relatively short maximum range, typically no more than 5-10 m, make range cameras unsuitable for data acquisition in larger indoor spaces (airports, hospitals, museums) and it's particular those kind of public indoor spaces for which the demand for indoor modelling is apparent.Accurate capturing of long corridors with image based systems is also challenging since the base-to-depth ratio is small when viewing in the direction of a corridor and many images need to be connected when viewing in perpendicular directions.The currently available commercial systems show a preferred use of 2D laser scanners to capture the geometry of indoor spaces.Only the ZEB1 (Zebedee) system provides a 3D positioning solution.The use of a single 2D laser scanner combined with an IMU seems, however, less accurate than i-MMS with 2D positioning based on SLAM (Thomson et al., 2013).We therefore investigate whether we can exploit the strength of the scanning geometry for accurate positioning in three dimensions and thereby eliminate the drift problems associated with the use of IMU's.For this we design a scanning system with three 2D laser scanners in a configuration as shown in Figure 1.This system has not yet been built.This study serves to analyse the feasibility and performance of the designed scanner.We plan to use three Hokuyo scanners with a 270° opening angle.The top scanner (green in Figure 1) is scanning in an approximately horizontal plane.It should scan the walls, but does not need to be exactly horizontal.The other two scanners are placed at an angle of 45° with the moving direction (like for the i-MMS), but are also tilted by 45°.The rotations of these two scanners are therefore comparable to those favoured in mobile laser scanning systems by Riegl and Optech.The first rotation w.r.t the moving direction aims to cover surfaces perpendicular to the moving direction.To cover all walls it is important to have a forward and a backward looking part of the scans on both sides of the scanner (see right side of Figure 1).We therefore rotate the scanners around their axes such that one scanner has a gap on the ceiling and the other scanner has a gap on the floor.The horizontal scanner is rotated such that the gap is to the left.In this way capturing of the walls in front and behind the scanner is ensured.The tilt of the two scanners generates slanted scans on the walls.This tilt is very important for our SLAM approach.Compared to vertical scans it has two advantages.First, a slanted scan line will more likely capture a part of the wall face covered by the previous scan line.E.g. in case of an open door a sequence of slanted scans may continuously capture parts of the wall left or right of the door.In case of a vertical scan line all or nearly all points of a next scan line may be measurements on another surface seen through the door opening.This will weaken the capability to reconstruct the sensor position if this other surface has not been observed before.Second, a slanted scan line will also strengthen the system of observation equations near concave corners.Here, many scan lines will cover the two surfaces on both sides of the corner and thereby increase the accuracy of the pose estimation.

6DOF SLAM WITH 2D LASER SCANNERS
We want to simultaneously retrieve the 3D sensor pose and the plane parameters from the observed scans of the three scanners.After defining the parameterisation of the poses and planes (4.1) we elaborate the observation equation (4.2) and describe the scan line segmentation (4.3).The procedure to determine which range measurements belong to which planes is explained in paragraph 4.4 followed by a description of the initialisation using the data of the first scan that also defines the local coordinate system (4.5).In this paper we use the term scan to refer to a set of three scan lines that are captured simultaneously by the three scanners (as shown in Figures 1 and 3).Optimisation of the estimation procedure is discussed in paragraph 4.6.

Parameterisation
We assume that the three scanners are rigidly mounted on a frame and that the rotations and offsets between these scanners have been calibrated.We can then define a three-dimensional frame coordinate system and use the observed ranges (and known directions of the laser beam) to calculate the coordinates of a reflecting point in the frame coordinate system.We want to determine the rotation and translation between the frame coordinate system and a locally defined model coordinate system in which the indoor model is to be described.As the sensor frame is constantly moving, the rotation and translation are continuous functions over time .
with as a function of three angles , , and .All six pose parameters are modelled as a function of time using splines.E.g., for : , with , as the spline coefficient of on spline interval and as the B-spline value at time on interval .Planes of walls, ceiling and floor are defined by a normal vector and a distance to the origin in the model coordinate system.While the observation equation can be elaborated for arbitrarily oriented planes, currently we only use horizontal and vertical planes.These are parameterised by and X cos sin , respectively.

Observation equation
Assuming that an observed point in the frame coordinate system is correctly associated to a certain plane, we expect that the distance between this point transformed to the model coordinate system and the plane equals zero.

0
We linearize the splines of the six transformation parameters.E.g. for : where the upper index 0 denotes the approximate value.The rotation is linearized by This then leads to the following observation equation where the observed distance is calculated using the approximate transformation and plane parameters.In bold are the unknown increments to the approximate spline coefficients and plane parameters.The increment Δ to the plane orientation is only included for vertical planes.

Scan line segmentation
We segment the scans of the three scanners into linear pieces in the frame coordinate system.For this we use a greedy line growing procedure followed by an optimisation of the end points of adjacent segments.The segmentation is used to improve the data association.The association of linear segments to planes instead of individual points to planes is faster and more reliable.

Data association
Suppose that the pose splines are determined up to a certain scan.To extend the pose estimation to the next scan we need to associate the segments in that scan to the planes that have been reconstructed so far.To test potential associations we extrapolate the pose splines over the time period of the next scan.These transformation parameters are used to obtain an approximate location of the segments in the model coordinate system.When the distance between a segment and a planar face is within some limit, the segment can be associated to this plane and observation equations can be formulated for the points of the segment.
When a linear segment of some minimum size cannot be associated to an already reconstructed plane, it is used to instantiate a new plane.Under our current assumption that planes are either vertical or horizontal, the knowledge of the slope of a single segment and of the scanner that measured the segment is sufficient to decide whether the plane is horizontal or vertical.In the more general case of arbitrary plane orientations information of a few adjacent scans with known pose is required to instantiate a new plane.

Initialisation
The first scans of the three scanners are used to establish the model coordinate system.We assume that the scanner frame is not strongly rotated.In that case, the scans of the two tilted scanners (orange en purple in Figure 1) can be used to identify linear segments on the floor and/or ceiling.Assuming that these surfaces are parallel, we define the direction of the Z-axis in the model coordinate system by the vector product of the longest segments of both scanners.We then identify the two longest coplanar segments on walls that are not collinear and determine the plane in the frame coordinate system.The vector product of the normal vector of this plane and the Z-axis then defines the direction of the X-axis.The normal vector of the floor/ceiling and the normal vector of a wall thereby fix the orientation of the model coordinate system.By definition the start of the frame trajectory is taken as the origin of the model coordinate system.
In the estimations with the observation equations the model coordinate system remains unchanged by using the following constraints: (1) the floor and ceiling planes are horizontal, (2) the orientation of the first wall is kept fixed by not estimating Δ , and (3) the start of the trajectory is fixed at the origin by , 0 , 0 and two similar observation equations for the translation in Yand Z-direction.

Parameter estimation
Ideally, one would update all pose and plane parameters after every new association of a scan segment to a plane.In the experiments we used cubic splines with an interval of 5 scans.
With 40 scans per second this leads to additional 48 spline coefficients to be estimated for every second of processed data.
When sorting the variables such that the spline coefficients precede the plane coefficients, the normal matrix takes the structure as shown in Figure 2.Such equation systems can be efficiently solved with Choleski factorization and sparse matrix techniques.Experiments showed that updating all pose and plane parameters is not required as long as the plane parameters are estimated accurately such that errors in those parameters do not lead to incorrect or missed data associations in future scans.We therefore adopt the following strategy for the parameter estimation: The first two scans of each of the three scanners are used to estimate linear functions of the six pose parameters over the time interval of these two scans.New planes are instantiated, but not adjusted.The splines are extended linearly to obtain the predicted pose for the next scan.After associating the segments of the next scan to the already known planes the last two scans are used to again estimate linear functions of the pose parameters over the now shifted time interval, again without updating the plane parameters.Once a plane has collected segment associations over a set of ten successive scans, an integral adjustment is executed with all observed ranges in the past ten scans, cubic splines coefficients of the pose parameters, and all parameters of planes instantiated within the past ten scans.Parameters of earlier instantiated planes are kept fixed as they are considered accurate enough.In this way sufficiently accurate pose and plane parameters are obtained while keeping the adjustments limited to typically no more than 40 parameters.Only after the final scan, all pose and plane parameters are updated simultaneously to obtain the most accurate estimation.

EXPERIMENTS AND RESULTS
To analyse the feasibility of the approach and theoretical precision of the reconstruction two experiments have been conducted in simulated environments.For the first experiment a corridor model with a loop was constructed with the dimensions shown in Figure 3 and a height of 3 m.A path with multiple loops was generated to study the performance over time.The path started at the middle of a long side.Data was simulated with the specifications of the Hokuyo UTM-30LX scanner, but with a reduced point density of 270 points per scan line (instead of 1080 points).A platform speed of 1 m/s was assumed.This leads to a point cloud with around 220,000 and 110,000 points on respectively the longer and shorter outer wall and 91,000 and 17,000 points on respectively the longer and shorter inner wall after one loop through the corridor.
Figure 3. Corridor with a loop.Left: floor plan with dimensions and path.Right: 3D view with one scan of the three scanners.
Loop closure has been disabled in a sense that association of line segments to a plane that had not been observed for a while (100 scans) was not allowed.Instead a new plane was instantiated.As a result wall planes were instantiated again in every loop, but the floor and ceiling plane were instantiated only once.This set up allows investigating the decrease in precision over time by comparing walls reconstructed in the first loop with the same walls reconstructed in later loops.
The reconstruction of poses and planes was repeated 50 times with independently generated noisy data.We analysed the precision of the angles and distances between reconstructed wall planes.The standard deviation of the corner angles was 0.008 degree for outer corners and 0.016 degree for inner corners.Thus, the system of observation equations allows to accurately reconstruct the corner angles.The outer corners are more precise because they contain more points and these points span a larger area.Figure 4 shows how this error propagates to the standard deviation of angles between walls as a function of the number of corners in between the walls.The function shows a clearly linear relationship.After three loops the relative precision between the orientation of the first and last passed wall is still below 0.1 degree.
Figure 4. Angle precision in degrees as a function of the number of passed corners in the corridor.
For the analysis of the plane to plane distances we considered the very first inner and outer walls.These walls were captured four times within the three scanned loops.Figure 5 shows that the noise in the distance of the wall centre determined at the start to the plane of the wall captured after one loop is only 1 mm (for both the inner and outer wall).The noise in the distances of the initial wall corners to the wall plane after one loop is 3.6 and 5.3 mm for the inner and outer walls respectively.Clearly the uncertainty in the angles contributed to these larger distance noises.The precision of the plane to plane distances linearly decreases with the number of loops.The low standard deviations are realised through the very large number of points on the surfaces (and the assumption of a perfectly calibrated system).The root mean square value of the residuals in the overall estimation is 0.023 m, which is a little lower than the ranging noise of 0.03 m.This can be explained by the fact that many ranges are not taken perpendicular to the walls.Hence, the noise in the distance of measured points to the walls is lower than the ranging noise.The geometry of the scanner measurements allows estimating all parameters well.Maximum coefficients of correlation between estimated parameters are typically below 0.7.
To analyse the feasibility of the SLAM approach in a bit more realistic environment, a part of the floor map of our building was digitised and a path was drawn for the platform trajectory (Figure 7).

DISCUSSION
In this paper we have presented the design of a new indoor mapping system based on only three 2D laser scanners.We have demonstrated the feasibility of the SLAM approach with 3D pose estimation in two simulated indoor environments.In the simulated corridor nearly all range measurements were associated to planes.In the office model measurements on the smaller wall surfaces did not trigger the instantiation of new planes as we required a minimum length of a scan line segment.Yet, the remaining measurements to the larger planes were clearly sufficient to estimate all pose parameters.Real indoor environments will, of course, contain many objects other than walls, floor and ceiling.This will lead to smaller, but typically more, linear segments of the scan lines and a more challenging data association process.The complexity may vary with the environment.E.g. office spaces will likely contain more objects than museums.In general, visibility of only three known independent planes is sufficient to reconstruct the scan poses.I.e., if one wall is completely occluded by a book shelve without clear planar surfaces measurements on an opposite wall may still suffice.It should also be noted that we only made use of a reduced number of 270 points on scan line.Making use of the full 1080 points will enable the reliable extraction of smaller linear segments and hence provide a larger number of observation equations.
The accuracy of a real system will, of course, also strongly depend on the calibration of the scanner system.Errors in estimated relative scanner poses as well as in the time offsets between the scan lines of the three scanners will impact the obtainable point cloud accuracy.
Sensor poses were reconstructed based on measurements to at least three independent planar surfaces.Reconstruction with the current formulation of the observation equations will fail if a corridor is curved.It remains to be investigated to what extent curvatures of corridors can be parameterised and estimated without introducing high correlations with the (kappa-) rotation of the sensor frame.
In this paper we primarily wanted to investigate whether a system with three 2D laser scanners provides a sufficiently strong geometry to enable the estimation of pose and plane parameters.The system and method as presented require further optimisation.Before construction of a system the optimal angles of the scanners with respect to each other should be determined.
The used angles where only chosen such that gaps in walls and corners do not lead to sudden interruptions of data associations.These angles can, however, be tuned to obtain the highest possible precision of the reconstructed planes.In the SLAM approach we used cubic splines at intervals of 5 scans.The order of the splines as well as the knot intervals can also be further optimised to find the best balance between precision of the reconstruction, estimability of parameters, and computational efforts.
In further work we will improve the scan line segmentation.Currently, we use a single threshold on the distance of a point to a line segment to decide whether the line segment should be extended to this point.This threshold takes into account both the noise in the coordinates of the point and the uncertainty in the estimated line parameters.With a three sigma confidence interval this leads to a very tolerant assignment of points to line segments.While this ensures that noisy points on a planar surface are correctly grouped to a line segment, it presents a large risk that points that slightly deviate from this surface are incorrectly included in a line segment.Such undersegmentations lead to biases in the estimation of pose and plane parameters, which in turn may lead to data association failures in future scans.We want to avoid such under-segmentations by extending the statistical testing to sequences of successive points.Small systematic deviations of point sequences to a line segment can then be better recognised.
A further improvement could be obtained by making the parameter estimation more robust.I.e. when larger residuals are found for series of successive points, it could be concluded that there is a mistake in the data association and that the observation equations of these points should be removed from the estimation of the pose and plane parameters.A correct data association is essential to the success of the SLAM procedure.Incorrect data association leads to a bias in the estimated pose parameters.Consequently, the predictions of future poses will be biased and may result in failure of the data association in further scans and the subsequent failure to estimate the remaining path.Further experiments will need to demonstrate whether the data association can be made reliable enough to cope with cluttered environments.

Figure 1 .
Figure 1.Configuration of the three 2D scanners with a 270° opening angle and resulting scan pattern in a corridor.

Figure 2 .
Figure 2. Structure of the normal matrix (black: filled, grey: not filled).It shows a band with values arising from the splines coefficients and stronger fill-in in the last rows and columns by the plane parameters.

Figure 5 .
Figure 5.Standard deviations of plane to plane distances in m measured at plane centres and corners as a function of the number of loops in the corridor.
With a simulated platform speed of 1 m/s it takes 90 s to complete the path.Points on 3635 scans were simulated for each of the three scanners.The scans often contain many segments on shorter pieces of wall.Still, the data assignment is successful for a sufficient number of planes to reconstruct all poses.The reconstructed point cloud again shows a 0.023 m standard deviation w.r.t. the reconstructed planes.For visualisation purposes the points on the ceiling and floor have been removed from the reconstructed point cloud shown in Figure7.The final global estimation involved around 4400 spline coefficients and 66 plane parameters.

Figure 6 .
Figure 6.Model with walls of an office environment obtained by digitizing a floor plan together with a simulated sensor path.

Figure 7 .
Figure 7. Point cloud resulting from the SLAM procedure together with the reconstructed path.