FAST CONVERGENCE METHOD FOR GLOBAL OPTIMAL 4DOF REGISTRATION

: Four degrees of freedom (4DoF) registration is a class of point cloud registration problems for ﬁnding a rigid transformation to align two point clouds under the constraint that the rigid transformation is composed of a three-dimensional (3D) translation and 1D rotation. This constraint is suitable to align scan pairs acquired using modern terrestrial Light Detection and Ranging (LiDAR) scanners, the scans of which can share the direction of gravity as the Z-axis due to such scanners using tripods or internal inclinometers. We propose a fast convergence method for global optimal 4DoF registration. The proposed method consists of (i) our newly developed 4DoF registration model formulated as an optimization problem involving the cylindrical norm to measure the distance between two points, and (ii) a fast convergence algorithm to ﬁnd a global optimal solution of the model. We experimentally demonstrated that the proposed method reduced the number of iterations to convergence and computation time compared with a current 4DoF registration method, especially when the given scan pairs are similar but cannot be aligned, which often appears in registration of multiple point clouds.


INTRODUCTION
Terrestrial LiDAR scanners to reproduce the real world in a 3D digital space as-is has become widely used in many engineering fields, such as building information modeling (BIM), construction information modeling (CIM), facility management, and the equipment delivery planning. To reproduce a 3D model of an entire target architecture, multiple scans acquired at different scan points must be aligned because every scan captures only a part of the architecture visible from a scan point. Therefore, point cloud registration, which is a technique to align multiple LiDAR scans, is necessary. This paper focuses on pairwise registration methods for aligning two point clouds by applying a rigid transformation to one side because many point cloud registration methods for multiple point clouds has been developed on the basis of the execution of pairwise registration (Theiler et al., 2015). When using pairwise registration as a sub-process of registration for multiple scans, it is not guaranteed that the two input point clouds can be aligned. Therefore, pairwise registration should be achieved quickly even when input scan pairs cannot be aligned.
One of the most common pairwise registration methods is the iterative closest point (ICP) method (Besl and McKay, 1992), which achieves highly accurate registration. However, it finds a locally optimal solution, which requires a good initial solution to obtain a correct result. Methods for finding a good initial solution to make such a method accurate are called coarse registration methods.
We propose a fast convergence method with the following three main features: • global optimality: The proposed method consists of a 4DoF registration model written as an optimization problem (see Problem 1) and its global optimization algorithm * Corresponding author (see Algorithm 1). The global optimality guarantees that the proposed method gives the best transformation to achieve the highest objective value regardless of the initial pose of input point clouds.
• 4DoF registration: This concept was originally introduced by Cai et al. into global optimal registration (Cai et al., 2019). Since modern terrestrial LiDARs are equipped with tripods or internal inclinometers, registration methods only needs to search for 3D translation and 1D rotation, not 3D rotation. This concept enables global optimal registration within a practical computation time. Problem 1 is our newly developed 4DoF registration model, which involves the cylindrical norm (see Figure 1) instead of the standard Euclidean norm.
• faster convergence: Compared with the current global optimal 4DoF registration method (Cai et al., 2019), called branch-and-bound algorithm with fast match pruning (FMP+BnB), the proposed method converges faster. As shown later, the proposed method enables faster convergence, especially when input scan pairs cannot be aligned.
The cylindrical norm is the key factor to achieve the faster convergence property. By adapting the cylindrical norm, as shown in Figure 3, two of the four parameters in 4DoF registration can be simultaneously optimized via the computation geometry problem which searches the most overlapped region of multiple rectangles and can be solved in polynomial time. We demonstrate the computation efficiency of the proposed method in experiments.

Related work
Researchers have proposed various methods for coarse registration. However, with many of these methods, the possibility of successful registration depends on the random sample consensus (RANSAC) strategy (Chen et al., 1999) or appropriate parameter tunings for every input point cloud pair on the basis of their overwrap ratio (Aiger et al., 2008). These features are inconvenient for practical use, such as difficulty for users to identify whether the success factors of registration is in the LiDAR locations or algorithm settings. Certain studies used specific structures in a point cloud, such as lines (Jaw and Chuang, 2008) or ground planes (Li et al., 2021). However, dependence on such structure are inconvenient due to their limited applicability.
To avoid these inconveniences, certain coarse registration methods, e.g., (Yang et al., 2016), have been developed on the basis of global optimization techniques, which can clearly describe the results as an exact solution of a certain objective function for evaluating the quality of rigid transformations. In the context of global optimal registration, it has been challenging to exactly solve optimization problems within a practical computation time.
To tackle this challenge, (Cai et al., 2019) introduced the concept of 4DoF registration and proposed the global optimal registration method, called FMP+BnB, which enables coarse registration for large real-world point clouds within a practical computation time. However, from our observations discussed in Section 4.2, FMP+BnB often consumes a large amount of time, especially when the input point clouds are similar but cannot be aligned, which often appears when using pairwise registration as a sub-process of registration for multiple point clouds.

CYLINDRICAL-NORM BASED 4DOF REGISTRATION MODEL
For two given point clouds, i.e., the source point cloud P ⊂ R 3 and target point cloud Q ⊂ R 3 , the goal with 4DoF registration is to find a rigid transformation f : R 3 → R 3 which aligns P to Q, under the constraint that f is formulated as with a rotation angle θ ∈ [0, 2π) and translation vector t ∈ R 3 .
To achieve this goal, we first generate 3D keypoint matches C := (pi, qi)i∈I between P and Q, where I is the finite index set. Note that using C for registration is common, and we follow the standard procedure for generating such matches (see Section 4 for the procedure used in our experiments).
In general, the C contain many false correspondences for reasons such as the two given point clouds only partially overwrap. Therefore, we designed our 4DoF registration model in a manner similar to robust estimation techniques. Concretely, the model is written as a problem to find an f in (2) which maximizes the number of correspondences in C closer than a certain constant distance, called an inlier threshold. Problem 1 formulates the model.
Problem 1 (Our 4DoF registration model). Let R(θ) ∈ R 3×3 for θ ∈ [0, 2π) be defined in (3). For given source point cloud P ⊂ R 3 , target point cloud Q ⊂ R 3 , C := (pi, qi)i∈I between P and Q with a finite index set I, and inlier threshold ( 1, 2) ∈ R++ × R++, We call the norm defined in (6) in Problem 1 cylindrical norm because of the shape of its unit ball, i.e., the Figure 1 illustrates the unit ball of the cylindrical norm in (6). The unit ball forms a cylinder with the base radius of 1 and height of 2 2, centered at the origin and having the Z-axis as its vertical axis.  (6) Note that our model in Problem 1 is different from the current 4DoF registration model used with FMP+BnB (Cai et al., 2019) in terms of the norm to measure the distance between matched keypoints. While the current model use the standard Euclidean norm, our model uses the cylindrical norm.
To show the benefit of the cylindrical norm, we rewrite the objective function E in Problem 1 in accordance with the definition of the cylindrical norm in (6) as follows: Intuitively, (7)-(9) indicate that E in Problem 1 counts the number of correspondences in C that satisfies the following two conditions: (i) [Horizontal condition]: The horizontal distance between R(θ)pi + t and qi is up to 1.
(ii) [Vertical condition]: The vertical distance between pi + t and qi is up to 2.
The vertical condition does not depend on the rotation angle θ because the rotation matrix R(θ) in (3) has no effect on the Zcoordinate of a point. The horizontal condition does not depend on [t]z which is the translation along the Z-axis. In other words, θ and [t]z affect different conditions. We emphasize that θ and [t]z play an important role in achieving fast convergence with the proposed method. As shown later in Section 3.3, θ and [t]z can be efficiently optimized simultaneously for a fixed [t]xy.
In Section 3, we describe finding a global optimal solution (θ , t ) with our 4DoF registration model.

GLOBAL OPTIMIZATION ALGORITHM
Algorithm 1 shows the algorithm for finding a global optimal solution to Problem 1. The algorithm consists of the following two steps: To lighten the burden of the subsequent optimization step, this step reduces given I to a smaller subset I under the constraint that I preserves the global optimal solution (θ , t ) of Problem 1. See Section 3.1 for details of this pruning step.
(ii) [Optimization]: This step searches for a global optimal solution (θ , t ) of Problem 1 on I by using a custom branch-and-bound (BnB) algorithm. See Sections 3.2-3.3 for details of this optimization step.

Prune indices
The pruning step is a preprocess to reduce I in Problem 1 to I such that a global optimal solution is preserved in I , i.e., arg max Proposition 1 states a sufficient condition for global optimal solutions to be preserved when an index k ∈ I is removed from I.
Proposition 1 (A sufficient condition for removable index). In Problem 1, let k ∈ I and define ( Then the index k is removable, i.e., arg max Proof. See Appendix A. In accordance with Proposition 2, we can find removable indices in I and generate I . TheĒ k in (12) is calculated by solving an optimization problem for θ, called the max-stabbing problem (Cai et al., 2019), which can be solved in a computational complexity of O(|I k | log |I k |).
Algorithm 2 shows the process of the pruning step.

Overview of optimization step
In Sections 3.2-3.4, we describe the optimization algorithm to solve Problem 1. This algorithm is directly applicable to the optimization step by replacing I in Problem 1 with the pruned I computed in Section 3.1.
To derive an optimization algorithm for Problem 1, we first rewrite the optimization problem in (4) as follows: The purpose of the translation to (15)-(16) is as follows: • As we see in Section 3.3, we can efficiently solve the inner optimization problem in (16), i.e., the problem simultaneously optimizing θ and [t]z for a fixed [t]xy, by computation complexity of O(|I| log |I|).
• As we see in Section 3.4, we can find a global optimal solution [t]z of the outer optimization problem in (15) by using a custom BnB algorithm which searches for the solution on a 2D space.
Note that the optimization algorithm of the proposed method is designed in a way similar to the algorithm of FMP+BnB (Cai et al., 2019) which also solves their 4DoF registration model by decomposing into an inner and outer optimization problems.   (6), the following condition can be derived: The Θi in (22) consists of at most two intervals of [0, 2π), and the Ti in (23) consists of an interval of R. Therefore, the condition in (21) shows that, as shown in Figure 2 As shown in Figure 3, the optimization problem for (θ, [t]z) in (17) can be translated into a computation geometry problem called the rectangle-intersection problem (Choi et al., 2012), which searches a point in the region where the given multiple rectangles overlap the most. The rectangle-intersection problem can be solved in O(N log N ) for given N rectangles (Imai and Asano, 1983). In our problem, the number of rectangles is at most 2|I|. See Figure 3 and Appendix B for our version of the algorithm to solve the rectangle-intersection problem.
We can now find an optimal solution (θ, [t]z) of the optimization problem in (16)   , which is on most overlapped region for multiple rectangles (slashed-red region), can be found as pair of (θi, tj), where θi and tj are a vertex of rectangles.

Optimize U in (15) by using BnB algorithm
We now discuss to finding a global solution of an optimization problem for [t]xy in (15) by using a custom BnB algorithm that uses the computation of U in (16) with a fixed [t]xy mentioned in Section 3.3. The BnB algorithm finds a global optimal solution of non-convex problems, which splits the original maximization problem into multiple smaller sub-problems (branches) then searches only branches with an upper bound exceeding the highest objective value found with the algorithm. The BnB iterations stop when all upper bounds of branches that have not yet been explored are less than or equal to the current highest objective value because there is no better solution in such branches.
In the context of solving (15), the custom BnB algorithm initializes the search space with a square S0 ⊂ R 2 that contains the optimal solution [t ]xy, then splits S0 into four smaller congruent squares with edges parallel to the X-axis or Y-axis. For each smaller square S ⊂ S0, let [tS ]xy ∈ R 2 be the center point of S. If [tS ]xy gives a higher objective value than the current best estimate [t]xy, [t]xy is updated to [tS ]xy. Divide S into the four smaller congruent squares then, as we describe later, compute an upper bound U of each square and keep squares with U exceeding the current highest objective value U ([t]xy) as branches, which will be explored later in the custom BnB algorithm.
Algorithm 3 shows the custom BnB algorithm described above. Note that while Algorithm 3 appears to find only an optimal horizontal translation [t ]xy via the optimization problem in (15), it simultaneously finds the optimal residual parameters (θ , [t ]z) in Problem 1. Once Algorithm 3 gives an optimal solution [t ]xy, the residual parameters (θ , [t ]z) in Problem 1 can be obtained by evaluating U for fixed [t ]xy.
To run Algorithm 3, we need to compute U (S) which is an upper bound of U on a given S ⊂ R 2 . Proposition 2 shows the computation of U (S) such an upper bound. Proposition 2 (An upper bound of U ). In Problem 1, define U as (16). Let S ⊂ R 2 be a bounded square, s0 be the center of S, and dS be the radius of S. Then, Proof. See Appendix C.
The inequality in (24) shows that U (S) is an upper bound of U on a S. U (S) can be computed by evaluating U for fixed s0.

EXPERIMENTS
We conducted experiments to demonstrate (i) the applicability of the proposed method to coarse registration (Section 4.1) and (ii) its computation efficiency (Section 4.2) by using the realworld datasets Arch and Trees 1 . The Arch dataset consists of five scans acquired with low overlap (30-40%), and the Trees dataset consists of six scans acquired in a forest with a large amount of underwood.
To demonstrate registration performance in a scenario in which point cloud pairs can and cannot be aligned, we prepared regular and irregular pairs of both Arch and Trees datasets.
• regular pairs: point cloud pairs that are assumed can be aligned with the dataset. We list all the regular pairs in Table 2 for the Arch dataset and in Table 3 for the Trees dataset.
• irregular pairs: all pairs of different point clouds. The X and Y coordinates were swapped in the source point cloud for all irregular pairs. Since the swapping is not a rigid transformation, all irregular pairs cannot be aligned.
The matches C in Problem 1 was generated by the following procedure: 1. The voxel grid down-sampling at 10cm in length and the intrinsic shape signatures (ISS) keypoint extraction (Zhong, 2009).
2. The fast point feature histograms (FPFH) feature (Rusu et al., 2009) computation. Matching keypoints on the basis of the distance between their FPFH features. Concretely, the correspondence (pi, qi) is included in C if their FPFH features are one of the ten nearest neighbors to each other.
The inlier threshold ( 1, 2) in Problem 1 was set to (0.4, 0.4). All experiments were implemented in C++ and executed on an Intel Xeon 3.20GHz CPU.

Validity of proposed method for coarse registration
We first evaluated the applicability of the proposed method to coarse registration by using the regular pairs in the Arch and Trees datasets.
For the Arch dataset, Table 2 represents the registration accuracy of the proposed method without and with the ICP method, which is a well-known fine registration method. The registration performance was measured on the basis of the rotation error and translation error formulated as where · F means the Frobenius norm and (Rgt, tgt) ∈ R 3×3 × R 3 is the ground truth of the rotation matrix and translation vector given by the dataset. The validity of the proposed method was confirmed by determining whether the result satisfies the following condition; both the rotation and translation errors decreased by using the ICP method and the translation error was sufficiently small (≤ 5 cm) after using this method, which means that the proposed method gave a good initial solution for fine registration. As shown in  Figure 4 shows the results of the proposed method for the point cloud pair s5-s1 in the Arch dataset. Note that pair s5-s1 is the worst case in terms of the translation error in Table 2. As shown in Figure 4, the source and target point clouds were aligned successfully even if they scanned different planes on the arch. We evaluated the results for the Trees dataset under the same conditions as the Arch dataset. Table 3 shows the accuracy of the proposed method for the regular pairs of the Trees dataset.
The proposed method was also suitable for coarse registration for all pairs in the Trees dataset. Figure 5 shows the results of the proposed method for the point cloud pair s2-s5, which is the worst case in terms of the translation error in

Computation efficiency
We now discuss the computation efficiency of the proposed method in terms of the computation time and convergence speed for finding a global optimal solution. We compared the proposed method with FMP+BnB, which also achieves global optimal 4DoF registration.
Arch dataset: Figure 6 shows the computation times for the regular and irregular pairs in the Arch dataset. The proposed method reduced the worst computation time to 51% compared with FMP+BnB. The computation time of the proposed method was more stable than that of FMP+BnB, especially for the irregular pairs. Numerically, the instability of the computation time defined as instability := maximum time − minimium time minimum time (26) was calculated; instability was 3.95 for FMP+BnB and 1.82 for the proposed method.  Table 4 shows the details of the computation-time comparison. Every value in Table 4 was computed as the ratio of the computation time of the proposed method to that of FMP+BnB for a given pair. The proposed method was more efficient than FMP+BnB for all irregular pairs. The harmonic mean of all ratios was 0.  Figure 7 shows the comparison of the number of iterations, i.e., the number of branches executed using the BnB algorithm, regarding convergence speed. The proposed method effectively reduced the number of iterations to 14% in the worst case. In other words, the maximum number of iterations was 7.1 times less with the proposed method than with FMP+BnB. This is the main aim of introducing the cylindrical norm and limiting the search dimension of the BnB algorithm into a 2D plane.

Trees dataset:
We also evaluated the results for the Trees dataset under the same conditions as the Arch dataset. As shown in Figure 8, the proposed method reduced the worst computation time to 51% compared with FMP+BnB. Instability was also reduced; 3.64 with FMP+BnB and 1.64 with the proposed method.   Figure 9 shows the comparison of the number of iterations regarding convergence speed. The proposed method reduced the number of iterations to 15% in the worst case. In other words, the maximum number of iterations was 6.7 times less than with FMP+BnB.

CONCLUSIONS
We proposed a fast convergence method for global optimal 4DoF registration. The proposed method consists of our newly developed 4DoF registration model (Problem 1), which includes the cylindrical norm in (6), and its global optimization algorithm (Algorithm 1). Since two parameters ([t]z, θ) in Problem 1 can be simultaneously optimized in polynomial time O(|I| log |I|) via the rectangle intersection problem (Figure 3), the cutom BnB algorithm of the proposed algorithm searches only a 2D space (Table 1) and can converge faster than that of FMP+BnB.
We conducted experiments to demonstrate the (i) applicability of the proposed method to coarse registration and (ii) its computation efficiency by using datasets for terrestrial LiDAR point cloud registration. Compared with FMP+BnB, the proposed method was 6.7 times faster in terms of the maximum number of iterations and 2.0 times faster in terms of the worst computation time.

APPENDIX B. SOLVING RECTANGLE INTERSECTION PROBLEM
We describe an algorithm to solve the rectangle-intersection problem shown in Figure 3 by computation complexity O(N log N ) for N rectangles. To store the information about rectangle sides, the algorithm uses the segment tree with lazy propagation (STLP) (Laaksonen, 2017, Sec. 15.2.1), which is a tree-data structure storing information about intervals. The STLP can compute the following range updates and queries by O(log N ).
• MAX n 2 n 1 (): obtain the index i ∈ [n1, n2), which gives the maximum value in [n1, n2). We denote MAX all () as MAX N 0 (), which obtains the index that gives a maximum value in all elements.
By using the STLP, we can solve the rectangle-intersection problem. In Algorithm 4, we denote the vertices of the i-th rectangle as {(θ (i) low , t