AN UNSUPERVISED REGISTRATION OF 3D POINT CLOUDS TO 2D CAD MODEL: A CASE STUDY OF FLOOR PLAN

Thanks to the proliferation of commodity 3D devices such as HoloLens, one can have easy access to the 3D model of indoor building objects. However, this model does not match 2D available computer-aided design (CAD) models as the as-built model. To address this problem, in this study, a 3-step registration method is proposed. First, binary images, including walls and background, are generated for the 3D point cloud (PC) and the 2D CAD model. Then, 2D-to-2D corresponding pixels (CPs) are extracted based on the intersection of walls in each binary image of PC (BIPC) and binary CAD (BCAD) model. Since the 3D PC space coordinates (XYZ) of all BIPC's pixels are known, BIPC part of the 2D-to-2D CPs can be considered 3D. Lastly, the parameters of the 8-parameter affine are estimated using the 2D-to-3D CPs, which are pixel coordinates in BCAD model as well as their correspondences in the 3D PC space. Experimental results indicate the efficiency of our proposed method compared to manual registration.


INTRODUCTION
For most applications, one needs to know the relationship between 3D point clouds (PC) and the 2D CAD model. It is because a wide range of urban projects are based on existing 2D CAD models (Wijmans and Furukawa, 2017). To be specific, to navigate through indoor objects of a building by 3D scanners like HoloLens, HoloLens-derived PC should be transformed into the available 2D CAD models. For this purpose, a few researches have tried to do registration between CAD model and other datasets (Usamentiaga et al., 2018;Wang and Sohn, 2010). PCto-CAD registration is a convoluted process because these two datasets have different modalities. This means one of them is a mixed of colorful lines while the other contains depth information. Another important issue that makes this process difficult is the different dimensions of the two datasets. The CAD model is 2D, while the PC is 3D. Apart from these two problems, having partial data deteriorates the process of the registration. Partial data is when only a portion of one dataset is captured in the other image.
Although there are not many researches on the registration between 3D PC and 2D CAD models in recent years, some relevant researches can be found in the literature. Most of the researches are based on render-based image synthesis. In this area, the average shading gradient was proposed and used in different researches (Plötz and Roth, 2015;Plötz and Roth, 2017) to do registration between an image and an untextured geometry. In (Rashwan et al., 2019), linear curve features have been specifically used to do registration between rendered depth images, coming from the 3D model, and the image. In (Corsini et al., 2009), a 3D model is rendered using the camera parameters, optimized during an iterative optimization process. The loss function of this optimization is mutual information, recommended for multimodal registration (Gaens et al., 1998). However, these methods cannot be used for the dataset in question because they are partial, and more importantly, have repetitive patterns (e.g., similar doors, windows and corridors).

* Corresponding Author
We propose a label-based three-step registration method between PC and CAD models to address this problem. At first, a binary CAD (BCAD) model and a binary image of the PC (BIPC) are generated. These images include walls and background. Following this, corresponding pixels of the BCAD and BIPC are extracted based on the intersection of the floor's walls. At the end of this step, 2D-to-2D corresponding pixels/ control points (CPs) are transformed to 2D-to-3D CPs by replacing 2D pixel coordinates of CPs in BIPC with their 3D coordinates in PC space. Note that the PC space coordinates (XYZ) of the BIPC pixels are known from the first step (generation of binary images). Lastly, the parameters of the 8-parameter affine (Okamoto, 1999), as a common 2D-to-3D transformation, are estimated based on corresponding points coming from the last step.

THEORETICAL BACKGROUND
Here, the 8-parameter affine, as a well-known 2D-to-3D transformation, will be discussed.
where U includes estimated . Furthermore, A and L are, respectively, 2n×8 and 2n×1 matrices formed as follows. Note that n is the number of correspondences, and , and are 3D space coordinates of the ℎ point. (4)

POINT CLOUD TO CAD REGISTRATION
The proposed method has three main steps, each of which will be discussed in the following.

Binary Image Generation
Here, both of the PC and the CAD models are converted to binary images where 0 indicates background and 1 indicates walls. The CAD model can be simply converted to the binary image because walls are clear according to the legend prepared for each map. Regarding the PC, they are initially converted to the image by setting the Z-component of the PC's points to zero. This is allowed because most of the light detection and ranging (LiDAR) instruments, including HoloLens, are leveled. After generating the image, thanks to the labels provided by the HoloLens for each point, the PC-derived image is converted to the BIPC. It is worth mentioning that the XYZ coordinates of the PC space of each pixel in BIPC are stored in this step.

Finding Correspondences
2D-to-2D correspondences can be found between the BCAD model and the BIPC considering this fact that we can have the starting point and the path direction from the HoloLens. Algorithm 1 and Algorithm 2 contain the steps taken to find corridors (wall intersection) in the BCAD and the BIPC, respectively. The BCAD and BIPC have the same size, equal to m × n, where m is the number of rows, and n is the number of the columns. Pick up indices of four largest values of vertical_line_length and call them 1 , 2 , 3 , 4 , ascendingly.
Reconstruct the target corridor using the above coordinates as follows. Pick up indices of the two largest values in vertical_line_length and call them 1 , 2 , ascendingly.
Reconstructing the target corridor using the above coordinates as follows.
Since the starting position is known in both the BCAD and BIPC, we can extract corridors in question using Algorithm 1 and Algorithm 2, respectively (red lines in Figure 4 and Figure 5). Indeed, the starting point can be considered as an initial value, which does not allow the method to select the wrong parts as a corridor. Subsequently, by knowing the approximate starting point in the BCAD, we can simply find corners in the BCAD corridor which correspond to the corners of the BIPC extracted corridor. Therefore, four corresponding points can be extracted. Two of these corresponding points are corresponding corners (red and yellow points in Figure 6 and Figure 7). Also, two other points are the cross points resulted from continuing lines of corresponding corridor walls (blue and green points in Figure 6 and Figure 7). These four corresponding points will be called as 2D-to-2D CPs.

Registration using 8-parameter Affine
As mentioned in the first part of the proposed method, for each point in the BIPC, its PC space coordinates (XYZ) are known. Hence, each of the four 2D-to-2D CPs, its coordinates in 3D PC can be obtained. Furthermore, its correspondence is clear in the CAD model. Accordingly, we have four 2D-to-3D CPs through which the parameters of the 8-parameter affine are estimated. Indeed, knowing the parameters of the affine is equivalent to the registration between these two spaces.

Dataset
As mentioned before, our dataset contains a CAD model and a 3D PC.

CAD Model:
The CAD model is a floor plan and a range of features (including walls, doors, windows, etc.). These features are recognizable, considering the colors and symbols (See Figure 1). Here, walls are represented in grey color.

PC:
We developed an application for Microsoft HoloLens1 to collect data for this experiment. Microsoft HoloLens is equipped with an IMU, four environment understanding cameras, one depth camera, and one photo camera. It provides Spatial Mapping API (Microsoft, 2018 ) to generate meshes for real-world space using its sensors. We used ray casting to retrieve surface types (floor, ceiling, and wall) in the mesh. Using point sampling technique, we generated PC from the meshes and labeled the points in the PC according to their surface types. In Figure 2, each color represents the surface type of the points. Red, green, and blue indicate floor, ceiling, and wall surface types, respectively.

Manual Control Points:
To assess the performance of the proposed method, 8 CPs are manually selected in each one of the CAD and PC spaces. The CPs are utilized to register the PC with respect to the CAD model, and the result is compared with the one provided by the proposed method. The location of the CPs in the CAD models are shown in Figure 3. Figure 3. Manually-selected CPs are illustrated using red signs. CPs in the CAD models are restricted to the area which has overlap with the PC.

Discussion and Results
First, binary images are generated for PC and CAD, where 1 indicates walls and 0 indicates background, including the floor and the ceiling. Following this, corridors in both the BCAD and the BIPC are extracted. In Figure 4 and Figure 5, BCAD and BIPC and their extracted corridors (in red) are respectively illustrated. Figure 5 shows the inherent difference of the PC with the CAD model and its binary images. As can be seen, we encounter a complex issue in our problem due to some errors occurring during 1) the labeling of the PC, 2) mapping the PC to the image space. These complications to our problem highlight the importance of corridors extraction. Finding correspondences between the noisy environment of the PC and the CAD, with different modality, is a challenging issue. One of the preferable solutions is to initially extract some mutual features with a minor error, noise, and complexity. In our case, corridors are one of the best choices as mutual simple parts in both PC and CAD spaces, affected less by error and noise.  According to Algorithm 1 and Algorithm 2, corresponding points (red, green, blue, and yellow points in Figure 6 and Figure 7) are extracted in both BCAD and BIPC. As a result of this step, we have 4 2D-to-2D CPs, which can be converted to 2D-to-3D corresponding points because three coordinates of the PC space (XYZ) are known. Lastly, according to the 2D-to-3D CPs, the parameters of the 8-parameter affine are estimated. In Figure 8, the result of our proposed method has been brought. For better comparison, the result of the manual registration, the parameters of which are estimated based on 8 CPs, has also been shown (see Figure 9). Note that, these CPs have been selected manually. As visually can be seen, our method leads to a much better result without using any manually-collected CPs. Our method has proved such proficiency in registration while it uses only 4 CPs, detected automatically. The reason why the proposed method has a superior performance is that our extracted CPs are remarkably more accurate than manually extracted ones. Considering the difficulty in selecting accurate 2D-to-3D CPs by an operator, our proposed method is a more efficient and practical alternative.

CONCLUSION
This paper presents a straightforward 2D-3D registration for buildings having floor plans. This method has three steps and essentially works based on the extracted corner points (i.e. intersection of walls). Thanks to the nature of the CAD on the one hand, and on the other, data collection of the HoloLens, binary images for the PC (BIPC), and the CAD (BCAD) model are initially generated. Following this, corresponding pixels between BCAD and BIPC are specified based on the corresponding corner points. Finally, registration is done by estimating the parameters of the 8-parameter affine through corresponding points coming from the previous step.
As with any new research, there are some unresolved issues that may present challenges over time. In our view, one of the most important of these is an automatic registration for scan patches in which there are not any corner points. The generalization of our proposed method through addressing could be a topic for future research.