PAIRWISE LINKAGE FOR POINT CLOUD SEGMENTATION

: In this paper, we ﬁrst present a novel hierarchical clustering algorithm named Pairwise Linkage (P-Linkage), which can be used for clustering any dimensional data, and then effectively apply it on 3D unstructured point cloud segmentation. The P-Linkage clustering algorithm ﬁrst calculates a feature value for each data point, for example, the density for 2D data points and the ﬂatness for 3D point clouds. Then for each data point a pairwise linkage is created between itself and its closest neighboring point with a greater feature value than its own. The initial clusters can further be discovered by searching along the linkages in a simple way. After that, a cluster merging procedure is applied to obtain the ﬁnally reﬁned clustering result, which can be designed for specialized applications. Based on the P-Linkage clustering, we develop an efﬁcient segmentation algorithm for 3D unstructured point clouds, in which the ﬂatness of the estimated surface of a 3D point is used as its feature value. For each initial cluster a slice is created, then a novel and robust slice merging method is proposed to get the ﬁnal segmentation result. The proposed P-Linkage clustering and 3D point cloud segmentation algorithms require only one input parameter in advance. Experimental results on different dimensional synthetic data from 2D to 4D sufﬁciently demonstrate the efﬁciency and robustness of the proposed P-Linkage clustering algorithm and a large amount of experimental results on the Vehicle-Mounted, Aerial and Stationary Laser Scanner point clouds illustrate the robustness and efﬁciency of our proposed 3D point cloud segmentation algorithm.


INTRODUCTION
Segmentation is one of the most important pre-processing step for automatic processing of point clouds.It is a process of classifying and labeling data points into a number of separate groups or regions, each corresponding to the specific shape of a surface of an object.The cluster analysis which classifies elements into categories according to their similarities has been applied in many kinds of fields, such as data mining, astronomy, pattern recognition, and can also be applied on the segmentation of 3D point clouds.

Point Cloud Segmentation
Segmentation in 3D point clouds obtained from laser scanners is not trivial, because the three dimensional point data are usually incomplete, sparsely distributed, and unorganized, also there is no prior knowledge about the statistical distribution of the points, and the densities of points vary with the point distribution.Many methods have been developed to improve the quality of segmentation in 3D point clouds that can be classified into three main categories: edge/border based, region growing based and hybrid.
The edge/border based methods attempt to detect discontinuities in the surfaces that form the closed boundaries, and then points are grouped within the identified boundaries and connected edges.These methods usually apply on the depth map where the edges are defined as the points where the changes of the local surface properties exceed a given threshold.The local surface properties mostly used are surface normals, gradients, principal curvatures, or higher order derivatives (Sappa andDevy, 2001, Wani andArabnia, 2003).However, due to noise caused by laser scanners themselves or spatially uneven point distributions in 3D space, such methods often detect disconnected edges which makes it difficult for them to identify closed segments (Castillo et al., 2013) without a filling or interpretation procedure.
The region growing based approaches deal with segmentation by detecting continuous surfaces that have homogeneity geometrical properties.In the segmentation of unstructured 3D point clouds, these methods firstly choose a seed point from which to grow a region, and then local neighbors of the seed point are combined with it if they have similarities in terms of surface point properties such as orientation and curvature (Rabbani et al., 2006, Jagannathan andMiller, 2007).There are also algorithms which take a sub window (Xiao et al., 2013) or a line segment (Harati et al., 2007) as the growth unit.(Woo et al., 2002) proposed an octree-based 3D-grid method to handle large amount of unstructured point clouds.The smoothly connected regions are the key points of the region growing based methods.Surface normal and curvatures constraints were widely used to find the smoothly connected areas (Klasing et al., 2009, Belton andLichti, 2006).In general, the region growing based methods are more robust to noise than the edge-based ones because of the using of global information (Liu and Xiong, 2008).However, these methods are sensitive to the location of initial seed regions and inaccurate estimations of the normals and curvatures of points near region boundaries can cause inaccurate segmentation results, and also outliers can result in over-and under-segmentation.
The hybrid approaches use both edge/border-based and regiongrowing-based methods to overcome limitations in the respective approaches (Vieira andShimada, 2005, Lavoué et al., 2005).(Benkő and Várady, 2004) proposed a hybrid approach for the segmentation of engineering objects, which detects sharp edges and small blends using an edge-based approach in the first step and then finds smooth regions after filtering out sharp edges and small blends.However, the success of these hybrid methods de-pends on the success of either or both of the underlying methods.

Clustering
The cluster analysis, which aims at classifying elements into categories on the basis of their similarities, has been applied in many kinds of fields, such as data mining, astronomy, and pattern recognition.In the last several decades, thousands of algorithms have been proposed to try to find a better solution for this problem in a simple but philosophical way.In general, these algorithms can be divided into two categories: partitioning and hierarchical methods.The partitioning clustering algorithms usually classify each data point to different clusters via a certain similarity measurement.The traditional algorithms K-Means (MacQueen et al., 1967) and CLARANS (Ng and Han, 1994) belong to this category.The hierarchical methods usually create a hierarchical decomposition of a dataset by iteratively splitting the dataset into smaller subsets until each subset consists of only one object, for example, the single-linkage (SLink) method and its variants (Sibson, 1973).

Clustering in Point Cloud Segmentation
The clustering algorithms which classify elements into categories on the basis of their similarities can also be applied on the segmentation of 3D point clouds.The widely used K-Means algorithm (MacQueen et al., 1967), which can divide the data points into K (a predefined parameter that gives the number of clusters), was applied in (Lavoué et al., 2005) to classify the point clouds into 5 clusters according to their curvatures.The shortcoming of the K-Means clustering algorithm is that it needs to know the number of clusters beforehand, which can't be predefined in many cases.To overcome this shortcoming, the mean shift algorithm (Comaniciu and Meer, 2002), which is a general nonparametric technique to cluster scattered data, was employed on the point cloud segmentation (Yamauchi et al., 2005b, Yamauchi et al., 2005a, Zhang et al., 2008).In the works of (Yamauchi et al., 2005b, Yamauchi et al., 2005a), the mean shift algorithm was employed to integrate the mesh normals and the Gaussian curvatures, respectively.In the work of (Liu and Xiong, 2008), the normal orientation was converted into the Gaussian Sphere, and a novel cell mean shift algorithm was proposed to identify planar, parabolic, hyperbolic or elliptic surfaces in a parameter-free way.However, most of the point cloud segmentation methods based on clustering can only discover small amount segmentations, which can be employed on some industry applications but may fail on the vehicle-mounted and aerial laser scanner point clouds which contains thousands of surfaces in large indoor/outdoor scenes.

Objectives and Motivation
In this paper, we aim to develop a simple, efficient point cloud segmentation algorithm which can be applied on a large amount of unstructured Vehicle-Mounted, Aerial and Stationary Laser Scanner point clouds by employing the clustering algorithm on point cloud segmentation.To achieve this goal, we introduce two algorithms: P-Linkage Clustering: Based on the assumption that: a data point should be in the same cluster with its closest neighboring point (CNP) which is more likely to be a cluster center, we propose a novel hierarchical clustering method named Pairwise Linkage (P-Linkage) which can discover the clusters in a simple and efficient way.Firstly, a pairwise linkage procedure is applied to link each data point to its CNP on the data-point level.Then the initial clusters can be discovered by searching along the pairwise linkages starting from the points with local-maximal densities.The data points p1 and p34 are the cluster centers, the symbol → indicates the pairwise linkage, and the big circle in green denotes the neighborhood set of a data point with a cutoff distance dc.
The proposed clustering method is not iterative and needs only one step for general cases, and also a cluster merging method is proposed for specific cases.
Point Cloud Segmentation: Based on the proposed P-Linkage clustering, we develop a simple and efficient point cloud segmentation algorithm which needs only one parameter and can be applied on a large amount of unstructured Vehicle-Mounted, Aerial and Stationary Laser Scanner point clouds.The P-Linkage clustering in point cloud segmentation takes the flatness of the estimated surface of a 3D point as its feature value and forms the initial clusters via point data collection along the linkages.For each initial cluster we create a slice.All the slices are merged in a simple or efficiently strategy to get the final segmentation result.The proposed point cloud segmentation algorithm needs only one parameter to balance the segmentation results of planar and surface structures.
The remainder of this paper is organized as follows.The proposed P-Linkage clustering algorithm is detailedly described in Section 2. The point cloud segmentation algorithm by employing the P-Linkage clustering on 3D unstructured point clouds is introduced in Section 3. Experimental results on different kinds of synthetic and real data are presented in Section 4 followed by the conclusions drawn in Section 5.

PAIRWISE LINKAGE
The key conception of the P-Linkage clustering method is that: a data point pi should be in the same cluster with its closest neighbor pj that is more likely to be a cluster center, and this relationship between pi and pj is called a pairwise linkage.This conception is derived from the idea of non-maximum suppression (NMS) (Canny, 1986, Neubeck andVan Gool, 2006), in which one data point is only needed to compare with its neighbors and will be suppressed if it is not local-maximal.Figure 1 shows an illustration of the NMS, from which we can see that p1 is suppressed by p2 and the same suppression occurs on p2 when it is compared to p3, which result in a link p1 → p2 → p3.In this way, all the data points on the curve are finally linked to the cluster center c ,which is the local-maximal one, just via comparing to their neighboring points.In fact, the P-Linkage clustering makes up the gap between the local to the global information of the data points, which makes it more robust than the local-based clustering methods and more efficient than the global-based ones.In the following subsections, we will introduce the pairwise linkage algorithm on the clustering of 2D data points, which takes the density of a data point as its feature value to build linkages.

Cutoff Distance:
The cutoff distance dc, as shown in Figure 2, is a global parameter to demarcate the neighborhood set of a data point pi from other data points.In the recent work of (Rodriguez and Laio, 2014), the value of dc was set as the value at the 1% − 2%-th of all the distances between any two data points, denoted as the set D, which were sorted in ascent order.However it is not appropriate to set the cutoff distance dc in this way because dc is an indicator of the distribution of the neighboring points, and it should be derived from the local neighborhood instead of D. Thus, we propose a simple method to determine the value of dc, which is described as follows.For each data point pi, the distance between pi and its closest neighbor is recorded in Dcn, and then dc is computed as: where scale is a customized parameter which means the cutoff distance dc is scale times the value of the median value of the set Dcn.In this way, dc represents much more neighborhood distribution information than setting it 1% − 2%-th of D. Only the data points whose distances to pi are smaller than dc are considered as the neighborhood set of pi, which is denoted as Ii, as the green circle shown in Figure 2.
Density: (Rodriguez and Laio, 2014) defined the density of a data point pi as the number of data points of its neighbors, which is discrete-value and thus is not suitable for our application requiring continuous values for densities.In our proposed method, the density ρi of a data point pi is calculated by applying a Gaussian Kernel on all the data points as follows: where N denotes the number of all the data points and dij is the distance between two points pi and pj.
Pairwise Linkage: With the densities of all the data points, the pairwise linkage can be recovered in a non-iterative way, which is performed as follows.For a data point pi whose neighborhood set is Ii, we traverse each point in Ii and find the closest data point pj whose density is greater than that of pi, and then we consider the data point pi should be in the same cluster as pj and record the linkage between the data points pi and pj.If the density of pi is local-maximal, which means that there exists no data point in Ii whose density is greater than that of pi, we consider pi as a cluster center.The result of the pairwise linkage procedure is comprised of a lookup table T recording the linkage relationship and a set Ccenter recording all the cluster centers.

Hierarchical Clustering:
The hierarchical clustering is a topdown procedure, which is similar to that of the divisive clustering algorithm.For each cluster center ci in Ccenter, we start searching the lookup table T from ci in a depth-first or breadthfirst way to gather all the data points that are directly or indirectly connected with ci, which generates a cluster whose center is ci.The whole hierarchical clustering finds the final clusters C. Figure 2 shows an illustration of the hierarchical clustering procedure.From Figure 2, we can observe that p1 is the cluster center due to its local maximal density and there are four pairwise linkages between (p1, p13), (p13, p4), (p4, p27), and (p27, p8).Thus the hierarchical clustering is performed as p1 → p13 → p4 → p27 → p8.By this way, the clustering information is propagated from the dense data points to the sparse ones, which is similar to the heat propagation.
Cluster Merging: When the data points are Gaussian-distributed, as shown in Figure 1, the hierarchical clustering via pairwise linkage can find the global cluster centers and recover the clusters quite well, but may fail in fragmented clustering results when there exist one or multiple local maximum(s).To deal with all the conditions of data point distribution, a customized cluster merging strategy is proposed with the following three steps.Firstly, for each cluster Cp, the average density µp and the standard deviation σp of all the data points in Cp are calculated.Secondly, the adjacent clusters for each cluster Cp are collected by searching for the border data points between adjacent clusters.For each data point pi in Cp, its neighborhood set is denoted as Ii.If a data point pj in Ii belongs to another cluster Cq, these two clusters are considered to be a pair of adjacent clusters, pi and pj are recorded as the adjacent points between Cp and Cq, respectively.Thirdly, for each adjacent cluster pair Cp and Cq, the average densities of the adjacent points of Cp and Cq are denoted as ρ p and ρ q , respectively.These two adjacent clusters will be merged if the following conditions are met: The cluster merging is conducted iteratively, which means that all the clusters that are directly or indirectly adjacent to the start cluster are merged.

Outliers:
In the previous work presented by (Ester et al., 1996), the outlier points are the ones whose densities are smaller than a certain threshold.By this way, the low density data points may be classified as outliers.In the work of (Rodriguez and Laio, 2014) the outlier points are considered as those whose densities are small than the highest density in the border region of a cluster, which means that all the data points in the border region of a cluster are discarded as outliers.In our work, we consider the outliers on the cluster-level.If a data point pi whose density is local-maximal but smaller than the median density, median(ρ), of all the data points, all the data points in the same cluster with pi are considered as outliers.
Figure 2 shows an illustration of some basic ideas of the proposed P-Linkage method, from which we can observe that there are two clusters in blue and red, respectively.For each data point, the pairwise linkage is formed by searching for the closest neighboring point whose density is greater than its own.For example, p8 is first linked to p27, p27 is then linked to p4, p4 is further linked to p13, and p13 is finally linked to p1 with the greatest density in its neighborhood.In this way, the complete linkage is found as p8 → p27 → p4 → p13 → p1, and thus all of these five data points are classified into the same cluster whose cluster center is p1.The same procedure occurs for all the data points in blue as a separate cluster.Similarly the red cluster can be formed by this way.As to the data points on the boundary between two clusters, the pairwise linkage can still be applied.Taking the data point p33 for example, p19, p7, p6, and p28 are its neighboring points, p6 is its closest neighboring point (CNP), and thus p33 is classified into the blue cluster, which is quite reasonable.The four data points in black, p26, p18, p17, and p16, are classified as outliers because there exist no CNP in their neighborhood, and the densities of their cluster centers are not high enough neither.
As a summary, the proposed clustering method can discover the clusters and cluster centers in only one step in general cases without the merging procedure.For each data point pi with a density ρi, we find its closest neighboring point CN P (pi) whose density is greater than that of pi, and classify the point pi to the same cluster as CN P (pi).If the density ρi of the data point pi is local-maximal and greater than the average density ρ, we consider pi as a cluster center.Algorithm 1 describes the complete procedure in details of the proposed P-Linkage clustering method. Algorithm

P-LINKAGE FOR POINT CLOUD SEGMENTATION
The segmentation of point clouds can also be formulated as a clustering problem because the data points on a small surface often share the similar normal value.Thus we can employ the proposed P-Linkage clustering method on the segmentation of point clouds, which differs from that on the 2D data points in three aspects: (1) the neighborhood is based on the K nearest neighbors (KNN) instead of the fixed distance neighbors; (2) the feature value is the flatness of the estimated surface instead of the density of neighbors; (3) the distance of two data point is measured as the deviation of their normal orientations instead of their Euclidean distance.In the following subsections we will explain the P-Linkage based point cloud segmentation algorithm in details.

Normal Estimation:
The normal for each point is estimated by fitting a plane to some neighboring points.This neighborhood can be specified in two different methods: K nearest neighbors (KNN) based and Fixed distance neighbors (FDN) based.For each data point, the KNN based methods select the K points from the point clouds having the minimum distance to it as its neighborhood, which is usually achieved by applying space partitioning strategies like the k-d tree (Arya et al., 1998).The FDN based methods (Toth et al., 2004) select all the points within a distance to each point, and thus the number of neighbors changes according to the density of the point clouds.Compared to KNN, the number of neighbors of FND is less in the areas of low density area, which may result in inaccurate estimation of the normals.
In this paper, we employ the KNN method to find the neighbors of each data point and estimate the normal of the neighboring surface via the Principal Component Analysis (PCA).The procedure contains three following steps.Firstly, we build a k-d tree by applying the ANN library (Mount and Arya, 2010).For each data point pi, its K nearest neighbors (KNN) is found and recorded as Ii which is sorted in ascending order according to their distances to pi.Secondly, for each data point pi, the covariance matrix is formed by the first K/2 data points in its KNN set Ii as follows: where Σ denotes the 3×3 covariance matrix and p represents the mean vector of the first K/2 data points in Ii.Then the standard eigenvalue equation: λV = ΣV (5) can be solved using Singular Value Decomposition (SVD), where V is the matrix of eigenvectors (Principal Components, PCs) and λ is the matrix of eigenvalues.The eigenvectors v2, v1, and v0 in V are defined according to the corresponding eigenvalues sorted in descending order, i.e., λ2 > λ1 > λ0.The first two PCs v2 and v1 form an orthogonal basis which indicate the two dimensions of highest variability that defines the best fitted plane of the neighboring points in Ii, the third PC v0 is orthogonal to the first two PCs, and approximates the normal of the fitted plane.λ0 estimates how much the points deviate from the tangent plane which can evaluate the quality of a plane fitting, and the smaller the value of λ0 the better the quality of the plane fitting.
For each data point, we first find its K nearest neighbors and calculate its eigenvectors via the first K/2 neighbors via PCA, and then take the eigenvector v0 as the normal and the eigenvalue λ0 as the flatness of the estimated plane.After that, the Maximum Consistency with Minimum Distance (MCMD) algorithm (Nurunnabi et al., 2015) is employed to find the inliers and outliers, which is conducted as follows.First, the orthogonal distances {d k o } K k=1 for the K nearest neighbors of a data point pi to its estimated plane are calculated, which are collected as a set NOD = {d k o } K k=1 .Then, the Median Absolute Deviation (MAD) is calculated as follows: where median(NOD) is the median value of NOD and a = 1.4826 is set constant.The inliers, also known as the Consistent Set (CS), are those data points whose Rz scores: are less than a constant threshold 2.5 (Nurunnabi et al., 2015).Thus for each data point pi, we obtain its normal n(pi), flatness λ(pi) and Consistent Set CS(pi).
Linkage Building: With the normals, flatnesses and CSs of all the data points, the pairwise linkage can be recovered in a noniterative way, which is performed as follows.For each data point pi we search in its CS to find out the neighbors whose flatnesses are smaller than that of pi and choose the one among them whose normal has the minimum deviation to that of pi as CN P (pi).If there exits CN P (pi), a pairwise linkage between CN P (pi) and pi is created and recorded into a lookup table T. If the flatness λ(pi) of pi is the minimum one in its neighborhood and λ(pi) is smaller than the following threshold: where λ is the average value of the flatnesses of all the N data points, σ λ = N i=1 (λ(pi) − λ) 2 /N is standard deviation of all the flatnesses, thus we take pi as a cluster center, and insert it into the list of cluster centers Ccenter.
Slice Creating: To create the surface slices, the clusters C are firstly formed by searching along the lookup table T from each cluster center in Ccenter to collect the data points that are directly or indirectly connected with it.The clusters whose numbers of data points are smaller than 10 will be removed as outliers.Then for each cluster Cp a slice is created by plane fitting via the MCS method proposed by (Nurunnabi et al., 2015), which is an iterative procedure with the iteration number calculated as: where h0 is the size of the minimal subset of the data points in Cp which equals to 3 for plane fitting, P is the probability of the event that there exists at least one case in all the It iterations that the random chosen h0 minimal subset is outlier free, and ǫ is the outlier rate in Cp which was set 50% for general cases.Then for each iteration in the MCS method, the following steps are performed: (1) First, h0 data points are chosen randomly.
(2) For the h0-subset, a plane is fitted via the PCA, and the orthogonal distance for all the data points in Cp are calculated and recorded in NOD.
(3) Then NOD is sorted in ascending order and the first h (h equals to half the size of Cp) data points are chosen to form the h-subset.(4) Finally, the PCA is applied again on the h-subset to fit for a plane whose flatness λ0 is added into the list of previous flatnesses, defined the set S(λ0).After the iterations, the S(λ0) is sorted in ascending order, and the plane corresponding to the smallest flatness is chosen to be the best fitted plane of Cp.
Then the MCMD outlier removal method is applied to find out the inliers, also known as the Consistent Set (CS), to the best fitted plane in Cp.Thus for each slice Sp, we obtain its normal n(Sp), flatness λ(Sp) and Consistent Set CS(Sp) in the same way as each data point.
Slice Merging: To obtain complete planar and curved surfaces which are quite common in the indoor and industry applications, a normal and efficient slice merging method is proposed, which is similar to the Cluster Merging procedure introduced in the P-Linkage clustering and contains the following steps.First, we search for the adjacent slices for each one, two slices Sp and Sq are considered adjacently if the following condition is satisfied: where pi ∈ CS(pj) and pj ∈ CS(pi).
Then, for a slice Sp and one of its adjacent slice Sq, they will be merged if the following condition is satisfied: where n(Sp) and n(Sq) are the normals of Sp and Sq, respectively, and θ is the threshold of the angle deviation.The greater θ is, the more curving the surface can be.The slice merging is conducted iteratively, which means that all the slices that are directly or indirectly adjacent to the start slice will be processed.

Evaluation on Clustering
The only customized parameter the P-Linkage requires is the scale used to determine the value of dc.We set scale = 5 for general cases, which will be proved in the following experiments to be a good choice.The merging procedure, which is specifically designed for the uniform distribution data, is not in use in general case but can be customized to get a better result if the initial clustering result contains too much fragments.
To evaluate the proposed method on the Gaussian distribution data, we chose two test data in the "Shape sets"1 , which contain 15 and 31 clusters, respectively.Figure 3 shows the clustering results on them, where the black points stand for the outliers, and the points in other different colors belong to different clusters.We can see that, in both data sets, all the clusters were correctly discovered, and the outliers extracted by the proposed method were all reasonable to be picked out.In both cases, the value of scale was set as 5.
To further evaluate the proposed method on the uniformly distribution data, the two other test data in the "Shape sets" were tested.Figure 4 shows the clustering results of our method.In Figure 4(a) all the 7 clusters were discovered and distinguished well.While in Figure 4(b), the left-down data points in purple were all classified into a single cluster, the reason is that the scale = 5 is too large for this case.By reducing scale = 2 and applying the merging procedure, a better clustering result can be achieved in Figure 4(c), however in this condition more data points were discarded as outliers (points in black) because there are not enough data points in its neighborhood thus the densities are smaller than the density threshold to be a new cluster.
We also tested the P-Linkage method on multi-dimensional data sets to see its robustness.Figure 5 shows the clustering result on a simulated 3D data which is composed of three subsets in Gaussian distribution.Each subset contains 200 data points.The number of correct classified data points for each subset is 175 (87.5%), 195 (97.5%) and 191 (95.5%), respectively.In all the three 2D views in Figure 5, we can see that all the three subsets were divided well.

Evaluation on Point Cloud Segmentation
Unlike the traditional K-Means algorithm and the approach proposed by (Rodriguez and Laio, 2014), the proposed P-Linkage clustering can be employed directly on the applications like point cloud segmentation where there exist a huge amount of clusters of data points in complex scenes.To test the robustness of the proposed point cloud segmentation method, we applied it on Vehiclemounted, Aerial and Stationary Laser Scanner point clouds.In all these tests, only the parameter θ was adjusted to get the best segmentation results without any filtering or diluting.

Vehicle-mounted:
The segmentation of dense vehicle-mounted laser scanner points is a challenging task due to the existence of varied kinds of road furnitures which contain signs and light poles, road barricades, billboards, the ground and vehicles.In this work, two vehicle-mounted datasets were tested as shown in Figure 6(a) captured from an urban street of 355 meters long with 217k points and in Figure 6(b) captured from a small partial of a city in details containing 120k points.From Figure 6(a) containing road, buildings, street lamps, vehicles and trees, we observe that the road surface was clustered into a complete one and separated entirely from other objects.Also, the building facades were discovered quite well.From Figure 6(b), we observe that most of the building facades were segmented well despite that their densities vary in a wide range.Those facades perpendicular to the road (the green slice within the big red ellipse frame) and the small slices connected with other big ones (the two slices within the small red ellipse frames) were recovered quite well.Figure 6(c) shows a detailed look of the segmentation result with the red quadrangles representing different slices, which means that the segmentation result of the proposed method can be applied on the extraction of street patches with more specific operations.

Aerial:
The first aerial data set tested is composed of 3433k points, which covers an urban area of 5km×5km.There are buildings, ground, road and vegetations in this data set.Figure 7(a) shows a partial result of the whole data set, from which we can see that the ground was separated by the road (left bottom, in orange) into two parts, in purple and lemon yellow, respectively, and the ground in purple was segmented into a whole surface despite of the various objects on the ground.The roofs of buildings were segmented into a whole part in general, while some small structures can also be kept.Figure 7 (b) shows a detailed view of the segmentation result of the area in red frame in Figure 7(a), from which we can see that the proposed method can preserve the details well.The second tested aerial data set is the ISPRS commission III/4 benchmark on Urban Classification and 3D Building Reconstruction and Semantic Labeling2 , the results of which are shown in Figure 7(c) and (d).We can see that the objects including the trees and building are generally separated from the ground which is segmented into a single part, and the details of the roofs are preserved well as Figure 7(d) shows.
More experiments to evaluate the accuracy and correctness of the segmentation method will be presented in the futrure work.What is noteworthy is that to achieve the segmentation results on both data sets, only the nearest neighbors size K in k-d tree building and the θ in slices merging is adjusted.

Stationary:
The tested stationary laser scanner data set consists of 2500×1076 data points, which means it can be unfolded into a 2D image whose column equals 2500 and row equals 1076.Figure 8 shows the segmentation result on a 2D image, from which we can see that all the main surfaces were segmented well.Specifically, we notice that the details of the modulator tubes on the ceiling were preserved quite well.In this case we set θ = 20.0 • due to that there exist many streamlined objects in this indoor scene, which will result in more surfaces and less planes.

CONCLUSION
In this paper we propose a novel hierarchical clustering method named P-Linkage to discover the clusters in a simple and efficient way by recovering the pairwise linkages on the data point level.The proposed P-Linkage clustering can be employed directly on the applications which have huge amount of clusters and complex scene.Applying the P-Linkage clustering on point cloud segmentation, we develop an efficient point cloud segmentation algorithm which can handle a huge amount of data points captured from different scenes.Experimental results on different dimensional synthetic data from 2D to 4D sufficiently demonstrate the efficiency and robustness of the proposed P-Linkage clustering algorithm and a large amount of experimental results on the Vehicle-Mounted, Aerial and Stationary Laser Scanner point clouds sufficiently illustrate the robustness and efficiency of our proposed 3D point cloud segmentation algorithm.

Figure 1 :Figure 2 :
Figure 1: An illustration of the pairwise linkage on a 2D Gaussian curve.Derived from the non-maximum suppression, the P-Linkage compares each data point to its neighbors and forms the linkages from p1 → p2 → p3... → c. 25 7 (a) scale = 5 (b) scale = 5 Figure 3: Clustering results of the proposed method on the Gaussian distributed test data.

Figure 5 :
Figure 5: Clustering results of the proposed method on the 3D (XY Z) simulated test data: the original data points and the clustering result projected onto the XY plane, the Y Z plane, and the XZ plane from left to right.
Figure 6: Segmentation results of the proposed method on the vehicle-mounted test data.

Figure 8 :
Figure 8: Segmentation results of the proposed method on the stationary indoor test data with θ = 20.0 • .Castillo, E., Liang, J. and Zhao, H., 2013.Point cloud segmentation and denoising via constrained nonlinear least squares normal estimates.In: Innovations for Shape Analysis, Springer, pp.283-299.

1
Hierarchical Clustering by Pairwise Linkage Require: The density of each data point; the cutoff distance dc.
Ensure: The clusters C; their cluster centers Ccenter.1: ρi : the density of a data point pi 2: ρ : the average density of all the data points 3: Ii : the neighborhood set of a data point pi 4: T : the lookup table recording all the pairwise linkages 5: for each data point pi do 6: Set LocalMaximal ← TRUE 7: Set dmin ← ∞ and CN P (pi) ← ∅ 8: for each neighboring point pj in Ii do 9: Set dij ← the distance between pi and pj 10: if ρj > ρi and dij < dmin then 11: Set LocalMaximal ← FALSE 12: Set CN P (pi) ← pj, dmin ← dij 16: Record the linkage between pi and CN P (pi) into T 17: else if LocalMaximal and ρi > median(ρ) then 21: Collect the clusters C by searching data points in the lookup table T from each data point in Ccenter