POLE-LIKE ROAD FURNITURE DETECTION IN SPARSE AND UNEVENLY DISTRIBUTED MOBILE LASER SCANNING DATA

Pole-like road furniture detection received much attention due to its traffic functionality in recent years. In this paper, we develop a framework to detect pole-like road furniture from sparse mobile laser scanning data. The framework is carried out in four steps. The unorganised point cloud is first partitioned. Then above ground points are clustered and roughly classified after removing ground points. A slicing check in combination with cylinder masking is proposed to extract pole-like road furniture candidates. Pole-like road furniture are obtained after occlusion analysis in the last stage. The average completeness and correctness of pole-like road furniture in sparse and unevenly distributed mobile laser scanning data was above 0.83. It is comparable to the state of art in the field of pole-like road furniture detection in mobile laser scanning data of good quality and is potentially of practical use in the processing of point clouds collected by autonomous driving platforms.


INTRODUCTION
In recent years, road safety has been stressed in many countries.As the modernisation and urbanisation arises in developing countries, an increasing number of vehicles emerges on urban roads.In order to improve road safety, many measures, which include urban objects inventory, are adopted by governments.During urban objects inventory, urban objects can be counted based on their categories and their locations can be also recorded.The obtained inventory information can be used to check the suitability of existing road furniture or to assist the planning of placing road furniture.Road safety can be thereby enhanced by these analyses.
To support urban objects inventory, high quality data is needed.In urban scenes, 3D data is commonly used as the data source for road-side objects inventory due to its precise recording of 3D location and geometric structure.With the development of sensors, mapping systems have become more precise in the past decades (Puente et al., 2013).Three types of 3D laser scanning systems are often used to collect 3D data in outdoor scene, airborne laser scanning (ALS) systems, mobile laser scanning (MLS) systems and terrestrial laser scanning (TLS) systems.Compared with TLS systems, MLS systems are more flexible and faster when acquiring data.Compared to the data collected by ALS systems, MLS data are denser and more precise.For these reasons, MLS systems are widely used to collect 3D point cloud data in urban scenes.Autonomous driving as a popular topic has been widely studied not only because of its convenience but also its contribution to road safety.3D urban object detection in sparse and unevenly distributed points (e.g.3D point clouds collected by Velodyne sensors) becomes crucial.
Currently, urban objects inventory mainly relies on manual labelling.Manual inventory of urban objects can be tedious and time-consuming.Therefore, automatic road-side objects inventory is in urgent demand.Much research has been carried out in automatic urban roadside objects inventory (Golovinskiy et al., 2009;Pu et al., 2011;Yang et al., 2015;Lehtomäki et al., 2016;Wang et al., 2017;Li et al., 2017).One of the major focuses is pole-like road furniture detection due to their essential traffic functionalities.Except their traffic functionalities, detected pole-like road furniture can be also potentially used as features for autonomous driving system (Hofmann and Brenner, 2009) or Simultaneously Localisation and Mapping (SLAM) in urban scenes.Great progress has been achieved in pole-like road furniture detection using mobile laser scanning data (Brenner et al., 2009;Lehtomäki et al., 2010;El-Halawany and Lichti, 2011;Cabo et al., 2014).However, the accuracy of detection is still modest especially in sparse and unevenly distributed point cloud data, which is crucial for 3D object detection in MLS data collected by autonomous driving systems.One reason is that these methods rely on geometric features such as eigenvaluebased features which are not robust in sparse and unevenly distributed MLS data.Another reason is that objects behind building façades are not eliminated.Therefore, in this paper we propose a method to detect pole-like road furniture from sparse and unevenly distributed mobile laser scanning data.Our method does not rely on the point density and point distribution.It is capable of detecting poles that are more distant to the road, and can be potentially used for pole-like road furniture detection in autonomous driving systems.This paper is organised as follows.Related work is reviewed in Section 2. Our proposed method is explained in Section 3. In Section 4 we test our algorithm and analyse the results.Finally, we draw the conclusion and give the outlook of our future work (Section 5).

RELATED WORK
As one of the focus topics in the research field of urban objects identification, pole-like road furniture detection has attracted much attention in recent years.Much progress has been made in this research field.A number of methods have been proposed to detect pole-like road furniture in mobile laser scanning data.Current proposed methods are assorted into two types, supervised learning and knowledge driven methods.In supervised learning methods distinctive features are utilised as input and trained to make prediction of candidates.In knowledge driven methods, by contrast, rules or constraints are defined to make predictions based on inductive experience or knowledge.
Pole-like road furniture detection has been explored by a number of knowledge driven methods.Brenner et al. (2009) represent an early attempt to extract pole-like objects from MLS data.A cylindrical stack model is utilised to analyse the structure of measured laser points.Lehtomäki et al. (2010) further develop this method by using the scanline information of MLS data.They first extract short clusters of which points are in the same scanline.Then constraints are designed to connect these clusters and a similar cylinder masking is applied to check whether they are poles.Similar to these two methods, Cabo et al. (2014) develop a voxel based framework to detect pole-like objects from MLS data in urban areas.However, this framework is not able to distinguish trees from pole-like objects.Fukano et al. (2015) detect pole-like objects and trees by using scanline information and a slice cutting algorithm.However, this method strongly relies on the triangulation of points, which does not work in sparse and unevenly distributed data.Pu et al. (2011) propose a percentile method to detect pole-like road furniture from MLS data.They first remove ground points by a rough classification step.Then a percentile method and shape analysis are carried out to classify above ground components into detailed classes such as trees and traffic signs.Li and Oude Elberink (2014) optimise this framework to improve the detection rate of trees by additionally using collected multi-echo information.Huang and You (2015) detect pole-like road furniture in MLS data by using slicing, seed generation and bucket augmentation.Multiecho information is not adopted in this method.
Eigen-based features have been adopted for pole-like objects detection from MLS data.Liberge et al. (2010) extract above ground objects by local discontinuity.Three types of vertical posts are detected by using eigenvalue features.Another eigenvalue based method to extract pole-like objects is investigated by El-Halawany and Lichti (2011).They extract points with high linearity and detect poles by using region growing to include their surrounding points.Bremer et al. (2013) adopt multi-scale eigenvalue features in combination with defined rules to detect pole-like objects from MLS data.Aijazi et al. (2013) segment above ground points by applying voxelisation and a link-chain rule.Descriptors of supervoxels are used to categorise segmented objects into five classes.Yokoyama et al. (2013) utilise the Laplacian filter to smooth above ground components.Eigenvalue features are then used to describe points with linearity, planarity and scattering.A designed model is fitted to decide whether an above ground component is a pole-like object.Yang et al. (2015) voxelise MLS data at multiple scales based on their point attributes, and define a set of rules to recognise objects in urban scene.A normalised cut algorithm is proposed to separate above ground points in Yu et al. (2015).They subsequently construct pairwise 3D shape context for a set of defined models to perform feature matching and recognise polelike road furniture in urban scenes.
In contrast to knowledge driven methods, supervised learning method do not require defined rules or models to make predictions.For supervised learning methods, after feature extraction, it is an end-to-end process.Golovinskiy et al. (2009) segment above ground objects by using a min-cut approach and recognise urban objects by shape features allied with machine learning techniques.Weinmann et al. (2015) employ an optimal neighbourhood selection and SVM to semantically label 3D urban objects which include pole-like objects.By using massive features and random forest, Hackel et al. (2016) semantically label urban scene in MLS point cloud.Lehtomäki et al. (2016) develop an object wise classification framework on basis of SVM to identity urban substances.
The performance of pole-like road furniture detection in sparse and unevenly distributed MLS data is still modest, less than 70%.Most of methods aforementioned strongly depend on the point density when calculating features to decide whether road furniture are pole-like or not.Supervised learning methods need many training samples.In this paper, we utilise the number of slices in combination with cylinder masking, which are not related to point density and are based on generic knowledge, to detect pole-like road furniture from MLS data.Numerous training samples are not needed in our method.

METHODOLOGY
In this section, we describe a four-stage framework to detect pole-like road furniture from MLS data.In the first stage, unorganised MLS data is cut into blocks based on trajectory data.A rough classification is carried out to extract ground, building and vegetation in the following stage.Then the slicing in combination with cylinder masking is proposed to detect pole-like road furniture.
Features such as the number of slices which belong to poles, used at this stage do not require high point density or even point distribution.In the end, we perform occlusion analysis to eliminate indoor pole-like objects.The overview of our framework is as shown in Figure 1.

Data partition
The volume of collected MLS data can be massive.In order to reduce memory resource and save computation time, we cut the unorganised MLS data into blocks along the trajectory in the first stage.First, we sort trajectory points based on their recording time.Then the length and width of the blocks is defined to generate the outline of every data block.We use these outlines to crop corresponding MLS data from the unorganised dataset.This work is analogous to Pu et al. (2011).The length corresponds to the direction along the trajectory line and the width corresponds to the direction perpendicular to the trajectory line.In our paper, the defined width in datasets A and B 30m, and the length is defined to be 40m and 60m respectively.

Rough classification
In this stage, we perform a rough classification to detect ground points, buildings and trees.First, ground points are extracted based on local height variance and trajectory recordings.Then above ground points are clustered by connected component analysis.Hereafter, buildings are detected among the above ground components.In the last part of this stage, we extract trees from above ground components.

Ground extraction:
In order to obtain above ground objects, ground points are first extracted and removed.Compared with above ground objects, there is small local height variance within sets of ground points.Normally ground points are underneath their corresponding trajectory locations.Based on these two attributes, ground points are extracted.Specifically, we calculate local height variance by the highest point and lowest point of every point's neighbourhood.The neighbourhood of every point is defined to be its k nearest points within a certain distance.If the height variance is larger than an empirically defined threshold (e.g.0.1m), it is labelled as a ground point.The ground extraction is point-wise.

Building façade detection:
First, above ground points are obtained by removing ground points extracted in our previous step and a connected component analysis is carried out to cluster above ground points into separated components (Vosselman et al., 2004).Then a surface growing algorithm is performed to detect planes from these separated above ground components.Four features are subsequently utilised to recognise building facades from above ground components.These features are: the area of a detected plane, the verticality of a detected plane, the width and height of a detected plane (Rutzinger et al., 2009).The verticality is the angle between the normal of detected plane and vertical direction.When all four feature values of a planar component exceed a threshold, the component is classified as a building façade.

Tree detection:
In our previous step, buildings are removed from the above ground components.Next, trees are identified.Based on multiple return information, trees can be extracted (Rutzinger et al., 2010).In our study, the percentage of points with the first return in above ground components is used as a feature to detect trees.If the percentage is smaller than a threshold, this component is labelled as tree.This threshold is fine tuned to be 0.95.We use this method whereas the other methods based on eigenvalue features strongly rely on the quality of the point cloud.

Pole-like road furniture detection
After buildings and trees are detected and removed, polelike road furniture are extracted from the above ground components in two steps The first step is to identify polelike road furniture candidates by using a slicing based method.Then the detected pole-like road furniture candidates are checked with cylinder masking.
In order to detect pole-like road furniture, we first cut above ground components into horizontal slices.Then slices with small diameters are selected and their centre points are calculated.A 2D connected component analysis is performed on the centre points of these selected slices.The number of centre points in every connected component is checked.If the number is larger than a threshold, this object is labelled as pole-like road furniture.The threshold is leveraged based on the height of slices and the length of shortest pole.
After the slicing analysis, there still remain above ground components other than road furniture such as undetected trees.To eliminate these objects, two coaxial cylinders are constructed as shown in Figure 2. The ratio of the number of points inside the inner cylinder C1 and the number of points inside the cylinder C2 is the discriminative feature to decide whether these remaining above ground components are pole-like road furniture or not.This ratio with pole-like road furniture is high.In contrast, this ratio with trees, which have many branches, is smaller.Therefore, the remaining trees can be eliminated.This step is similar to the cylinder masking described in Brenner et al. (2009) and Lehtomäki et al. (2010).The difference is that instead of using pole-like clusters retrieved from the profile information, we use connected slices to fit two coaxial cylinders.In this paper, r1 is set to be the median width of these connected slices.r2 is to be r1+0.5.
This step does not rely on the point density and the evenness of point distribution.It is because we use the number of slices and 2D connected component analysis to detect slices which belong to poles.The calculation of number of slices and cylinder masking does not strongly rely on the point density and the evenness of point distribution.Therefore, high quality data is not needed for this method.

Occlusion analysis
In the previous step, some objects inside buildings are mis-detected as pole-like road furniture.One important clue is that they are located behind detected building façades.According to the relative locations between these segments, the trajectory and the detected building facades, we exclude the segments behind the building facades.

EXPERIMENTAL RESULTS
In section 4.1, two test sites are described to evaluate the performance and reliability of our algorithm.We analyse the experimental results in section 4.2.

Test sites
To evaluate the performance of our framework, two test sites are selected.These two datasets were collected in different countries.Moreover, the scanning geometry of these two datasets is also different.Dataset A is the Paris benchmark dataset, collected by the Stereopolis II system (IGN, 2013).It was collected by a time of flight ranging system and covers approximate 0.45 km of road scene.
The point density of Dataset A ranges from 72 per square metre to 500 points per square metre.The ratio of the distance between neighbouring points along scanlines and the distance between neighbouring scanlines ranges from 0.35 to 1.0.
Dataset B was collected in Espoo, the second largest city in Finland.It was acquired in 2009 by the ROAMER system, which consists of a Faro laser scanner and other sensors (Kukko et al., 2007;Kukko et al., 2012).A phase shift ranging system was adopted to collect Dataset B. It covers about 1.0 km of road scene.Dataset A and B were collected by different scanning systems with different scanning geometries.The point density of Dataset B ranges from 50 points per square metre to 250 points per square metre.The ratio of the distance between neighbouring points along scanlines and the distance between neighbouring scanlines ranges from 0.1 to 0.42.The distribution of Dataset B is strongly uneven and sparse.Dataset B was collected without multi-echo information.

Results and analysis
Experiments are carried out with these two introduced datasets.The aim is to detect pole-like road furniture higher than 0.5 metre.Experimental results are as shown in Figure 3 and Figure 4. We evaluate the detection result by computing the completeness and correctness, which is as provided in Table 1.The completeness of the detection is 0.86 and 0.80 in the test site A and test site B, respectively.In both test sites, 87% of all detected objects are pole-like road furniture.The large segments in the right side of Figure 3 and the left side of Figure 4 are street lights connected to fences and trees.In our framework we are still able to detect connected pole-like road furniture such as street lights connected with fences.Most of the incorrect results are from the connected objects.For example, in Dataset A, there are pole-like road furniture extremely close to building façades.In the left figure of Figure 5, there is a street light (red circle in the left figure of Figure 5) close to the building façade.A road sign (red circle in the right figure of Figure 5) is connected to building façade.The smallest distance between pole-like road furniture and building façade in Figure 5 is only 0.01m.It is difficult to extract such polelike road furniture, as our algorithms groups points within 0.5 m.In Dataset B, there are pole-like road furniture connected with other objects (as indicated in the red cropped area of Figure 6).When we cut them into slices, these slices of poles are still connected with each other unless we set the distance for connected component analysis to be very small.Small distance for connected components analysis nevertheless leads to fragments.Therefore, trading-off the parameters of connected component analysis is insufficient to detect these poles.
Another case is that trees, building pillars and pedestrians are mis-detected as pole-like road furniture.
There is a tree with a few branches in Dataset B (as shown in the right figure of Figure 6) mis-detected as pole-like road furniture.A few partially scanned pedestrians in both test sites are also mis-detected as objects of interest.The reason for falsely classifying pillars and pedestrians as pole-like road furniture is that we only use the width and the ratio as features to eliminate them.Trees with few branches are not detected correctly.It is because our method eliminate trees based on the ratio of multi-echo points and the high ratio value in the cylinder masking.There are not sufficient points with multiple counts in such trees and the ratio value is high in the cylinder masking.These features are not enough to make these bare trees, partially scanned pedestrians and building pillars discernible.In Yang et al. (2016), the used point repetition frequency is 1100 kHz.Our used point repetition is much lower, about 120kHz.Point density in test site B ranges from 50 points per square metre to 250 points per square metre.The point density is considerably higher (1500 points per square metre) in Wang et al. (2017).The mirror frequency has a significant effect on the evenness of point distribution.The higher mirror frequency, the more even point distribution.The mirror frequency of laser scanning collecting data is above 70 Hz in Wang et al. (2017).Compared with their datasets, the point clouds used in our experiment are of lesser quality, collected at the mirror frequency of 30 Hz.

CONCLUSION AND FUTURE WORK
To conclude, our method is stable and performs constantly well on two different datasets representing different scanning geometries of mobile laser scanning system.The completeness and correctness of the detection were 0.86 and 0.87, 0.80 and 0.87 in the test sites A and B, respectively.Features used in this method do not rely on the high point density or evenly distributed points.This method works well in sparse and unevenly distributed mobile laser scanning data.Our algorithm can potentially be of practical use in the pole-like road furniture extraction in sparse point clouds.An example of such sparse data is MLS data collected using Velodyne laser scanners, also used in autonomously driving vehicles.However, it is still difficult to separate pole-like road furniture connected to other objects such as trees and building façades.The use of imagery or multispectral LiDAR could further help in distinguishing between the objects i.e. a pole close to a facade should be detectable even with a single channel intensity when there is a large enough intensity gradient between the two.However, the use of colour information may lead to over-segmentation of single objects.How to separate objects and avoid over-segmentation by adding colour information remains to be explored in the future.

Figure 1 .
Figure 1.The flow chart of pole-like road furniture detection

Figure 3 .
Figure 3. Detected pole-like road furniture in Dataset A

Figure 5 .
Figure 5. Undetected pole-like road furniture in Dataset A.

Figure 6 .
Figure 6.Undetected pole-like road furniture and misdetected pole-like road furniture in Dataset B Compared to the method presented in Lehtomäki et al. (2010), we are able to eliminate trees and retain these pole-like road furniture with traffic function.The detection rate of our method was 80% and correctness was 87%, while in the same dataset their detection rate was 69.7% and correctness was 86.7%.The detection rate is significantly improved.It is noted that the reference of earlier work contained also other than road furniture, such as tree trunks.Compared to our result, trees and objects behind facades are not eliminated in the study by Cabo et al. (2014).Compared with the work of Li et al. (2014), our method is able to exclude false positively detected pole-like objects inside building.Wang et al. (2017) have achieved pole-like road furniture detection with high accuracy, approximate 95%.This, however, required high point density and point cloud of good quality.Two laser scanners are used to collect their experimental datasets.Only one scanner is used to collect data in our test dataset.Point repetition frequency strongly affects the point density.Normally higher point repetition frequency leads to higher point density.Point repetition frequency is 1333 kHz in Wang et al. (2017).InYang et al. (2016), the used point repetition frequency is 1100 kHz.Our used point repetition is much lower,

Table 1 .
Detection evaluation in two test sites