ON-ROAD INFORMATION EXTRACTION FROM LIDAR DATA VIA MULTIPLE FEATURE MAPS

On-road information, including road boundaries, road markings, and road cracks, provides significant guidance or warning to all road users. Recently, the on-road information extraction from LiDAR data have been widely studied. However, for the LiDAR data with lower accuracy and higher noise, some detailed information, such as road boundary, is difficult to be extracted correctly. Furthermore, most of previous studies lack an exploration of efficiently extracting multiple on-road information from a single framework. In this paper, we propose a new framework that can simultaneously extract multiple on-road information from high accuracy LiDAR data and can also more robustly extract detailed road boundaries from low accuracy LiDAR data. First, we propose a Curb-Aware Ground Filter to extract ground points with rich curb structure features. Second, we transform the vertical density, elevation gradient and intensity features of the ground points into multiple feature maps and extract multiple on-road information from the feature maps by employing a semantic segmentation network. Experimental results on three datasets with different data accuracy demonstrate that our method outperforms other recent competitive methods.


INTRODUCTION
On-road information, such as road boundary, road markings, and road cracks, plays an important role in urban construction. Extraction of the on-road information is significant in many applications such as road maintenance (El-Halawany et al., 2012), city planning, intelligent drive assistant systems (Wen et al., 2016), High Definition (HD) map (Ma et al., 2018) and traffic flow monitoring and prediction (Lv et al., 2015). Over the past few years, due to the huge market of autonomous driving and intelligent cities, LiDAR industry has developed rapidly, and the LiDAR data for on-road information extraction have been widely studied (Ma et al., 2018).
By using high accuracy mobile LiDAR sensors, such as, RIEGL VQ-450, RIEGL VUX-1HA, and Optech Lynx HS-600 Dual, high accuracy point clouds can be collected. Some detailed onroad information, such as road boundaries, could be extracted from the high accuracy point clouds via the structure feature of curbs (Jaakkola et al., 2008, Yang et al., 2017. However, there are many LiDAR sensors with low accuracy, such as Velodyne VLP-16 and HDL-64E. For the LiDAR data with lower accuracy and higher noise level, detailed on-road information are hard to be extracted correctly, as shown in Fig. 1. In addition, most of the related methods extract every on-road information separately. These methods lack an exploration of efficiently extracting multiple on-road information by a single framework.
In this paper, we propose a multiple feature map-based on-road information extraction framework. The keys to the framework are two-fold. (1) We propose a Curb-Aware Ground Filter. Instead of other methods that extract ground points without curb information (Zai et al., 2017, Wen et al., 2019a, the Curb-Aware Ground Filter extracts both road surface points and curb points which provide essential structure features for robust road boundary extraction. (2) By transforming both the structure and texture features of on-road information into multiple feature maps, and employing a semantic segmentation network on the feature maps, we can simultaneously extract road boundaries, road markings and even road cracks from high accuracy LiDAR data, and also can more robustly extract detailed road boundary information from low accuracy LiDAR data.
We conducted extensive experiments on multiple datasets including a part of the Coastal Ring Road (CRR) (Wen et al., 2019b), Paris-Lille-3D (Roynard et al., 2018) and KITTI odometry data set (Geiger et al., 2012). The CRR is collected by a RIEGL VQ-450 LiDAR with millimeter-level accuracy. The Paris-Lille-3D and KITTI are collected by Velodyne HDL-32E and Velodyne HDL-64E LiDAR with centimeter-level accuracy, respectively. Experimental results on the three data sets demonstrate that our method outperforms other competitive methods.

RELATED WORK
A large amount of recent work has studied on-road information extraction from LiDAR data. These work mainly consists of road boundary extraction, road marking extraction and road crack detection, which are detailed as follows.

Road Boundary Extraction
A variety of road boundary extraction methods have been conducted by converting 3D point clouds into 2D geo-referenced feature(GRF) images. Jaakkola et al. first converted point cloud into height image, and then they extracted gradient information from the height images. Last, a pixel was selected as a curb point if it has a specified number of neighboring with a specified gradient (Jaakkola et al., 2008). Based on two assumptions (the distance between the road surface, and 3D trajectory is constant and the normal direction?of the road surface is parallel to the z-direction), road surfaces were extracted by GRF images and then boundaries of all road surface were generated by alpha-shape algorithm (Yang et al., 2017).
Some road boundary extraction methods are conducted on Li-DAR data directly. According to the derivatives of the Gaussian function to MLS point clouds, road edges were extracted by applying a parametric active contour and snake model (Kumar et al., 2013). The feature map of RANSAC-based normal direction was analyzed, and then road edges and road surfaces were obtained by Kalman filters (Hervieu, Soheilian, 2013).
Other methods also studied the road boundary extraction using both point clouds and scanning trajectory. Wang et al. first divided the point cloud into several parts along the trajectory, and then the road boundary was extracted and refined from each part . Zai et al. detected rough road boundary via supervoxels and alpha-shape algorithm and then extracted road boundary by applying graph cuts on the trajectory and rough boundary (Zai et al., 2017).
With the development of deep learning, some works used 3D deep learning to process point cloud data. Since road edge extraction can be regarded as a classification problem, Rachmadi et al. detected road edge from the point clouds using Encoder-Decoder Convolutional Network (Rachmadi et al., 2017). Liang et al. proposed a convolution network, which takes as input overhead LiDAR and camera imagery as well as the gradient of the LiDARs elevation value, generates road boundaries feature map (Liang et al., 2019).

Road Marking Extraction
Road markings are highly retro-reflective materials painted on asphalt concrete pavements. Thus threshold-based methods have been commonly used for road marking extraction. Wang et al. proposed a method using an adaptive binary threshold method from binary raster images, which were generated from the filtered laser points based on their reflective properties . Yang et al. proposed an adaptive block and a multi-threshold method to detect road markings based on the intensity information from the MLS data (Yang et al., 2018). By using MLS trajectory to discretize LiDAR data into smaller sections, Jung et al. extracted road surface by employing constrained Random Sampling and Consensus (RANSAC) algorithm. Then the road surface is rasterized into a 2D intensity image to separate the lane markings. Finally, the remaining incorrect lane markings are detected and removed through a noise filtering phase using Dip test statistics (Jung et al., 2019).
Most of the above methods are extracted based on a certain type or some type of features of road marking, and they require some prior knowledge, and it is easy to produce omission or wrong lifting. Thus deep learning is a significant way to solve these problems. Wen et al. projected 3D MLS data onto a 2D intensity image, and modified U-Net model to segment road marking pixels (Wen et al., 2019a).

Road Crack Detection
Because road cracks are lowly retro-reflective on asphalt concrete pavements, the intensity feature of point clouds can be used to detect road cracks. Yu et al. employed the Otsu thresholding algorithm to extract crack candidates. Next, a spatial density filtering algorithm was performed for outlier removal. Finally, crack points were clustered into crack-lines, and crack skeletons were extracted by performing an L1-medial skeleton extraction algorithm (Yu et al., 2014). Guan et al. developed ITVCrack to extract pavement cracks by using the iterative tensor voting (ITV) method (Guan et al., 2015). Chen et al. first generated Digital Terrain Model (DTM) from MLS point clouds to detect cracks. Sequentially, local height changes that may relate to cracks were detected based on a high-pass filter. Last, a Gaussian-shaped kernel was used to extract crack features (Chen, Li, 2016). A sparse points grouping method is proposed by Li Q et al. to detect cracks from the 3D point clouds .
There are some ways to extract road cracks using deep learning. Gavilán et al. proposed a seed-based approach by combining Multiple Directional Non-Minimum Suppression (MD-NMS) with asymmetry check, to extract road cracks (Gavilán et al., 2011).

OUR METHOD
To efficiently extract on-road information from LiDAR data, we transform structure and texture features of ground points into multiple feature maps and extract road boundaries, road markings, and road cracks by employing a semantic segmentation network on the feature maps. The proposed method consists of three functional blocks: (1) ground points extraction, (2) multiple feature maps generation, and (3) on-road information extraction. The framework of our method is illustrated in Fig. 2.
We provide a detailed introduction of each block in the following sections.

Ground Points Extraction
Raw point clouds can be segmented into two parts. One is the off-ground part, which consists of trees, buildings, poles, etc. The other is the ground part which consists of road surfaces, boundaries, markings, cracks, and others. Filtering out off-ground point clouds can minimize computational load and save memory space. Curb points are essential for road boundary extraction. However, most of the recent methods extracted the road surface without curb information (Zai et al., 2017, Wen et al., 2019a. We propose a Curb-Aware Ground Filter to extract the ground points with the curb points retained. First, uninteresting points are removed based on lateral density contrast, such as moving cars (see Figure 3. (a)) in KITTI data. The lateral density of point Pi is computed as follows where D l (Pi) = lateral density of Pi, I = indicator function, P x i , P y i = coordinates of Pi, N x j (Pi), N y j (Pi) = coordinates of the neighbors, point of Pi, R = searching radius.
According to experimental results, if D l (Pi) is less than a quarter of mean lateral density, Pi will be regarded as an uninteresting point. However, this step is unnecessary for high precision LiDAR data, because lateral density differences among points are tiny. Subsequently, the ground points are always at the bottom area of LiDAR data through observation. Hence, the proposed ground filter demonstrated in Fig. 3  is no other point between the inner radius and the outer radius (purple region), the current point will be treated as a ground point. According to the gap between the inner radius and the outer radius, most of the curb points are retained. Finally, the ground points are clustered by Euclidean distance.
The processed results of three steps on millimeter-level (CRR) and centimeter-level accuracy (KITTI) LiDAR data are shown in Table 1.

Raw Data Vertical Density
Elevation Gradient Intensity KITTI CRR Table 2. The visualization results of the multiple feature maps on millimeter-level (CRR) and centimeter-level accuracy (KITTI) LiDAR data.

Multiple Feature Maps Generation
We transform the road texture and structure information into multiple feature maps. This is a key step to enable our method to simultaneously extract road boundaries, road markings, and even road cracks from high accuracy LiDAR data, and also more robustly extracting detailed road boundary information on low accuracy LiDAR data. Specifically, feature transformation processes are conducted on vertical density, elevation gradient, and intensity perspectives. All the features are projected on the xy plane to generate multiple feature maps for employing more efficient image processing.
The first process is to compute the vertical density of ground points. The vertical density of point Pi is defined as follows where Dv(Pi) = vertical density of Pi, Nj(Pi) = Pi's vertical neighbor points, R = searching radius.
Then elevation gradient is calculated by a differential filter (Figure 3.(c)), which is comprised of a convolution computation over the height values of neighboring grid cells (Hata et  al., 2014). To eliminate the effect of density, we improve the method as follows where G(Pi) = elevation gradient of Pi, Cx, Cy = convolution result of x/y-axis direction, nx, ny = number of points in neighboring cells of , x/y-axis direction.
As most of the previous methods , Wen et al., 2019a, the intensity information I(Pi) is projected onto the horizontal plane directly.
At last, to accelerate the convergence of semantic segmentation (see Sec. 3.3 ), all the feature maps are enhanced by gamma transform (Eq. 4) and all the features are normalized to [0,1].
where I = output feature, c = constant, r = input feature, γ = parameter of enhancement.
The visualization results of three features on millimeter-level (CRR) and centimeter-level accuracy (KITTI) LiDAR data. are shown in Table 2.

On-Road Information Extraction
After the multiple feature maps generation, we obtain three feature maps. Every pixel in the feature maps could be a road boundary point, road marking point, road crack point, or other points. Therefore, we consider the on-road information(road boundaries, road markings, road cracks) extraction task as a four classification problem. We employ Attention U-Net semantic segmentation neural network (Oktay et al., 2018) to classify every pixel. Compared with U-Net (Ronneberger et al., 2015), Attention U-Net has more attention to details due to the increased attention mechanism, and the details are available in (Oktay et al., 2018).  Table 4. Road marking extraction results on CRR and Paris-Lille-3D dataset, respectively. Our method achieves best performance on two dataset compared with other competitive methods.
After semantic segmentation, we obtain on-road information points, including road boundaries, road markings, and road cracks. All the on-road information are converted to point clouds by referring previously extracted ground points. The ground point closest to the pixel point in the vertical direction is determined as an on-road information point.

Datasets and Evaluation Metrics
To verify the effectiveness of our proposed method, we conducted experiments on three datasets including a part of the Coastal Ring Road (CRR) (Wen et al., 2019b), Paris-Lille-3D (Roynard et al., 2018) and KITTI odometry data set (Geiger et al., 2012). The CRR dataset was collected by a RIEGL VMX-450 system, which is equipped with a RIEGL VQ-450 laser scanner. Because of the very high absolute accuracy of 5 mm and high measuring rates, the collected point clouds are dense, accurate, and feature-rich. The Paris-Lille-3D dataset was collected by an MLS prototype of the center for robotics of Mines ParisTech: L3D2, which is equipped with a GPS (Novatel FlexPak 6), an IMU (Ixsea PHINS in LANDINS mode) and a Velodyne HDL-32E LiDAR. Due to the worse LiDAR accuracy, the road cracks are hard to see in the Paris-Lille-3D dataset even by a human. The last one is the KITTI odometry data set, which was collected by MLS system equipped with a Velodyne HDL-64E LiDAR, an OXTS RT3003 inertial and GPS navigation system. The point clouds provided by the KITTI dataset are noisy and with lower accuracy. The road markings and road cracks are hard to distinguish. Therefore, we just conduct road boundary extraction on this dataset.
The visualization results of extracted on-road information from three datasets are illustrated in Fig. 4.
To compare the performance of our method with other methods, we adopt the widely used three metrics (Zai et al., 2017), which consist of completeness, correctness, and quality, defined as Eq 5, where TP, FP and FN refer to true positive, false positive and false negative respectively. Note that all of the evaluations are conducted on the 2D xy plane, and the width of the road boundary was not considered.

Experimental Results
Road Boundaries. To evaluate the effectiveness of our proposed method on the road boundary extraction, we reproduced the three previous methods (Jaakkola et al., 2008, Yang et al., 2017, Zai et al., 2017. The comparison results on three datasets are available in Table 3.
On account of the proposed method has a prominent capacity to encode structure and context features of road boundary. Our method achieves the best correctness and quality on three datasets. Especially, our method outperforms other methods by a large margin on the KITTI dataset, which proves the effectiveness of our proposed method.
Road Markings and Road Cracks. The extraction of road markings mostly depends on the performance of the segmentation network in our method. Instead of applying the U-Net as the segmentation network (Wen et al., 2019a), we employed Attention U-Net that takes more attention to details extraction. Therefore, the performance of our method on two datasets shows a superior improvement compared with other methods. The results of road marking extraction are available in Table 4.
Through the proposed multiple feature maps, the road cracks can also be detected from high accuracy LiDAR data. The results are demonstrated in Table 5

Parameters
To reduce the computing time, down-sampling of the dense point cloud is necessary. However, the reduction of points means the loss of point cloud features. In our method, every point is projected onto the corresponding grid cell, and only one point in each grid cell is defined as a boundary point. Therefore, if there are points in a grid cell, we should make sure that the grid cell has at least one point after down-sampling. We preprocessed the high precision LiDAR point cloud by a voxel grid-based down-sampling with a threshold of 0.03m, and the grid size of projection is 0.05m.
Through observation, the curb height in the CRR dataset is about 0.15m. Therefore, the inner radius of the ground filter is set between 0m to 0.15m. The outer radius should be bigger than the inner radius. The convolution grid size of heightgradient is determined by the inner radius. The estimated curb height of KITTI point cloud is between 0.2m and 0.25m, regarding the errors of 0.05m. Thus, the inner radius and outer radius of the ground filter set at bigger values. In the features enhancement step, when the gamma value is less than 1, the area with a lower value in the feature maps is enhanced, and the area with a higher value is compressed. When the gamma value is greater than 1, the area with a higher value in the feature map is enhanced, and the area with a lower value is compressed. In our experiments, we set a gamma value at 2 to improve the contrast of the feature maps. The parameters of our method are shown in table 6. Gamma value of feature enhancement 2.000 2.000 2.000 Table 6. Parameters of experiments, where PL3D denotes Paris-Lille-3D dataset.

Failure Cases Analysis
The proposed methods can work well on most of the LiDAR data. Nevertheless, due to the changeable environment and the limitation of LiDAR performance, there are also failure cases, a part of which is shown in Fig. 4. On account of the irregular variation of intensity, some road markings are falsely extracted, such as the areas of (a), (b) and (c). There are also some inaccurately extracted road boundaries due to the occlusion and unclear curbs, as shown in (d) and (e). By increasing the LiDAR scanning accuracy or introduce other supporting information, such on-road information can be extracted correctly.

CONCLUSIONS
In this paper, we proposed a new on-road information extraction framework. First, a Curb-Aware Ground Filter is developed to extract ground points with curb information. Second, we transform structure and texture features into multiple feature maps from the extracted ground points and employ a deep learning network to extract multiple on-road information from the feature maps robustly. The superior experimental results on three datasets demonstrate the effectiveness of our proposed methods.