AUTOMATIC DETECTION AND CHARACTERIZATION OF GROUND OCCLUSIONS IN URBAN POINT CLOUDS FROM MOBILE LASER SCANNING DATA

Occlusions accompany serious problems that reduce the applicability of numerous algorithms. The aim of this work is to detect and characterize urban ground gaps based on occluding object. The point clouds for input have been acquired with Mobile Laser Scanning and have been previously segmented into ground, buildings and objects, which have been classified. The method generates various raster images according to segmented point cloud elements, and detects gaps within the ground based on their connectivity and the application of the hit-or-miss transform. The method has been tested in four real case studies in the cities of Vigo and Paris, and an accuracy of 99.6% has been obtained in occlusion detection and labelling. Cars caused 80.6% of the occlusions. Each car occluded an average ground area of 11.9 m2. The proposed method facilitates knowing the percentage of occluded ground, and if this would be reduced in successive multi-temporal acquisitions based on mobility characteristics of each object class.


INTRODUCTION
Occlusions are one of the main limitations of point clouds (Friedman and Stamos, 2012). Occlusions imply a lack of data and therefore a lack of knowledge about empty areas. Depending on the number and importance of the occlusions, they can influence automatic algorithms if these are not designed robustly with this problem in mind. In object classification and recognition, occlusions distort feature extraction (Papazov and Burschka, 2010;Xu et al., 2017). In segmentation, occlusions break continuity of structures and objects in space, producing over-segmentation . In registration, occlusions hinder the use of well-known algorithms such as ICP (Liu et al., 2012). When occlusions are located at point cloud border, they make it difficult to know the contour of the acquired area. In general, depending on their location and size, occlusions cause undesirable behaviour in the algorithms, which may even render certain case studies unusable.
The assessment of whether a cloud is suitable, depending on the size and location of the occlusions, is a visual task. This can be done prior to the treatment of the cloud, or later, when processing errors are found in the result. Both options imply a cost in terms of hours invested by a human observer and also, in the second option, an unproductive processing time.
The existence of occlusions is closely related to the acquisition method and the experience of the people responsible for taking the data. In terrestrial laser scanning (TLS), several acquisitions * Corresponding author are taken from different locations to eliminate occlusions. These point clouds, each with different occlusions, need to be referenced and merged together to generate a more complete point cloud that minimizes the occluded area (Chen and Yang, 2016). Although planning for TLS data collection is usually a manual process, there are also methods to automate it (Frías et al., 2019). In Mobile Laser Scanning (MLS), occlusions are reduced due to the continuous displacement of the laser, although they do not disappear either. This is because MLS is limited to movement in roads and following traffic regulations, so freedom of movement is finite. In cities, objects near to MLS trajectory, such as cars and trees, cause important occlusions in facades and on the ground. In specific ground elements, such as sidewalks, very relevant for pedestrian navigation, occlusions even can completely hide them (Balado et al., 2019a).
To minimize occlusions in urban environments, the use of multitemporal MLS acquisition is common. Multi-temporal acquisition minimizes occlusions generated by dynamic objects (Schachtschneider et al., 2017). But occlusions caused by static objects are still maintained. In addition, certain dynamic objects have a static behaviour in the scene, such as parked vehicles. Given the lack of parking places in cities, these have a high occupancy rate, so even if the vehicles change, the parking space remains occupied. Therefore, successive multi-temporal acquisitions do not always ensure that the occluded area is reduced, but successive multi-temporal acquisitions multiply the acquisition costs.
There are different methods to complete occlusions in point clouds without acquiring more data. The most common are based on the same principles used in image processing (Arias et al., 2011;Buyssens et al., 2015), therefore, for application in 3D vector data, point clouds are transformed into depth images (Doria and Radke, 2012;Salamanca et al., 2008). But these methods focus on very small study areas with a limited number of objects. Other authors focus on processes of mathematical morphology (Balado et al., 2019a) or interpolation (Serna and Marcotegui, 2013) to complete urban ground. In order to obtain a complete model, the point cloud of facades can be combined and completed with pre-generated models called SmartBoxes (Nan et al., 2010). In (Feng et al., 2020), the shape of the buildings is completed on the basis of the topological relationship with superimposed trees. For applications, it is important to classify such interpolated data as it has no properties intensity, and no free line-of-sight between points and the location of the laser emitter. However, it cannot be guaranteed that generated data with the abovementioned approaches coincides the true data.
Very few works specialize in detecting occlusions and in finding which object generated them. This information is very relevant, since the size and shape of the occlusion is given by the object that generates it (Zhang et al., 2019). The occlusions in point clouds are represented as absence of points on the surfaces that form the point cloud, but these empty surfaces correspond with a volume that is hidden from the MLS view behind another object. Based on this premise, occluded areas can be detected by means of a visibility analysis, knowing the location of the laser emitter and the hypothetical occluded surface (Bonde et al., 2014;de Oliveira et al., 2018;Habib et al., 2009;Huang et al., 2017). Visibility analysis, especially in 3D, is a time computationally expensive technique limited by the number of points (González de Santos et al., 2018).
The aim of this work is to design a method to detect automatically the existing ground occlusions of urban point clouds acquired with MLS, and to associate each one to the object class that caused it. In this way, it is possible to automatically evaluate the occluded ground surface, obtain statistical values of the size of the occlusions caused by each object, and efficiently plan successive multi-temporal acquisitions based on static, dynamic and temporary static objects. To the best of the authors' knowledge, no other works have been found that address this problem. This work starts from a segmented and classified urban street point cloud, and the method focuses on image processing techniques after rasterizing the point cloud, without the need to employ 3D visibility analysis.
The rest of this paper is organized as follows. Section 2 explains how the input data is generated. Section 3 presents the designed method. Section 4 is devoted to analyse the results. Finally, Section 5 concludes this work.

OVERVIEW OF SEGMENTATION AND OBJECT CLASSIFICATION OF URBAN POINT CLOUDS
The proposed method employs as input data a point cloud of an urban street, sectioned to contain only one line of façades. The point cloud must be segmented into ground, building façades and objects. Objects must be classified in the most common urban classes: cars, motorbikes, vegetation, pole-like objects, pedestrians, waste-bins, and others. Segmentation and classification have been treated extensively by many authors (Babahajiani et al., 2015;Balado et al., 2019b;Börcs et al., 2017;Roynard et al., 2018;Serna and Marcotegui, 2014;Soilán et al., 2019;Weinmann et al., 2015).
In this work the method designed by (Balado et al., 2020) for segmentation and classification of urban objects is implemented. The method consists of the following processes. First, the point cloud is segmented into cross section along the MLS trajectory. Second, ground and façade planes are detected for each cross section. Points belonging to ground and façade planes are labelled. Third, remaining points are considered as objects and they are individualized by means of connected components. Fourthly, objects are transformed to image and classified by means of an InceptionV3 (Szegedy et al., 2016). The Convolutional Neural Network (CNN) training was performed with the training set composed of 90% of images obtained from online sources and 10% of images obtained from point clouds, and the validation set composed of images obtained from point clouds. Thus, these four processing steps are for segmentation and classification. The classification has reached an accuracy of 86%, the errors have been corrected manually with the intention of not influencing the method proposed in this work. Finally, all points previously segmented are merged to generate the input data = [ ], being XYZ the 3D coordinates and L the label: ground, building façade, cars, motorbikes, vegetation, pole-like objects, pedestrians, waste-bins, and others.

METHOD
The proposed method is based on the superposition and processing of raster images generated separately from the point clouds of ground, building façades and objects. The method is composed of three main processes: first, the clouds are processed to eliminate the points not relevant for rasterization; second, the detection and individualization of gaps; and third, the corresponding label assignment per gap.

Point cloud processing and rasterization
Not all points are candidates for generating occlusions on the ground. Only the points between the MLS and the ground can produce occlusions. In a cross section view, the relevant points are located under a line l from the location of the laser emitter to the height of the occluding objects and prolonged ( Figure 1). This line l cannot be calculated without MLS trajectory. For analyzing ground occlusions, points with Z coordinate under MLS height are delimited as Region of Interest (ROI), improving processing time and preserving relevant points belonging to ground, buildings and objects.
In addition, in order to overlap correctly the three raster images (ground, façades and objects) without geo-reference, it is not possible to generate the raster images directly from each point cloud. Hence common contour points must be added to each cloud that delimit common processing area for images ( Figure 2). Once the point clouds have been converted into raster images, image processing techniques can be applied.
3.1.1 Calculation of ground altitude: Assuming a street without slope, where the ground is at constant altitude, the calculation of the ground altitude gz is done through an average Z of points of the ground point cloud . In case sloped streets, local ground altitudes can be estimated segmenting the street in cross sections along the trajectory (Balado et al., 2017a).
3.1.4 Rasterization: The rasterization process reduces the dimensionality of the point cloud to an image, in this case on the Z axis (Balado et al., 2017b). Points are projected on the XY plane and structured on a grid (image). Each grid value (pixel) is associated with the mode of labels L of the points that fall on each pixel. The point clouds of ground , façades and object are rasterized separately in , , and , respectively. Therefore, two binary images are generated ( and ), and a grayscale image corresponding to the object labels.

Detection and individualization of occlusions
The occlusion detection is based on the detection of gaps in the ground image (pixel value = 0). For this, it is necessary to know which areas of the image correspond to gaps and which areas correspond to the exterior of the case study. Since occlusions can reach the border of the ground with the façades, the façade image is used to delimit the study area ( Figure 3). Once detected, occlusions must be individualized to analyse each one separately.

Generation of ground-façade image
: For correct subsequent individualisation, the continuity in gaps between the interior and the exterior must be broken on . Façades can have discontinuities, due to openings, such as entrances, windows, occlusions, etc. To correct them, a morphological closing is applied to the façade image . Then, closed façade image is combined to ground image by means of the logical OR function.

Occlusion detection:
Gaps in an image correspond to its regional minima not connected to image border (Soille, 2013). Gaps are filled by applying a 2D geodesic transformation (Soille and Gratin, 1994). Gap raster image is obtained by subtracting to the filled image.

Occlusion individualization:
To analyse each gap separately, an identifier is associated to each occlusion through connected components (Kovalevsky, 2019). In this way, a binary image can be generated for each gap.

Label assignment for each occlusion
Not every object in the occlusion generates the occlusion. A distinction must be established between objects that cause the occlusion (occluding objects) and objects located in the occlusion without cause it. Objects that do not produce the occlusion are located inside and behind the occluding object (from the MLS perspective). These objects are partially occluded by the occluding object. The occluding object is overlapped with the occlusion at the gap border closest to the MLS trajectory and farthest from the façade line (Figure 4). To detect the occlusion objects, the hit-or-miss transform is applied to the gap raster image . Once detected, occluding object's label is associated to the entire occluded area.

Border gap detection from occluding object:
The hit-ormiss transform allows detecting binary shapes in an image according to a direction (Bhattacharya et al., 1995). The hit-ormiss transform uses a mask M based on a 3x3 matrix where the central element is fixed to 1 (occlusion exists), other element according to occlusion direction is fixed to 0 (no occlusion) and remaining elements are fixed to X value (it doesn't matter). The choice of direction is made on the raster image of the occlusion, based on the perpendicularity of the facades and with the activation of the most distant pixels. The choice of direction is made in the raster image of the occlusion, based on the perpendicularity of the facades and with the activation of the most distant pixels ( Figure 5). After applying the hit-or-miss transform, raster binary individual images of the gap oriented border are obtained .

Label assignment to occlusion area:
Each is multiplied pixel per pixel by the raster image of objects . Pixels with a value different from 0, resulting from the multiplication, contain the label of the occluding object. Then, that label is associated to the individualized occlusion . Finally, all with label are merged, generating an image of all occlusions with corresponding labels.

Data
The proposed method has been tested in four real case studies in the cities of Vigo (Spain) and Paris (France).   TerraMobilita Contest (Vallet et al., 2015). Dataset 1 corresponded to a 50 meter of a façade line on Alfonso XII street in Vigo, the point cloud had 3.5 million points and several objects, including four parked cars. Dataset 2 corresponded to a 50 meter of a façade line on Elduayen street in Vigo, the point cloud had 6.3 million points and a large variety of objects, including motorbikes parked on the sidewalk and small trees. Dataset 3 corresponded to a 90 meter of a façade line on Madame street in Paris, the point cloud had 1.5 million points and a large number of parked cars and some pedestrians. Dataset 4 corresponded to a 67 meter of a façade line on Madame street in Paris, the point cloud had 1 million points and a large number of motorbikes and four cars.

Results and analysis
In the tests a h = 2 m and a raster grid size = 0.1 m have been assigned. This grid size ensures the existence of ground points in each no occluded cell, being the lowest point density 1 point every 0.05 m, located in the intersection of sidewalk and facade (area farthest from the MLS trajectory). The code was run on an Intel Core i7 CPU 2.8 GHz with 16 GB RAM using MATLAB. The processing time for each dataset was 4.4 s, 7.7 s, 4.2 s, 4.3 s, respectively.
The results of applying the method are shown in Figure 7. In the images, it can be seen that small gaps (approximately those represented by pixels without continuity) are not assigned any class. This is caused by the assignment via the gap border, if the edge resulting from applying the hit-or-miss transform does not coincide with any object, the gap is discarded and considered not relevant. These small occlusions can be caused by thin objects or by an acquisition density failure. Table 1 accounts for acquired, occluded and classified areas. On average, only 4 m 2 were not allocated in all datasets due to small occlusions, only 1% of total occlusions. Table 2 lists the number of objects per class, the detected as occluding objects, and the total area assigned to each class. It can be seen that, even though tree and pole-like objects exist, the occlusions of these were not counted due to their small size. In Figure 7 and Table 2, it can be seen that, even though tree and pole-likes exist, their occlusions were not counted due to their small size. The most relevant occlusions, produced by larger objects, were detected and correctly labelled with an accuracy of 99.6%. The only error occurred in dataset 2. A gap generated by a pedestrian was assigned to the class others (bollard) because the pedestrian shadow had discontinuities that broke the continuity with the pedestrian.

Discussion
The raster resolution influences directly the results. Higher resolution allows for more accurate gap detection, as small occlusions could maintain continuity and would not be eliminated. But a higher resolution of the raster also requires a higher point density acquisition. If the resolution of the raster is increased maintaining point density, pixel voids non-occlusion related may appear between the points and the size of the occlusions would be falsified.
In addition, the gap detection depends on the existence of a line of façades to separate occlusions and the exterior of the study. In the case of no façade line, the gap detection would not be performed correctly. An alternative would be to implement a convex hull (Feng et al., 2020;Wang et al., 2017) on the ground binary image, to delimit the study area. However, urban ground is usually delimited by buildings.
With regard to the label assignment, most relevant urban object classes were selected. However, the class cars (and motorcycles) allows a differentiation between parked cars and cars in motion (dynamic objects). This differentiation can be performed based on geometric features, as point clouds of both are notably different (Balado et al., 2019a), or on location, as these vehicles are usually on parking slots. In datasets 1 and 2, two moving cars were acquired. Therefore, not all vehicles can always be considered as temporary static objects. This differentiation would allow the correct calculation of the area occluded by static, dynamic and temporary static objects.
As mentioned in the introduction, most authors opt for automatic occlusion correction without focusing on what produced the occlusion. In papers where there is a detection phase, detection and correction is usually based on visibility analysis (Friedman and Stamos, 2012), for which it is necessary to know the exact 3D TLS position or MLS trajectory. In this work, trajectory has been not necessary as input data, since many times this data is not available. The MLS position has been determined with respect to façades. In addition, while visibility analyses are performed in 3D, which implies a higher computational cost, in this work, point cloud processing has been used only to remove nonrelevant information. After rasterization, image processing takes advantage of well-known and optimized mathematical morphologic techniques and matrix operations.
However, with respect to a visibility analysis, the proposed method has some limitations. It is only applicable to the ground, as each gap has been assumed as an occlusion generated by an object. Although with a change of perspective in the rasterization, a façade image can be obtained. In the façades there are numerous gaps that do not correspond to occlusions generated by objects. Some gaps are occlusions generated by the façade geometry, entrances or windows. Furthermore, there may be several occluding objects per occlusion, although it has not been observed in any of the four case studies. The present method has not been designed for such a situation. The algorithm should be improved to divide each gap according to each occluding object.

CONCLUSIONS
In this work, an automatic method for the detection and labelling of existing occlusions in urban ground of point clouds acquired with MLS has been presented. The input of the method is a point cloud segmented into ground, trees and objects, which classified. The method has correctly detected the number of occlusions corresponding to the largest occluding objects and has correctly assigned the 99.6% of the labels. The proposed method has enabled to measure the occluded area per object class, confirming numerically that cars and motorbikes, in the case of appearing on the urban scene, are the objects that cause the greatest number of occlusions, occluding each one an average ground area of 11.9 m 2 and 2.3 m 2 .
As future work, the method will be adapted for application to façades and improved to divide gaps generated by various occluding objects. It will also be studied how estimate occlusions generated by static non occluding objects, when these are part of gaps generated by parked objects, to estimate the percentage of gap that would persist in case of removing temporary static objects, such as parked cars.
made of the information it contains. The statements made herein are solely the responsibility of the authors.
The authors would like to thank Ordnance Survey GB (https://www.ordnancesurvey.co.uk) and 1Spatial (https://1spatial.com/) for sponsoring the publication of this paper.