SEGMENTATION OF HERITAGE BUILDING BY MEANS OF GEOMETRIC AND RADIOMETRIC COMPONENTS FROM TERRESTRIAL LASER SCANNING

Nowadays, the terrestrial laser scanning represents an integral source of data for cultural heritage 3D storage and access through digital communication tools. The achievement of 3D models requires the implementation of several tasks such as segmentation. Segmentation is the key step during the point cloud processing where all homogeneous areas are identified, which describe a building facade. Usually, a large part of the segmentation approach focuses on the geometric information contained in the point cloud data by exploiting mathematical representation of a parametric surface. However, due to the complexity of the architecture, such segmentation does not suffice. Henceforth, other approaches turn to the use of color and laser intensity components. Although a variety of algorithms have been developed in this sense, problems of over-segmentation or under-segmentation are observed. In this context, we propose a new approach for point cloud segmentation aiming at a more accurate result. This approach relies on all the components of a colored point --both geometric and radiometric-combining the RGB values, laser intensity and geometric data. Our process begins with the extraction of homogeneous planar segments using the RANSAC algorithm. Next, the result is subjected to a radiometric-based segmentation, first through color similarity as one of the homogeneity criteria of a region growing algorithm, then through the use of intensity similarity for segment fusion. Experiments are performed on a facade presenting an example of Moroccan classical architecture located in Casablanca’s Medina. Results show the importance of integrating all point cloud components, both geometric and radiometric.


INTRODUCTION
The rapid evolution of surveying techniques by terrestrial laser scanner enables the engineer to carry out complex projects with strict requirements of geometric accuracy, time optimization and product extraction.In Morocco, the main challenge for the Culture Ministry is about the preservation and rehabilitation of the old Medina (traditional city center), historic sites and buildings.The advent of terrestrial laser scanning (TLS) has solved the issue of recording and storing a large amount of 3D data, named point clouds.However, the amount of the data impedes their rapid processing and their direct integration in communication tools such as GIS.Also, the emergence of a variety of more user-friendly terrestrial laser scanners lengthens the time interval between the acquisition and the extraction of products.In the field of architecture and heritage, this unsatisfactory combination boosts the research and development of mathematical models and algorithms for the automation of processing tasks such as the extraction of wireframe and mesh 3D models.The 3D modelling derives from several processing steps of the point cloud.The segmentation is the main step that precedes and greatly influences the 3D modelling process.Segmentation usually means partition of space into characteristic zones with respect to homogeneity criteria.Several segmentation approaches are based on the geometric aspect, either through constraints of point co-normality and coplanarity or through the recognition of geometric shapes.When we examine a building facade, we are aware that geometrical aspect facilitates the identification of its constituting elements.This, at least, enables the segmentation of facade planes, characterized by well-defined primitives, thanks to plane recognition algorithm.However, when architectural details can be found in a same plane such as the window shutters, the geometric information is not sufficient.Figure 1 presents the former residence of General Lyautey which is a heritage building of the Casablanca city.This remarkable house dates back over 100 years.Most of the time, the window shutters are closed.The red frame drawn on figure 1 is a concrete example where a segmentation algorithm based only on geometric information will fail to discriminate the wall from the window shutters.The old residence is only one example of many buildings in the Casablanca Medina characterized by the same architecture.Our work thus combines the geometric and radiometric information derived from the TLS data in the segmentation process.We propose an algorithm in three steps.Firstly, the point cloud is segmented by means of geometric information.
The result is then analysed with the RGB color information.The

RELATED WORK
Literature identifies two categories of segmentation methods.The principle of the first category is the fusion of points into segments according to certain homogeneity criteria.The second category defines the best primitives fitting a point cloud.
The segmentation methods based on the fusion principle are limited in the case of unstructured point clouds that are point clouds with noise, outliers and different densities.Indeed, these methods depend on the validation of the identification of points susceptible to appear as noises or outliers.For example, the region growing algorithm is a process influenced by the presence of noise at the following two stages: the identification of the seed surface and growing phase (Pu and Vosselman, 2006).The method based on the clustering principle offers great flexibility in the definition of the attributes used to identify homogeneous elements.However, the clustering principle requires important computational time regarding multidimensional data (3D).This method is also sensitive to noisy data (Filin, 2002).Segmentation by means of the profiling technique, based on the fusion approach, also presents some limitations (Mapurisa and Sithole, 2012).The method is not appropriate for unstructured data with varying densities, which is the case in the real LIDAR data.
The segmentation methods based on geometric pattern recognition, is reliable even in the presence of a high proportion of noisy points.However, they show other kinds of problems.In literature, two pattern recognition algorithms are often used in the segmentation of point clouds: the Hough Transform and the RANSAC paradigm.The Hough transform is time-consuming (Borrman and al., 2011).The RANSAC approach is less efficient when points belonging to two adjacent planes are associated too early with the first defined plane (Huang & Brenner, 2011;Boulaassal, and al., 2009;Boulaassal, 2010).Moreover, in the architectural field, details cannot always be modelled into easily identifiable geometrical shapes.Besides, if some entities can be characterized by geometric properties, others are more readily distinguished by their color content (Barnea and Filin, 2013).Thus, multiple data sources including the color content should provide richer information for automatic interpretation.Hybrid segmentation approaches with geometric and radiometric components are currently timid (Pu and Vosselman, 2009;Strom and al., 2010).According to the authors, they require a careful choice of the color space and preprocessing.
In this work, a new approach is adopted for the segmentation of old Moroccan Medina buildings.This approach combines geometric and radiometric criteria, which allows us to overcome failures when using only the geometric aspect.

Study area
The city of Casablanca has a history that goes way back and has been influenced by various cultures: Roman, Phoenician, Arab, Berber, European and American.From there, we find a great diversity in the architecture around the city.Among the most popular architectural styles Art Deco, which characterize the old neighbourhood.Among these Art Deco buildings, the former residence of the Medina of Casablanca is more than 100 years old and served as home to General Lyautey during the French protectorate in Morocco in the early twentieth century.Part of the residence is now used as an office the Moroccan Labour Union (UMT).Another part is a public building, although closed to the public.

Equipment
The building has been scanned by a terrestrial laser scanner FARO Focus 3D.The acquisition mode of this scanner is based on the phase difference measurement.The scanner range reaches 120 m indoor or outdoor with low ambient light and normal incidence to a 90% reflective surface.Its accuracy is 2 mm for a range of 10 m.Its acquisition speed reaches 976 000 points per second.The wavelength is 905 nm.This system is equipped with a camera.The integrated color camera delivers 70 megapixels of photorealistic color data.
The RGB color range is coded on 8 bits [0 -255] and the laser intensity on 11 bits [-2047 -2048].Figure 2 show the appearance of point cloud by the laser intensity and RGB color level.During the registration process, we had chosen the first station as reference.As pre-processing we develop an automatic filtering based on RANSAC.To delete details that are not interesting for our study (tree, fountain, grass, etc. ...), we relied on the strength of the RANSAC algorithm in pattern recognition of the plane containing the maximum of points.This is the case of facade building (Figure 3).In the following, only points appearing in black on figure 3 are considered.

Methods
Our segmentation strategy exploits the geometric and radiometric information in three steps: extraction of the main planes using a RANSAC approach then segmentation of facade details through both a region growing and fusion processing.The first step is only based on geometry.The second one uses the RGB information and the last one exploits the laser intensity.Let us note PTS = {p 1 , p 2 … p m } the set of laser points.The first step based on the RANSAC method leads to a set of planes, ∑SP i = {SP 1 , SP 2 , …,SP n }.Each plan is processed by adopting a region growing using criteria of homogeneity, namely the RGB color similarity.Segmentation errors due to the brightness variations in RGB data are corrected owing to the laser intensity similarity.We obtain at the end of the process homogeneous surfaces in terms of point co-planarity and radiometric similarity: ∑SP ij = {SP i1 , SP i2 ,.., SP ik }.We now detail these three steps.

Extraction of main and secondary planes:
RANSAC (Random Sample Consensus) (Fischler and Bolles, 1981) is an optimization method that has proved its efficiency to recognize the geometric shape from a set of points, despite the presence of noise and outliers.The method is iterative, the recognition begins with the random sampling of a minimal number of points to estimate the parameters of the shape (plane, sphere, cylinder, etc.).The set of points at a certain distance from the model are then appointed inliers while the rest of the points are outliers.In the case of facades, the geometric shape is the plane and requires a minimum of three points for its estimation.The RANSAC approach uses the geometric components (X, Y and Z).Let us note ESS the subset of three random coplanar and non-collinear points from PTS: ESS = {p 1 , p 2 , p 3 }.The three points randomly selected are used to estimate the parameters of the mathematical model M defined as follows: where pr = {(a 1 , a 2 , a 3 , a 4 )} is the set of quadruplets corresponding to the four parameters that define a plane in the 3D space.
(a 1 , a 2 , a 3 ) = the normal vector.a 4 = the distance between plane and reference origin.a = the estimated solution from ESS. p = one point from ESS: p = (X, Y, Z) T .F M = the function describing the mathematical model: An important variable to evaluate the RANSAC paradigm result is the adequacy between the model and the other points of the cloud.Often, this fitting is expressed as a projection distance: Using this distance measurement, we can define the Consensus Set (CS).This set presents all points that are enough close to the estimated model from ESS, namely M(ESS).The authorized maximal distance to the model is the ds-threshold.The cardinal of CS represents the number of points lying on this plane: The choice of ds-parameter is discussed in part 4.
Computational time depends on the number, N, of iterations required to find the best theoretical plane (Harley and Zisserman, 2003).
where q = the probability of a point to be inliers.P = the probability of randomly selecting the sample initializing the right plane.
The N-parameter determination consists in adapting the estimation of q for each identified CS.It stops when CS gathers a maximum number of inliers.Thus, the facade main plane can be defined as the plane which contains maximum number of inliers.This plane, that is determined with three points in RANSAC process, is refined by least squares method using the whole set CS.It leads to new parameters: a' 1 , a' 2 , a' 3 and a' 4 .The randomness of ESS provides a variety of planes but certain planes have a wrong orientation compared to the facade main plane.To extract the secondary planes which are in adequate orientations to the façade main plane, we recall and implement the method presented in (Boulaassal, 2010).The method reorients the point cloud according to local new reference of adjusted main plane for recognition of the different secondary planes orientations.To define this new reference, we should identify the eigenvectors and eigenvalues of the variancecovariance matrix.The input of the variance-covariance matrix is the point cloud reduced to its average coordinates M new (Lay, 2004) (Boulaassal, 2010): with i  [1,…,Card(CS)] where Card(CS) = CS cardinal of main plane M new = point coordinate matrix reduced to the average (X av ; Y av Z av ) T .MVC 3*3 = variance-covariance matrix.
We shall thus have three directions, each of which corresponds to an eigenvector of MVC 3*3 .The degree of variation of points coordinates in three directions is proportional to the eigenvalues.In other words, the variation in the data is greater in the direction of the eigenvector associated with the largest eigenvalue and vice versa.Thus, if the eigenvalues are ordered: λ1> λ2> λ3.Then the two eigenvectors (1, 2) associated with the two largest eigenvalues, λ1   λ2, are the basis of the main plane and the third vector associated with the smallest eigenvalue (λ3) is its normal (Boulaassal, 2010).The point cloud is reoriented to (O', 1, 2, 3) reference, where O' is the reference origin.After this, extraction of secondary planes (parallel, inclined or perpendicular to the main plane) depends on added constraints to the values of a' 1 , a' 2 , a' 3 and a' 4 parameters.The extraction of main and secondary planes provides a set of planar segments ∑SP i , each segment will be subject of facade details extraction.

Extraction of facade details:
The purpose of this second step is the segmentation in each primary plane of segments corresponding to facade details.These details are contained in the same plane.Our segmentation approach is based on a region growing algorithm.This process begins with the identification of a seed point and evolves with respect to its neighbors to define a homogeneous region s i , checking a predicate P r .It relies on the verification of constraints such as the radiometric similarity or the radiometric variance of seed points.The segmentation operation SP i of SP i segment produces a set of homogeneous regions ∑SP ij defined by the following properties: The region growing algorithm is iterative and is performed in three steps.The first mission is to randomly select a seed point from the point cloud.The second step is the growing to a seed surface.The third stage is the seed surface growing to a homogeneous area.The selection of seed point in a noisy point cloud may distort the results of the region growing algorithm, especially if the seed point is confused with an outlier.The use of geometric extraction through RANSAC algorithm filters and directs the selection of seed points in the plane refined by CS.Thus, we exclude the risk to be positioned on an outlier.In our approach, the seed points are randomly selected from planes segments resulting of the previous geometric extraction.The growing of seed point to a seed surface depends on the verification of geometric criterion and radiometric homogeneity. The geometric criterion This criterion is used to find the nearest neighbors respecting radius, td, from a seed point in order to create an initial seed surface.The detected surface is accepted only if the color homogeneity criteria are satisfied.
 The color homogeneity criteria The distance between the seed point RGB color and the RGB color mean of the seed surface is small enough: Sim co < tr.Sim co is a color similarity measurement and tr is a threshold.We use the Euclidian distance in the RGB color space.So the RGB color mean of the seed surface is the gravity center of seed surface points in the RGB color space.
Its color variance (var) is small enough: var < Vr.Vr is a threshold.The variance of a seed surface should be considered along with the color similarity measurement.
where V R , V G and V B = the empirical variance of each color component of the seed surface.
Once acceptable seed surface is detected, the region growing process starts with respect of the color similarity measurement.The use of a seed surface, instead of a seed point makes the seed selection more robust.The color homogeneity criterion is the similarity measurement computed in the RGB color space, as previously defined.Here, the use of the previous geometric constraint is not significant since we cannot predict the spatial distribution of a homogeneous region.

Segment fusion:
The obtained segmentation presents some errors due to the brightness variation in RGB colors along the facade.In order to correct these errors, we propose to use the laser intensity that is not sensitive to illumination conditions since laser scanners are active systems.This phase consists in the fusion of similar segments.Here, the similarity is measured through the difference between the laser intensity averages supplied by each of the regions tested.The fusion is accepted if the difference is inferior to a threshold, f.The result is stored in the form of a matrix with Card(CS) as number of lines and 8 columns.Where Card(CS) is the cardinal of the Consensus Set of the considered main plane and the 8 columns correspond to (X, Y, Z, laser intensity, R, G, B, region index).

RESULTS
During the first step, our algorithm requires the identification, calculation or estimation of the ds-threshold that determines inliers and outliers.Its choice depends on the characteristics of the facades to be segmented.For example, in the case of walls, ds-value should depend on wall planarity, which reflects the level of perfection during the building construction.ds should also depend on modelling objectives.Indeed, the basic reconstruction of a building requires wider values.On the other hand, if the purpose is the verticality control, the choice should be more meticulous and sometimes variable from a zone to another.Our objective is limited to the original building architecture recognition.An erroneous choice of ds-value generates sub-segmentation if the value is too high; it will induce over-segmentation if the value is too small (Figure 4).The definition of an optimal threshold is difficult.Here the choice of the ds-value is heuristic in order to avoid oversegmentation and sub-segmentation.We have chosen ds=0.06m(Figure 5).
Figure 5. Recognition of main planes with ds = 0.06 m: main primary plane is the green part while the red part refers to outliers segmented into two other primary planes.
The extraction of the main facade allows the creation of a local coordinate system where the X-and Z-axes coincide with its plane and the Y-axis is normal.The analysis of the point cloud in this system leads to the extraction of the other planes (see 3.4.1). Figure 6 shows an example of other primary planes parallel to the main plane.Their normal parameters (a 2 , a 4 ) are different from zero and (a 1 , a 3 ) are close to zero.
Figure 6.Other primary planes parallel to the main primary plane (green).
In the second processing step, four thresholds are chosen: td, tr and Vr during the seed surface delineation, and tr 2 during the region growing from the seed surface.Similar to the RANSAC algorithm, the threshold identification depends on the data to be processed and objectives.For example, in our case, the following values lead to appropriate results: td = 0.20 m, tr = 30, Vr = 900 and tr 2 = 60 (Figure 7).As mentioned above, shadow and illumination conditions induce unexpected RGB variations, which cause oversegmentation.The red triangle is an example of the brightness variation.It represents an area of high brightness (Figure 2).In most cases, it will be difficult to find suitable thresholds, especially with a large variation in brightness.Furthermore, the decreasing threshold values produce remarkable defects in the segmentation result (Figure 8). Figure 8 shows the distribution of ten segments which form the main plane of the facade.For a set of points composing each segment, we calculate the average of its laser intensity.Table 1 shows the average laser intensity and the number of points resulting from segments of figure 8.If we consider the average intensities, we observe that the intensity of the wall, corresponding to segment index 1 (white concrete) is similar to several segments representing the same entity (segment indices 1, 2, 3, 5, 7, 8, 9, 10).The same remark applies for the windows and door frame (brown wood) (segment index 4).Index 6 presents noise segment.7.
This observation explains our post-processing approach, segment fusion, based on the ability of laser intensity to differentiate the different types of materials.This step requires a laser intensity similarity threshold, named f.The threshold should be chosen according to the degree of intensity variation.For these data, the threshold is f = 200.The result shows an adjustment of previous imperfections.We shall shrink the number of regions from 10 down to 3 (Figure 9 and

DISCUSSION
As a comparison, nineteen measurements of two window shutters and of the elements around an ornamental door frame have been carried out manually in the point cloud and the segmentation product.The mean difference is 4 mm and its standard deviation 6 mm, which partially shows the quality of the segmentation.The approach tested on other facades of the old Medina with architecture similar to the residence had led to similar results.It validates the reliability of the segmentation approach integrating all the components of the point cloud for 3D building modelling this specific architecture.However, the choice of the different thresholds in our approach may influence topological and radiometric factors in the results, among which: -The number of homogeneous segments obtained.
-For each segment, the number of connected components.
-The number of small sizes connected components.
-The homogeneity of color and intensity in each segment.The study of these factors will be held in future works to limit the sensitivity to parameter choice.

CONCLUSION
The wealth of information contained in a point cloud presents great opportunities in the process of heritage building segmentation especially in the old Medina.The very architecture of these buildings makes it uneasy to discriminate the different elements located in one plane.The segmentation approach we have proposed in this work benefits from the complementarity between the different components of the point clouds.Geometric information is essential in the identification of planar segments.These segments present the input of radiometric segmentation process, through adding RGB color and laser intensity criteria.RGB color data permit the classification of elements with the same geometric definition.Laser intensity solves the imperfections due to unexpected RGB color variations.In the next step, the segmentation results will be modelled in wire, mesh or geometric shape allowing the storage of the architectural models of the patrimonial buildings of Moroccan old Medina.These archives facilitate the rehabilitation of heritage sites as well as the maintenance of prominent works-of-arts in compliance with the requirements of the Culture Ministry and the urban agencies.

Figure 1 .
Figure 1.Former residence of General Lyautey.The red frame shows a surface sample where two elements are located in the same plane (wall and window shutters).

Figure 2 .
Figure 2. The appearance of point cloud according to: (a) the laser intensity level, (b) the RGB color level 3.3 Point cloud acquisition and pre-processing The point cloud was obtained using 2 stations of the laser scanner and spherical targets for local consolidation.The two point clouds have been registered together.The resulting point cloud is non-georeferenced.It is tied to the local coordinates of the scanning station chosen as reference station.During the registration process, we had chosen the first station as reference.As pre-processing we develop an automatic filtering based on RANSAC.To delete details that are not interesting for our study (tree, fountain, grass, etc. ...), we relied on the strength of the RANSAC algorithm in pattern recognition of the plane containing the maximum of points.This is the case of facade building (Figure3).In the following, only points appearing in black on figure 3 are considered.
the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W1, 2013 XXIV International CIPA Symposium, 2 -6 September 2013, Strasbourg, FranceThis contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.

Figure 3 .
Figure 3. Point cloud filtered by RANSAC Paradigm.Black colored points are those recognized as facade points and red colored points can be rejected.

Figure 4 .
Figure 4.The ds-threshold influences the results observed during the first phase of the segmentation process.(a) Subsegmentation with ds = 0.15 m, (b) Over-segmentation with ds = 0.03 m.Green-colored points are belonging to the main primary plane.

Figure 8 .
Figure 8. Brightness variation effect on the results of the segmentation: result with td = 0.2m, tr = 10, Vr = 100 and tr 2 = 40.10 segments (appearing as different colors) are found.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-5/W1, 2013 XXIV International CIPA Symposium, 2 -6 September 2013, Strasbourg, France final segmentation is obtained through the introduction of the laser intensity component.Careful attention was paid to the appropriate algorithm for each step.

Table 1 :
Result of the region growing stage by means of the threshold values in figure