A POINT-BASED LEVEL-SET APPROACH FOR THE EXTRACTION OF 3D ENTITIES FROM POINT CLOUDS – APPLICATION IN GEOMORPHOLOGICAL CONTEXT

: The commonly used level-set models have been proven efﬁcient in delineating and detecting objects in a variety of image processing applications. Such models do not require training or labelling, need little prior information as to existing objects, and want little adaptation between scenes. Despite their great potential, their utility for point cloud processing has been limited to 2D space. In this paper, we introduce a novel level-set-based approach which extracts entities directly from the point cloud while retaining their three-dimensional point form. To do so, we adapt the level-set scheme to 2D smooth manifolds represented by unstructured points in 3D space. Surface derivatives are then computed using a local surface parametrization based on a weighted least squares approximation. This alleviates the need for a triangulated mesh and facilitates the level-set evolution within the point cloud. As a driving force, we utilise visual saliency to focus on pertinent regions. As the saliency estimation is performed pointwise, the proposed model is completely point-based, resulting in three-dimensional entities extracted by their original points. We apply the proposed method to extract geomorphological entities in two fundamentally different scenes. Such entities present a challenge to existing extraction schemes, as they are embedded within their surrounding and they do not conform to closed parametric forms. The proposed approach enables the detection of various entities simultaneously, without prior knowledge of the scene, and regardless of their position. This promotes ﬂexibility of form and provides new ways to quantitatively describe morphological phenomena and characterise their shape.


INTRODUCTION
Level-set based models are widely used as a detection and delineation framework in various image processing applications (e.g., Yang et al., 2020a;Wang et al., 2020a;Yu et al., 2021). Unlike the traditional multistep approaches, which are composed of consecutive steps to achieve specific tasks, level-set approaches formulate the detection as a minimisation problem under an explicit set of assumptions. Therefore, they allow detection without training or labelling data, and only little adaptation is required for different sets of data. Such flexibility is even more pronounced with the effortless addition of constraints, phases, and a-priori information to the cost function (Wang et al., 2020b;Biswas and Hazra, 2021). Additionally, level-set methods are especially efficient in delineating complex topologies' boundaries. This is achieved simultaneously for the entire image, without the need for prior information as to the number of entities in the scene (Wang et al., 2020b;Biswas and Hazra, 2021). For these reasons, utilisation of such framework in 3D point cloud analysis will undoubtedly enable efficient and flexible detection applications. Adapted to point clouds, pointbased application will facilitate detection of complex 3D entities without being limited to a specific plane. Furthermore, extracting the points that compose the entities instead of their modelled version will enable their characterisation without introducing biases. Being controlled by geometry, such an approach will allow detection regardless of the acquisition technique or the size of the entity.
Despite its great potential, only a few works utilised level-set methods for the extraction of entities in point clouds. Kim * Corresponding author and Shan (2011) extracted roof boundaries using the multiphase region-based Chan-Vese model (Vese and Chan, 2002). The authors used each element of the normal vector as a different phase in the cost function. Minimisation was defined so as to reach homogeneity of each phase inside and outside the curve. In order to segment a noisy point cloud, Awadallah et al. (2014) projected the 3D point cloud onto a 2D plane and applied the edge-based geometric active contours (GAC, Caselles et al., 1997) to delineate the existing objects. Arav and Filin (2016) proposed to use GAC with gradient vector flow (GVF, Xu and Prince, 1998) to detect sinkhole rims. In a later work, the authors proposed a region-based model to delineate geomorphological entities (Arav and Filin, 2021). Instead of using low level features, such as curvature or normal, the authors suggested a saliency measure to assess the distinctness of a point. Seeking for homogeneity in saliency values inside and outside the curve, the level-set function evolved based on a rasterized map of the computed saliency. In these approaches, however, the detected objects were still limited to the 2D domain, described by their planar boundaries.
Studies that have sought to extract 3D entities from point clouds still used two-dimensional slices of the data according to which the level-set function was developed. Tang et al. (2012) sliced the point cloud at different heights and detected the boundaries of trees at each slice using the Chan-Vese model. The detected curves were then connected to establish a 3D segmentation. Similarly, Khattak et al. (2013) extracted the three-dimensional shape of complex buildings from depth maps created from the point cloud. To that end, the authors built a series of depth maps from which a sequence of contours per slice was extracted. The contours were then connected to a 3D mesh. These practices follow the commonly used approach in level-set-based MRI segmentation (e.g., Ramudu and Babu, 2018;Khalil et al., 2020;Yang et al., 2020b). Thus, they do not capitalise on the advantages of the minute sampling of the surface, nor do they consider characteristic features of the point cloud, such as occlusions or varying resolutions within the data. Additionally, the resulting entities are meshes, which may introduce distortion to the extracted shape.
In this paper, we present a region-based level-set approach to extract entities in their 3D point form directly from the point cloud. Taking advantage of the variational form of the level-set model, the proposed method can be readily applied to extract multiple types of entities, in various scenes, regardless of the point cloud acquisition technology. The detection framework is materialised by adapting the level-set method to 2D smooth manifolds represented by unstructured points in 3D space. Surface derivatives are then computed using a local surface parametrization, based on a weighted least squares approximation. This alleviates the need for a triangulated mesh to compute the derivatives and facilitates the evolution within the point cloud. By this we transfer the level-set method from the continuous domain to point clouds. To reduce computational overhead, we employ a narrow-band approach. This way, the computational complexity scales with the number and size of the entities rather than the total size of the point cloud. The proposed method requires no specific initialisation or starting points, but is driven by visual saliency that is estimated pointwise. The extracted entities are formed by the original points that describe them. Thus, no new artefacts are introduced to the data.
We test the application of the proposed method on two demonstrative examples. The first example depicts a complex cave scanned by a terrestrial laser scanner. Being complete threedimensional scene, we show that standard methods cannot be easily applied, let alone level-set based, but require a thorough adaptation. The second dataset demonstrates a terrain disturbed by gullies and sinkholes. This dataset was scanned by airborne laser scanner and is approximately flat. We show that the proposed model successfully enables the extraction of important entities in both scenarios, without the need of any special manipulation to such data.

Detection framework
In the standard 2D form of the level-set detection framework, the zero-level-set of the function φ : Ω → IR defines a curve (1) on a connected subset Ω ⊂ IR 2 of the 2D Cartesian plane. The level-set function φ serves only as a tool to extract the curve, its non-zero values are generally not of interest. Therefore, it is customary to choose φ as the signed distance function to the curve. The zero-level-set then delineates inside and outside parts of Ω Ω = Ωin ∪ Ωout, In this context, the level-set method can be seen as an approximation method for computing the curve that minimises a cer-tain energy functional. Starting from an initial curve, the levelset function, and thus the curve, evolves in time according to a Euler-Lagrange equation for the energy functional. Being less sensitive to noise and to the initial curve, we follow the region-based scheme proposed by Chan and Vese (2001). In this approach, we seek for boundaries around regions that share a certain similarity, which can be represented by quantifiable cues (discussed further in Sec. 2.4). Letξin andξout denote the mean value of the cue of each region and |C| the length of the curve C. The energy functional is expressed by where ν0, µ1, µ2 are constant non-negative weighting scalars. This functional formulates the conditions for the curve we seek by means of a level-set function. The value of the energy E(C,ξin,ξout) depends on the curve, and its minimum is achieved when the curve is optimal. The data terms ξ(x) −ξin 2 and ξ(x) −ξout 2 ensure the approximation to the cues map, while the regularisation term, |C|, guarantees that the boundary between the regions has a minimal length. The statistic nature of this formulation, in the form of minimum variance, implies that no preliminary smoothing is needed. In essence, it also translates into little dependence on the curve initialisation (Chan and Vese, 2001).
Solving this minimisation problem through a level-set formulation requires defining the curve C as a function of φ. We utilise the Heaviside function and Dirac delta to recover Ωin, Ωout, and C from the level-set function φ. The energy equation then becomes This functional is similar to Eq. (3), with the distinction that the energy is computed over the entire domain while the Heaviside function controls the influence of the "inside" and "outside" terms. The definition of the Heaviside function readily enables the computation of the mean cue values inside and outside the curvē Practically, regions where both the Heaviside function and the Dirac delta are zero will result in zero-motion, and will not propagate from the initial function. Instead, it is common to utilise an approximate, smooth form, of both the Heaviside function and the Dirac delta, defined by Keepingξin,ξout fixed and minimising Eq. (5) with respect to φ, we obtain the time-dependent Euler-Lagrange equation: where φt denotes the time derivative. Eq. (9) defines the evolution of the level-set function towards optimum; it is discretised by an explicit Euler scheme where i is the iteration index and ∆t the temporal step-size.
One sees that Eq. (9) is of a region competition form: if the local cue ξ(x) at point x is more similar to the average of the interior, then x is assigned to the interior (the curve moves outwards) and vice versa. The solution is reached by the curves that are formed at the minimum when φt = 0.

Adaptation to 3D point clouds
The evolution of the level-set (Eq. 9) requires the computation of derivatives of φ at each iteration. In the standard case, where Ω ⊂ IR 2 is assumed as a planar surface, it can be discretised by a uniform Cartesian grid. This enables the use of finite differences to numerically compute the required derivatives. We now consider a two-dimensional smooth manifold Ω ⊂ IR 3 embedded in 3D space, i.e., an arbitrary surface. Derivatives on this manifold can be estimated by finite difference approximations defined on a triangulated mesh (Sethian and Vladimirsky, 2000). Here, we aim to avoid triangulation and the associated introduction of artefacts as well as the computational overhead. Instead, we use a moving least squares approach (similar to Ho et al. (2005)), that allows defining surfaces in 3D space by the points that lie on them rather than by triangulation. Specifically, the surface Ω is represented by a point cloud X := {xi ∈ Ω, i = 1, . . . , n}. In order for this to be an adequate representation of the surface, we assume that the surface is sufficiently covered with points. Formally, we require that max The cues ξ(x) are given for all x ∈ X. The level-set function φ however need not be computed for all points, but rather only for those points within a certain distance from the zero-levelset, similar to classical narrow-band methods. Practically, this means that only points near a zero crossing are updated: {x ∈ X | ∃y ∈ X : |x − y| < hmax ∧ φ(x)φ(y) ≤ 0} (12) This observation is essential, as otherwise processing large point clouds would be computationally infeasible.
To obtain a discretisation of the surface derivatives occurring in Eq. (9), we employ weighted least-squares approximations.
For an arbitrary function f : Ω → IR, the projection of f onto the two-dimensional tangent plane spanned by ti1 and ti2 at a point xi is approximated by a 2D quadratic polynomial and ui, vi are functions of x given by The coefficients a i are determined via weighted least-squares fit of neighbouring points Since w h vanishes outside of the interval [0, h], it suffices to consider a neighbourhood of radius h for each point. We will use h = .
Note thatfi : IR 3 → IR can be interpreted as a 3D extension of f at xi which is constant along the normal direction. This is used to define an approximate gradient operator∇ bỹ ∇fi := ∇fi, so that ∇f (xi) ≈∇fi(xi).
The gradient of the polynomialfi can be calculated analytically.

Distance regularisation
To avoid numerical errors or instability, classical level-set methods require redistancing to keep the level-set function φ as an approximate signed distance function. For surfaces, redistancing involves computing the shortest path (i.e., the geodesic) between points and the zero-level-set (Kimmel and Sethian, 1998). Various methods exist for that task, such as the widely used fast marching algorithm (Sethian, 1996). Instead of employing an explicit redistancing step, Li et al. (2010) proposed to include a distance regularisation term directly in the level-set evolution equation. Adapting this method for the 3D case, the full level-set evolution equation is then given by where d(x) : IR → IR is defined as In Eq. (18), µ1, µ2 control the smoothness of the curve and ν0 constrains its length, measured in dataset units. The strength of the regularisation is determined by λ. We use µ1 = µ2 = 1 and ν0 > 0, λ > 0 in all our computations.

Cues for detection
We expect good geometric cues to highlight interesting regions within a point cloud of a terrain. Such areas are characterised by a change in the surface. More specifically, the distinctness of a point or a region is estimated by measuring the deviations in both normal and curvature of a wider surround from a narrower centre (Arav and Filin, 2020): with K the neighbourhood size, ni and κi the estimated normal and curvature at point i, respectively, and Wij the weighting function This scheme is based on a Normal distribution whose centre is at ρ, and σ defines the size of the surrounding. This way, a centre-surround operator is established, so that the surroundings at distance ρ are intensified, while the weights of close and far regions are lowered. Eventually, dn and dκ express the weighted mean difference between the point x and the surrounding surface for both normal and curvature, respectively. The saliency is given by: where the exponents act as a normalisation operator. The resulting saliency-based cues at each point are then used for evolving the level-set function (Eq. 18).

RESULTS AND DISCUSSION
We show the utilisation of the proposed method to extract different types of geomorphological entities. The use of threedimensional point clouds for the documentation and study of geomorphological processes is gaining popularity in recent years. This has to do with the growing understanding that landforms are better studied in their natural 3D form and with the growing availability of 3D point cloud acquisition platforms (Sofia, 2020). The variety of geomorphological entities, forms, scenes, and sizes presents however a challenge to existing extraction schemes. Therefore, common approaches usually target specific landforms in heuristic and localised ways, and require the development of designated models for each scene and object (e.g., Xu et al., 2017;Wu et al., 2018;Dong et al., 2020).
Here we show that the proposed method allows a more generalised detection. We begin with the extraction of morphological features in a terrestrially scanned cave and then move to the extraction of large-scale entities from an airborne laser scan. Each dataset represents a different aspect of the proposed method. The first displays an example to the three-dimensionality of the model; while the second presents a comparable dataset, as it was previously analysed (Arav and Filin, 2021). By showing the various scenes, features, and acquisition techniques, we demonstrate the flexibility of the proposed method.
In all our computations, simple knn neighbourhoods with k = 12 were used to compute the normals and tangent planes. The weighting parameters for the level-set evolution (Eq. 18) were set to µ1 = µ2 = 1.0, λ = 0.001 throughout. The parameters h and ν0 depended on the dataset. Note that the step size ∆t in Eq. (10) was chosen based on the Courant-Friedrichs-Lewy (CFL) condition (Li et al., 2010). That is to say, the higher the spatial resolution is, the smaller the time-step. The initial levelset function was chosen as simple voxelization of a given size d with the zero-level-set lying at the voxel borders, i.e., Our quantitative evaluations of the extraction were performed in reference to manually delineated entities. These were extracted by a user not associated with the research to assess the entities extracted in Arav and Filin (2021). Two main measures were used: with TP refers to true-positive, FP false-positive, and FN falsenegative.
The model was implemented in Python 3.8, using Ubuntu 20.04 operating system on an 11th Gen Intel®CoreTM i7 CPU@2.80GHz, with 16GB RAM.

Pockets, niches and blocks in a cave
The first dataset features a point cloud acquired by a terrestrial laser scanner (Riegl VZ2000) in the Untere Traisenbacher Höhle, a small cave located on the steep northern slope of Ebenberg, Austria (lat. 47 • 57 , long. 15 • 42 , Fig. 1). Fig. (1) shows shaded representations of the cave, as acquired by a single scanning position. It can be seen that the dataset includes occlusions (marked as red rectangles). These are characteristic to terrestrial laser scanning in general, and in cave measurements in particular, adding complexity to the analysis of the data. Mapping and characterising the morphological features on the cave surface normally would require isolating the individual planes and rasterizing each one of them. Fig. (2) presents the projection of the right wall of the cave onto the yz-plane (a) and its rasterization using the nearest neighbour interpolation at 1 mm grid size (b). Since the wall is not rectangular and has occluded regions, the rasterization requires interpolation over large areas. Note that due to the 2D projection, the floor, ceiling, and the other walls cannot be represented in the rasterized version.
To allow a 2D level set extraction, we define the cues by visual saliency. Fig. (3) shows the results when applying the Fine Grain (Wang and Dudek, 2014) approach on the rasterized wall.
Here we used the implementation of openCV (OpenCV, 2015). It shows that salient regions are mostly at the edges of the wall or at occluded regions. Therefore, application of the levelset method on this saliency map will delineate the interpolated  areas, rather than the morphological entities in the wall (e.g., pockets and niches). Moreover, this procedure will have to be repeated for the ceiling, floor, left, and rear walls and then the results should combined in a suitable manner to describe the cave. Instead, we propose to characterize the entire cave at Figure 3. Saliency estimation on the raster, using Wang and Dudek (2014) method. The salient regions refer mostly to occluded areas that were interpolated in order to create the raster. once, without the need to divide it into sections.
We set the minimal object size (ρ) to 0.3 m, being the average size of the larger blocks on the ground. Fig. (4) presents the estimated saliency map. It can be seen that on the floor of the cave, blocks that are larger than 0.3 m were marked as salient. Also, pockets and niches on the walls and ceiling were detected.
Since the saliency here is computed point-wise, no interpolation was needed, contrary to the rasterized approach. Moreover, as the estimation is carried out according to available neighbourhood. Therefore, occluded regions have little to no effect on the saliency estimation (Fig. 5). Next, we apply the proposed method for the extraction of the entities. The level-set function was initialized to a voxel grid of d = 0.1 m. The timestep was set to ∆t = 0.05, based on the spatial resolution. The majority of the development is achieved within the first 50 iterations. Then, convergence slows down and only small modifications are fine tuning the extraction. Fig. (6) present extracted pocket and niche from the right wall. These allow morphological characterisation of elements that otherwise would be difficult to obtain. Note that the entities were extracted from the entire cave, regardless of their location within it, i.e., they can be part of the ceiling, floor, or walls. This is carried out in a single process, with an arbitrary initialization. Moreover, even in occluded regions a full extraction was materialised, since some point were recorded on the surface, satisfying Eq. (11).

Gullies and sinkholes in flat terrain
The second dataset features an airborne laser scan acquired over the Ze'elim alluvial fan in the Dead Sea, Israel (lat. 31 • 22 , long. 35 • 24 ). The featured point cloud spans over 480 × 375 m 2 at point density of 8 pts/m 2 . Fig. (7a) presents the analysed region. Two main gullies dissect the mudflat. These gullies fork several times and create smaller and shallower gullies. Therefore, they vary in both width (between 5-9 m) and depth (2-6 m). Sixty-six sinkholes are scattered along the fan, also in varying sizes, with perimeters between 5 m and 40 m. We aim to extract points that describe gullies and sinkholes from the point cloud. Note that common approaches target one feature per detection model, and even then do not focus on the points but rather on polygons or polylines (e.g., Hofierka et al., 2018;Wu et al., 2018). We use 2 m minimal object size (ρ) to estimate the saliency which is presented in Fig. (7b). It can be seen that even relatively shallow sinkholes and gullies are marked as salient.
Fig. (8) shows the development of the level-set function, which was initialized with a voxel grid of d = 10 m. Though the current dataset inherently differs from the previous one, both by acquisition technique and resolution, only three parameters required adaptation: h, which is given by the point cloud resolution, is set to 1.5. The parameter ν0 constrains the curve length and is set to 0.025 to better fit the expected characteristics of the entities of interest. Lastly, the step size ∆t was changed to 10. It should be noted that the initialisation has little bearing on the final result. Nonetheless, it affects the number of iterations required for convergence.
The final extraction is presented in Fig. (9). It can be seen that both gullies and sinkholes were extracted in full. Quantitative analysis shows 91% precision and 89% recall. The lower recall is due to the missing points on sinkholes walls, which created discontinuity of the surface Ω and violate the assumption in Eq. (11). Additionally, sinkholes that are smaller than 3 m were not detected. This is probably as a result of the interval size of the Wendland weighting function (Eq. 15).
A comparative analysis between the results of the proposed method and the approach proposed in Arav and Filin (2021), shows similar detection precision (92% vs. 91% in the current work). It can be seen that instead of delineation, the proposed method extracts the entities themselves rather than just a delineation ( Fig. 9 vs. Fig. 10). This allows an in depth morphological characterisation of the entire entity, rather than their contours.

CONCLUSIONS
In this paper we presented a 3D point-based level-set approach to extract entities from point clouds. In the proposed method, we used a region-based model and a moving least squares approach to reconstruct the local surface and for numerical differentiation. This allowed the extraction of the original points that belong to salient entities. To the best of our knowledge, a definition of a level-set model over a 3D point cloud has never been introduced let alone used as an extraction approach in such datasets. The pointwise nature of the model and result suggests that no new artefacts are being introduced and no preprocessing is required. Moreover, as the detection is driven by visual saliency, the entities are found simultaneously without the need for training or labelling data, and by using an arbitrary initialisation.
We demonstrated the application of the proposed methodology by extracting geomorphological entities in a complex cave and in open terrain. The proposed method allowed the extraction of various geomorphological entities in both scenes, regardless of their form, size, or position. This was done with little adaptation between datasets and had reached high precision and recall rates.
It should be noted, however, that as the proposed method assumes continuous surfaces, it may still be sensitive to occlusions. Such regions are common in point cloud data, and therefore should be further investigated. In this regard, a better definition of a point neighbourhood rather than the simple kNNneighbourhood may better define the surface, and thus allow an improved solution. The proposed extraction method is independent of the saliency, and can use other cues. These can be computed either by machine-learning tools, or with the introduction of non-geometric information (e.g., radiometric).