SMART POINT CLOUD: DEFINITION AND REMAINING CHALLENGES

: Dealing with coloured point cloud acquired from terrestrial laser scanner, this paper identifies remaining challenges for a new data structure: the smart point cloud . This concept arises with the statement that massive and discretized spatial information from active remote sensing technology is often underused due to data mining limitations. The generalisation of point cloud data associated with the heterogeneity and temporality of such datasets is the main issue regarding structure, segmentation, classification, and interaction for an immediate understanding. We propose to use both point cloud properties and human knowledge through machine learning to rapidly extract pertinent information, using user-centered information (smart data) rather than raw data. A review of feature detection, machine learning frameworks and database systems indexed both for mining queries and data visualisation is studied. Based on existing approaches, we propose a new 3-block flexible framework around device expertise, analytic expertise and domain base reflexion. This contribution serves as the first step for the realisation of a comprehensive smart point cloud data structure.


INTRODUCTION
The democratization of capturing devices across a growing number of industries has made point clouds a mainstream spatial sensor's output data.Unnamed aerial vehicles (UAV), light detection and ranging (LiDAR) devices, terrestrial laser scanners (TLS) and depth sensors are common remote sensors or platforms within geospatial, robotics, engineering and construction (AEC), mining, topography, cultural heritage, architecture and archaeology fields.Passive sensors such as thermal, infrared and RGB cameras has even become a common tool for creating point cloud via photogrammetry and computer vision implementation, making this technique a promising way to get quick and colour balanced point clouds.While data acquisition methods are increasing and applications with it, point clouds suffer from several structural limitations causing indirect exploitation through human interpreted deliverables (e.g mesh).
In this paper, we address unstructured point clouds challenges.While answering a specific need, denaturising the original dataset is time-consuming, error-prone and most importantly lead to a loss of crucial information that could be directly exploited.Interpreting point clouds requires specific knowledge and analytical skills in order to extract pertinent information for the end user, indicating a necessity to attach domain information through visual and semantic variables.If within the point cloud and its attributes all the necessary information can be found and easily conveyed without the need for redundant tasks, it would become a more intelligent structure.Following the continuum defined by (Otepka et al., 2013), we settle that keeping the native point cloud rather than interpolation is advantageous for many applications including semantically rich systems.
Solutions can be found at different processing levels from different approaches.However, the amount of data involved outlines new challenges in terms of integration, abstraction, structuration and algorithmic complexity.Organizing, segmenting and handling billions of points including outliers are no trivial task.Indeed, we need to retain only relevant * Corresponding author.observations and avoid data saturation.This implies to identify internal and external influential sources over sensed data, then to classify and structure point clouds through indexing scheme and database management systems (Cura et al., 2015).Yet segmentation relies on data abstraction into consistent indicators and feature descriptors which can describe the essential information both spatial and semantic.This challenge remains highly contextual as to detect relevant objects given a specific context, one must understand which descriptors he should use to recognize an object composed of several points within a scene.Hence, classifying a subpart of an entity means determining which observations lies within an interval, defined spatially and semantically.Our work thus adopt a global vision over point cloud processing, involving many research fields that relate closely to the problems of data mining.Underlying tasks known as registration, georeferencing, segmentation, and structuration have a major influence to attach knowledge onto points.Linking spatial concepts and semantic information through collection, analysis and domain adaptation demands a highly functional data structure.But to define a new approach structuring this third type of data -one being raster, the other one vector (van Oosterom et al., 2015) -we need to identify remaining challenges.
We propose a framework for the development of a new structure: the smart point cloud (SPC).This is the first contribution of a PhD research combining laserscanning and big data management.The first objective of this paper is to examine 3D capture applications, to outline current limitations in the workflows and the challenges that a new structure would face over the variety of capturing methods and platforms.While our proposed approach can be extended to all types of point cloud, we will illustrate on a multisensory coloured TLS point cloud.Then, we will review feature description, segmentation and classification both from a geometrical and artificial intelligence (AI) algorithmic point of view delineating automation, performance, completeness and relevance of the methods regarding the context and domain adaptation (transferring knowledge from the source to target).Existing multi-dimensional data management systems for point cloud are described while revising best indexing methods for spatial, semantic and visual queries that allow real-time scenarios as a global need for the new structure.We will then discuss requirements for a contextually dependent SPC structure built on a topological level of detail (LoD) retaining both spatial and semantic relationship for intelligent data mining, followed by our future work in the area.

3D data capture: means and applications
Due to the rapid development of surveying and reality capture technologies, the acquisition of point clouds continues to become easier and faster while incurring lower costs.The evolution and expansion to a wider audience is driven mainly through scene understanding where tasks like navigation, grasping or scene manipulation are essential to its applications.Image-based reconstructions prove useful in cases ranging from archaeological to full towns and complex architectural reconstruction, making this technique a favourable way to get quick and visually pleasant point clouds.Although precision for middle to large scale are getting increasingly better, remote sensing via active sensors is favoured in the industry as the precision characterization is more accessible.There are discussions in which computer vision would replace Light Detection and Ranging (LiDAR) (Leberl et al., 2010), but practical cases tend to a merging of both techniques and predilection applications for each, combining strengths of natural light independence with low-cost and highly visual image-based reconstruction.Overall, the evolution of the technology through TLS, Mobile laser scanning (MLS), Solid State LiDAR are extending cases for indoor mapping using HMLS (James and Quinton, 2014), MMS systems (Thomson et al., 2013), or more recently MMBS (Lauterbach et al., 2015) demands massive automation and structuration to limit task specific manual processing and storage problems (Figure 1).
Figure 1 Common TLS workflow vs smart workflow 3D data capture workflows would benefit from semantically rich point cloud.Using connectivity, material properties, date stamp or even description of chunk-wise group of points can even be useful for mesh derivation.Managing highly dimensional data and heterogenic sources therefore goes through the definition of efficient features that describe the nature and properties of a point cloud sample in order to classify and establish relations between point segments in the new data structure.However, the versatility in acquisition methodology and sensors is a first challenge that a device expertise block must address to obtain level 1 features regarding (Otepka et al., 2013) signal classification.

Sensors & device expertise
The sensor choice mainly depends on the context, the precision and the resolution that the specific application domain demands.To describe accurately a scene composed of scattered observations, extracting high fidelity descriptors from sensors becomes essential.This process requires a device expertise including application definition, data acquisition methodology, and sensor data fusion (Klein, 2004).Indeed, the combination of different sensors generating different yet complementary signature provide relevant information without the limitations of a single use and create a multisensory system.Overlaying colour over TLS point clouds is a representative example of such a practice which allows to correlate a spatial position of an element where the colour plays an important role.Minimizing fusion and accidental errors will however not correct the acquisition methodology flaws.Indeed, as referred in (Dimitrov and Golparvar-Fard, 2015), missing/erroneous data, misadjusted density, clutter and occlusion are problems that can arise from an improper or impossible set-up on the scene.Filtering techniques and knowledge based-interpretation are possible solutions to these problems to get the most complete description of the scene.Complementary, getting a high number of representative signal descriptors and parameters for each point permits a higher physical and semantic description important for classification and domain adaptation, thus fusing passive sensor data.Improving their qualities through flawless acquisition, multisensory fusion, normalization and filtering to improve their qualities is a necessary first step but depends on the contextual capture.Therefore feature extraction needs to address calibrated signal characteristics regarding the domain.

Domain & analytic expertise
Automation in detecting objects by grouping points that share a similarity and decisive criterion is the basis for segmentation, thus classification.This step is crucial since accuracy of the subsequent processes depends on the validity of the segmentation results (Pu and Vosselman, 2009).We distinguish analytical segmentation from domain featuring.While analytical segmentation plays on geometrical properties and available information directly computed on the point cloud, domain featuring relies essentially on similarity between detected or already classified objects with the current point cloud in process.Combining both approaches to overcome limitations is a major SPC perspective thus we review recent research in the area.

Analytical and geometry featuring
A first step in automation is to detect specific shapes in the point cloud.The random sample consensus (RANSAC) and the Hough transform are highly used in shape retrieval, for instance (Ochmann et al., 2016;Schnabel et al., 2008).Their efficiency for the 3D detection of geometrically simple parameterized shape such as cylinders, spheres, cones, torus, planes and cubes has been proven, providing an efficient shape descriptor with insight over the geometrical properties of a point cloud sample.As it falls short for complex shapes or fully automated implementations, the use of the richness of surface geometry through local descriptors provide a better solution.The pertinent paper of (Dimitrov and Golparvar-Fard, 2015) presents a region growing (RG) algorithm to automatically segment point cloud of AEC piping facilities.The method tackles capture conditions flaws as discussed in 2.2.They identify two complex problems being the abstraction of 3D shapes and the initial scale factor choice that heavily influence the offline computational need.These are research directions that a domain adaptation can solve.RG results relies on a smoothness constraint regarding edge-based, top-down and bottom-up surface segmentation or scan-line as stated in (Rabbani et al., 2006), which is mainly determined through the similarity criterion conditioned by the initial threshold.Normals are powerful local or global descriptors used as a base of RG but their representativity heavily relies on neighbourhood selection encapsulating 3D points from a search definition commonly spherical, cylindrical, or k-nearest points in 2D or 3D (Weinmann et al., 2015).Results depend on the size adaptation, therefore scale of the search that will generate a feature descriptor either global or local.Other local feature descriptors (e.g curvature, moment) are widely studied and used for their good fit to study point cloud (Rusu et al., 2008) but will not be detailed here.
The structural approach connected component labelling implemented by (Girardeau-Montaut et al., 2005) in CloudCompare allows the 3D extraction of connected components based on a proximity criterion in an octree-based space.This is interesting as it could potentially extract a raw topology that could be used for further relationship determination.The work of (Douillard and Underwood, 2011) proposes a set of methods and empirically shows the advantages of extracting the ground prior to per-object segmentation.A voxel-based segmentation is then used retaining per-voxel features (means, variance and density).The contextual approach developed is an interesting vision that could be extended through different structural searches, and gravity-based computations provide relevant relationship estimators for terrestrial applications.We refer in this paper to this structural generalization segmentation as abstraction-based segmentation, working on global features extracted from a generalized spatial node such as a voxel or a sub-group identified spatially.Indeed, unstructured point cloud can benefit from structural properties used as part of a segmentation process, with examples in (Aijazi et al., 2013) creating "super-voxels" or (Okorn et al., 2010) extracting floor plan via 2D voxel projection for histograms.
An important contribution to point cloud classification and feature estimation is brought by (Weinmann et al., 2015), discussing the suitability of features that should privilege quality over quantity.This shows a need to prioritize and find robust and relevant features to address the heterogeneity in a point cloud structure.We propose to build on (Weinmann, 2016) classification of 3D descriptors in three categories being point attributes (sensor descriptors obtain through measurements), shape and local features.We can extend the definition by adding structure descriptors which include global descriptors and structure generalization through abstraction-features for segmentation.While geometry-based segmentation algorithms can be implemented without taking the opportunity of retrieving information from a knowledge source or higher level descriptors, it shows great limitations to its validity, extensibility and computational complexity.Every analytical and geometrical descriptor could be further used as a basis on which domain expertise builds a classification/validation process: a similarity criterion extracted from acquired or available knowledge.This relates to machine learning either supervised or unsupervised and conceptual inference.Learning from the data itself and representative features is a potential solution, therefore we review recent developments and their limits.

From human to artificial intelligence
With a great deal of data to be collected and analysed, there is a great need to study Big Data practices.(Liu et al., 2015) pertinently state the problem linked with big data collection, quality and usage: the information is often incomplete, inconsistent and unreliable.A lack of validation and normalization needs to be addressed.This goes through reliable data analysis including data collection through more detailed representation and multi-scale analysis to enhance reliability and ethics reflexion.Multi-sensory systems while allowing to obtain more classifiers should consider data fusion principles to normalize properties becoming representative.Though the quality, completeness and usage has been review previously, data-driven predictions allow to statistically infer the validation of results and its consistency.Classifiers that learn from previous or available knowledge differ from their approach, thus their results.They are usually categorized in 3 ensembles as in being supervised learning (from a set of features to a labelled data) unsupervised learning (structure detection by pattern recognition) and reinforcement learning (functional inference through a set of state and actions).In this paragraph, we essentially focus on feature based automation.(Belgiu and Drăguţ, 2016) give a good overview on how is used the Random decision forest (RF) classifier (randomly building multiple trees in subspaces).Mostly supervised, RF classifiers provide great accuracy and are able to handle the high data dimensionality.However, while new approaches combining new features could improve the classification results, a feature elimination procedure can drastically increase the final accuracy as stated by (Belgiu and Drăguţ, 2016).(Valentin et al., 2015) extended the concept of RF to Streaming Random Forests (SRF) by building a powerful machine learning routine based on feature detection while interactively labelling data in real time.This very interesting implementation is user-centered, allowing to define an intuitive classification based on specific user needs.(Wang et al., 2015) focus on obtaining discriminative shape features to directly describe the point cloud via unsupervised clustering prior to classification.They introduce the multiscale and hierarchical point clusters (MHPCs) to extract geometric features from point cloud refined using the bayesian concept Latent Dirichlet Allocation (LDA) for finally classifying the point cloud into four classes (people, car, tree, building) using the decision tree AdaBoost.Hierarchically clustering point groups allows to keep information at multiple levels giving better results than one of the most used unsupervised clustering model Kmeans (mainly used for single-scale point cloud simplification).As seen, multiple classifiers or ensemble learners outperform single classifiers in term of accuracy making them a great tool for multisensory data.(Koppula et al., 2011) propose a 3D point cloud labelling method based on support vector machines (SVM) empirically validated over an indoor scene captured by a RGB-D sensor (Kinect).They focus on detecting three pertinent properties (visual appearance, local shape and geometrical context) to detect nine classes in the scenes using a large margin learning approach.The developed model is closed to the Conditional Random Field (CRF) used in (Xiong et al., 2013).One identified limitation is the assumption of shapes estimated via planar patches as a primary criterion.(Niemeyer et al., 2014) proposed a context-based CRF classifier without segmentation over aerial LiDAR data to label the point cloud in four separate classes.The results show a high potential for urban classification, benefitting of contextual information such as topology which extends the scope of the included RF classifier greatly.(Garstka and Peters, 2015) presented a Reinforcement learning framework that is able to adapt the number of object classes dynamically through a finite Markov Decision process.It is an ongoing research that yield great preliminary results, and a potential powerful base for inference reasoning.Other promising types of learning approach that need to be explored further for point clouds are neural networks (Soares et al., 2015) and Genetics algorithms.To better classify with high automation while validating using statistical principles, machine learning is a supplement of choice to reinforce geometrical reasoning.Learning algorithms, whether supervised, unsupervised or by reinforcement highly depend on the number and accuracy of samples and features.To get better descriptor we can leverage available knowledge through existing databases.Any existing knowledge-representation system via ontologies would allow the use of a broad new range of new inference rules.

Domain adaptation
As stated by (Tangelder and Veltkamp, 2007) "any fully-fledged system should apply as much domain knowledge as possible, in order to make shape retrieval effective".With the rise of online solutions, we have seen a great potential in using knowledge database for classification to analogically associate shapes and groups of points with similar features.This association through analogy "is carried out by rational thinking and focuses on structural/functional similarities between two things and hence their differences.Thus, analogy helps us understand the unknown through the known and bridge gap between an image and a logical model" (Nonaka et al., 1996).This introduces the concept of data association for data mining, and relationships between seemingly unrelated data in a relational database or other information repositories.The use of domain knowledge over point cloud data by separating domain knowledge from operational knowledge refers to ontologies, although knowledgebased applications do not always refer to ontology reasoning.
An important contribution is made by (Rusu et al., 2008), to turn a kitchen point cloud into a meaningful representation for robot interaction and recognition called an object-map.Their algorithm includes data acquisition, geometrical mapping and functional mapping.As an end product, it produces a mesh via functional reasoning.It is interesting to note that hierarchy representation allows to validate segmentation and add refinement constraints.While the concept includes topological concepts, this paper bases functional reasoning on common-sense knowledge mostly hardcoded.(Lai and Fox, 2010) developed an interesting web object recognition workflow over point cloud using web data from Google warehouse and a domain adaptation framework where they learn a set of 3D distance functions between training data and classified segments.The implementation shows that the unsupervised use of the Web with different features in essence such as 3D model can decrease the accuracy of the point cloud classification when working without domain adaptation.(Xiong et al., 2013) contribution aims at solving the point cloud to BIM (Building Information Model) pipeline, and the paper presents results for geometrically simple elements such as walls, ceilings and rectangular openings.It allows the reconstruction of a semantically rich 3D model, based on a contextually-based learning method studying relationships between planar patches.(Ochmann et al., 2016) propose an automatic reconstruction retaining wall connectivity by detecting limits through a labelling process per scan.This implementation is a concrete use case benefiting of both internal relationship, geometrical and domain knowledge.However it could be further extended by addressing the predisposition and scan methodology where each room is considered captured from one inside scan position.(Galindo et al., 2005) in order to improve robot-vision understanding and navigation, propose an approach combining an abstracted spatial hierarchy graph and a semantic hierarchy that model domain concepts.The proposed conceptual metric-topologic-semantic multi-hierarchical map allows deeper comprehension; however, the semantics are hardcoded making it hard to extend more largely.(Derrac and Schockaert, 2015) describe extensively conceptual spaces and present some advanced principles to semantic reasoning.Semantic relations, beyond class retrieval could assess the credibility of new regions.Therefore, it is essential for users to be able to validate inferred solutions.
Similarity-based solutions apply to point cloud segmentation techniques by searching a database for similar models while assessing segmentation on a context-dependent matter but the objects within the point cloud should be well defined with precise attributes.(Kassimi and Beqqali, 2011) use shape indexes to induce a semantic ontology based model, that is included in a learning process to directly label the 3D model, increasing automation.However, while proposing a way to infer knowledge in segmentation and classification method, these papers rarely cover the topic of data structuration.To keep a record and use ontologies over analysis process, the point cloud needs to be structured retaining spatial and relation information deducted or useful for classification and segmentation.For data visualisation, it is also very important to work over a structure as flexible as possible to handle billions of records and queries over different attributes for validation through visual perception.

Structuring and interacting with point clouds
The large datasets that point clouds constitute cannot directly fit in the main memory, demanding a new structure to exploit through a Database Management System (DBMS).While the data heterogeneity rises, existing DBMS still rely on a limited number of data models to manage efficiently the variability and redundancy of the amount of observations.Handling efficiently these massive unstructured datasets (heterogeneous and from different sources) demands high scalability, speed (when data must be processed/mined in a near or real time manner) and computational adaptation (cloud computing) to answer specific needs.This relates to Big Data problematics, or how to efficiently process big semi-structured / unstructured datasets.While disclosing existing data mining limitations, Big Data mining techniques introduce new challenges due to the high volume and heterogeneity of the massive datasets.Finding hidden pattern and information for knowledge discovery requires complex multimodal systems.(Dobos et al., 2014) pertinently state the limited dimensionality support in geographic information systems (GIS) to k-dimensional data with k ≤ 3 struggles for indexing higher dimensionality.Therefore, data interaction needs flexibility and scalability for different tasks: processing, data management and visualisation.To solve these challenges, spatial indexing and storage is essential.It should be able to scale up to multiple servers, be optimized for sequential and parallel disk access or for CPU/GPU intensive tasks.Relational Database Management Systems (RDBMS) and NoSQL DBMS for such application exists, but we confront several identified problems.

Management system for high dimensional data
Point cloud data large volume and high resolution make it suitable for LoD management and rendering.The data model that determines the logical structure of a database will determine in which manner the data can be stored, organized, and manipulated.While file-based are common point cloud storing systems managed through hierarchical-like database models, sharing, compatibility, query efficiency and data retrieval are the main limitations in these models.(Otepka et al., 2013) reviewed extensively existing large point cloud data structures including attribute and geometrical information organization.They rightfully state that the secondary storage access limits data-intense tasks, that could be solve through streaming algorithms to keep small parts in-memory.However this implies pre-sorting and structuring a priori the data.In their discussion, they propose to separate coordinates from features in the DBMS to permit efficient attribute updates as well as georeferencing and spatial reorganization.Their vision of a flexible model could be solved using relational database models (RDBMS), most used in GIS systems.It is very attractive because their robust data model allows a layer of abstraction over the file-system using a dedicated data retrieval language (SQL).However, multiple relational tables with a point per row would reach easily billions of tuples, which becomes problematic.(Dobos et al., 2014) introduced the concept of point cloud database for scientific applications.After stating the common dimensional space reducing techniques to treat k dimensional data being space reduction and PCA, they propose an approach based on relational tables.Classical RDBMS for such application exists, but binary trees limited scalability that struggle with huge datasets size and non-adapted vectorisation and indexation schemes often specific for one usage are hard to exploit on many different servers.Building on this, they point requirements of the structure for analysis of point clouds mainly filtering capabilities, key look-up and NN's search, cluster analysis, outlier identification, histogram and density estimation, random sampling, interactive visualisation, data loading, insert and updates.(van Oosterom et al., 2015) extend the concept by defining a third type of spatial representation (the first one being vector datarow like Single Feature Specificationand the second raster datamultipoint object): point cloud data.Their work focus on benchmarking several available commercial Point Cloud Data Management Systems (PCDMS) (block model and flat model of PostgresQL-PostGIS, block model and flat table model of Oracle, flat model of MonetDB, file-based LAStools) to define which one is the most fitted for point cloud management.While some improvements need to be implemented to fix issues in available solutions, each provides a benefit compared to the others but none can answer efficiently combined queries, data I/O and real-time visualisation.The interoperability stays essential to combine point cloud data with vector data and raster data.They also show in a brilliant way the need of linking user needs, user type with user experience to define a standard in point cloud design and implementation.The NoSQL database robustness to massive data with weak relationship can scale up to many computers but functionalities are today very limited.
Based on the approach available pgPointCloud defining patches in a XML scheme that cluster points present in PostgresQL RDBMS, (Cura et al., 2015) let the user create patches, allowing contextual groups while benefiting of high compression through the Point Cloud Server (PCS).The final implementation shows efficient loading, storing, processing, exporting and web visualisation and provide a very interesting RDBMS that could be extend when the grouped point retain a specific relationship.In a working paper (Cura et al., 2016) extend the PCS implementation by adding a MidOc LoD reordering that work on the barycentre of each cell bounding box.While this method allows better visualisation, its main interest resides in creating a structure flexible enough that it can allow a fast classification based on a multi-scale dimensionality descriptor.
In their book, (Ben Hmida et al., 2012b) propose a strategy decomposing knowledge into 3D processing and domain.They structure such information in an ontology structure keeping information ranging from data source to object characteristics, hierarchy, geometrical topology, processing algorithms, … The final goal is to extract geometrical shapes describing the point cloud that retain all the information such as relationship and topology.The major contribution of (Ben Hmida et al., 2012a) is a knowledge based detection approach to create object grouping points using the OWL-SWRL ontology languages.They developed a WiDOP prototype to be able to efficiently manage point cloud data.Interestingly, they decompose object knowledge in Deutsche Bahn scene knowledge (classes and relevant information about objects), Geometric knowledge (geometrical and physical characteristics) and topological knowledge (adjacency relations within the scene elements) that provide an automatic robust framework inferring prior domain knowledge.While providing some solution to the integration of domain expertise through a priori or a posteriori knowledge, the efficiency and extensibility to production processes depend on the underlying structure for efficient processing, analysis and visualisation.As stated by (Otepka et al., 2013), naïve strategies especially considering query complexity of neighbour search O(n²) are unrealistic for industrial applications.The available memory cannot handle the amount of information, and rely on an indexation scheme that should efficiently allow data-retrieval and reactive queries.

Spatial indexation techniques for edition / visualisation
Indexation for 3D point clouds via spatial indices that subdivide the space through different approaches are a solution to reduce the overhead via chunk memory loading.The exhaustive paper presented by (Richter and Döllner, 2013) state that the spatial subdivision of k-d tree are not suited for updates (e.g.add, remove) operations over point clouds because the tree structure becomes unbalanced.However, k-d trees perform well regarding NN searches by efficiently eliminating large portions of the search space (≈ O(log n)).Octree structures, a 3D analogy of quad-tree (Yang and Huang, 2014), as opposed to kd-tree perform well for update operations thanks to their uniform spatial subdivision, which makes it particularly interesting considering point cloud varying resolution, distribution and density.However, as stated by (Zhu et al., 2007) , octrees are "not able to dynamically adjust the tree structure according to the actual object layout.As a result, the tree depth is high where there are many objects, and this also results in unstable query performance".In their paper, (Gong et al., 2012) propose a data management indexing scheme based on a 3D R-Tree, avoiding unbalanced structure and overlapping.As 3D-R-Tree adjust the index structure based on the real structure, the object distribution has a relatively low impact factor.However, node overlapping creating multipath queries is a major challenge that optimization of previous work solved.They developed the promising 3DOR-Tree, a hybrid approach combining both strength of octrees and 3D-R-Trees to manage LoD point clouds.Other approaches such as modified nested octrees and sparse voxel octrees (Scheiblauer and Wimmer, 2011) will also be investigated.
Essentially, data visualisation techniques for analytical and interactive tasks build themselves on indexing techniques that provide efficient LoD and rendering structures.In the context of point cloud, semantics and domain can highly influence the type of rendering used in order to directly transmit the correct information to the end user.A point cloud representation avoids interpolation or approximating a set of unorganized points benefiting of a theoretical unlimited depth in the LoD.(van Oosterom et al., 2015) vario-scale LoD research field would even allow to avoid the "Block effect" of Discrete LoD scenes.(Beserra Gomes et al., 2013) developed a very interesting approach for LoD hierarchy with point cloud data to make it more easily handled in Real Time.They introduce the foveated point cloud through top-down (task guides attention) and bottom-up (external stimuli drives attention) approaches.(Richter and Döllner, 2010) discuss the importance of out-of-core real time rendering system for interactive exploration of point clouds.Based on LoD concepts aggregating points regarding attributes, they define a new class-attached point cloud out of core rendering system by storing points in a layered multi-resolution kd-tree.Point cloud object class information a priori computed allows different rendering techniques such as silhouette rendering and splatting depending on the visual information that needs to be communicated.(Richter and Döllner, 2013) present a system architecture to manage massive point cloud, including database integration, interactive rendering and visualisation through classbased rendering.In the paper, the authors clearly identify important advantages of point clouds over models: they are LoD adaptive and allow fast update without the need to remodel, or re-extract information.Therefore, point cloud can be much less effort for specific task of integration and comparison.(Liu and Boehm, 2014) introduce a novel interactive segmentation method for interactively segmenting point cloud through a Max FlowMin Cut algorithm solely based on interaction to assign a background and highlight wanted objects via user drawing.(Valentin et al., 2015) developed an extremely powerful machine learning and feature detection program while interactive semantic classification is made in Real Time.User focused, it allows to define a classification based on real user needs.Theses interactions provide a direct immersion in a coherent structure, bridging virtual environment with 3D capture.

PROPOSED FRAMEWORK
"Intelligent environment" (Novak, 1997) as an interactive and smart structure to transparently communicate relevant information to users is an attractive solution for virtual reconstructions, especially point clouds.The overview of current practices showed a need to improve automation, data management and interaction.Identifying links and relations within segmented objects becomes essential to truly understand how each spatial entity relates to its surroundings redefining big point cloud data as smart data.Independently considered, each reviewed paper provides a solution to one part of the global point cloud processing chain.But combining different approaches produce a more powerful and robust segmentation, classification and information extraction workflow.For example, merging shape matching methods with feature based methods (fast computation, pseudo-metric, discriminative abilities, robustness) and structure-based methods (partial matching, abstraction, connexion) provide a more exhaustive representation of the data, allowing automation and validation.Also combining geometry and topology with multisensory data is important to estimate higher level features, while normalization and ethics should be integrated in the pipeline (Liu et al., 2015).Certain approaches such as (Ben Hmida et al., 2012a) and (Ben Hmida et al., 2012b) provide an opening on domain knowledge integration to complete this analytical expertise but most of the paper hardly consider leveraging available information through the semantic web, with some experiment showing the negative effect over accuracy when too far from point cloud characteristics (Lai and Fox, 2010).But one of the main struggle that many paper address concerns the data integration and the management system for unstructured point clouds.Existing PCDBMS and indexing techniques provide a solution to storing, compressing and managing the data (Dobos et al., 2014;Richter and Döllner, 2013;van Oosterom et al., 2015), but efficiency and extensibility to dynamic semantic update and ontological reasoning stays limited.Structural and visualisation queries over octree derived indexing techniques can provide an efficient solution for out-ofcore rendering and parallel processing, but data structuration cannot efficiently include context adaptation and inference reasoning.Building on this we propose a global solution that classify, organise, structure and validate objects detected through a flexible and highly contextual structure that can adapt to different domain and device expertise created by: 1. Integration of the raw multisensory data; 2. NN-oriented indexation including filtering and initial normalization (N0); 3. Smart recognition including segmentation, classification, validation and refinement; 4. Object-LoD Smart structuring which temporally takes place in parallel while recognizing objects; 5. Contextual SPC abstraction Level X definition.Such a structure should therefore retain the three main expertise processes described being sensor, analytic and domain expertise that we group as shown in Figure 2. The ability to handle neighbour's search, cluster analysis are fundamental tasks that appear early for defining higher level features.Therefore, indexing while filtering and normalizing with available knowledge and data is a main concern in the architecture.Moreover, the pipeline should allow an easy update of these features, therefore a smart way of storing domain information.Hence, we propose a global SPC pipeline creation as shown in Figure 3.
First, we define an operational block around the captured subject where context and properties of the scene will heavily define the chosen methodology.The device expertise will condition which attribute to filter, normalize and a weighted adjustment process regarding the technical details of the output data could be of interest.The raw captured data passes through normalization and filtering steps.The importance of interactivity and user-centered interaction as seen in (Liu and Boehm, 2014) introduce a validation process to exclude the points in question.The data indexing scheme needs to retain critical information while previsioning structuration searches (attribute-key, NN, semantics).KD-Tree or derived octrees such as 3DOR Tree are performing well for such intensive tasks, therefore this is an analytical-structure prior to final classification that will be further investigated for storing points.However, attributes and features are to be stored in a different structure while retaining a direct link to the correct point/patch index, as discuss in (Weinmann, 2016).The structure should avoid any costly rebuilding operation when deleting or inserting points for example.The domain block can store information in an ontology structure, directly exchanging information with the analytical block regarding: -Completeness: analytical results, both features and semantics will directly be transferred to the domain; -Validation: completeness will be evaluated statically and by context/domain comparison to extract relevant information, which will directly be injected to be analytically accepted for local organization; -Refinement: when validation fails, the data is reinjected in an analytical loop to iteratively be benchmarked and further analysed, or rejected when the probabilities are too low.

Figure 3 Workflow for Smart Point Cloud structuration
We differentiate geometrical reasoning from statistical and machine learning, with a priori relationship estimation.While the validation takes place, we propose to detect and independently treat these special objects: -The ground: being contextually gravity dependent, the SPC considers the ground as the foundation on which every elements is then referenced.-Elements: which can be iteratively refined considering the level X of abstraction desired, but also the resolution and precision of the (multi)sensor data.-Boundaries: which can be either Walls or Ceilings, both including structural definition and elements.
Because the workflow is context dependant, the user will define the detail level desired in the final structure which define the segmentation level refinement, and we propose to build on a common gravity-based definition: -Level 0: Ground and boundaries; -Level 1: Each independent object detected analysed independently to extract host and guest; -Level 2: The first guest, for example the cup of coffee (L2) on a table (L1) that stands on the floor (L0).
The importance to play on all possible scales for featuredescriptors (sub-space / global), structuration and visualisation is primordial.The concept of vario-scale (Huang et al., 2016) providing near-continuous capabilities is interesting and should be studied for its fit to our propose SPC.Some objects will then possibly be described in many classes, having an influence on how deep the selectivity can go.Studying topological relationship for point cloud is herein complicated when no direct contact can be observed through TLS measurements.Retaining relations and organizing hierarchically via topological LoD defines a final step to get a smart data structure.These conditions can infer physical description and combine many possible analysis, e.g. the possibility to recreate occluded zones through topology, reason about position in time and space using domain knowledge and semantics, conducting structural analysis...This whole pipeline describes an intelligent environment creation, where the end-user will directly have access to relevant information based on an update and refinement process to keep track of interactions and needs of the end users, deprecated in the structure through a knowledge update.In our future work, we will address specifically the tasks of point cloud segmentation for determining the topological LoD data structure.This will be based on connected component and variant analysis including abstraction-featuring to determine new methods for ground and boundaries recognition, as well as hierarchical iterative element refinement.The architecture of the domain expertise lies on a server database keeping two different entities: the training data and the expert knowledge intelligence.Each of these modules will participate into the classification and organization to obtain labelled data, which will be stored in another partition, retaining descriptors for each class.These classes will be contextdependent and predictions assessment will directly link the domain with the data to extract applications relevant for specific features.Reinforcement learning can handle more complex environment than supervised learning, and theoretically provide a more powerful framework for modelling streaming data that will be part of future investigations.The challenge of data structuration will also be address, combining both implementation of ontology/RDBMS-variant to efficiently store the point cloud retaining specific device expertise, analytical expertise and domain expertise.Use cases extended to structure from motion while aggregating topographic data will be discussed, in order to define a SPC structure for smart data.

CONCLUSION
This paper gives a definition for a point cloud knowledge-based structure contextually subdivided according to classification results.The research aims at presenting a general framework for the development of smart point clouds.Fusing data from both active and passive sensors provide additional information that relays through high level feature descriptors precise for contextual classification.However, semantization relies on geometrical descriptors as well as domain analogy and validation to extract and define a new structuration of the point cloud data through correct indexing techniques.This implies separation between relationships / topology and spatial / attribute information to provide efficient data mining capabilities.The targeted analysis of current limitations in analytic, domain featuring and data structuration has led us to establish a research agenda.While adopting the described framework, the next work that will be undertaken will deal with contextual LoD segmentation.

Figure 2
Figure 2 The Smart Point Cloud constitution