MODEL FOR SEMANTICALLY RICH POINT CLOUD DATA

: This paper proposes an interoperable model for managing high dimensional point clouds while integrating semantics. Point clouds from sensors are a direct source of information physically describing a 3D state of the recorded environment. As such, they are an exhaustive representation of the real world at every scale: 3D reality-based spatial data. Their generation is increasingly fast but processing routines and data models lack of knowledge to reason from information extraction rather than interpretation. The enhanced smart point cloud developed model allows to bring intelligence to point clouds via 3 connected meta-models while linking available knowledge and classification procedures that permits semantic injection. Interoperability drives the model adaptation to potentially many applications through specialized domain ontologies. A first prototype is implemented in Python and PostgreSQL database and allows to combine semantic and spatial concepts for basic hybrid queries on different point clouds.


INTRODUCTION
Among the ever-increasing flood of data collected every day, geospatial data occupies a large portion.While the well-known sentence "80% of the data is geographic" (Hahmann et al., 2011) is an arguable empirical statement, it delineates a need to find new or improved ways to derive value from big geospatial data (Li et al., 2015).The expansion's potential to benefit many applications such as construction, emergency response, planning, monitoring critical infrastructures and transportation makes it a favourable way to address many societal problematics.Extracting information from spatial concepts help us better understand our world and take informed decisions.Although geospatial data "stricto senso" is the bedrock of many decisionmaking processes, the injection of semantics enhances the representativeness of the data while enlarging possible applications.GIS research targets this enrichment by finding solutions to make the storage, retrieval, and manipulation of spatial data easier and more representative.
Since a decade, computing power is advanced enough allowing GIS systems to be extended for managing 3D models hosting metadata.The Open Geospatial Consortium (OGC) work and standards has brought stable references that are adopted internationally.The most used example for planning and construction is CityGML, IndoorGML, as well as BIM-IFC ISO standards, which allows many applications that 3D geospatial data alone (i.e meshes, CAD) could not answer.But these models are derived from a more direct source of information, and as our means to capture all 3 dimensions of our world evolve, we increasingly deal with point clouds from LiDAR, TLS 1 , MLS 2 , HMLS 3 , MMSS 4 and dense image-matching.They constitute geospatial data that are gradually more produced through various means and platform, but their democratization doesn't follow the same growth curve.Indeed, on top of the captured 3D geometry, each application specifically requires additional semantics often domain-linked to allow a direct exploitation from the collected information.For this reason, it is highly impractical to solely base reasoning on point cloud spatial attributes alone.
Our paper aims at solving this issue to allow a straight integration and a better interfacing of point cloud data in our computerized environment.On top of an efficient data management system that can handle the ever growing data size, complexity and heterogeneity, the addition of knowledge to such a structure is very interesting for meaningful information integration (Poux et al., 2016a).Enabling semantic injection to point cloud is a first part in undertaking the creation of intelligent environments (Novak, 1997) to allow digital copies of the world to be used as decision-making tools.However, this demands a highly functional reasoning engine that allows both spatial and semantic queries to efficiently interpret natural language and spatial operations.However, the needed metadata are "typically drawn from distributed sources and often are thematically and spatially fragmented.Thus, for a given geographic region, data differ in quality and modelled semantic aspects" (Stadler and Kolbe, 2007).Therefore, spatio-semantic coherence is mandatory to enable a valid and representative modeling of the environment.This article is based on the previous work of (Poux et al., 2016a), which defines a Smart Point Cloud (SPC) workflow that enables a connection between point cloud data and three identified knowledge sources being device, analytic and domain knowledge.We extend this concept by proposing a conceptual data model composed of three meta-models acting at three different conceptual levels to efficiently manage massive point cloud data (and by extension any complex 3D data) while integrating semantics coherently.
In the first part, we will review the use of semantics for enhancing 3D data regarding domain applications and existing standards for managing multi-dimensional geospatial data.Then we will present the extended modular Smart Point Cloud model to allow a more direct interoperable integration of semantics in point-based virtual environments.Finally, we will discuss and illustrate the proposed conceptual model by a prototype in python and PostgreSQL to manage an indoor point cloud captured by a terrestrial laser scanner, enabling reasoning from information extraction.

SEMANTICS & 3D DATA
Communities working with 3D data are very different implying a wide diversity on how the data is used.This explicitly demands that data-driven applications enable targeted information extraction specific to each use case.For example, if you work with a mesh representing what looks like a table, while you know it is a table, you need to digitally attach some extra information if you want to deepen the operations made by the computer rather than interpreting on the fly (brain work).Indeed, once metadata is attached to 3D data, then you can more easily grasp, or even can make calculations that were impossible before (e.g.count the number of table in the scene, what is the surface size of each table …).These are very basic examples of how metadata enhances the use of 3D data.But of course, while this is rather convenient for one use case, making a general rule that applies to all models is a daunting task.Each application requires its own semantic and geometric information.In this first part, we will study the attempts, standards and existing reflections to define a common scheme for exchanging relevant 3D information.

Explicit 3D spatial information systems
Datasets that explicitly include spatial information are typically distinguished regarding the data models and structures used to create, manage, process, and visualize the data.Ee consider three analogically distinct categories of 3D data environments (Ross, 2010): -3D GIS: GIS systems typically model the world itself, retaining information about networks, connectivity, conductivity, associativity and topology.This enables spatial analysis, often carried on large collections of 3D instances stored in virtual data warehouses with coordinates expressed in a frame of reference.-3D CAD: CAD/CAM techniques model objects in the real world through parametric and triangular modelling tools.The topology is often planar or limited (although vendors extend functions to include semantics and higher descriptive topology (Zlatanova and Rahman, 2002)) and the retained information usually plays on a visual scale.The distinction between storage and visualization is not as well defined as in 3D GIS systems, and typically files are stored as single complex 3D objects (one file).CAD files carry visualization information that is not relevant to the data itself.A simplistic difference consists in thinking of 3D GIS systems as 3D spatial database whereas 3D CAD models are rather related to 3D drawings.The coordinate system is therefore linked to a defined point of interest (often the centroid) in the scene.-BIM: it constitutes working methods and a 3D parametric digital model that contains "intelligent" and structured data initially for planning and management purposes.It is often studied for its integration with 3D GIS systems with an extensive review in (Liu et al., 2017), but their parallel evolution (conditioned by temporal and hermetic domain research) and fundamentally different application scopes are slowing down their common assimilation.BIM models share many properties with 3D CAD models, including their expression of coordinates in a local system, but benefit of a higher semantic integration.
The emergence of new data sources and evolution in data models constantly put in question the suitability of these categorizations.
Established and emerging data types and their integration / characterization can become difficult for meeting the characteristics of one of these categories.For example, a more primal spatial data from a more direct data source such as 3D point clouds could benefit of their own category.Indeed, they have a very small direct integration in these groups, but rather serve as a support for the creation of CAD/CAM models, BIM models or 3D GIS systems.In some advanced cases, the information included in 3D point clouds can help extract metadata for the future data model.
However, it is important to note that while barriers between each category was well defined five years ago, the improvement and added functionalities to each category as well as interoperability and integration research and standardization plays a major part into blurring the respective frontiers.

Interoperability and ontologies for 3D semantically-rich data
"Semantic interoperability is the technical analogue to human communication and cooperation" (Kuhn, 2005).This sentence pertinently summarize the drive in GIS research to formalize semantics in order to facilitate the communication of data among different communities.Different levels of interoperability exist, and we are looking in this paper at the technical parts without looking at societal issues raised by enterprise-oriented information sharing (Harvey et al., 1999).However, the conceptualization of interoperability in our computerized environment remains a challenge at different levels: 1.The nature of concepts that defines interoperability should not arise from simplistic assumptions as notions evolve with time; 2. The ever growing use of 3D data makes it very hard to define a common language to be spoken by all professionals; 3. The knowledge involved is sparse enough to constraint natural language extension in a computerized formalism; 4. Standardization efforts need an international cooperation to represent as thoroughly as possible the reality and benefit of effective coordination; 5. Retaining semiotic relationships between concept, symbol and entity, as in the semantic triangle (Ogden et al., 1923).
In a narrower context, 3D data as 3D models are largely used for a high number of applications, which vary in scope and scale as well as enlargement.Therefore, semantic schemes as generic as possible provide a potential solution for interoperability.
Ontologies are a good way to explicitly define knowledge in order to address semantic heterogeneity problematics arising from this large variety.However, independent work and research limits their extension to a broader audience especially looking at 3D content.But the rise in usage demands that specific solutions allow 3D data to be exchanged and used as thoroughly as possible.Independent development and uncoordinated actions in the research field of ontologies applied to GIS are addressed by entities such as the World Wide Web Consortium (W3C), the International Organization for Standardization (ISO), the Open Geospatial Consortium (OGC), the International Alliance for Interoperability (IAI) and the rise of open-source developments and repositories.Clarifying standardization processes over 3D data is especially important, with issues arising at both a technical level and a consideration level (how is 3D data considered by the community?) In general, a standard defines a data model at two levels: properties and geometry.A well-known example is the standard GML3 issued by the OGC which is used by the CityGML data model describing the geometrical, topological, and semantic aspects of 3D city models (Kolbe et al., 2005).The specification and the decomposition in Level of Details (LoD) as well as the current 2.0 version allowing to define semantic concepts has made the integration of city models easier and applicable to a wider range of use cases (Biljecki et al., 2015).Indeed, this gives the possibilities for decision makers to impose a specific "abstraction figure" (LoD1, LoD2 …) that characterizes the granularity level of the wanted geometry and semantic concepts.This interoperability "tool" is a leap forward in the democratization of the standardized data model CityGML.
However, its integration with other standards or ontologies is still being discussed and studied, where a discrete number of LoD with 'unconnected' (potentially uneven) levels could be a concern (Karim et al., 2017).This illustrates the need to find interoperable systems between already established standards to benefit of higher semantics and topology integration that enhances our comprehension and usability of 3D data.
The Semantic web is a great tool standardized through Semantic Web 3.0 that is able to create links between already established standards, which encourages the use of web-based data formats and exchange protocols, with the Resource Description Framework (RDF) as the basic format.Indeed, this has the potential to greatly reduce the gap/frontier between each previously defined category in 2.1, and better integrate knowledge within 3D spatial data.This is especially efficient if we better integrate 3D point clouds, on which we today derive so many systems and data models.Indeed, in a first time it could serve as transition data, but given time it could provide all the necessary information if correctly integrated.

Transition to point clouds
The work of (Janowicz et al., 2010) constitutes a pertinent analysis of identified interoperability problems and semantic enrichment routines.The authors outline semantic challenges for geospatial applications, namely discovery, access, registration, processing and visualisation.They propose a Semantic Enrichment Layer which includes interesting functionalities, in particular a reasoning module to load and query specific ontologies.
Semantics injection into point cloud constitutes an opening to gather interoperable data with a reasoning potential.It relates to complex and active point cloud research vigorously driven by governmental, industrial and academic needs, such as sustainable planning, self-driving cars or VR teaching.It includes problematics that arise from point cloud data management in terms of storing and efficient structuration for semantic and spatial demands.Indeed, existing PCDBMS (point cloud database management systems) and indexing techniques provide a solution to storing, compressing and managing the data (Dobos et al., 2014;Richter and Döllner, 2013;van Oosterom et al., 2015), but efficiency and extensibility to dynamic semantic update and ontological reasoning stays limited.Queries over octree derived indexing techniques can provide an efficient solution for out-of-core rendering and parallel processing, but data structuration cannot efficiently include context adaptation and inference reasoning.
Therefore, identifying links and relations within segmented objects becomes essential to truly understand how each spatial entity relates to its surroundings and connecting GIS, CAD and BIM concepts (as seen in 2.1) to 3D point clouds.Certain approaches such as (Ben Hmida et al., 2012a) and (Ben Hmida et al., 2012b) provide an opening on domain knowledge integration.An important contribution is made by (Rusu et al., 2008), to turn a kitchen point cloudand by extending any environment -into a meaningful representation for robot interaction and recognition called an object-map.Their algorithm includes data acquisition, geometrical mapping and functional mapping.Geometrical mapping is composed of an outlier removal, a persistent feature estimation, a point cloud registration and then a resampling step.Building on this, a segmentation process and model fitting based on the geometrical mapping constitute the basis for region identification and object hypotheses constituting the functional mapping.As an end product, the algorithm produces a mesh via functional reasoning.It is interesting to note that the intangible hierarchy representation allows to validate segmentation and add constraints to refine the results.Although the concept is a first step toward topological concepts, this paper bases functional reasoning on commonsense knowledge mostly hardcoded.While these constitute pertinent examples of knowledge and semantic enhancing capabilities, no clear and defined structure is developed.The work of (Poux et al., 2016a) is a first step in this direction: it proposes a global framework that classify, organise, structure and validate objects detected through a flexible and highly contextual structure that can adapt to three identified knowledge sources being domain, device and analytic knowledge.This lays the groundwork for the development of a new data model -the smart point cloud -that can address previously identified issues while retaining a high level of interoperability with existing standards.(Poux et al. (2016a) propose a definition for a point cloud knowledge-based structure contextually subdivided according to classification results.The semantization process relies on geometrical descriptors as well as a domain analogy integrated in a new structuration of the point cloud data through correct indexing techniques.This implies a separation between relationships / topology and spatial / attribute information to provide efficient data mining capabilities.At a higher conceptual level, the creation of an intelligent virtual environment from point clouds can be inspired by our cognitive system: recognizing an object means accessing symbolic units stored in a semantic memory and which are abstract from our previous experiences while being independent from any context.Disposing of either digital copies of the real world, invention / conception of "things" to be integrated in the world or a combination of both, we refer to geometries from the "physical space" and "fictional space" (immaterial, concept-based) as in (Billen et al., 2012).In their paper, they propose an ontology of space in order to facilitate an explicit definition of CityGML.By extending the formalism, it constitutes a basis for semantic injection into point clouds.

THE SMART POINT CLOUD
However, the study of ethno physiography as well as human cognition of geospatial information is mandatory for defining information system ontologies.Indeed, the closer (and the richer) the model is to the domain concept, the better (and more extensible) the ontology will be.But the questions of how detailed an ontology should be dependent on the levels of interoperability that is envisioned.

Conceptual SPC (Smart Point Cloud) Model
The purpose of the SPC characterization is to represent the real world spatially described by point clouds in a computerized form: a user-centered frame representation serving an intelligent environment.The definition of a generic model that applies to a general purpose is very complex, as opening on all domains that benefit from 3D semantically rich models and point clouds range from neuro-psychiatry to economics or geo-information.Our approach was thought to allow a maximum flexibility by defining a conceptualization on which different domain formalization can be attached (Figure 1).
Figure 1 Meta-model articulation for the creation of a SPC It is for this reason that we wanted to clearly illustrate a privileged domain of application: indoor environments (for BIM, emergency response, inventory management, UAV collision detection …).Therefore, we divided the characterization (knowledge representation and data modelling) in different hierarchical levels of abstraction to (1) avoid overlap to existing models, and (2) enhance the flexibility and opening to all possible formalized structure.The core instruction is that the lower levels are closer to a domain representation than higher levels (level-0 being the highest level) but they impose their constraints.The overall structure can be seen as a pyramidal assembly, allowing the resolution of thematic problems at lower levels with reference to constraints formally imposed by the higher levels.
Knowledge integration is essential to the creation of the SPC structure, as it constitutes the necessary source for the meaning and adaptation of different entities within the pyramidal model.By default, we integrate a core external algorithmic module that allows to extract a raw relationship graph based on a voxel-based element mining routine inspired by (Gorte and Pfeifer, 2004;Wang et al., 2017).This was established as it does not require any external semantic information other than pure spatial information which encourage flexibility and adaptation.However, more domain-verse classification modules such as (Xiong et al., 2013) provide potentially enhanced workflows.As seen beforehand, while RDBMS are a great fit logically speaking, they do not perform well considering the very high number of rows.Clustering via indexing-schemes is mandatory for interactive visualisation as well as efficient data loading, inserts and updates.Building this spatial structure over an object-based binary host/guest structure would provide powerful analysis and visualization exploitation.In parallel, an ontology structure allows inference reasoning and semantics retention, and is directly linked to the spatial structure thus defining relationships and topology for points and objects.The top level, called level-0 gathers data, information, and knowledge about the core SPC components.

Level-0: Generalized SPC meta-model
For clarity, we specifically target point clouds, but the model can easily be extended to all kind of gathered data from our physical world, and in an extended version provide an opening for 3D Figure 2 Level-0 Generalized SPC meta-model (UML).A point cloud constituted of points is block-wise organized through semantic patches.These can be pure spatial conglomerate or retain a coherent semantic relationship between constituting points.
Generalizations via different schemes are possible using the generalisation structure to provide additional analysis flexibility.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume IV-4/W5, 2017 12th 3D Geoinfo Conference 2017, 26-27 October 2017, Melbourne, Australia meshes or parametric model integration.The different metamodels are formalised in UML, and provide a conceptual definition for implementations.We therefore modelled as a goal to provide a clear vision and comprehension of the underlying system, but the database creation slightly differs to privilege performances, therefore adaptations are made at the relation scheme modelling level.
The generalized SPC meta-model (Figure 2) formalizes the core components needed for constituting semantic point patches.It starts with the most primitive geometry: a point.It has a position defined by three coordinates in Euclidean space (R 3 ): X, Y and Z.Each point has a limited number of attributes, for an example in Figure 2 derived from 3 different sources: device knowledge (scan angle, intensity …), analytic knowledge (normal, curvature, roughness …) or domain knowledge (definition, representativeness …).While the UML model shows a one to many relationship, to avoid too many SQL joints and for performances sake, the attributes can directly be integrated within the point table (the same applies to semanticPatch).However, it is important to note that one point can have many sets of attributes (consequently, as does a semanticPatch).
Figure 3 Example of a basic LOD n-1 Generalization of 3 SemanticPatches from a point cloud with color attributes only.
A collection of points sharing the same type of dimensions (spatial and semantic) constitute a point cloud.This is a datadriven aggregation, as depending on the definition of the dataset, the point cloud object parameters will differ.However, one dataset often represents a coherent point aggregation which serves a domain purpose.This point cloud entity also benefits from a knowledge source identifier to identify which knowledge source it relates to (if multiple domain-specific ontologies are connected to the model).To cope with heterogeneity in point cloud sources, a schema is defined and attached to all point clouds that share a similar dimension number, dimension type, scale and offset, similarly to (Cura et al., 2017).Each point cloud is then parsed in semantic patches, regarding available knowledge and an adapted subdivision technique.Arbitrary, such a technique could be point-number related, geometry related or position related.While the existing postgreSQL plugin pgpointcloud defining patches in a XML scheme provides spatial patches, we propose to greatly enhance such an approach by constructing semantic patches, which retain both spatial and semantic properties.It constitutes small spatial subsets of points that share a relationship based on available (and injected) knowledge.By default our proposed voxel-based subdivision method groups point using geometrical & topological properties that implicitly relate to abstract conceptualization of our mind (such as geometric shapes to group points belonging to a plane, others floating above it …).As such, they are indirectly semantized."semanticPatch" retains many attributes, with an emphasis on two specifics: a classification status (which can be 0:unclassified, 1:one class only or 2:many classes), and a confidence level for the classification.These are computed through a segmentation and classification routine as described in 4, which is independently developed from the proposed point cloud data model.In order to speed up computations, allow enhanced spatial & semantic searches and provide new generalization possibilities to better address our representation of the data, a LoD generalization structure definition is directly linked to the semantic patches (Figure 3).It defines the indexation scheme used, the different levels (if any), its node spatial extent and neighbours, associated geometries (if any) and other generalized attributes derived from statistical computations (average, Gaussian mixture …).(Poux et al., 2016a) suggests a 3DOR-Tree as defined in (Gong and Ke, 2011) for improved performances, but hashing and implicit storage (Cura et al., 2016) can also greatly improve the internal coherence.

Level-1: Connection-layer meta-model
The connection-layer meta-model (i.e. the strict framework that drives the use of a formalism and resolves any ambiguities about the use of its concepts) plays the role of a plug system: an interface between the core SPC level-0 generalized meta-model, and a domain ontology that formalizes the domain-specialization of a generic ontology.It is constituted of two sub-levels, L1-1 and L1-2 (Figure 4).
The core element in this meta-model is created by an aggregation of one or many patches that define a connected element (connectedElements).These are the entities that closely relate to classified objects, retaining both spatial and metadata coherence.Connected elements transparently describe a portion of the space that is by default indirectly influenced by analytical knowledge and device knowledge, from the underlying patch organization.Connected elements have a spatial extent computed from the aggregation of patches, as well as one or several geometries that can be obtained by topological calculations from the patches.Aside from geometrical attributes including a spatial generalization (which can be for example the barycentre of the spatial extent, but also more representative statistical generalization) they retain raw semantics from the underlying patch aggregation rule.This can be ConnectedElement, VerticalElement or HorizontalElement, but the integration of domain knowledge gives the opportunity at this level to deepen the representativeness of a connected element.Nevertheless, one connected element regarding a variety of applications can have different spatio-semantic interests.Therefore, aggregated elements constitute an aggregation of connected elements which provide additional granularity and flexibility (a chair, with 4 feet and one sitting area, which is either 5 connected elements or one aggregate element).In a same philosophy, each connected element retains basic relationships with its surrounding environment.We therefore detect and store host and guest relationship information (the table is the guest of the floor, and the floor is the host of the table).These strong concepts have an influence on how deep the selectivity can go.Retaining relations and organizing hierarchically through topological relations refers to mereology, applied on point clouds object generalization regarding DE-9IM (Clementini and Di Felice, 1997).Therefore, a double structural definition retaining generalization and point primitives (Level-0) allows new analysis combining multi-LoD definitions.This pyramidal graph relationship formalization permits to easily access a spatially connected graph for reasoning engines that interpret topological relations.These conditions can be used to infer a physical description and combine many possible analysis, for example the possibility to recreate occluded zones, reason about position in time and space and conduct structural investigation.
Connected elements also have additional properties and specific attributes inherited from the patches that it relates to.While every point can retain a date stamp, a connected element can be influenced by temporal variations, but duplicating a physical description of the connected element at every discrete temporal interval would not be sufficient.Therefore, connectedElements can have a temporal modifier that will describe the different modifications from the in-base initial state.They can also relate to multiple spaces, which define a set of dimensions in R, R 2 … R n (for example X/Y/Z in R 3 ).
"Space" and "ConnectedElements" are connected to a lower abstraction level L1-2 within the connection-layer.A space can have many subspaces defined in respect to the space dimensions.In a spatial context, it is interesting to note that they are mostly fiat subspaces in regard to (Smith and Varzi, 2000).Indeed, bona fide boundaries represent physical separators whereas fiat boundaries will describe a fictional border, and most of subspaces for human cognition have a fictional border (i.e. a room with an open door).The topological inward relation allows to constitute different subspace LoDs (we can consider a building, or the first floor of that building, or the room 2/43 of that first floor …)."subSpace" therefore retain a domain knowledge source pointer that can be dedicated to one or many specific domains (it can be a subspace in regard to the ontology of buildings, to the archaeology temporal findings in Australia …).The concept of world objects results from the definition of (Billen et al., 2012), which is a mind conceptualization of an object that also follows the categorization of (Smith and Varzi, 2000)."worldObject" is a specialisation of "connectedElements" retaining a domain related semantic pointer similarly to "subSpace" (a knowledge source mirroring the domain conceptualization).Geometries attached to these entities are useful for topological calculations and the direct link to "subSpace" allows many possible queries for information extraction (testing the inclusion of a world object in a subspace, testing the intersection of two objects geometries with a fiat boundary from a subspace …)."subSpace" and "worldObject" constitute the entry points on which domain ontologies can plug themselves to adapt to a specific application.

Level-2: Domain adaptation
As stated by (Tangelder and Veltkamp, 2007) "any fully-fledged system should apply as much domain knowledge as possible, in order to make shape retrieval effective".With the rise of online solutions, we have seen a great potential in using knowledge database for classification to analogically associate shapes and groups of points with similar features.This association through analogy "is carried out by rational thinking and focuses on structural/functional similarities between two things and hence their differences.Thus, analogy helps us understand the unknown through the known and bridge gap between an image and a logical model" (Nonaka et al., 1996).This introduces the concept of data association for data mining, and relationships between seemingly unrelated data in a relational database or other information repositories.Enabling the use and analysis of domain knowledge through explicit domain assumptions while separating domain knowledge from operational knowledge refers to domain ontologies.This shares interoperability notions with our proposed SPC structure; while one domain meta-model formalization is suited for some applications, another can be more adapted for others and create different results that will be used differently.These will dictates how the final point cloud data model should be used (for which application).
Figure 4 Level-1 Connection-layer meta-model.It is directly linked to the Level-0: semantic patches constitute connectedElements.aggregatedElements and topological notions gives flexibility to the deepness of an element characterization.ConnectedElements can relate to one or multiple spaces defined by their dimensions.These are subsequently divided in subspaces regarding a concept from a domain knowledge characterization, similarly to the world objects (being a specialization of connectedElements).
Therefore, the level-2 meta-model is directly linked to different knowledge sources, which are specified in the level-1(-2) metamodel interfaces: "subSpace" and "worldObject".Their conceptual abstraction in between pure spatial data (point clouds) and specific domain-verse data constitute a generic door for the potential connection to many level-2 domain specialization.This allows a great flexibility and a context adaptation to a very wide range of application, limited only by the underlying domain ontology.In fine, the domain meta-model attached to the connection-layer meta-model, and indirectly to the generalization meta-model constitute the SPC model.
In a simple example (Figure 5), we illustrate over a basic indoor ontology the connection of a level-2 meta-model to the connection-layer meta-model.It contains 2 class elements (separatorElement and internalElement) specialized in 8 classes (transitionSeparator, verticalSeparator, horizontalSeparator, livingElement, mepElement, madeMadeStructElement, moveableElement, noise) which can also be specialized in a refinement process to get as close possible from the abstract idea that the human mind has of a concrete or abstract object of thought.This crude example is inspired by already established BIM standards and is used on a simple test case to enable rapid perception of natural language requests.As such, one selected "worldObject" can be specialized and identified as an internalElement, a mepElement (Mechanical, Engineering, Plumbing), specifically a duct of the "subspace" room 4 in the higher LoD level "subspace" building 7, and attached by an "externalFixture" next to the exit "door".The possibility to play on all possible scales is therefore an opening on a flexible system that can be adapted to many real-world applications.

PROTOTYPE & DISCUSSIONS
Any modeling choice is arbitrary and depends on the conscious or unconscious aspirations of the designer.Although our work responds to a concern for generalization at a spatio-semantic level, it nevertheless remains that it is not totally independent of a certain context.It is for this reason that we wanted to clearly illustrate a privileged domain of application, indoor environments (for BIM, emergency response, inventory management, UAV collision detection …).This choice permits to explore different scales and configurations for deeply and entirely testing our developments.It is also ideal for the definition of new virtual spaces, and the GIS demand associated to such environment is ever increasing.Therefore, as the formalization of domain constantly evolve, modelisation and direct integrations of level-2 domain meta-models will be explored in future research.
A first prototype is implemented that addresses level-0 and level-1 SPC conceptual model.The implementation was developed under Linux (Ubuntu), using several python, C++ and SQL libraries including psycopg, pdal, gdal, pcl, CCLib, pgpointcloud, postGIS.We integrated the results in the open source DBMS PostgreSQL.Different informations (spatial and radiometric) were fused regarding (Poux et al., 2016b), and a multi-scale voxel structure is computed to enable recognition of independent spatial entities by voxel adjacency-study.While the SPC model at Level-0 doesn't necessitate any semantics to function (spatial attributes {X, Y, Z} only), their pertinence to domain applications greatly orient the creation of patches.Therefore, point grouping rules influence the performances of information extractions.While this is done spatially in the prototype via a voxel-based topological analysis as in (Poux et al., 2017), other criteria will be explored.Each detected element is parsed in semantic patches arbitrary subdivided based on a point maximum number of 800 points and directly integrated in PostgreSQL.The database is populated and we obtain semantic patches that constitute connectedElements, retaining both spatial and semantic information.Using PostGIS and SQL statements (i.e  1 Basic SQL statements for level-0/1 abstraction The integration of semantics to create a more intelligent structure such as "The connected element CC0065 is a chair named 'rocking_chair' made in 1962for relaxing" as SQL statements (INSERT INTO moveableelement connectedelement_id, type, title, date_prod, kind) VALUES ('65', 'chair', 'rocking_chair', '1990-07-13', 'relaxing');) is also taking advantage of the SPC structuration.Then, dual spatio-semantic queries leveraging the linked domain concepts from a level-2 meta-model mirror our real world information gathering (i.e.natural language query: "I want to locate all cypress in the garden, calculate the min.distance to surrounding buildings and if the distance to buildings is inferior to their height, then I want to calculate how much length I can cut if they are not protected" illustrated in Figure 6).Tests were conducted with static point data only, but varying positions in space and time present additional problematics that will be investigated.While our present paper is a proof of concept that provide direct integration of hard-coded or computed domain knowledge, our next work will include the extensibility of the proposed model to other data types, as well as a better integration of learning routines and ontologies as knowledge sources.A direct workflow to bring intelligence in real-time is currently investigated.

CONCLUSION
Through the definition of articulated meta-models, we propose a new data model giving point clouds the possibility to retain semantic concepts.Considering the variability in their definition due to heterogeneous knowledge sources, we proposed an allround solution to semantic injection and spatio-semantic queries.Its conception through different conceptual levels allows to better integrate point clouds in existing workflows, retaining a high interoperability potential with existing and future standards.While our proposed approach is illustrated on point cloud data, it can be extended to all kind of complex data that represent physical components.A special focus was laid on the dual aggregation hierarchies of semantic feature types and geometric decompositions.Future work will include the integration of the multi-level meta-model with point cloud processing frameworks for knowledge discovery.

Figure 5
Figure5Level-2 meta-model example.separatorElement and internalElement are connected to the Level-1 meta-model directly through "worldObject" and "subSpace".It is a succession of specialization describing an indoor environment.

Figure 6
Figure 6 Example of a point cloud spatio-semantic query While 3-dimensional spaces are strongly inferred in the SPC model, 4-dimension spaces integrating time or by extension ndimensional spaces are possible characterizations for greater interoperability.Tests were conducted with static point data only, but varying positions in space and time present additional problematics that will be investigated.While our present paper is a proof of concept that provide direct integration of hard-coded or computed domain knowledge, our next work will include the extensibility of the proposed model to other data types, as well as a better integration of learning routines and ontologies as knowledge sources.A direct workflow to bring intelligence in real-time is currently investigated.

Table 1 )
, information extraction is possible.