IMPROVING SEMANTIC UPDATING METHOD ON 3 D CITY MODELS USING HYBRID SEMANTIC-GEOMETRIC 3 D SEGMENTATION TECHNIQUE

Cities and urban areas entities such as building structures are becoming more complex as the modern human civilizations continue to evolve. The ability to plan and manage every territory especially the urban areas is very important to every government in the world. Planning and managing cities and urban areas based on printed maps and 2D data are getting insufficient and inefficient to cope with the complexity of the new developments in big cities. The emergence of 3D city models have boosted the efficiency in analysing and managing urban areas as the 3D data are proven to represent the real world object more accurately. It has since been adopted as the new trend in buildings and urban management and planning applications. Nowadays, many countries around the world have been generating virtual 3D representation of their major cities. The growing interest in improving the usability of 3D city models has resulted in the development of various tools for analysis based on the 3D city models. Today, 3D city models are generated for various purposes such as for tourism, location-based services, disaster management and urban planning. Meanwhile, modelling 3D objects are getting easier with the emergence of the user-friendly tools for 3D modelling available in the market. Generating 3D buildings with high accuracy also has become easier with the availability of airborne Lidar and terrestrial laser scanning equipments. The availability and accessibility to this technology makes it more sensible to analyse buildings in urban areas using 3D data as it accurately represent the real world objects. The Open Geospatial Consortium (OGC) has accepted CityGML specifications as one of the international standards for representing and exchanging spatial data, making it easier to visualize, store and manage 3D city models data efficiently. CityGML able to represents the semantics, geometry, topology and appearance of 3D city models in five well-defined Level-of-Details (LoD), namely LoD0 to LoD4. The accuracy and structural complexity of the 3D objects increases with the LoD level where LoD0 is the simplest LoD (2.5D; Digital Terrain Model (DTM) + building or roof print) while LoD4 is the most complex LoD (architectural details with interior structures). Semantic information is one of the main components in CityGML and 3D City Models, and provides important information for any analyses. However, more often than not, the semantic information is not available for the 3D city model due to the unstandardized modelling process. One of the examples is where a building is normally generated as one object (without specific feature layers such as Roof, Ground floor, Level 1, Level 2, Block A, Block B, etc). This research attempts to develop a method to improve the semantic data updating process by segmenting the 3D building into simpler parts which will make it easier for the users to select and update the semantic information. The methodology is implemented for 3D buildings in LoD2 where the buildings are generated without architectural details but with distinct roof structures. This paper also introduces hybrid semantic-geometric 3D segmentation method that deals with hierarchical segmentation of a 3D building based on its semantic value and surface characteristics, fitted by one of the predefined primitives. For future work, the segmentation method will be implemented as part of the change detection module that can detect any changes on the 3D buildings, store and retrieve semantic information of the changed structure, automatically updates the 3D models and visualize the results in a userfriendly graphical user interface (GUI)


INTRODUCTION
3D city modelling is fast becoming the new trend in building and urban management while modelling 3D objects are getting easier with the emergence of the user-friendly tools for 3D modelling available in the market.Nowadays, many countries around the world have been generating virtual 3D representation of their major cities.These 3D city models are previously generated only to serve as visualization purposes.The involvement of major industry players such as Google and Microsoft, and with the CityGML is made as a standard format for spatial data exchange by the OGC (Kolbe et. al, 2009), the interests in 3D city modelling are significantly increasing.The growing interest in improving the usability of 3D city models has resulted in the development of various tools for analysis based on the 3D city models.Today, 3D city models are generated for various purposes such as for tourism, locationbased services, disaster management and urban planning.
Theoretically, every spatial feature in each 3D city model object should be accompanied by its corresponding semantic information to ensure that the model is spatially and semantically correct.This can be seen in CityGML format where aside from the five levels of detail to represent spatial features of a city object, it also support semantic data which highlight its importance.To further emphasize on the importance of coherent modelling of semantics and geometrical features of a 3D city model object, CityGML not only provide the support for semantic data in the CityGML file but also from other sources such as external code list for enumerative attributes, external reference from other database or sources, and generic attributes.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-2/W1, ISPRS 8th 3DGeoInfo Conference & WG II/2 Workshop, 27 -29 November 2013, Istanbul, Turkey This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.261 However, most of the 3D city models do not consist of any semantic data due to the unstandardized modelling technique and lack of tools to assist in such process especially in postmodelling stage.
This paper focuses on the development of the semantic data updating method for 3D buildings in CityGML format based on the 3D segmentation technique.Section 2 will discuss past researches related to 3D modelling in CityGML and 3D segmentation techniques.Meanwhile, section 3 will describe the methodology used in developing the semantic updating process and the result will be presented at the end of this section.The last section will discuss the future work and conclusion that can be drawn from this paper.

3D Buildings Generation
The emergence of better hardware and software in the computer-related industries makes it easier for the users to generate 3D buildings.Some researchers even focused on the developing an automated process for generating 3D city models (Takase et. al, (2003); Sugihara and Hayashi (2008);and Steinhage et. al (2010).3D building generation process is important in change detection and analysis since it will provide the input for the applications.Isikdag and Zlatanova (2010) introduced an approach to draw and visualize simple geometric representation of 3D buildings directly in the Google Earth environment.Urban planning is known to be a complex and tedious process which involves many parties and joint decision making.They suggested that by introducing this approach, all parties involved can easily access and view the data, and get the better picture on the proposed structures and how it relates to the environment on the actual site rather than just looking at the architectural model while lacking on the information of the surrounding area.
Another research by Isikdag and Zlatanova (2009) was focused on defining a framework to automatically generate buildings in CityGML using Building Information Models (BIM).The seamless integration between Building Information Models with GIS models is still scarce and researches are attempting to utilize the IFC and CityGML for this purpose.Even though several studies demonstrated the data transfer from IFC models into CityGML, they are still lacking in providing a formal and descriptive framework for automatic generation of buildings in CityGML using the IFC models.They presented the preliminary ideas for defining a semantic mapping that will allow the automatic transformations between the IFC and CityGML models.
Even though the IFC models are more accurate, detail and semantically rich, the models tend to have a very large size data thus increase the difficulties in storing and managing the data.Since the structures provided in the IFC models are very complex, the models are prone to provide unimportant details excessively.Furthermore, most of the knowledge fields and applications that use 3D models are not going to utilize the additional data unless for a very specific applications and analysis.Kim et. al (2008) presented a method to automatically generate Digital Building Models (DBM) with complex structures (parts with different slopes, sizes, and shapes) from LiDAR point clouds.The method consists of four steps.First, the ground/non-ground points are classified based on the visibility analysis among ground and non-ground points in a synthesized perspective view.Then, the non-ground points are analyzed and used to generate hypotheses of building instances based on the point attributes and the spatial relationships among the points.Next, each building is segmented into a group of planar patches.The intermediate boundaries for segmented clusters are produced by using a modified convex hull algorithm.These boundaries are used as initial approximations of the planar surfaces comprising the building model of a given hypothesis.Finally, those initial boundaries are used to derive a refined set of boundaries, which are connected to produce a wireframe representing the DBM.
The research done by Kim et. al (2008) provide an interesting idea on how to automatically acquire a high quality 3D models with complex structures as an input for detecting and analysing the changes that occurs on 3D building especially for building management purposes.
Wang et.al ( 2010) conducted a research on building detection from high-resolution PolSAR data at the rectangle level by combining region and edge information.They propose a new approach at the rectangle feature level to extract buildings from high-resolution polarimetricsynthetic aperture radar (PolSAR) data, using both region-based and edge-based information.They start with employing low-level detectors to provide raw region and edge information of the scene.Then the rectangle features are initially extracted from the edge detection results, and further optimized to best fit the rough region-based building detection results.In the last step, a novel Markov random field (MRF) framework for rectangles is proposed, in which the data energy term of rectangles is defined from the region information while the smoothness term is defined according to the contextual prior knowledge about the buildings.Under this framework, the building rectangles are identified from the optimized rectangle candidates by minimizing the total energy.The effectiveness of the proposed method is verified using the real fully PolSAR data.Ledoux and Meijers (2009) carried out a research to extrude a building based on its footprint in order to create topologically consistent 3D city models.They had acknowledged that extruding a building from its footprint is a well-known technique and easy to implement but if the topological relationships between footprints are ignored, the 3D model will most probably be topologically inconsistent, which consequently cause the models to be unusable for analysis.

Semantic Data Modelling in CityGML
CityGML is a data model to represent and exchange spatial data i.e. 3D city models, especially urban objects.According to Kolbe (2009), CityGML is in XML-based format and an application schema for the Geography Markup Language version 3.1.1(GML3).The Open Geospatial Consortium (OGC) has accepted CityGML specifications as one of the international standards for representing and exchanging spatial data, along with GML3.CityGML able to represent the semantics, geometry, topology and appearance of 3D city models in five well-defined Level-of-Details (LoD), namely the LoD0 to LoD4.The accuracy and structural complexity of the 3D objects increases with the LoD level where LoD0 is the simplest LoD (2.5D Digital Terrain Model (DTM)) while LoD4 ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-2/W1, ISPRS 8th 3DGeoInfo Conference & WG II/2 Workshop, 27 -29 November 2013, Istanbul, Turkey This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.262 is the most complex LoD (architectural details with interior structures).Kolbe (2009) discussed in detail about the role of CityGML in exchanging and representing 3D city models, the aim of CityGML development, its modelling aspects, recent applications and its relation to other 3D standards such as IFC and KML.
Previously, 3D city models have been used mainly for visualization purposes but with the rapid development in 3D city modelling has prompted some applications such as facilities management, building information model and simulations to utilize additional information about the city objects with standardized representations as suggested by Kolbe (2009).Stadler and Kolbe (2007) discussed about the spatio-semantics coherence in the integration of 3D city models.Since most of the GIS data are thematically and spatially fragmented, they suggested that straight forward joining of 3D objects would resulted in geometrical inconsistencies such as cracks and other inconsistencies if the spatial and semantic entities are not in coherent.They also suggested that, the spatial and semantic information can be utilized together only when both data (semantic and geometry) share the same structure and considered coherent.The spatio-semantics coherence plays an important role in validating the 3D city model and supporting the data integration in order to establish spatial interoperability.

3D Segmentation on 3D buildings
3D model segmentation method has been used in various fields such as medical technology, computer vision and geospatial applications.However, it serves the same purpose which is to break down an object into simpler parts to be manipulated for different applications such as object analysis, feature extraction and classification, object recognition, model reconstruction and generalization.Although most of the segmentation method used in geospatial and building-related applications are based on 2D segmentation (Mian et. al (2006); Tolt et. al (2006); Miliaresis and Kokkas (2007); Sampath and Shan (2010); Cheng et.al ( 2010)), there are some researches that dwelled into the 3D segmentation as presented by You et. al (2003), Hu et. al (2004), Thiemann and Sester (2004), Poupeau and Bonin (2006), Garcia (2009) and, Manferdini and Remondino (2010).Thiemann and Sester (2004) presented a research on segmentation of 3D building for generalization that utilized an adaptation of algorithm by Ribelles et. al (2001).Ribelles proposed a segmentation process that will detect holes, bumps and notches on a 3D building model by segmenting the model with one or more planes of its boundary.The features will be separated from the rest of its body where each plane will divide the space into two half spaces, with the space behind the plane is defined as solid and the space in front of the plane as empty.The 3 main operations are indicating of protrusions, detecting holes features and detecting complex holes.Protruding feature is defined as the difference of body and half space.Figure 1 shows two different segmentation on the same body where a protrusion (left) is detected with the quality value of 3 while a complex hole (right) is detected with better quality value of 3/7. Figure 1.Two different segmentations on the same geometry with different quality values (Thiemann and Sester, 2004).
However, Thiemann and Sester ( 2004) considered the algorithm by Ribelles et.al ( 2001) employed a "brute force" method as it tries all split with all combinations of planes increases the complexity of the algorithm and running it with four or more splitting planes makes it extremely time consuming.
To counter the problems, Thiemann and Sester ( 2004) introduced an extension from the original algorithm based on the theory that reducing the number of Boolean operation will reduce the complexity of the algorithm and its processing time.
They suggested that only one split-plane is used and only if it yields no result, then two or more split-planes will be used.In order to balance out the separation of bad protrusion features before the good complex hole, they also introduced a heuristics where only parts with value smaller than 1 are considered as valid.Figure 2 shows the segmentation on a building with 34 different split-planes.
In other research, You et. al (2003) presented a methodology which indirectly support the proposed method for this research, that, fitting primitives can be used for both; modelling or partitioning 3D buildings.In You et. al (2003), several buildings are automatically modelled from LiDAR data by using the estimation of primitives fitting.Several basic primitives that are often used in building designs (i.e.planes, cylinders, spheres) are fitted to the LiDAR data through surface and edge fitting.The method also supports high-order modelling primitives (i.e.ellipsoids and superquadrics) for irregular building structures.Figure 3 shows an example of a quite complex building that made up from several basic primitives.
Figure 3. Example of building made up from several primitive (You et. al, 2002).
Region growing segmentation is used on the LiDAR data to separate the buildings with other features (i.e.terrain, trees, lamp posts, etc.).Once all the primitives that make up the whole building are identified, the primitives will be combined based on CSG data model to generate a complete 3D building.Since there is no limit on the number of primitives used, the method can even generate a complex building, automatically.You et. al (2003) evaluated the proposed method by embedding the constructed models in the original LiDAR data to compare the difference between the proposed primitives with the real world LiDAR data of the buildings.The comparison between the constructed models based on primitives fitting and the LiDAR data is shown in Figure 4 and the constructed models for the whole area is shown in Figure 5.  (You et. al, 2003).
Figure 5.The primitives based models for the whole study area (You et. al, 2003)

3D Segmentation
Segmentation is basically a method to partition or break down an object into simpler parts for various objectives (object detection, analysis and management, texture mapping, etc.).Currently, 3D segmentation is heavily utilized in computer vision and medical technology.However, the development in 3D GIS has triggered the needs to use the tool for geospatialrelated applications (You et. al, 2003;Hu et. al, 2004;Thiemann and Sester, 2004).Generally, there are two principal types of segmentation; surface-type and part-type (Agathos et. al, 2007;and Shamir, 2008).The surface-type segmentation uses various primitives such as planes, cylinder and sphere as an approximation of the mesh to create distinct surface regions.On the other hand, the part-type segmentation creates volumetric parts by partitioning the mesh into meaningful or semantic components.Figure 6 shows the result for the part-type and surface-type segmentations.

Region Growing Segmentation:
There are numbers of segmentation techniques introduced in different fields, but based on the survey made by Shamir (2008), the simplest segmentation technique is called the Region Growing Segmentation meanwhile a comparative study on segmentation techniques done by Attene et. al (2006) and several other researchers (Garland et. al, 2001;You et. al, 2003;Hu et. al, 2004;Manferdini and Remondino, 2010) shows that the suitable segmentation algorithm for manmade objects is segmentation based on primitives fitting or 3D volumetric approaches.
Region growing is also known as a local-greedy approach where it starts with a seed element, then, examines its neighbouring elements and grows a sub-mesh incrementally by determining whether the adjacent elements should be added to the seed's cluster based on predefined criteria.Shamir (2008) has provided the pseudo codes for the region growing algorithm as follows: The concept of region growing is illustrated in Figure 7 from Agathos et. al, 2007.Seed point A examines all its neighbouring elements and adding the qualify elements in its cluster based on predefined criteria.
Figure 7. Seed point A grows by adding adjacent elements to its segmentation region (Agathos et. al, 2007)

3D Segmentation Based on Semantic-Geometric Decomposition
The proposed segmentation technique attempts to introduce a hybrid; part-type (often known as semantics) and surface-type (geometries) elements to the segmentation process so that the segmentation result will not only be based on fitting primitives, but also based on their semantic properties.This technique is able to segment a 3D model directly in vector data format, making the result have a higher accuracy than segmenting a raster data.
The data used in this research is particularly generated in LoD 2 standard.Even though CityGML support higher levels of detail (namely LoD 3 and 4) which includes the architectural and interior feature of a building respectively, its main structures are still the same as represented in LoD 2. In other words, LoD 3 is basically LoD 2 model with added architectural details such as door, windows and openings.Thus, it is deem sufficient for the 3D buildings to be generated in LoD 2 since the main structures of the building (which is important for the proposed technique) remains the same.However, while LoD 2 normally represents limited semantic data of a building by default i.e.Roof and WallSurface, the proposed technique is able to add more specialized semantic information on top of the existing ones, based on the segmented model.Figure 8 shows the schema for LoD 2 building as part of the CityGML building schema (Kolbe, 2009).

Semantic Segmentation:
In order to preserve the semantic data during the segmentation process, first, the 3D building need to be segmented based on its semantic attributes.
The semantic data can be provided together in the CityGML file of the 3D building.Figure 9 shows a CityGML file that contains semantic data of the building.

Figure 9. Semantic attributes in CityGML file
In this phase, geometries that represent the semantic data will be differentiated with each other, regardless the shapes they assembled.The segmented parts based on the semantics are also known as parent geometries.The child geometries will be obtained using the segmentation technique based on the fitting primitives.
Even though the semantics are present in the CityGML structure, the proposed segmentation method will classify the WallSurface according to their semantics, recognized the objects as parent geometries and used it as a starting point for geometric segmentation.Furthermore, it will enable the proposed method to retain the semantic information of the child objects, which are inherited from the parent geometries.If the model is not a semantic model, the segmentation method will attempt to derive the semantics based on user-predefined criteria for certain features (if any) such as roof before proceeding with the geometric segmentation.Figure 10 shows the result of the segmentation based on semantic-type.

Geometric Segmentation:
Since the model has been segmented based on its semantics, the geometric segmentation will start with each segmented parent, instead of based on the whole model.Segmenting the parent geometries will allow complex shapes to be broken down to simpler parts, thus, making it easier to thoroughly analyse the segmented structures while the semantic data still intact.The segmentation by fitting primitives will attempt to identify shapes from a small family of primitives such as cuboids and cylinders, in the complex shapes of the parent geometries.
In this phase, the segmentation method will extract the boundary nodes and classify them according to their parent geometries and surface ID respectively.It means that every geometric faces in the parent geometries will be given a unique ID.Then, each node from every faces in the parent geometries will be projected orthogonally to relevant boundary lines.
Then the process will move on to the connecting face that share a boundary with the previous face and the same process will be repeated but this time the newly accepted nodes also will be taken into account if it was projected on the common boundary.Figure 11 shows the selection of faces in the segmentation process.
Figure 11.The selection of faces.This process will be done face by face until all faces have been visited.After visiting the last face, the segmentation will validate all projected nodes to ensure all accepted nodes have been projected on every eligible boundary.Figure 12

Updating the Semantic Data from the Segmented Model
The segmented model contains a number of segments based on a group of primitive shapes or in other words, is a combination of primitive shapes that makes up the whole building.Since the model had been segmented by their semantic data previously, any specialized semantic data for the particular segment can be entered in this stage.For example in Figure 13 below, Wall Surface contains two main blocks which are Block A1 and Block A2.However, in Block A1, there are two more segments that represent different parts of block, namely the Lobby and the Foyer.By selecting the segmented parts, users can enter the semantic information of the segmented parts such as its name, function, type, capacity, etc.
Building A . Hierarchy of specialized semantic data Since most of the 3D building models are generated without any proper semantic data for its every component, the proposed method seems to be able to solve part of the problem where semantic data and its related spatial features (segmented parts) can be added at post-modelling process.Figure 14

CONCLUSIONS AND FUTURE WORK
3D city models concept has gained a lot of interest and widely accepted by the world especially the geospatial-related discipline.The concept has been improved with the acceptance of CityGML as its standard, and participation from big market players has resulted in rapid developments and productions of hardware and software that efficiently assist the 3D city modelling process.The costs and time needed to generate 3D city model object has been cut down significantly due to the fact that it can now be done not just by trained users, but also the general public.CityGML is able to represent the geometrical features of a 3D city model objects in 5 levels of detail, but it also emphasized in its corresponding semantic information to ensure the model is semantically and geometrically coherent.However, more often than not, the semantic information of 3D city model objects is not available.The problem occurs due to the inconsistencies in modelling technique and lack of tools in assisting semantic information management.This paper proposed a method to add semantic information of a 3D city model object by using a hybrid semantic-geometric 3D segmentation method.Basically, the proposed method will segment the 3D model based on the existing semantic data and then further segmenting it based on its geometries.This will enable the users to add more specialized information on top of the existing semantic data.If there is no semantic data available in the CityGML file, users can provide the data through other sources such as code lists, external reference or based on generic attributes.If there is still no data available, the method will proceed by segmenting it geometrically.The proposed method attempts to improve the semantic updating process for 3D city models by providing users with capabilities to enhance its semantic data in post-modelling stage.The segmentation method used in this paper is able to preserve the semantic data of the structure as well as further breaking down the semantically segmented part based on its geometries.The method also can be utilized for other analysis such as spatiotemporal analysis on urban development, 3D generalizations and automatic feature extraction.
For future work, the proposed method will be implemented in change detection for 3D city model in order to thoroughly analyse the changed structures.The results of the analysis will be stored in a 3D Geo-DBMS and visualized in 3D city models environment.
The same polyhedron can be segmented with different operations and combination of planes which resulted in many different features.To determine the best segmentation result, Ribelles et.al (2001) used the equation below to determine the best feature with the smaller value indicate a better feature: Quality value = New area (of the splitting face) / Old area (of facets lying in the split-plane)

d
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-2/W1, ISPRS 8th 3DGeoInfo Conference & WG II/2 Workshop, 27 -29 November 2013, Istanbul, Turkey This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.264

Figure 8 .
Figure 8. Schema for LoD 2 building in CityGML(Kolbe, 2009) queue Q of elements Loop until all elements are clustered Choose a seed element and insert to Q Create a cluster C from seed Loop until Q is empty Get the next element s from Q If s can be clustered into C Cluster s into C Insert s neighbours to Q Merge small clusters into neighbouring ones ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-2/W1, ISPRS 8th 3DGeoInfo Conference & WG II/2 Workshop, 27 -29 November 2013, Istanbul, Turkey This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.265

Figure 10 .
Figure 10.The result of the semantic-type segmentation Figure 12.The results of the segmentation process below shows the result for the segmentation process for semantic updating process.Roof Wall Surfaces ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume II-2/W1, ISPRS 8th 3DGeoInfo Conference & WG II/2 Workshop, 27 -29 November 2013, Istanbul, Turkey

Figure 14 .
Figure 14.Segmentation result for semantic data updating process