VOLUME COMPARISON OF AUTOMATICALLY RECONSTRUCTED MULTI-LOD BUILDING MODELS FOR URBAN PLANNING APPLICATIONS

3D city models are playing a growing role worldwide as sources of integrated information upon which different urban applications are developed. In the context of urban planning and design, semantic 3D city models can provide plenty of qualitative and quantitative information about the urban context and of the area(s) to be transformed. This paper takes inspiration and continues a work recently published in which several design parameters and Key Performance Indicators are computed from a semantic 3D city model, and later used in a GIS-supported urban design process to develop a new area. As many of such parameters are derived from the gross volume of the building stock, this paper investigates whether and to which extent different building stock models might affect the estimation of the gross volume. The study is carried out in anticipation of the upcoming LoD2-based, country-wide model of the Netherlands that is being finalised by our team. At the same time, the paper investigates whether and which information can be obtained regarding the quality of the LoD2 model from a comparison with the LoD1 one, with a focus on volume calculation.


INTRODUCTION
Holistic urban planning requires much qualitative and quantitative knowledge of the urban context and of the area(s) to be transformed. This applies not only to the built-up areas, but also to open spaces, existing infrastructures (above and below ground), and covers both the current situation and the estimation of impacts by the envisioned scenarios. Creating and sharing this knowledge among the different stakeholders and practitioners can still be however a rather complex and time-consuming process. At the same time, the number of cities in the world creating and using 3D city models as "digital geo-twins" (Lehner and Dorffner, 2020) has been growing continuously in the last decade, also thanks to the steady advances in all geomaticsrelated disciplines. The adoption of a semantic 3D city model is associated with several beneficial effects for a city, as it can be seen as a source of integrated and harmonized spatial and nonspatial information to be used for many applications. As of today, different applications based on and exploiting the added value of virtual 3D city models have been documented in literature. They range from noise mapping, augmented reality, up to energy simulation tools (e.g. Stoter et al., 2020;Blut and Blankenbach, 2021;Wang et al., 2020;Rossknecht and Airaksinen, 2020). Biljecki et al. (2015) provide a review of applications based on 3D city models. Further, more recent examples of applications are described for example in Bao et al. (2020) and HosseiniHaghigh et al. (2020).
In the context of urban planning, this paper takes inspiration and continues the work recently published by Agugiaro et al. (2020), in which an new approach was proposed to support the computerassisted urban planning and design process by exploiting a semantic 3D city model. Additionally, a prototypic software tool was also presented and described. According to the authors, urban transformations of the future, generally defined as "the city of tomorrow", always relate to the existing city (i.e. "the city of today"). The idea is then to extract a set of Key Performance Indicators (KPIs) from a 3D city model, to use them to analyse the "city of today", on the one hand, and to provide quantitative support to the urban planner to create a number of design proposals for a new development area, on the other hand. More specifically, the proposed approach elaborates on a widely used notion in urban planning, i.e., the number of households per hectare, as one of the common units of measure for urban housing density (Torrents and Alberti, 2000). Generally, the term "household" refers to the number of people sharing the same living space (e.g., a family), and, although often used in urban planning, the actual size of the physical space used by a household (i.e., a dwelling) is seldom considered, neither in 2D nor in 3D. Therefore, a set of parameters computed from the semantic 3D city model is introduced to estimate the volume of the physical space "used" by a dwelling, both in terms of residential and non-residential spaces. These volumetric parameters are then used to support the successive urban design process. As such parameters are derived from the gross volume of the buildings, it is crucial that the building stock is accurately represented by the city model, as it will affect the successive analyses and design phases. The aforementioned prototypic software tool computes the KPIs from a semantic 3D city model of Amsterdam (more details about it in section 3).
The estimation of the gross volume of the building stock by means of the 3D city model, andmore specificallyfrom different LoDs (Levels of Detail) is the starting point of this paper. The topic has already been investigated to different extents by other authors (e.g. Macay Moreira et al., 2013;Wate et al., 2016;Biljecki et al., 2018). Here we will restrict the reasoning to the CityGML "world", as the number of possibilities tied to the different LoDs and the rather loose definition of LoD in the CityGML specifications can already lead to rather different results. Biljecki et al. (2016), for example, have suggested a more refined classification of LoDs. This classification is now widely accepted and used, especially in the Netherlands, and is presented schematically in Figure 1. Another common problem associated with 3D city models is the lack of information (metadata) documenting, for example, the source data and the reconstruction process that has led to the final product (Labetski et al., 2018). This regards all possible objects in a city model, both in terms of geometry and associated thematic datatherefore also the computation of the gross volume of the building stock. As Figure 1 exemplifies, even in 3D city models with LoD1.x buildings the value of the enclosed volume can vary considerably, depending on the reconstruction technique. Still, it is generally accepted that LoD1 (building) models, ormore precisely -LoD1.0 to LoD1.2 can be rather easy to obtain, depending on the available data. The most common approaches are to extrude the footprint by a certain height value which is either obtained from spatial data, e.g. lidar point clouds or normalised DMS (nDSM), or by multiplying the number of floors by a certain offset. The overall loss of accuracy in terms of enclosed volume is counterbalanced by the possibility to generate such models rather quickly and for large areas. Still, automatic generation of city-wide LoD2 models is a current topic of research worldwide, as both quality of available surveyed data and development of new 3D reconstruction algorithms steadily improves (Rottensteiner et al., 2014;Lafarge, 2015). In terms of data availability for 3D city modelling, the Netherlands have a long tradition of providing high-quality, country-wide, up-to-date open (geo)data that can be accessed mainly via a centralised national portal (PDOK, 2021). Thanks to the already available datasets, and their continuous improvements, basic data needed for a country-wide semantic 3D "city" model are already available. In particular, when it comes to the modelling of the building stock, a LoD1 model of the whole Netherlands, consisting of over 10.2 million buildings, has been available since 2019. It is called 3D BAG (Dukai et al. 2019) and it extends the Dutch BAG (Basisregistratie Adressen en Gebouwen) dataset (BAG, 2021) containing information about each address in a building, its current main use (residential, commercial, industrial, etc.), the year of construction and the registration status. The 3D Geoinformation group at TU Delft is currently working on releasing a new version of the 3D BAG, which will contain automatically reconstructed LoD2.2 buildings for the whole Netherlands, as well as LoD1.2 and LoD1.3 buildings derived from the LoD2.2 models by means of progressive 3D generalisation. The final version of this new 3D BAG dataset is expected to be released in the course of 2021. The LoD2.2, LoD1.3 and LoD1.2 buildings models that are used in this paper are a preliminary preview of this new dataset. A brief overview of the underlying building reconstruction process is given in section 2.
The main questions this paper deals with is whether the LoD2.2 geometries provide better volume estimations than LoD1.3 (and LoD1.2). As a test case, the aforementioned city model of Amsterdam will be used as reference. At the same time, the idea is to compare the enclosed volume of each LoD model in the new multi-LoD dataset to assess its quality and check whether potential 3D reconstruction errors can be automatically (or semiautomatically) identified and classified.

AUTOMATIC 3D RECONSTRUCTION OF LOD2.2 (AND LOD1.X) BUILDINGS
The building geometries used in this paper were reconstructed using an improved version of the LoD1.3 reconstruction method described in Stoter et al. (2020). This method creates building solids automatically from a set of building footprints and a classified aerial lidar point cloud. Following is a description of the source data and a summary of the method.
All source data are taken from PDOK, the aforementioned national open-geodata portal of the Netherlands. The building footprints come from the BAG dataset, containing information on all buildings in the Netherlands and the associated addresses. The building footprints are 2D polygons and have a minimum planimetric accuracy of 30 cm (BAG, 2021). The source for the elevation data is the AHN3 (Actueel Hoogtebestand Nederland) dataset. This is a classified aerial lidar point cloud that was collected in 2014-2016 in the study area. It has an average point density of 6-10 points/m 2 , a planimetric accuracy of at most 13-23 cm and an elevation accuracy of at most 10-20 cm (AHN, 2021). For the reconstruction process, only the AHN3 points that are classified as ground or as buildings are used. From the BAG dataset only those footprints are selected that are current and for which elevation data are available in the AHN3 dataset. This means that all buildings constructed or demolished after 2014-2016 are filtered out.
The general principle of the developed building reconstruction method consists in creating a partition of the building footprint into roof parts, where each roof part is labelled with a roof plane. This is called the roof partition. The line geometries that induce this partitioning are extracted from the point cloud and a solid geometry for the building can be obtained by simply extruding this partitioned footprint. The overall method is illustrated in Figure 2. The first three steps (a-c) are identical to the procedure described in Stoter et al. (2020). First, the lidar points for each building are selected using a point-in-polygon procedure with the BAG footprints ( Figure 2a). Then a region-growing plane detection algorithm is used to identify roof and wall planes in the point cloud ( Figure 2b). From the detected roof planes, the boundary lines are extracted ( Figure 2c). These lines, as well as the detected planes and the original lidar points, are then used to create the roof partition ( Figure 2d) by means of a graph-cut optimisation approach, as described in Zebedin et al. (2008). The LoD2.2 building solid is created by extrusion from the roof partition ( Figure 2e). The LoD1.3 building solid ( Figure 2f) is also created by extrusion, but from a generalised roof partition. This generalisation is performed by setting each roof part to the 70 th percentile of the lidar points of the corresponding roof plane and by merging roof parts with a height discontinuity smaller than 3 m (i.e. the approximate height of a floor). The LoD1.2 building solids are created by extruding the BAG footprint to the 70 th percentile of all lidar points in the roof planes. The elevation of the building ground surfaces is determined by taking the 5 th percentile of the elevation values of the lidar ground points in a radius of 4 m around the building. In addition, in case the point cloud indicates that there are ground planes within the building footprint, these parts are removed from the roof partition. For example, this can happen with underground parking lots and can lead to the creation of a multi-part building from a single footprint. With the described method applied to the city model of The Hague, the root mean square error (RMSE) between the lidar points and the reconstructed building solid surfaces is less than 30 cm for 98% of the points and less than 10 cm for 63% of the points.

VOLUME COMPARISON AND EVALUATION OF THE RECONSTRUCTED BUILDINGS
In this section a comparison between the above mentioned LoDx building models (i.e. LoD2.2, LoD1.3, LoD1.2) and a so-called volumetric building model, based on a normalised DSM, generated for the city of Amsterdam and used here as ground truth, is carried out. For the sake of clarity, from now on the 4 models will be named as nDSM, LoD22, LoD13 and LoD12 models. First, a brief description of the 3D city model used as reference, i.e. the nDSM model, is given. Then the analyses are carried out and the results are presented.
The nDSM model was generated by gathering heterogeneous, spatial, and non-spatial datasets that were then harmonised and integrated. The complete list of used dataset as well as details regarding the data integration process can be found in Agugiaro et al. (2020). Here only the most important characteristics will be mentioned. The nDSM model contains all buildings of Amsterdam (circa 171000), modelled as single-part buildings. As said before, the city model is "frozen" to the year 2016 due to the last availability of the AHN3 data (and derived products such as the DSM and the DTM). The nDSM model follows the CityGML 2.0 specifications and is stored in a 3D City Database instance (Yao et al., 2018). Several candidate datasets were initially considered to model the building geometries. However, issues were found in both the existing 3D BAG LoD1 model and the LoD2 model provided by the Municipality of Amsterdam. Therefore, and given the relevance of estimating the enclosed volume of buildings, a different approach was adopted, i.e. the focus was put less on the building "outer" shape and more on the enclosed volume, which was therefore derived from the normalised DSM, i.e. obtained as the difference between the available DSM and DTM datasets (also available via PDOK). The volume is computed by intersecting the nDSM with the BAG footprints. As a result, the so-called volumetric city model was generated, as this was found to be the least error-prone approach, and to best serve as input for the urban planning application built on top of it. A visual example is given in Figure 3.
From a thematic point of view, national and local datasets containing building information (e.g. year of construction, building usage, etc.) were fused and integrated with the with the geometries. A set of rules was defined to classify all buildings into 5 main classes, corresponding to fully "Residential", "Mixed-use", "Non-residential (single-function)", "Nonresidential (multi-function)", or "Unknown". In order to align the reference nDSM model with the 3 LoDx models, only those buildings identified by the same ID were chosen, resulting in circa 167000 buildings to be used in the successive comparison analyses. The reason for using the whole city model of Amsterdam is that it offers a significant number of heterogeneous buildings, covering different shapes, sizes, usages, years of construction, and it has been cleaned up sufficiently to be trusted as a reasonably accurate ground truth.
From   This is visible also in Table 2 that presents an overview of the distribution of the buildings (and their gross volume) according to their functions classes, as well as the sum of the gross volume for each model. Table 2 shows that the residential class is the most relevant, both in terms of number of buildings and gross volume, followed by mixed-use and non-residential buildings. Buildings of unknown function are numerically significant (26.4%) but the contribution in terms of volume is only 8.1%. In addition, looking at the average size of the footprint, it can be seen that the class of function "unknown" contains buildings with rather small(er) footprints. From a simple visual inspection, it can be confirmed that most of them are garages, sheds or other ancillary, small buildings. In the nDSM model, residential buildings account for 34.3% of the gross volume, followed by non-residential (single function) and mixed-use buildings (26.3% and 24.2%, respectively). Finally, non-residential (multifunction) buildings represent 7.1% of the whole gross volume. Table 2 contains also the corresponding values for the LoDx models in terms of gross volume, volume distribution and (in the coloured columns) volumetric difference with regard to the nDSM model. In terms of internal distribution, within each model the distribution is approximately the same, and is comparable with the nDSM model. In terms of volume differences, the LoD2.2 and the LoD1.3 yields both circa 11% larger volumes than the nDSM. Looking at the class-specific volumetric differences of residential buildings, the values are quite similar for both LoD22 and LoD13 (+4.5% and +3.7% respectively) and indicate a rather good correspondence between the nDSM and these 2 models. Considerably larger are the differences in case of non-residential and mixed-use buildings, where all values are in the range between 13.0% and 18.6%. This could indicate that the reconstruction process of the buildings in these classes might be more problematic, as it deals with buildings having on average larger footprints and (likely) more irregular shapes, i.e. these are models with a higher complexity. Finally, all figures in the LoD12 model are generally worse than the others. The overall gross volume difference, for example, reaches 19.6%.
As the next step, an analysis on the volume differences between the LoDx and the nDSM models was carried out. Table 3 contains the percentile-based characterisation of the distribution of the volume differences (in %, always referred to the nDSM). Figure  5 shows the distribution of the differences of the 3 LoDx models grouped in intervals of growing size. In general, all 3 distributions are shifted towards the positive x-axis, which is in line with the previous findings in terms of aggregated volume differences. The RMSE is similar for the LoD22 and LoD13 (31.2% and 33.9%, respectively), and 58% for LoD12. However, if we consider only the 5-95 percentile interval, the RMSE values drop to 6.4% for the LoD22 model, 10.9% for the LoD13 and 15.6% for the LoD12 ones. In particular, Figure 6 shows that over 50% of the LoD22 buildings differ volumetrically no more than +5% from the corresponding nDSM value. If the interval is extended to all differences between -10% and +10%which can be considered an acceptable threshold in terms of maximum volume differencethe percentage grows to over 80% of all buildings in the LoD22 model. If we keep the same interval, i.e.
[-10%, +10%], the number of buildings of the LoD13 model drops to circa 55%, and to circa 45% in the LoD12 model.  Table 3. Percentile-based characterisation of the distribution of gross volumes differences (in %) in the LoDx models with respect to the nDSM model. The RMSE is computed on the whole sample and for the 5-95 percentile interval.  A specific investigation on the remaining 20% of the LoD22 buildings with a volume difference larger than ±10% was carried out. Most of the buildings in this category are multi-part buildings, and the graph in Figure 7 analogous to the one in Figure 6 shows that circa only 10% of the LoD22 multi-part buildings are within the ±10% threshold of volume differences. The percentage is slightly higher in the LoD13 model, and smaller in the LoD12 model. This can be considered a first simple indicator of where to look for possible problems in the 3D reconstruction process. In the city model of Amsterdam, the number of multi-part buildings is however rather small (i.e. only 0.2%, corresponding to 319 buildings out of 167248), nevertheless their contribution to the total gross volume differs significantly according to the model: 3.6% in the nDSM, 9.9% in the LoD22, 9.6% in the LodD13 and 5.8% in the LoD12 models.  A visual inspection was carried out with particular attention to the multi-part buildings. Two examples are presented in Figure  8. After further investigation, it turned out that the wrongly reconstructed building shown in Figure 8 (bottom) was the result of a bug in the reconstruction software which has since been fixed. This exemplifies the value of the analysis carried out here also as a means to quickly verify the correctness of the building reconstruction implementation. In order to further investigate volume differences between the models, more aspects were included into the analysis, i.e. the footprint area, the year of construction, and the building usage (class). For space reasons, the following sections will contain the results of the comparison only between the LoD22 and the nDSM models.

Volume comparisons based on footprint size
All footprint areas were classified into size intervals. Table 4 gives an overview of the main characteristics associated with each class. Figure 9 shows that classes with larger variations are at the opposite sides of the graph: either buildings with very small footprints (≤20 m 2 ) or with rather larger footprints (≥2000 m 2 ). If, on the one hand, the number of small-footprint buildings is considerable (23.8%), their aggregated volume accounts for less than 0.4% in both the nDSM and the LoD22 models. As this does not hold for the larger-size buildings, the analysis was further refined considering also the usage of the buildings, as shown in Figure 10. Here the only class which greatly differs from the others in terms of variability of volume differences is the mixeduse one, especially in the case of very large-footprint buildings (≥10000 m 2 ). All other classes show similar characteristics, save for buildings of class "unknown" of footprint size smaller than 100 m 2for which, however, the same considerations hold as written before.  Table 4. Distribution of buildings (and associated gross volume) according to the footprint size. Figure 9. Distribution of volume differences (in %) between LoD22 and nDSM models based on footprint size. Top and bottom whiskers correspond to the 95 th and 5 th percentile, respectively.

Volume comparisons based on building age
All buildings were classified into 5 age classes depending on their year of construction. Table 5 contains information about the distribution of buildings and the respective volume for each age class. Also in this case, for each age class the usage classes were considered, and the result is shown in Figure 11. Looking at the graph, buildings of "unknown" function are the only ones that differ considerably in terms of variation of volume differences, while less striking differences can be highlighted for all other age and function classes. Recalling that buildings of "unknown" class account for only circa 8% of the whole gross volume in the city model of Amsterdam, there seem to be no clear dependency on the year of construction when it comes to the differences between the LoD22 and the nDSM models.
Year of construction  Table 5. Distribution of buildings (and associated gross volume) according to the age class. Figure 11. Distribution of volume differences (in %) between the LoD22 and nDSM models based on age class and building function. Top and bottom whiskers correspond to the 95 th and 5 th percentile, respectively.

Spatial analysis and visualisation
The map presented in Figure 12 shows for each neighbourhood the distribution of the buildings according to the volume difference intervals described before and shown in Figure 6. The map shows that the buildings with largest differences are distributed approximately around the belt around the city centre of Amsterdamwhere most likely larger shopping centres and other large-footprint facilities are located. Secondly, the maps shows all volume differences aggregated at neighbourhood level. This allows us to quickly recognise neighbourhoods with the biggest volume differencesand to check for potential problems in the 3D reconstruction process. Of course, a similar map detailing information at single-building level can also be obtained, but it is left out here for space reasons. A positive side effect of the visual inspection on the (volumetrically speaking) "problematic" neighbourhoods has led to the discovery of otherinitially unexpectedphenomena. An example is shown for neighbourhood "N73D"highlighted in Figure 12 by means of a dashed polygon. As Figure 13a shows, the footprint heights of the buildings in the nDSM model (represented as grey boxes) and the LoD22 model (as red wireframe) are misaligned, and this leads to a high share of volume differences between 10% to 30%. Figure 13b gives an example and provides some hints on the reasons: the presence of uneven terrain surrounding the buildings. However, as the error could depend itself on the nDSM model, or in the 3D reconstruction process (or both!), further investigation is currently being carried out.

Figure 13.
[a] Example of misalignment between footprint heights in the nDSM buildings (as grey boxes) and the LoD22 buildings (in red wireframe).
[b] Example of a building surrounded by terrain points at different heights.

CONCLUSIONS
In this paper, a volume comparison of different buildings models in the city of Amsterdam was carried out, namely between an existing volumetric city model, derived in a previous work from a normalised DSM, and a set of LoD2.2, LoD1.3 and LoD1.2 models that are currently being prepared and are going to update the 3D BAG dataset, available for the whole Netherlands. The reasons that motivate the work presented in this paper are manifold. On the one hand, the current, nDSM-based city model is used to extract a number of volume-based KPIs used for urban planning and design. The first question is therefore whether the newer LoDx datasets (chiefly the LoD2.2 one) offer a significant improvement in terms of estimation of the enclosed gross volume of the building stock. On the other hand, the availability of a whole city model which has been cleaned up and already used for past works represents an opportunity to check whether potential errors in the new LoDx models can be quickly spotted automatically (or semi-automatically). Hence the second question, i.e. whether information on the accuracy of the 3D reconstruction process (in terms of volume) can be collected and used to further improve it.
It is crucial to recall that in this work the nDSM-based model has been used as reference. If, in general, this is an acceptable assumption, it is reasonable to assume that it is still not completely error-free. As a matter of fact, some errors were found, e.g. when it comes to small buildings (e.g. sheds) that might be completely covered by vegetationor, as shown in Figure 13 in the case of misaligned building footprint heights. Still, using it as a reference for comparing the other LoDx models has proven useful to quickly find some issues in the building reconstruction pipeline, and to reason on how to address them. Generating a "clean" DSM by first filtering out the vegetation from the AHN3 point cloud (unlike it is now the case) could surely lead to an improvement of the resulting nDSM as ground truth, but at the cost of additional pre-processing steps.
In conclusion, from the analysis carried out using the 3D city model of Amsterdam, we can say that:  The LoD22 model is the closest to the nDSM model, andfrom a volumetric point of viewit differs globally by circa +11%. Circa 60% of the LoD22 buildings differ in volume no more than +5% from the corresponding ones in the nDSM model, and the figure reaches 80% of the buildings if the interval ±10% is consideredwhich is an acceptable threshold in terms of gross volume accuracy. In terms of errors, multipart buildings resulting from the 3D reconstruction process tend to be more error-prone, despite their relative scarce number within the 3D city model.  If the different functions in the building stock are considered, residential buildings (which represent numerically circa 56% of the whole model) are the class with the smallest volume difference: globally the LoD22 residential buildings differ in volume only +4.5% from the nDSM model. Non-residential and mixed-use buildings (together circa 17% of the building stock) show larger volume differences, ranging circa from +15 to +18% with regard to the nDSM model.  When it comes to the LoD13, it is quite similar to the LoD22 and the same considerations done before still hold. On the other hand, the LoD12 model shows larger volume differences, both globally (+19.6%) and in terms of building classes. Residential buildings, for example, are still the class with the smallest volume difference, but it yields now +12.6%  If we refine the analysis by classifying the buildings according to the size of the footprint, it is mostly at the extremes that there are major volume differences: small-footprint buildings (≤20 m 2 ), e.g. garages, sheds, etc., or very large footprints (from 2000 m 2 upwards) where the roles of mixed-use and nonresidential buildings becomes more significant.  If we focus on the year of construction, there are no particular trends that seem to imply a dependency of the reconstruction success rate on the age of the building, with the exception of the building of class "unknown"but they make up only circa 8% of the total gross volume of the whole dataset.
From an urban planning point of view, and, more specifically, when it comes to the extraction of volume-dependent design parameters and KPIs, it can be concluded from this analysis that the nDSM-based 3D city model can be used as a good proxy in case a LoD22 city models is not available, especially if the focus is on residential buildings. This means that advantages in terms of data availability and easier reconstruction process may compensate for the lack of a geometrically more detailed LoD22 building modelwhich in general is still less widely available and more complex to produce. Alternatively, the LoD13 model might represent a good alternative, while the LoD12 model shows too large volume deviations to be used in this context.
By means of the comparison between the nDSM and the LoD22 footprints, a misalignment of the absolute footprint heights has been found in certain areas. Theoretically, they should coincide or be very close to each other, so this has turned out to be a quick way to identify potentially problematic buildings. However, the reasons for this issue are still subject of investigation and so far no conclusive findings have been collected. The plan is therefore to further investigate and report on them in future publications.