EXTRACTING AND EVALUATING CLUSTERS IN DINSAR DEFORMATION DATA ON SINGLE BUILDINGS

In the past two decades persistent scatterer interferometry (PSI) has become a well understood and powerful method to monitor the deformations of man-made structures. PSI can derive displacement histories of thousands of scattered points on a single building with accuracy of a few millimetre per year, by analysing space-borne SAR data. In this paper, we present a method to cluster PS points on a single building into segments which show the same deformation behavior. The spatial distribution of those clusters gives an insight into the structural behavior of a building. We use dimensionality reduction to visualize the clusters in the deformation space. The comparison of our extracted displacement patterns with ground truth data from precise levelling and 3D tachymetry confirms the plausibility of our remote sensing method.


INTRODUCTION
Large subterranean civil engineering projects, such as tunnel constructions require careful planning and monitoring of expected deformations of the terrain above and next to the construction site, especially when executed under dense urban areas. Typically a risk analysis is performed, considering the geological situations and professional experience from similar projects. After localizing potentially endangered areas evidence measurements are carried out before the beginning of construction activities and are repeated throughout the whole building period. Such measurements include precise levelling, continuous total station measurements, the installation of water level devices and other in-situ monitoring instrumentations.
Beside ground based techniques, remote sensing applications such as space-borne synthetic aperture radar (SAR) has developed to an acknowledged way to augment and prove terrestrial monitoring. Its ability to capture a wide area makes SAR a unique and valuable technique for urban building monitoring. Differential interferometric SAR (DInSAR) analysis of high resolution X-Band data such as the TerraSAR-X satellite's is hereby able to measure deformations in the millimetre scale (Hanssen, 2001;Maccabiani et al., 2017). As Crosetto et al. (2016) conclude nicely, particularly the persistent scatterer interferometry (PSI) approach has developed to be a reliable and well understood method in urban deformation monitoring and is implemented in several software bundles. By analysing a time series of coherent SAR images, PSI is able to derive the 3D position of persistent scatterers and their over time deformation, projected on the satellite's radial line-of-sight (LOS). In urban environments, these persistent scatterers (PS) are mainly caused by man-made structures since trihedral corners and metal parts are fostering the backscatter of radar signals. As Schunert et al. (2012) have shown, depending on the type of the façade, buildings in high resolution X-Band data can often have more than 1 PS per m 2 which leads to hundreds of points on small * Corresponding author houses and can easily surmount thousands on large office buildings. The interpretation and visualization of such high number of points can be challenging. One often needs civil engineering experts to interpret the deformation patterns and link them to the actual construction process, therefore a simple representation and pre-analysis of those big numbers of time-series is necessary.
In this paper, we propose a new technique to interpret the result of PSI. We combine Open Street Map (OSM) with DInSAR analysis to separate single buildings in order to find substructures in those instances. Identifying such substructure may help to understand the physical movement of a building and can point out rigid structures which, in case of a monitoring, need to be measured individually. The analysis of substructure motions on a single building can help to detect potentially risky deformation patterns, i.e., if two adjoining parts of a building are moving in opposite directions. We are evaluating our findings with in-situ 3D tachymeter measurements and precise levelling to compare the deformation time series we derived from the satellite data. In our experiments, we are using TerraSAR-X High Resolution Spotlight data acquired over Stuttgart, Germany during a time span of almost 3 years. The ground truth data are measured in the course of the ongoing construction of a new underground main station in the city centre. Similar work has been done by Zhu et al. (2018) and Costantini et al. (2018), who suggest a clustering algorithm to find temporal deformation patterns in COSMO-SkyMed DInSAR results. In their work they focused on extracting motion patterns in PS pairs (PSP) (Costantini et al., 2014) clusters on buildings. Their proposed deformation model automatically separates linear processes from periodical annual patterns and helps to find potentially dangerous movements. In contrast to Zhu et al., we use ground truth deformation data as well as a 3D city model to evaluate our findings. Furthermore, by mapping the high dimensional deformation space onto a 2D one we are able to find clusters in the deformation patterns more reliably, by fine-tuning our clustering parameters and distance metrics.
In Section 2 of this paper we firstly describe the SAR data we are using. We will provide information about the spatial and temporal sampling of the satellite images. Following, we give an overview over the monitoring approach in a construction project, characterizing the current ground based monitoring methods and their demands on accuracy. In Section 3, the PSI algorithm is shortly explained with its strengths and limitations, as well as the results one usually expects by this method. Our clustering method is then shown in detail, explaining how we define the PS points similarity and our hyper parameter estimation approach. We briefly outline the used embedding method to visualize the patterns in high dimensional space. In the results Section 4 we present the clusters found in the motion patterns on a large office building. We compare the deformations to our ground truth data and visualize the distribution of the points on the façades. Finally, we are discussing on our findings, point out possible applications of our technique and give an outlook on our future work, towards a citywide, house-wise risk analysis.
The central idea of our method is an automatic approach to find and cluster PS points on rigid structures of buildings, so these segments represent individual parts of a building. Therefore we partition the deformation patterns of PS points on a building. In difference to single points, such clusters can be considered as redundant observations of the same deformation process and can provide civil engineers valuable information about the static relationships of large structures. This method could be a useful tool in evaluation of previous, purely construction statical based analysis. Due to the widespread characteristics of this remote sensing technology, we are able to automatically derive information about deformations and movements in a large area. This might help to use ground based observation devices more effectively and enables civil engineers to take countermeasures to avoid damage of buildings.

SAR Images
The SAR data we are using has been acquired by the German X-Band SAR satellite TerraSAR-X (TSX). The 88 "HighRes SpotLight 300 MHz" (Airbus, 2017) images were captured during a 3 years time span (September 2016 -June 2019). The repeat orbit period for TSX is 11 days. The slant range -azimuth resolution for HS300 images is 0.6 m × 1.1 m. The maximum spatial baseline, relative to the master image in January 2018 does not exceed 400 m ( see Fig. 1). The scene, shown in Figure 2 covers the inner city of Stuttgart, Germany. The corresponding OSM data contains about 38000 building instances in this area.

Ground truth measurements
Before the start of a construction project, all buildings within a predefined buffer around the construction site are observed using classical surveying methods. The measurement values play a big role to understand the normal behaviour before any construction influences take place. Moreover, when the construction activities begin, extensive surveying deformation measurements are applied for these chosen buildings. The choice of the type of instrumentation, the repetition rate of the measurements is directly related to the objective of keeping the buildings safe and to continuously monitor the potential construction influences and to control them step by step during construction.  The structure of the investigated building consists of an extended roof erected over two carrying pillars. This type of structure and the different construction activities in the vicinity of it dictate extensive surveying observations including precise levelling and tachymetry. Table 1 includes the measurement accuracy of each type.

Measurement methodology Measurement accuracy
Precise levelling 0, 3 mm per 1 km double levelling and invar bar 3D Tachymetry Distance: 1 − 2 mm + 2 ppm Angle: 1" (0,3 mgon) All measurement points of the above-mentioned systems are located on the side of the building facing the construction site.
On the lower part the precise levelling points are installed while prisms for the 3D tachymetry are located on higher positions (see Fig. 5). It is important to point out that the temperature values show a direct correlation with the measurement values.
The results of the ground-based measurements are further used as ground truth for the PSI data.

METHODOLOGY
In the following section, we are giving an overview of our approach. The workflow in Figure 3 describes the sequential procedure. After deriving the displacement history for all points in the SAR scene via PSI, we use OSM building footprints to analyse these time series on single building instances with DBSCAN, using a correlation based distance definition. Afterwards we visualize these clusters by reducing the dimensionality via t-SNE and evaluate the plausibility by fusing airborne laser-scanning (ALS) data with the clusters. Finally we compare the results of ground based monitoring systems to our results.
PSI (X, Y, Z, dn(t)) Compare time series to ground truth Ground truth from levelling and 3D tachymeter close to cluster center

Persistent scatterer interferomety
Persistent scatterer interferometry (PSI) is an advanced InSAR technique. The main idea of this approach is the detection of temporal coherent pixels in a stack of co-registered SAR images. By analysing the phases of such pixels in each image of the stack, relative to a master image, the line-of-sight (LOS) movement history and a 3D position of this scatterer can be estimated (Ferretti et al., 2001(Ferretti et al., , 2000. Crosetto et al. (2016) give a very good overview over the history and the capabilities of PSI algorithms and we highly recommend reading this article for more detailed insights. PSI works extremely well for dense urban areas, since man-made structures especially located at house façades and roofs act as a good reflectors. Even though the PSI algorithm doesn't need a precise DSM, it is able to estimate the scatterers X-Y positions in the order of the pixel size while the height component is typically a bit less accurate (Chang and Hanssen, 2014).
It is worthwhile to mention that InSAR techniques are limited by the LOS geometry (Pepe and Calò, 2017). As Figure 4 describes, the real movement of a scatterer is projected onto the vector towards the satellites position. Therefore, the measured displacement depends on the geometry between the real movement and the observation angle. If a mostly vertical movement is assumed the LOS values can be scaled depending on the satellites incidence angle (Ketelaar, 2009).
Since PSI is analysing time series of multiple SAR images, one of its results is the displacement history of each scatterer. For every P Sn point we obtain its relative deformation dn(tm) as a time series with a measurement for each SAR acquisition (m ∈ N | 1 ≥ m ≤ 88) (see Eq.(1)). As Maccabiani et al. (2017) stated, the accuracy for each measurement can be better than 2 mm.
The deformation time series represent the most advanced PSI product and is the base for our clustering approach. As Gernhardt et al. (2010) and Crosetto et al. (2015) have shown, PSI time series, derived from high resolution TerraSAR-X data, are able to grasp the annual movements of buildings. They confirm thermal expansion of buildings up to several millimetres in amplitude. We exploit this fact for our clustering, under the assumption, that each segments of a building shows a characteristic movement behaviour.  Figure 4. The actual displacement DREAL gets projected on the line-of-sight (LOS) towards the satellite. Depending on the incidence angle θ, different movements are measured. In fact, movements perpendicular to the LOS can not be detected. Note that this is only an issue for non vertical movements. To compare the deformation to levelling data, or if the DREAL is assumed to be mostly vertical it can be obtained by scaling DLOS with 1 cos(θ) ( see Eq. (3)).

DBSCAN with sample correlation distance and knearest neighbour ε-estimation
Density-based spatial clustering of applications with noise (DB-SCAN) is one of the most common data clustering algorithms (Ester et al., 1996). Given a set of points in some space, it groups points with many nearby Neighbors (dense areas) and marks points in low-density areas as outliers. Its result vary on two hyper parameters: ε specifies the maximum distance of one point to a cluster to be part of it and minP ts: sets the minimum number of points in one cluster. DBSCAN is able to find an unknown number of arbitrary shaped clusters while ignoring outliers and noise.
We are treating the deformation histories dn(t) as points in a M dimensional space. Each point dn ∈ R M is defined by its M measurements. In general, there are several different metrics to define the distance between two points. Especially in high dimensional data, this is a big issue (Aggarwal et al., 2001). We are using "1− the sample correlation" r of two time series (i.e. dx(t) and dy(t)) as their distance metric: where sx, sy = sample standard deviations sxy = sample covariance We also experimented with other metrics (Euclidean and City Block) but Equation (2) seemed to work best for this kind of data, since this distance metric corresponds to the nature of the time series. By being invariant against an amplitude scaling of the deformation histories, it rates points on a uniformly moving rigid structure as very close.
We face the problem of automatic ε-estimation similar to Liu et al. (2007) by analysing the k-distance graph (k = 5). This enables us to automatically run this analysis on all buildings of interest. The fine-tuning of the minimum points per cluster minP ts and the ε parameters of DBSCAN was done by embedding the deformation histories from this M dimensional space R M into R 2 using t-distributed stochastic neighbour embedding (t-SNE). Figure 6 is an example of such an embedding of the high-dimensional deformation space.

t-SNE visualization
We are using t-distributed stochastic neighbour embedding (t-SNE) to visualize our clustering results (van der Maaten and Hinton, 2008), this helps to fine-tune the DBSCAN parameters. The t-SNE method gained a lot of attention, since this technique is well-suited for embedding high-dimensional data in a low-dimensional space of two or three dimensions while preserving its local structures. The main idea is to define a probability distribution over all points in the high dimensional space, where similar objects have high probability to be picked as pairs while dissimilar points are extremely unlikely to be picked as pairs. Then t-SNE iteratively finds a representation in low dimensional space, which is having the same probability properties. To measure the minimization of the sum of difference of conditional probability, t-SNE minimizes the sum of Kullback-Leibler divergence of overall data points using a gradient descent method. As a similarity metric we choose the sample correlation as described in Equation (2).
Usually results of this embedding are preserving clusters and local neighbourhoods much better than other dimensionality reduction methods (van der Maaten, 2009), therefore it can help to judge the quality of the DBSCAN Cluster extraction (Figure 6). Clustering in the output of t-SNE was also considered in the course of this work, but since the embedding is not preserving bigger structures in the data and the low dimensional space is not physical to interpreter (Wattenberg et al., 2016;Linderman and Steinerberger, 2017) we decided to use conventional clustering on the high-dimensional space as described in Section 3.2.

Line-of-sight to vertical deformation
As illustrated in Figure 4 the displacement time series are representing the projection of the real deformation onto the satellite's radial line-of-sight (LOS). Considering the fact that we want to compare the deformation to the precise levelling results, the assumption of a purely vertical movement is a suitable deformation model. Under these conditions the LOS deformation time series dn(t) can transformed into the vertical deformations ∆dn(t) by a scaling it with the cosine of the incidence angle θ: (Sanabria et al., 2014).
To compare the 3D tachymeter measurements with the PSI deformation histories, the 3D deformation vector has to be projected onto the line-of-sight (Vollrath et al., 2017).

RESULTS
For our case study we chose a building close to the construction site of a new underground train station in which, due to the construction work, continuous precise levelling and 3D tachymeter measurements are executed over several years. In Figure 5 we show the PS point cloud next to the LIDAR representation of the building and the positions of the levelling points and the 3D tachymeter prism on the building.
The detected clusters in deformation space are visualized via a t-SNE embedding in Figure 7, one can nicely see the difference between embedded clusters and the results from DBSCAN. Figure 6 shows the position of the extracted clusters on the building. Even though no spatial information has been considered in the clustering process, the clusters seem to plausible segment the building in rigid structures. The choice of DBSCAN parameter ε seems to have a big impact on the segmentation results, and has to be estimated for each building via the k-distance graph. Figure 8 shows the corresponding time series of each cluster, as well as a temperature profile on the building. The single displacement histories show a clear correlation with the temperature, but also linear trends and single change events can be detected for some clusters. The clusters in Figures 6 and 7 correspond the to time series in Figure 8 and have to be regarded in respect to each other.
In Figure 9 we compare the ground truth to the closest cluster to the levelling points (Fig.8c). Both, the deformation history extracted from PSI and the measured deformation on the ground seem to match very well. The difference in amplitude, especially of the fast movement in December 2018 could be explained by a non-vertical deformation direction.
In Figure 11, we show the 3D tachymeter results. The 3D deformation measurements have been projected from a local reference frame onto the line-of-sight (see Fig. 10). The resulting deformation matches the cluster on the corresponding part of the building.     Figure 6 as time series in colour, all members of the cluster in black. The temperature on the building is shown in the bottom plot. Plot (c) is compared to the precise levelling ground truth in Figure 9. Plot (d) is compared to the LOS projection of the 3D tachymeter in Figure 11.
ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume V-3-2020, 2020 XXIV ISPRS Congress (2020 edition) Figure 9. Top: Ground truth from two precise levelling points underneath the building (see Fig. 5). Bottom: Deformation of the (spatially) closest PSI cluster (c) to the levelling points.

CONCLUSION
The segmentation of PS points on a building into clusters, which represent rigid bodies can help to understand the deformation of a structure. We used the displacement histories obtained by high resolution DInSAR to find clusters, respectively, groups of points that show a similar displacement behaviour over time. By visualizing these high dimensional clusters on in a two dimensional t-SNE embedding, we can visualize the quality of our clustering approach. The position of the clusters on the building seem to show plausible segments of a the building, even though no spatial information has been considered in the clustering process. The received displacement time series reveal the different movement behaviour of each cluster. By comparing the time series to ground truth measurement from traditional monitoring approaches, such as levelling and 3D tachymeter observations we could show, that our extracted clusters are representing these displacements very well.
Especially the comparison to the 3D tachymeter data highlights the limitations of single orbit DInSAR observations. The one Figure 11. Deformation time series from the 3D tachymeter. The location of the prism is shown in Figure 5. The direction cross points towards the wall, along indicates horizontal movements along the wall and vertical shows the down-up movement (see Fig. 10). LOS-Projection shows the projection of this displacement onto the satellite's line-of-sight direction, it matches the displacement history of the closest cluster (d).
dimensional line-of-sight observation can not represent a complex 3D deformation.
Future work will explore the possibility to cluster in the t-SNE embedding, to overcome the complexities which come with defining distances and thresholds in high-dimensional spaces. By choosing a high perplexity as t-SNE parameter, clusters are clearly separated in the embedding so DBSCAN has a higher chance to distinct noise from clusters. First experiments show very promising results of this approach, but might need a good strategy to deal with the non-preservation of bigger structures in t-SNE embedding. Another interesting follow up to this work will be a city wide analysis of all buildings. After finding clusters on each house, we can analyse their relative movements to find buildings with critical, potential damaging displacement patterns.
measurements for our investigations. Furthermore we would like to thank the State Office for Spatial Information and Land Development Baden-Württemberg (LGL) for providing citywide ALS data as well as high resolution orthophotos to colourize these points.