PSDefoPAT – TOWARDS AUTOMATIC MODEL BASED PSI POST-PROCESSING

: The deformation of the surface of the earth is triggered by numerous naturally occurring and artificial processes such as global isostatic adjustment, aseismic and coseismic movement, varying amounts of groundwater or gas stored underground, and soil consolidation due to urbanization. Monitoring these surface deformations is essential to understand the underlying processes and provide authorities and the public with hazard assessments. Remote sensing techniques, such as Persistent Scatterer Interferometry (PSI), have the capability of mapping these deformations. Besides a spatial analysis of the deformation patterns, PSI also provides information on their temporal evolution. Post-processing strategies to analyze the displacement time series have gained interest in recent years. This paper presents our PSI post-processing strategy, which incorporates different deformation models and automatically chooses the best-fitting one based on statistical tests.


INTRODUCTION
The surface of the earth is constantly deforming.The processes that trigger the deformation are categorized as large-or smallscale and naturally occurring or artificial processes.The main large-scale naturally occurring processes are global isostatic adjustment, aseismic and coseismic movement, i.e., movement occurring without seismic activity and movement directly associated with earthquakes.The basic concept of these processes is that the surface of the earth consists of several plates moving relative to one another (Minster et al., 1978).An example of aseismic creep alongside plate boundaries is presented in the case study of the North Anatolian Fault (NAF) in Turkey by Cetin et al. (2014).Persistent Scatterer Interferometry (PSI) was used to map the deformation of the NAF between 2003 and 2010.The deformation map shows alternating zones of steady-state creep and locked zones alongside the fault.Knowing the spatial and temporal evolution of these deformations is essential for seismic hazard assessments of affected regions.Locked faults accumulate strain, which is later released abruptly during earthquakes.At the same time, aseismic creep has the effect that less strain is accumulated over time, and thus the likelihood of significant earthquakes shrinks notably.For example, historical data of aseismic creep at Ismetpasa shows that the NAF deformation rate decays exponentially since the earthquake in 1944 (Cakir et al., 2005).Human-induced deformations are caused by, for example, varying amounts of groundwater or gas storage (Béjar-Pizarro et al., 2017;Even et al., 2020;Teatini et al., 2011).For example, the aquifer of Madrid caused alternating uplift and subsidence between 1992 and 2010 in response to groundwater extraction during draughts and recovery periods with excess precipitation (Béjar-Pizarro et al., 2017).Underground gas storage causes similar ground deformation.Gas is stored in depleted hydrocarbon reservoirs in many parts of Europe and North America.The gas is injected into the reservoir during summer resulting in uplift and extracted in winter leading to subsidence (Even et al., 2020;Teatini et al., 2011).Another example of human-induced deformation is presented in Solari et al. (2016).They mapped the deformation of the urban area of Pisa between 1978 and 2013.Comparing the subsidence rates of individual buildings with their age revealed a direct relationship between the age of the building and its subsidence rates.The case study also demonstrated that urbanization accelerates ground consolidation.Monitoring ongoing surface deformation is essential to understand the underlying processes and provide authorities and the public with hazard assessments.Surface deformations need to be considered for developments of infrastructure and human settlements (Raspini et al., 2018).The studies mentioned above showed that PSI is capable of mapping long-term and slowly developing deformations.Most PSI algorithms provide a map of mean deformation velocity and the associated displacement time series for each identified persistent scatterer (PS) (Ferretti et al., 2001;Hooper et al., 2004).Studies such as Navarro et al. (2019) use the mean deformation velocity map to identify areas of active deformation.The mean deformation velocity is estimated based on the assumption that the deformation is dominated by a linear trend (Ferretti et al., 2001;Hooper et al., 2018).However, above listed studies showed that this assumption is not always correct.For example, storage of gas leads to a seasonal horizontal and vertical displacement (Teatini et al., 2011), displacement rates of aseismic creep or subsidence due to urbanization can change over time resulting in either a piecewise linear trend or higher polynomial trend (Cetin et al., 2014;Solari et al., 2016).Assuming a linear trend might lead to the over-or underestimation of the velocity of the deformation or, in the case of a purely seasonal deformation pattern to misclassification as a non-active area.Berti et al. (2013) presented an algorithm that distinguished between linear, quadratic, and piecewise linear deformation patterns based on the displacement time series of each identified PS.However, they do not consider seasonal deformation patterns.The PSI post-processing strategy presented by Costantini et al. (2018) considers only PS identified on buildings, clusters the PS for each building, and estimates each cluster's average displacement time series.The average displacement time series are labeled as either piecewise linear or a combined seasonal and linear displacement time series.PS not included in one of the clusters are considered outliers and excluded from the analysis.The algorithm assumes that the area of interest (AOI) is an urban area with a dense PS grid.The AOI presented in this study includes both rural areas with a sparse PS grid and urban areas with a dense PS grid.The deformation processes expected in the area include landslides, aseismic creep, subsidence in urbanized areas, and large engineering structures, such as dams, in rural areas.Therefore, it is not appropriate to only consider PS identified on building structures, exclude every PS that does not deform in the same manner as its neighboring PS, and only consider a selection of possible deformation models.In a previous study, we presented the Matlab-based tool Persistent Scatterer Deformation Pattern Analysis Tool (PSDefoPAT) (Evers et al., 2021).The goal of PSDefoPAT is to describe the behavior of an individual PS over time with a mathematical model.The tool decomposes the time series into its trend, seasonal, and noise components.The trend component is approximated by either a linear, quadratic, cubic, or piecewise linear regression model, while the seasonal component is assumed to be a sine function.The quality of the models is judged based on the statistical measurements.The previously presented version of PSDefoPAT relies on user input.The tool uses preparatory steps such as outlier or change point (CP) detection, which require the user to adjust the settings for each time series.The provided options for these methods are chosen based on a visual inspection of the individual time series.Processing an entire PSI dataset this way would be cumbersome.Therefore, an automatized version of PSDefoPAT would be preferable.In order to automatize PSDefoPAT, steps that require user input need to be adjusted so that the required user input is minimized or the user input can be generalized for an entire data set.The steps that need to be adjusted to reduce the demand for user input and facilitate an automatized post-processing of each PS displacement time series within a PSI data set are outlier and CP detection.The original version of PSDefoPAT only identified extreme outliers in the time series based on statistical assumptions selected by the user.Furthermore, it only allowed the time series to be divided into two segments.Incorporating wavelet transformation based de-noising and a top-down segmentation algorithm reduces the noise obscuring the underlying signal and lifts the limitation on the number of segments used in a piecewise linear representation (PLR) of the time series.In this paper, we present our new PSI post-processing strategy.The paper is structured into five sections.The dataset and study area are presented in Section 2. An overview of the previously used concept for PSDefoPAT and the alterations implemented for the new version are presented in Section 3. In Section 4, the performance of the old and new versions of PSDefoPAT are compared based on three exemplarily displacement time series.Our conclusions are summarized in Section 5.

Dataset
In order to compare the time series used in this study with the ones presented in Evers et al. (2021), we decided to examine the same area of interest (AOI) and use the same data set.The AOI is located on the Peloponnese Peninsula in Greece.The largest city in the area is Patras.The Gulf of Patras encloses the city in the North, the Hellenic subduction zone in the West, the Erymanthos and Panachaiko mountains in the East and the South.Vertical and horizontal displacements are expected to be observed due to numerous landslides and active faults in the area (Sakkas et al., 2018;Chalkias et al., 2014).In order to observe the deformation occurring between January 2019 and January 2021 in the area, 122 Sentinel-1 images were acquired.The images were recorded with descending acquisition geometry and the acquisition mode Interferometric Wide Swath.

Persistent Scatterer Interferometry Processing
PSI was developed to reduce the effect of decorrelation and atmospheric phase delay on the displacement rates derived from interferograms (Ferretti et al., 2001;Hooper et al., 2004).PSI algorithms, such as the Standford Method for Persistent Scatterers (StaMPS) use a time series of interferograms to identify pixels with a low noise level.These pixels are referred to as Persistent Scatterers (PS).Only the PS pixels are used to determine the displacement rates for the observation period (Hooper et al., 2004).Figure 1 illustrates the mean velocities in LOS for Patras and the surrounding region.The range of the mean velocities extends from -12 mm/a (blue) to +10 mm/a (red).The sign marks the direction of the movement.A negative velocity indicates movement away from the sensor, while a positive velocity represents a movement towards the sensor.Even though Figure 1 shows many areas of surface deformation, we will concentrate on the three areas marked with red rectangles in Figure 1.Area A covers the center and southern part of Patras, while Areas B and C consist of more rural regions.Area B covers the mountainous area to the South-East of Patras, and Area C contains a dam and freshwater reservoir.

METHODS FOR PS DISPLACEMENT ANALYSIS
Generally speaking, a time series is a sequence of observations of a particular variable in chronological order.The aim of time series analysis is to characterize the time series and decompose it into its trend, seasonal, and noise component.Then, a mathematical model that describes the behavior of the variable over time can be built based on this information (Neusser, 2016;Montgomery et al., 2015).The variable observed in this study is the displacement of individual PS generated with the PSI algorithm StaMPS.This chapter is divided into three parts.The first part summarizes the original version of the Matlab-based tool PSDefoPAT (Evers et al., 2021) and explains the adaptations made to facilitate its automatic application.The second-and third-part focus on denoising time series based on its wavelet transformation and time series segmentation algorithms.Both aspects are newly incorporated in our approach of time series analysis.Each step can be found in a separate tab in Area III.The second step, prepare data, gives the user the option to smooth the time series or detect and replace the outliers.Five different methods were included in PSDefoPAT: The alternative to smoothing the time series is to detect the outliers in the time series.
A data point is considered as an outlier if: • it deviates more than three times the scaled median absolute deviation from the median, • it deviates more than three times the standard deviation from the mean, or • it is situated more than 1.5 interquartile ranges above the upper quartile or below the lower quartile.

De-noising with Wavelet Transformation
A wavelet basis consists of wave-like functions that oscillate around zero for a limited time.The basic building block of a wavelet family is the mother wavelet ().Every other wavelet of the family is constructed by dilating or translating the mother wavelet.
The scaling parameter a and the translation parameter b determine the position of the wavelet in time and the width of its non-zero-part.The non-zero-part of a wavelet is referred to as its support.The wavelet is a compressed version of the mother wavelet in case of  < 1 and a stretched version in case of  > 1 (Walczak et al., 1997;Motard et al., 2013).Well-known wavelets bases are the Daubechies-Wavelets (Daubechies, 1988) and Meyer-Wavelets (Meyer, 1992).Similar to how sine and cosine functions are used in Fourier transformation to represent a uniformly periodic signal, wavelet bases describe piecewise regular signals.The projection of the signal onto wavelet basis functions is called wavelet transform (WT) (Mallat, 1999).As well as any transformation, WT shifts the signal from its original domain into a new one, in which operations such as noise reduction or signal compression may be easier to carry out (Walczak et al., 1997).
The first step of noise reduction is to employ wavelets to decompose the signal into its wavelet representation.The Discrete Wavelet Transformation (DWT) decomposes the signal into a finite set of wavelet coefficients.The number of wavelet coefficients depends on the wavelet basis used to decompose the signal.The wavelet coefficients are used to define two filters: (1) a scaling filter, which is a low-pass filter, and (2) a wavelet filter, which resembles a high-pass filter.Both filters consist of the same wavelet coefficients, only with altering signs and in reversed order (Walczak et al., 1997).The filters are used for the recursive pyramid decomposition algorithm, first introduced by Mallat (1989).The algorithm offers a hierarchal multiresolution representation of the analyzed signal.The filters are applied to a signal f(t) with N data points resulting in N/2 detail coefficients and N/2 approximation coefficients at level 1.The approximation coefficients are then subjected to the scaling and wavelet filter resulting in N/4 detail coefficients and N/4 approximation coefficients at level 2. This step is repeated until the preferred level is reached (Walczak et al., 1997).
After the decomposition has been carried out, thresholding is applied to the coefficients   at each level (Walczak et al., 1997).
The universal threshold  first introduced by Donoho et al.
(1995) can be written as: The thresholding can be applied as hard thresholding or soft thresholding.In the case of hard thresholding, the coefficient   is dismissed if its value is less than  and kept if it surpasses .
In the case of soft thresholding, the coefficient   is also dismissed if its value is smaller than , but the coefficient is shifted towards zero if its value surpasses  by subtracting  from the absolute value of the coefficient.
After thresholding, the signal is reconstructed based on the remaining wavelet coefficients.
For our purposes, we decided to use the third Daubechies wavelet for the WT and soft thresholding for the de-noising step.Softthresholding provides a smooth and continuous time series reconstruction (Donoho, 1995).

Segmentation Algorithms
Regardless of which specific model is selected for the trend component, its characteristic parameters may change over time, such as the slope in case of a linear trend.The representation of a piecewise linear time series with a regression model is referred to as a piecewise linear representation (PLR).The PLR approximates a given time series of length N with K straight lines (Keogh et al., 2004).An example is shown in Figure 3.The black dots represent the time series, while the linear approximation of the time series is depicted with a blue line and the PLR with three consecutive red lines.The PLR in Figure 3 is characterized by three segments and two change points (CP), which are transition points between two consecutive segments.A segmentation algorithm takes a time series as input and returns the segments and CPs.The literature distinguishes between offline and online algorithms.Online algorithms allow data points to be added parallel to the execution of the algorithm and thus do not have access to the entire time series to produce the best PLR.The functional principle of offline algorithms, on the other hand, is that the data set remains unchanged during the execution of the algorithm, and the entire time series is taken into consideration to find the best PLR (Lovrić et al., 2014).The concrete task of a segmentation algorithm can be understood in three different ways: 1. Produce the best PLR for the provided time series with K segments.2. Produce the best PLR of the provided time series so that the maximum error of each approximated segment does not exceed a previously specified threshold.3. Produce the best PLR of the provided time series so that the maximum combined error of all approximated segments does not exceed a previously specified threshold (Keogh et al., 2004).The Sliding-Window algorithm starts with the first couple of data points from the given time series as a segment and adds to the segment as long as the error of the segment does not exceed a user-specified threshold.Should the segment exceed the threshold, the recently added data point is removed from the segment, and the algorithm starts to form a new segment beginning with that data point.Since the algorithm never uses the entire time series to determine the boundaries of the segments, it is considered an online algorithm.The Top-Down and Bottom-Up algorithms, on the other hand, are offline algorithms.The Top-Down algorithm starts with one segment representing the entire time series.This segment is then divided into two segments if the error of the segment exceeds a user-specified threshold.After that, the algorithm recursively tests each segment and divides the time series into additional segments until the error of each segment is less than the threshold.In contrast, the Bottom-Up algorithm starts with the finest possible segmentation of the given time series, i.e., the algorithm assumes that each combination of two neighboring data points is a segment and then merges adjacent segments if the error of the resulting segment does not exceed a user-specified threshold (Keogh et al., 2004).We decided to use a top-down algorithm with the second formulation of the problem at hand in mind for our purposes.The Sliding-Window approach is not used because it delivers poor results.And although it is reported in literature that the bottom-up approach outperforms the top-down approach (Keogh et al., 2004), in our case, the bottom-up approach provided a to fine segmentation of the displacement time series.To evaluate the segmentation, we used the mean squared error.The threshold is set to 15 % of the standard deviation of the analyzed time series.

RESULTS & DISCUSSION
In the following section, the models for the LOS displacement time series d PS of three different PS are presented.The exemplarily PS are the same ones that were examined in the previous study by Evers et al. (2021).Therefore, the selected models d PS are compared to the previously estimated models u PS .The PS are located within the red rectangles in Figure 1 (Area A, B, and C).They were selected because they are subject to different deformation phenomena: (1) subsidence due to the consolidation of building material, (2) varying uplift and subsidence, and (3) a landslide with varying velocity.PS 1 is located in Area C on the dam body of the Parapeiros-Peiros dam.The construction of the dam was finished in early 2019 and has been storing water since September 2019.The dam body is expected to subside due to the dead load of its weight and the stored water (Evers et al.,2020;Hunter et al., 2003).PS 2 is located in Area A, which covers the center and South of the city of Patras.It is a mainly residential area.Therefore, subsidence due to urbanization and deformation due to varying groundwater levels are expected.PS 3 is located in Area B. Area B is located southeast of Patras.It is a mountainous and landslide-prone area (Del Soldato et al., 2018).PS 3 belongs most likely to a landslide.Thus, a linear or piecewise linear deformation is expected to be observed.Figure 4 shows the original displacement time series of PS 1 (black) and the de-noised time series (blue).The time series was de-noised based on its WT with the third Daubechies-Wavelet and using soft-thresholding.The model d PS 1 was developed based on the de-noised signal.The displacement time series exhibits a quadratic trend with the following coefficients: (5) A seasonal component was not estimated because the periodogram of the de-noised time series did not reveal a significant frequency.Figure 5 shows the original time series (black), the trend (blue), and residual (red) components.The mean absolute deviation (MAD) of the model d PS 1 is 2.55 mm.Previously, the time series was approximated with a linear trend and seasonal component instead of a quadratic trend.The model u PS 1 has the following coefficients (Evers et al., 2021): The lower estimation for the interceptor and the selection of a different model are probably due to de-noising the time series before the model d PS 1 was estimated instead of only removing three outliers based on user input.However, a quadratic trend is in better agreement with the expected deformation phenomena than a linear trend combined with a seasonal component.The PS is experiencing subsidence due to the consolidation of the building and foundation material of the dam body.The subsidence was expected to start after the construction finished, which was in Spring 2019, and was expected to exhilarate with an increasing water level in the reservoir (Hunter et al., 2003).
The InSAR time series starts in January 2019 and end in January 2021 covering the time period in which the subsidence is expected to exhilarate.The MAD of the previous model u PS 1 is 2.18 mm. Figure 6 shows the original displacement time series and the denoised signal of PS 2. The estimated model d PS 2 consists of a seasonal and a linear trend component.back to a varying groundwater level in the area.However, groundwater level measurements would be needed to confirm this hypothesis.Figure 8 shows the original time series and the de-noised signal of PS 3.Both time series exhibit a piecewise linear trend with one CP.Figure 9a presents the results from the top-down segmentation algorithm.The algorithm divided the time series into two segments.The CP, i.e., the transition from the first to the second segment, occurs after 444 days.The first segment has a slope of -9.1 mm/year, and for the second segment, the slope changed to -1.95 mm/year.In both cases, the time series analysis revealed that PS 3 slowed down its movement.

CONCLUSION
In this paper, we presented the adaptations made to PSDefoPAT, which facilitate the automatization of its application.The required user input is reduced to one variable, which is the threshold to find the best PLR of the time series.The smaller the threshold is selected, the more segments are used for the PLR of the time series.
The estimated coefficients and calculated MAD values of all models are summarized in Table 1.The adjusted version of PSDefoPAT delivered similar results as the original one, except for PS 1.The deviations of the models estimated for PS 1 are probably based on the de-noising step introduced in the adjusted version of PSDefoPAT.However, we also outlined that the now selected quadratic model is in better agreement with the expected deformation phenomena than the previously estimated model.The next step is to apply the algorithm to an entire PSI data set and define a reliability value for each PS based on the discrepancy of the displacement time series and the estimated model for each PS. Cetin

Figure 1 .
Figure 1.Mean veolocity map of Patars and the surroundig region for the time between January 2019 and January 2021.
PSDefoPAT aims to characterize a given time series and provide a mathematical model that describes the temporal behavior of the displacement rates of each identified PS.The initially implemented version of PSDefoPAT was used manually and relied on user-specified input.Figure2presents the user interface of the original version of PSDefoPAT.The interface is structured into three areas.The map of estimated mean velocities is shown in Area I. General functions such as selecting a specific PS to be analyzed or exporting the results are provided in Area II.The examination of the displacement time series is divided into five steps: 1. Exploratory Data Analysis 2. Prepare Data 3. Trend Estimation 4. Examine Seasonality 5. Decomposed Displacement Time Series

Figure 2 .
Figure 2. PSDefoPAT userinterface structured into Area I, II and III

Figure 3 .
Figure 3. Exemplarily time series (black dots) approximated with a linear regression (blue line) and a piecewise linear representation (PLR) (red line).The PLR is defined by Segments A, B, and C, as well as the change points (CP)  and  2 .

Figure 7 .
Figure 7. Displacement time series of PS 2(a) decomposed into a linear trend and cyclic component, (b) the estimated additive model.

Figure 4 .
Figure 4. Original displacement time series of PS 1 (black stars) and the de-noised signal (blue).

Figure 5 .
Figure 5. Displacement time series of P1 2 (black) with a quadratic trend (blue) and the residual component (red).

Figure 6 .
Figure 6.Original displacement time series of PS 2 (black stars) and the de-noised signal (blue).

Figure 9 .
Figure 7a presents the separate components of the time series.The seasonal component is represented by the blue line, the trend component by the green line, and the residual component by the red line.Figure 7b presents the additive model of the time series, i.e., the trend and seasonal component are summed up.The model agrees with the original data.The average discrepancy between the additive model and the displacement time series is 3.02 mm.The average difference between the previously estimated model and the original time series is 4.73 mm.Previously, the time series was also approximated with a seasonal and a linear trend component.However, the coefficients of the model u PS 2 differ from the currently estimated ones.The period and the amplitude of the seasonal component were estimated to be 359.5 days and 5.44 mm.The slope and interceptor of the linear trend component were estimated to be -2.57mm/year and 3.03 mm(Evers et al., 2021).  2 = −2.57mm year  + 3.03 mm + 5.44 mm • sin ( 2 359.5 days * ( − 46 days)) (8)
(Evers et al., 2021)ecewise linear model d PS 3 fitted to the denoised signal.The MAD of the model is 2.89 mm, while the previously estimated model u PS 3 has a MAD of 2.46 mm.The previously selected model is also a PLR of the time series.However, a different CP was estimated.The CP occurred after 416 days.The slope changed from -12.05 mm/year to -1.679 mm/year(Evers et al., 2021).