MULTI-FREQUENCY POLINSAR DATA ARE ADVANTAGEOUS FOR LAND COVER CLASSIFICATION - A VISUAL AND QUANTITATIVE ANALYSIS

: This paper investigates the enhanced potential of using multi-frequency PolInSAR data for land cover classification. In order to enable a descriptive analysis that goes beyond the mere comparison of classification accuracies, a two-step classification process is applied. First, polarimetric and interferometric features are extracted and projected into a 3-dimensional feature space by using the supervised dimension reduction algorithm Uniform Manifold Approximation and Projection (UMAP). Subsequently, based on the expressive 3-dimensional representation a simple yet sufficient k-nearest neighbors (KNN) classifier is applied to assign a land cover class to each pixel. In this way, besides the simplified classification, the visualization of the underlying data structure is possible and contributes to a better explanation and analysis of classification results. The data analyzed in this way are airborne Land S-band PolInSAR data acquired by the F-SAR system. The visual analysis of reduced feature spaces as well as the quantitative analysis of classification results reveal the benefits of combining both frequencies with regard to class separability.


INTRODUCTION
The possibility to acquire images regardless of cloud cover and daylight makes Polarimetric Synthetic Aperture Radar (PolSAR) systems a powerful tool for earth observation. By transmitting and receiving differently polarized microwave pulses, information-rich measurements are obtained that describe scattering processes occurring in the target area. The analysis of the polarization state of backscattered signals allows deriving information about the geometric shape, orientation and geophysical parameters of observed scatterers on the ground. Thus, PolSAR systems provide a rich data basis to generate accurate and up-to-date land cover maps. These maps play an important role in various application domains such as efficient planning and management of urban and agricultural land use or environmental monitoring (Lee and Pottier, 2017).
If an interferometric constellation (InSAR) is used for imaging, the phase difference of SAR image pairs yields additional valuable information. For example, the coherence of scattering mechanisms can be quantified that provides insights about the structure and temporal variability of observed scatterers. Hence, the combination of polarimetry and interferometry (PolInSAR) results in a powerful observation space and yields the potential to improve land cover classification even further.
Current airborne PolSAR systems, such as F-SAR or AIRSAR, additionally enable the simultaneous acquisition in different frequency bands. Since electromagnetic waves penetrate media to different depths depending on their frequency, various observation frequencies result in different backscattering responses for the same observed area. The question that arises in the context of land cover classification is whether the combination of different frequency bands contributes to an improved or more fine-grained class separation. * Corresponding author The complementary information content of multi-frequency data was elaborated in detail in (Baronti et al., 1995) using airborne PolSAR images acquired in P-, L-, and C-band. First evidence of the added value of using this type of multifrequency PolSAR data for land cover classification is demonstrated in (Chen et al., 1996). Based on classification results, derived by using a dynamic learning network, the superior discrimination capability of multi-frequency over single-frequency data sets is shown. In (Turkar et al., 2012), L-C-and X-band data captured by spaceborne sensors are combined, resulting in an improvement in classification performance as well. Various methodical approaches have been proposed for the classification of multi-frequency data. These include the unsupervised H/A/α-Wishart classifier for dual-frequency PolSAR data introduced in (Ferro-Famil et al., 2001) and supervised machine learning approaches such as the use of a Random Forest (Hagensieker and Waske, 2018), a Support Vector Machine (Lardeux et al., 2009) or a Stacked Auto-Encoder as presented in (De et al., 2018).
In many studies (Chen et al., 1996, Shang et al., 2009, Turkar et al., 2012, Hagensieker and Waske, 2018 addressing the advantages of multi-frequency PolSAR or PolInSAR data for land cover classification, the potential and performance are evaluated in terms of classification accuracies solely. However, these accuaracies are highly influenced by the selection of training and test areas. Furthermore, considering quantitative classification results only does not support the explainability of misclassifications. This limits the application-specific improvement of the classifiers. Aiming at explainability, the main challenge is the high number of PolInSAR features that are included in the classification. For human observers it is difficult to get a complete overview of all relevant features and their interaction and influence on the class separability. In addition to the difficulty of inter- pretation, the large dimension of the feature space complicates automatic classification. Since the volume of the feature space grows exponentially with the number of features, the training of a classifier requires significantly more labeled data. In addition, redundant information, that may be present in the feature representation, can negatively affect the performance of a classifier. To counteract this problem, feature selection or dimension reduction approaches have been proposed in the context of PolSAR data classification. For example, in (Chen et al., 1996) redundant features are identified and removed based on a correlation analysis. Additionally, methods for dimension reduction such as ISOMAP (Ainsworth and Lee, 2004), Laplacian Eigenmaps (Tu et al., 2012) and t-distributed Stochastic Neighbor embeddings (He et al., 2020) are proposed in order to represent relevant polarimetric information in just a few components.
In this paper the approach of dimension reduction in conjunction with a simple classifier is chosen to assess the advantages of multi-frequency PolInSAR data for land cover classification. Thereby, both highlighted challenges are overcome at the same time. First, reducing the feature space to only three dimensions allows a visualization of the structure of the data underlying the classification. This enables an instructive visual analysis of class separability as well as the explainability of classification results. Second, classification is greatly simplified by using a low-dimensional yet expressive representation of the data. In contrast to existing approaches, the method applied in this paper employs Uniform Manifold Approximation and Projection (UMAP) for dimension reduction. This algorithm is chosen, since, according to our analysis presented in (Schmitz et al., 2021), it is well suited for finding a 3-dimensional representation of PolInSAR data that preserves class separability.
The paper content is structured as follows: In Section 2, the study site and the acquired data are presented. The methods to perform dimension reduction, classification and visualization are described in Section 3. In Section 4, a visual analysis and comparison of single-and multi-frequency data are presented based on UMAP results and quantitative classification results are discussed. An additional focus in Section 4 is on the analysis of misclassifications. In Section 5, a summary is given and conclusions are drawn.

DATA SET AND STUDYSITE
The multi-frequency PolInSAR data set considered in this study originated from an airborne measurement campaign conducted in August 2020 on the German North Sea coast. The data was acquired by the F-SAR system, developed by the German Aerospace Center (Horn et al., 2009). This airborne SAR system is capable of capturing fully polarimetric data in five different frequency bands, four of which can be used at the same time. Interferometric measurements can be realized in single-pass and repeat-pass. Thus, the F-SAR system enables the acquisition of a highly informative data set. The PolIn-SAR image pairs used in this study are fully polarimetric SAR images, acquired simultaneously in L-and S-band in repeatpass configuration. The resolution of single-look products is 0.6 m × 1.29 m in azimuth and range direction for L-band images and 0.5 m × 0.65 m for S-band images. In both cases the incidence angles vary between 25°(near range) and 55°(far range).
The captured scene considered in this study depicts a coastal strip on the German Wadden Sea between Neuharlingersiel and Carolinensiel. The geographic location and the processed PolSAR image in Pauli representation (S-band) are shown in Figure 1. The tide was low at the time of data acquisition, so that dry fallen tidal flats, musselbeds and water-filled tideways are visible on the seaward side. On the landward side, the tidal flat is adjoined by lane systems, salt marshes and two small sandy beaches. Beyond the dike, a large part of the depicted area represents agricultural land. In addition, two densely builtup residential areas and isolated farm yards are mapped.

METHODS
The approach, which is used to classify single-and multifrequency PolInSAR data contains three main steps. First, based on filtered SAR images, polarimetric and interferometric features are extracted on pixel level that span a highdimensional feature space. The following supervised dimension reduction is performed by applying UMAP. In this step the high-dimensional feature representation of each pixel is projected into a 3-dimensional Euclidean space, while retaining the local and global topological structure of the data. By making use of label information, data points representing the same class are projected close to each other in the reduced feature space, whereas data points referenced by different classes get separated. On the basis of the 3-dimensional feature representations, a k-nearest neighbors (KNN) classifier is used to assign a class label from a predefined set of land cover classes to each data point. One result of this approach is the classification map itself. The additional result is a visually presentable feature space that gives insights into the underlying data structure and provides support for analyzing classification results. In the following, the three main components of the classification approach and the generation of human interpretable visualizations are described in detail.

Feature extraction
Each PolSAR image pixel is represented by its 2×2 scattering matrix. In order to extract polarimetric features from the L-band and S-band images, 3×3 coherency matrices are calculated and averaged over three azimuth bins. For further reduction of speckle noise, a Refined-Lee filter using a 7×7 window is applied. To decrease the influence of the incidence angle, covariance matrices are projected to the wave front plane using the γ0-convention. Based on the resulting corrected matrices, several polarimetric features, summarized in Table 1, are extracted using the PolSARPro Software (Pottier et al., 2018). The interferometric coherence, which describes the local phase correlations between two complex SAR images, was delivered along with the F-SAR data.

Magnitudes of coherency matrix elements [dB]
SP AN (S) = |S hh | 2 + 2|S hv | 2 + |Svv| 2 Total scattered power [dB] H, A, α Entropy, Anisotropy (and 4 combinations of these) and mean alpha angle derived from the Eigenvalue Decomposition (Cloude and Pottier, 1997)  Extracted features from L-and S-band images are projected to ground-range geometry on a common 1 m × 1 m raster. In order to reduce the high dynamic range that is observed for several features, a feature-wise clipping based on the 3 rd and 97 th percentile is performed. Subsequently, each feature is normalized to the value range [0, 1]. By stacking the features of a single-frequency band, each pixel can be represented by a 46-dimensional feature vector. For the multi-frequency representation, concatenating the vectors consequently results in a 92-dimensional vector.

Supervised dimension reduction
The high amount of features used to describe scattering properties not only poses difficulties for direct interpretation by human observers, but likewise increases the complexity of classifiers for automatic analysis. Some of the selected features have closely related physical interpretations. Thus, redundancies arise, which are ideally eliminated without loss of relevant information. For this step, the nonlinear dimension reduction method UMAP is applied. The UMAP algorithm, proposed and described in (McInnes et al., 2018), is based on neighborhoodgraphs and makes use of ideas from topological data analysis. The algorithm can be divided into two main steps: Graph construction and graph projection. In the first step the topology of the high-dimensional data is approximated and represented by a fuzzy simplicial complex. This representation essentially provides a weighted graph, whose edge weights indicate how likely two data points are to be connected. Thereby, the connection probability depends on the distance of a point to its knearest neighbors, where k is a hyperparameter. The second step involves determining a low-dimensional topological representation of the data that is as similar as possible to the highdimensional representation in terms of local and global structure. This optimization problem is formulated as the minimization of the cross-entropy between the two topological representations. In addition to unsupervised dimension reduction, which is based solely on the data itself, the UMAP algorithm can be used for supervised dimension reduction that includes label information from reference data. For this purpose, in addition to the topological approximation of the high-dimensional data, a second fuzzy simplicial complex is constructed, which is based exclusively on the categorical distance of the reference labels. Using the intersection of both complexes, one that relates to the data and another that relates to the reference labels, a joint topological representation is generated and subsequently a low-dimensional representation is optimized. As a result of supervised dimension reduction, a projection rule is obtained by which high-dimensional feature representations can be projected to a low-dimensional space. This is done in such a way that the data structure is preserved while different classes are separated from each other. The obtained projection rule can subsequently be applied to new unseen data without a known class label.
In this study, supervised UMAP with Euclidean distance is applied to project high-dimensional feature representations of single-or multi-frequency PolInSAR data to a 3-dimensional Euclidean space (in the following referred to as reduced feature space). Thus, the visualization and intuitive capture of the data structure of the different data sets is enabled and the subsequent automatic classification is facilitated.

Classification
The output of the dimension reduction is not yet a classification result, but only a compact representation of the data. However, by using supervised dimension reduction, a data representation is already learned that separates different classes. Assuming that the learned representation is also capable of separating unseen test data correctly, classifying the data based on this representation is a straightforward task. Therefore, a simple k-nearest neighbors vote is used as a classifier that assigns a land cover class to each data point. This pixel-based classification does not include any spatial features. Thus, the level of noise still present in the feature images is directly reflected in the classification image. To eliminate isolated pixel classification, i.e. pixels assigned to class yi surrounded by pixels assigned to class yj, a majority filter is applied in a 3×3 window in a post-processing step. Thus, the final result of the whole processing chain is a land cover map of the area covered by the PolInSAR data.

Visualization of reduced feature space
As a result of applying supervised UMAP, each PolInSAR image pixel is characterized by three components, thus can be visualized in a 3-axis scatterplot. The components plotted on the x-, y-and z-axis, cannot be interpreted directly in physical terms. However, the relative position of the data points to each other in the reduced feature space is meaningful. Data points describing similar scattering processes, and thus probably belonging to the same land cover class, are adjacent, while differences in scattering responses are characterized by a high distance in the reduced feature space. Consequently, the class separability as well as intra-class variances in the underlying PolInSAR data become visible. For the interpretation of scatterplots, it should be noted that the inclusion of class information may lead to an artificial separation of two similar classes.
In order to visualize the low-dimensional representation of the PolInSAR data within a spatial context, colored images, in the following referred to as RGB visualization, are generated. For this purpose, the three components are scaled linearly, yielding a range of values from 0 to 255, and are interpreted as red, green and blue intensities. Each image pixel is colored based on its resulting associated RGB value. The RGB visualizations provide an intuitive way to rapidly identify which land cover classes are similar or dissimilar in the PolInSAR data set under consideration.

RESULTS
UMAP projection and classification are applied on PolInSAR data acquired simultaneously in L-and S-band. A visual analysis based on reduced feature spaces and a quantitative analysis based on classification results are performed. The objective of the investigations is to compare S-versus L-band data and to compare single-versus multi-frequency data with respect to their potential for land cover classification. By means of the classification results, poorly distinguishable classes are identified for the different data sets. The 3-dimensional representations of these classes are considered in detail in order to better understand the underlying nature of the issues.

Experimental setup
In order to apply the classification process, annotated training and test data are required. To this end, the study area is labeled manually supported by optical data, defining ten land cover classes. The annotation and division of the scene into training and test areas are depicted in Figure 2. The number of annotated pixels per class varies greatly. The classes sand and asphalt are underrepresented in the training data with a quantity of approximately 47,000 respectively 18,000 pixels. To reduce the strong imbalance between the classes, only a subset of the maximum of 100,000 labeled training pixels per class is used to determine projection rules for dimension reduction and to train the classifiers. This represents a compromise of class balance and the inclusion of valuable label information. To identify suitable hyperparameters of KNN classifiers (number of neighbors k and weight function for prediction), a 5-fold cross validation is performed using subsets of the training data. Subsequently, the trained models for dimension reduction and classification are applied to the unseen test data.  Figure 2. Manually generated reference labels for training and testing, covering ten land cover classes.

UMAP visualization
The results of the supervised dimension reduction applied on the single-frequency data sets (L-band, S-band) and the multifrequency data set (L-& S-band) are shown in Figure 3. For each data set, data points projected to the reduced feature space are displayed using the RGB visualization described in Section 3.4, and four scatterplots. In the first scatterplot (considered from left to right) projected data points belonging to the training set are mapped and colored according to their reference labels. The second plot depicts projected data points belonging to the test set, also colored according to their reference labels. These two plots provide information about the class separability and show how well the 3-dimensional representation, learned from the training data, can be transferred to unseen test data. The following two scatterplots show the same sets of projected data points, but this time colored according to their position in the reduced feature space. For colorization the axes are interpreted as red-, green-and blue-intensity scales, thus representing the color scheme of the RGB visualization shown above the scatterplots. The color schemes derived for the three data sets differ significantly, i.e. same land cover classes appear in different colors. This is due to the fact that the projection rules used to derive the 3-dimensional representations were determined individually for the different data sets.
In the resulting representation only the relative position of the points to each other is meaningful. Thus, the absolute position of a point, which determines the color in the RGB visualization, is not directly related to the land cover class.
Based on Figure 3, the following observations are made: For all data sets, the learned 3-dimensional representations generalize sufficiently well to unseen data. This is concluded from the fact that the layout of the test data replicates that of the training data, including the internal structure of the clusters. For the most part, the formation of projected test data points into separated clusters follows the reference label well. Comparing the scatterplots of the three data sets with each other, the superiority of the multi-frequency data set in terms of class separability is evident. Using only L-band data, the musselbed class (orange) shows high intra-class variance and does not form a compact cluster. In addition, test data referenced as man-made (red) are mixed up with data referenced as high vegetation (dark green) and confusion occurs between farmland (brown) and meadow (light green). Compared to the L-band data, the structure of the S-band data generally shows better class separation. However, overlapping clusters and confusion between classes of test data are still present. As already highlighted, the most separable clusters are formed when using the multi-frequency data set. The only remaining challenge arises for test points belonging to the classes farmland (brown) or meadow (light green), that are located between the corresponding clusters in the reduced feature space.
In addition to the visual analysis of class separability, the ex-  ploration of the reduced feature spaces reveals further characteristics of the investigated data. One interesting observation is the split of the water (dark blue) reference class into two separated clusters which arise for the single-and multi-frequency data sets. With additional consideration of the RGB visualizations, it can be concluded that this behavior is related to the incidence angle dependence of polarimetric features. As a pure surface scatterer, the water class is particularly affected, so that water surfaces appear different in the near and far range. This example illustrates even further the benefits of visualizing the data structure for the understanding and analysis of the data.

Classification results
The 3-dimensional feature representations, shown and discussed in the previous section, provide the input data for the KNN classifier that is used to assign a land cover class to each pixel. The achieved classification results based on single-and multi-frequency data are given in Figure 4 in form of confusion matrices. Some observations that have already emerged from the preceding visual analysis are also reflected in the quantitative results. The two classes water and mudflats, that form visually separable clusters in the reduced feature spaces, are classified accurately for all three data sets. The classification based on L-band data frequently fails to identify the class musselbed, which has already been indicated in the high intra-class variance of this class observed from the RGB visualization. High error rates also occur between the three classes meadow, farmland and sand. Overall, the classification based on S-band data provides significantly better results. Compared to the L-band based results, the classification accuracy increases for the majority of classes. The remaining weakness is the confusion of sand and asphalt and the poor separation of meadow and farmland, which have nevertheless improved compared to the Lband results. One of the exceptions, where the L-band data set provides better results, is the assignment of the class man-made.
Based on the S-band data set, data points incorrectly classified as man-made occur frequently in areas of low vegetation. With the aid of resulting classification images, it was determined that the affected areas are salt marshes, which are crossed by narrow water ditches. A possible explanation is that double reflections occur between the water surface and the edge of the ditch. Since double reflections typically occur between the ground and building walls, there is confusion between the two classes low vegetation and man-made. The overall accuracy of the classification based on the multi-frequency data set, namely 88.84 %, does not significantly exceed the classification result based on the S-band data alone (88.23 %). However, a detailed review of the confusion matrix reveals that there is a substantial improvement for certain classes. By using the multi-frequency data, especially the classes man-made and low vegetation are better identified than by using only a single-frequency band. Only the class meadow is more often confused with the class farmland compared to the S-band based classification.

Analysis of misclassification
In the following, classes which are often incorrectly assigned in the classification are considered in more detail. For this purpose, their representations in the 3-dimensional feature space are used. For the L-band data set, confusion between the classes farmland, meadow and sand ( Figure 5) is analyzed. For the Sband data set, confusion between sand and asphalt ( Figure 6) as well as between farmland and meadow (Figure 7) is explored. For the multi-frequency data set, the challenge of distinguishing farmland and meadow (Figure 8) is investigated. From the figures, two phenomena are observed that cause confusion in the classification. The first phenomenon becomes apparent in Figures 5 and 6. Test data points referenced by class yi are projected to areas, in which a cluster of another class yj is formed by projected training data. In the case of L-band data ( Figure 5), high-dimensional feature representations with the reference label farmland are projected to positions in the 3-dimensional space, where the cluster meadow or sand is established for the training data set. A similar behaviour is observed in the reduced feature space based on S-band data for the classes sand and asphalt ( Figure 6). The second phenomenon arises if there is no clear border between two classes but rather a smooth transition from one class to another. This is illustrated by the examples shown in Figures 7 and 8 for the classes farmland and meadow. In the case of S-band data (Figure 7), an overlapping area between the two classes is already developed in the representation of the training data. The misclassifications of test data arise exclusively in this transition area. For the multi-frequency data set, data points corresponding to farmland or meadow are well separated for the training data. However, several test data points are projected to regions between the two distinct clusters, resulting in a smooth transition here as well. A continuous transition of the two classes is comprehensible for the data used in this study. The areas marked with the reference label farmland differ among themselves. While some areas are plowed fields, other fields have not yet been cleared of vegetation or crop residues. Especially in the area of unplowed fields, mixed pixels result, in which meadow and farmland fall into one resolution cell. The projection of such pixels to the area between the clusters or to the transition area of the two clusters is thus reasonable.

CONCLUSION
In this paper, it has been demonstrated in an illustrative and comprehensible manner that multi-frequency PolInSAR data are beneficial for land cover classification. For this purpose, a classification approach, which consists of supervised dimension reduction using UMAP followed by a KNN classifier, was applied on single-and multi-frequency (S-and L-band) PolIn-SAR data. Using this method provides a meaningful visualization of the feature space underlying the classification, and a simplified class assignment by reducing redundant information and using the resulting 3-dimensional feature representation. Based on the reduced feature spaces, the better separability of land cover classes for multi-frequency data becomes visually apparent and is equally reflected in the quantitative analysis of the classification results. The benefit of provided visualizations was additionally demonstrated in the analysis of misclassifications. Future work will focus on evaluating the added value of simplified classification through the use of low-dimensional feature representations. In particular, the hypothesis that the applied method requires less training data while maintaining the classification performance will be investigated.