NON-LOCAL MEANS FILTER FOR POLARIMETRIC SAR SPECKLE REDUCTION-EXPERIMENTS USING TERRASAR-X DATA

The speckle is omnipresent in synthetic aperture radar (SAR) images as an intrinsic characteristic. However, it is unwanted in certain applications. Therefore, intelligent filters for speckle reduction are of great importance. It has been demonstrated in several literatures that the non-local means filter can reduce noise while preserving details. This paper discusses non-local means filter for polarimetric SAR (PolSAR) speckle reduction. The impact of different similarity approaches, weight kernels, and parameters in the filter were analysed. A data-driven adaptive weight kernel was proposed. Combined with different similarity measures, it is compared with existing algorithms, using fully polarimetric TerraSAR-X data acquired during the commissioning phase. The proposed approach has overall the best performance in terms of speckle reduction, detail preservation, and polarimetric information preservation. This study suggests the high potential of using the developed nonlocal means filer for speckle reduction of PolSAR data acquired by the next generation SAR missions, e.g. TanDEM-L and TerraSAR-X NG. * Corresponding author.


INTRODUCTION
Polarimetric synthetic aperture radar (PolSAR) has been widely applied in Earth observation, including land classification, snow cover mapping, surface geophysical parameter estimation, and so on. However, in certain applications, the presence of speckle limits the usage of PolSAR data, making speckle reduction a prerequisite.
Speckle filtering, or most of the filtering problems, can be viewed as a weighted averaging in spatial domain. The samples been averaged can be drawn within a local neighbourhood to the target pixel or be drawn anywhere in the whole image. The latter was introduced as the non-local means filter in (Buades et al., 2005), which is demonstrated to have great speckle reduction as well as detail preservation power. This paper analyses the non-local means filter in PolSAR image, with special attention on a new weight kernel combining with different similarity measures. To bring some background knowledge, a briefly introduction of the state-of-the-art of speckle filters is as follows.

Local filter
Local filter selects samples within a local window. The most straightforward one is the Boxcar filter, which averages all the pixels in a sliding window.
Instead of averaging all the samples in the local window, one could also adaptively select them. One well-known local adaptive filter is the Lee filter introduced by (Lee, 1980), which takes into account the statistics of the local window. In (Lee, 1981), Lee introduced the improved Lee filter by changing the square window into a rectangular or triangular window, whose exact shape is adaptive to the edge in the image. In this way, the orientation of structures is well preserved. (Vasile et al., 2006) applied homogenous region recognition using local statistics. After regions are recognized, they applied the strategy of the Lee filter for speckle reduction. (D'Hondt et al., 2013) utilised both spatial information and radiometric information (intensity in SAR and covariance matrix in PolSAR) to carry out weighted averaging in the local window. They not only consider the spatial relation of two pixels but also the information the pixels carry. (Buades et al., 2005) introduced a new technique called nonlocal means filter. This filter broke through the local limitation of conventional filters. Experiments and research have shown that the non-local means filter is capable of strong noise reduction while preserving details.

Non-local filter
The key studies in the non-local means filter lies in the similarity measure of two pixels. (Chen et al., 2011) treats it as a detection problem by modelling the PolSAR covariance matrix of a pixel in homogenous area as a Wishart distribution. A criterion derived from the likelihoods of the covariance matrices to the model measures how similar two pixels are. The similarity measures are mapped to weights by a kernel. (Liu and Zhong, 2014) introduced an additional prior term in the likelihood. (Deledalle et al., 2014b) extended the algorithm to general multi-pass SAR data, with local adaptation of parameters, and post-processing for bias reduction.

PolSAR data format
With horizontal and vertical linear polarizations, the PolSAR data can be expressed by the scattering matrix. HH HV The covariance and coherence matrix can be defined from the Lexicographic and the Pauli vector, respectively as follows: Covariance matrix: Coherence matrix: where H is the Hermitian transpose, and  denotes the expectation.
The joint distribution of Ω is usually modelled as a multivariate complex circular Gaussian distribution.

Workflow of non-local means filter
The non-local means filter selects similar pixels in the whole image, whereas in practise only a large enough search window is used, in order to reduce the computational cost.
While searching for similar pixels, not only the two pixels are compared, but also their respective neighbourhood, called patches. Comparing patches instead of just single pixels better preserves the structures.
Through certain comparison, the similarity value can be calculated. It is mapped into weights through a kernel, which are used in the weighted average. (Deledalle et al., 2014a) categorised similarity measures into three classes: the detection approach, the geometric approach, and the information approach. One example from each category and another similarity introduced by (Guo et al., 2015) are introduced in this section and analysed in the following.

Similarity measures
Detection approach compares the likelihoods of the pixels covariance matrices based on a statistical model. (Chen et al., 2011) models the covariance matrices as a Wishart distribution and employs a generalized likelihood ratio test. The criterion (6) describes the similarity of two PolSAR pixels.
where C i = PolSAR covariance matrix i = indicate different pixel L = number of looks |.| = determinate of matrix Δ = similarity value of two pixels Geometric approach does not assume any specific model, but rather utilises the geometric distance of two PolSAR covariance matrices to describe their similarity. One example (D'Hondt et al., 2013) of the geometric approach adopts an affine invariant metric (7).
where ||.|| F =Frobenius norm log=matrix logarithm Information approach originates from information theory. One example (D'Hondt et al., 2013) of the information approach utilises the Kullback-Leibler divergence, which measures the discrepancy of two probability distributions. Based on the assumption that the PolSAR data following multivariate complex circular normal distribution, they constructed a symmetric measurement of Kullback-Leibler divergence as follows: where tr=trace of matrix d=dimension of PolSAR covariance matrix Lastly, the trace approach is introduced by (Guo et al., 2015). It measures the difference of PolSAR covariance matrices using the criterion (9) 2 1 2 1 2 All the four similarity measures are defined in   , 0  , with 0 implies the highest simililarity, i.e. the analysed pixel is identical to the target pixel.

Data
All experiments in this work exploit fully polarimetric TerraSAR-X data acquired in the commissioning phase. The experimental data are shown in Figure 1 in the Pauli basis.

Analyses of similarity approaches
Similarity measures greatly influence the performance of the non-local means filter. In this section, different similarity measures are analysed in terms of homogeneous pixel recognition and detail differentiation.
For the experiment, two test areas are selected from the data, one homogeneous, and the other with regular pattern. Only the latter is shown in this paper ( Figure 2). One pixel from each test area is chosen as the target pixel. The target similarity images (TSI) of the selected target pixels are computed, which show the similarity values between the target pixel and every other pixel in the test area.

Test area one
The expected TSI is a constant zero for a homogeneous area. Therefore the mean and standard deviation of the similarity values in the TSIs are the matrices of the performance of different similarity measures. The closer the mean value to zero and the smaller the standard deviation, the better the similarity measures perform.

Test area two
The test area is shown in Figure 2. The pixel marked by the red dot in Figure 2 is the selected target pixel and the TSIs are shown in Figure 3. It can be observed that the detection approach, the geometric approach, and the information approach are capable of recognizing the selected pattern.
Among them the geometric approach can even detect the variance in the selected area and shows obvious contrast.  Figure 2 that the first three intersection points are very similar while the fourth and fifth ones are less similar. This fact is only precisely detected by the geometric approach. The similarity values along the red line also verify that the geometric approach is capable of detecting the contrast.
According to these two analyses, the geometric approach has the best performance on differentiation ability among these evaluated similarity approaches.

Exponential kernel
The most common exponential weighting kernel used in (Buades et al., 2005): where w = weight value Δ = similarity value h = filtering parameter Since the kernel function is determined, the filtering parameter is the most important factor, which influences the filtered results. Therefore, Figure 5 shows the impact of different filtering parameters on the filtered results. Choices of filtering parameter influence the speckle reduction and detail preservation. These two aspects always form a trade-off. It is obvious that selecting appropriate filtering parameter is crucial for the weight kernel.

Piecewise kernel
In addition to the exponential kernel as (10), there also exists the piecewise kernel (11), where the original filtering parameter acts as a threshold.
while the exponential kernel assigns small weight values even to dissimilar pixels, the piecewise kernel excludes these pixels from the filtering process. Hence, the piecewise kernel avoids the risk that, for isolated and very bright scatterers, which are quite common in SAR, even though these pixels are assigned small weights, they will have a considerable contribution when computing the weighted mean. Moreover, if the search window is large and few similar pixels of the target pixel are found, the piecewise kernel limits the filtering only to similar pixels and gives less biased results.
The selection of the filtering parameter is vital for both types of kernel. The best selection strategy is capable of differentiating similar pixel pairs and dissimilar pairs regardless of the employed similarity approach and patch size.

Proposed kernel
Based on the piecewise kernel, we propose a new kernel which adaptively determines the threshold. A user-selected homogeneous area and heterogeneous area are sampled and for both areas all similarity values are calculated. The two probability density functions of the similarity values are then used for determining the threshold. Figure 6. Probability density functions of similarity values calculated from homogeneous and heterogeneous areas Figure 6 shows the PDFs calculated from the samples of TerraSAR-X data to exemplify the selection of filtering parameter. The red curve and blue curve in Figure 6 are the PDFs of similarity values calculated from the heterogeneous area and the homogeneous area, respectively. These two curves have an intersection point T. On the left side of point T, the corresponding similarity values are very small, and are most likely calculated from dissimilar pixel pairs. Conversely on the right side of point T, the PDF of similarity values from homogeneous area has a higher probability.
This paper treats the similarity value corresponding to the intersection point T as a boundary to separate similarity values representing similar pixel pairs and dissimilar pixel pairs. This value is selected as the filtering parameter, similarity threshold, for the piecewise kernel.

Experiments
In this section, the proposed kernel is examined using TerraSAR-X data in conjunction with the detection approach, the geometric approach, and the information approach. These three filtered results are compared to the NLSAR (Deledalle et al., 2014b), DSM (Liu and Zhong, 2014), Pretest (Chen et al., 2011), Refined Lee (Lee, 1981), and IDAN (Vasile et al., 2006), where the first three are non-local means filters, and the last two are conventional local filters. The results of Refined Lee and IDAN are produced by PolSARpro (European Space Agency, 2011). The results of NLSAR, DSM, and Pretest are produced without any post-processing and parameter adaptive strategy. Figure 7 shows the filtered results of all these methods.

Evaluation
The results are evaluated based on three aspects: speckle reduction, detail preservation, and polarimetric information preservation.

Speckle reduction
The equivalent number of looks (ENL) is used as the index to evaluate these methods' performance on speckle reduction (Lee & Pottier, 2009). It is calculated with respect to three selected homogenous areas from experimental TerraSAR-X data for all three polarimetric channels. This index is shown TerraSAR-X data Higher ENL implies better speckle reduction. The best two filters: NLSAR and the newly designed kernel combined with the information approach, are marked in bold in Table 2.

Detail preservation
The measure of detail preservation is according to the edge preservation degree of ratio of average (EPD-ROA) (Feng et al., 2011): where m=number of pixels in selected area i=ith pixel in selected area I D1 , I D2 =adjacent pixels' values of filtered data I O1 , I O2 =adjacent pixels' values of original data For three areas which include edge structures are selected from experimental TerraSAR-X data, filtered by all these algorithms, EPD-ROA is calculated. Every polarimetric channel of these filtered data are evaluated using EPD-ROA. The values of the index are shown in Table 3.
While using EPD-ROA evaluates detail preservation, if the value of the index is closer to one, it means that the corresponding algorithm performs better on detail preservation. The best three filters regarding detail preservation, the designed kernel with the geometric approach, the designed kernel with the information approach, and DSM, are marked as bold in Table 3. One interesting coincidence is that DSM and the designed kernel with the information approach perform exactly the same on detail preservation, although they have total different similarity approach and weight kernel.  Table 3. EPD-ROA of filtered data

Polarimetric information preservation
The decomposed parameters, entropy and alpha, and the unsupervised classification using entropy-alpha are utilised to evaluate the filters' performance on polarimetric information preservation.
The entropy and alpha are derived from the eigenvalues decomposed from PolSAR covariance matrix. The entropy carries the information of the number of targets and the alpha carries the information of the scattering mechanism.  Figure 9 shows the alpha and the entropy of the pattern in Figure 2 respectively. By visual evaluation of the pattern preservation of alpha and entropy, the designed kernel with geometric approach outperforms the other filters, because its result shows the best preservation of the pattern structure.
Furthermore, the entropy-alpha unsupervised classification (Cloude & Pottier, 1997) is also used to evaluate the preservation of polarimetric information. The unsupervised classification separates the entropy-alpha plane into nine regions ( Figure 10), with each region associates to a class (Z1:Branch/Crown structure, Z2:Cloud of anisotropic needles, Z3:No feasible region, Z4:Forestry double bounce, Z5:Vegetation, Z6:Surface roughness propagation effect, Z7:Dihedral scatterer, Z8:Dipole, Z9:Bragg surface). Figure 10. Entropy-alpha plane and categories of the entropyalpha unsupervised classification (Cloude & Pottier, 1997) In the filtered TerraSAR-X data, a vegetation area and a double bounce area are selected according to the optical image of the same area, which belong to class Z5 and Z7 respectively. The entropy-alpha of each pixel in the two areas filtered by the eight filters are mapped into the entropy-alpha plane as depicted in Figure 11, which is then used to evaluate their performance concerning polarimetric information preservation.

Figure 11. Entropy-alpha unsupervised classification test on vegetation scatterers and double bounce scatterers
According to the unsupervised classification, the designed kernel with the three similarity measures and NLSAR produce correct and compact classification results that means better preservation of polarimetric information.
To sum up, firstly, Pretest, NLSAR, and the designed kernel with detection approach utilise the same similarity approach. Pretest performs worse compared to the other two filters on all three aspects. NLSAR slightly outperforms the designed kernel with detection approach on speckle reduction, but it is not comparable to the designed kernel with detection approach on detail preservation. The two filters have barely notable differences on the evaluation of the entropy-alpha unsupervised classification. However, the designed kernel with the detection approach has a much better performance concerning pattern preservation of the polarimetric information. So, the designed. kernel outperforms the kernels used in the methods of Pretest and NLSAR.
Secondly, according to the three evaluated aspects, the designed kernel with the geometric approach and the information approach are capable of strong speckle reduction while preserving details and polarimetric information.
At last, the designed kernel enjoys some advantages originating from its definition. The kernel has the advantages of the piecewise kernel as discussed in section 3.4. It is also data adaptive because its filtering parameter is automatically determined from data samples.

CONCLUSION AND OUTLOOK
In this paper, the non-local means filter for speckle reduction of PolSAR data is analysed and a new weighting kernel is proposed.
The similarity approaches, including the detection approach, the geometric approach, the information approach and the trace approach, are analysed. In summary, the information approach and the geometric approach are comparable and outperform the detection approach. Among these two, the geometric approach gives overall better performance.
The designed kernel is adaptive to the data. Experiments using TerraSAR-X data show that it is capable of strong speckle reduction while preserving detail and the polarimetric information.
The PDF used in this work is calculated from samples of the real data, which have to be manually selected. In the future, a fully automatic sampling scheme should be developed.