A COMPARISON OF SUB-PIXEL MAPPING METHODS FOR COASTAL AREAS

This paper presents the comparisons of three soft classification methods and three sub-pixel mapping methods for the classification of coastal areas at sub-pixel level. Specifically, SPOT-7 multispectral images covering the coastal area of Perth are selected as the experiment dataset. For the soft classification, linear spectral unmixing model, supervised fully-fuzzy classification method and the support vector machine are applied to generate the fraction map. Then for the sub-pixel mapping, the sub-pixel/pixel attraction model, pixel swapping and wavelets method are compared. Besides, the influence of the correct fraction constraint is explored. Moreover, a post-processing step is implemented according to the known spatial knowledge of coastal areas. The accuracy assessment of the fraction values indicates that support vector machine generates the most accurate fraction result. For sub-pixel mapping, wavelets method outperforms the other two methods with overall classification accuracy of 91.79% and Kappa coefficient of 0.875 after the post-processing step and it also performs best for waterline extraction with mean distance of 0.71m to the reference waterline. In this experiment, the use of correct fraction constraint decreases the classification accuracy of sub-pixel mapping methods and waterline extraction. Finally, the post-processing step improves the accuracy of sub-pixel mapping methods, especially for those with correct coefficient constraint. The most significant improvement of overall accuracy is as much as 4% for the subpixel/pixel attraction model with correct coefficient constraint.


INTRODUCTION
Coastal image classification is important for the monitoring of changes of coastal features, such as shoreline position and the coverages of coastal water bodies, sandy beaches and vegetation.Compared with other data source such as aerial images and UAV-acquired images, satellite images have the advantage of large spatial coverage, which is efficient for largescale coastal monitoring.However, the biggest disadvantage for most satellite images is the relatively low spatial resolution which limits the accuracy of estimation of coverages and positioning of boundaries.Sub-pixel mapping technique may be a solution to relieve this limitation.Generally, to realise the classification at sub-pixel level based on the original pixel-level images, two main steps are implemented: soft classification which predicts the percentage of each class inside a pixel and the sub-pixel mapping which determines the distribution of subpixel labels.
Soft classification, also called sub-pixel classification, allows multiple class membership for each pixel, which is designed to overcome the mixed pixel problem (Mertens, 2008).The estimation of class proportions inside each pixel leads to the generation of multiple fraction/abundance images, which are required for the following sub-pixel mapping step.Based on multiple criteria, Heremans and Van Orshoven (2015) compared 6 commonly applied machine learning methods, i.e.Multilayer Perception, Support Vector Regression, the Least-Squares (LS)-SVM, Bagged Regression Trees, Boosted Regression Trees and Random Forests for sub-pixel land cover classification with 8day MODIS NDVI images based on multiple criteria.They found that SVMs outperform other methods when time and training data are not considered.Other soft classification methods not belonging to machine learning include linear spectral unmixing (Settle and Drake, 1993) and fuzzy classification (Bastin, 1997).The soft classification step is very important in the determination of the final classification accuracy in sub-pixel level (Thornton et al., 2006, Mertens, 2008).Therefore, the comparison and selection of soft classification methods is implemented in this paper.
In this paper, Section 2 introduces the principle of selected soft classification methods and sub-pixel mapping methods.Section 3 illustrates the conduct of the experiment with SPOT-7 multispectral images and gives a comparison and discussion of results.Section 4 gives the conclusion and future work.

Selected Soft Classification Methods
In this paper, linear spectral unmixing, fully-fuzzy classification and SVM method are selected and compared for the soft classification.
2.1.1Linear Spectral Unmixing (LSU): Linear mixture model (LMM) involves representing any kind of spectral response as a linear combination of substances called endmembers orthogonal to or independent of each other.Assuming there are M endmembers, the LMM can be expressed as: Where x is the pixel spectrum vector, i s represents each endmember, i a is the fractional abundance vector and w is the noise vector.To be physically meaningful, two constraints are usually applied.Firstly, the fractional abundance vector should be no less than zero.Secondly, the sum of fraction values for each pixel should be one.More details of LSU can be found from the paper by Keshava and Mustard (2002).
This method has been widely applied in the sub-pixel mapping especially for hyperspectral images with many spectral bands.The commonly used software ENVI even has the linear spectral unmixing tool to extract the endmembers.However, the main drawback of LSU is the definition of endmembers and the assumption of orthogonality (Tompkins et al., 1997, Maselli, 2001).Nevertheless, this method has been tested considering its advantages of simplicity to understand and ease for implementation.Bastin (1997) proved that fuzzy classification is more accurate than linear mixture modelling and maximum likelihood classification method for Landsat TM imagery.Further, the fully fuzzy classification method used by Zhang and Foody (2001) can relax the requirements for training pixels compared with partially-fuzzy classification, indicating that the training pixels do not need to be pure, which can be beneficial for images with relatively low spatial resolution.It outperforms the partiallyfuzzy classification method as it increases the degree of overlap between pure pixels.Their method is based on the fuzzy cmeans algorithm (Bezdek, 2013), while instead of the mean and covariance of the samples, the fuzzy mean i v and fuzzy covariance matrix i C for class i are calculated as follows (Wang, 1990):

Fully-fuzzy Classification (FFC):
Where n is the number of class, Then the optimisation is realised by minimising of the error function i and m is the weighting exponent controlling degree of fuzziness.
2.1.3Support Vector Machine (SVM): Mountrakis et al. (2011) gives a review of the increasing numbers of recent works using SVM in remote sensing area and conclude that SVM has good generalisation ability even with limited numbers of training samples.SVM can be used not only for classification but also for regression.For soft classification SVM is used for regression.The basic principle is to fit a model to predict the possibility of each pixel belonging to each class based on the training samples (Heremans and Van Orshoven, 2015).The training samples are projected to a higher-dimensional feature space by applying an appropriate kernel function.Several kernel functions can be used including the linear, polynomial, radial basis function (RBF) and sigmoid function.In this paper, the RBF is applied.In the new feature space, a linear model is then fitted with maximal margin with minimal errors.Therefore, the non-linear regression problem is solved as a linear regression function in higher-dimensional feature space (Smola and Schölkopf, 2004).There are two parameters mostly affecting the performance of the soft classification, i.e. the parameter  inherited from the kernel function and the error penalty parameter c .Details of the algorithm will not be illustrated in this paper and readers can refer to the tutorial by Smola and Schölkopf (2004).

Selected Sub-pixel Mapping Methods
Three popular sub-pixel mapping methods, i.e. sub-pixel/pixel spatial attraction model, pixel swapping, ANN predicted Wavelet Transform are tested and compared in this paper.The introduction of those selected methods follows.

Sub-pixel/pixel Attraction Model (SAM):
Spatial dependence, i.e. spatially close observations are more likely to be alike than spatially distant ones, is the basic assumption for sub-pixel mapping (Mertens et al., 2006).For spatial attraction sub-pixel mapping models, spatial dependence is expresses by the attraction of neighbouring pixel or sub-pixels.Based on the variation of attraction targets, different models such as the subpixel/sub-pixel attraction model (Liguo et al., 2011), subpixel/pixel model (Mertens et al., 2006) and pixel/pixel attraction model (Wang et al., 2012) were developed.The main advantages of these models are the simplicity and ease of understanding (Mertens et al., 2006).In this paper, the method proposed by Mertens et al. (2006) which involves the interaction between pixel and sub-pixel is used.For each subpixel , ab p with sub-pixel coordinates of ( , ) ab , its attraction value for class c is calculated as: Where   , ij P c is the fraction value of its neighbouring pixel with pixel coordinates of   , ij.The distance between the sub- pixel and its neighbouring pixels is illustrated as in Figure 1.
For the definition of neighbouring pixels, there are three neighbourhood models (Mertens, 2008) and the 'surrounding' model where all the 8-connected pixels are considered in the calculation of attraction, is selected for this paper.
Figure 1.Illustration of coordinates of pixel and sub-pixel and distance calculation between them (adopted from Mertens ( 2008)) After the calculation of attraction value for each class, one simple method to derive the sub-pixel mapping result is to directly label each sub-pixel with the class with largest attraction value.Alternatively, the correct fraction (CF) constraint which forces the number of sub-pixels for each class inside a pixel to be consistent with the fraction value of the soft classification result (Mertens et al., 2004) can be applied.In this paper, both SAM and SAM with CF constraint (SAM CF) will be tested.k is calculated by a distance-weighted function:

Pixel
Where m is the number of neighbouring sub-pixels,   k j z x indicates whether the neighbouring sub-pixel j x is labelled as class k or not (1 or 0), ij  is the distance-dependent weighting parameter calculated as: Where ij h is the distance between sub-pixel i x and its neighbour sub-pixel j x .Iteratively, in every pixel, the sub-pixel with minimum attraction value classified as '1' and the sub-pixel with maximum attraction value classified as '0' would be swapped if the spatial correlation is increased after the swapping (Atkinson, 2001, Atkinson, 2005).Since for PS, the proportion of each class inside each pixel is kept throughout the iteration (Thornton et al., 2006), it is a method with CF constraint.

ANN Predicted Wavelet Transform (ANN WT):
Wavelet transform (WT) for 2D image can decompose the source image into approximation images at different lower spatial resolutions and detail coefficients forming the difference between successive approximations (Ranchin and Wald, 2000).Inversely, with the approximation fraction image and the estimated detail coefficients, the higher-resolution fraction image can be reconstructed ideally without any information loss (Mertens, 2008, Mertens et al., 2004).Mertens (2008) introduced ANN to model the wavelet coefficients considering ANN's noise-resistance and capability of modelling.Their experimental results indicate that ANN WT method generally perform better than Genetic Algorithm and spatial attraction method.To derive the fraction image  Similar with the SAM and SAM CF methods, there are two methods, i.e.ANN predicted WT without or with the CF constraint, which will be abbreviated as ANN WT and ANN WT CF respectively in the following.

Dataset
A set of SPOT-7 pan-sharpened sample images over Port Beach in Perth, Australia was downloaded from the website http://www.geo-airbusds.com/en/23-sample-imagery. The images were acquired on 24 October 2014 with a spatial resolution of 1.5m and four spectral bands.A spatial subset of 1024×512 pixels is selected and the location of the studied area is shown in Figure 3.To simulate the coarse-resolution image at pixel level, each of the four spectral bands is degraded to pixel size of 6m, which means the scale factor is 4. The reference classification map in sub-pixel level is manually created using the original multispectral images.The soft classification map in pixel level is then generated by calculating the percentages of each class inside the block of every 4×4 pixels.For SVM, the same set of training pixels is selected.Apart from the intrinsic attributes of the original images, i.e. the pixel values corresponding to multispectral bands, other attributes including Normalised Difference Vegetation Index (NDVI) and Difference Water Index (NDWI) (Gao, 1996) are also selected and added to the input features vectors to improve the accuracy of the soft classification.Moreover, the position of the pixel, i.e. the row and column are selected as well considering the spatial distribution of different objects.All those features are scaled to the range of [-1, 1] to avoid the features with much larger numeric values dominating other features.The open source package LIBSVM developed by Chang and Lin (2011) distributed on the website https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/ were downloaded and utilised for the processing.To find the optimal combination of parameter  and c , the parameter optimisation tool applying cross validation (CV) to the training samples is used.According to the optimisation result, the optimal parameters are 4.0 and 0.25 respectively.
To be consistent with SVM and LSU method, the same number of training pixels with the SVM method is selected for FFC.However, some of those training pixels are changed to mixed pixels as shown in Figure 4(b) with fraction values all above 0.9.The optimal value 2.3 of weighting exponent parameter m is derived by initial analysis to minimise the mean root mean square error (RMSE) of the five classes.Table 1 gives the accuracy of the soft classification results including the RMSE and correlation coefficient (CC) for each class.It can be seen that the SVM performs best no matter whether judged based on RMSE or correlation coefficient.Specifically, it generates the most accurate result for the classification of water body with RMSE of 0.048 and correlation coefficient of 0.997.LSU is generally better than FFC except for the classification of water body.This can be explained by the intraclass variability between still water body and the nearshore water body with highly-reflective foams and waves.The selection of training pixels within this area can significantly affect the resultant spectrum of endmember.
The FFC method is almost as accurate as SVM for the classification of water body with RMSE of 0.052 and correlation coefficient of 0.995, while for other classes it generates the poorest result.Although the FFC method does not generate as good result as SVM, it has the advantage of the relaxation of training pixels.This can be very crucial when it is difficult to choose pure pixels or even when there are no pure pixels for some classes, which does not satisfy the basic assumption of many algorithms.Since the soft classification result directly determines the final results, the SVM method is used in the following.

Sub-pixel Mapping
In order to compare the sub-pixel mapping techniques with pixel-level classification, the hard classification result of SVM at pixel level is resized to the same dimension as the sub-pixel result, which is abbreviated as HC for convenience.Besides, to improve the performance of PS method, the ANN WT CF result is used as initial input instead of random distribution.The results of the selected sub-pixel mapping methods are shown as Figure 5.It can be seen that for all the methods, some parts of the wet sandy beach are misclassified as low-reflection objects.
Although for those methods without CF constraint many small objects are lost, especially for the SAM method; methods with CF constraint seem to be less accurate since they generate many scattered pixels or patches.The Cohen's Kappa coefficient and overall accuracy (OA) for those methods are presented in Table 2. Firstly, ANN WT generates the most accurate result with OA and Kappa coefficient of 91.3% and 0.867 respectively, followed by SAM method with OA and Kappa coefficient of 91.21% and 0.865.However, even for the most accurate two methods, i.e.ANN WT and SAM, the accuracy does not improve much compared with the hard classification result using SVM.The increase of OA for them is only 0.21% and 0.12% respectively.Secondly, as consistent with visual judgment, the methods without CF constraint generate more accurate results than those with CF and the HC.By applying CF constraint, the accuracy decreases considerably, which is even worse than HC.Taking the most accurate ANN WT as an example, after applying CF, the accuracy reduces to the lowest.Thirdly, among the three methods with CF constraint, PS is the most accurate one with OA and Kappa coefficient of 88.79 and 0.831 respectively.

Post-processing
To improve the final results, a post-processing step incorporating known geographic knowledge is implemented.Firstly, it is believed that the water body should be only one connected region and isolated water body areas are reclassified.While this might not hold for some special areas, it can reduce the misclassification for most cases of coastal areas.
Secondly, very small patches are regarded as misclassified.Verhoeye and De Wulf (2002) used a smoothing filter to reduce the isolated sub-pixels in their sub-pixel mapping results, which is a reverse step of sub-pixel mapping (Mertens, 2008), while in this paper, filtering is avoided.Instead, they are re-labelled according to the available information.For ANN WT and ANN WT CF, each of these sub-pixels is reclassified to the class with second largest fraction value.Similarly for SAM and SAM CF, they are reclassified to the class with the second largest attraction value.For PS, since no information in sub-pixel level is available, each sub-pixel is reclassified to the class with largest number of sub-pixels shown in its neighbourhood window.
Thirdly, the areas classified as high-reflection or low-reflection objects between sandy beach and the water body or inside the water body are assumed to be misclassified and are re-labelled as water or sandy beach.This misclassification might be caused by foam and waves which are highly reflective and more similar to high-reflection objects rather than water body or by wet sand which is dark and more similar to low-reflection objects.This phenomenon is hard to avoid even with some training points in those ambiguous areas.The sub-pixel mapping results after post-processing are shown in Figure 6.The accuracy and its improvement compared with the results without post-processing are shown as Table 3.After post-processing, the ANN WT is still the most accurate, which is slightly better than HC.The accuracy of HC is also improved and is more accurate than other methods except for ANN WT.The methods with CF constraint are improved considerably and the improvement of SAM CF is the most significant with 4.00% and 0.06 of OA and Kappa coefficient respectively, which is even more accurate than SAM.SAM CF and ANN WT CF are comparable with HC result after postprocessing, which indicates that the methods with CF constraint can be potentially improved with proper post-processing.
Since the waterline is a very important feature of coastal areas, the waterlines are extracted from the post-processed classification maps.The distance of each pixel in the extracted shoreline to the reference waterline is recorded.The mean value and RMSE of the distances for each extracted waterline are shown in Table 4. From Table 5, we can find that ANN WT not only performs best at classification, but also generates the most accurate waterline with mean distance of 0.47 pixels, RMSE of 0.67 pixels and maximum distance of 2.68 pixels.Other methods are still not as good as HC, which is consistent with the classification accuracy assessment.However, all the methods with CF constraint generate poorer results than those without CF constraint, which indicates the CF constraint is not beneficial for the mapping accuracy of boundaries between gradually changed classes such as the waterline along sandy beaches.The errors inherited from the soft map force the subpixels to be assigned to classes and further result in many small patches around the boundary position.Therefore, it is expected that CF is more appropriate for areas with many small objects.
Besides, the accurate soft classification map is crucial.
In the future, real coarse-resolution images instead of simulated images will be used for the objective assessment of sub-pixel mapping techniques.Besides, MRF based methodologies such as the fully spatially adaptive sub-pixel mapping method (Aghighi et al., 2014) will be compared with the ANN WT method.Moreover, the soft classification accuracy is expected to be improved by incorporating other information such as the texture information.Finally, more factors should be considered in the future if we want to assess one sub-pixel mapping method objectively and comprehensively.For example, Ling et al. (2008) found that pixel swapping method generated very different results with different scale factors, while for this paper, only the scale factor of 4 is tested.

CONCLUSIONS
In this work, a set of SPOT-7 multispectral images is used to test the performance of soft classification methods and sub-pixel mapping methods for the classification of coastal areas in subpixel level.Firstly, for the soft classification, LSU model, FFC and the SVM methods are used to generate the fraction map.
The RMSE and correlation coefficient of the fraction values both indicate that SVM is the most accurate among the three methods.Then by using the soft map generated by SVM, SAM, SAM CF, ANN WT, ANN WT CF and PS methods are tested for their performance of sub-pixel mapping.A post-processing step according to the known spatial knowledge of coastal areas is then implemented to improve the results.By the accuracy assessment of the classification and waterline extraction results, ANN WT is found to be the most accurate sub-pixel mapping method compared with the other five methods.Specifically, the overall classification accuracy of ANN WT is 91.79% and Kappa coefficient of 0.875 after the post-processing step; the mean distance of the extracted waterline to the reference waterline is 0.47 pixels, i.e. 0.71m.Besides, it is found that the CF constraint decreases the classification accuracy of sub-pixel mapping methods and waterline extraction for the studied coastal area.Finally, the post-processing can be very important for some methods especially for those with CF constraint with the most significant improvement of overall accuracy is as much as 4% for the SAM CF method.

v
will be calculated by equation (3): Swapping (PS): PS method was firstly raised and tested on simulated imagery byAtkinson (2001) and then developed by other researchers such asThornton et al. (2006),Luciani and Chen (2011) andYuan-Fong et al. (2012).Firstly, the binary classification image at sub-pixel level is generated by random allocation according to the fractions of classes.Then, the attraction value i A of each sub-pixel from neighbouring pixels for class

FFF
 with higher resolution level of 1 j  based on the input coarse fraction j F in resolution level of j , four steps are needed as shown in Figure2.Firstly, WT decomposition is applied to the fraction image j with resolution level of j using as input.By this step, the coefficient images with the same resolution level with the input fraction image are generated.Finally, the fraction image 1 j F  in higher resolution level 1 j  is generated by WT reconstruction step.

Figure
Figure 3. Studied area Figure 4. (a) Training pixels for LSU and SVM.(b) Training pixels for FFC.

Table 1 .
Accuracy assessment of soft classification methods

Table 2 .
Accuracy assessment of sub-pixel mapping methods

Table 3 .
Accuracy of post-processed results