META-LEARNING FOR WETLAND CLASSIFICATION USING A COMBINATION OF SENTINEL-1 AND SENTINEL-2 IMAGERY

: In wetland mapping, a lot of uncertainty is related to the task of selecting an appropriate classification approach. Although the individual models are available and well-established in the literature for the classification task, the combination approaches have become popular recently. Hence, selecting an appropriate method is challenging, whether an individual approach or combination. In this work, a meta-learning study is performed to prove that combining the result of individual machine learning models could be better than using the best single model. This study investigates the applicability of the meta-learning method for wetland classification. We will first explore the importance of extracted features for each model. Then, the essential features are fed to the model with the well-tuned hyper-parameters. Finally, the voting classifier as a meta-learning approach is adopted to improve the classification result. The classification map of the study area reached the highest accuracy (Overall Accuracy = 93.9% and Kappa= 0.92) when the proposed ensemble classifier was employed. The results show the superiority of a combination of methods over simple model selection approaches. The results of this study can provide new insights for researchers to find new combination strategies to improve the classification results.


INTRODUCTION
Wetlands are areas submerged or saturated by water either yearround or throughout part of the year (Tiner, 2016;M. Mahdianpari et al., 2020b). Due to the complex nature of wetlands and the internal variability of wetland classes, discrimination of such areas are somewhat challenging (Mahdianpari et al., 2019;M. Mahdianpari et al., 2020a). Therefore, analyzing different classification strategies can mitigate the issues and help better map and monitor such vital regions. As no two algorithms are alike, it then becomes impossible to attribute a positive performance to a particular modeling choice. Thus, allowing an evaluation of the combined use of individual algorithms can act as s guidance in order to take advantage of several methods at the same time. A meta-learning analysis could yield insights, linking the properties of algorithms and data features to deliver promising results. The core idea behind utilizing a meta-learning strategy is to combine the outcomes of several classifiers to achieve a better classification result (Jafarzadeh et al., 2021). In other words, the results of a group of classifiers (from a single base classifier to complicated models) are combined to improve the classification result. In remote sensing (RS) studies, to improve the classification accuracy, the application of multi-source Earth observation data collected using different portions of the electromagnetic spectrum has been extensively increased in the past decade * Corresponding Author (Briem et al., 2001;Jafarzadeh et al., 2021). Hence, in this study, free access optical and Synthetic Aperture Radar data are utilized to explore and analyze a meta-learning approach and improve the accuracy of wetland classification.

Study Area
Wetlands compose a significant portion of the Canadian landscape, especially in the province of Newfoundland and Labrador (NL). The study area encompasses the Avalon region on the island of Newfoundland, located on the easternmost coast of Canada. This region includes four main wetland types: bog, fen, marsh, and swamp. The wetland ground-truth dataset used in this research was obtained via field campaigns conducted in the summers of 2015-2020.

Methodology
There are several machine learning approaches proposed in the literature (Dietterich, 2000;Rokach, 2005). Here, the goal is to investigate a meta-learning approach in wetland classification. To this end, several well-known decision tree-based models that are utilized significantly in the classification of RS data (Masoud Mahdianpari et al., 2020;Georganos et al., 2018), especially wetland classification, are selected, including Decision Tree (DT), Random Forest (RF), Gradient Boosting Machine (GBM), and Extreme Gradient Boosting (XGBoost). The details of these models are discussed thoroughly in (Jafarzadeh et al., 2021). The first step in utilizing the abovementioned models is the hyper-parameter tuning step. According to the assessments reported in (Jafarzadeh et al., 2021), tuning parameters are set as shown in Table 2.  Table 2. Hyper-parameter tuning of classifiers.
In the second step, the feature importance for each model is provided. As listed in Table 2, ten features were computed and extracted in GEE for each set of imagery (i.e., ten features for each of Sentinel-1 and Sentinel-2). The relative importance of each model is depicted in Figure 1. As illustrated in Figure 1, out of 20 features, the Red, EVI, GCVI, NIR, and SpanHH_HV features in the DT method; EVI, NDVI, NIR, Green, and HV features in the RFl NIR, Blue, DVI, EVI, and Red features in the GBM method; and EVI, NIR, NDWI, Blue, and DVI features in the XGBoost method shown the most importance.

Classification maps
Classifying wetland classes is accomplished by selecting and utilizing five of the most important features in each model. Figure  2 shows the classification maps for each model. The next subsection presents the evaluation of each model.

Confusion matrices
The confusion matrices for each of the obtained classification maps are also demonstrated in Figure 3. Comparing the results shows that these ensemble classifiers are capable of classifying most non-wetland classes correctly. However, as mentioned earlier, the complexity of wetland categories impedes class discrimination. Although over 93% of non-wetland classes have been classified correctly, all methods have the worst performance in classifying swamp and marsh classes. The DT method could only classify about 45% of swamp and marsh classes correctly.
The RF (with about 58% and 61% of correctly classified swamp and marsh classes) and GBM (with about 58% and 54% of correctly classified swamp and marsh classes) have better results than DT. Overall, the XGBoost approach has the best classification result.

Meta-learning result
Here, the voting classifier (VC) as a meta-learning approach is adopted for the final classification. This method uses predicted class labels for majority rule voting. Indeed, this approach is a simple strategy that tries to produce a better classification map by combining the result of several methods. Thus, we expect to reach a high accuracy in wetland classification through this approach. The classification map and confusion matrix of the VC is illustrated in Figure 4. The VC's result shows high accuracy in non-wetland area classification like the previous methods. However, as expected, the wetland classification accuracy is also increased in this approach. By looking at its performance in discriminating the swamp and marsh classes and comparing it with the XGBoost method (the best one among the utilized ensemble methods), it is clear that the classification accuracy has increased.

Quantitative evaluation
To assess the overall performance of classification maps, the Overall Accuracy (OA) and Kappa metrics are selected. These metrics are well-known and used in a variety of RS applications ( Table 3. Accuracy assessment of the classification algorithms. According to quantitative evaluation, DT algorithms were able to predict the classes (wetland and non-wetland areas) with OA and kappa of 88.32% and 0.85, respectively. The RF, GBM, and XGBoost had better results than DT, in lowest accuracy to the highest order. The OA and kappa metrics for the VC approach were 93.9 and 0.925, which are the highest in this study.

CONCLUSION
Wetland classification and wetland types discrimination are challenging tasks using Earth observation data. Thus, an appropriate classification approach is needed to deal with the classification complexity. This study compares individual classifiers with a meta-learning approach combining classifiers via a majority vote rule, with an application on wetland classification. The results show that the classification map of the Avalon area reached the highest accuracy when the voting classifier was employed as a meta-learning strategy. The results of this work demonstrate the superiority of a combination of methods over solo model selection approaches. For future work, the combination of more complex classification models for metalearning is suggested. Moreover, designing new strategies for meta-learning may prove to be an essential research path for classification accuracy improvement.