MONITORING AND ASSESSMENT OF AGRI-URBAN LAND CONVERSION USING MULTI-SENSOR REMOTE SENSING AND GIS TECHNIQUES

Continuous agricultural land conversion poses threat to food security but this has not been monitored due to ineffectual policies. One of the Philippine provinces with a high rate of conversion is the rice-producing province of Cavite. To assess the spatiotemporal dynamics of agricultural land conversion in Cavite, this study aims to develop an operational methodology to produce Land Use and Land Cover (LULC) change maps using a multi-sensor remote sensing approach for decision making and planning. LULC maps were generated using Random Forest Classification of Landsat 8 and Sentinel-1 image collections. Spectral indices, combinations of radar polarizations (VV, VH), and their principal components were included to improve its accuracy. Conversion maps were generated by taking the bi-annual difference of LULC maps from 2016 to 2019. Accuracy was assessed using visual inspection with Google Earth Pro. Classification was carried out using single-sensor (optical or radar) and multi-sensor (optical and radar) approach in combination with three feature selection algorithms, namely, Sandri and Zuccolotto (2006), Liaw and Wiener (2015), Kursa and Rudnicki (2010). Multi-sensor and single sensor yielded similarly high overall accuracies (OA = 96%) with the exception of single-sensor radar approach (OA = 53%). Multi-sensor approaches exhibit high accuracies (Cumulative Accuracy = 91%) in detecting agricultural to built-up LULC change up to 5,000 square meters unlike single-sensor optical approach (Cumulative Accuracy = 76%). Among the multi-sensor approaches, the method of Liaw and Wiener (2015) remains to be superior as it only uses eight (8) variables.


INTRODUCTION
As an agricultural country, the Philippines had been reliant on rice, its most important agricultural commodity, for its economic growth and food security as it constitutes for about 15% of the gross value added in the agriculture industry while serving as the main source of food (David & Balisacan, 1995;Abdullah, Ito, & Adhana, 2006). Its economic contribution is exhibited by the overall land use of the country as nearly two-thirds of the country's farm lands produce rice (David & Balisacan, 1995). Moreover, the demand for rice has been increasing based from the average growth rate of the consumption per capita at 0.73% from 1996 to 2003 (David & Balisacan, 1995;Abdullah, Ito, & Adhana, 2006). To help support and sustain this valuable industry in the country, especially in attaining food-security, implies the need for effective monitoring systems and tools, and one way to implement that is through Remote Sensing and GIS technologies (Ravanera, 2018;David & Balisacan, 1995).
Various studies have provided substantial findings on the use of remote sensing in agricultural monitoring applications such as that of Torbick et al. (2017) which utilized optical imagery to monitor rice crop lands. A research of the Philippine Rice Research Institute in 2019, on the other hand, developed a model utilizing radar-based imagery to detect and monitor rice crops. However, using optical or radar imagery alonehas been found to have limitations. Therefore, for agricultural RS applications, it was proposed to integrate both imageries as they could produce better results (Torbick et al, 2017;Boschetti et al, 2015;Joshi et al, 2016, Dusseux et al, 2014. * Corresponding author This study therefore aims to support the agriculture industry in the country by developing an operational methodology to produce Land Use and Land Cover (LULC) change maps, specifically focusing on agricultural land conversion, using optical and radar remote sensing datasets. This paper presents the following: (1) generation of LULC change maps of the study area using multi-sensor and single-sensor image analysis; (2) comparison of the approaches and results based on the accuracies of the generated LULC change maps; and (3) assessment of the spatio-temporal dynamics of agricultural land conversion in the study area.
Since free access to radar images was realized only recently, the timeframe of this study has been limited to 2016 and 2019 only. Moreover, the study focused on agricultural to built-up LULC changes only. Agricultural land, in this study, is defined to include cultivated and uncultivated croplands, grasslands and fallow lands. These sub-categories of agricultural lands are not distinguished in this study.

Integration of Optical and SAR Satellite Imagery
While both types of datasets have been utilized in LULC mapping applications, each has its own advantages and limitations. Optical remote sensing uses visible, near infrared (NIR), shortwave infrared (SWIR) sensors, and their derivative indices to assess and monitor vegetation growth and land cover dynamics (Lefsky and Cohen, 2003;Shaw and Burke, 2003;Dusseux et al., 2014;Fieuzal, 2011). However, for temporal analysis, compositing/mosaicking is needed as optical imagery is more prone to cloud cover. This leads to sparse time series and ambiguous pixels due to the temporal differences and atmospheric variations of the optical images being mosaicked (Dusseux et al, 2014;Erasmi and Twele, 2009;Joshi et al., 2016;Torbick et al., 2017).
Radar remote sensing can capture surface information and produce dense time series regardless of weather and the Sun's presence (Idol, Haack and Mahabir, 2015;Erasmi and Twele, 2009). With varying dielectric constant to the surface, radar remote sensing can monitor soil conditions (Lefsky and Cohen, 2003;Fieuzal et al., 2011). However, it is limited to a single microwave. The inherent presence of speckle in the images results to uncertainties and poor accuracies during analysis (Idol, Haack and Mahabir, 2015;Joshi et al., 2016).
The combination of optical and radar-based datasets for land use and land cover mapping is really suggested as the synergism between both datasets can be utilized for improved analysis and results (Joshi et al., 2016;Erasmi and Twele, 2009).

Dimensionality Reduction and Variable Selection
Combining optical and radar-based datasets directly usually results to high data dimensionality and redundant information. This combination compromises interpretability. Principal Components (PC) analysis and Random Forest (RF) variable selection help in simplifying datasets with high dimensionality (King and Jackson, 1999;Genuer, Poggi and Malot, 2010;Argamosa et al., 2018). PC analysis summarizes variances of variables into uncorrelated dimensions producing different components with newly extracted information which are weighed using eigenvalues (King and Jackson, 1999). On the other hand, RF designates variables with importance scores based on the increase in the mean of the error of a tree in a forest when the variable is employed during a regression or classification process (Genuer, Poggi and Malot, 2010). A high variable importance signifies a greater contribution to a model's accuracy while a low variable importance suggests that a variable has no significant contribution to the model regardless of the permutation applied (Argamosa et al., 2018).
Various studies utilize RF for variable selection. In 2006, Sandri and Zuccolotto proposed RF variable importance which is measured by calculating the centroid of the variable importance values then selecting variables that are within a given threshold/average distance. This method has been effective as the noise variables tend to cluster together while significant variables appear as outliers (Sandri and Zuccolotto, 2006). On the other hand, Argamosa et al. (2018) uses a different implementation of variable selection following Liaw and Wiener (2015). In their study, the variables were arranged by increasing variable importance scores and grouped into different regions based on the similarity of scores as defined by a relatively flat slope (Liaw and Wiener, 2015). An abrupt change in slope denotes a new region and for each region, the variable having the highest variable importance score was selected (Liaw and Wiener, 2015). A variable was not selected in the first region due to its low importance (Liaw and Wiener, 2015). Kursa and Rudnicki (2010) used "shadow variables" which are generated with random pixel values. Z-scores of the variables are computed and all variables with an importance lower than the maximum Zscore value among the shadow variables were omitted (Kursa and Rudnicki, 2010).

Random Forest Classification
For multi-sensor remote sensing applications, Random Forest (RF) classifiers have been proven to produce superior and consistent accuracies even for large sets of data with high dimensionality (Torbick et al, 2016;Pal, 2003;Baghdadi and Zribi, 2016;Pelletier et al, 2016). Furthermore, RF requires shorter training time and easy stable parameterization because its parameters cause only small influence on the classification accuracy (Pelletier et al, 2016). However, RF shows possible overfitting of data but is still advantageous due to its ability to produce higher accuracies and efficiently process large and diverse datasets (Pal, 2003).

Dataset Derivatives
Various derivatives can be utilized for improved classification resulting in better LULC maps. For optical imagery, Normalized Difference Vegetation Index (NDVI), Normalized Difference Water Index (NDWI), Normalized Difference Built-up Index (NDBI) and the Normalized Difference Bareness Index (NDBal) are among the indices which help discriminate land features (Ghosh, Chatterjee and Dinda, 2018;Li and Chen, 2014;Sharma, Ghosh and Joshi, 2012). NDVI utilizes the Red and Near infrared regions to emphasize greenness of land features making vegetation more visible compared to other features (Torbick et al, 2017;Szabo, Gacsi and Balazs, 2016). For water features, NDWI takes advantage of the near and short-wave infrared regions (Szabo, Gacsi and Balazs, 2016). NDBI is another index used to extract built-up features and lastly, the NDBal index discriminates bare soil features however, alternative to this index is the Bare Soil Index (BI), developed to model Forest Canopy Density (FCD) and exhibited better discrimination of the builtup and bare soil features (Macarof and Statescu, 2017;Li and Chen,2014;Sharma, Ghosh, and Joshi, 2012;Rikimaru, Roy and Miyatake, 2002).
Similarly, radar-based imagery also has its derivatives and among the commonly used are the different combinations of polarizations such as the difference, mean and ratio of vertical and horizontal polarizations. These have been found to significantly increase classification accuracy (Abdikan et al, 2016).

Study Area
The area of study is the rice-producing province of Cavite, located in the CALABARZON region of the Philippines ( Figure  3). CALABARZON, one of the main producers of rice, fears for food security as agricultural lands are continuously converted due to urban development or expansion (Cabildo, 2017; Philippine Statistics Authority CAF, 2012).

Data Processing
Cloud Computing -Google Earth Engine (GEE): Preprocessing of satellite imagery up to generation of conversion maps were all performed in Google Earth Engine. Google Earth Engine (GEE) was used as a tool for cloud computing and processing most especially for researchers who lack highperformance computational resources (Torbick et al, 2017).
Google Earth Pro: Generated KML files were overlaid in Google Earth Pro to validate conversion. Said Regions-ofinterest (ROIs) were checked individually to determine conversion mapping accuracy for each category.

Satellite Data
This study utilized Landsat 8 and Sentinel-1 satellites. With a 16day repeat cycle Earth coverage and having a total of 11 bands ranging from visible to infrared wavelengths with an average 30m spatial resolution, Landsat is able to provide timely and high-quality optical multispectral data which can be utilized for global and regional change detection and surface characterization applications (United States Geological Survey, 2020). Sentinel-1 offers SAR data with a 6 day repeat cycle at the equator which can be used for improved temporal change detection and analysis (ESA, 2019).
The integration of the optical and radar-based datasets requires the same temporal coverage for the images. The Google Earth Engine Data Catalog provides an overlap of satellite imagery for both sensors starting mid-2015 pushing the timeframe of the study from 2016 to 2019.
Optical-based Data: USGS Landsat 8 Surface Reflectance Tier 1 imagery derived from the Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS) were utilized in this study. Images were grouped by year to generate annual cloud-free composite images. Two image collections were formed, one for the year 2016 and another for 2019. The 'pixel_qa' band derived from the CFMASK algorithm detects pixels with cloud and cloud shadow attributes masking clouds in the surface reflectance image. Annual cloud-free composites were then formulated by taking the median of the multiple intra-year pixel values as its value.
where RED = pixel values of the red band BLUE = pixel values of the green band GREEN = pixel values of the green band NIR = pixel values of the near infrared band SWIR = pixel values of the shortwave infrared band Radar-based Data: Level 1 C-band Synthetic Aperture Radar Ground Range Detected (SAR GRD) of Sentinel was utilized. Images were grouped by month before radiometric and geometric calibration. The polarizations 'VV' and 'VH' were selected before taking the mean value of each group. This was used to compute for other derivatives of the radar-based dataset.

Land Use and Land Cover Mapping
Land cover classes need to be defined before generating LULC maps (Baghdadi et al., 2016;Jin et al., 2018;Torbick et al., 2017). For this study, the land cover classes to be used are (1) Agriculture: cultivated and uncultivated croplands, grasslands and fallow lands, (2) Built-up: settlements, lands used for artificial or industrial use such as roads, (3) Bare soil: bare fields and mining sites, (4) Pure and mixed forests, and (5) Water: inland waters, lakes, reservoirs, and flooded lands.
According to Jensen and Lulla (1987), the minimum number of training pixels should be at least 10 times the number of variables used in the classification model. Since there are five land cover classes, the minimum training population was 50 pixels. However, to improve training for the LULC classification model, 250 training pixels were selected and a total of 150 pixels were used for testing. Training pixels were marked through high resolution satellite imagery as seen from Google Earth Pro.
Marking of these features were also based on the familiarity and personal knowledge of the researchers on the study area eliminating confusion in classification of some pixels.

Principal Component Analysis (PCA)
Before Random Forest Classification, the number of variables for a single composite was seventy (70): ten (10) from the optical dataset and sixty (60) from the radar dataset. This was computationally expensive and time consuming for Google Earth Engine; thus, dimensionality reduction was implemented.
PCA allowed the creation of a linear combination of variables while retaining the influence of variables that were most important to the classification (King & Jackson, 1999). Google Earth Engine (GEE), however, has a limit of twelve (12) input variables when performing PCA. This was performed first on the optical dataset as it only had ten (10) variables, although this did not produce positive results as each variable was unique to its counterparts. The radar dataset was split into five (5) groups: one for each polarization derivative consisting of the month values intra-year. It appeared as that the variables in each group showed more relationship as a single Principal Component (PC) eigenvalue already accounted for more than 95% of the sum of the eigenvalues of all the PCs.

Random Forest Classification
Number of Decision Trees: The optimal number of trees was determined by slowly increasing it and watching for out-of-the bag error (OOB) convergence (Jin et al., 2018). Its effect on the overall accuracy of the model was also observed.

Models based on Variable Importance:
The original seventy (70) variables were reduced to just twenty-six variables (26) after the implementation of PCA.
A model was constructed based on the method of dictating which variables were deemed to be important, some of which led to the reduction of many variables while one, none. Five (5) models were created: (a) the method of Sandri and Zuccolotto (2006), (b) the method of Liaw and Wiener (2015), (c) the method of Kursa and Rudnicki (2010), (d) a method inspecting whether a single-sensor (optical) would suffice to accurately perform land cover classification; (e) a combination of the three (3) former methods by taking the mode of the generated composites.

Land Cover Change Detection
Conversion Map Generation: Conversion maps were produced by taking the bi-annual differences of LULC maps from 2016 to 2019. Pixels that did not change land cover or were not of agricultural land cover in the year 2016 were masked out or given a value of 0 representing no agricultural conversion present. Four (4) conversion maps were generated; this was done to compare conversion maps using the different models.
Spatial Filtering: Spatial filtering simplified raster maps before converting them into a vector format. A majority filter changes a raster value based on its contiguous neighbouring cells while boundary filter uses the expand and shrink method to clean ragged edges between zones (ESRI, 2014).

Categorizing Conversion:
Final raster maps were converted into vector format. Shape areas of individual polygons were computed and categorized according to size: (a) 50,000 sqm and above; (b) 10,000 sqm to 50,000 sqm; (c) 5,000 sqm to 10,000 sqm; (d) 1,000 sqm to 5,000 sqm; (e) 500 sqm to 1,000 sqm. Each category was exported as a separate KML file and was imported into Google Earth Pro for accuracy assessment.

Principal Component Analysis (PCA):
The first principal component (PC1) of the VH, VV, Mean Backscatter, and Backscatter Ratio groups represented more than 90% of the total variation of all data. Thus, for these groups, the respective PC1s were used. This is not the case for the backscatter difference group, for which at least ten (10) PCs are needed to sufficiently describe variability. The twelve (12) backscatter difference variables were therefore retained instead of using the PCs to preserve all the information. Table 1 shows the final pool of variables for the Random Forest Classification. The optical imagery dataset comprised of ten (10) variables: six (6) spectral bands and four (4) spectral indices. The radar-based imagery dataset had sixteen (16) variables: four (4) principal components and twelve (12) backscatter differences, one for each month of the year. The total number of variables in the pool was twenty-six (26).   Table  3 shows the accuracy assessment of the different methods under Random Forest Classification. All methods used the same number of decision trees (1000). The first four (4) methods showed very close overall accuracies (OA), suggesting that any one of these four (4) methods can be used to produce accurate LULC maps.

Summary of Image Classification Accuracies:
One thing to note, however, is the number of variables used in each method, the least of which is the method of Liaw & Wiener (2015) with just eight (8) variables. On the other hand, the method of Kursa and Rudnicki (2010) produced the best overall accuracy but required all twenty-six (26) variables to attain such accuracy.
Comparing the confusion matrices, Sandri and Zuccolotto's method (2006) presented a lower error of commission for agriculture and built-up land cover. Kursa and Rudnicki's method (2010) inched over the other methods in terms of commission error for bare soil land cover. The opposite happened when analyzing the omission errors. Kursa and Rudnicki's method (2010) performed best in classifying agriculture and built-up land cover.
Consequently, Sandri and Zuccolotto's method (2006) dominated the rest for bare soil land cover. The optical based dataset led in terms of omission error for the forest land cover. All methods yielded the same accuracy for water land cover as the first three (3) indicated fewer numbers of unmasked pixels due to the inclusion of the SAR dataset.
The single-sensor radar-based dataset exhibited very poor results. LULC change maps for this method were not generated due to its low classification accuracies.  Table 3. Random Forest Classification Statistics

Integration of Different Variable Selection Methods:
An integrated LULC map was also generated in this study. It used the methods of Sandri and Zuccolotto (2006), Kursa and Rudnicki (2010) and Liaw and Wiener (2015) and were compared to one another at pixel level. The integrated map is based on the mode of the land cover class pixel values from the three LULC maps. As seen in Table 4, the different LULC maps agreed mostly with one another. Pixels which showed little or no agreement could be attributed to mixed land cover present in one pixel or uncertainties during random forest classification.   Figure 3 illustrates accurately classified agricultural land conversions within the study area. As seen in the figure, the region-of-interest (ROI) must have been agricultural land before or in 2016; by 2019, it must have been built-up land. The conversion in the ROI must have the same area, more or less, as the classified ROI. Errors in area overlaps and gaps were attributed to limitations in spatial resolution as long as they were kept to a minimum. The LULC conversion maps were not entirely accurate. Accuracy assessment of the detected conversion of agricultural lands to built-up was implemented using visual inspection with Google Earth Pro. Misclassifications of converted agricultural lands were also detected. Various cases can be attributed due to changes in image color despite showing the same land cover. Another factor was differences in phenological stages. These were attributed to the compositing of the optical datasets. The compositing of different Landsat 8 images with different temporal description resulted in the ambiguity of some regions in the output composite image.

Mapping of Conversion of Agricultural Lands to Builtup
Aside from misclassification, there were also cases of omitted agricultural land conversions. For some, while there were detected conversions, the size of the resulting ROI did not reflect the actual size of the conversion. This was attributed to the misclassification of the model between built-up and bare soil features. The limitation of the study to detect agricultural land conversions to built-up only could have resulted in the omission of other agricultural land conversion to built-up which were rather classified as conversion to bare soil due to the misclassification of the produced RF classification model. From 2016 to 2019, the number of agriculturally converted lands continued to increase. Looking at the maps above, the locations of these lands are in the northern part of Cavite bounded in the same direction by National Capital Region (NCR), the seat of government and a metropolitan area in the country. Agri-urban conversions were classified by size to assess accuracy as detections become smaller. Table 7 shows that along with the number of detected conversions to increase, it can be implied that the total area of conversion also increases.  Figure 6. Cumulative Accuracies of different methods in detecting agri-urban conversions. Figure 6 illustrates a graphical representation of the cumulative accuracies of the different methods as the sizes of the regions-ofinterest (ROIs) decreased from greater than 5 hectares or 50,000 square meters down to a range from 500 square meters to 1,000 square meters. The four (4) multi-sensor methods: specifically, Sandri and Zuccolotto (2006), Liaw and Wiener (2015), Kursa and Rudnicki (2010) and the integration of methods, all began from an accuracy of unity and seemed to decrease linearly until ROIs of size 500 to 1,000 square meters where their accuracies lay within the range of 75.97 % to 79.23%. Comparing these methods, Liaw and Wiener's method (2015) and Sandri and Zuccolotto's method (2006) had the highest and lowest final cumulative accuracy, respectively. The use of an exclusively optical dataset remained at the bottom compared to all the methods used in the study; nevertheless, its use was still deemed acceptable for detecting conversions larger than 50,000 square meters.

Accuracy Assessment of Detecting Agri-Urban Conversions
The overall accuracies of the multi-sensor methods remained close to one another, and the generated LULC Maps and Conversion maps appeared similar. The range of their neighbouring accuracies signified that using any of these methods can be used interchangeably without sacrificing too much on its accuracy.
One thing that distinguished the multi-sensor methods from one another is the number of variables used in forming the model. Sandri and Zuccolotto's method (2006) used twelve (12) variables while Liaw and Wiener's method (2015) utilized eight (8) variables. These two (2) methods proved to be the most efficient as fewer number of variables translate to a lower time and computational complexity. Kursa and Rudnicki's method (2010) proved to be exhausting as it uses all twenty-six (26) variables for it to produce relatively the same results. Finally, the integration of the mentioned three (3) methods was the most burdensome as it required performing the first three methods before taking the mode of the generated LULC maps. Nevertheless, it remained superior in detecting land conversions from agricultural to built-up land cover up in areas 5,000 to 10,000 square meters.

Conclusion
The research was able to develop a multi-sensor remote sensing approach using Landsat 8 and Sentinel-1 imageries to generate LULC maps of the province of Cavite, Philippines from 2016 to 2019 with overall accuracies at 96%. By employing GIS techniques, LULC change maps were then produced allowing the assessment of the spatio-temporal dynamics of agri-urban land conversion through the dynamic detection and quantification of conversion of agricultural lands to built-up. Using the produced LULC change maps, it was also found out that while singlesensor approach, specifically using optical satellite imagery, can also produce similar image classification accuracies to that of the multi-sensor approach, the developed multi-sensor approach proved to be superior when producing the LULC change maps having the capability to detect agri-urban conversions across ranges of areas.

Recommendations
As the study is limited to land cover change for agricultural to built-up only, it is recommended to extend the analysis of land cover change to other land cover classes as well. The methodology can also be further improved by developing a component that would allow identification of cultivated and uncultivated agricultural lands. Time-series analysis can also be integrated which may be used for estimation of crop yield. By doing so, the potential of the study could be maximized especially in generating comprehensive spatio-temporal analysis of agricultural lands which could be used by the government, most especially those involved in the agricultural sector in addressing the problem of food security and self-sufficiency in the country.