ESTIMATION OF MANGROVE FOREST ABOVEGROUND BIOMASS USING MULTISPECTRAL BANDS , VEGETATION INDICES AND BIOPHYSICAL VARIABLES DERIVED FROM OPTICAL SATELLITE IMAGERIES : RAPIDEYE , PLANETSCOPE AND SENTINEL-2

Aboveground biomass estimation (AGB) is essential in determining the environmental and economic values of mangrove forests. Biomass prediction models can be developed through integration of remote sensing, field data and statistical models. This study aims to assess and compare the biomass predictor potential of multispectral bands, vegetation indices and biophysical variables that can be derived from three optical satellite systems: the Sentinel-2 with 10m, 20m and 60m resolution; RapidEye with 5m resolution and PlanetScope with 3m ground resolution. Field data for biomass were collected from a Rhizophoraceae-dominated mangrove forest in Masinloc, Zambales, Philippines where 30 test plots (1.2 ha) and 5 validation plots (0.2 ha) were established. Prior to the generation of indices, images from the three satellite systems were pre-processed using atmospheric correction tools in SNAP (Sentinel-2), ENVI (RapidEye) and python (PlanetScope). The major predictor bands tested are Blue, Green and Red, which are present in the three systems; and Red-edge band from Sentinel-2 and Rapideye. The tested vegetation index predictors are Normalized Differenced Vegetation Index (NDVI), Soil-adjusted Vegetation Index (SAVI), Green-NDVI (GNDVI), Simple Ratio (SR), and Red-edge Simple Ratio (SRre). The study generated prediction models through conventional linear regression and multivariate regression. Higher coefficient of determination (r) values were obtained using multispectral band predictors for Sentinel-2 (r = 0.89) and Planetscope (r = 0.80); and vegetation indices for RapidEye (r = 0.92). Multivariate Adaptive Regression Spline (MARS) models performed better than the linear regression models with r ranging from 0.62 to 0.92. Based on the r and root-mean-square errors (RMSE’s), the best biomass prediction model per satellite were chosen and maps were generated. The accuracy of predicted biomass maps were high for both Sentinel-2 (r = 0.92) and RapidEye data (r = 0.91).


INTRODUCTION
Mangroves have a wide range of economic, social and environmental benefits often referred to as ecosystem services.Like other vegetated coastal ecosystems, mangroves are important blue carbon sinks with a storage capacity between 990 and 1074 t C ha -1 (Donato et al., 2011).In the tropics, mangroves are among the carbon-rich forests with an average storage of 1023 t C ha -1 (Laffoley & Grimsditch, 2009).The greatest carbon pool in a tree is the aboveground biomass which refers to the living biomass above the soil including the stems, bark, branches, foliage, and seeds.It is usually measured for carbon flux monitoring (Vashum & Jayakumar, 2012), carbon stock quantification (Kumar and Mutanga, 2017) and for developing carbon policies and forest management protocols.
Traditional approach to field biomass estimation of mangroves is limited to the spatial constraints of data collection and inaccessibility of mangroves stands.A common non-destructive approach is the use of allometric equations derived from parameters such as diameter at breast height (DBH).Remote sensing served as a non-destructive alternative for a more robust, continuous and spatially explicit biomass assessment (Herold and Johns, 2007).The availability of different remote sensing systems led to increased capability for biomass estimation.Optical remote sensing systems offers global coverage which is _________________________________ * Corresponding author often cost effective.For regional scale, aboveground biomass estimation is usually carried using optical platforms such as Landsat (Shao & Zhang, 2016;Gleason & Im, 2011), IKONOS and MODIS (Yin et al, 2015).With newer moderate resolution satellite systems, plot-level biomass estimate can also be achieved through improved imaging sensors with shorter revisit time.Among these new platforms are RapidEye (2008), Sentinel-2 (2015Sentinel-2 ( , 2017) ) and PlanetScope (2014).Sentinel-2 is a landmonitoring constellation of two identical satellite with novel spectral capabilities with a swath width of 290 km and a frequent revisit time of 5 days.The optical payload it carries has visible, near-infrared and infrared sensors, which provide a total of 13 spectral bands with 10m, 30m and 60m ground spatial resolution (ESA).Compared to Sentinel-2, RapidEye has higher resampled spatial resolution of 5 meters with revisit time of just one day.It is known as the first commercial satellite with a red-edge band in addition to the blue, green, red, and NIR bands.Prediction models using RapidEye bands were found to explain biomass variation better than Landsat (Ramoelo and Cho, 2014).PlanetScope has the least number of bands (blue, green, red, and NIR) but it has the highest spatial resolution of 3m.Fewer studies on biomass estimation were conducted using PlanetScope data compared to the other satellite imageries.No previous studies have compared the performance of these three satellite data using prediction models developed from the same field data, with focus on the common bands, indices, and biophysical factors that can be derived from these systems.
This study aimed to evaluate the biomass prediction efficiency of multispectral bands, vegetation indices and biophysical variables derived from RapidEye, PlanetScope and Sentinel-2.Specifically, different biomass prediction models using linear regression and non-linear multivariate regression algorithms were developed in this study.Furthermore, the accuracy of each prediction model as well as the accuracy of the predicted aboveground biomass maps were assessed using field validation plots.

Study Site
The test area is a mangrove plantation located in the village of Baloganon, Masinloc in the province of Zambales (Figure 1).The site is dominated by Rhizhopora species such as R. mucronata and R. apiculata.Diameter at breast height (DBH), tree heights and other field data needed to compute the aboveground biomass were collected in November 2015.A total of 1.2 ha consisting of thirty 20m x 20m plots were selected as the training data while another 0.2 ha were set aside as map validation data.

Satellite Data Collection and Pre-processing
The available Sentinel-2, PlanetScope and RapidEye data acquired closest to the field data were selected (see Table 1).The Sentinel-2 Multispectral Imager Instrument (MSI) Level 1-C image covering Baloganon, Masinloc was downloaded from Sentinel Scientific Data Hub (ESA).The product is already orthorectified, georeferenced, and radiometrically calibrated into top-of-atmosphere (ToA) reflectance data.Atmospheric correction was carried using Sen2Cor standalone tool, but can be processed alternatively in the S2A Toolbox of the Sentinel Application Platform (SNAP).This processor uses image-based retrievals with Look-Up tables (LUTs) pre-calculated from the libRadtran model to minimize or remove atmospheric effects from level 1-C images (Main-Knorn et al., 2015).All Level-2A bands were stacked and resampled to 10m pixel size using SNAP (ver.5.0) geometric operation tool.
Table 1.Product levels and satellite acquisition dates The downloaded RapidEye level 3A orthoproduct has undergone radiometric, sensor and geometric correction using Ground Control Points (GCPs) and fine Digital Elevation Models (DEMs).The image was then atmospherically corrected using Fast Line-of-sight Atmospheric Analysis of Hypercubes (FLAASH) in ENVI 5.0.Image center, illumination azimuth angle, spacecraft view angle and other correction parameters are incorporated in the RapidEye image.The PlanetScope image was downloaded as an Ortho Scene product which is orthorectified, scaled Top of Atmosphere Radiance image product (Level 3B), and delivered as analytic 4-band product (Planet Team, 2017).Conversion to ToA reflectance image were made using a Planet Labs python guide (www.planet.com/docs/guides/quickstartndvi).

Multispectral Bands
There are 4 common multispectral bands (Blue, Green, Red and NIR) among Sentinel-2, RapidEye and Planetscope; and 5 bands (+Red-edge) between Sentinel-2 and RapidEye.The Red-edge 1 of Sentinel-2 (705nm central wavelength) was chosen being the closest to the wavelength values of RapidEye's Red-edge band (690nm -730nm).Additional bands of Sentinel including Rededge 1-3 and SWIR 1-2 were tested as a separate group of predictor input.

Vegetation indices
Indices that were selected as biomass model inputs are Normalized Differenced Vegetation Index (NDVI), Soil-adjusted Vegetation Index (SAVI), Green-NDVI (GNDVI), Simple Ratio (SR) and Red-edge Simple Ratio (SRre).The first four indices were generated from the three satellite data while SRre can only be generated from Sentinel-2 and RapidEye since it requires a Red-edge band.These indices are combinations of visible, rededge and NIR bands.The formula for each index is shown in Table 2.

Vegetation index
Formula Reference Normalized Difference Vegetation Index (NDVI) x (1 + L) variables describe the spatial distribution of vegetation state and dynamics, thus, are useful for biomass estimation (Widlowski et al., 2004).LAI, Fraction of absorbed photosynthetically active radiation (FAPAR), and FVC are the main variables computed by the SNAP toolbox using tested, generic algorithms based on specific radiative transfer models.The generation of the variables were composed of three main steps: (1) normalization of the inputs, (2) implementation of the artificial neural network (ANN) algorithm and (3) denormalization of the output and ( 4) generation of quality indicator (Weiss and Baret, 2016).
Rapideye and Planetscope has no biophysical variables included in their products.To facilitate comparison of these variables among the three satellite data, products such as LAI, FAPAR and Ca were derived for RapidEye and PlanetScope using available equations.LAI was obtained using Equation 1 (Zeng et al., 2000) as implemented by previous studies (Ali, 2015;Zeng et al., 2003).The NDVIs (bare soil) and NDVIv (dense vegetation) were selected through histogram evaluation, scaled between the lowest NDVI (NDVIs) and highest NDVI (NDVIv). ( where FVCNDVI = fractional vegetation cover (NDVI-derived) NDVIs = NDVI values for bare soil NDVIv = NDVI values for vegetation The output FVC derived from NDVI was then used for computing the LAI using a logarithmic equation (Norman et al., 1995) tested on large-scale field experiments: ( where LAINDVI = NDVI-derived leaf area index FVCNDVI = fractional vegetation cover () = light extinction coefficient for a given solar zenith angle The light extinction coefficient k(θ) is a measure of attenuation of radiation in the canopy determined by the angle and spatial arrangement of the leaves.It can range between 0.4 and 0.65 in a variety of mangroves canopies.The average value between this ranges, approximately 0.5, was used in this paper as suggested by Clough (1997) and Perera et al. (2013).
Alternative data for comparison with Sentinel's Chlorophyll-a (Ca) was obtained by using the Green Chlorophyll Index model (CIGREEN) developed by Gitelson (2003): (3) Equations 1-3 were also applied to the Sentinel bands to compare the correlation between the modeled and the SNAP-generated biophysical products.These were labelled Sentinel-2s (SNAP generated) and Sentinel-2m for the modeled variables.

Model Development, Regression and Analysis
Mean data were obtained from the bands, vegetation indices and biophysical variables through zonal statistics extracted using thirty 20m x 20m polygons.The first analysis was done using linear regression models between the measured AGB and the biomass estimation predictors.The coefficient of determination values and RMSE's were recorded and were compared between input groups and among the satellite data.
The second analysis was carried using a Multivariate Adaptive Regression Splines (MARS).MARS is a regression and data mining technique developed by Friedman in 1991.This method uses basis functions in modeling the predictor and response variables.The generated basis functions will then be used as the new set of predictor variables to generate the final model.The initial step of MARS is a forward algorithm, which selects all possible basic functions and their corresponding knots.Then, a backward algorithm will discard the basis functions which do not contribute significantly to the accuracy of the fit (Friedman, 1991).The final model of MARS consists of a collection of basis functions including nonlinear and interaction relationships among the predictor variables (Bilgili et al., 2010).Friedman (1991) suggested a maximum of 15 basis functions.In this study, a maximum of 10 basis functions was set to avoid overcomplexity of the models.Standard MARS parameter values were used for the three satellite data.Other MARS parameters are the minimum observation between knots (1), maximum interaction (10), ridge value (-7), degree of freedom for knot optimization (1) and speed factor (1).This was implemented in Salford Predictive Modeler v.8 (Salford Systems, San Diego, California, USA).The r 2 and RMSE per test input group were obtained.

Aboveground Biomass Map and Accuracy Assessment
The best linear or multivariate biomass prediction model per satellite data were chosen based on r 2 and RMSE values.The basis functions per model were converted as individual bands in ENVI 5.0 by applying the mathematical equation to the important variables per model.The stacked basis functions bands were used for the final equation.Ratio and multipliers were used to convert the output biomass product (Mg/plot) to biomass per hectare (Mg ha -1 ) for each pixel of the satellite data.This data was used in creating the aboveground biomass maps for Baloganon, Masinloc.The accuracy was obtained using five 20m x 20m validation plots located outside the training plots used for model development.The correlation between measured AGB from the validation plots with the predicted AGB generated using RapidEye, PlanetScope, and Sentinel-2 models was examined.

RESULTS AND DISCUSSION
The study compared the correlation between field-measured AGB and predicted AGB of three satellite data using two statistical regression methods: linear correlation and MARS algorithm.

Linear Regression of Aboveground Biomass
There are six input groups tested for the regression analysis: PlanetScope multispectral bands, PlanetScope derived indices, RapidEye bands, RapidEye derived indices, Sentinel-2 derived indices and Sentinel-2 bands.The best linear regression models for each satellite data are the PlanetScope SR-based model (r 2 = 0.56), RapidEye NIR-based model (r 2 = 0.71) and Sentinel-2 SRbased model (Table 3, Figure 2).The highest r 2 and lowest RMSE (9.75 Mg ha -1) among all data inputs were obtained with the NIR band of RapidEye.SR was seen to be an efficient biomass predictor index, providing the highest r 2 for both Sentinel-2 and PlanetScope data.
Among the multispectral bands input, the highest coefficient of determination values was obtained with RapidEye (r 2 = 0.71) while Sentinel-2 and PlanetScope have equal coefficient values, CIGreen = (NIR-Green) -1 r 2 = 0.49.NIR band is the most effective predictor band for both RapidEye and Planetscope data.The SWIR 1, SWIR 2 and Rededge bands of Sentinel data gave the lowest coefficients of determination (r 2 = 0.03, 0.003, 0.013).The same result were reported by Castillo et al. (2017) where negative correlation (r) with biomass were observed.Higher r 2 were obtained with the Blue, Red and NIR bands.The green band gave the lowest coefficient of determination value for the PlanetScope and RapidEye, with r 2 = 0.20 and 0.06, respectively.
The efficiencies of all satellite data as biomass predictors are relatively higher with the use of vegetation indices with mean increase of 0.2, 0.14 and 0.19 for PlanetScope, RapidEye and Sentinel-2, respectively.This is driven by the potential of the vegetation indices to highlight plant intrinsic properties that are well related with biomass accumulation, such as leaf greenness and vigor.Each index has its specific expression which can represent green vegetation properties better than using individual bands.The best predictive models were obtained with the multispectral bands of Rapideye with the highest r 2 value of 0.89 and smallest RMSE of 4.96 Mg ha -1 (Table 4).The three important variables of RapidEye are NIR (100%), Blue (42%) and Red bands (38%).This conforms to the results of Huang et al. (2017) which reported that NIR was the most important RapidEye band for biomass estimation.
The best model for Sentinel-2 was generated with vegetation indices SR, SRre and NDVI (r 2 = 0.89; RMSE = 5.69 Mg ha -1 ) (Figure 3 -top).Simple ratio has 100% variable importance in the final model.Also known as Ratio Vegetation Index, SR do not have normal distribution compared to indices such as NDVI.High efficiency of SR as biomass predictor is commonly observed in areas with closed and dense vegetation cover, including biomass estimates in tropical forests (Clerici et al., 2016).SR is also an important variable in the RapidEye indexbased model and in the two best linear regression models.Meanwhile, NDVI is known to perform best in estimation of leaf biomass (Kross et al., 2015) and is usually a successful biomass predictor for a wide range of satellite data.The vegetation index of PlanetScope generated the best model for this satellite system (r 2 = 0.80; RMSE = 7.68 Mg ha -1 ) although the coefficient of determination is lower than that of Sentinel-2 and Planetscope.Among the important variables, GNDVI provided 100% importance to the model followed by NDVI (89%) and SAVI (56%).GNDVI was also significant for the RapidEye model; while NDVI was consistently an important index for the three satellite data.

Regression of Biophysical Variables
Linear regression of LAI, FVC and ClGreen for the three satellite systems resulted to low r 2 values of LAI and FVC with measured AGB (LAI r 2 = 0.01 to 0.44; FVC r 2 = 0.01 to 0.45) while high r 2 values were obtained between AGB and CIGreen (r 2 = 0.42 to 0.69).
The weak r 2 values between AGB and LAI and between AGB and FVC were considered to be affected by the presence of undergrowth vegetation and layering in the study site.This was also the reason observed by Russel and Tompkins (2005) and Heiskanen (2006) in their test areas.It is important to note that among the modeled LAI and FVC data, PlanetScope produced higher r 2 than RapidEye while RapidEye has higher r 2 than Sentinel 2-m.This observation proves that canopy and understory mixing was partly reduced by using higher resolution data.Unlike these two parameters, Chlorophyll-a values derived for each pixel may not have been significantly affected by undergrowth vegetation, thus, less errors were introduced.The highest coefficient of determination among the CIGreen-based models was generated with RapidEye data (r 2 = 0.69).Sentinel 2-m models performed better than Sentinel 2-s models for both LAI and FVC inputs.Between the modeled and the SNAPgenerated biophysical variables, only the CIGreen values have high correlation (r = 0.81, p > 0.001).5. Important variables and correlation of measured aboveground biomass and biophysical variables using MARS MARS regression of AGB and the biophysical variables resulted to higher r 2 values (Table 5) than linear regression results, except for Sentinel-2m where correlation between AGB and ClGreen is lower.CIGreen was the sole important variable for Sentinel and RapidEye while all three variables were used in PlanetScope.

Basis Functions and Final Models
The best biomass prediction models for each satellite data were chosen based on the highest r 2 and lowest RMSE value.These models are the Sentinel-2 index-based model, PlanetScope index-based model and the RapidEye multispectral bands-based model (Figure 3).The basis functions and final equation were generated by MARS.Table 6 shows the basis functions and final models for predicting the biomass using the observed AGB data.
BFs are functions used to demonstrate each distinct interval of the predictors in the form below: BFn = max (0, x-k) or = max (0, k-x) models depends more on the range of data values and their degree of autocorrelation rather than the effect of spectral pre-treatment on the band reflectance data.The advantages of MARS compared to other machine algorithms includes predictive accuracy, computational speed and simplicity of interpretation (Mina and Barrios, 2010).Compared to linear regression, MARS can transform variables and identify higher order interactions between variables.MARS also performed better than multiple linear regression when applied to LiDAR-based biomass estimation (Laurin et al., 2016).Table 6.Basis functions and final model generated for each satellite data to predict aboveground biomass of mangroves

Aboveground Biomass Maps
The basis functions of the best model per satellite data were applied to either bands or vegetation indices.The maps were generated using the native resolution of the input data (Figure 4) after converting the aboveground biomass per plot to AGB per hectare for each generated pixel.The ideal output of the pixelbased maps is that the total biomass per hectare of four Sentinel pixels will be equal to the total biomass of 16 RapidEye pixels and 44.44 PlanetScope pixels, having a similar area of 20m x 20m.Aboveground biomass totals of 690 Mg ha -1 , 613 Mg ha -1 , and 793 Mg ha -1 were recorded for Sentinel-2, RapidEye and PlanetScope data, respectively.

Map Accuracy Assessment
Accuracy of maps generated by MARS algorithm were assessed by using five validation plots with an area of 20m x 20m each.These plots were not included as training data to test the biomass predictive mapping efficiency of the generated models when predictors are correlated to a response variable outside the training sample.The accuracy of predicted biomass maps cannot be evaluated by inter-comparison of maps, thus validation data is needed.The RMSEs and coefficients of determination (r 2 ) between predicted values and field measurements were recorded (Table 7).

CONCLUSION
Reliable biomass estimates are essential for obtaining the net primary production in ecological studies.Forest aboveground biomass is one of the baseline data needed for carbon stocks assessment and climate change studies.
The relationships between AGB and the set of predictor variables were established.This study have demonstrated the efficiency of the multispectral band, vegetation indices, and biophysical variables derived from three novel optical satellite imageries: Sentinel-2, PlanetScope and RapidEye.Multispectral bands are the preferred input predictors for RapidEye while derived vegetation indices are recommended when Sentinel-2 and Planetscope were employed.Simple Ratio consistently provided high r 2 for RapidEye and Sentinel both through linear and multivariate regression.NIR band is the most effective predictor band for RapidEye and Planetscope.For Sentinel-2, the important bands are Blue and Red-edge 1. Weak linear correlations were observed between AGB and the other Red-edge bands and the two SWIR bands.However, addition of these bands (Sentinel2 Set-B) increased the coefficients from 0.62 to 0.84 in the case of MARS regression.The biophysical variables generated from Sentinel-2, PlanetScope, and RapidEye generated low coefficient of determination values except for the Green Chlorophyll Index (CIGreen).
The multivariate non-parametric MARS is a robust classification method that can be used in remote sensing analysis.It is efficient in determining the relevant variables with good predictive accuracy, computational speed and flexibility with the choice of parameter values for penalty parameter and degree of freedom for knot optimization, among others.In this study, MARS models performed better than the linear regression results.As MARS was reported to be sensitive to data size and outliers, we would recommend further studies to introduce more test and validation datasets to the algorithm.This paper is one of the few studies on mangroves biomass prediction conducted using PlanetScope data.Improvement of result with the generated PlanetScopebased models is also recommended such as assessing other vegetation indices and band ratios.
Overall, the study recommends both Sentinel-2 and RapidEye for mangrove biomass prediction due to consistently high coefficient of determination and low RMSE values based on test and validation data.Through the methods employed in this study, a plot level and pixel-based aboveground biomass estimates can be generated which can aid in mangrove management and conservation.

Table 7 .
Result of accuracy assessment through regression of the predicted AGB raster data and the field validation dataset