ESTIMATION OF LAND SURFACE ALBEDO FROM GCOM-C/SGLI SURFACE REFLECTANCE

This paper examines algorithms for estimating terrestrial albedo from the products of the Global Change Observation Mission – Climate (GCOM-C) / Second-generation Global Imager (SGLI), which was launched in December 2017 by the Japan Aerospace Exploration Agency. We selected two algorithms: one based on a bidirectional reflectance distribution function (BRDF) model and one based on multi-regression models. The former determines kernel-driven BRDF model parameters from multiple sets of reflectance and estimates the land surface albedo from those parameters. The latter estimates the land surface albedo from a single set of reflectance with multi-regression models. The multi-regression models are derived for an arbitrary geometry from datasets of simulated albedo and multi-angular reflectance. In experiments using in situ multi-temporal data for barren land, deciduous broadleaf forests, and paddy fields, the albedos estimated by the BRDF-based and multi-regression-based algorithms achieve reasonable root-mean-square errors. However, the latter algorithm requires information about the land cover of the pixel of interest, and the variance of its estimated albedo is sensitive to the observation geometry. We therefore conclude that the BRDF-based algorithm is more robust and can be applied to SGLI operational albedo products for various applications, including climate-change research.


INTRODUCTION
The Japan Aerospace Exploration Agency (JAXA) initiated the Global Change Observation Mission (GCOM) to observe data on a global scale for analyzing global climate change and water circulation mechanisms. Under this project, GCOM -Climate (GCOM-C) was launched successfully in December 2017, and the Second-generation Global Imager (SGLI) onboard GCOM-C is expected to measure reflectance and radiation in the region of visible to infrared wavelengths (GCOM-C, 2021a). In December 1999, the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor was launched, and it is still in operation. SGLI and MODIS have equivalent spatial resolutions, (i.e., 250 m and 1 km), but SGLI has more bands than MODIS, including three polarimetric bands for a better understanding of atmospheric properties.
SGLI is designed to provide operational products regarding land, atmosphere, ocean, and cryosphere. Terrestrial albedo is one of the most important physical parameters for understanding the global circulation of water and heat, and climate-change research requires long-term and global datasets of albedo. Albedo is defined as the ratio of upwelling and downwelling irradiances, where irradiance is derived from the integral of radiance over a given hemisphere. Using the SGLI data, we aim to develop an algorithm for operational terrestrial albedo products. Therefore, in this paper we examine an algorithm for the stable estimation of daily SGLI-based albedo from surface reflectance. * Corresponding author The MODIS albedo products are generated by using the bidirectional reflectance distribution function (BRDF)-driven method (Strahler et al., 1999). Kernel-driven BRDF models are regarded as robust and semi-empirical and can be applied to any type of land cover (Lucht et al., 2000). The kernel is a function determined by the viewing and illumination geometries. BRDFdriven albedo estimation requires a certain number of observations. For example, MODIS operational BRDF/albedo products use at least seven good-quality reflectance data points obtained within 16 days. However, this requirement is challenging for tropical and subtropical climate regions where optical images suffer from cloud contamination. Simulation analysis has shown that the more observations that are contaminated by noise, the more unstable the estimated BRDF model parameters (Susaki et al., 2004). Consequently, in such regions there is often little or no data available for MODIS BRDF/albedo products; however, the available products have an acceptable accuracy (Susaki et al., 2007). Therefore, to obtain stable albedo products, a technique is required for estimating albedo from fewer observations. Such an approach would increase the temporal resolution of albedo products. Cui et al. (2009) presented a method that utilizes an empirical relation between bidirectional reflectance and albedo by using Polarization and Directionality of the Earth's Reflectances (POLDER) data. It provides a direct estimate of land-surface broadband albedo from a single bidirectional observation. Similar approaches using multi-regression models were adopted in (Liang et al., 2013). Qu et al. (2014) also extended this method for estimating broadband albedo from MODIS data and multiangular POLDER-3 data. They generated regression models with the MODIS reflectance as the dependent variable and the multi-channels of the POLDER-3 reflectance as the independent variables. They implemented landcover classification by using reflectance in the blue channel and the normalized difference vegetation index (NDVI), and pixels of interest were classified into three landcovers, i.e. vegetation, non-vegetation, and snow/ice. The multi-regression models for each landcover class estimate the broadband albedo from MODIS reflectance data observed in a single measurement. The algorithm estimates daily albedo with improved temporal resolution. However, that technique requires landcover classification for pixels of interest in order to select optimal multi-regression models, and such classification may hamper accurate albedo estimation. This is because (i) the classification may include errors and (ii) landcover for multi-regression models is quite ambiguous (i.e., vegetation and non-vegetation), and thus the multi-regression models do not represent the landcover of interest. Compared to this multi-regression-based method, the aforementioned BRDFbased method has an advantage that no landcover information is required. Therefore, in this paper we examine two methods-one based on a BRDF model and one based on multi-regression models-for the stable estimation of SGLI-based albedo from surface reflectance. Table 1 gives the specifications of the SGLI bands. SGLI was designed to measure ocean color, land cover, vegetation, snow, ice, clouds, and aerosols (Shimoda, 2018). The visible and nearinfrared non-polarization channels are observed using a pushbroom scanner, while shortwave and thermal infrared are measured using an optical mechanical scanner. Herein, "channels" and "bands" refer to the same concept.  (Lacaze et al., 2009).

SGLI Data
The publicly available BRDF data follow the land-cover types proposed by the International Geosphere-Biosphere Programme (IGBP). The original IGBP land-cover map contains 17 classes, and the monthly and yearly databases of the 16 classes other than water bodies are available. The monthly databases contain the best-quality BRDFs for each month independently, whereas the yearly databases contain the high-quality pixels from a full year with the aim of monitoring the annual cycle of surface reflectance and the directional signature. The central wavelengths of the publicly available channels are 490, 565, 670, 765, 865, and 1020 nm (Lacaze et al., 2009).

Field Data
We measured the broadband albedo and bidirectional reflectance factor (BRF) data at two sites in Japan, namely the Tottori Sand Dune, which is classified as barren land, and the Yamashiro Test Field (Ataka et al., 2014), which is classified as deciduous broadleaf forest. The site details are presented in Table 2.
We measured the BRF data at wavelengths of 300-2500 nm by using a portable spectroradiometer (Field Spec 3, Analytical Spectral Devices, USA) under clear-sky conditions (ASD, 2021). The measurements were conducted at a height of 1.5 m for barren land and of 28.5 m for the deciduous broadleaf forest. In the measurement of BRF data, we set the relative azimuth angles of the solar and viewing directions to 15°, 45°, 90°, 135°, and 180° to save on measurement time. We chose 15° instead of 0° to avoid contamination from the equipment shadow. The viewing angle was set to 0°, 15°, 30°, and 45°. We used a 5° field-of-view lens in the measurement. We also measured the broadband albedo for wavelengths of 285-3000 nm by using a four-component net radiometer (MR-60; EKO Instruments, Japan) (EKO, 2021). The instrument mounts two hemisphere pyranometers, one on the upper flat side and the other on the lower flat side.  Table 2. Names and geolocations of study areas.

METHODS
Flowcharts of the two methods chosen for estimating albedo from land surface reflectance are shown in Figure 1, namely, the BRDF-based method [ Figure 1(a)] and the multi-regressionbased method [ Figure 1(b)]. Before explaining the two methods, we explain the primary modules used therein.

Kernels
A kernel used for estimating the BRDF and terrestrial albedo is a function of the bidirectional reflectance determined by the viewing and illumination geometries (Pinty et al., 1991;Wanner and Strahler, 1995;Hao et al., 2020). In general, we consider two types of scattering observed from an object on the terrestrial surface, namely volumetric scattering and geometric scattering.
For the volumetric-scattering kernel, Ross (1981) developed a kernel for the directional reflectance above a horizontally homogeneous plant canopy. Roujean et al. (1992) derived the Ross-Thick kernel, which was designed for use in areas with large values of leaf area index and does not consider hotspot effects. The geometric-scattering kernel calculates the scattering from sunlit and shaded objects and backgrounds; for example, tree crowns are approximated as spheroids to calculate their surface scattering. The semi-empirical kernel-driven BRDF model expressed by Equation (1) has been used for the operational BRDF and albedo products of MODIS (Roujean et al., 1992;Strahler et al., 1999): Here, i, v and  are solar zenith angle, viewing zenith angle and the relative azimuth angle, respectively. fiso, fvol, and fgeo are unknown coefficients. Users can select any combination of volumetric-and geometric-scattering kernels, the values of which are determined once the illumination and viewing geometries are given.
The three unknown coefficients in Equation (1) are determined by minimizing the least-squares error between the observations and the estimated albedos. In operational application, MODIS BRDF products are produced as follows. If at least seven cloudfree observations of the surface are available during a 16-day period, then a full model inversion is attempted. First, the available data are evaluated to discard any outliers, and additional checks are performed to ensure positive kernel weights. If the data pass these evaluations, then a full inversion, or a normal inversion, is performed to establish the BRDF parameter weights that provide the best root-mean-square error (RMSE) fit.

Narrowband and Broadband Albedo Estimation
With the estimated BRDF model parameters, we estimate the narrowband albedo, which is an albedo for a relatively narrow wavelength range. In most cases, the wavelength range is equivalent to the bands designed for satellite sensors. Before estimating the narrowband albedo, we start with the narrowband black-sky and white-sky albedos. The black-sky albedo is a virtual albedo in the absence of a diffuse component, while the white-sky albedo is a virtual albedo in the absence of a direct component when the diffuse component is isotropic. The actual albedo at a given wavelength is expressed as a linear combination of the black-sky and white-sky albedos by using the atmospheric optical depth (Strahler et al., 1999).
The broadband albedo is defined for wider wavelength ranges, such as 0.3-3.0 m or 0.3-5.0 m. However, no sensor measures the radiance over such a wide wavelength range; therefore, it is impossible to estimate the broadband albedo directly from observed sensor data. Instead, the broadband albedo is estimated by extrapolation, expressed as a linear regression model of several narrowband albedos (Liang, 2000). This conversion is known as narrow-to-broadband (NTB) conversion.

Method Based on BRDF Model
As shown in Figure 1(a), the BRDF-based method starts with multiple sets of SGLI reflectance observed during several observations. First, the unknown coefficients used in the kernelbased BRDF models in Equation (1)

Landcover Classification
The approach of estimating albedo via BRDF model parameters has the advantage of requiring no information about land-cover type in the area of interest. In contrast, the approach of estimating albedo from a set of reflectances requires a priori information about land-cover type because the BRDF shape depends strongly on land cover; accordingly, the coefficients of the multiregression models, which are explained in the following section, depend on land-cover type. Therefore, selecting the optimal multi-regression models requires having the land-cover information. In operational albedo products, it may be possible to refer to the IGBP land-cover map, among others. However, using such a land-cover map can cause errors due to misclassification or the mixed pixel problem in maps containing more than one land-cover type. Therefore, we decided to classify land cover into three types: vegetation, non-vegetation, and ice/snow. Figure 2 shows a scattergram of NDVI versus reflectance at the wavelength of 443 nm, which is the center wavelength of VN3 of SGLI. The data were generated as simulated SGLI data from POLDER-3 data. Three classes of data were used for the scattergram, namely, (i) deciduous broadleaf forest (IGBP-4), (ii) snow/ice (IGBP-15), and (iii) barren or sparsely vegetated (IGBP-16). Figure 2 shows that the combination of NDVI and reflectance at 443 nm can be used to classify pixels into these three classes, albeit that some snow/ice points with higher NDVI and lower reflectance at 443 nm may be difficult to separate. The approach taken in this research is to classify the SGLI datasets into three land-cover classes beforehand.

EXPERIMENTS
In this research, we define the broadband albedo as that measured in the wavelength range of 0.285-3.0 m because the albedo meter used in the field measurements covered those wavelengths.

Conversion from POLDER-3 to SGLI bands
The measured digital numbers of the samples were converted into reflectances of the bands defined for SGLI by using the SGLI relative spectral response (RSR) and the digital numbers of a reference. The reflectances of the bands for POLDER-3 were also calculated by the same procedure except using the POLDER-3 RSR.
As for the in situ data measured by Field Spec 3, the reflectances of SGLI and POLDER-3 channels were calculated. With the set of reflectances, a multi-regression model was generated that used six channels of POLDER-3 reflectances as the independent variables and a specific channel of SGLI reflectance as the dependent variable.

NTB Conversion
For the kernel-driven BRDF model in Equation (1), we used the Ross-Thick kernel for volumetric scattering and the Li-Sparse kernel for geometric scattering (Strahler et al., 1999). We collected in situ BRF data and albedo data for several types of land cover and used them to determine the coefficients of NTB conversion. We constrained the coefficients to be non-negative and determined them by minimizing Akaike's information criterion (AIC) (Akaike, 1974). For a given model, AIC assesses its performance by evaluating both the number of variables used in the model and the sum of the errors. Note that VN6 was excluded from the calculation because its radiance may be saturated in land cover with high reflectance. The obtained model of NTB conversion is where is the narrowband albedo of band i. Figure 3. Scattergram of actual albedo versus that estimated by narrow-to-broadband (NTB) conversion for SGLI. "Calibration" denotes the calibration data used to determine Equation (2), and "Validation" denotes the validation data (Susaki et al., 2020 For calibration, we used the reflectances of barren land (11 points), deciduous broadleaf forest (three points), and paddy fields (one point), and the RMSE for the 15 points was 0.010. To validate Equation (2), we used barren land (nine points), deciduous broadleaf forest (three points), and grassland (two points); the RMSE was 0.023, which is acceptable. As reported by Susaki et al. (2020), Figure 3 shows a scattergram of actual albedo versus that estimated by NTB conversion.

Multi-regression Model Estimation for Albedo Estimation
We determined the coefficients of multi-regression models for the dataset including surface reflectance and albedo for a specific geometry. We set the interval for the solar and sensor zenith angles as 5° and that for the relative azimuth angle as 15°. The latter is larger because the relative azimuth angle is less sensitive to the coefficients compared with the two zenith angles.

Accuracy Assessment
First, we assessed the multi-regression models by using in situ data measured in barren land and deciduous broadleaf forest. We examined the possible combinations of SGLI bands as independent variables of the regression model and found VN5, VN8, VN11, and SW1 to be the best combination in terms of AIC. The RMSE was calculated by using the residual between the simulated and estimated albedo. Table 3 gives the RMSE for each geometry, and Figure 4 shows a contour map of the RMSE of the estimated albedo when compared with the actual in situ albedo. Figure 4 was generated by applying the kriging technique to the results of Table 3. The final RMSEs for all geometries are 0.020 for barren land and 0.044 for deciduous broadleaf forest. Note that some results for specific geometries are excluded from Table  3(b) because strong reflectances were observed that may have been due to scattering by the tower.
We then examined the validity of both the BRDF-based and multi-regression-based methods for estimating surface albedo by applying SGLI data. Because the spatial resolution of SW3 (i.e., 250 m) is finer than that of the other shortwave bands (i.e., 1 km), we again examined the possible combinations of SGLI bands as independent variables of the regression model under the condition that SW3 is included in the independent variables. We found VN8, VN11, and SW3 to be the best combination in terms of AIC. Table 4 gives the RMSEs of the temporal albedo estimated by the two different approaches. One approach is to use multi-regression models generated from the data of vegetated areas. The other approach is to use the BRDF model parameters provided in SGLI atmospheric-corrected land surface reflectance (RSRF) products (GCOM-C, 2021b). The BRDF model parameters are generated as a by-product of RSRF, and the processing uses the kernels reported by Maignan et al., (2004). Note that we did not include the assessment results for Tottori because valid SGLI data were available only for October 28, 2019.
(a) (b) Figure 4. Contour maps of root-mean-square error (RMSE) of multi-regression model for IGBP LC16, barren or sparsely vegetated. The RMSE was calculated by using the residual between the simulated and estimated albedo. The solar zenith angle was set to 0° and 30° for (a) and (b), respectively.

Multi-regression Model Estimation for Albedo Estimation
In this research, we examined a method for estimating albedo from a single set of surface reflectances by using multi-regression models. The models were generated from multi-angular simulated SGLI surface reflectance. To improve the accuracy of the albedo estimation, we examined the performance of multiregression models having a number of independent variables among the following 10 channels: VN1, VN2, VN3, VN5, VN6, VN8, VN11, SW1, SW3, and SW4. We found that the models with more independent variables do not always generate better accuracy. For example, the validation results show that some models with six or seven independent variables have an RMSE of approximately 0.06 for barren land, which is much worse than that in Table 3. Also, cases were observed in which the absolute values of the coefficients were far too sensitive to changes in the sensor zenith angle, as were the signs of the coefficients (i.e., positive or negative). These observations may be the result of POLDER-3 data having relatively low spatial resolution, (approximately 6 km); the model estimation uses the albedo simulated based on POLDER-3 data.  . Temporal albedo of Yamashiro, Japan estimated from SGLI data. "Regression" denotes the results applying multiregression models for vegetation class to SGLI surface reflectance, and "BRDF" denotes the results using the BRDF model parameters provided in SGLI surface reflectance products.  Table 4. Root-mean-square errors of estimated temporal albedo for Yamashiro, Japan, shown in Figure 6. "Regression" denotes the results applying multi-regression models for vegetation class to SGLI surface reflectance, and "BRDF" denotes the results using the BRDF model parameters provided in SGLI surface reflectance products.
The POLDER-3 data measured in 2008 were aggregated monthly, and it is possible to derive monthly-based multi-regression models. We examined the performance of such monthly-based models and found the estimated albedos to be less accurate. In reality, it can be reasonable to apply monthly-based multiregression models because some vegetation (e.g., deciduous forests) has seasonally changing BRDF. However, when we derived monthly models, fewer samples were used than those used to derive yearly multi-regression models, thereby leading to unstable model estimation. Table 4 shows that the yearly multiregression models generate acceptable albedo accuracy; therefore, it is reasonable to use yearly multi-regression models to generate operational albedo products.

Sensitivity Analysis
Next, we discuss the sensitivity of the multi-regression models. The dark areas in Figure 4(a) and (b) represent the geometry for which worse RMSE was generated for barren land. We set the solar zenith angle to 0° and 30° for Figure 4(a) and (b), respectively. The dark areas correspond to the hot spot of the measurement. This feature is also common to that of deciduous broadleaf forests and indicates that the simulated SGLI surface reflectance near the hot-spot geometry may have larger variance than the reflectance for other geometries.
This interpretation is supported by Figure 5(a) and (b), which show that the errors of the estimated albedo obtained using the in situ data of those solar zenith angles were not identical. However, because most of the measurements were conducted between 9:00 and 11:00 in the morning, similar solar zenith angles were observed. There are dark areas around the viewing zenith angle of approximately 15° in the principal plane in Figure 5(a) and around that of approximately 10° in the principal plane in Figure  5(b). Consequently, it should be noted that the albedo near a hotspot geometry estimated by the multi-regression models may be less accurate than those near other geometries.

CONCLUSIONS
In this paper, we examined two algorithms for generating GCOM-C/SGLI surface albedo products, namely, one based on a BRDF model using several sets of reflectance and one based on multi-regression models using a single set of reflectances.
Regarding the latter algorithm, we simulated the multi-angular SGLI surface reflectance from the POLDER-3 surface reflectance, and we generated datasets of simulated albedo and multi-angular reflectance via a kernel-driven BRDF model. We derived multi-regression models at an arbitrary geometry of solar zenith, sensor zenith, and relative azimuth angles for three landcover classes, namely, vegetation, non-vegetation, and snow/ice. The experimental results show that the former algorithm generates an albedo with an acceptable RMSE of 4.7×10−2, whereas the albedos estimated by the multi-regression-based algorithm have an acceptable RMSE of 3.9×10−2. The latter algorithm requires landcover classification and, more importantly, may be affected by larger variance when the surface reflectance near a hot-spot geometry is used. Therefore, we conclude that the BRDF-based algorithm can be applied to SGLI operational albedo products.