USING MULTI-TEMPORAL REMOTE SENSING DATA TO ANALYZE THE SPATIO-TEMPORAL PATTERNS OF DRY SEASON RICE PRODUCTION IN BANGLADESH

Remote sensing in the optical domain is widely used in agricultural monitoring; however, such initiatives pose a challenge for developing countries due to a lack of high quality in situ information. Our proposed methodology could help developing countries bridge this gap by demonstrating the potential to quantify patterns of dry season rice production in Bangladesh. To analyze approximately 90,000 km of cultivated land in Bangladesh at 30 m spatial resolution, we used two decades of remote sensing data from the Landsat archive and Google Earth Engine (GEE), a cloud-based geospatial data analysis platform built on Google infrastructure and capable of processing petabyte-scale remote sensing data. We reconstructed the seasonal patterns of vegetation indices (VIs) for each pixel using a harmonic time series (HTS) model, which minimizes the effects of missing observations and noise. Next, we combined the seasonality information of VIs with our knowledge of rice cultivation systems in Bangladesh to delineate rice areas in the dry season, which are predominantly hybrid and High Yielding Varieties (HYV). Based on historical Landsat imagery, the harmonic time series of vegetation indices (HTS-VIs) model estimated 4.605 million ha, 3.519 million ha, and 4.021 million ha of rice production for Bangladesh in 2005, 2010, and 2015 respectively. Fine spatial scale information on HYV rice over the last 20 years will greatly improve our understanding of double-cropped rice systems, current status of production, and potential for HYV rice adoption in Bangladesh during the dry season.


INTRODUCTION
Rice is an important food staple for more than 2 billion people globally (Khush 2005;Muthayya et al. 2014).Therefore mapping the extent of rice-growing areas, understanding diverse rice-farming systems, characterizing rice adoption or abandonment, and evaluating its potential for improvement is crucial to current and future food security goals, as well as environmental concerns such as greenhouse gas emissions (Kuenzer & Knauer 2013;Smith et al. 2008;van Groenigen et al. 2013;Whitcraft et al. 2015).
Researchers have identified remote sensing as one of the most effective methods to monitor rice production, especially at regional and national scales (Whitcraft et al. 2015).Over the last three decades, different methods for rice mapping and monitoring have been developed using remote sensing data (McCloy et al. 1987;Fang et al. 1998;Dong et al. 2016).The spatial extents of these studies range from experimental plots to continental scales and employ unsupervised, supervised, rule-based, and/or time series algorithms (Boschetti et al. 2017;Dong et al. 2016).The Landsat and Moderate Resolution Imaging Spectroradiometer (MODIS) constellations have been the most widely used because the spectral information they record is particularly suitable for agricultural characteristics (Okamoto 1999;Whitcraft et al. 2015).
MODIS images have been used much more extensively in agricultural and rice monitoring applications at larger regional scales because of the faster re-visit time (~1 day) and relatively smaller datasets resulting from its lower resolution (Becker-Reshef et al. 2010;Boschetti et al. 2017;Duveiler et al. 2015;Xiao et al. 2005Xiao et al. , 2006;;Zhang et al. 2017).Nelson et al. (2014) also combined MODIS time series images with SAR active sensor data for rice monitoring, exemplifying new initiatives and novel techniques becoming available with the advent of free access to remotely sensed datasets, and improved expert understanding of regional rice production systems.
Earlier analyses of Landsat imagery were limited to only a single image or a few images within smaller regions due to the size of datasets and previous computational limitations ( McCloy et al. 1986;Panigrahy & Parihar 1992).More recently, Dong et al. (2016) analyzed hundreds of Landsat images over Northeast Asia, which has only become possible with recent innovations in cloud-based computing technology and easy access to the Landsat archive.
Rice mapping and monitoring using remote sensing techniques hinges on the creation of algorithms that can adequately account for spatial and temporal aspects of production, while simultaneously providing flexibility across the many environments suitable for rice (Nelson et al. 2014).This is especially challenging when trying to maintain automation and minimizing expert user inputs.Boschetti et al. (2017) showed how to use expert knowledge of temporal growing season information and rice phenology to improve accuracy in rice mapping using their automated PhenoRice algorithm.A number of other studies have mapped rice, even at national and sub-continental scales (Dong et al. 2016;Zhang et al. 2017).These important studies illustrate the evolution of rice mapping in recent years and each has also discussed gaps and limitations to the various methods employed.These concerns are related to validating results and tradeoffs in the implementation of different methodological approaches.
In previous studies, scientists made significant tradeoffs between the spatial and temporal resolutions of different sensors, as well as the extent of selected study areas.The predominant limitation was the large computational power required to analyze high resolution imagery across large land areas and over longer periods of time.For example, while much MODIS imagery has been successfully and widely used to map rice across large areas through time (Boschetti et al. 2017;Xiao et al. 2005Xiao et al. , 2006;;Zhang et al. 2017), the resolution is only at 250 m, which limits analysis to homogenous landscapes with little fragmentation.However, rice production often occurs in heterogeneous, fragmented landscapes, particularly in places such as Bangladesh where most farmers are small holders with an average farm size of 0.24 hectares (Rapsomanikis 2015).
Other studies such as Dong et al. (2016) present large area analyses of Landsat 30 m data for rice mapping, but limit their study temporally to one year, demonstrating the power of Google Earth Engine which is a cloud-computing platform.The natural next step with this newfound computational capacity is to conduct higher resolution analyses over longer time periods and across larger land areas.This would bridge the existing gap in linking finer resolution imagery, such as 30 m Landsat, with long-term (multi-year) time series rice mapping at the country scale.Specifically, this study quantifies dry season rice production in Bangladesh from the late 1990s (Landsat 5), to the 2000s (Landsat 5 and 7), to the mid-2010s (Landsat 8).Moreover, we provide a new method of integrating spatio-temporal aspects of classification into a procedure which is regionally flexible and requires few input parameters to model.Thus, the objectives of this study are: (1) to demonstrate the use of a HTS model with VIs (HTS-VIs) in classifying dry season rice production in Bangladesh, (2) to compare and validate the results of this model with district-wise and national rice production statistics from Bangladesh, (3) to evaluate temporal changes in dry season rice production at the districtlevel based on the results of the HTS-VIs classification, and (4) to assess the limitations of the HTS-VIs model and discuss potential improvements for future work.

Remote Sensing data:
In this study, ortho-rectified Landsat 5 (LT5), Landsat 7 (LE7) and Landsat 8 (LC8) Top of Atmosphere (TOA) reflectance imagery are used to analyze dry season rice production in Bangladesh.LT5, LE7, and LC8 satellite platforms image(d) the entire earth every 16 days with a pixel resolution of 30 m and multiple spectral bands (Wulder et al. 2016).The TOA images are filtered spatially to cover all regions within the modern borders of Bangladesh.All Bangladeshi territory is covered by approximately 24 Landsat tiles.In total, 2396 LT5 images are included for years 1998-2011, 729 LE7 images for 1999-2002, and 1368 LC8 images for 2013 -2017.These ortho-rectified time series images provide a platform for analyzing pixel-level changes from yearto-year and season-to-season with a re-visit time of 16 days.In Figure (1), we show a map of the fraction of cloud-free observations during the dry season (December to May) for each Landsat platform.The cloud cover maps were derived based on the ratio of used ('good') pixels to the total number of pixels for each time series of images between December 1 st and May 31 st .It's important to note that a different cloud detection algorithm was used for LT5, which identified a significantly higher number of cloud pixels.As a result, more pixels were masked in LT5.
Figure 1.Percentage of cloud-free observations available for LE7, LT5, and LC8 for corresponding years.

Reference data from official statistics:
There are 9 divisions (provinces/states) and 64 districts in Bangladesh.District-wise data for boro rice production area were generated based on reports from the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) and Bangladesh Bureau of Statistics (BBS).ICRISAT data is available between 2004 and 2010 (ICRISAT 2004(ICRISAT -2010)), and BBS data was used for 2011 to 2015 (BBS 2011(BBS -2015)); these datasets will be referred to as the reference data for the rest of the manuscript.The reference data are aggregated areal estimates at district level and do not provide explicit spatial details or locationspecific management details.

Google Earth Engine platform
Google Earth Engine (GEE) is a cloud-based geospatial data analysis platform built on Google infrastructure and capable of processing petabyte-scale remote sensing data (GEE 2017;Moore & Hansen 2011).In previous studies, Landsat imagery could not be analyzed on larger regional or national scales because of high computational requirements.Instead, regional and national scale analyses were limited to moderate or lowresolution imagery platforms (Dong et al. 2016).To address this, GEE has opened the way for big geo-data analysis and the potential for investigation of higher resolution imagery such as Landsat in time series models.The GEE platform provides easy access to archives of remotely sensed data through either a JavaScript or Python API while simultaneously hiding the complexities of parallel computing and cyber-infrastructure.

Rice Phenology and Spectral Characteristics
Paddy rice, which accounts for more than 90% of rice production worldwide and is the main focus of this study (Kuenzer & Knauer 2013), typically follows three stages during the growing season: (1) a flooding and transplanting phase, (2) a flowering and tillering phase, and (3) a grain-filling and harvest stage (Dong & Xiao 2016;Mahmood 1997;Wassman et al. 2009).During the flooding and transplanting phase, rice fields have exposed soil with little to no vegetative cover, followed by water infiltration and flooding, which slowly transitions to light vegetation after rice seedlings are transplanted.As the seedlings grow, the rice canopy begins to cover the flooded soil below and vegetation rapidly increases to usher in the flowering and tillering phase.This second phase is defined by rapid vegetative growth, rice flowering, germination, and pollination, as well as panicle greening, lengthening, and thickening.Finally, the grain-filling and harvest stage follows as rice matures, browns, and dries.When the grain reaches ~20% moisture or less, harvest time commences and rice is cut from the fields, which return to brown exposed soil once again (Shelley et al. 2016).Importantly, the time-frame for each of these primary phenological phases changes depending on the rice varietal planted and the local planting times, as determined by weather and management decisions.Thus, any model for rice identification using these phenological stages must be flexible to changes in the length of the growing season, e.g., ~90 -120 days for an HYV or Hybrid versus 130 + days for many traditional varieties, as well as flexible to the commencement of the transplanting.
The majority of previous rice-mapping research attempted to identify pixels that show the characteristic temporal pattern of flooding followed by transplanting using remote sensing indicators such as vegetation indices (VIs; Boschetti et al. 2017;Xiao et al. 2005Xiao et al. , 2006;;Zhang et al. 2017).Specifically, these VIs are the Normalized Difference Vegetation Index (NDVI; Tucker 1979) and Enhanced Vegetation Index (EVI; Huete et al. 1997Huete et al. , 2002)), which are sensitive to leaf chlorophyll content or plant greenness, and the Land Surface Water Index (LSWI; Xiao et al. 2004), which is sensitive to leaf water content and soil moisture.These indices are calculated based on the following formulas: (3) When LSWI+0.05 > NDVI or LSWI+0.05> EVI, the signature represents a transition from plant cover or dry soil to primarily water or saturated soil and indicates the likely timeframe during which rice transplanting takes place.We refer to this as the flood signature.More details about this process can be found in Xiao et al. (2005Xiao et al. ( , 2006)).

Fitting a Harmonic Time Series Model
Cloud cover predominates over many regions of Bangladesh throughout the year, especially during the monsoon season between late June and October.Because of this, there are missing observations in addition to the atmospheric and sensor noise in the recorded bands of the imagery, which has to be dealt with in the estimation of VIs (Roy et al. 2002;Verbesselt et al. 2010).To correct for these issues, a HTS model was employed to interpolate NDVI, EVI and LSWI values over the time period for each pixel.The harmonic model was used to estimate each VI value ( * ) at each time () based on the following equation: where  represents a constant,  1 is a coefficient for the overall trend in ,  2 is a coefficient for the sin function at the frequency  of time , and  3 is a coefficient for the cosine function at the frequency  of time .In this model, two harmonic frequencies per annum account for the two seasons of rice that occur in a given crop-calendar year in Bangladesh.
For Landsat 8, a single HTS model was fitted to the time interval between 2013 and 2017, i.e., the entire four years of available Landsat 8 imagery.For Landsat 7, again a single HTS model was fitted for the years 1999 -2002.However, for the Landsat 5 imagery, a HTS model was fitted corresponding to advancing five-year windows between 1996 and 2013 with the middle (third) year being the focus of analysis, e.g., for analyzing 2010, a HTS model was constructed using the image between 2008-2012.For each year, the rice classification timeframe was specified between December 1 st and May 31 st , which includes the boro season across the entire country.Due to cloud cover, time series observations of VIs were irregular.
To ensure temporal comparability across regions, a pseudomodel of the HTS-VIs was generated starting December 1 st and advanced every 16 days through May 31 st .This analysis resulted in regular time series VI values (total of 12) for each pixel at 16-day intervals for the dry season.
The flooding signature was used to determine the rice transplanting phase using the regular HTS-VIs values.LT5, LE7, and LC8 collect slightly different spectral band widths and have lower radiometric resolutions compared to LC8.We found that EVI is generally more sensitive to the spectral characteristics of rice for LT5 and LE7, so we assumed the singular consideration of LSWI + 0.05 > EVI for LT5 and LE7 was sufficient for identifying rice paddy in the transplanting phase for these platforms.However, for LC8 two VIs for vegetation (NDVI/EVI) were used.We estimated the total number of observations for which the conditions are satisfied (0 being the minimum while 12 is the maximum).
In the time series plot below (Figure 2), LSWI+0.05 is greater than EVI or NDVI for five consecutive counts; this pixel would be classified as rice for the boro season of the year plotted.(Since these types of time series plots depict the phenology of rice, these will be identified as 'pheno-plots' for later reference).Given that planting dates vary year-to-year and regionally, and phenological stages change by variety, we attempted to develop a flexible model for rice pixel identification without prior detailed knowledge of ricegrowing seasons.Three counts (~16-32 days) of the flood signature were used as the minimum for a rice-producing pixel, and eight counts (~80-96 days) were used as the maximum for a rice-producing pixel.Any pixel with a count of three to eight counts of LSWI greater than EVI/NDVI would indicate flooding/transplanting for at least 16 days and a maximum of 96 days.We assume that less than three observations might be residual flooding from the previous aman season or rainfall and more than eight might be permanent water, wetland areas, or non-rice flooding.All the Landsat imagery analysis described so far was performed on the GEE platform.Each of these HTS-VIs classifications was exported for each year and further analyzed in R statistical software.

Optimization of area estimates using reference data
The area covered by 'possible' rice pixels (any pixel with 3 to 8 counts of the flood signature) was extracted within each of these 64 districts using the 'raster' package in R statistical software (Hijmans & van Etten 2012).We estimated different areas under rice with the following assumptions about the flooding and transplanting duration: 1) that the rice cultivation system for any specific district is homogeneous and flooding signatures can be observed for only one of the following flooding day intervals: 16, 32, 48, 64, 80 or 96 (3 to 8 observations of the flooding signature); and then, 2) that the rice cultivation system is heterogeneous and multiple flooding durations can be present within a district.Based on these assumptions, we estimated areas under different flooding durations using all possible (21) combinations of flood signatures, i.e., 21 models of rice-producing area estimation for each district each year during the dry season.Next, the calculated areas based on these 21 estimations per district per year were compared with those in the district-level reference data to find the combination which provided the closest estimation of rice-producing hectares to the reference data.For example, if the difference between the 3 -4 counts (model) of the flooding signature (the total area of 'possible' rice pixels in a district between 16 to 32 days of the flood duration) was smaller than all other combinations, then we accepted that scenario as most representative of boro rice production in that district.Based on the minimum difference models, an accuracy assessment was conducted for the rice classified areas derived from the HTS-VIs model within each district against the district reports on annual boro rice areas from the reference data.For accuracy assessment, the Mean Absolute Percentage Error (MAPE) was calculated between the reference data and the HTS-VIs model results (Armstrong & Collopy 1992;Hyndman & Koehler 2006).This assessment provides critical understanding for how rice production in fragmented landscapes may be mapped spatio-temporally, which could lead to improvements in classification at finer scale resolutions in the future.

Model Fitting and Spatial Variability of Dry Season Rice Production
The HTS-VIs model estimated 4.021 million ha, 3.519 million ha, and 4.605 million ha of rice production at the national level for Bangladesh in 2015, 2010, and 2005, respectively, based on the closest values of LC8 and LT5 compared to the reference data.Notably, the LT5 analysis over-estimated rice areas in some years and under-estimated in others due to the implementation of a different cloud-masking algorithm from the one employed with LC8, which resulted in a lower number of cloud-free observations for fitting the harmonic time series model.

Harmonic fitting of the Landsat VIs:
For the purposes of this study, we focus on the boro season between December/January and April/May.The EVI and NDVI signals are strong here and there are fewer missing values during this time period.The flood signal during the transplant phase shows up strongly, followed by a rapid increase and peak in EVI/NDVI, which indicates this small test area is most likely under rice production.Each year produces a similar pattern for this test, which suggests farmers are consistently planting a boro season crop in this region.In Figure (3), we present the fitted harmonic model for LC8 at a test site near Mymensingh.There are generally two seasons of rice, aman followed by boro.Overall, there is less cloud cover during the boro season, so this study focuses on the second season.The rice signature for the boro season is clear in this representation of the harmonic model, but due to missing values in both February and May, the model would be difficult to compare with other regions or across years and could easily misclassify the rice signature.To resolve this issue and make all regions and time periods more comparable, we generated a common time interval and equal observations for all VIs, an example of which is presented in Figure (4).what is known about regional differences in rice production patterns in Bangladesh, namely that there is a stronger, more predictable boro season in the north and there are often flooding and drainage issues during aman in the coastal region.

Generation of VI at equal-intervals starting
The pseudo HTS-VIs model was used for all estimations of rice-producing area in order to maintain this systematic comparability.The next section explores this spatial and temporal variability more explicitly.Figure 4. Generated VIs at equal-intervals starting December 1 st in the Mymensingh example.

Comparison of rice-area estimates from remote sensing analysis and reference data
The In our validation assessment, we compared our closest rice area estimation for boro rice areas provided by the reference data using MAPE, also known as Mean Absolute Percentage Error (Hyndman & Koehler 2006).We found accuracy levels of 10. 14, 8.42, and 8.49 percent for 201314, 8.42, and 8.49 percent for , 201414, 8.42, and 8.49 percent for , and 2015 in comparing LC8 in comparing LC8 with the reference data.For LT5, we found accuracy levels of 8. 60, 7.84, 8.21, 7.88, 7.47, 7.71, 7.78, 7.91, 7.86, and 8.23 percent for each year from 2004-2013 in the HTS-VIs national estimates.

Advantages:
The proposed methodology provides a flexible platform for mapping rice production in heterogeneous landscapes.In previous rice mapping efforts, expert knowledge on detailed rice phenology has been a key feature of the models.In other cases, where expert knowledge has not been included, the focus of rice mapping has been in relatively simple, homogenous production systems.This methodology allows one to fit a rice-identification scheme flexible to regional differences in production systems and varying landholder sizes.When combined with Landsat data at 30 m resolution as in this study, the methodology can account for rice in fragmented and disjointed landscapes.While the model shows promising results in this introductory example for Bangladesh, there are some limitations to the model in its present form and likewise, there are ways to improve it in future implementations.

Limitations/constraints:
The primary limitation of this methodology is the dependence on reference data in identifying the best rice classification for each district.We identified the HTS-VIs model based on the least difference model from the district reference data, which illustrates the robustness of the model to estimate rice-growing areas across heterogeneous agricultural practices, season lengths, and fragmented geographies.However, this dependence on the reference data hinders the models application in un-referenced regions or years.This gap might be bridged in future work, which is discussed in more detail below.
Another limitation prevalent in this model is its sensitivity to cloud cover.The model does not perform consistently in the aman season in Bangladesh because of the monsoonal cloud cover during this time.In some cases, the model adjusts to missing data via the interpolation of the HTS-VIs, but given a scenario where two to three consecutive observations are missing, the model prediction will be poor.Because of this, we introduce this model primarily for dry season mapping in Bangladesh and suggest it should primarily be used in regions where extended periods of cloud cover are less widespread.Cloud cover was more problematic for LT5 estimates due to a less nuanced cloud-masking algorithm, which could potentially be reduced in future work.In this case, we used all LT5 district estimates within 2.5 standard deviations, e.g., 95%, of the sample to remove outlier estimates caused by the excessive cloud masking.In the validations, this reduced errors significantly, but in future work this step may not be necessary.
The difference between our models and the reference data may also be due to the influence of small, fragmented plots of i.e., less than 30 m pixels, or omission errors from the influences of cloud cover as described above.Instead, we propose that the HTS-VIs model may identify rice production more correctly than those collected in situ at the district level, though this requires further confirmation with higher resolution imagery or better field observations than are currently available.At this point, higher resolution imagery is limited by time and/or cost, and field observations in rural Bangladesh are difficult to obtain for previous years.

Future research pathways:
Future work could compare these results with previously implemented MODIS rice classification algorithms to identify disparities based on resolution.It is possible that without a maximum/minimum difference threshold for VIs, commission errors may occur in wetland areas where similar vegetative characteristics exist alongside watery land surfaces.
Similarly, we use a minimum flooding duration of 16 to 32 days, which could lead to omission errors where the flooding and transplanting phase is shorter.The decision to omit flooding duration less than 16 days was based on the assumption that many areas in Bangladesh could remain flooded for 16 days with normal precipitation patterns given the low-lying and relatively flat landscape.On the other end of the spectrum, the maximum flooding duration included is 96 days to ensure that longer flooding and transplanting phases are captured.In some regions, flooding persists from the aman season into the dry season when farmers may begin transplanting rice.
In this study, we limited ourselves to introducing the HTS-VIs model with Bangladesh as an example in order to demonstrate its effectiveness at capturing rice-producing areas in a fragmented landscape with diverse agricultural systems, but further validation and testing is needed for the HTS-VIs model.

CONCLUSIONS
In this study, we demonstrate the implementation of a harmonic time series (HTS) model with EVI, NDVI, and LSWI to identify rice production areas during the boro season in Bangladesh.To our knowledge, this is the first time a pixelbased time series model has been applied to Landsat at the national level on a decadal time scale.We found that, the HTS-VIs model has the potential to map rice production across fragmented landscapes and heterogeneous production practices with comparable accuracy to other methods but without expert knowledge inputs.For LT5, this method shows approximately 25 -40 % difference from the reference data for districts in Bangladesh-generally less than the estimates made by local agricultural offices.For LC8, our results were within 5 -15% of reference data at the district level, which is a significant improvement over the LT5 predictions likely due to improved quality bands, higher radiometric resolution, and the consequent larger number of good quality observations.As agricultural monitoring via remote sensing becomes more widespread and important in meeting global food security needs, we suggest that models like the HTS-VIs introduced here could improve the accuracy and efficiency with which scientists are able to do so, giving policy-makers and development practitioners an enhanced platform for their work.

Figure 2 .
Figure 2. Rice phenology for the boro season with a flood signature of '5' for LC8.
December 1 st : In Figure (4), we present the same test area from Figure (3), but this iteration includes a generation of equal-intervals for the VIs beginning on December 1 st and ending May 31 st in the year 2017.Importantly, the observations are consistent every 16 days with no missing observations because they are interpolated based on real observations in the original HTS model.In the pseudo HTS model, the signature for rice is an improvement over the fitted model with original values.This is demonstrated by the fact that we see a clear harvest time in late December for aman rice, followed by the flood signature for boro season planting, and a peak and decline in the flowering and tillering phase in May.These signatures can exemplify
HTS-VIs model results show that indeed central and northern districts in Bangladesh have increased the hectares of rice production during the dry season, i.e., during boro and aus.According to the reference data, approximately 4.064 million ha were planted in 2005, 4.707 million ha were planted in 2010, and 4.090 ma hectares were planted in 2015 at the national level.Below, we compare our best fitting HTS-VIs results for 2005, 2010, and 2015 with these reference data from BBS and ICRISAT, both nationally via MAPE and at the district level via mapping.In Figure (5), we see that the district-level rice growing spatial patterns as well as magnitude in hectares planted are relatively consistent.The upper row represents the reference data and the lower row represents the closest HTS-VIs estimations for that year.

Figure 5 .
Figure 5. District maps of dry season rice production: Reference vs. HTS-VIs closest model.