THE RECONSTRUCTION OF NDVI TIME SERIES USING SPATIO-TEMPORAL INFORMATION

Due to the influence of cloud cover, atmospheric disturbance and many other factors, normalized difference vegetation index (NDVI) corrupted by noises has a negative effect on the downstream applications. To this end, researchers have developed a large number of methods to reconstruct NDVI time series. The harmonic analysis of time series (HANTS) is one of the most widely used approaches of NDVI reconstruction. In this paper, HANTS algorithm was improved by the utilization of spatio-temporal information of NDVI time series with spatial filling and filtering, which makes up the deficiency of HANTS that only uses temporal information of NDVI time series. The simulation experiments on moderate resolution imaging spectroradiometer (MODIS) NDVI time series have proved that our method has effectively improved the quantitative and qualitative reconstruction performance of HANTS algorithm.


INTRODUCTION
Normalized difference vegetation index (NDVI) time series data derived from numerous satellite sensors have been widely applied in assessing the global ecological environment (Pettorelli et al., 2005), monitoring and information extraction of vegetation phenology (Atkinson et al., 2012;Zhang et al., 2003) and dynamic change of land cover (De Beurs et al., 2004;De Fries et al., 1998). However, NDVI time series are interfered by sensor performance, cloud cover, changes of sun angle, reflection in two directions of ground objects, and so on Li et al., 2019).
Although the maximum value compositing (MVC) has been used to alleviate the contamination, residual noise still remains. So, it is necessary to reconstruct NDVI time series before the applications.
Since 1986, more than 20 denoising methods have been proposed to smooth NDVI time series data from various satellite sensors. The main reconstruction methods can generally be divided into three categories: 1) Threshold methods, which set a threshold to control the smoothness of the reconstructed results, mainly including the best index slope extraction algorithm (BISE) (Viovy et al., 1992) and its improved version (modified BISE) (Lovel and Graetz., 2001); 2) Filter methods, which use a filter to reconstruct NDVI in a moving window, such as Savitzky-Golay filter (SG) (Chen et al., 2004) and mean-value iterative filter (MVI) (Ma et al., 2006); 3) Function curve fitting methods, which choose a function curve to fit the vegetation growth, such as asymmetric gaussian fitting (AG) (Jonsson et al., 2002), double logistic function fitting (DL) (Beck et al., 2006), Fourier transform (FT) (Sellers et al., 1996), harmonic analysis of time series (HANTS) (Roerink et al., 2000) and moving weighted harmonic analysis (MWHA) method .
In addition, some researchers proposed to combine two or more algorithms to reconstruct high-quality NDVI time series. For example, Bradley et al. (2007) combined FT with spline-based method to derive inter-annual phenology; Zhang et al. (2011) combined inverse distance weight (IDW) interpolation with discrete wavelet transform (DWT) to denoise MODIS NDVI data in Hebei Plain, China. Meanwhile, the performance of each method has been evaluated. Hird et al. (2009) made an in-depth comparison of the reconstruction results of MODIS NDVI products by six reconstruction methods including AG, DL, SG, 4523H, MVI and ARMD3-ARMA5 filtering, finding that all the six reconstruction methods behave relatively good ability of denoising. Additionally, the performances of AG and DL are better than those of ARMD3-ARMA5. Atkinson et al. (2012) compared the results of MERIS MTCI time series data from 2004 to 2006 by four methods, finding that Whittaker is the best, and AG and DL are not suitable in areas with multiple-growthseason vegetation.
Although the above studies have compared these reconstruction techniques in many aspects, the results can be distinctly different because the methods have obvious merits and demerits and can be affected by vegetation types, study area and noisy pixel percentage.
Among these methods, HANTS is a widely used and one of the best methods in NDVI time series reconstruction. However, since HANTS only depends on the temporal correlation of NDVI time series, it may have poor performance when there are large data gaps in the time series. In order to make use of the spatio-temporal information of NDVI time series, HANTS is improved with spatial filling and filtering in this paper.
It is vital to evaluate the performances of denoising methods, and ground observation is the most effective way of verification. However, this kind of reference data, which need to be spatially appropriate, is hard to obtain. In this case, the reference NDVI, which is regarded as noise-free data, was synthetized by the 11year original NDVI time series in this paper. Besides, the rootmean-square error (Eq. (1), abbreviated as RMSE) between the reconstruction and reference NDVI was calculated as a statistical indicator to evaluate the denoising performances of our method and HANTS algorithm.
(1) where = reconstructed NDVI time series = reference NDVI time series = the number of pixels The rest of the paper is organized as follows. The data set and the proposed reconstruction algorithm are introduced in Section 2. The results of our method are compared with HANTS in Section 3. Finally, a conclusion for the current work and the plan for future work are provided in Section 4.

Data
The study area of this paper locates between 114°0'7''-114°35'52'' E and 31°0'7''-31°35'52'' N, which is a 72.5km square area in the north of Wuhan city, China. The geomorphology is dominated by mountains and hills, and the climate is typical subtropical monsoon climate, with large cloud cover in summer and autumn. The data used in this study is MOD13A1, one of the moderate resolution imaging spectroradiometer (MODIS) products of EOS/Terra satellite, including the 500m resolution NDVI time series synthesized by maximum value composite (MVC) method in 16 days and its quality control data VI_Quality from 2008 to 2018. The product is subjected to geometric correction and atmospheric correction. The data can be downloaded freely from the official website of NASA (https://ladsweb.modaps.eosdis.nasa.gov/). In this work, all data processing steps were handled by MATLAB R2017b and ENVI 5.3.

Method
Since HANTS only uses the temporal information of NDVI series, it may perform poorly when there are long-time gaps in the NDVI time series. In order to overcome this disadvantage, spatial filling and filtering are applied to the NDVI series before HANTS algorithm. Spatial filling and filtering utilize the spatial information of the data set, and they are completely independent of time gaps.
The processing flow of our method is shown as Figure 1. Firstly, the noise series are generated according to the quality control data (VI_Quality) and a cloud mask derived from the QA SDS. In this study, we use the first two bits of the VI_Quality value to identify the noise pixels. As shown in Table 1, '00' represents the pixels of high quality while others are regarded as the contaminated pixels. Besides, the pixels with high quality are labelled as '1' while the others are labelled as '0' in the corresponding location of cloud mask. Secondly, the NDVI time series with its corresponding noise series are used to obtain reference NDVI. Thirdly, the noise series are inserted into the corresponding reference NDVI to generate the reference NDVI with noise. Finally, the reconstructed NDVI is obtained after spatial filling, spatial filtering, and HANTS algorithm. The detailed operation of each step will be introduced in the following part.
In addition, the noise series of the original NDVI data in this study are concentrated in several images of the whole time series data. According to our statistics, the proportion of noise in these images can reach more than 90% while others may only have nearly 10%. Besides, there are continuous images with large amount of deletion in the time series. In general, the HANTS algorithm gets poor performance under the circumstances. And this can definitely test the validity of our method in enhancing the reconstruction performance of HANTS algorithm. For now, we only test our method in this type of noise series and more types of noise series will be verified in the future work.

Reference NDVI Calculation
The noise-free reference NDVI is a benchmark for evaluating the reconstructed NDVI (Liu et al., 2006). It is essential to generate reference time series to calculate the reconstruction deviation of our method.
In this study, the reference NDVI was obtained from the original 2008-2018 NDVI data set, as shown in Figure 2. Firstly, the average NDVI of 11 years NDVI time series is calculated. In this step, all NDVIs in the same day of year (DOY) from 2008 to 2018 were averaged to generate the reference NDVI for DOY. Secondly, if the number of valid years of some point is greater than 4 according to the noise series which are generated by the corresponding quality control data (VI_Quality), the average NDVI of this point is regarded as its reference NDVI. If not, this point will be handled by nearest neighbour interpolation, and the result of interpolation is regarded as its reference NDVI. Through these procedures, most of the contamination in the original NDVI time series can be alleviated. And the reference NDVI time series are regarded as the true value when assessing the reconstruction performances of our method. Figure 2. Reference NDVI calculation.

Image Reorganization
To overcome the limitation of HANTS algorithm which performs poorly in the reconstruction of long-time gaps, we proposed an image reorganization method to obtain a 2dimensional image, in which the adjacent pixels had a strong positive correlation.
Before the utilization of spatial filling algorithm, the reference NDVI with noise was reorganized, as shown in Figure 3. The basic idea is to reshape one image with the size of m rows and n columns into one single column. For l images, the l columns are concatenated to a new image with the size of m×n rows and l columns. Then the spatial and temporal correlation of the images can be used by patch operator. But this reorganization only uses unidimensional spatial information. In order to utilize the correlation better, the reorganization will be done twice, one reshaping by row and the other reshaping by column. Moreover, the mean value of these two reorganization images after spatial filling will be regarded as the final filled NDVI in the next step. Figure 3. Image reorganization.

Spatial Filling Algorithm
The prototype of spatial filling algorithm in this paper comes from a digital image inpainting algorithm proposed by Criminisi et al. (2004). The basic idea of the spatial filling algorithm can be simply described as using the value of the most similar valid areas in the reorganized image to fill the invalid areas.
First of all, the reorganized image was divided into known areas with valid NDVI and unknown areas to be reconstructed according to the noise series. Then, the whole image was uniformly divided into a series of p×p blocks, and the block with the least number of unknown pixels will be filled first by the most similar block obtained from the valid areas.
The specific local filling process can be shown as Figure 4. In Figure 4, the black pixels represent the unknown areas and the pixels with other colors represent the known areas. Assume that block1 is the one with the least number of unknown pixels in the whole image and ready to be filled first. In block1, there are both valid pixels and unknown areas. Then, another block2 is searched in the known areas of the whole images, whose NDVI value is the most similar to the corresponding valid NDVI value in block1 by calculating their root mean square error (RMSE). A linear function is established based on the known parts of the two blocks. And a new block2 is obtained from fitting block2 by the linear function. Finally, the unknown pixels of block1 are filled by the NDVI value of the corresponding pixels of the new block2. The previous operations are circularly carried out until the unknown part of whole image filled. It should be noted that the filled image will be put back to the original l images with the size of m×n by the inverse process of image reorganization in this step.
Through the abovementioned process, filled NDVI time series are obtained. In this way, the original missing pixels are replaced by some approximate valid values. Although this kind of values may have some deviation from the true values, they can provide useful information to alleviate the interference of long-time gaps in the original NDVI time series and improve the performance of HANTS algorithm effectively as well.

Spatial Filtering
After spatial filling, the spatial filtering is applied. It can not only use the spatial information to enhance the details of the image, but also remove the self-existing noises and those caused by spatial filling. Many excellent spatial filtering methods have been developed in recent years. For brevity and effectivity, the adaptive median filter is adopted in this study.
When using adaptive median filter, the window size of the filter changes constantly during the filtering process according to the noise density. In general, a large window size is needed for regions where there are numerous missing pixels and a small window size is required for regions with little missing data. Besides, the window size of the filter will not change when no noise is detected and the valid value will be retained. In this way, the details of the NDVI time series can be maintained while eliminating noises.

The Harmonic Analysis of Time Series (HANTS) Algorithm
After spatial filtering, the harmonic analysis of time series (HANTS) algorithm is used to reconstruct the time-series NDVI. Harmonic analysis is a basic means of signal processing because any periodic function of time can be expanded into the Fourier series, i.e. the sum of an infinite number of sine and cosine functions. The core of HANTS algorithm is the two-dimensional Fourier transform. The reconstructed NDVI time series can be expressed as the following formula: (2) where = reconstructed NDVI = the number of NDVI time series = the NDVI value of the sample point t = the number of harmonics = the number of layers of samples 0 = the coefficient of zero frequency Based on the abovementioned theory, the harmonic analysis of time series (HANTS) algorithm has been applied to reconstruct vegetation index time series data with residual noise after preprocessing. And the reconstruction process of HANTS algorithm can be described as the following three steps: 1. Identifying the noise pixels outside the valid range of time series data. Generally, the lower boundary of valid value is set to -0.2 while the upper boundary of valid value is set to 1. And all the pixels outside this range will be considered as unavailable data; 2. Reconstructing the rest of valid pixels in time series by Fourier transform to obtain the initial fitted time series; 3. If the deviation between the reconstructed time series and the valid pixels exceeds the threshold called Fit Error Tolerance (FET) and the number of the rest pixels is less than the minimum number of valid pixels necessary for the subsequent reconstruction procedure, then abandon the pixels with deviation larger than half of the FET value and return to step 2. If not, end the procedure.
The parameters of HANTS algorithm are variable when the algorithm is applied to reconstruct different NDVI time series. However, there is no acknowledged principle to set these parameters up to now. In general, the parameters of HANTS algorithm are determined by the experience of users. In this paper, these parameters are set as shown in Table 2

RESULTS
In this paper, the final reconstructed NDVI time series of our method were compared with those reconstructed by HANTS algorithm. In both two methods, the parameter settings of HANTS algorithm were the same (DOD = 5; FET = 0.05; Nof = -0.2; Min = -0.2; Max = 1.0; Hilo = low). The two methods were not only evaluated from the reconstructed NDVI series of 2008-2013 in our study area by quantitative indexes including linearly dependent coefficient (R) and root mean square error (RMSE), but also by the visual effect. The linearly dependent coefficient (R) reflects the level of similarity between reconstructed NDVI and original time series, while the root mean square error (RMSE) reflects the deviation between them. The annual average R and RMSE of the results reconstructed by HANTS and our method from 2008-2013 were calculated by the reference NDVI, as shown in Table 3. As you can see, our method has higher R values and lower RMSE values than HNATS for every year, which means our method retains more valid pixels and has a better reconstruction performance than HANTS algorithm in our study area.  Table 3. The quantitative comparison of HANTS and our method.
As for the visual effect of the results, two groups of reconstruction results of HANTS algorithm and our method are shown as examples. Figure 5 shows the first group of reconstructed NDVI. Figure 5(a) and Figure 5(b) represent the reference NDVI and cloud mask, respectively. The cloud mask is extracted from noise series, in which the black part labelled as '0' stands for the areas to be reconstructed while the white part labelled as '1' stands for the valid pixels in the corresponding NDVI time series. The function of cloud mask is to mark the missing parts of the NDVI time series. Compared with the result of HANTS [ Figure 5(c)], the result of our method [ Figure 5(d)] is more similar to the reference NDVI. Additionally, the result of HANTS includes obvious noise, as shown in the red rectangles of Figure 5(c).
The second group of reconstructed NDVI is shown in Figure 6. As shown in Figure 6(c), there is obvious noise in the red elliptic region of the reconstructed result by HANTS. There are three blocks of water area in the red rectangular region of Figure 6(c), where there are only two blocks of water areas in the same position of reference NDVI [ Figure 6(a)]. By contrast, our method [ Figure 6(d)] has a better performance than HANTS, which is more similar to the reference NDVI in details. Additionally, the water area in the yellow rectangular region in Figure 6(c) is obviously expanded compared with the reference NDVI, while our method does not.
Through the abovementioned comparison between our method and HANTS algorithm, the new technique proposed in this study, which to some extend improved the performance of HANTS, can effectively remove the majority of noise in NDVI time series while preserving the valid pixels in the original data in our simulation experiments.

CONCLUSIONS
Based on MODIS NDVI time series with the resolution of 16 days and 500m in typical subtropical environments, this paper proposed an improved HANTS method to reconstruct NDVI time series. Our proposed method makes use of the spatial and temporal information of NDVI time series together. After image reorganization, the spatial and temporal correlations of NDVI are utilized by spatial filling and filtering, which are beneficial to the subsequent HANTS. For both quantitative and qualitative evaluations, the proposed method performs better than HANTS, which is only based on the temporal information of NDVI time series. However, this paper only conducted simulated experiments. In the future, the proposed method will be tested by more real experiments to validate the effectivity.