ESTIMATING CORN YIELD IN THE UNITED STATES WITH MODIS EVI AND MACHINE LEARNING METHODS

: Satellite remote sensing is commonly used to monitor crop yield in wide areas. Because many parameters are necessary for crop yield estimation, modelling the relationships between parameters and crop yield is generally complicated. Several methodologies using machine learning have been proposed to solve this issue, but the accuracy of county-level estimation remains to be improved. In addition, estimating county-level crop yield across an entire country has not yet been achieved. In this study, we applied a deep neural network (DNN) to estimate corn yield. We evaluated the estimation accuracy of the DNN model by comparing it with other models trained by different machine learning algorithms. We also prepared two time-series datasets differing in duration and conﬁrmed the feature extraction performance of models by inputting each dataset. As a result, the DNN estimated county-level corn yield for the entire area of the United States with a determination coefﬁcient ( R 2 ) of 0.780 and a root mean square error ( RMSE ) of 18.2 bushels/acre. In addition, our results showed that estimation models that were trained by a neural network extracted features from the input data better than an existing machine learning algorithm.


INTRODUCTION
Both population growth and increasing incomes are expected to increase food demand.Global food production has to be increased by more than 70% between 2005 and 2050 to feed the projected world population of 9.1 billion people in 2050 (FAO, 2011).Therefore, the proper management of agricultural production is vital to mitigate the risk of food shortages.The accurate estimation of crop yields is essential for decision-making regarding regional and global food security issues (Wang and Zhang, 2013).Satellite remote sensing serves an important role in monitoring crop yields at the global scale (Hall and Badhwar, 1987).Satellite remote sensing is highly useful for monitoring largescale crop areas due to its ability to acquire the information needed for managing croplands over large areas simultaneously.The enhanced Vegetation Index (EVI) derived from MODIS satellite data has been applied to observe crop conditions (Galford et al., 2008, Wardlow et al., 2007).Although the MODIS-based EVI is often affected by cloud contamination, several methods were proposed to improve the data for monitoring seasonal changes in vegetation, such as smoothing time-series variation by applying the wavelet transform (Sakamoto et al., 2005).
Traditionally, statistical methods have been used for estimating yields of various crop types.However, such methods are not useful in cases when many factors and relationships must be considered (Paswan and Begum, 2013).When estimating crop yield at regional and national scales, estimation accuracy is degraded and uncertainty increased by heterogeneity of environmental conditions, including the irrigation system, fertilizer application rate, climate meteorological conditions, and soil (Conradt et al., 2014).
To handle the complicated factors and relationships in estimating crop production, machine learning techniques such as support vector machine (SVM) and artificial neural network (ANN) have been applied (Karimi et al., 2008, Paswan andBegum, 2013).However, those techniques still require human efforts to identify features for accurate estimation.In contrast, the deep learning (DL) technique does not require such human efforts due to its mechanism for creating a multi-layered neural network using a multiple restricted Boltzmann machine or an autoencoder (Erhan et al., 2010).Although DL has the potential to be applied in crop yields estimation, no such applications have been demonstrated so far.
Corn is an important staple crop that is cultivated globally; it has a huge impact on food security (HLPE, 2013).In this paper, we report on the performance of DL applied to county-level corn yield estimation across the United States by using MODIS-based EVI and daily metrological data as compared to that of SVM and ANN.

Materials
We used Daymet and EVI calculated from MOD09A1 as the input data for the corn yield estimation model (Table 2.1).We selected cornfields from the cropland data layer to mask the input data on cornfields.We selected corn yield data as the target data of corn yield estimation model in this study.The cornfield layer extracted from the CDL was used to mask the weather data and EVI because this study targets on corn yield.
2.1.3MODIS EVI Satellite-based vegetation indices are often used for estimating agricultural products.The EVI is a vegetation signal with improved sensitivity in high biomass regions (Huete et al., 2002).We calculated EVI from MOD09A1, which is an 8-day surface reflectance dataset developed with the best possible observation coverage, low view angle, absence of clouds or cloud shadow, and aerosol loading (Vermote, 2015).The spatial resolution of MOD09A1 is 500 m; the data were acquired from the Land Processes Distributed Active Archive Center (LP DAAC; https://lpdaac.usgs.gov/).We calculated EVI by equation ( 1).The coefficients for the MODIS data are G = 2.5, L = 1, C1 = 6, C2 = 6, and C2 = 7.5.
Time-series MODIS EVI data typically contain noise induced by cloud contamination and atmospheric variability.Previous studies used the wavelet transform for smoothing time-series vegetation index data for better identification of crop phenological stages (Sakamoto et al., 2005).We applied the wavelet transform to remove noise.

Daymet
The Daymet dataset provides gridded estimates of daily weather parameters for North America (Thornton et al., 2014).The spatial resolution of Daymet is 1 km.We used daily surfaces of minimum and maximum temperature, precipitation, humidity, shortwave radiation, snow water equivalent as the input data of corn yield estimation models.

Methodology
Figure 4 shows a flow diagram of our methodology.We prepared two types of the input datasets that differ in duration and used several machine learning algorithms to evaluate each estimation method.

Smoothed MODIS EVI
Wavelet shrinkage is a nonlinear method (Donoho, 1995) comprised of three steps: (1) compute the wavelet coefficients from the original signals; (2) replace the coefficients with 0 if absolute values are smaller than the threshold; and (3) reconstruct the signals by using the inverse wavelet transform.This method is often used for data compressing and signal denoising (Aggarwal and Rathore, 2011).The signal, f (x), is transformed in the wavelet transform as equation ( 2).In this study, Coiflet 2 was used as a mother wavelet function.We used the hard thresholding method (Donoho, 1995) to remove noise from MODIS EVI.The threshold is calculated by the following equations (3-5).The following condition equation shows the hard threshold.
Hard threshold : Figure 5 shows time-series MODIS EVI and EVI smoothed by the hard threshold method.Figure 6 shows a map of smoothed EVI across the United States.Therefore, data preprocessing was essential to normalize the digital information.In this study, for normalization the standard score was applied and computed by using equation (6).
where µt = mean of accumulation value during period t σt = variance of accumulation value during period t Standard score converts the group of data to a frequency distribution with a mean of 0 and a standard deviation of 1.
Daymet contains daily meteorological data, and MODIS EVI is interpolated into daily data by wavelet smoothing.Because application of the standard score requires calculating the mean and variance of period t (equation ( 6), we used two input dataset: a daily dataset and a 5-day accumulation dataset.For the daily dataset, we calculated the mean and the variance of date t for 5 years (2008)(2009)(2010)(2011)(2012)(2013).For the 5-day accumulation, we calculated the mean and variance of every 5 days from January 1 to December 31 for 5 years.The daily input dataset has 2552 dimensions, and the 5-day accumulation input dataset has 512.

Masking the input data
Using the three datasets together is problematic because the spatial resolutions of Daymet and MODIS EVI are 1 km and 500 m, respectively, both higher than that of the county-level corn yield (Figure 2).To extract data on cornfields, we resampled MODIS EVI and CDL into 1 km resolution with nearest neighbour to coordinate with the resolution of Daymet and calculated the mean value of each datum of cornfields in every county.The extent of cornfields was identified by CDL (Figure 3).

Support Vector Machine
The SVM (Vapnik, 1995) is a supervised nonparametric statistical learning algorithm.SVM has been used in numerous applications for aerial-satellite remote sensing (Mountrakis et al., 2011), such as estimating vegetation characteristics as a regression problem.We used the radial basis function as a kernel function and optimized hyper-parameters with a grid search.

Artificial Neural Network
Remotely sensed data has been used to develop crop yield estimation models with ANN (Jiang et al., 2004, Li et al., 2007).ANN is a simulation model that represents the neural network of the brain.It is expected to develop models with a strong nonlinearity between different parameters and crop production.
DL is a new machine learning method that is constructed by a multi-layered neural network.Researchers using DL have had great success in image recognition and other complex issues that are difficult to solve with earlier methods (Le et al., 2011).
We developed deep neural network (DNN) models including six hidden layers (Figure 7), each hidden layer contains 4000 neurons.We also developed a cron yield estimation model by using an autoencoder, which is a neural network that has a small central layer (Figure 8).This small central layer is trained to reconstruct a high-dimensional input vector and used to represent more important features.This method is called pre-training and provides a great advantage by beginning the computation with better parameters.Pre-training was done before the actual computation.
We used the mini-batch method (Cotter et al., 2011) to train the ANN models.

Evaluation
The performance of each corn yield estimation model developed by SVM and ANN was evaluated with the root mean square error (RM SE) and the coefficient of determination (R 2 ).In this study, the total number of data was 9676.Approximately 80% of the dataset (7800) was used for training,  The RM SE provides a general purpose error metric for numerical predictions.RM SE is calculated by using equation ( 7).
where N = sample number of the test dataset yi = actual corn yield data acquired from the USDA ŷi = estimated corn yield R 2 is a measure of how well a model fits a dataset; we calculated it by using equation (8).
where ȳ = mean of corn yield

RESULTS
The corn yield estimation model developed by SVM with 5-day accumulation input dataset had the best accuracy in this study (Figure 9, Table 2).However, when the SVM model was trained by the daily input dataset, the accuracy was worse.In contrast, the estimation model trained by an autoencoder with the daily input dataset had better accuracy than the autoencoder model with the 5-day accumulation input dataset (Figure 10).For DNN (the ANN with six hidden layers), the model trained with 5-day accumulation input dataset had higher accuracy than when using the daily input dataset (Figure 11), but the result of using the daily input dataset was better than the case of the estimation model trained by SVM.
Corn estimation models trained by ANN allowed us to extract significant features from high-dimensional data.In addition, the method of designing input data affected the accuracy of models developed by machine learning algorithms.Corn yield of counties that had small or few cornfields in the cropland data layer resulted in a difference of more than 50 bushels/acre between the actual and estimated values.This difference between actual and estimated values was smaller for major areas of cornfields.

CONCLUSION
Our results showed that accurate estimation of crop yield across a wide area is possible by using machine learning and remote sensing data.
Corn yield estimation models trained by DNN and an autoencoder better extracted features from the high-dimensional input data.Therefore, DNN and autoencoders are promising methods for improving the accuracy of crop yield estimation by integrating more data, such as soil properties, irrigation, and fertilization, into the input dataset.
In recent years, the digitalization of agricultural information has advanced, and various types of agricultural data are stored.These datasets can be utilized for estimation models with machine learning.However, standardization of information is one of the greatest challenges, particularly the linkage with legacy data and systems developed in existing and future research.Agricultural ontologies are expected to promote integration among different systems and data (Nagai et al., 2014).
Crop yield estimation at county level is not sufficient for estimating crop yield at smaller scale.In this study, the spatial resolution of the input data was 1 km, which is more precise than the spatial resolution of the target data.By using the crop yield estimation model developed in this study, we can downscale the estimated crop yield by inputting the data at 1 km spatial resolution.

Figure 4 :
Figure 4: Flow diagram of methodology used in this study σ = variance of noise n = sample number of the signals M AD = meadian absolute deviation

Figure 5 :
Figure 5: MODIS EVI and EVI smoothed by the wavelet transform

Figure 7 :
Figure 7: Diagram of the deep neural network used in this study

Figure 8 :
Figure 8: Diagram of the autoencoder used in this study

Table 1 :
The dataset in this study

Table 2 :
Estimation results with several models