ASSESSMENT OF CROPPING SYSTEM DIVERSITY IN THE FERGANA VALLEY THROUGH IMAGE FUSION OF LANDSAT 8 AND SENTINEL-1

In the transitioning agricultural societies of the world, food security is an essential element of livelihood and economic development with the agricultural sector very often being the major employment factor and income source. Rapid population growth, urbanization, pollution, desertification, soil degradation and climate change pose a variety of threats to a sustainable agricultural development and can be expressed as agricultural vulnerability components. Diverse cropping patterns may help to adapt the agricultural systems to those hazards in terms of increasing the potential yield and resilience to water scarcity. Thus, the quantification of crop diversity using indices like the Simpson Index of Diversity (SID) e.g. through freely available remote sensing data becomes a very important issue. This however requires accurate land use classifications. In this study, the focus is set on the cropping system diversity of garden plots, summer crop fields and orchard plots which are the prevalent agricultural systems in the test area of the Fergana Valley in Uzbekistan. In order to improve the accuracy of land use classification algorithms with low or medium resolution data, a novel processing chain through the hitherto unique fusion of optical and SAR data from the Landsat 8 and Sentinel-1 platforms is proposed. The combination of both sensors is intended to enhance the object ́s textural and spectral signature rather than just to enhance the spatial context through pansharpening. It could be concluded that the Ehlers fusion algorithm gave the most suitable results. Based on the derived image fusion different object-based image classification algorithms such as SVM, Naïve Bayesian and Random Forest were evaluated whereby the latter one achieved the highest classification accuracy. Subsequently, the SID was applied to measure the diversification of the three main cropping systems.


Agricultural vulnerability and cropping systems
As a very dynamic system, irrigation agriculture in arid regions is exposed to many threats such as droughts, soil degradation, soil salinization, water scarcity and population growth.Many developing regions of the world are relying on a single cash crop type which fosters the economic dependency on the world market and decreases the local crop production that ensures food security.Monocropping has always been a negative contributor to a rural population´s response capability to droughts or unexpected climate conditions.An important indicator for the performance of an agricultural system and the adaptivity of an agricultural society is the measure of cropping systems diversity.Each cropping system bears different characteristics to adapt to a certain threat and diversified cropping systems can decrease the level of vulnerability to the above mentioned threats (Jackson et al., 2012).Furthermore, the application of different cropping types will contribute to an increased land use sustainability and decrease the economic risks of the local population through diversified yields, especially through orchard-based agroforestry (Bühlmann, 2006) where the combination of crop types within the same farming plot is made possible (e.g.fruit trees and wheat).The irrigated lowlands of Central Asia belong to those regions substantially challenged by the aforementioned threats (Orlovsky et al., 2005).Especially population growth and land degradation put a high pressure on natural land and water resources.However, despite high demands for sustainable planning reliable data which would allow the analysis of the cropping system are absent.Remote sensing can support the * Corresponding author establishment of data bases in irrigation systems of Central Asia.For instance, the identification of different major crop types covering extensive fields has been conducted through NDVI time series from different data sources (e.g.RapidEye, ASTER, SPOT 5) (Conrad et al., 2010(Conrad et al., , 2013)).However, only very few studies have been done to assess the land use fraction of smallholding farming parcels, garden plots within populated areas as well as larger orchard plantations that contribute to crop diversity.This paper focuses on the assessment of diversity of cropping systems that are prevalent in the Fergana Valley of Uzbekistan.Moreover, by fusing optical (Landsat 8) and SAR (Sentinel-1) data it will demonstrate the extraction of the three main cropping systems: orchard plots, garden plots and summer crop fields which were the dominant crop type during the data acquisition.The major crops that are amalgamated within the superclass "summer crops" are cotton and maize.At the time of preparation of this paper, no approach of Image Fusion of data from the Landsat satellites and Sentinel-1 has been known hitherto.

Study area
Figure 1.Three cropping systems: orchards (left), garden plots (middle) and summer crop fields (right).Source: Google Earth The study area is located within the surroundings of the alluvial fan south of the city of Kokand (districts Uchkuprik, Baghdad and Rishton) in the southern part of the Fergana Valley which is located between eastern Uzbekistan, northern Tajikistan, and western Kyrgyzstan, characterized by a semi-arid climate and a rapidly growing population of about 14 Mio people (as of 2015).
Agriculture is the main source of employment, the key economic sector and is of existential importance for the people in this region.Major crops and land use systems such as winter wheat, cotton, rice, melons and orchards containing apricot and apple trees were identified by applied remote sensing-based monitoring (Löw et al., 2015).Due to arid climate conditions the intensive agricultural land use is only possible by irrigation, which in turn consumes considerable amounts of freshwater from external sources, i.e. from the Naryn-Syrdarya river system.

Data used
The input data comprised a Sentinel-1 image and a Landsat 8 (Operational Land Imager) image, both acquired on August 8 th 2015.The SAR image consists of a single C-band (wavelength λ=5.54cm), containing vertical signal polarization in Interferometric Wide Swath Mode (IW).This mode implicates a spatial resolution of 10m per pixel.The data is provided as a Ground Range Detected (GRD) product, multi-looked and projected to ground range using an Earth ellipsoid model.The Landsat 8 surface reflectance product was selected.It is atmospherically corrected and contains 7 out of the 11 bands as shown in Table 1.

Bands
Wavelength λ in nm

Radiometric Calibration
Pre-processing of SAR imagery takes a very important part because the aim is to derive the actual intensity of the emitted microwave signal that is received by the sensor.It can be expressed as the intensity image of the received signal.A conversion of the intensity values to Sigma Nought σ0 values is necessary to derive the radiometrically calibrated backscattering coefficient of the reflected surface (Furuta et al., 2011).

Speckle filtering
Although radar imagery is almost entirely non-affected by cloud coverage and atmospheric conditions, it is very influenced by speckle noise.The presence of speckle can be advantageous, when the detection of specific materials and surfaces can be supported by the analysis of noise patterns and disturbances.In most cases however, speckle noise is desired to be reduced through filtering to derive a more readable information from the SAR image.Different image filtering methods have been developed and tested on a variety of data in the past 25 years such as Mean, Lee, Frost and Gamma, Kalman and Kuan filter algorithms (Huang et al., 1996).They can be described as the standard smoothing filters due to their relatively low complexity.
In most cases, a speckle reduction method can be described as a convolutional morphologic filter that reduces the image contrast and the high frequency information, performed by a moving window, e.g. by a 3x3 kernel.For the analysis of the Sentinel-1 image in this paper, the Gamma filter was chosen to be the most efficient as it reduces speckle while preserving object edges as proposed by other authors (Gagnon et al., 1997).It is defined as a Maximum a Posteriori (MAP) adaptive filter based on Bayesian analysis of the image statistics by assuming a Gamma distribution of the input data.
Figure 2. Calibrated Sentinel-1 image (upper left) and Gamma MAP Speckle Filter with different kernel sizes: 3x3, 7x7 and 11x11

Terrain correction
The topographic correction and image registration of a radiometrically calibrated SAR image is converted from the Ground Range Geometry into a coordinate system.Geometry distortion effects such as foreshortening and layover are rectified through a SRTM 3 arc seconds digital elevation model using the Sentinel-1 Toolbox.Artefacts however, remain in the interpolated areas which is not illuminated by the sensor.The image is then co-registered into WGS 84 / UTM zone 42N coordinate system, the same as the Landsat image.

METHODOLOGY
The basic idea of the methodology is the application of open source data from the Landsat 8 and Sentinel-1 earth observation satellites.Image fusion is performed to enhance the spatial resolution of the multispectral Landsat image and to test the ingestion of the SAR backscatter into the optical data in order to derive more detailed land use information with a spatial resolution of 10m per pixel.Four different classifiers are tested for the optimum classification accuracy of the three cropping systems (winter-summer crops, orchards, and smallholder garden plots) by applying pixel-and object-based image analysis.This study investigates if image fusion increases the classification accuracy significantly.Smaller garden plots are expected to be delineated accurately through image segmentation.Finally, the Simpson Index of Diversity (SID) is calculated to assess the cropping system diversity accordingly.

Image Fusion
This paper demonstrates one of the first experiments to fuse Sentinel-1 and Landsat 8 data.As both images were acquired on the same date with the same conditions, changes of the backscatter signal, due to changed climate conditions, soil moisture fluctuations or changes in the surface roughness through plant growth are neglected.This work is based on the idea of the fusion of SAR (AirSAR) and Landsat TM to fuse the imaginary part of the dielectric constant, (Yonghong et al. 2008).The fusion of optical and SAR data enables the separation and identification of surface characteristics not only based on spectral signatures, but also on the dielectric properties and the surface roughness of the observed objects that is expressed through the backscattering values.Combining information from two different sensors enhances the accuracy and radiometric level of information from remote sensing images.The advantage of radar data over multispectral data is demonstrated through textural measures of the different polarization bands (Sarker et al., 2012).The analysis of the SAR backscatter allows the parametrization of the roughness of the different cropping systems.
The multi-layer classification approach by stacking the multispectral bands with the SAR backscatter band was neglected due to two reasons: (1) the enhancement of the spatial resolution from 30m to 10m was desired to improve the object-based edge detection between agricultural parcels, thus an image fusion approach seemed more logical and (2) as the data comes from two different sensors with different measurement scales and variation, the normalization of the optical and SAR data ranges was preferred to be avoided, in order to retain the original data, (Tso et al., 2009).Image fusion techniques have been researched and developed for a variety of applications to increase the retrievable information and to sharpen lower resolution.During our experiments we have tested different algorithms against each other such as the Ehlers method, Principal Component Analysis (PCA), Discrete Wavelet Transform (DWT) and Gram-Schmidt spectral sharpening (GS) algorithms because of their higher performance as described by other authors (Klonus and Ehlers, 2008;Abdikan, 2012).
Pansharpening is often applied by enhancing the spatial resolution of multispectral images with a single pan channel.One of the most frequently used image fusion techniques in Remote Sensing is the IHS transformation method that transforms a high resolution greyscale image into the IHS color space (Intensity, Hue, Saturation) of the multispectral image.The common IHS Transform algorithm combines both images, but assumes a spectral overlap and similarity between the sensor typesit is therefore not suitable to merge SAR and optical data.Particular focus has been laid on the Ehlers Fusion technique as it is expected to achieve the highest correlation between the original multispectral image and the fused image with the lowest RMS error.This method is based on an IHS transform but it retains the multispectral information of all bands of the input image.It does not include the spectral information of the higher resolution image (in this case the backscatter signal of the SAR image), but it ingests the roughness and backscatter values as a textural information.The Ehlers algorithm transforms all bands of the Landsat image to the IHS color space which allows to separate it into its spatial (I) and spectral features (H and S).The intensity component is transformed further into the frequency space through a Fast Fourier Transform (FFT).The SAR image is now adapted into the greyscale range of the intensity component and transformed through FFT as well and filtered through a High Pass Filter in order to enhance detailed structures.
In parallel, the intensity image is filtered through a Low Pass Filter which retains the spectral information.Both components are added and adapted at intensity levels which replace the original intensity channel.Finally, all three channels are transformed back to the RGB space.This process is iterated for all remaining multispectral bands (Klonus and Rosso, 2008).Through the DWT method, the SAR image is decomposed into its low and high frequency components by keeping the geographic information.It is resampled to the resolution of the Landsat image.Three bands of the latter one (near infrared, red and green) are converted into the IHS color space and then reconstructed by replacing the low frequency portions of the wavelet by the multispectral bands (Lemeshewsky, 1999).
The PCA is a dimensionality reduction that orthogonally transforms the variance of all input bands (7 Landsat + 1 SAR) into the main image components where the first component contains the majority of variance and all subsequent bands contain the diminished, uncorrelated information with decreasing common variance.The Eigenvalue of the first component contains the variance of the SAR backscatter and the multispectral Landsat brightness values.The GS method computes a synthetic lower resolution SAR band by weighted averaging of the seven Landsat bands.The bands are then decorrelated through vector orthogonalization.The synthetic SAR band is now replaced by the gain and bias adjusted SAR band and all Landsat bands are up-sampled accordingly (Maurer, 2013).

Multiresolution Image Segmentation
Image segmentation is a process of identifying and grouping homogeneous pixels into real-world objects by retaining object edges.By contrast, a pixel-based approach leads to so called salt and pepper effects with a noisy image representation.
To maximize the expected classification accuracy, the attribution of the segmented objects contains spectral, topological, geometrical and textural parameters that are used as input for the classification.The segmentation has been performed via the image processing software eCognition.A variety of spectral, textural and geometric attributes can be extracted for each segment as shown in

Classification of cropping systems
In this paper the authors took advantage of the Scikit-Learn module for Machine Learning that is integrated in an open source Python programming environment.Four supervised classification algorithms were tested in order to identify the most accurate method to extract three different cropping system classes (summer crop fields, gardens and orchards): K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Random Forest (RF) and Naïve Bayesian (NB).150 training samples have been selected for the region of interest with the following class labels: garden plots, summer crops fields, orchards, urban, bare soil and water.The image object classification was performed based on the segmentation results.The KNN classification algorithm has found a wide application range in Remote Sensing due to its simplicity.It is a non-parametric technique that classifies an object by the majority vote of its neighbours with the class being assigned to the class most common amongst its K-nearest neighbours measured by a distance function (Fix et el., 1951).A drawback of this method is the very high sensitivity to weak training data.The SVM classifier is based on the notion of separating classes in a higher dimensional features space by fitting on optimal separating hyperplane between them, focusing on those training samples that lie at the edge of the class distributions, the so-called support vectors (Foody et al. 2004).Random Forest was introduced in 2001 by Breiman as an enhanced Ensemble learning method.A forest of randomized trees is a random composition of single decision trees that try to find the best fit between the original and the sampling data.By contrast to other Ensemble methods, each node of a tree is split using the best among a subset of predictors randomly chosen at that node rather than choosing the best split among all predictors.Each tree in the ensemble is built from a sample drawn with replacement (bootstrap) from the training set.In addition, when splitting a node during the construction of the tree, the split that is chosen is no longer the best split among all features.Instead, the split that is picked is the best split among a random subset of the features.As a result of this randomness, the bias of the forest usually slightly increases but, due to averaging, its variance also decreases, usually more than compensating for the increase in bias, hence yielding an overall better model.In addition, the algorithm is also robust against overfitting (Breiman 2001).The number of trees that usually achieve good results and are considered adequate is 500 (Gislason et al., 2006).The NB method is based on applying Bayes' theorem with the "naïve" assumption of independence between every pair of features.It is a probabilistic approach and requires a small amount of samples to estimate the necessary parameters and is very fast compared to more complex methods.In Bayesian analysis a prior knowledge or prior probability is needed for training.Each pixel is then assessed according to its likelihood to belong to a certain class.The pixels with the highest probability score are chosen for the final classification decision (Murphy 2006).In this approach, the Gaussian Naïve Bayes has been used, assuming the likelihood of the features to be Gaussian.

Cropping systems diversity
The Simpson Index marks one of the oldest means for the assessment of diversity, first presented in 1949 in Nature, and has since been commonly used in diversity research.Originally used in biology, it displays diversity as the simple concept of the chance to draw from a population two individuals of the same species.It ranges from 1 to 1/z, where z is the number of species present in the area (Simpson, 1949).In the case of this case study, z equals the number of the different cropping systems.For this purpose, the extended version of the index is calculated by subtracting the value from its maximum of 1 (Peet, 1974): Accordingly, a value of 0 within a given area (cell in a regular grid of 1x1km) indicates the minimum possible diversity: all agricultural land (pi) is used by a single cropping system only.An index value approaching 1 on the other hand indicates a highly diverse area where the available land is more evenly distributed between all z appearing cropping systems.

Image fusion quality assessment
As it can be seen in Figure 3, additional experiments were undertaken with 3 other image fusion techniques to examine the most suitable method: Discrete Wavelet Transform (DWT), Principal Component Analysis (PCA) and Gram-Schmidt spectral sharpening (GS).The evaluation of the image fusion results in the following subsections has shown that the data fused by the Ehlers algorithm gave the best results.(1) A variety of methods to assess the spatial and spectral quality of the fused images exist as demonstrated in previous research works (Yakhdani et al., 2010).Specifically, the spectral correlation, spectral discrepancy and textural measures such as entropy and homogeneity are assessed.The spectral quality is assured by retaining the intensity information of the SAR image while the multispectral information is kept.

Spectral correlation:
The estimation accuracy of the spectral values between the fused and original image is given through the correlation coefficient r², indicating the quality of the fit between two compared datasets.A higher value indicates spectral similarity between both images.As shown in Figure 4, the highest score for spectral correlation is achieved through the Ehlers Fusion.The GLCM (Grey Level Co-Occurrence Matrix) is one of the most popular methods to analyse texture in an image.The matrix defines the probability that grey level i occurs at a specific distance and direction from grey level j in the texture image (Parekh 2011).Textural analysis between the original and fused images is needed to assess the change of the spectral information and thus, to identify if the fused image bears additional spectral information through the ingestion of the SAR backscatter.
Entropy measures: Entropy ε is a measure of disorder and randomness within an image segment.It is defined by the following formula:  = − ∑ (, )log ((, )) , A higher entropy in the fused image reveals that the level of the spectral detail has increased.Homogeneity measures: Homogeneity measures the uniformity of the image segments, meaning that there are a lot of pixels with the same or very similar grey level value.The larger the changes in grey values, the lower the homogeneity value (Gebejes et al., 2013).It is defined by the following formula: The GS method yields the lowest homogeneity scores, whereas the PCA achieves the highest ones.At the same time however, it has a very low entropy value which may lead to information loss.Conversely, a higher entropy and a medium homogeneity scores as proven through the Ehlers Fusion method is preferred.Too low homogeneity scores as performed by the GS method, may result in image noise and the capability reduction to detect image object edges such as parcel boundaries.This may affect the image segmentation and feature extraction procedures negatively. (2) (3) Spectral discrepancy: Considering the results in tables 3 and 4 it seems that the Gram-Schmidt method gave the best results with increased entropy and decreased homogeneity in relation to the original image, meaning a gain in textural and spectral information.An additional quality measurement, the spectral discrepancy, however, reveals that the GS fusion is most suitable due to the relative greater spectral similarity and higher fidelity to the original image.The Ehlers method on the other side is indicated by a higher spectral correlation coefficient score, by still retaining high entropy and homogeneity.It also achieves a stretching between the mean values of the individual bands, which leads to a higher spectral information gain and higher feature separability.Table 5. Spectral discrepancy between Ehlers, DWT and GS fusion results according to three selected bands for analysis (green, red and near-infrared).Table 6.Mean Bias comparison between the Ehlers and GS fusion method with the GS scoring best.
The Universal Quality Index (UQI) measures the amount of transformation of relevant data from the Landsat image into the fused image.The optimum value is close to 1 and indicates similarity (Alparone et al., 2008): where σLandσF are the standard deviations of the original and fused image respectively.
Image UQI Ehlers 0.28 GS 0.43 Table 7. Universal Quality Index comparison between the Ehlers and GS fusion method with the GS scoring best.
The quality assessment demonstrated that the GS method yielded the best spectral similarity scores whereas the Ehlers method yielded the best spectral correlation scores.The latter one also achieves a better balance between descriptive textural measures such as entropy and homogeneity.

Object-based image classification
The segmentation was performed using the multiresolution segmentation algorithm through the software eCognition.The edge detection within the original Landsat 8 image performs very well.The recognition of object borders in the fused image however is more detailed due to the higher spatial resolution, allowing for a better delineation of agricultural plots.Class colours: light green (garden plots), green (summer crop fields), dark green (orchards), red (urban), grey (other).

Classification accuracy assessment
The (5) yielded second best scores whereas the overall classification accuracy of the DWT was the lowest, similar to the results of the PCA-based fusion.In this case, the PCA method would be preferred over the DWT approach due to its higher performance and less complex algorithm setup.The overall accuracy was assessed by point stratification with 20 points for 3 classes each (garden plots, summer crops fields, orchards).The overall accuracies of the classifications are shown in Garden plots usually contain a variety of crop types including the same type of fruit trees that are grown within the larger orchard plots and therefore difficult to separate through their spectral signatures.Very often, intercropping in the orchards and urban gardens takes place, even with sub-tree cultivation.Due to mixed pixel values that are composed of the scattering characteristics of the Sentinel-1 C-band within the tree canopy, diffuse scattering from the ground between fruit trees and the corner reflectors from nearby buildings or urban features that on the other hand affect the urban garden plots the class separability towards the larger orchard plantations is increased during the classification of the fused image.Of course, geometric attributes such as shape and area support the class separation between the larger orchards and smaller urban gardens, in many cases however, the image segment size is not crucial.

CROPPING SYSTEM DIVERSITY
With a mean SID of 0.65, the examined research area has a relatively high cropping systems diversity.This undermines the situation within the Fergana Valley that faces a high population density and at the same time must meet the demands for a higher crop diversification that is herein addressed as cropping systems diversity.Due to the larger summer crop fields that are a result of the Soviet agricultural planning system, the most suitable integration of the SID has proven to be through regular grid of 1x1km.Garden plots account for 45,72% of the agricultural acreage in the examined scene, meaning a high subsistence farming rate within the rural population as well as a high urbanized agricultural structure.Table 9. Mean SID and cropping system acreage in %, calculated for the region of interest.

CONCLUSION
Sentinel-1 SAR and Landsat 8 optical data have been fused successfully in order to derive three different cropping system classes and to assess their diversification in the test area of the Fergana Valley.The Ehlers Fusion has been chosen as the one with the best fusion performance according to the fusion quality and classification accuracy assessment with the Random Forest classification algorithm outperforming other classifiers.The exact delineation of objects on parcel level is a very difficult issue, even after enhancing the spatial resolution to 10m after the image fusion.A detailed mapping of crop types remains therefore challenging.Further research needs to be done on the performance of the classification algorithms and the amount of training data.However, combined with time series analysis, especially the derivation of NDVI throughout the year, could give an exact information of the phenological processes for each cropping system.Furthermore, a time series could give information about water scarcity, irrigation patterns and vegetation condition.This supports the understanding of agricultural vulnerability patterns.The SID was calculated and gave the conclusion that a high cropping system diversity exists in the region, with a high share of garden plots, especially in the peripheral areas of the rural settlements and towns.Those urban gardens are mainly used for subsistence farming and require a high amount of the available water for irrigation due to perennial usage.The delineation of garden plots in semi-arid areas such as the Fergana Valley with non-existing cadastral data is of major importance in order to estimate water use efficiency and food security indicators.Further research is needed to generate a largescale land use database for the whole Fergana Valley.Altogether, this analysis indicates the possibilities of utilizing combined optical and SAR data to better understand land use patterns and cropping system diversity in heterogeneous irrigation systems of Central Asia.

Figure 4 .
Figure 4. Scatter plots and correlation coefficients for three image fusion results referring to the original image.The Ehlers fusion method yields the highest r² score.

Figure 6 .
Figure 6.Image segmentation experiments with the original Landsat data (left) and the Ehlers Fusion image (right).

Figure 7 .
Figure 7. Results from the Random Forest classification of the Ehlers fusion image (left) and the original Landsat image (right).Class colours: light green (garden plots), green (summer crop fields), dark green (orchards), red (urban), grey (other).
evaluation of the different classifications of different image fusion datasets was that the classification of the Ehlers Fusion image through a Random Forest classifier gave the highest score in the overall accuracy.On the other hand, the NB classifier outperformed other methods for the classification of the original Landsat image.The classification results of the GS fusion image

Figure 8 .
Figure 8. Regular grid of 1x1km showing the SID, ranging from red (low) to green (high).Diversity decreases on the scene borders and in proximity to settlements.

Table 1 .
Selected Landsat 8 bands with wavelength values

Table 1 .
Segmentation attributes table

Table 3 .
Table3shows the mean Entropy for all fusion results.Entropy measurements of the image fusion results and the original image.Here, the Gram-Schmidt method achieves the highest entropy value.

Table 4 .
Homogeneity measurements of the image fusion resultsand the original image.The PCA method achieves the highest homogeneity values.

Table 8 .
Overall accuracies of the classification results in %.Best accuracy is achieved by the Random Forest classification of the Ehlers Fusion image.