SENTINEL-2 DATA FOR THE DETERMINATION OF GROUNDWATER WITHDRAWAL IN THE MAGHREB REGION

Agriculture plays an important role in the economy of the Maghreb region. Most of the water needed for irrigation comes from pumping of the aquifers. A controlled pumping of the groundwater resources does not exist yet, thus, estimating the total water consumption for agricultural use only with in situ data is nearly impossible. In order to overcome this lack of information, Copernicus data are used for determining the groundwater withdrawal through agriculture in the Maghreb region. This paper presents an approach for estimating and monitoring crop water requirements in Tunisia based on multitemporal Sentinel-2 data. Using this multitemporal information, a thorough analysis of the different culture types over time is possible, from which a set of additional multitemporal features is deduced for crop type classification. In this paper, the contribution of those features is analyzed, showing a classification accuracy enhanced by 10% with the multitemporal features. Furthermore, relying on existing methods and FAO standards for the estimation of crop water needs, the methodology aims to estimate the specific crop water consumption. The results of the water estimates are validated against delimited areas where estimates of the water consumption are available from the authorities. Finally, as the study is conducted within the framework of an international technical cooperation, the methodology aims to be reproducible and sustainable for local authorities. The particularity of the results presented here is that they are achieved through automatic processing and using exclusively Open Source solutions, deployable on simple workstations.


INTRODUCTION
Water scarcity is an important challenge in several regions of the world.Especially in the Maghreb region, the pressure on the water resources is very high, the principal consumer of water being the agricultural sector (Jacobs et al., 2012).Knowledge of the amount of water used for agriculture is thus of paramount importance for an organized management of the available water resources.
The determination of the water needs involves the determination and monitoring of the specific agricultural practices and crop types of the region, in a high spatial and temporal coverage.Remote sensing, and in particular optical remote sensing, is a tool that has proven its performance for agricultural crop mapping (Baghdadi et al., 2016;Bégué et al., 2018), as it is a cost-effective way to derive large-scale information about diverse agricultural parameters.
A parameter of tremendous importance for water estimation via crop mapping is the crop evapotranspiration, which directly relates to a crop types specific water needs (Allen, 1998).As the water need of a crop varies during the season depending on its specific growing stage, a crop calendar in combination with specific crop coefficients is very helpful to relate crop growing stage and characteristic water need at a specific time of year (Casa et al, 2009).Yet, the respective duration of the different crop growing stages may differ depending on the regions.A multitemporal analysis of the crops based on remote sensing data permits thus to update and adjust the respective growing stages of the different crops, and helps to identify an optimal temporal window for crop classification as well as the optimal number of acquisition dates (Conrad, 2014).
Existing approaches for crop mapping rely mainly on commercial high-resolution satellite data (RapidEye data (Conrad, 2014)) or Open Source medium resolution data (Landsat 7 ETM+ (Casa et al, 2009)).The 30m resolution of the later can be too coarse in specific areas of the Maghreb, where the farmers sometimes cultivate very small fields or only parts of the fields depending on the climatic and financial situations.The launch of the Copernicus Sentinel-2 satellites (June 2015 for Sentinel-2A and March 2017 for Sentinel-2B) permits to achieve a better resolution (up to 10m), whilst providing the data in an open source policy.The repeat cycle (10 days for one satellite, 5 days for the two satellites constellation) allows a very regular monitoring and ensures the availability of enough cloud free images for crop monitoring.First crop mapping approaches using Sentinel-2 data have been performed by (Immitzer et al., 2014), using only the spectral characteristics of the Sentinel-2 bands at a single date.In (Belgiu et al., 2018), crop mapping has been performed using time-weighted dynamic warping.This method analyses the temporal evolution of the different crop types, and uses it as weight for the classification.A dynamic cropland mask distinguishing cropland from other areas (Valero et al., 2016) and a crop type map (Matton et al., 2015) are produced within the Sen2-Agri system (Sen2-Agri, 2017).For both crop masks and crop type mapping, the authors use specific temporal features derived from the NDVI (Normalized Difference Vegetation Index) time series of the Sentinel-2 data.Unfortunately, the system requirements for computation are still not practicable for sustainable application within the technical cooperation.
Diverse methods and equations exist to determinate the evapotranspiration of crops, depending on the available data (Allen, 1998).While (Casa et al, 2009) rely on a simplified equation due to missing meteorological parameters, (Le Page et al., 2012) use a linear relationship between NDVI of MODIS data and the crop specific coefficient.The latter estimation was however performed for broad land cover classes, distinguishing only between yearly and seasonal cultures.
In this paper, an approach for the determination of water needs based on the classification of Sentinel-2 images is presented.Particularly, the temporal characteristics of the different crops are analyzed in order to deduce the most suitable multitemporal features for classification.Two different classification methods are tested, and the classification results using different band combinations are thoroughly compared.From the classification results, the specific water needs of the region are determined.The novelty of this approach resides in the specific adjustment of the crop coefficient using the classification results and the derived temporal features.The particularity of this approach lies in the transferability to other projects and regions of the technical cooperation, where computing resources are limited, as everything was implemented using SNAP, QGIS and a spreadsheet.This paper is structured as follows: first, the test area and the data are described (Section 2); secondly, the adopted methodology for crop mapping and estimation of water needs is outlined (Section 3).The results are presented and discussed in Section 4.

TEST AREA AND DATA
The test area presented in this paper is the plain of Nebhana, situated in the Northeast of Tunisia (Figure 1), situated between the cities of Nadour in the North and Kairouan in the South.This region is characterized by an intensive agriculture.The water for irrigation either comes from the Nebhana dam, through a complex pipeline system, or is pumped directly from the underlying aquifers.Due to long periods of drought, the water supply from the dam is not always guaranteed.The last substantial drought year was 2016, where the water supply from the dam was shut down and is drastically restricted since.Consequently, most of the water used for agricultural irrigation comes from direct pumping of the aquifers.In order to limit the transport distances, most farmer drill a borehole near their agricultural plots.Since only a few of them register the boreholes, the local authorities encounter problems to keep count of the boreholes and particularly of the amount of water pumped.This is where the use of remote sensing adds important information, in order to help the local authorities to evaluate the total amount of water used.For the study, the water needs of the region for two different agricultural seasons are analyzed, i.e. winter 2016/2017 and summer 2017.The winter season in Tunisia lasts generally from October to April, and the summer season from May to September.Important crops during winter are cereals, forages, small vegetables (mostly peas) and tree plantations.In summer, only vegetables, trees and some forages are relevant for the water balance.In order to perform a detailed temporal analysis of the different crop types, monthly ground truth data were acquired, in order to analyze the specific evolution of the different cultures over the region.A total of 357 reference plots are observed and monitored each month, for a total surface of 1221 km² and an agricultural surface of 481 km².As those ground truth data are acquired considering 55 different crop types (in the following subclasses), a few plots only represent each crop type.Therefore, in order to ensure a good learning of the classifier, more data are used for the training as for the validation: 70% of each class is used for training and the other 30% for validation.Free and open, available Copernicus data of Sentinel-2 are used in order to obtain a regular and high temporal coverage of the area for following the evolution of the different crop types.The considered acquisitions and respective ground truth data are listed in Table 1.For the winter season, all available cloud free images have been considered, whereas for the summer season, only about one cloud-free dataset per month is considered -  corresponding to the date of the specific ground truth campaigns -in order to keep the computation time low.As mentioned earlier, especially in summer in this region, less distinct crop types are expected as in winter, making a multi-temporal analysis of the different crop types based on a monthly data rate possible.

METHODOLOGY
The methodological workflow is shown in Figure 2. Considering a stack of Sentinel-2 data for one season, the data are first preprocessed in order to correct atmosphere and relief influence, and a cropland mask is used in order to consider only the agricultural areas for further processing steps (3.1).In a second step, the NDVI is estimated for each dataset (3.2).Based on the acquired ground truth data and the NDVI time series, NDVI profiles for the different crop types are created (3.3).The analysis of the profiles leads to the determination and creation of specific multitemporal features that allow to better distinguish the different crop types from each other (3.4).Those additional features are then used for the land use classification, which aims the differenciation between major crop classes corresponding to different water needs (3.5).Finally, the results of the classification are the input for the crop water requirement estimation (3.6).

Pre-Processing
For time series of multispectral data acquired over a whole year, it is necessary to perform atmospheric correction in order to make them comparable.As this work is focused on the use of Open Source solutions in order to ensure the sustainability of the approach, the Sentinel Application Platform (SNAP) is used for the pre-processing of the data, and more particularly the Sentinel-2 Toolbox and associated plugin Sen2Cor for atmospheric and relief correction.This plugin aims the retrieval of Bottom-of-Atmosphere reflectance values performing corrections of aerosol optical thickness, water vapor retrieval, cirrus correction as well as relief correction using a DEM.Since April 2017, already corrected Sentinel-2 Level 2 data are available for all Europe and the Mediterranean region, but as data from October 2016 onwards are used in this approach, pre-processing is performed for all data with the same algorithm, for sake of unity and better comparability.
In order to focus on the specific water need for agricultural use, the Cropland mask of the ESA CCI land cover -S2 prototype land cover 20m map of Africa 2016 (CCI Land Cover, 2017) is used, which is well defined for the test area (Figure 3).This map was created combining Random Forest and other Machine Learning algorithms.We resampled it at 10m resolution.Another approach would be to calculate the agricultural mask based on longer time series and vegetation statistics.

Vegetation index
In a second step, for each pre-processed data, the NDVI is calculated, the band indicated below are correct for Sentinel-2: (1) Where L is a factor varying from 0 (high vegetation density) to 1 (low vegetation density).Figure 4 shows the difference of the temporal profiles between NDVI and SAVI (L=0.5) for different cereal crops (oat, wheat and barley) during winter season.It is obvious that both indices show similar characteristics for each crop type.A smaller stretching of the values is observed for SAVI, due to the additional factor L. L=0.5 has been set arbitrarily, having no further insight about the real density of vegetation in our area.However, a single factor has the drawback that the vegetation density should be the same over all the area of interest.In our case, the vegetation density can be very heterogeneous, depending on the agricultural practices of the farmers and the varying water availability.In order to develop a sustainable and reproducible approach, we focused in the following on the NDVI.For each date, an NDVI image using SNAP is calculated, yielding a NDVI time series.

NDVI profiles
From the NDVI time series, NDVI profiles are created using the acquired ground truth (Figure 5).Each node of the NDVI profiles corresponds to one acquisition date as shown in Table 1.Five principal classes are considered based on the consolidated 55 ground truth classes: cereals, forages, trees, vegetables and bare soil.Those five classes are identified as corresponding to specific differing water needs following FAO (Allen, 1998).Table 2 shows an overview of the principal subclasses regrouped in those macro-classes (MC) for each season.The subclasses of one macro-class mostly have very similar water needs, during the same period (Table 3).For each macro class, the mean NDVI is calculated using the corresponding ground truth data (Table 1).

Additional multi-temporal features
As it can be a problem to differentiate between crop types considering only their spectral characteristics for one date, it is relevant to define some additional features to increase their separability during classification, based on the analysis of their respective growing stages.Those features can be defined using the temporal behavior of the different crop types (Figure 5).For example, even if cereals and forages look very similar, they have different cycle lengths (Table 3), permitting to distinguish them spectrally over time.The analysis of the created NDVI profiles leads to the derivation of four such temporal features, suitable for our region, resumed in the following.For the sake of sustainability and reproducibility, all features are calculated using simple GIS operations, in QGIS.

3.4.1
Maximum NDVI As from the profiles (Figure 5), the maximum NDVI can be used for differentiating trees from bare soil for both seasons: bare soil has a lower NDVI value as trees.Also the group cereals/forages/vegetables can be distinguished from trees and bare soil in winter as all those three classes show much higher maximum NDVI as trees or bare soil.The maximum NDVI of those seasonal classes is very high, characterizing the date of maturity of the crops.Even there, another distinction can be made, as vegetables present a slightly lower maximum NDVI as forages and cereals.In summer, the NDVI of cereals corresponds to that of bare soil as no cereals are cultivated and the fields are bare.The summer forages are almost new cuts of clovers, explaining a higher NDVI value as in case of cereals.
Where i is the number of the acquisition, N the total number of acquisitions.

3.4.2
Difference Maximum-Minimum NDVI The difference maximum-minimum NDVI helps to differentiate between crops whose NDVI changes to a large amount over the season (in winter: cereals, forages and vegetables, in summer: forages and vegetables), trees whose NDVI changes to a lower amount in winter and is constant in summer, and bare soil which NDVI remains steady low all over the seasons.This difference is particularly helpful in summer to recognize vegetables, as they are the only class changing over this time period.
i is the number of the acquisition, N the total number of acquisitions.

Maximal Slope
In order to differentiate between cereals and vegetables in winter, the slope of the NDVI is considered.Indeed, whereas the vegetables show a steadily growing NDVI from October to mid-March, the cereals seem to have an abrupt growing stage, thus a steeper slope, from mid-December to mid-March.To extract this information in a robust way and get rid of possible outliers in the reference data, the slope over four consecutive acquisitions is determined.This has the drawback that the time span for slope determination can vary a lot depending on the date of the first acquisitions, as the considered acquisitions do not have a regular time interval in winter, but has the advantage of smoothing such irregularities due to the longer time span.From all calculated slopes, the maximum positive slope is then determined, creating a new feature image that permits to distinguish the different slopes and thus helps distinguish between cereals and vegetables.
i is the number of the acquisition, N the total number of acquisitions, and t the time of the corresponding acquisition.

Emergence Date
The emergence date (EMD) corresponds to the date when a crop starts its growing phase.Usually, it is the period where the crop's water needs increase drastically up to maturity.This is also the date where the NDVI value starts to increase, e.g. the date that corresponds to the first inflexion point of the NDVI temporal profile.
As here a simple GIS calculator is used and no regression analysis can be made, we decided to set this date as being the date corresponding to the first acquisition of the maximal slope feature.Considering the NDVI profiles, the emergence date feature will allow to distinguish between forages and cereals.To this goal, the date of maximum NDVI could also help.Indeed in Figure 5 the cereals seem to have a later maturity date as the forages.However, the detection of this feature depends highly on the harvesting date of the crops (Table 3), which may vary a lot between the different farmers.Thus, in this approach, the difference of emergence dates between forages and cereals is preferred, assuming that the seeding period differs less than the harvesting period.

89
:!, ;<=>= i is the number of the acquisition, and t the time of the corresponding acquisition.
This feature is more interesting in winter than in summer.Indeed, in summer, the water consuming crops can already be distinguished easily using the three previous features and the emerging crops are reduced to the vegetables.
All those features are represented schematically in Figure 6, and are determined separately for winter and summer seasons.
Looking at Figure 6, other temporal features can be defined, such as the duration of the growing stage, considering the time span between maximum NDVI and the emergence date, or the date of maximum NDVI, corresponding to the crop maturity before harvesting.However, the determination of those features was not considered as robust enough regarding the specific agricultural practices of each farmer.Table 3: Principal growing stages of the different crop classes, and their respective crop coefficient Kc, from (Allen, 1998).

Crop type classification
Based on the spectral properties of the data and on the described multitemporal features, crop type classification is performed, in order to differentiate the previously mentioned crop macroclasses, which correspond each to a specific water need (Table 3).This step is performed using the Open Source systems QGIS and SAGA GIS.Starting with the winter season, two standard classification methods are compared: Maximum Likelihood (ML) and Support Vector Machine (SVM).The later is used as it permits accurate classifications even with training samples of medium quality (mixed pixels, small training samples).For the summer seasons, only ML classification is performed but the influence of the multitemporal features for the classification is analyzed.

Estimation of crop water requirements
The estimation of crop water requirements happens subsequently to the crop type classification.A tool based on the FAO method, described in (Allen, 1998) is set up.This method uses the Penman-Monteith equations to determinate the crop evapotranspiration: where ET0 is the reference evapotranspiration in [mm.day-1], corresponding to the water need of a reference grass under ideal conditions.Rn is the net radiation at crop surface, G the soil heat flux density, T the mean daily air temperature at 2m height, u2 the wind speed at 2m height, es the saturation vapor pressure, ea the actual vapor pressure, ∆ the slope vapor pressure curve and γ the psychrometric constant.All those climatic parameters can be calculated or approximated using either look-up-tables or standard parameters of weather stations: max, min and mean temperature, and precipitation values (Allen, 1998).Usually, an estimation of ET0 is made at a monthly rate, using monthly averages of temperatures and rainfall.The reference evapotranspiration serves as input for the calculation of the plant specific evapotranspiration ETc, defined as: where Kc is the crop coefficient, a crop specific parameter with no dimension, serving as factor for adjusting the water need depending on the specific crop characteristics.In this work, the single crop coefficient approach is used, considering the combined effects of crop transpiration and soil evaporation.Kc values given from the FAO are used for the different crop stages (Table 3).In order to adjust them to the region to improve the estimation of the water needs, the map of emergence dates is used in combination with the results of the crop type classification.Indeed, even within one macro-class, the emergence date of the different crops may differ slightly, depending on the agricultural exploitations and farmer's practices.Consequently, the real Kc may vary from plot to plot.In order to consider this variation and make a realistic estimation of the water needs at regional level, we computed for each macro-class and each emergence date (monthly rate) the corresponding interpolated Kc for the month for which the water need should be determined.For each macroclass individually, the surface corresponding to each possible emergence date is considered.The final estimated Kc for one macro-class for the month of interest is then a surface weighted average of the Kc of the different emerging dates for this macroclass.Figure 7 shows a numerical example of this calculation for the winter seasons, as in equation ( 9).
S Q ]4-^, Y is the interpolated Kc at month m for the considered emergence date em.SMC is the surface of the considered macroclass MC, and SMC,em the surface of the same macro-class having the emergence date em.
From this result, a monthly ETc can be calculated for each macroclass and specific ETc map can be created for the whole area (Section 4): The estimation of the monthly total water need for the region is the summation of all ETc after removing the effective rainfall:

RESULTS AND DISCUSSION
In this section, the results of the crop type classification as well as the estimation of the water needs are shown and discussed.The considered bands, as well as the classification results are presented in Figure 8.For the winter classification, five classes are considered and the classification algorithms Maximum Likelihood (ML) and Support Vector Machine (SVM) for our area are compared.ML achieves a slightly better overall accuracy than SVM.A closer look at the confusion matrix, in order to analyze the quality of the differentiation between cereals and forages, shows a better producer's accuracy for the forages using ML (67%) than SVM (54%), for equivalent user accuracies.For the cereals, ML achieves a slightly better user's accuracy (76%) than SVM (71%) for equivalent producer's accuracies.A higher producer's accuracy shows a higher correctness of the classification whereas a higher user's accuracy stand for a higher reliability.This is important as it shows that forages and cereals can be well distinguished using the proposed approach, even if they are spectrally very similar.Only for the vegetables, SVM shows better producer's and user's accuracies than with ML.This is probably due to the relatively small amount of training areas for vegetables compared to the other classes, showing that SVM can better cope with small training samples and mixed pixels than ML.
As for the other classes ML outperforms SVM, it is used for the classification of the summer crops.There, only four classes are considered, as cereals are not cultivated in summer and the fields are bare.Different band combinations and multitemporal features (Figure 8) are analyzed.The best overall accuracy (85.74%) was achieved using 8 bands of Sentinel-2 (Figure 8e) and the multitemporal features , !%% and )*+,-.Using only the four principal bands (Figure 8d) and the same features, the overall accuracy is similar (84.69%).As using less bands permits a faster classification processing, the use of only four spectral bands is retained.In order to analyze the contribution of the multitemporal features, different analyses are performed: the use of the four spectral bands only (Figure 8c) provides an accuracy of 74.84%, which is 10% less than using the spectral bands together with the multitemporal features.Especially for the trees, the producer's and user's accuracies are in this case of about 37%, which is also visible in the classification results, as most of the tree plantations in the West have been classified as vegetables.The user's accuracy of vegetables is in this case only 18%.The use of the four spectral bands of all summer acquisitions (Figure 8f) instead of the multi-temporal features yields worse overall accuracy (68.25%).On the contrary, using only the multitemporal features for classification (Figure 8g), leaving apart the spectral bands and the emergence date feature, yields a very good overall accuracy of 84.19%.Using additionally the temporal feature 89 (Figure 8h) slightly deteriorate the accuracy (79.47%).This can be explained as the emergence date may not always characterize a specific crop type, but depends principally on the sewing date, which depends on the farmer practice.Therefore, the emergence date is a useful information for the authorities to know when a crop will need more water intake, but should be used as an additional information to the crop type classification, and not directly for the classification.A closer look at the confusion matrix of the two best classification results (not shown here) reveals a very good classification of forages and bare soil.Also the producer's accuracies of trees and vegetables are very high (around 90%).The user's accuracy of vegetables is around 60% and the user accuracy of trees around 50%, meaning that in summer, only 50% of the classified trees are really trees.As for the winter, the user's accuracy of the trees is around 80% and tree plantations do not change from year to year, it is preferable to use the tree mask extracted during the winter classification, as it is more reliable.
Based on the classification results, water needs for a specific month or a specific season can be calculated, following the approach explained in Section 3.6.Using the developed tool in combination with free available climatological data, a total water volume of 20 Mm³ has been estimated for the month of March 2017 for the considered area.No direct validation is possible for 2017.However, the water consumption of specific zones within this area is known by the water authorities for the winter season 2015-2016.Even if the cultures were probably not exactly the same as for the winter season 2016-2017, we compared the water consumption of one of those zones in March 2016 with the estimated water need for March 2017, in order to validate the order of magnitude.For this zone with a surface of 478ha, the reference of March 2016 indicates a water consumption of 120 919 m³.For the same area in March 2017, the calculation yields 149 186 m³, which is in the same order of magnitude.This result is very encouraging, especially as the amount of water indicated in March 2016 corresponds to the volume of water which has been charged by the water providers, and may be slightly underestimated compared to the real consumption due the potential presence of non listed boreholes.More reference information concerning the real water consumption will be acquired and used in the future in order to complete the validation.Using the classification results, a map of the emergence date characterizing the time of year where the crops start to need water (Figure 9a), and a map of the water need for a specific month (Figure 9c), derived from the month specific Evapotranspiration (8? Q,Y , Figure 9b) can be produced.

CONCLUSION
In this paper, an approach for the determination of water need of agricultural areas based on optical Sentinel-2 data is developed and validated.Best crop mapping results are achieved using spectral bands and additional multitemporal features defined from crop multitemporal NDVI profiles.The calculated water needs are coherent with the available reference information.Depending on the considered period and on the crop types, other temporal features could be determined for classification (Valero et al., 2016).Using the emergence date, further distinctions could be made, especially concerning the trees: e.g.citrus trees have in general a higher NDVI value than olive trees, leading to different maximum NDVI.Future work will consider the additional use of RADAR data for which some preliminary tests were very promising as they show an increase of the overall accuracy by about 10%.

Figure 1 :
Figure 1: Location of test area and evolution of the water level in the Nebhana dam from 2011 to 2017 (source: GoogleEarth & Sentinel-2).

Figure 6 :
Figure 6: Schematic representation of the different multitemporal features

Figure 7 :
Figure 7: Numerical example of the calculation of the adjusted Kc.

Figure 8 :
Figure 8: Classification results; classification performed for winter on the acquisition from 05.03.2017 and for summer on the acquisition from 19.08.2017, dates providing the best separability between the classes according to the NDVI profiles.

Table 1 :
Overview of the acquired Sentinel-2 data and ground truth

Table 2 :
Principal crop classes and their respective macroclass