MULTITEMPORAL SEGMENTATION OF SENTINEL-2 IMAGES IN AN AGRICULTURAL INTENSIFICATION REGION IN BRAZIL

: With the recent evolution in the sensor's spatial resolution, such as the MultiSpectral Imager (MSI) of the Sentinel-2 mission, the need to use segmentation techniques in satellite images has increased. Although the advantages of image segmentation to delineate agricultural fields in images are already known, the literature shows that it is rarely used to consider temporal changes in highly managed regions with the intensification of agricultural activities. Therefore, this work aimed to evaluate a multitemporal segmentation method based on the coefficient of variation of spectral bands and vegetation indices obtained from Sentinel-2 images, considering two agricultural years (2018-2019 and 2019-2020) in an area with agricultural intensification. Images of the coefficient of variation represented the spectro-temporal dynamics within the study area. These images were also used to apply an edge detection filter (Sobel) to verify their performance. The region-based algorithm Watershed Segmentation (WS) was used in the segmentation process. Subsequently, to assess the quality of the segmentation results produced, the metrics Potential Segmentation Error (PSE), Number-of-Segments Ratio (NSR), and Euclidean Distance 2 (ED2) were calculated from manually delineated reference objects. The segmentation achieved its best performance when applied to the unfiltered coefficient of variation images of spectral bands with an ED2 equal to 7.289 and 2.529 for 2018-2019 and 2019-2020, respectively. There was a tendency for the WS algorithm to produce over-segmentation in the study area; however, its use proved to be effective in identifying objects in a dynamic area with the intensification of agricultural activities.


INTRODUCTION
Agricultural intensification could be characterized by an increase in the production of food, fuel, or fibre per unit of land, which is achieved through the adoption of management, technology, or increment of inputs (Tilman et al., 2011;Garrett et al., 2018). This practice results in an increase in the number of crops on lands already cultivated and in the carrying capacity of the number of animals obtained when the conditions of the pastures are improved (Garnett, 2011). Therefore, areas with agricultural intensification are more dynamic, which makes their mapping difficult for the purposes of quantification, management, and planning of land use. Thus, the development of new technologies in remote sensing has facilitated the monitoring of these areas, providing images with better spatial and temporal resolutions, frequent updating of data, and more efficient mapping methodologies (Hu et al., 2019).
In this context, earth observation satellites, such as those provided by the Sentinel-2 mission of the European Space Agency (ESA) (Drusch et al., 2012), which systematically monitor the same territory, constitute an excellent source of data for the development of products for the monitoring of the spatial distribution of agricultural activities (Xiong et al., 2017;Hu et al., 2019). In particular, time series of optical images are potentially useful to deal with changes caused by phenological differences in the crop development cycle, whose representation can be seen in the temporal profiles of vegetation indices (Arvor et al., 2011;Brown et al., 2013;Müller et al., 2015;Jakimow et al., 2018;Griffiths et al., 2019). However, it is still a challenge to segment agricultural fields considering their seasonal changes based on spectro-temporal information of satellite images, especially in dynamic regions with agricultural intensification and smaller land sizes of properties. In addition, some land uses that are conceptualized based on temporal characteristics, such as integrated crop-livestock systems (Manabe et al., 2018), whose fields are intensively managed with agricultural crops and pastures in different seasons, would be better identified through a multitemporal approach. In this case, the different agricultural production systems in a region result in greater heterogeneity at the pixel level, which is a problem in image classification (Li et al., 2015;Vieira et al., 2012).
Most land-use mapping studies have focused on pixel-level classification, and with the recent gain in spatial resolution of sensors such as the MultiSpectral Imager (MSI) coupled to Sentinel-2 mission platforms, there was the possibility of applying object-based image analysis (OBIA) to identify types of agricultural crops from multitemporal images (Belgiu and Csillik, 2018;Petitjean et al., 2012). OBIA is a promising approach in detecting areas with different land uses, which combines remote sensing and image segmentation techniques (Bueno et al., 2019;Khiali et al., 2019).
Image segmentation is the process of subdividing an image into homogeneous segments by grouping pixels based on previously determined criteria of homogeneity and heterogeneity (Haralick and Shapiro, 1985). In this way, each segment groups pixels that have similar radiometric characteristics and separates their adjacencies that are significantly different in relation to the same characteristics (Khiali et al., 2019). It is already known that not all features extracted from the images result in improvement in segmentation accuracy (Li et al., 2015). Thus, in the selection of image characteristics, a crucial step in image analysis, spectral bands, and vegetation indices must be tested in order to better represent the landscape elements in question (Li et al., 2015).
In view of the above, this study aims to evaluate a Sentinel-2 multitemporal image segmentation method based on the coefficient of variation in an area of agricultural intensification. For this, the methodological workflow was focused on the delimitation of agricultural fields through the representation of spectro-temporal data using images of the coefficient of variation and the application of an edge detection filter (Sobel).

Study area
The study area corresponds to an area of 83.11 km², located in the municipality of Caiuá, in the State of São Paulo (Figure 1), where the Campina Farm is situated, which is considered a Technological Reference Unit for integrated crop-livestock production systems (Cordeiro et al., 2020). Except for Campina Farm, the properties in the study area have a small and mediumsized landholding structure, where several settlements are present. The region has heterogeneity in land use, such as pastures, perennial agriculture (lipstick tree, papaya, citrus, coconut), semi-perennial agriculture (sugarcane, cassava, napier grass), annual agriculture (soybean, corn, sorghum, crotalaria), forestry (eucalyptus) and native vegetation.

Methodological Flow
The methodological flow comprises the following steps: (1) Sentinel-2 temporal image acquisition; (2) image processing, which consists of preparing them for segmentation; (3) segmentation, which involves the process of creating unchanging reference objects and the segmentation process; (4) segmentation assessment, which consists of applying metrics that assess the quality of segmentation ( Figure 2).

Image Acquisition
Monthly cloud-free images of Sentinel-2 (Level-2A -Bottom of Atmosphere) were downloaded from the European Space Agency database (ESA, 2020). These images correspond to the period of two agricultural years: 2018-2019 and 2019-2020. To follow the region's agricultural calendar, the initial image was defined in September of one year, and the final image was defined in August of the following year, totaling 24 multispectral images, 12 for each agricultural year.

Image Processing
After image acquisition, the spectral bands (Blue, Green, NIR, RED, Red Edge 1, Red Edge 2, Red Edge 3, SWIR1, SWIR2) were separated and ten vegetation indices (Chlorophyll Red-Edge, Enhanced Vegetation Index, Green Normalized Difference Vegetation Index, Moisture Stress Index, Normalized Difference Red-Edge, Normalized Difference Vegetation Index, Red-edgebased Normalized Difference Vegetation Index, Red edge 1, Red edge 2, Soil Adjusted Vegetation Index), available and referenced on the Index DataBase website (Henrich et al., 2012), were calculated in sen2r toolbox (Ranghetti et al., 2020). For each spectral band and index, the coefficient of variation per agricultural year was calculated. The next processing step was based on the method proposed by Watkins and van Niekerk (2019a), in which the Sobel filter is applied to the coefficient of variation images to detect the edges of agricultural fields in the study region. Subsequently, these images are normalized, using minimum and maximum values, to obtain the mean image of the filtered images of spectral bands and vegetation indices, respectively ( Figure 3). Furthermore, to test the segmentation process, images of the normalized coefficient of variation of bands and indices, without filter application, were also used.
The edge detection algorithm is designed to detect and highlight points in a digital image where brightness changes markedly, so high grey levels indicate a sharp discontinuity with adjacent pixels. Thus, the algorithm's main objective is to capture events and important changes in the image (Sharma and Mahajan, 2017).

Segmentation
To perform the segmentation, Watershed Segmentation (WS) was used, a region-based algorithm that uses local minima as central points and expands objects outward with increasing levels of intensity until they reach the edges of another object (Salman, 2006). WS is implemented in Orfeo ToolBox -an open-source tool to process remote sensing images - (Grizonnet et al., 2017), version 7.3.0, and has two segmentation parameters: Depth Threshold and Flood Level. For this work, based on the segmentation tests performed, it was decided to use the Flood Level parameter kept at 0.12 in filtered images and 0.10 in unfiltered images, varying only the Depth Threshold values until finding the best performance, according to the results of the segmentation assessment and visual inspection.

Segmentation Assessment
To assess the quality of the segmentation results produced, a hundred reference objects were handly digitised for each agricultural year (Figure 4), which represent agricultural fields with a uniform spectro-temporal pattern throughout the investigated period. Thus, these reference objects were compared with the outputs of the segmentation process in AssesSeg, a free tool developed by Novelli et al. (2017). Through AssesSeg, metrics that are associated with segmentation quality were generated. The main metric is a modified version of the supervised discrepancy measure called Euclidean Distance (ED2), proposed by Liu et al. (2012), which in turn depends on two other metrics: Potential Segmentation Error (PSE) and Number-of-Segments Ratio (NSR). The PSE (Eq. 1) is a geometric measure that calculates the ratio between the total area of the subsegments and the total area of the reference objects; thus, a value equal to zero of this metric indicates that there are no subsegments. On the other hand, NSR (Eq. 2) is an arithmetic measure that calculates the ratio of the absolute difference of the number of reference objects and the number of corresponding segments to the number of reference objects. Thus, a value equal to zero of this metric indicates that there is a preferential one-to-one relationship between the reference objects and the corresponding segments; that is, it represents the arithmetic discrepancy in the over-segmentation situation (Liu et al., 2012). ED2 (Eq. 3) is a measure that considers geometric and arithmetic discrepancies; therefore, a value equal to zero can indicate a good segmentation quality (Liu et al., 2012).
However, as highlighted by Novelli et al. (2017), when a reference object does not have a corresponding segment, the actual number of features employed is less than the original. Thus, to consider this polarization effect, it was proposed that the PSE and NSR values be increased in such a way to consider the real number of reference objects employed. Thus, the new values of PSE (Eq. 4) and NSR (Eq. 5) were calculated as follows: where max(| − | is the maximum under-segmented area found for a single reference object; represents the maximum number of corresponding segments found for one single reference object; Σ| | represents the total area of the mn reference objects.

RESULTS AND DISCUSSIONS
For the segmentation process, in cases where the Sobel filter was applied to the images, the WS segmentation algorithm was executed using the Flood Level parameter constant at 0.12, while for cases where the filter was not used, the same parameter was kept at 0.10. The definition of the choice of values for this parameter was mainly considered the visual inspection , seeking a balance to avoid sub-segmentation or over-segmentation situations.
Furthermore, for both casesapplying the edge detection filter and not applying the filterthe NSR, PSE, and ED2 metrics implemented in the AssesSeg tool were calculated to assess the segmentation quality. In general, the segmentation processes that generated lower numbers of segments also had lower ED2 values.

Agricultural year: 2018-2019
For the 2018-2019 agricultural year, considering the images in which the edge detection filter was applied, the best segmentation occurred in the mean image of spectral bands since the results were closer to zero.
Thus, for the mean image of spectral bands, the Depth Threshold and Flood Level parameters were set at 0.022 and 0.12, respectively, and the number of segments generated in the segmentation process was 1129, which resulted in an ED2 equal to 10.482 (Table 1). For the mean image of vegetation indices, the same parameters were defined at 0.026 and 0.12, respectively, with the number of generated segments equal to 1145 and ED2 equal to 11.158 (Table 1). In this way, according to the definition of the ED2 parameter, the segmentation quality was more satisfactory in the case of the mean image of spectral bands.
When considering the case where no filter was applied, the segmentation process that came closest to the expected also corresponded to the segmentation of the mean image of spectral bands since this segmentation resulted in an ED2 of 7.289 (Table  1). Table 1. Modified ED2 and parameters tested in mean images of spectral bands and vegetation indices for the 2018-2019 agricultural year.
Thus, considering the mean image of the vegetation indices, the best segmentation result was obtained when the Sobel edge detection filter was applied (Table 1). On the other hand, when the filter was not used, the result was closer to zero for the mean image of the spectral bands (Table 1). In both cases analysedapplication and non-application of the Sobel filtersegmentation was more efficient in spectral band images than in vegetation index images.

Agricultural year: 2019-2020
Considering the application of the Sobel filter, the mean image of the spectral bands for the 2019-2020 agricultural year, in the same way as for the 2018-2019 agricultural year, showed lower values of ED2 when compared to the mean image of the vegetation indices (Table 2). Table 2. Modified ED2 and parameters tested in mean images of spectral bands and vegetation indices for the 2019-2020 agricultural year.
In the case of non-application of the filter, the mean image of the spectral bands produced a segmentation with 308 segments, the smallest number of segments among all segmented images, which also generated a lower ED2 value, equivalent to 2.529. On the other hand, the mean image of vegetation indices presented ED2 higher than when the Sobel filter was applied (Table 2).
Considering the mean image of vegetation indices for this agricultural year, the best segmentation result was obtained when the Sobel edge detection filter was applied ( Table 2). As for the non-application of the filter, the result was closer to zero for the mean image of the spectral bands (Table 2). Therefore, the segmentation proved to be more efficient in spectral band images than in vegetation index images, as also occurred in the other analysed agricultural year.

Final Segmentation
Considering all the results found for the Euclidean Distance 2 metric, the segmentation achieved its best performance when applied to the unfiltered spectral band mean image for the 2018-2019 and 2019-2020 agricultural years ( Figure 5) with an ED2 equal to 7.289 and 2.529, respectively.
These results are more evident when looking at some regions of the maps of both years, for example, the areas located on the lower left and lower right sides of the segmentation maps, which had their agricultural fields more segmented (more objects) in the 2018-2019 agricultural year compared to the 2019-2020 agricultural year ( Figure 5). Another point to be highlighted is the values close to the PSE of both segmentations, which corresponded to 1.471 and 1.482 for the 2018-2019 and 2019-2020 agricultural years, respectively. However, the NSR values were more distant, corresponding to 7.139 (2018-2019 agricultural year) and 2.050 (2019-2020 agricultural year). Thus, the NSR contributed to the ED2 value of the first analysed agricultural year being higher than the ED2 value for the second analysed crop agricultural year since this metric is associated with the number of corresponding segments created in the segmentation process. We emphasize that, as it is a region where the properties are small and medium-sized and with agricultural intensification, a greater demand in the segmentation process was already expected.
Mainly when compared to other regions with larger properties and more defined shapes. Thus, these difficulties contributed to the ED2 values remaining above zero since dynamic regions, as in this study, are more difficult to segment. In general, the results of the NSR metric, related to the number of segments created, had a greater influence on the results of ED2, which corroborates the results of Watkins and Van Niekerk (2019b). These same authors, when using the WS segmentation algorithm to delineate agricultural field boundaries from Sentinel-2 multitemporal images, also found that this algorithm tends to produce many small objects, being more prone to over-segmentation.
The AssesSeg tool proved to be very useful when combined with semi-automatic segmentation algorithms to assess segmentation quality, as it was able to produce all the results for different parameters established in the segmentation step. Furthermore, it does not depend on specific targeting software, which makes its use easier and wider.
In this study, it is important to note that the use of the coefficient of variation of spectral bands and vegetation indices allowed considering the temporal characteristic, intrinsically, within the segmentation process in the study area, since its use brought information about the spectral variation of each pixel during a agricultural year.

CONCLUSIONS
The products generated from the coefficient of variation of the spectral bands and the vegetation indices of the Sentinel-2 monthly images can represent the multitemporal characteristic of the segmentation process. These products are capable of delineating agricultural fields that present a spectro-temporal pattern. Although there was a tendency for the Watershed Segmentation (WS) algorithm to over-segment the study area, its use proved to be effective for identifying objects in a dynamic landscape, where the intensification of agricultural activities is present.
Among all the tests performed -with or without edge detection filter (Sobel) -the segmentation with the best performance uses the mean image of the unfiltered spectral bands, evidenced in the year 2019-2020. In addition, it is worth noting that an increase in the number of reference objects can contribute to reduce uncertainty in the assessment of segmentation quality.