SHORELINE EXTRACTION USING TIME SERIES OF SENTINEL-2 SATELLITE IMAGES BY GOOGLE EARTH ENGINE PLATFORM

: In recent decades, global warming and sea level rise, population growth, and intensification of human activities, have directly affected the coasts and as such, their monitoring for the accretion and retreat are among the issues that are considered by the coastal countries. This study, compares two supervised classification algorithms for classifying Sentinel-2 satellite imagery for shoreline extraction. Median monthly images from 2020/01 to 2021/12 are taken and classified by Random Forest (RF) and Support Vector Machine (SVM) algorithms. By validating the maps, it is found that the RF algorithm has better accuracy and as such by averaging the accuracy of all maps, the overall accuracy (OA) values of 97.18% and the kappa coefficient (KC) of 0.97, and the mean overall accuracy and kappa coefficient of maps from SVM algorithm of 85.15% and 0.79, respectively, is obtained. After extracting the shorelines, the Digital Shoreline Analysis System (DSAS) is used to calculate the displacement rate. By calculating the Linear Regression Rate (LRR) factor, it is found that in 91% of transects (166 transects) we see the shoreline retreat to land. In 54% of them, the average rate of the retreat is 5.42 meters per year and in only 9% (16 transects) we see the accretion towards the sea.


INTRODUCTION
Coastal areas are the parts of the earth that are affected by marine processes and are subject to erosion, sedimentation, and pollution (Ngowo, Ribeiro, and Pereira 2021).Globally, sea level rise and floods are expected to increase significantly by the mid of this century, with far-reaching potential consequences for coastal cities.In the United States, where 23 to 25 populous coastal cities are located, the combination of storms and rising sea levels have endangered the valuable assets of many people.According to the World Resources Institute (WRI), about 2.2 billion people, or 39 percent of the world's population, live 100 kilometers off the coast (Mitra 2013).It is predicted that by the end of the 21st century, about 6,000 to 17,000 square kilometers will be eroded globally (Hagenaars et al. 2017).Therefore, continuous monitoring of changes in coastal areas is important for national development and environmental protection (Rasuly, Naghdifar, and Rasoli 2010).The shoreline is defined as the land-water contact line and is one of the 27 major environmental hazards recognized by the International Geographic Data Committee (IGDC) (Kuleli et al. 2011).Natural factors such as shoreline deformation include sea level rise, large storms, and tidal effects, as well as human factors such as massive construction (construction of ports and piers) and the fish farming industry (Lin et al. 2013).In recent decades, remote sensing techniques have been used to extract the position of the shoreline and the extent of its changes.The advantages of this technique include high spatial coverage of satellites, low cost, high speed of information retrieval, no need for physical presence at the site and the possibility of satellite's revisit over the study area.Thus, it can be said that the use of satellite imagery to extract shorelines is the best choice (Ngowo, Ribeiro, and Pereira 2021).Multispectral satellite imagery has a simple interpretation and thus it is the best option to discover the edge of the beach.In the research done by Landsat images, we can refer to Landsat satellites from different sensors of Multi-Spectral Scanner (MSS), Thematic Mapper (TM), Enhanced Thematic Mapper Plus (ETM+) and Operational Land Imager (OLI) Appeared (Yuan et al. 2005).Extensive research has been done on shoreline changes.In 2010, Tuncay Kulely et al.Examined the shoreline changes of 5 important wetlands in Turkey, namely YUMURTALIK, GOKSU, GEDIZ, KIZILIRMAK and YESILIRMAK.In this study, the time series of Landsat satellite images were used.They first preprocessed the images, then selected the NDWI spectral index for the images, and then converted them to binary images by applying the Otsu threshold.They used the DSAS to show the displacement and the rate of change.Erosion and sedimentation rate were calculated using EPR (End Point Rate) and WLR (Weighted Linear Regression) techniques (Kuleli et al. 2011).In 2018, Gang Qiao et al.Used the Landsat satellite imagery series of MSS, TM, ETM+, and OLI sensors and panchromatic aerial photographs to detect and study changes in Shanghai coasts from 1960 to 2015.The aerial photographs dating from 1960 to 1980, and were re-sampled up to 30 meters to be dimensioned with the pixels of the Landsat satellite imagery.Object-based classification was used on aerial photographs to extract the shoreline automatically.Landsat satellite imagery was taken from 1980 to 2015 and extracted from the shoreline using the MNDWI index, and then measured the rate of change by DSAS using WLR and EPR techniques (Qiao et al. 2018).Ngowo et al.Examined changes at the mouth of the Mnazi Bay -Ruvuma estuary Marine Park in Mozambique.The data used in this study include 9 images from Landsat-5 (TM) and Landsat-8 (OLI) satellites taken from 1991 to 2019.In this study, they used the SVM supervised classification method.The study area was classified into six classes: "water", "mangrove forests", "non-mangrove forests", "agricultural lands", "buildings" and "sand".They reclassified, which divided the image into only water and land classes.The DSAS tool was used to calculate the displacement rate in this area.The statistical parameter used in this tool is LRR factor (Ngowo, Ribeiro, and Pereira 2021).The present study investigates and calculates the shoreline displacement rate using the time series of Sentinel-2 satellite imagery on the Google Earth Engine platform on the shore of Chabahar bay.The reason for choosing these satellite images, in addition to being free, is for having a high spatial resolution as compared to Landsat images, as well as their high repetition frequency.In this research, we have compared the accuracy of two supervised classification algorithms, random forest and support vector machine method, in classifying image series.After selecting the algorithm with higher accuracy, we display the series of classified maps and calculate the shoreline transfer rate using the LRR factor in DSAS extension in ArcMap software.

Study Area
The study area is on the northeastern shores of Chabahar bay in Sistan and Baluchestan province in Iran.Chabahar bay is the largest bay on the coast of the Oman Sea.This bay is called "omega bay" because of its ring-like shape.Besides, geographically, politically, and economically, it has a strategic location and deserves special attention.Compared to other areas in southern coast of Iran, Chabahar has a privileged position from the shipping and maritime transport point of view.The existence of deep waters within this large bay, makes Chabahar (Latitude: 25 o 22 ' 5 " N to 25 o 25 ' 59 " , Longitude: 60 o 33 ' 54 " E to 60 o 38 ' 4 " E) suitable for mooring large ships and economically right location for port facilities investment.

Dataset
In this research, the time series of free Sentinel-2A and Sentinel-2B satellite images with a spatial resolution of 10 meters has been used, which has been taken from 2020/01/01 to 2021/12/31.We have selected a median image for each month.In total, the number of available images is 24 images (Table 1).8 features were selected for each image, which includes water body extraction indices (Table 2).All processing, including image capture and classification, is done on the Google Earth Engine platform.Google Earth Engine is a scientific platform for processing, analyzing, and visualizing satellite imagery supported by organizations such as NASA and the ESA.Today, this virtual system with extensive support for free satellite data and images, allows users to quickly process satellite images.

METHODOLOGY
The basis of this research is the comparison of the performance of two machine learning algorithms in the supervised classification of support vector machine and random forest.According to figure 3, this research consists of 4 steps.(1) taking Sentinel-2 images and selecting water body extraction features for images, (2) selecting training data and classifying images with random forest and support vector machine algorithms, (3) validation of classified maps with testing samples and selection of algorithms with high accuracy, ( 4) shoreline extraction and calculation of movement rate.Due to the effects in the study area, we labeled the images with four classes: "water", "soil", "mangrove" and "wetland".According to Table 3, 70% of the samples were considered training data and entered into the classification algorithms as input.After classifying the images with the remaining 30% of the samples, we validated the accuracy of the classified maps.Two important indicators of the kappa coefficient and overall accuracy were used to validate the data.After comparing the accuracy of the two random forest algorithms and the support vector machine in image classification, we select the algorithm with higher accuracy.The final step in this research is to extract the shoreline and calculate the shoreline movement rate.The DSAS extension in ArcGIS software was used for this purpose.There are several methods in this extension to calculate the shoreline movement rate.In this study, we used the LRR (Linear Regression Rate) factor.Table 3. Training and testing samples.

Random Forest
Random forest is a group learning approach developed by Breiman to solve regression and classification problems.Group learning is a machine learning scheme to increase accuracy by combining several models to solve a problem.In this method, multiple classifiers participate in group classification to obtain more accurate results than a single classification.In other words, merging multiple classifiers reduces the variance and may provide more reliable results.A voting scenario is then designed to assign the label to the unlabeled sample.The common method of voting is the majority vote.Which assigns the label to the unlabeled sample with the maximum number of votes from all the different categories.The popularity of the majority voting method is due to its convenience and effectiveness.Group learning methods include two types of "reinforcements" and "bagging " .Random foresting was the first successful bagging group learning approach developed by Breiman by combining bagging sampling, random decision forests, and random selection of independently introduced traits.
The random forest algorithm is a supervised classification that produces trees with high variance and low bias.The new set of unlabeled data is evaluated against all decision trees, and each tree votes for the unlabeled data to join the classes, and eventually, the sample data will be the class member that received the most votes among the decision trees.On average, about two-thirds of unlabeled data is used to train bag trees, and the rest (outside the bag) is used to validate and evaluate the model quality (Sheykhmousa et al. 2020;Ali et al. 2012;Zarei, Hasanlou, and Mahdianpari 2021).

Support Vector Machine
The support vector machine is a nonparametric supervised statistical learning technique developed by Cortes and Vapnik (Cortes and Vapnik 1995).This algorithm is used for classification and regression problems that have no assumptions about the distribution of basic data.In this method, the data set is labeled.The goal of the SVM training algorithm is to find a hyperplane to separate the dataset into predefined classes in an n-dimensional space.The term separator hyperplane is used to define decision boundaries that minimize incorrect classifications.In fact, the hyperplane should be defined as having the greatest distance between the two sets of training data.In a two-dimensional space, the hyperplane is a regular line, and in a three-dimensional space, the hyperplane is a twodimensional plane, and so on.SVM is, in its simplest form, a binary classifier.The sample data that should be labeled in the remote sensing classifier are pixels of multispectral and hyperspectral images.Figure 4, is a simple sketch of a two-class classification problem in a two-dimensional space.One of the general aspects of SVMs is that not all training data is used to determine the separator hyperplane.The subset of points at the periphery is called the support vector.These vectors are the only vectors that determine the maximum margin of the hyperplane (Mountrakis, Im, and Ogole 2011;Maji, Berg, and Malik 2008).However, in some cases the problem may not be linearly separable, meaning that the number of classes is more than two and it is not possible to separate the training data with a single page.In this case, SVM can be a good option with the help of a nonlinear kernel such as the radial basis function (RBF).Here the data is mapped to a larger space (Figure 5).
Figure 5.The transfer the nonlinearly distributed data from two-dimensional space to three-dimensional space (Muhammed et al. 2020).

DSAS system
The DSAS is an additional extension for Arc GIS that is produced by USGS and uses a robust tool to evaluate and calculate the shoreline movement rate.First, the shorelines are drawing then introduced in the form of a shape file.Then the baseline is drawing parallel to the shorelines.The transects are then perpendicular to the baseline and intersect all shorelines.
The length of transects depends on the goal of the researcher.
There are several methods for calculating shoreline movement rates.Including Net Shoreline Movement (NSM), End Point Rate (EPR), and Linear Regression Rate (LRR).NSM the total distance between the oldest and newest shoreline is calculated for each transect.EPR represents the ratio of the spatial gap between the oldest and latest shoreline position to the corresponding time difference for each transect.A linear regression rate (LRR) of change statistics can be determined by fitting a least-squares regression line to all shoreline points for transects.In this research, the LRR method has been used to calculate the shoreline displacement rate (Himmelstoss et al. 2018;Toorani et al. 2021;Qiao et al. 2018).

Validation of random forest results
In the random forest algorithm, 150 decision trees were selected.The reason for this choice is the optimal processing speed in the Google Earth Engine platform and also the high number of selected classes.The SVM algorithm uses a linear kernel.The reason for choosing a linear kernel is its high accuracy compared to the radial basis function (RBF) kernel.In the validation step of maps with kappa coefficient and overall accuracy, in both random forest and SVM methods, we concluded that the random forest algorithm has acted in the classification of all images with higher accuracy (Tables 4, 5).
According to

Validation of SVM results
According to Table 5, the results of the SVM classification show that the accuracy of the maps is lower than the accuracy of the maps obtained from the random forest method.The highest accuracy of the map belongs to Map No. 6, with an overall accuracy of 97.32% and a kappa coefficient of 0.9617.Map number 24 has an overall accuracy of 44.64% and a kappa coefficient of 0.2321.The average overall accuracy and kappa coefficient of all maps with the SVM algorithm were 85.15% and 0.7899, respectively.

Number
Overall By comparing the accuracy of the maps in Tables 4 and 5, it can be seen that the stochastic forest algorithm has a much better performance in classifying images than the SVM algorithm.For example, the overall accuracy of the classified maps in the random forest algorithm is all above 95%.However, in SVM algorithm, we see low accuracy in all maps.For example, in map number 24, we see very low accuracy, which is 44.64%.The reason for the poor performance of the SVM algorithm in this research is the high number of classes.The higher the number of classes, the lower the performance of the SVM algorithm.The SVM algorithm works best in two-class classifications and can convert an image to a two-class map with a linear kernel.
In Figure 6, it can be seen that the false color images (A) are applied to the 4-class maps, which include the classes "water", "soil", "mangrove forest" and "wetland", by applying a random forest classification algorithm, we converted (B).In the next section, the pixels were reclassified and we converted the maps into two classes, "water" and "land" (C).

Extracting shorelines
To extract the shoreline of the images, all the maps were transferred to the ArcMap software, and all the shorelines were drawn in the form of a line-shaped shape file.Then the date of each shoreline was introduced to the shape file table.Figure 7, shows the true-color image of the Sentinel-2 satellite on 2020/01 with the extracted shorelines.After drawing the shoreline, the baseline should be drawn approximately parallel to the shoreline.Then transects should be introduced.Transects are lines perpendicular to the baseline that are specified by the user as the length and distance between them.The transects intersect with all the drawn shorelines, and the amount of the shoreline accretion and retreat on each transect is determined.
The space between transects in this study is 50 meters due to the immediate changes in the coast of the region, therefore, it is necessary to know the number of changes at short distances.
The length of the transects should also be enough to cut all the shorelines.The maximum length of transects in this study is 220 meters.From the intersection of each transect with the shoreline, the amount of shoreline movement on that transect is calculated (Figure 8).

Calculation of shoreline movement rate
According to table 6, a positive LRR would then indicate accretion, while negative values would correspond to the coastal erosion.Out of 182 transects, in 91% of them (166 transects), we have seen the shoreline retreat to land and in only 9% (16 transects) we have seen the accretion towards the sea.Among the 91% shoreline retreat, 50.54% (92 transects) of the average change is -5.42 m / yr.In 24.72% (45) transects, the average retreat was about -14m / yr.At 10.43% (19 transects) the retreat rate was -22.69 m / yr.Besides in general, we have seen more changes during the 10 transects that have been between -32 m / yr to -62m / yr and only in 1.65% of transects, we can see the most changes (-62 m / yr).
From the 9% shoreline accretion, we averaged 3.47 m / yr at 4.44% (8 transects) and among the 4 transects, we see an accretion of about 17 to 24 m/yr.Besides, in just one transect, the accretion rate is 65.29 meters per year.In Figure 9, we have categorized the transects using the color map.The red and orange transects indicate the erosion of the shoreline towards land, and the blue transects indicate the shoreline accretion in that area.The intensity of the colors in this color map shows the amount of change.

CONCLUSION
The basis of this research was the comparison of two supervised random forest classification algorithms and SVM in shoreline extraction.In this study, 24 median images of Sentinel-2 were used.After the classification by SVM and random forest methods, the maps were validated by general accuracy indices and the kappa coefficient.Finally, a random forest method with an average overall accuracy of 97.18% and a kappa coefficient of 0.9685 was selected.All images were then transferred to ArcMap software.The shoreline relocation rate was calculated by DSAS extension.Finally, in 91% of transects we saw the retreat to land and in 9% of transects have an accretion.

Figure 2 .
Figure 2. Flowchart of the methodology used for extraction of shorelines.Class Water Soil Mangrove Wetland Training data 469 342 167 132

Figure 4 .
Figure 4.The specification of hyperplane and support vectors to separate data (Mountrakis, Im, and Ogole 2011).

Figure 6 .
Figure 6.(A) False color image from 2020/01, (B) map of four classified classes resulting from the image, (C) reclassified map includes water and soil.

Figure 8 .
Figure 8. Drawn transects at a distance of 50 m perpendicular to the baseline .

Figure 9 .
Figure 9.The rate of change along transects using color map (negative values in red indicate retreat and positive values in blue indicate accretion).

Table 4
, the highest accuracy belongs to Map No. 20 with an overall accuracy of 99.40% and a kappa coefficient of 0.9914, and the lowest accuracy belongs to Map No. 22 with an overall accuracy of 96.13% and a kappa coefficient of 0.9440.The average overall accuracy and kappa coefficient of the maps were 97.18% and 0.9685, respectively.

Table 4 .
Accuracy of maps classified by random forest method.

Table 5 .
Accuracy of maps classified by SVM method.

Table 6 .
Calculation of movement rate with LRR factor.