USING TIME SERIES OF SENTINEL-1 IMAGES TO PRODUCE DRY BATHYMETRY OF RIVERS

A method is presented to produce river cross section profiles from a time series of Sentinel-1 images paired with water level data. Four programs are presented that generate river width data and cross section profiles from SAR or optical images. The programs generate a river bank and island width database with minimum manual intervention. Water level data from in situ stations are interpolated to match the width data and create elevation points from which the cross section profiles are produced. The method is fully described and tested on the São Francisco River in Brazil. The width data are plotted against discharge data to compare their progression. Over 1700 cross sections were produced and classified by their shape. Potential and limitations are presented.


INTRODUCTION
Since 2014 the Copernicus Program and the European Space Agency have launch a series of satellites belonging to the Sentinel family. In particular, the Sentinel-1 (A and B) and -2 (A and B) missions provide high resolution images (10 m) that combined can provide an image every few days (5-6) almost anywhere on Earth. The Sentinel-1 mission with its SAR sensor can supply images regardless of cloud or sun illumination which is a great advantage for acquiring data during the cloud infected rain season. More then ever, remote sensing is now a major supplier of data for hydrological application such as hydrological and hydrodynamic models (Caloz, Puech, 1996, Olsson, Pilesjö, 2002, Mertes et al., 2004, Sichangi et al., 2018. Radar remote sensing (both imaging sensors and altimeters) is specially well suited for hydrological applications (Lewis, 1998). It can provide data on water surface, water level and land use surrounding water bodies. River width is an important hydrological variable needed for instance for applying the Manning equation. Coupled with water level, water width can be used to estimate river discharge through a rating curve (Tourian et al., 2017, Sichangi et al., 2018. In rivers with significant discharge variations, the river width/level pair can produce partial cross sections useful to estimate variables such as Manning's roughness coefficient or the hydraulic radius and is also required to estimate propagation speed. This article describes a method to produce time series of river width over a 430 km reach of the São Francisco River in Brazil. The hypothesis is that these widths data can be paired with water level data to create partial cross sections of the river, a process we call "dry bathymetry". In a similar approach, satellite images and satellite altimetry have been joined to produce the bathymetry of a lake (Arsen et al., 2014) or to calculate volume differences in a reservoir (Abreu, Maillard, 2014). * Corresponding author 2. METHOD

River Width Extraction
Pavelsky and Smith (2008) have published and made publicly available a program called RivWidth for the calculation of the width of a river based on satellite image data. Although this program is freely available, we have chosen to create our own application based on the same general principle but differing in the details and operation mode. Our programs were developed to integrate the whole process from the image processing to the production of river cross sections from time series of satellite images. A Graphical User Interface (GUI) was also created for each of the four programs. The four programs were written in Python 3.5 and are publicly available at http://www.igc.ufmg.br/ under the Remote Sensing section (Geography). A flow chart of the river extraction programs (first three) is presented in Figure 1.
Six Sentinel-1 images covering the 12/2015-12/2016 period were corrected and processed to produce the width data. All images suffered the same sequence of correction and pre-processing: 1) thermal noise removal, 2) calibration (σ•) and 3) terrain correction using SRTM data. The sequence of four programs was used to convert the image data in a width database. The first program (SRW River Preparation where SRW stands for Satellite River Width) applies a threshold to the images to extract the water surface which is then filtered, segmented, cleaned and transformed in vector format using both in-house functions and the OTB (Orpheo Tool Box, https://www.orfeo-toolbox.org/) Python library. The user can alter the search radius and the minimum segment size to account for the size of the river or the resolution of the images being used. A river bank and an island vector file are produced by this first program. Depending on the complexity of the river network, some manual cleaning might be necessary. In the case of the São Francisco River, this was not necessary since this was the only major river in the S-1 images. Only the tributaries had to be "chopped off".
The river bank file (without the islands) is then processed by the second program (SRW River CentreLine) to produce a smooth river centre-line. Because the width data forming the time series needs to be based on the same points and orientations, the same centre-line is used for all the image dates. The date with the lowest water level is chosen to ensure that the centre-line is always located within the "wet" cross section of the river. The centre-line is produced by squeletonizing the raster version the river bank through the creation of a water-to-land distance map. The centre-line is then extracted using a maximum value focal operator. The resulting pixels are then sorted in a upstreamdownstream order. This is indicated by the user that can either pick one of eight starting points (N, NW, W, SW, S, SE, E, NE) or use the cursor and click on a starting point on the image. The formation of the centre-line is then the result of four rules: 1. the following point must be at a minimum distance set by the user; 2. if more than one point obeys the rule, the farthest is chosen (to solve cases of two pixel ridges); 3. the angle formed by the two previous points with the following point must be within the range set by the user (usually 0 to 20 degrees); 4. if the angle is zero and the maximum distance criterion is not exceeded, the point is not kept.
These rule ensure a relatively smooth centre-line that is unlikely to create width lines that cross each other or measure width in a direction off from the general flow of the river. Figure 2 shows the kind of problem a centre-line with too much details can create.
The third program (SRW River Width) then combines the centreline, river bank and islands vector files to generate the width data at regular distance intervals. The program first creates a point vector file at regular distance intervals (set by the user) along the centre-line starting from the upstream beginning of the river reach. The river width is always measured perpendicular to the centre-line considering a line formed by the previous and following points to avoids generating width measurements problems like illustrated in Figure 2. Furthermore, the width is (a) (b) Figure 2. Problems created by a rough centre-line with too much details: (a) the dent in the midle of the line produces an unlikely width measurement, (b) the smooth version is much more likely to be representative of the river width.
measured from the centre-line to the first crossing of the river bank on each side of the centre-line. The right-and left-hand sides of the width are kept separated for future geomorphological applications. In our case the distance between width measurements was set to 250 m. The width measurements are saved in a shapefile's attributes spreadsheet that includes the total width, the width of islands, the total width less the island and the cumulative distance from the start of the reach. Although only the first crossing of the width direction line with the river bank is considered (i.e. a meander could imply more line crossings), more than one islands of sand bar could be found on a cross section and so all these are summed up to compute the width of islands. All the island crossing widths were recorded and eventually subtracted from the river width. The program displays a figure showing the results. Figure 3 illustrates a detailed section of the results overlaying the SAR image. Note that the width line are split on the river centre-line or in the middle of the island when it is the case. The edges of the island are also recorded.
The cumulative distance is the attribute used to join the interpolated river level with the width data. The water level data were taken from seven in situ hydrological stations from Brazil's water agency (Agência National deÁguas -ANA) for the six dates of the images and linearly interpolated using the cumulative distance. An average distance of 83 km separates the in situ stations and an average elevation difference of 53.16 m separates the first upstream and last downstream stations. Table 1 shows some of the important characteristics of the in situ stations and their data. Figure 4 illustrates the varying width of a river section at the extreme North of the study area for the six image dates.
The fourth program (SRW River CrossSection) combines the width data with the interpolated water level data to produce cross sections of any sector of the river reach.

River Cross Section Generation
The crossings between the width lines (perpendicular lines from the river centre-line), islands and river banks generates a series of points that are then associated with the corresponding water level interpolated from the in situ stations. Each of the six S-1 images were sensed during a few seconds at around 8H30 AM of the day which does not account for the propagation time of about eight days between the first and the last station (41135000 -45298000). This implies that it is possible that highest water level (or widest width) at a station does not corresponds to the highest water level at another station. The further apart the station the least likely it is that these levels correspond. In addition, it was noted that some inconsistencies were generated by the water surface extraction procedures from the S-1 imagesm (the same σ0 threshold was used to segment the S-1 images considering the images have been calibrated, but this might not necessarily be the best approach and further work will focus on this issue). Together these problems have generated some inversions in the width data by which the river width from a higher water level might be narrower than at a lower level. The crossing points from the six dates can then be joined to create a (partial) cross section. The outer (river bank) profile is joined automatically by the program which attempts to join all the points if they build a consistent positive slope. At this point, the island points were joined manually to complete the cross section. Figure 5 shows the consecutive steps needed to produce the cross section. The first three steps (a-c) are done automatically by the program (SRW CrossSection) whereas the last two are completed manually at this point. Bézier curves were used to smooth the profile.

Validation
In situ validation was not possible at this time but since cross section profiles of the ANA fluviometric stations is produced yearly, a pseudo validation was made possible at the six stations covered by the São Francisco River reach. These cross sections are produced in order to measure the water flow velocity across the river and at regular depth intervals to generate or update rating curves linking the water level with the discharge. However, while discussing the method to produced these cross sections with the agency responsibles (the Brasilian geological survey -CPRM), it appears that these are not always produced at exactly the same location nor was it possible to guaranty that what was considered "perpendicular" to the flow is the same as what was use by the RWS method. Furthermore, the cross section profiles generated here are only partial and do not necessarily have the same distance reference but do have the same altimetric reference since the ANA stations used all had an absolute elevation reference. So it was decided to produce this validation only in qualitative terms by comparing the overall shape described instead of an error measurement of some kind.

River Width Data
The three programs that generate the width data worked well in all six situations (dates) and very little manual intervention was needed. Mostly it was vector cleaning that needed manual editing (after the SRW River Preparation program) but this was greatly reduced by applying a rough mask around the river. To ease processing, the OTB vectorization routine divides large images in tiles which creates artefacts breaks in the river bank and islands vector files. Our image data was divided in two large images, a South image (21000 lines × 7000 columns) and a North image (16500 lines × 9000 columns) that the vectorization program split in 15 to 25 tiles each. The SRW River Preparation programs attempts to join the tile sections by itself through a topological assembly which is usually successful except in some cases. The most common exception is brought by the presence of a bridge that splits the river. The islands are another exception that, when split, are not recognised as islands but as the end of a section with a "Y" shape and need to be manually merged and added to the island vector file.
Inconsistencies were considered whenever a wider width is associated with a lower water level or vice versa. These represented approximately 20% of the river bank and island data points and were mostly concentrated on two dates: 1 April 2016 and 15 December 2016. Furthermore these inconsistencies were more frequent in the upstream section of the river reach characterised by mostly rectangular cross sections (see next item).
The initial RivWidth program presented by Pavelsky and Smith (2008) did not consider the islands which might be acceptable for the braided streams they presented. In straight to meandering rivers with permanent islands like the São Francisco, it was found to represent an important source of errors when attempting, for instance, to estimate the discharge through width data. The graph presented in Figure 6 shows the difference in width created by the presence of islands. It also shows that the middle portion of the reach has the greatest proportion of islands, especially large ones. This is also the portion that meanders most whereas the downstream sector is mostly straight. Figure 7 shows the two width sets representing the lowest and highest water levels along with a second degree trend line. Both the raw data and the trend lines are in agreement with the expectation that higher water levels correspond with wider river banks. The hydrological year 2015-16 was the last year with a relatively "normal" behaviour since 2011-12 which was the last hydrological year with a flood and was chosen for this reason. The years 2013-2019 are regarded as a drought period for the São Francisco River. The hydrological year 2019-2020 is showing signs to bring a break to this drought period as the Três Marias Reservoirs (upstream from the reach) is already almost full so that the river can return to receiving its normal flow.
To further test the consistency of the width data it was represented with the discharge data from the seven in situ stations inside the river stretch (Figure 8 and Figure 9). In Figure 8 the three dates were selected for showing an apparent relationship between the width trend lines and the discharge. The top graph represents the greatest discharge situation and displays a monotonic increasing trend for the two lines. The middle and bottom graphs show anomalies especially with regard of the discharge line. These anomalies can only be interpreted as flow propagation phenomena by which the peak discharge measured at the Note that some points were left out to avoid negative slopes. third (bottom) and fourth (middle) stations did not reach yet the downstream ones. To a certain extent the phenomenon is also observable in the width data when compared with the top graph.
In Figure 9, the relationship is much less visible and probably suffers from the problem of flow propagation as well. These three dates are also the ones causing most inversion problems in building the cross sections but can also be related to differences in the water surface extraction either during the segmentation or because of calibration differences between the images.
A number of authors have attempted in the last two decades to estimate discharge from width either by using the Manning equation (Sichangi et al., 2018), the hydraulic geometry equations (Gleason et al., 2014) or by statistical inference (Tourian et al., 2017). The width dataset created here tend to support the idea of being able to estimate discharge data from width but also serves to illustrate the eventual problems that can emerge with such endeavour. The next section on cross section profiles brings more substance to this question.

Cross Section Profiles
A total of 1760 partial cross section profiles were generated for the São Francisco River reach. The majority (60.1%) of  these profiles indicated predominantly rectangular cross sections. The remaining cross sections were mostly trapezoidal asymmetric (37.1%) with very few symmetrical ones (2.8%). Of all the cross sections, 17.7% had islands reaching the highest water level or sand bars (9.7%) when they were submerged. Figure 11 shows a selected sample of cross section profiles that illustrates all the situations encountered. Most cross sections have a definite rectangular profile (160 -220. 620, 670, 680, 710, 820) in which the water levels from the December 2015 -December 2016 period have very little effect on the width, hence the difficulty often reported to model discharge from width data. Seven cross section profiles (610, 650, 730, 740 770, 790 and 830) have an asymmetrical trapezoidal shape while only one (640) can be considered trapezoidal symmetrical. Five cross sections have an island or two (700) and three (190,220,680) have a submerged island that could be a sand bar. Note that in most cases, some points had to be left out to avoid creating negative slopes. As mentioned above, these points might either be the result of an imprecise water surface extraction or due to a flow propagation phenomenon, especially since the water level resulted from a linear interpolation process with very distant anchor points. The photographs in Figure 10 were taken from a boat in the middle of the river and illustrate well an asymmetrical cross section with a very steep left bank and a gently sloping right bank. Figure 12 shows a series of complete cross section profiles from the six in situ ANA stations used to estimate water level of all the width data. The width data generated in this study have been superimposed on these cross sections (blue dots) in an effort to compare the shapes these data points suggest with the actual cross sections. The first example of the six (station 42210000) is perhaps the worse as the profile depicted by the dots is clearly wider than the measured one. Nevertheless a rectangular shape is still represented by both the continuous line and the dots. The remaining five profiles appear to be relatively well represented by the dots. Although small differences are visible (which can also be caused different locations of a different interpretation of what is perpendicular to the flow), all shapes are clearly respected. Even though this is not a true validation, it still shows the method had the merit of producing valuable partial descriptions of the real cross section profiles.

DISCUSSION
The results presented here suggest that partial bathymetry can be achieved through synchronised time series of SAR images and water level. The six dates considered were chosen for having both a relatively wide range of water level (a difference of about 5 m between low and high waters) and images match- ing the same dates. The probability of achieving this kind of combination is relatively recent with freely available high resolution SAR data. The Sentinel-1 mission only started imaging the Brazilian territory in 2015. The reach of the São Francisco River under scrutiny is a mostly straight to moderately meandering channel characterised with a predominance of rectangular bank shapes (≈60%) in which the width vary little with water level. Conversely, symmetrical and asymmetrical trapezoidal cross sections are likely to show width variations with water level (hence discharge). These cross sections are also very useful for identifying and distinguishing islands and sand bars.
In the present case the asymmetrical trapezoidal cross sections were clearly associated with the sinuosity of the river and their identification can help identify locations where discharge estimates through width are more likely to yield good results. In the longitudinal direction, sequences of cross section profiles could provide valuable information on geomorphological characteristics of the river channel such as the riffles-pools morphology.
In longer time series (five years or more) the dynamics of the river channel, islands and sand bars could also be monitored. The shape of the cross sections can also be related to the flood risk by estimating the capacity of the channel to accept a discharge surge.
Although only partial, the cross section profiles produced using the proposed method can provide reasonable estimates of Manning's roughness coefficient or the hydraulic radius of the channel, both critical in hydrological and hydrodynamic models. A number of studies have already succeeded to estimate discharge and flow propagation data from width time series , Gleason et al., 2014, Tourian et al., 2017, Sichangi et al., 2018. Future efforts will be invested in the integration of satellite altimetry from multi-mission altimetry satellites (i.e. CryoSat-2, Sentinel-3, Sentinel-6, SWOT) with in situ stations to create better water level time series. A larger number of images from different dates and a greater range of water stages would improve the results obtained from this method. Another aspect that will requires more thorough investigation is related to the method for extracting the water surface from SAR images. A probabilistic approach to improve the water proportion in edge pixels has already shown promising results .

CONCLUSIONS
The proposed method made it possible to generate over 10000 (1760×6) width measurements from six different dates of a  Figure 12. Validation cross sections used to compare with the dry bathymetry partial cross sections produced with the width data. From top to bottom: station 42210000, 43200000, 44200000, 44290002, 44500000 and 45298000. The blue dots represent the dry bathymetry data generated from river width. Note that the cross sections were adjusted laterally since no common distance reference was available.
430 km reach of the São Francisco River using only Sentinel-1 SAR images. Some inconsistencies were observed that were attributed to the flow propagation but that could also be due to calibration differences between the images or caused by the segmentation process. These width measurements were used to create partial cross section profiles of the river we called dry bathymetry. The analysis of the data showed the importance of excluding the islands from the width measurements. Plotting the width progression in conjunction with discharge values indicated the effect of the flow propagation speed. A comparison between the width data and the actual cross section profiles of six fluviometric stations located nearby showed the width data fitted relatively well the shape of the profiles. Further work will mainly concentrate on the precision of the width data extraction process.
The data generated by this work brings a valuable contribution to a series of studies conducted on the São Francisco River to estimate discharge from remote sensing data. It complements a number of other studies that have used remote sensing image data to estimate flood return period  and altimetry data to estimate water level (Maillard et al., 2015) and slope (Martins et al., 2019).