ASSESSMENT OF STEREO-EXTRACTED DSM FROM WORLDVIEW-3 OVER DIFFERENT LAND COVERS DEPENDING ON THE IMAGING GEOMETRY.

: WorldView-3's 0.31m resolution in panchromatic mode, makes it one of the highest resolution commercial satellite in the world. This fact, together with its excellent stereo capabilities, make this satellite ideal for digital surface model (DSM) extraction working on very complex morphologies where a higher level of detail is required. In this communication we assess the quality (both completeness and vertical accuracy) of DSM extracted from WorldView-3 stereo pairs depending on the image geometry and sun position. Three different land covers (bare soil, urban areas and agricultural plastic greenhouses) have been tested in the Southeast of Spain (Almería). The well-known semiglobal matching (SGM) algorithm was used for all the extracted DSM. A clear relationship between DSM completeness and the WorldView-3 stereo pair imaging geometry measured as convergence angle was found. The completeness values decreased as convergence angles increased, especially in complex reliefs. In fact, convergence angles lower than 16 degrees is recommended when working on urban or greenhouse land covers. Moreover, sun light can cause glint effect in greenhouse areas. In this land cover, the attained results suggest to use stereo pairs taken when the sun presented a very low elevation. In Almería, the last happens in winter. The best results in all the tested land covers can be obtained by fusing DSM extracted from more than one stereo pair.


INTRODUCTION
In the last 20 years, with the advent of Very High Resolution (VHR) optical satellites such as IKONOS or QuickBird, accurate Digital Surface Models (DSMs) can be quickly and conveniently generated. The stereo capabilities of VHR satellites, together with their agile pointing ability, enable the generation of geometrically robust and radiometrically consistent along-track stereo images which can be acquired for any place on Earth (e.g., Zhang and Gruen, 2006, Dowman et al., 2012, Poli and Caravaggi, 2012, Aguilar et al., 2014a. Several researches about DSMs generation from VHR satellite imagery have been published over different land cover such as urban areas (Di Rita et al., 2017;Mandanici et al., 2019), flat bare soil (Aguilar et al., 2014a), mountainous areas (Fratarcangeli et al., 2016), deciduous forest (DeWitt et al., 2017), glaciated regions (Noh and Howat, 2015), herb and grass (Hobi and Ginzler, 2012), even over greenhouse covered areas Aguilar et al., 2014bAguilar et al., , 2019Aguilar et al., , 2020.
In practice, the third-order Rational Function Model (RFM), which builds the object-to-image space mapping through 78 parameters called Rational Polynomial Coefficients (RPC) initially derived from the satellite navigation system, is the sensor model usually applied for WorldView-1/2/3/4 imagery Mandanici et al., 2019;Loghin et al., 2020). The original vendor supplied RPC orientation has to be refined using tie points (relative correction) or accurate ground control points (GCPs, for absolute correction) (Grodecki and Dial, 2003).
There are many commercial software packages being able to extract DSM from VHR stereo images such as MATCH-T, supplied by Trimble, LPS eATE, embedded into ERDAS, or Socet Set ATE, by BAE Systems. Among these, OrthoEngine, the photogrammetric module of Geomatica (CATALYST), has been the most used in research works, serving as a benchmark for others packages in comparison tests (Fratarcangeli et al., 2016;Di Rita et al., 2017). Regarding the image matching algorithm, several researchers point out to semi-global matching (SGM) (Hirschmüller, 2008) as the best option, particularly working on urban areas. For instance, Han et al. (2020), Aguilar et al. (2019) and Nemmaoui et al. (2019) reported that SGM yielded better results than the traditional image matching method based on areabased matching and the cross-correlation threshold.
Satellite imaging stereo geometry, measured as convergence angle (Li et al., 2007), plays a significant role both in the final DSM vertical accuracy (Li et al., 2009;Aguilar et al., 2014a) and completeness of the extracted DSM (Aguilar et al., 2014a;Mandanici et al., 2019). Mandanici et al. (2019), working in urban areas, reported that the completeness achievable with only one WorldView-3 stereo pair is extremely variable (ranging from 50% to 90%), due to the combined effect of the geometry of acquisition and the specific urban texture. In this sense, the authors recommended to use more than one stereo pair to obtain better results.
The goal of this communication is to assess the quality (vertical accuracy and completeness) of DSMs extracted from WorldView-3 imagery over bare soil, urban, and plastic greenhouse land cover. Since three WorldView-3 images (along track triplet) were available, we also researched the influence of the stereo imaging geometry and the sun position in the quality of each extracted DSM. Also, a fusion approach based on the score channel values was proposed to merge three single DSMs attained from individual WorldView-3 stereo pairs.

Study Site
This work was carried out in Almería, Southeast of Spain. The study area comprised a rectangle of 5,700 by 10,000 m centred on the WGS84 geographic coordinates 36° 46' 21.58"N and 2° 40' 21.78"W ( Fig. 1 and Fig. 2). This area presented a smooth relief.
Three subareas of 250 by 250 m within the study area were selected. One corresponded to an urban area, concretely in the village of La Mojonera, framed in a blue box (Fig. 2). The second area, framed in a red box, represented a bare soil area. Finally, the third subarea (green square) included plastic greenhouses land cover (Fig. 2).

WorldView-3 PAN images
WorldView-3 (WV3) is a VHR satellite successfully launched on August 13, 2014. This satellite provides optical images with 0.31 m GSD at nadir in PAN mode.
Three WV3 PAN images were taken in Ortho Ready Standard Level-2A (ORS2A) format on December 25, 2020. The catalogue ID for the raw data, the acquisition time and other characteristics of these images are shown in Table 1. The final PAN products delivered presented a GSD value of 0.3 m. The images were coded with 1, 2 and 3 in the same order of their acquisition.
They were acquired in forward direction in the same satellite track, setting a WV3 triplet. Figure 3 shows the sky plot with satellite azimuth and elevation angles at the time of acquisition for the three WV3 images. The sun was located in a very low position (28 of sun elevation).
Note that, combining the three original PAN single WV3 images shown in Table 1, three different stereo pairs could be made up. The convergence angle can be defined as the angle between two rays that intersect at a common ground point, one from the fore image and one from the aft image, measured along the convergence or epipolar plane. It can be calculated based on the azimuth and the elevation angles provided in the WV3 metadata, according to the formulas available in the literature (Li et al., 2009). The convergence angle is a good predictor of DSM quality. The first stereo pair composed of images 1 and 2 (1-2) presented the lowest convergence angle (15.35). The second stereo pair (2-3) had an intermediate convergence angle of 22.54. The last couple of images (1-3) presented the highest convergence angle (37.89).

Figure 2.
Study area over a PAN WV3 orthoimage and the three selected subareas of 250 by 250 m on urban (blue square), bare soil (red square) and plastic greenhouse (green square) land cover.

Ground truth LiDAR data
The LiDAR data used as ground truth in this study was provided by the PNOA (Spanish National Plan for Aerial Orthophotography) as a point cloud in LAS binary file, format v. 1.2, containing orthometric elevations. It was captured on 23 September 2015 by using a Leica ALS60 discrete return sensor with up to four returns measured per pulse and an average flight height of 2,700 m. The registered point density of the test area, taking into account the overlapping, turned out to be 0.97 points/m 2 (all returns).
The original density of the LiDAR point cloud was significantly reduced to extract a representative and yet manageable set of LiDAR points. After this, LiDAR data from the three selected subareas were carefully edited by manually removing the points that did not belong to the DSM. Finally, an evenly distributed ground truth LiDAR edited data over each subarea of around 0.2 points/m 2 was obtained for validation. This same LiDAR edited data was used by Aguilar et al. (2019) and Aguilar et al. (2020), where the reader can find more details about the edition process.

DSM Extraction from VHR Satellite Stereo Pairs
In this work, three WV3 stereo pairs (1-2, 2-3 and 1-3) were used independently to extract the DSM in the study area. The hierarchical SGM approach based on the widely known algorithm proposed by Hirschmüller (2008), implemented in the software OrthoEngine (Geomatica v. 2018 (PCI Geomatics, Richmond Hill, ON, Canada)), was used to generate the disparity maps after applying an epipolar rectification process to the original stereo images.
Previously, the sensor orientation of WV3 images was carried out using the RFM refined with a zero-order polynomial adjustment (RPC0). According to Aguilar et al. (2013), seven GPS-RTK surveyed GCPs evenly distributed over the working area were used to compute the sensor orientation into OrthoEngine.
Three grid spacing format DSMs were extracted from each WV3 stereo pair. The output was composed by a DSM for each stereocouple and a score channel, which classifies elevation values in three levels of confidence, depending on the goodness of matching. According to Mandanici et al. (2019) and considering the complex morphology of urban and greenhouse land covers, an extremely high level of details is required to correctly describe the shape of buildings and agricultural infrastructures. For this reason, the DSM extraction was performed with the highest possible resolution, setting the grid spacing to 0.3 m and without interpolation. All DSMs were extracted in orthometric elevations using the EGM2008 geoid. A total of 9 DSMs were extracted corresponding to the three subareas (bare soil, urban, and greenhouse) and the three stereo pairs (1-2, 2-3 and 1-3).
Mandanici et al. (2019) proposed a method for fusing individual DSM attained through OrthoEngine based on the score channel.
In their work, an average DSM was generated, in which the elevation value in each pixel was computed as a weighted average of the values measured in each stereo pairs, using the score channels to derive weights. We use this approach in the current work, although without applying the weighted derived from the score channel that only presented values of 0, 99, 100 and 101. We only averaged those elevations that presented values of 99 or more in the score channel. Finally, three averaged DSM (we called them MultiView DSM), one for each land cover, were extracted by fusing the single DSMs attained from the three individual WV3 stereo pairs (1-2, 2-3 and 1-3). In the special case of plastic greenhouses land cover, and extra MultiView DSM was calculated by removing the stereo pair 1-3.

Quality assessment of the extracted DSMs
The quality of the extracted DSMs was assessed by computing their completeness and vertical accuracy. The completeness was computed for the three studied subareas as the ratio between the number of correctly matched points for the considered DSMs (1-2, 2-3, 1-3 and MultiView) and the maximum possible number of points corresponding to the 0.3 m DSM grid spacing.
The vertical accuracy assessment of the DSMs derived from the WV3 images was carried out by using the 3D points from LiDAR data (see Section 2.3) as independent check points (ICPs), computing vertical residual (z-residual) at each corresponding point by subtracting the LiDAR height from the WV3 DSM derived height. Note that each ICP will produce a z-residual only if the WV3 DSM presents height information in the area around the planimetric position of this ICP. About 16,000, 12,000 and 9,000 ICPs were finally obtained for bare soil, urban and greenhouse land covers, respectively.
The computed accuracy measures included mean, standard deviation (SD) and the root mean square error in Z (RMSEZ). These statistics were computed to assess the final vertical accuracy after removing outliers from the z-residuals populations by applying the three-sigma rule (Daniel and Tennant, 2001). The percentage of outliers for each DSM was also calculated. In addition, and due to the likely presence of outliers in the measurements taken from the original dataset, an additional robust statistic such as normalized median absolute deviation (NMAD) (Höhle and Höhle, 2009) was computed over the residuals. NMAD is a measure of scale or variability that can be considered a consistent estimator for the estimation of standard deviation, also offering the advantage of being very insensitive to the presence of outliers. Note that the three-sigma rule was not applied in the case of the NMAD calculation.

Visual inspection
Figures 4, 5 and 6 show the RGB orthoimage and threedimensional shaded relief for the different WV3 derived DSMs produced in this work (i.e., 1-2, 2-3, 1-3 and MultiView) for each considered land cover.  Regarding the bare soil land cover (Fig. 4), the quality of the DSMs derived from single WV3 stereo pairs (Fig. 4c, Fig. 4d and Fig. 4 e) appear similar to the quality of the LiDAR derived DSM, always represented with a grid spacing of 1 m. The completeness values were greater than 99% for all the stereo pairs. The MultiView DSM (Fig. 4f), attained from fusing the three single DSMs show in Fig. 4c, 4d and 4e, presented the best rate of matching points and, visually, was slightly better than the other options. Figure 5 shows the DSMs over the plastic greenhouses land cover obtained from the different strategies tested in this work. The two first DSMs derived from single stereo pairs (Fig. 5c, 5d) had quite good visual quality. However, in the DSM from stereo pair 1-3 ( Fig. 5e) there were many outliers (colour purple and blue) not detected by the score channel. The errors in these points also affected to the fusion of individual DSMs (MultiView DSM in Fig. 5f). Because of this fact, and only in the case of plastic greenhouses land cover, we generated a new MultiView DSM merging only the stereo pairs 1-2 and 2-3 (Fig. 5g).  Lastly, the DSMs achieved on the subsample over urban area are shown in Figure 6. In this complex relief, the stereo pairs 1-2 and 2-3 had better visual results than the stereo pair 1-3. Again, the MultiView DSM yielded better results than the individual DSMs  (1-2, 2-3 and 1-3).

DSM completeness
In this section we analysed the completeness achieved in all the DSMs extracted in this work. We also compared the current results with those obtained by Aguilar et al. (2019)   In the case of bare soil (Table 2), the completeness percentages were extremely high (>99%) for the three individual DSMs, being the values slightly related with the convergence angle (also depicted in Table 2) of each stereo pair. In the stereo pair 1-3, with the highest convergence angle (37.89), a few data was missed in the corresponding DSM, achieving a completeness value of 99.03%. In the case of 1-2 and 2-3 stereo pairs, the obtained completeness values of 99.98% and 99.86% resulted to be very similar to the values reported by Aguilar et al. (2019). The MV DSM strategy set the completeness value at 100%.
In the urban land cover (  (2019) reported that, with a single stereo pair in an urban area, the achievable completeness varies between 50% and 90%, while better results can be obtained using more than one stereo pair (using three WV3 images the average completeness doubles, with six images reaches 99%). A clear negative correlation of completeness and convergence angle was already observed by Mandanici et al. (2019) where building density was high. They recommended the stereo pairs with smaller convergence angles, in the range 8-16 degrees. This finding is consistent with the results found in this work.  Table 2. Completeness values of DSMs derived from stereo pairs and triplet (MultiView, MV) for each land cover and convergence angle.

99.30
Regarding plastic greenhouse land cover, the relief usually is complex and, in addition, the transparency of the plastic materials provokes a lot of matching errors. Also, Aguilar et al. (2019) reported that, in certain situations, the plastic cover of the greenhouses may induce specular reflection of sun light, thus causing unusually bright pixel digital values (sun glint effect). This effect contributes to increase the number of missing image matching points. All these facts make DSM extraction in these unique land cover a real challenge. The completeness values for the current work are shown in Table 2. We had very good results when the individual DSMs with lower convergence angles (1-2 and 2-3) were applied (completeness values of 98.21% and 97.25%). However, the completeness value for the stereo pair 1-3, with the highest convergence angle, was of 75.49%, presenting a lot of outliers. The MV strategies got to improve again the completeness. Table 3 shows the vertical accuracy measures (NMAD, SD and RMSEz) calculated in each DSM derived from the individual WV3 stereo pairs and the merged ones (MV).

DSM vertical accuracy
In the bare soil land cover, the SD values (random error) were ranging from 0.231 m in the stereo pair 1-2 to 0.202 m in the stereo pair 1-3. The Mean values (systematic error) were very variables ranging from -0.502 m to 0.004 m, although this figures were not depicted in Table 3. RMSEz had values between 0.321 m to 0.547 m. Also, MV strategy performed very good results in vertical accuracy measured as SD (0.202 m). The convergence angle had little impact on the vertical accuracy for this land cover. Bare soil is usually the land cover where the best vertical accuracy are obtained when relief is very smooth . The SD values reported by Aguilar et al. (2019) were always lower than the GSD of the satellite images (0.3 m for WV3 and 0.5 m for WV2). More robust results were achieved between different stereo pairs using NMAD, especially for greenhouse land cover.
In the urban subarea, the SD rose to 1.473 m when the three WV3 images (three stereo pairs) were used in MV strategy. This error is typical for some points located on facades. When individual stereo pairs were used, the SD values were ranging from 1.  In greenhouse land cover, the SD values attained for stereo pair 1-2 (0.573 m), stereo pair 2-3 (0.574 m) and for MV strategy removing the stereo pair 1-3 (0.551 m) were excellent if compared with the SD values of 0.84 m and 1.10 m achieved by Aguilar et al. (2019) using the WV3 and WV2 stereo pairs, respectively. It could be mainly attributed to the stereo pairs viewing geometry and its relationship with the sun position. In plastic greenhouse areas, as well as in urban areas, the use of stereo pairs with convergence angles smaller than 15 is extremely important. Moreover, an excessive sun elevation could produce glint effect on the plastic surfaces, causing problems to the matching algorithm. In the WV3 triplet used in this work taken on Christmas day, 2020, the sun elevation was only of 28, while in the stereo pairs used by Aguilar et al. (2019) and taken in July, the sun elevation was set around of 69. In Figure 7 we can see the DSM in an extended area of 2,500 by 2,500 m mainly covered by plastic greenhouses and urban land cover (in the lower left corner). A lot of errors appear on the plastic greenhouses in the WV3 DSM of 2015 (Fig. 7b) due to its high convergence angle (32.1) and its high sun elevation (69.3). By contrast, when the DSM was extracted from the 1-2 WV3 stereo pair, taken on 25 December, 2020 (Fig. 7a), presenting a convergence angle of 15.35 and a sun elevation of 28, only a couple of errors resulted to be significant on plastic covers. The results over the city of La Mojonera, located in the lower left corner, were very similar for both DSMs. Figure 8 shows the score channel corresponding to the DSMs depicted in Figure 7. Blue pixels represented values of the score channel = 0. In other words, pixels where the matching procedure did not work properly, while pixels in green had very good values of score channel ranging from 99 or 101. a) WV3 1-2 (25 December, 2020) b) WV3 (5 July, 2015) Figure 8. Score channel corresponding to DSM from WV3 stereo pairs over a square area of 2,500 m x 2,500 m: a) WV3 stereo pair 1-2 taken on 25 December, 2020; b) WV3 stereo pair taken on 5 July, 2015 and used by Aguilar et al. (2019).
Regarding the percentage of detected outliers after applied the three-sigma rule (Table 3), the results show that the urban land cover had the highest percentage of outliers with values ranging from 2.31% to 2.66% for the individual stereo pairs. Values of around 2% were obtained in greenhouses. Lastly, values between 1.51% and 1.66% were achieved in bare soil.

CONCLUSIONS
The high spatial resolution of WV3 stereo pairs in PAN mode are very interesting to obtain accurate DSM in very complex reliefs such as urban or plastic greenhouse areas.
A clear relationship between DSM completeness and the WV3 stereo pair imaging geometry measured as convergence angle was found. The completeness values decreased as convergence angles increased, especially in complex reliefs. In fact, convergence angles lower than 16 is recommended when working on urban or greenhouse land covers.
Moreover, in greenhouse areas, the plastic cover can produce specular reflection of sun light causing glint effect. In that way, the stereo pair viewing geometry as well as the sun position at the time of image acquisition should be taken into account. In this land cover, we strongly recommended to use stereo pairs with very low convergence angle (<16) and taken for a very low sun elevation. In Almería, the last happens in winter.
Bearing in mind the importance of the satellite viewing geometry and its relationship with the sun position in the greenhouse land cover, the use of triplet on this unique landscape (i.e., more than one stereo pair) can improve the DSM quality in terms of both vertical accuracy and, particularly, completeness.
With the forthcoming rapid development of the plastic greenhouse areas over the world, future works should be focused on evaluate strategies for extracting accurate DSM from VHR satellite imagery.