VALIDATION OF ‘ AW 3 D ’ GLOBAL DSM GENERATED FROM ALOS PRISM

Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM), one of onboard sensors carried by Advanced Land Observing Satellite (ALOS), was designed to generate worldwide topographic data with its optical stereoscopic observation. It has an exclusive ability to perform a triplet stereo observation which views forward, nadir, and backward along the satellite track in 2.5 m ground resolution, and collected its derived images all over the world during the mission life of the satellite from 2006 through 2011. A new project, which generates global elevation datasets with the image archives, was started in 2014. The data is processed in unprecedented 5 m grid spacing utilizing the original triplet stereo images in 2.5 m resolution. As the number of processed data is growing steadily so that the global land areas are almost covered, a trend of global data qualities became apparent. This paper reports on up-to-date results of the validations for the accuracy of data products as well as the status of data coverage in global areas. The accuracies and error characteristics of datasets are analyzed by the comparison with existing global datasets such as Ice, Cloud, and land Elevation Satellite (ICESat) data, as well as ground control points (GCPs) and the reference Digital Elevation Model (DEM) derived from the airborne Light Detection and Ranging (LiDAR). * Corresponding author


INTRODUCTION
The Digital Elevation Model (DEM) which represents the 3-D information on the ground is one of essential layers in the field of geographic information systems.Its applications extend over wide ranges in the scientific fields, e.g., hydrology, geomorphology, ecology, etc., as well as in the practical use, e.g., infrastructure design, disaster monitoring, environmental monitoring, natural resources survey, etc.For measuring the 3-D information on the ground various methods are used depending on target scales or accuracies.In recent years global DEM datasets derived from spaceborne remote-sensing techniques are being widely used with its wide coverage and homogeneous data quality.The Shuttle Radar Topography Mission (SRTM) was the first global datasets made by spaceborne radar instruments and was released in 2003 first.The data has 3 arcsec (90 m) pixel spacing at the first release in which the absolute and relative height accuracies are ~9 m and ~10 m respectively (90 % errors) (Rodriguez et al., 2006).The global elevation data derived from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER/GDEM) derived from the optical stereo sensors was released in 2009 first.It has the height accuracy of 13 m (1) in 1 arcsec (30 m) pixel spacing (Tachikawa et al., 2011).Panchromatic Remote-sensing Instrument for Stereo Mapping (PRISM) was an optical sensor onboard the Advanced Land Observing Satellite (ALOS) operated from 2006 to 2011, and was designed to generate worldwide elevation data.The sensor consists of three independent panchromatic radiometers for viewing forward, nadir, and backward producing in-track triplet stereoscopic images in 2.5 m ground resolution (Tadono et al. 2009).To utilize the global data archives observed during the five year mission life of the sensor we started to generate new global elevation datasets, named 'Advanced World 3D (AW3D)', which have finer ground resolution and higher accuracy than those existing ones (Takaku et al., 2014).The data is Digital Surface Model (DSM) processed in unprecedented 5 m grid spacing utilizing the original triplet stereo images in 2.5 m resolution.The target accuracy of the product was set to 5 m (rms) in vertical and also 5 m (rms) in horizontal.The process-chain of datasets is full-automatic to process approx.one million sets of stereo images which cover 35 km square each on the ground.As the first version of global data processing was completed at the end of March 2016, a trend of global data qualities became apparent.This paper reports on the latest status of the global DSM processing with validation results on some test areas.The accuracies and error characteristics of datasets are analyzed by the comparison with existing global datasets such as Ice, Cloud, and land Elevation Satellite (ICESat) data, as well as with Ground Control Points (GCPs) and a reference airborne LiDAR/DEM.

DATA PROCESSING STATUS
The DSM datasets are generated on 1°x1° tiles of geographic latitude/longitude grids as the final products.Total number of output tiles to be processed is approx.twenty-three thousand for global land areas, while the total number of input stereo images is approx.one million sets.The stereo images are processed in a unit of scene on 35 km x 35 km first to generate "DSM-scenes," and then, they are mosaicked onto "DSM-tiles" on 1°x1°.

Processing algorithms
Figure 1 shows the processing flow of whole software which consists of the scene data processing and the tile data processing.For scene data processing we have developed software called DSM and Ortho-rectified image (ORI) Generation Software for ALOS PRISM (DOGS-AP) (Takaku et al., 2009).The software briefly consists of the tie-point generation, the image orientation, the image matching for height-calculation, and the masking of outlier areas.The software works full-automatically and does not need any GCPs to give the planimetric accuracy of 5 m (rms) thanks to the well-calibrated PRISM physical sensor model.Only relative orientation is performed to fix the epipolar geometry with tie-points among stereo images.It is equipped with an exclusive matching engine to generate the DSM data in 5 m grid spacing by utilizing the unique triplet stereo images.It also generates the ORI data of nadir image on original 2.5 m pixel spacing.In the masking process the invalid areas for the image matching, i.e., cloud, snow, ice, desert or water, are automatically masked by analysing the matching correlations (Takaku et al. 2014).The existing global water-body-data in public domain such as SRTM Water Body Data (SWBD) (NASA/NGA, 2003) or Global Self-consistent, Hierarchical, High-resolution Shoreline Database (GSHHS) (Wessel et al., 1996) are utilized as initial masks.In the mosaicking process the DSM-scenes are automatically mosaicked on 1°x1° DSM-tiles in 0.15 arcsec spacing (approx.5 m at equator) as final dataset products, where the longitude spacing varies depending on the latitude-zones, i.e., 0.3 arcsec for 60°~70°, 0.45 arcsec for 70°~80° and 0.9 arcsec for 80°~90°.The software briefly consists of vertical-shift-correction, stacking/mosaicking, interpolation of height in water mask areas, tile-framing, and Quality Control / Quality Assurance (QC/QA) (Takaku et al. 2014).The DSM-scenes include absolute vertical shift-errors of up to 40 m (Takaku et. al. 2009); they are corrected with existing global height control reference such as ICESat or SRTM, while maintaining the relative accuracy of details.In the stacking/mosaicking process all available grid data of DSM-scenes are stacked and are averaged for each grid of the DSM-tiles after discarding outliers, while filling the void (i.e., mask-areas derived from cloud, snow, ice, desert, etc.) mutually (Takaku et al. 2013).The height data in inland-water masks are then filled with their surrounding valid heights, i.e., a fixed height of their averages in lakes or their interpolated heights calculated by the inverse distance weight method in rivers.The classification between the lake-and the river-masks are based on standard deviations of their surrounding heights.In the QC/QA process the absolute accuracy index is calculated with globally available ICESat data and is annotated in the product metadata.To check the large errors or obvious missing masks we perform manual visual inspections on sample tiles.

Processing system
To process global datasets within almost two years we implemented the software to a parallel processing cluster system.It is equipped with more than six hundred CPUs on a Linux operating system and is capable of processing maximum three thousand DSM-scenes and one hundred DSM-tiles per day.The output DSM-tiles and DSM-/ORI-scenes, as well as the input stereo images with ancillary data (i.e., satellite orbit data, high-frequency attitude data, etc.), existing datasets (i.e., SWBD, GSHHS, SRTM, and ICESat), and intermediate processing data, are archived at a storage-system of approx.four Peta bytes.

Data coverage
We have processed more than 23K 1°x1° DSM-tiles which cover most of global land areas that lay between 83 degrees north latitude and 82 degrees south latitude.The tiles may include voids where the valid data could not be extracted from the stacking of all available scenes due to cloud, snow, etc.An index of data coverage, i.e., the coverage-rate of valid data, in all processed land-areas for each tile is calculated as follows: where R is the coverage-rate (%), N v is the number of valid data which have no masks, N iv is the number of voids which have invalid masks, and i is an identifier of tiles.In valid data areas the stacking number of DSM-scene data is counted on each grid so that a stack-average of a tile will be calculated as a quality index of each tile.Namely: (2)

Scene-data processing (DOGS-AP) Tile-data processing
where S is the stack-average, and N s is the stacking number of valid data j in a tile.Figure 2 (a) and (b) show the distribution of the coverage-rates R and the stack-averages S respectively for all processed DSM-tiles.Table 1 shows the total values of R and S for tiles in latitude-zones of 20 degree intervals, as well as in whole areas, except for Greenland and Antarctica which have exceptional trends due to the large ice sheet.The values for Greenland and Antarctica are separately shown in Table 2.
From Figure 2 it was confirmed that the rates/stacks are relatively high in the middle of Eurasia, North America, the southern part of South America, and Australia, thanks to their fine weather condition and sufficient ground textures.They are reflected to the R and S for the latitude-ranges of N70-N10 and S10-S50 in Table 1, which indicates 90.12 % or higher and 3.20 or higher respectively.The maximum stack-average is 19.2 at the tile-ID S26E133 (the south-west corner of the 1°x1° tile) in Australia.On the other hand, in the equatorial zone the rates/stacks are apparently low due to the heavy cloud coverage on the tropical rainforest areas.They are reflected to the R and S of N10-S10 in Table 1, which shows 75.12 % and 2.69 respectively.In northern Eurasia they are low as well due to snow/ice, while in some parts of northern Africa similar trends are confirmed due to the large desert.The low rate of S50-S70 in Table 1, which shows 78.84 %, is due to too few tile samples which have low rates at south tip of South America.Moreover, in Figure 2 (b) some systematic gaps along lines of latitudes are visible in Eurasia and Africa; they were due to satellite operations affected by a restriction of data-downlinks.Nonetheless, the total rate and stack in whole processed tiles, except for Greenland and Antarctica, are 91.58 % and 4.62 respectively.In Greenland the rate is poor due to the large ice sheet where there is no enough texture/contrast for the image matching.It indicates 37.89% in Table 2 taking account of missing tiles at the central area of the island shown in Figure 2.
In Antarctica the rate is also poor due to the ice sheet but is slightly better than that of Greenland though the coverage is limited to the latitude of 82 degrees south due to the satellite paths.It indicates 48.61 % in the continent except for the polar region over the latitude of 82 degrees south.

VALIDATION
The perspective global absolute accuracy of the processed tiles is being routinely monitored by comparison with existing global height reference i.e., ICESat data, while the detailed relative accuracy is independently evaluated at validation sites of GCPs and LiDAR/DEM in valid data areas.

ICESat
ICESat data products GLA14 are used in the validation (Zwally et al. 2012).The absolute accuracy of the data is less than one meter for the points selected in optimal conditions (Duong et al., 2009).To compare the different spacing data the 0.15 arcsec (5m) grid data of DSM-tiles under the 65 m footprint of ICESat are averaged first.And then the averaged points, in which the standard deviations in the footprint were larger than 5m, are omitted from the comparison because the points in the steep/rough terrain may have less reliability (Huber et al., 2009).
After the comparison the points, where the heights of ICESat were higher in more than 100 m, are associated with outliers due to the cloud reflections or saturated waveforms in ICESat data and are excluded from the results (Carabajal et al. 2006).
In the validation total 262,163,604 points of ICESat data, which correspond to approx.12K points for each tile on average, were selected on approx.23K tiles.Figure 3 (a) and (b) show the distribution of the mean difference and the standard deviation respectively from the ICESat data in each tile, while Table 3 show the statistics of height difference between the DSM-tiles and the ICESat data for tiles in latitude-zones of 20 degree intervals as well as in whole areas.In the comparison 90 % of the tiles have mean differences within +/-3.09m and standard deviations within 5.31 m among whole tiles, while the average error and standard deviation of whole ICESat data are -0.08 m and 3.26 m respectively.This means that the DSM-tiles have enough absolute/relative accuracy on a continental scale in comparison with the target accuracy of 5 m rms.However, there are some systematic mean-errors in Figure 3 (a), e.g., the positive error spot in the central part of South America and negative error clusters in Eurasia and North America.The maximum and minimum mean errors are +14.4m of tile-ID S16W068 and -8.7 m of tile-ID N33W079 respectively for the latitude range between +/-60 degrees.These are derived from the differences between ICESat and SRTM which was used as  the reference of absolute height for DSM-scenes (Huber et al., 2009).We have a plan to correct these errors in future work.In Figure 3 (b) the tiles, which have the standard deviations of over 5m, are distributed in equatorial zone and in northern Eurasia, and appear to be correlated with the rate of data coverage and stacking numbers shown in Figure 2.They are reflected to the relatively high standard deviations for the latitude-ranges of N70-N50 and N10-S10 in Table 3, which indicates 3.93 m for both.This implies that the areas in which the data coverage is relatively poor may have less accuracy due to the lack of data stacks which can mitigate blunders of the image matching or missing masks.

GPS-Track
The GCPs, which were used in the geometric calibration of the PRISM sensor model (Takaku et al. 2009), are used in the validation of DSM-tiles as well.The absolute accuracy of the GCPs derived from GPS tracks is better than 1 m (rms) in both of planimetry and height.The height of the DSM-tiles at the planimetric position of each GCP is interpolated from its surrounding grid data for their comparison.Total 4,628 GCPs, which are distributed among 15 areas, were used in the validation and their distribution is shown in Figure 4. Table 4 shows a statistics of height errors for each area as well as for whole areas, while Figure 5 shows the histogram of errors for whole GCPs.The number of GCPs has wide variety among the areas, while Japan has most of them so that it contributes well to the whole results.The rms error of whole GCPs is 3.28 m and is enough consistent with the target accuracy of 5 m.However there are large rms errors in some areas, i.e., Bhutan, Mexico, Norway, South Korea and USA (DC /NY), in which the rms errors are 8.58 m, 6.87 m, 5.35 m, 5.55 m, and 5.11 m, respectively.In Bhutan the steep rugged terrain, which the absolute height of GCPs ranges over 5,000 m, may cause the large height errors because slight planimetric positional errors may also cause the height errors, as well as the accuracy of image matching may decrease in those areas.In Mexico the large average error of 5.88 m causes the large rms error; they may be due to the errors of vertical-shift-correction, which the SRTM was used as the reference.In Norway the rate of data coverage is relatively low, so that the height accuracies may follow it as mentioned in 3.1.In South Korea and USA (DC/NY) their respective large standard deviations of 5.52 m and 4.95 m may be due to effects of smoothed sheer edges at the buildings in urban areas.

LiDAR
The reference LiDAR/DEM is located in a part of Kenai Peninsula in Alaska, USA and was provided by Kenai Watershed Forum through Alaska Satellite facility (Peterson, et al., 2014).The data is Digital Terrain Model (DTM) which represents heights of bare-earth as if trees or artificial structures were all removed on the ground.Hence the reference data for this validation was selected on almost bare areas so that it can be used in the comparison with the DSM-tiles.Table 5 shows the basic specification of the dataset.The reference data was down-sampled to the geographic latitude/longitude grid of the DSM-tiles at the site area, which corresponds to approx.5 m x 5 m, to calculate their height differences.The errors depending on different slope angles are evaluated as well as the one in the whole site area.The slope angle in each pixel on the downsampled LiDAR/DEM is calculated with the third-order finite difference weighed by reciprocal of squared distance (Horn, 1981).Figure 6 shows the DSM-tile data and the height difference image between the DSM-tile data and the reference LiDAR/DEM data on the site area.Table 6 shows the statistics of the height difference between the DSM tile data and the LiDAR/DEM data for four different ranges of slope angles, i.e., 0°~10°, 10°~20°, 20°~30°, and 30°~, as well as for whole data, while Figure 7 shows the histogram of normalized frequency for the differences.The results, which indicate positive average errors from +2 m to +3 m, seem to include some systematic errors derived from the difference between DSM and DTM on remaining vegetation areas which we could not exclude sufficiently.Nonetheless, the rmse in whole areas, which indicates 2.70 m, is enough consistent with the results from ICESat and GCP reference.The errors increase as the slope angles become larger.In the area where the slope angles are over 30 degrees the rmse is 6.14 m and is slightly exceeding the target errors, whereas in other areas they are up to 3.60 m.These trends are almost consistent with past same validations performed with a LiDAR/DSM reference in Japan (Takaku et al., 2014).

CONCLUSIONS
The results of the validations for the 'AW3D' global DSM datasets derived from ALOS/PRISM are presented as well as their processing status.In March 2016 we completed to process more than 23K 1°x1° DSM-tiles, which cover almost global land areas in unprecedented 5m grid spacing, with the total coverage rate of about 90 %.The perspective and detailed accuracies of the products were evaluated with their respective appropriate reference data, i.e., the ICESat, GCPs, and the LiDAR/DEM.All the results were almost consistent and met the target height accuracy of 5 m rms except for some extreme terrains.One of our important tasks remaining is to fill the voids of about 10 % in the global coverage.We use stereo images of some commercial satellites (e.g., WorldView series) to fill the voids without quality gaps in the DSM for commercial purposes; however, they are still far from filling the entire voids.Japanese next optical satellite mission is one of the potential sources to overcome the problem though its details are not yet fixed so far.We will continue to discuss about the issue.
The original datasets which have 0.15 arcsec (5 m) spacing will be released basically in commercial base, while the low resolution data which have 1 arcsec (30 m) spacing are generated by a mean/median filter of a 7x7 kernel on the original ones and will be provided to public via an internet site of JAXA free of charge.

Figure 2 .
Figure 2. Distribution of coverage rates R (a) and stack averages S (b) for DSM-tiles.Table1.Total values of R and S for tiles in latitude-zones of 20 degree intervals, as well as in whole areas, except for Greenland and Antarctica.

Figure 3 .
Figure 3. Distribution of mean errors (a) and error standard deviations (b) from ICESat data points for DSM-tiles.

Figure 4 .
Figure 4. Distribution of 4,628 GCPs in 15 areas used in the validation of DSM-tiles

Figure 6 .
Figure 6.Painted relief image of the DSM-tile data on the validation area, which has the height range of approx.0~900 m, selected from the reference LiDAR/DEM (left) and the height difference image between DSM-tile data and the reference LiDAR/DEM data (DSM-tile minus LiDAR/DEM) (right).The black areas in the painted relief image represent the mask of the sea, while ones in the difference image represent masks which were not used in the validation.

Figure 7 .
Figure 7. Histogram of height differences for different ranges of slope angles as well as for whole data.The depicted difference-range corresponds to +/-3 from the  for the distribution in the slope range of '30<='.

Table 2 .
The R and S for tiles in Greenland and Antarctica.

Table 6 .
Statistics of height difference between the DSM tile data and the LiDAR/DEM (DSM tile data minus LiDAR) for different ranges of slope angles as well as for whole data.

Table 5 .
Basic specification of the LiDAR/DEM dataset in Kenai Peninsula