CNN SEMANTIC SEGMENTATION TO RETRIEVE PAST LAND COVER OUT OF HISTORICAL ORTHOIMAGES AND DSM: FIRST EXPERIMENTS
- LASTIG, Université Gustave Eiffel, ENSG, IGN, F-94160 Saint-Mandé, France
Keywords: archival aerial images, Semantic segmentation, CNN, land cover, fusion
Abstract. Images from archival aerial photogrammetric surveys are a unique and relatively unexplored means to chronicle 3D land-cover changes occurred since the mid 20th century. They provide a relatively dense temporal sampling of the territories with a very high spatial resolution. Thus, they offer time series data which can answer a large variety of long-term environmental monitoring studies. Besides, they are generally stereoscopic surveys, making it possible to derive 3D information (Digital Surface Models). In recent years, they have often been digitized, making them more suitable to be considered in automatic analyses processes. Some photogrammetric softwares make it possible to retrieve their geometry (pose and camera calibration) and to generate corresponding DSM and orthophotomosaic. Thus, archival aerial photogrammetric surveys appear as being a powerful remote sensing data source to study land use/cover evolution over the last century. However, several difficulties have to be faced to be able to use them in automatic analysis processes. Indeed, surveys available on a study area can exhibit very different characteristics: survey pattern, focal, spatial resolution, modality (panchromatic, colour, infrared…). Planimetric and altimetric accuracies of derived products strongly depend on these characteristics. Thus, analysis processes have to cope with these uncertainties. Another important gap states in the lack of training data. Deep learning methods and especially Convolutional Neural Networks (CNN) are at present the most efficient semantic segmentation methods as long as a sufficient training dataset is available. However, temporal gaps can be very important between existing available databases and archival data. In this study, two custom variants of simple yet effective U-net - Deconv-Net inspired DL architectures are developed to process ortho-image and DSM based information. They are then trained out of a groundtruth derived out of a recent database to process archival datasets.