Volume IV-2/W7
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., IV-2/W7, 153–160, 2019
https://doi.org/10.5194/isprs-annals-IV-2-W7-153-2019
© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., IV-2/W7, 153–160, 2019
https://doi.org/10.5194/isprs-annals-IV-2-W7-153-2019
© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.

  16 Sep 2019

16 Sep 2019

SEN12MS – A CURATED DATASET OF GEOREFERENCED MULTI-SPECTRAL SENTINEL-1/2 IMAGERY FOR DEEP LEARNING AND DATA FUSION

M. Schmitt1, L. H. Hughes1, C. Qiu1, and X. X. Zhu1,2 M. Schmitt et al.
  • 1Signal Processing in Earth Observation, Technical University of Munich, Munich, Germany
  • 2Remote Sensing Technology Institute, German Aerospace Center (DLR), Oberpfaffenhofen, Wessling, Germany

Keywords: Data Fusion, Dataset, Machine Learning, Remote Sensing, Multi-Spectral Imagery, Synthetic Aperture Radar (SAR), Optical Remote Sensing, Sentinel-1, Sentinel-2, Deep Learning

Abstract. The availability of curated large-scale training data is a crucial factor for the development of well-generalizing deep learning methods for the extraction of geoinformation from multi-sensor remote sensing imagery. While quite some datasets have already been published by the community, most of them suffer from rather strong limitations, e.g. regarding spatial coverage, diversity or simply number of available samples. Exploiting the freely available data acquired by the Sentinel satellites of the Copernicus program implemented by the European Space Agency, as well as the cloud computing facilities of Google Earth Engine, we provide a dataset consisting of 180,662 triplets of dual-pol synthetic aperture radar (SAR) image patches, multi-spectral Sentinel-2 image patches, and MODIS land cover maps. With all patches being fully georeferenced at a 10 m ground sampling distance and covering all inhabited continents during all meteorological seasons, we expect the dataset to support the community in developing sophisticated deep learning-based approaches for common tasks such as scene classification or semantic segmentation for land cover mapping.