ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume V-3-2022
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-3-2022, 271–279, 2022
https://doi.org/10.5194/isprs-annals-V-3-2022-271-2022
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-3-2022, 271–279, 2022
https://doi.org/10.5194/isprs-annals-V-3-2022-271-2022
 
17 May 2022
17 May 2022

INVESTIGATING 2D AND 3D CONVOLUTIONS FOR MULTITEMPORAL LAND COVER CLASSIFICATION USING REMOTE SENSING IMAGES

M. Voelsen1, M. Teimouri1,2, F. Rottensteiner1, and C. Heipke1 M. Voelsen et al.
  • 1Institute of Photogrammetry and GeoInformation, Leibniz Universität Hannover, Germany
  • 2Department of Photogrammetry and Remote Sensing, K. N. Toosi University of Technology, Tehran, Iran

Keywords: land cover classification, remote sensing, FCN, multi-temporal images, 3D-CNN

Abstract. With the availability of large amounts of satellite image time series (SITS), the identification of different materials of the Earth’s surface is possible with a high temporal resolution. One of the basic tasks is the pixel-wise classification of land cover, i.e. the task of identifying the physical material of the Earth’s surface in an image. Fully convolutional neural networks (FCN) are successfully used for this task. In this paper, we investigate different FCN variants, using different methods for the computation of spatial, spectral, and temporal features. We investigate the impact of 3D convolutions in the spatial-temporal as well as in the spatial-spectral dimensions in comparison to 2D convolutions in the spatial dimensions only. Additionally, we introduce a new method to generate multitemporal input patches by using time intervals instead of fixed acquisition dates. We then choose the image that is closest in time to the middle of the corresponding time interval, which makes our approach more flexible with respect to the requirements for the acquisition of new data. Using these multi-temporal input patches, generated from Sentinel-2 images, we improve the classification of land cover by 4% in the mean F1-score and 1.3% in the overall accuracy compared to a classification using mono-temporal input patches. Furthermore, the usage of 3D convolutions instead of 2D convolutions improves the classification performance by a small amount of 0.4% in the mean F1-score and 1.2% in the overall accuracy.