ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Publications Copernicus
Articles | Volume II-4/W2
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., II-4/W2, 35–41, 2015
ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., II-4/W2, 35–41, 2015

  10 Jul 2015

10 Jul 2015


M. McDermott1, S. K. Prasad1, S. Shekhar2, and X. Zhou3 M. McDermott et al.
  • 1Department of Computer Science, Georgia State University, USA
  • 2Department of Computer Science, University of Minnesota, USA
  • 3Department of Management Sciences, University of Iowa, USA

Keywords: GPU, MapReduce, Hadoop, Parallelism, GIS, Spatial, Spatio-Temporal, Data Mining

Abstract. Discovery of interesting paths and regions in spatio-temporal data sets is important to many fields such as the earth and atmospheric sciences, GIS, public safety and public health both as a goal and as a preliminary step in a larger series of computations. This discovery is usually an exhaustive procedure that quickly becomes extremely time consuming to perform using traditional paradigms and hardware and given the rapidly growing sizes of today’s data sets is quickly outpacing the speed at which computational capacity is growing. In our previous work (Prasad et al., 2013a) we achieved a 50 times speedup over sequential using a single GPU. We were able to achieve near linear speedup over this result on interesting path discovery by using Apache Hadoop to distribute the workload across multiple GPU nodes. Leveraging the parallel architecture of GPUs we were able to drastically reduce the computation time of a 3-dimensional spatio-temporal interest region search on a single tile of normalized difference vegetative index for Saudi Arabia. We were further able to see an almost linear speedup in compute performance by distributing this workload across several GPUs with a simple MapReduce model. This increases the speed of processing 10 fold over the comparable sequential while simultaneously increasing the amount of data being processed by 384 fold. This allowed us to process the entirety of the selected data set instead of a constrained window.