ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., IV-4/W2, 125-130, 2017
© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 4.0 License.
19 Oct 2017
F. Xiao Administrative Information Center, National Administration of Surveying, Mapping and Geoinformation of China, Engineer, 100830 Beijing, China
Keywords: Spatial Data, Spark, Index, Spatial Operations Abstract. In this paper, a novel Apache Spark-based framework for spatial data processing is proposed, which includes 4 layers: spatial data storage, spatial RDDs, spatial operations, and spatial query language. The spatial data storage layer uses HDFS to store large size of spatial vector/raster data in the distributed cluster. The spatial RDDs are the abstract logical dataset of spatial data types, and can be transferred to the spark cluster to conduct spark transformations and actions. The spatial operations layer is a series of processing on spatial RDDs, such as range query, k nearest neighbour and spatial join. The spatial query language is a user-friendly interface which provide people not familiar with Spark with a comfortable way to operation the spatial operation. Compared with other spatial frameworks based on Spark, it is highlighted that spatial indexes like grid, R-tree are used for data storage and query. Extensive experiments on real system prototype and real datasets show that better performance can be achieved.
Conference paper (PDF, 1199 KB)

Citation: Xiao, F.: A SPARK BASED COMPUTING FRAMEWORK FOR SPATIAL DATA, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., IV-4/W2, 125-130,, 2017.

BibTeX EndNote Reference Manager XML