Spatio-temporal Change Modeling of Lulc: a Semantic Kriging Approach

Spatio-temporal land-use/ land-cover (LULC) change modeling is important to forecast the future LULC distribution, which may facilitate natural resource management, urban planning, etc. The spatio-temporal change in LULC trend often exhibits non-linear behavior, due to various dynamic factors, such as, human intervention (e.g., urbanization), environmental factors, etc. Hence, proper forecasting of LULC distribution should involve the study and trend modeling of historical data. Existing literatures have reported that the meteorological attributes (e.g., NDVI, LST, MSI), are semantically related to the terrain. Being influenced by the terrestrial dynamics, the temporal changes of these attributes depend on the LULC properties. Hence, incorporating meteorological knowledge into the temporal prediction process may help in developing an accurate forecasting model. This work attempts to study the change in inter-annual LULC pattern and the distribution of different meteorological attributes of a region in Kolkata (a metropolitan city in India) during the years 2000-2010 and forecast the future spread of LULC using semantic kriging (SemK) approach. A new variant of time-series SemK is proposed, namely Rev-SemKts to capture the multivariate semantic associations between different attributes. From empirical analysis, it may be observed that the augmentation of semantic knowledge in spatio-temporal modeling of meteorological attributes facilitate more precise forecasting of LULC pattern.


INTRODUCTION
One of the major challenges associated in monitoring the environmental changes is to predict the meteorological pattern with highest degree of accuracy.Further, forecasting the land-use/ land-cover (LULC) change will help in assessing the impact of urbanization, city planning, and other socio-economic activities.It has direct impact on several environmental threats, such as: urban heat island, drought, flood, etc.Several studies have reported that the meteorological attributes (for example, land surface temperature (LST), normalized difference vegetation index (VI), moisture stress index (MSI), etc.), that are near to the earth surface are influenced by the behavior of LULC, and positively correlated with its distribution pattern [ (Hengl et al., 2012) (Bhattacharjee et al., 2014)].Hence, their spatio-temporal analysis may incorporate the knowledge of LULC by semantic analysis of the terrain to achieve better accuracy.Therefore, the meteorological attributes and the LULC can be regarded as the inter-dependent factors of the terrain, which together influence the environmental changes.This work focuses on the spatio-temporal forecasting of land-use/ land-cover (LULC) distribution pattern of a spatial region, by learning the past behaviors of meteorological data.It has been hypothesized by the scientists that the twenty first century will be an era of predicting the weather/ meteorological patterns with some well-defined mathematical equations (Shukla, 1998).However, the model which can deal with all the spatial uncertainties with high accuracy is lacking.Though prediction/ forecasting of meteorological attributes have been studied extensively in the field of remote sensing and geographic information system, however accuracy is still a major research challenge.In most of the cases of meteorological forecasting, the auxiliary information in the terrain are overlooked, which have a huge impact on the primary attribute to be predicted, and often contribute to achieve better accuracy.Hence, the shift of research paradigm from univariate to multivariate scenario is indispensable for the accurate mapping of the environmental changes.
In order to carry out a data driven approach for analyzing the past behavioral pattern of the meteorological attributes, the kriging (Humme et al., 2006) based interpolation methods have been reported as the most popular, widely used, and best suitable techniques in literature, for handling spatial variability of the terrain.One such way to deal with this variability is to make use of other secondary information while predicting the actual or the primary variable/ attribute.In that case, the secondary variables have to be correlated with primary and must influence it significantly.In the literatures of geo-statistical kriging estimators, the commonly known approaches which can accommodate these multivariate information into the interpolation process are co-kriging, kriging with external drift, regression kriging, etc.However, most of these methods (both univariate and multivariate) have ignored to model the semantic LULC knowledge of the terrain for predicting meteorological attributes.In previously proposed interpolation methods, namely semantic kriging (SemK) (Bhattacharjee et al., 2014) and times-series semantic kriging (SemKts) (Bhattacharjee and Ghosh, 2015) , the LULC based knowledge is incorporated in the prediction equation for better (accurate) estimation of different meteorological attributes.The traditional two dimensional spatial auto-correlation model, semivariogram has been extended further in spatio-semantic semivariogram model, for analyzing, quantifying the impact of semantic LULC knowledge over other meteorological attributes.The proposed work also focuses on the spatio-temporal forecasting, however, the distribution of LULC has been considered as the primary variable to be predicted.The proposed work is a new interpolation based approach, namely Rev-SemKts, which can be regarded as the reverse times-series semantic kriging, where different meteorological attributes are considered as the auxiliary information for predicting the landcover pattern.Hence, the proposed approach belongs to the family of multivariate kriging, where LULC pattern is the primary prediction attribute, and the two other correlated meteorological attributes (Bhattacharjee and Ghosh, 2015), namely land surface temperature (LST) and normalized difference vegetation index (NDVI) have been considered as the secondary information.The broad objectives of the work can be stated as follows: • extending the times-series semantic kriging in multivariate scenario for the augmentation of the auxiliary spatial information of the terrain • modeling a reverse SemK approach, Rev-SemKts, for quantifying the temporal change of semantic knowledge (or, the change in LULC pattern) with respect to other meteorological attributes • modeling the spatio-temporal semivariograms and cross-semivariograms for analyzing the correlation among different factors • modeling the estimation equation for Rev-SemKts to forecast the LULC distribution pattern in future time instances • experimentation with real meteorological data, to demonstrate the efficacy of the proposed method The rest of the paper is organized as follows.Section 2. presents the state of the art related to the LULC change modeling, forecasting of LULC pattern, utilizing this knowledge for the prediction of meteorological attributes, etc.The preliminary works on semantic kriging, and its time-series extension are presented in the Section 3. The framework on forecasting of the future LULC pattern using the proposed Rev-SemKts approach is presented in Section 4. The experimental results, yield by the proposed approach are presented in Section 5. Finally, the conclusion is drawn in the Section 6.

BACKGROUND
Many literatures have reported the impact of LULC change modeling for the detection of global environmental dynamism, operational planning of different biospheric, atmospheric aspects, urban and city planning, disaster management, etc.  (Schilling et al., 2010) have studied how the change in LULC influences the rapid expansion of soybean cultivation in the Mississippi River basin, and found it to be highly influential for the regional water and climate patterns.Muñoz-Villers et al. (Muñoz-Villers and López-Blanco, 2008) have monitored the LULC changes between 1990 and 2003 in a tropical mountainous watershed using Landsat TM imagery.They have used a RS-GIS approach and achieved above 75% classification accuracy.Hoek et al. (Hoek et al., 2008) have used the land-use regression methods for modeling annual mean concentrations of different meteorological attributes such as: N O2, N Ox, P M2.5, etc. in European and North-American cities.
A very few works have been reported on modeling the LULC knowledge for the interpolation based prediction of meteorological attributes, and vice-versa.Though some of the articles state the importance of this knowledge for different meteorological analysis, however they fail to quantify and incorporate this property of the terrain into the prediction process.Hengl et al. (Hengl et al., 2012) have reported the land-cover pattern to be one of the influencing factor for predicting the land surface temperature trend.Several other literatures [(Lambin and Geist, 2006) (Mahmood et al., 2010) (Sertel et al., 2011)] have also mentioned about the significance of this knowledge for the weather/ climatic patterns and their dynamics.Janssen et al. (Janssen et al., 2008) have proposed a detrended kriging model, namely RIO, for air pollution measurement.They have used CORINE LULC data for developing the land-use indicator.Petris ¸or et al. (Petris ¸or et al., 2010) have examined whether a study region is affected at macro-scale through the changes of land-cover or land-use.
They have used ordinary kriging method for capturing the environmental changes and their effect on land-use.As far as our knowledge is concerned, no such article have been reported regarding the reverse analysis of meteorological attributes, such as: LST, NDVI, etc., using multivariate spatial interpolation for forecasting the future LULC pattern.The previous works on semantic kriging [(Bhattacharjee et al., 2014) (Bhattacharjee and Ghosh, 2014)] are one of the initial attempts to incorporate this knowledge into the interpolation process for meteorological attribute prediction.It has been further extended to temporal domain, resulting univariate times-series semantic kriging (SemKts) (Bhattacharjee and Ghosh, 2015), for forecasting.The proposed method can be considered as the reverse SemKts process in multivariate scenario.The overall idea of the proposed approach and its relationship with the existing works on semantic kriging are depicted in Fig 1 .The subsequent sections present the overview of the basic SemK technique, mainly on the quantification of semantic LULC knowledge, and its temporal extension.

SEMANTIC KRIGING (SEMK) AND ITS TIME-SERIES EXTENSION
The semantic kriging (SemK), initially proposed in (Bhattacharjee et al., 2014), attempts to quantify the semantic LULC knowledge of the terrain for the prediction of meteorological attributes through spatial interpolation.It extends the traditional semivariogram model into a three dimensional spatio-semantic semivariogram, by incorporating the distribution pattern and influence of the surrounding spatial feature for the variance modeling.One of the most popular kriging based method ordinary kriging (OK) (Stein, 1999) has been extended further with the quantified semantic knowledge for modeling the relatedness between the sample points.For quantifying this new knowledge of the terrain, the association between the representative spatial features of the sample points is mapped such that it adhere to the Tobler's law of spatial proximity (Tobler, 1970).That is, the semantically similar and the correlated features should be assigned more weightage than a dissimilar feature, and vice-versa.et al., 2008).The "is-a" (hyponym) semantic relation is considered here to map the type of association between classes, however, any semantic relation could have been used, depending on the application requirement.
For the quantification of this semantic knowledge, each of the sampled locations are then mapped to the most appropriate representative leaf feature in the hierarchy.Two metrics have been proposed for analyzing the association between a pair of leaf features in the hierarchy, namely: semantic similarity and the spatial importance.For the actual interpolation process, these two metrics are used for mapping the traditional covariance measure into higher semantic dimension.

Semantic Similarity
This metric is analyzed between any pair of sample points or their representative leaf LULC classes in the ontology hierarchy.Modified context resemblance method (Manning et al., 2008) have been used here for modeling the metric.The semantic similarity between the i th and j th sample points (or their representative features fi and fj referring the ontology) is denoted as SSij, given as follows: where, |fi| and |fj| are the total number of LULC classes in the i th and j th feature paths; mi and mj are the number of LULC classes matching in both the paths in the ontology, respectively.

Spatial Importance
This metric is more pragmatic one as it considers the real-time sample points from the terrain.Hence this metric may change with respect to the study region, time, and the attribute to be predicted.The spatial importance between each pair of leaf LULC classes is measured by the correlation analysis between the sampled locations, represented by these features with respect to the prediction attribute.First, the entire RoI is divided into k number of non-overlapping zones (R k ) such that ∪ k i=1 R k = RoI and k pairs of sample points (representing the pair of features) are chosen from each of the zones.Next, the pairwise correlation score is measured with these k pairs of sample points, which is termed as spatial importance between the pair of features.The spatial importance between i th and j th sample points, or their representative features fi and fj in the ontology, is denoted as SIij and is given as follows: where, Z(fp q ) represents the random field value of the q th sample point, representing the feature fp; Z(fp) represents the average of the random field values of the feature fp, over k sample points; MP is the meteorological attribute to be predicted.

Times-series Semantic Kriging (SemKts)
For the time-series augmentation of semantic kriging approach, the above mentioned two semantic metrics, namely semantic similarity and spatial importance (Bhattacharjee et al., 2014), are extended in temporal dimension.This modeling is presented in (Bhattacharjee and Ghosh, 2015).Between the two semantic metrics, the evaluation of the semantic similarity solely depends on the structure of the ontology hierarchy.As this hierarchy is built with the exhaustive spatial feature set, it remains static with respect to time, hence the time-series covariance (more generally the relatedness) between any LULC classes with respect to this metric would be same with the spatial covariance between them.Therefore, its temporal evaluation (time-series semantic similarity) will be same with the spatial semantic similarity (Bhattacharjee et al., 2014).
However, as the spatial importance deals with real-time data for its evaluation process, the significance of temporal lag between the sample points is highly evident for this metric.Given any two sampled locations (xi, t−i) and (xj, t−j), represented by fi and fj land-cover classes respectively, the whole study region is again divided into k number of random zones.The k pairs of sample points are chosen and measured for both fi and fj, but from different time instances, t−i and t−j, respectively.The correlation analysis is carried out between these k pairs.The temporal extension of the spatial importance metric is termed as time-series importance (SI ts ).Hence, the time-series importance between any i th and j th sample points (or their representative LULC classes fi and fj) is evaluated as follows: where, Z t −u (fp q ) represents the attribute value of the q th sample point, representing the feature fp at time instance t−u, and Z t −u (fp) is the average attribute value of the feature fp over k sample points at the time instance t−u.

REV-SEMK TS : REVERSE TIMES-SERIES SEMANTIC KRIGING
This section presents the detailed description of the proposed reverse times-series semantic kriging (Rev-SemKts) framework.
The overall process flow of the framework is depicted in Figure 3.In this framework, the meteorological data and the LULC distribution of the terrain are considered as the input.After quantification of the semantic knowledge (Bhattacharjee et al., 2014), the temporal and spatial inter-relationships among the meteorological attributes and the different LULC classes are evaluated.Considering the 3D spatio-semantic and temporal-semantic semivariogram models, the future LULC distribution pattern is forecasted using the proposed approach Rev-SemKts.It can be considered as the multivariate extension of SemKts technique (Bhattacharjee and Ghosh, 2015).
This work assumes that if there N sampled locations in past time instances, any i th sample point can be characterized as (xi, {Z1 i , Z2 i , ..., Zp i }, LCi, t−i).That is the sampled location xi, measured at the past time instance t−i.The p number of meteorological attributes are measured as Z1 i , Z2 i , ..., Zp i , at the sample point xi, and its corresponding land-cover LCi (or, fi) is also known.Here, for our analysis, we have considered two meteorological attributes, LST and NDVI, hence p = 2.For the forecasting of LC0 at the interpolation point x0, only past land-cover information could have been used.However, incorporation of other meteorological knowledge may yield more accurate estimation.The proposed framework extends the times-series semantic kriging (SemKts) (Bhattacharjee and Ghosh, 2015) in multivariate domain, where multiple secondary attributes are considered for predicting the primary LULC information of the terrain.The general assumption in multivariate scenario is that the secondary attributes must be correlated with the primary and are estimated jointly utilizing a BLUE (best linear unbiased estimator) (Henderson, 1975).In univariate scenario, the estimation equation of semantic kriging can be given as: where Ẑ(x0) is the estimated primary attribute value at the prediction point (x0), N is the total number of interpolating points, xi is the attribute values at the i th interpolating point, and wi is the weight assigned to each of the interpolating point.In multivariate semantic kriging, as two time-series [LST (Z1) and NDVI (Z2)] have been considered as the secondary attributes for the estimation, the two multivariate estimation equations are given as follows: Being an extension of ordinary kriging, both the above equations are constrained by following conditions: The wp i is the weight assigned to the interpolating points, calculated from the semivariogram model.In spatial context, a plot of semivariance versus distance between known sample points is termed as semivariogram.In Rev-SemKts framework, the relationship between meteorological attributes and the LULC classes is determined using these semivariogram models.Therefore, the concept of semivariogram is further extended in semantic and temporal dimension, resulting spatio-semantic and temporal-semantic semivariogram models.In our application, the prediction attribute Z being the land-use/ land-cover of the terrain, the quantification of this knowledge is carried out using the methods as specified in Section 3. Once this knowledge is quantified, the corresponding changes in primary attribute with respect to the secondary in space and time domain is evaluated using semivariogram models.Hence, for the two individual auxiliary time-series, the spatiosemantic and temporal-semantic semivariogram models are depicted in Figures 4, 5, 6, 7, respectively.These semivariograms are created by taking the distance/temporal lag (X axis) and the change of meteorological attribute (Y axis) as the independent variables, and the change of LULC as the dependent variable (Z axis).
However, the knowledge of both the meteorological attributes can also be used in order to evaluate how the change of LULC takes place with respect to the change in both the attributes together.
In that case, the semivariograms between the two or more semivariograms (cross-semivariogram) (Vauclin et al., 1983) should be modeled.Figures 8 and 9 depict the change in LULC with respect to both LST and NDVI together, in spatial and temporal domain, respectively.
Once these semivariograms are modeled, the SemKts approach is

EXPERIMENTAL RESULTS
For the performance analysis of the proposed approach, an empirical experimentation has been carried out with real meteorological data, for the study region Kolkata (a metropolitan city in India, with central coordinate: 22.567 • N 88.367 • E).The satellite remote sensing imagery (Landsat ETM+,) offered by United States Geological Survey (USGS) 1 , have been used for this study.These imagery are processed further with the standard satellite image processing tools to obtain the land surface temperature (LST), normalized difference vegetation index (NDVI) data of the terrain.A supervised classification on the satellite imagery yields the LULC distribution map of the study region.The signature of six different land-cover classes have been considered for classifying the terrain, such as: forest, agriculture, wastelands, built-up, waterbodies, and wetlands.For the time-series analysis, eleven years data have been considered, for the duration 2000-2010.In order to best fit the semivariogram models, past ten years data have been used (duration 2000-2009) for forecasting the LULC distribution map in the year 2010.
1 http://www.usgs.gov/The forecasting have been carried out for some unsampled locations of the terrain by the proposed approach Rev-SemKts.Two zones in the region Kolkata are further subdivided into five subzones, where the predictions have been carried out separately.Three types of external drift have been considered, such as: LST, NDVI, and both LST and NDVI.One standard error metric is considered to quantify the accuracy in prediction, namely root mean square error (RMSE).It may be observed from the case study that the proposed approach with the external drift of more than one correlated meteorological attributes forecast the future LULC distribution pattern better than other two cases (considering the drift with single attribute).
As mentioned, two spatial zones have been considered from the whole study area, as shown in Table 1 (the bounding box is specified as: [lower left corner, upper right corner]).The forecasted LULC imagery of both the zones are also depicted in Table 1.The actual LULC image is shown, along with the predicted imagery by Rev-SemKts, using the drift of LST, NDVI, and both LST and NDVI, respectively.It has been found that the Rev-SemKts, with more number of meteorological attribute information, generates better forecasted image and results in higher PSNR (peak signal-to-noise ratio), compared to single attribute drift (≈10-15 dB higher).

CONCLUSION
Urban and city planning have drawn concern in the various fields of study for the last few decades.It primarily focuses on the distribution pattern of land-use/ land-cover (LULC) of the terrain.Hence, anticipating the future behavioral change of LULC is one of the major research challenges in the field of remote sensing and GIS.This work focuses on forecasting the future LULC pattern of a region, based on different correlated attribute information.The

Figure 3 .
Figure 3. Rev-SemKts Framework carried out.The traditional spatio-temporal semivariance matrices, C t , also referred as W t 1 (with respect to the sampled locations), and D t (with respect to the sampled and unsampled locations) are modified with the semantic semivariogram measures, resulting C ts and D ts matrices for LULC, respectively.Each element of the two matrices, i.e., C t ij in C t and D t 0i in D t are modified as: C ts ij = ∆fij and D ts 0i = ∆f0i, where ∆fpq denotes the amount of change in LULC as (SI ts pq * SSpq).Hence, as already derived in (Bhattacharjee et al., 2014), the weight vector produced by proposed Rev-SemKts framework is given as: W Rev-SemK ts = C ts −1 [D ts − λ ts 1], where, λ ts is the timeseries Lagrange multiplier of Rev-SemKts.

Figure 4 .Figure 10 .
Figure 4. Spatio-semantic semivariogram with respect to LST Figure 5. Spatio-semantic semivariogram with respect to NDVI (Bhattacharjee et al., 2012)ferent land-cover classes are measured by organizing them into an ontology hierarchy [(Bhattacharjee et al., 2014)(Bhattacharjee et al., 2012)].It is built with the exhaustive feature set (LULC classes) of the spatial region of interest (RoI).Figure2presents an example ontology hierarchy for the spatial study region Kolkata, India.It is constructed with reference to the LULC classification classes, reported by NRDMS, India in (Mendiratta