GRAPH-BASED URBAN LAND USE MAPPING FROM HIGH RESOLUTION SATELLITE IMAGES

Due to the dynamic character of urban land use (e.g. urban sprawl) there is a demand for frequent updates for monitoring, modeling, and controlling purposes. Urban land use is an added value that can be indirectly derived with the help of various properties of land cover classes that describe a certain area and create a distinguishable structure. The goal of this project is to extract land use (LU) classes out of a structure of land cover (LC) classes from high resolution Quickbird data and additional LiDAR building height models. The study area is Rostock, a German city with more than 200.000 inhabitants. To model the properties of urban land use a graph based approach is adapted from other disciplines (industrial image processing, medicine, informatics). A graph consists of nodes and edges while nodes describe the land cover and edges define the relationship of neighboring objects. To calculate the adjacency that describes which nodes are combined with an edge several distance ranges and building height properties are tested. Furthermore the information value of planar versus non-planar graph types is analyzed. After creating the graphs specific indices are computed that evaluate how compact or connected the graphs are. In this work several graph indices are explained and applied to training areas. Results show that the distance of buildings and building height are reliable indicators for LU-categories. The separability of LU-classes improves when properties of land cover classes and graph indices are combined to a LU-signature.


INTRODUCTION
The UN Habitat announced in their annual report in 2010 that 52% of the global population is living in cities.By 2030 it is forecasted that 60% of the world will be urban (UN Habitat, 2011).Urbanization is the least reversible human dominated LU-type.The consequences range from land cover change to climate impacts, habitat loss, or extinction of species.Additionally it influences transportation developments, energy demand, or the automobile market (Seto et al., 2011).The main drivers in Europe, mentioned by the European Environment Agency, are e.g.population increase, rising living standards and improvement of quality of life, economic growth and globalization, policies and regu-lations, low transport costs, availability of roads et cetera (European Environment Agency, 2010).Therefore there is a need for urban LU as an important information for various planning applications, for political decision making as well as for the monitoring of ecosystem changes, disaster management, or the analysis of quality of living.Urban LU-types are "characterized by the arrangements, activities and inputs people undertake in a certain land cover type to produce, change or maintain it" (Di Gregorio, 2005).Currently urban LU-types are collected from surveying, mapping, digitizing of aerial imagery, population statistics, or inquiries.The benefits of a derivation from satellite images are the high temporal resolution coincidental with a constant coverage and an area wide availability.The trained human eye can distin-guish between a residential area, an industrial area, or a city core in high resolution satellite images.This is because the human perception works in a holistic way.The Gestalt-theory defines rules that describe how the human brain interprets complex structures.Proximity and similarity for example suggest patterns.Furthermore we try to simplify complex objects into basic geometrical shapes.The capability of the human brain can handle many of the rules at the same time.The challenging part is the implementation and combination of Gestalt rules and the emulation of the human brain for automated image interpretation.Antrop and Van Eetvelde (2000) analyzed landscape metrics in suburban areas with regards to holistic characteristics.The aim of the project described in this paper is the extraction of urban LU-categories using holistic land cover properties for graph generation and analysis.Graphs, containing nodes and edges, are an abstract concept of real world phenomena representing objects and their relationship among each other.Land cover objects (e.g.building, tree, impervious surface) define the nodes while the edges are the connection in relation to properties like distance, area or perimeter, building height, etc..After the description of the study site and the available datasets the graph theory is introduced.Some graph indices that characterize the connectivity or complexity of the graph are defined followed by a description how the graphs in the study area are computed and analyzed.The results are discussed and the future work is suggested.

RELATED WORK
Medicine and informatics disciplines concerning image processing and interpretation have been dealing with graph theory for structural analysis and classification approaches for a long time.Deuker et al (2009) described various graph metrics of complex networks relating to the application in human brain networks.Magnetoencephalography (MEG)-Images from human brains in a resting state and solving a memory exercise were analyzed.Different frequency interval networks were generated and global graph metrics were calculated (clustering coefficient, path length, small-worldness, assortativity, hierarchy, etc.) and divided into first and second order graph metrics.Gunduz, Yener, and Gultekin (2004) accomplished a classification of brain cancer cells (glioma) based on topological properties in the tissue image.The graphs were created using an exponential function of the Euclidean distance and the Waxman model.Diverse graph metrics and an artificial neural network (ANN) classification were performed resulting in the classes: healthy, cancerous, and inflamed.Graph based derivation of urban LU-types was established by Barnsley and Barr (1997).They developed the data model named XRAG (eXtended Relational Attribute Graph) which consists not only of nodes and edges but also of properties associated with the nodes (area, perimeter) and edges (distance, direction) as well as labeling of the land cover, grouping, and the probability of the assigned land cover class.The approach was based on raster data of a topographic map where land cover classes and their properties were derived.Morphological, relational, and spatial properties were analyzed to distinguish between several urban areas with buildings from different decades.Bauer and Steinnocher (2001) used XRAG implemented in SAMS (Structural Analyzing and Mapping System) as a prerequisite for the definition of rules within eCognition software (Benz et al., 2004) to derive LU on a rule based method.De Almeida, Morley, and Dowman (2007) describe a graph based approach to analyze topological structures of urban regions.LiDAR data was triangulated and the slope of the resulting triangles was computed, followed by a classification of flat and steep polygons.Afterwards, depth-first search (DFS) and breadth-first search (BFS) algorithms were used to analyze how informative the graph-trees for the topological prop-erty of containment are.Anders, Sester, and Fritsch (1999) presented a graph clustering technique to infer higher level structures from detailed cadastral map data.A relative neighborhood graph (RNG) was computed by using a Delaunay Triangulation (DT) and outlier edges were removed.Furthermore the local neighbor density was calculated by using the size of the Voronoi cell around a node and the mean distance from the DT of linked nodes as an estimator.

STUDY SITE AND DATASET
The study area is Rostock, a German Hanseatic city with more than 200.000 inhabitants built along the Warnow river until its embouchure in the Baltic Sea.A cloud-free 16-bit Quickbird satellite image with four channels (blue, green, red, and nearinfrared) from September 2009 was used (Figure 1).After atmospheric corrections a pan-sharpening of the multi-spectral image data was accomplished, resulting in a spatial resolution of 60 cm per pixel at the ground.LiDAR data from 2006 with a point density of two points per square meter was normalized with the digital elevation model to obtain the relative building and vegetation heights (nDSM).Additional cadastral parcel and building polygons from January 2011 and an urban land use map, digitized from orthophotos from 2007, were beneficial for testing the methods on ideal training areas and will be helpful for the validation of the results .The Quickbird Ortho Ready Standard Imagery product did not include topographic correction and had a CE90 (Circular Error with 90% level of confidence) of 23 m or higher, depending on the terrain and the viewing angle (Cheng et al., 2003).For this reason a projective transformation with 30 well distributed ground control points from the cadastral building polygons and 3D information from the LiDAR elevation model was accomplished.In the first step ideal test areas for the different LU-classes were chosen.Therefore the essential land cover classes were combined from cadastral building polygons and a fusion of classes from the land use map and the nDSM.The following land cover categories were derived: Built up, low built up, paved, trees, shrub, grass, water, and open land.

GRAPH THEORY
Graph theory is a branch of mathematics that had its initiation most likely in 1736 when Euler abstracted the problem of the "Bridges of Königsberg" into a simple graph and proved that a walk around Königsberg crossing every bridge uniquely is impossible.It is an interdisciplinary science with various applications e.g.routing problems (shortest path algorithms, Traveling Salesman problem), social and cognitive networks, technological networks like the Internet, cell networks or ecological networks, infrastructure networks, flow models, minimal costs computation, and many more.A graph consists of vertexes (nodes) and edges (links) (Caldarelli, 2007).A graph can be described with an adjacency matrix.The rows and columns of the adjacency matrix represent the vertexes while the entries denote if the vertexes are connected (1) or not (0).Graph indices represent structural properties of graphs.The beta index ( 1) is a measure of connectivity and is computed by the ratio of edges (e) over vertexes (v) (Rodrigue et al., 2009).
The gamma index is a ratio of the observed edges over the number of maximum possible edges.Its value ranges between 0 and 1, where 1 represents a complete graph with no subgraphs.The formulas depend on the graph type.A planar graph (2) has no edge intersection outside of a vertex.In a non-planar graph (3) edge intersection must not result in a vertex followed by a higher number of possible links (Rodrigue et al., 2009, Stuttgart, 2011). (3) Figure 2 illustrate the difference between planar and non-planar graphs of a residential detached house area.The vertexes were generated from the centroids of the building polygons.The planar graph is the outcome of a Delaunay Triangulation.Another measure for connectivity is the alpha index which specifies the redundancy of a net.It is defined as the number of cycles (meshes) over the maximum possible cycles.The achieved values vary between 0 and 1, while 1 represents a completely connected graph.The formulas depend on the planar (5) and nonplanar (6) graph type as well.A mesh is a chain where the start and end node are the same and does not cross a link more than once.The number of cycles (u) ( 4) is estimated by a combination of nodes, links, and number of sub-graphs (p) which are subsets of a graph (Rodrigue et al., 2009, Stuttgart, 2011).
More extensive graph measures that consider the internal complexity of a graph could reveal structural differences between networks.The clustering coefficient, also referred to as transitivity in social network analysis, specifies if the neighbors of a vertex are also neighbors of each other.It unveils cluster or communities which have a similar degree of links.The clustering coefficient is divided into local and global, while the local clustering coefficient defines the clustering of a specific node, the global describes the clustering of the total graph.The local clustering coefficient (7) of a node is the proportion of edges of the vertexes in its neighborhood (e jk ) by the maximum possible edges that could occur between them.Therefore the degree of the vertex (ki) is essential, which indicates the number of outgoing edges or the number of adjacent nodes (Rodrigue, 2009).
The global clustering coefficient (8) indicates the overall probability of a graph that neighboring nodes are connected.It is defined as the proportion of observed number of closed triplets over the maximum number of triplets.A triplet is a set of 3 nodes that is connected by two (open) or three (closed) edges (Rodrigue, 2009).
The assortativity ( 9) is a coefficient of the Pearson correlation between node degrees.The derivation is well described in Caldarelli (2007) or Newman (2002).It refers to the preference of a vertex to connect to vertexes of similar degree.The values range from -1 (disassortative graph) to 1 (assortative graph).In disassortative graphs nodes with very different degrees are connected and they often occur in strong hierarchical structures while social networks are typically assortative networks (Bassett et al., 2011, Deuker et al., 2009, Caldarelli, 2007, Rodrigue, 2009).The network assortativity function is stated as where ji and ki are the degrees of the nodes at the end of the ith edge and E as the total number of edges (Newman, 2002, Bassett et al., 2011).

DATA ANALYSIS
The first goal of the project was to define urban typology and to test the graph measures on significant training areas with the help of the properties of ideal cadastral and topographic data.Six meaningful urban LU-classes relevant for German urban fabric were established and listed in Table 1.Specific characteristics such as imperviousness or green area index were transfered from Banzhaf and Höfer (2008) .The German Federal Land Utilization Ordinance (Bundesministerium der Justiz, n.d.) constitutes the upper limit of the measure of the building area (10) and of the floor area (11).Both indices are allocated to the urban LU-classes in Table 1.

Analysis of planar and non-planar building graphs
To generate the graphs and compute the graph indices Matlab R in combination with the additional Matgraph and the Brain Connectivity Toolbox were used.After importing the land cover polygons the centroids, representing the graph nodes, were computed to approximate the objects.To create planar graphs a Delaunay Triangulation of the building centroids was performed, which represents the spatial relationship between the building objects and can be referred to as relative neighborhood graph (RNG).
Afterwards edges with a distance above 50 m were thinned out.
Non-planar graphs were generated with the help of the distance matrix containing all distances from one building node to all other building nodes.To compare the graph types edges above 50 m were deleted respectively.The graph indices were computed and the results are summarized in Table 2 and Table 3 The absent values for allotment areas resulted from lacking building objects over 3 m in height.Comparing graphs containing low built up objects the allotment areas had high values of beta, gamma, and alpha indices while most of the other LU-classes had zero entries.Table 2 shows that the beta, gamma, and alpha indices correlate with each other and that the city center had the highest values, representing the dense building structure.The lowest values were achieved by the industrial area.This indicates that the distance of 50 m is not sufficient for a well connected graph.Because of the low assortativity values the planar graphs of all LU-classes were neither assortative nor disassortative.The administrative district had the highest clustering value for planar and non-planar graphs which denoted the occurrence of several communities.The gamma and alpha indices of nonplanar graphs correlated while the beta index was not in line with the other two connectivity measures.Nevertheless, the ranking of the LU-classes regarding the values of the beta index was the same as with planar graphs.The values for assortativity increased immensely for non-planar graphs especially for the city center, which is a sign for an evenly distributed node degree and a homogeneous structure.

Comparison of different distance ranges of planar building graphs
For the second approach planar building graphs were tested with different distance ranges.After a Delaunay Triangulation of the Centroids of the LU-training classes the distribution of the distances of the edges were visualized in a box plot (Figure 3) to derive the distance ranges from the lower and upper quartile and the median.The overlapping distance ranges of 0-30 m, 10-40 m, 30-60 m, and 30-80 m were achieved to allow us investigate the separability.The edge outliers of the graphs were thinned out and the measures were computed.Looking at the different distance ranges the graph structure changes.E.g. in the industrial area the graph changes from a sparsely connected in the 0-30 m range to a multiple connected graph in the 30-80 m range.
The geometric graph visualization of the industrial area during the different ranges is shown in Figure 4.

Using graph based measures for creating a LU-signature
Because beta, gamma, and alpha index correlate in planar graphs just the beta index was discussed.Furthermore the idea of a beta index-based LU-signature was introduced.In Figure 5 the beta index of the LU-training areas for the 4 distance ranges was presented.Administrative district, city center, and residential single family houses started with a higher beta index and showed a decline in the classes with the broader distances.The beta index of the industrial area and residential block buildings showed the opposite behavior and rose with broader distances.A straight line combines the values to distinguish the separability.The approach for the future work is to develop a land use signature for several graph measures and to assign the LU-classes on the basis of their membership to a specific signature.
Also the assortativity showed significant changes passing the dis-    The conclusion from the values in Table 4 was that residential single family houses and the city center have similar densities for red and dark roofs.The industrial area and the administrative district appeared to have low or no red roofs.A similar study has to be done for metallic roofs because we expect dense structures for industrial areas.

CONCLUSION AND FUTURE WORK
The introduced graph measures applied on ideal training areas showed potential for land use class extraction.The beta index of planar and non-planar graphs identified the city center as the densest structure while the clustering coefficient showed the highest values for the administrative district.Beta, gamma, and alpha index correlated in planar graphs.In contrast to non-planar graphs where gamma and alpha index correlated, the beta index did not correlate.The assortativity values for non-planar graphs increased compared to those from planar graphs.The city center appeared as an assortative graph which was an indicator for an evenly distributed node degree and hence a homogeneous structure.The beta index signature of distance range-graphs was a significant characteristic showing the development of graphs according to the edge attribute of distance.Single family houses and the city center showed a decrease of the beta index with broader distances while industrial areas and residential block buildings have the opposite properties.The significance of red and dark roof types was investigated revealing that all residential urban LU-classes (single family buildings, block buildings, city center) had similar densities for red and dark roofs.Industrial areas and administrative areas had a denser structure of dark roofs and an empty graph (no edges) for red roofs (denoting the absence or isolated occurrence).A study with metallic (bright) roofs should be performed assuming a frequent appearance in industrial areas.
Further research has to be done concerning graphs with several node (land cover type, area, compactness, etc.) and edge (distances, orientation, weights, etc.) attributes.The indices have to be applied on a set of reference areas as well as on derived land cover datasets from high resolution satellite images.Additional to the distance as a spatial attribute also the building heights, areas, and direct neighborhood will be analyzed.Allotment areas are well distinguished with respect to the building height as a relative characteristic for graph generation.The green area index, imperviousness, building area as well as floor area index as indicated in the urban land use definitions should be converted into graph structure characteristics.The mentioned land use signature has to be developed further based on the combination of various graph properties in order to achieve a better LU-class separability.The adoption of supervised machine learning methods will be utilized to analyze various graphs and graph measures simultaneously.

Figure 2 :
Figure 2: planar(a) and non-planar(b) graph of a residential detached house area

Figure 5 :
Figure 5: Beta index of planar graphs for different distance ranges

Figure 6 :
Figure 6: Assortativity of planar graphs for different distance ranges

Table 1 :
Urban land use classes

Table 3 :
. Graph measures of non-planar building graphs

Table 4 :
Beta index of different roof type graphs