DEFINITION OF CONTOUR LINES INTERPOLATION OPTIMAL METHODS FOR E-GOVERNMENT SOLUTIONS

: Citizens expect high quality and rich e-Government solutions. Map based applications could be signiﬁcantly improved by utilization of digital elevation models (DEMs). DEMs can be produced using expensive modern remote sensing solutions (e.g., LIDAR). Such data are not available for the wide public. Thus, it is very effective to use contour line maps to derive high-quality DEMs. For many cities around the world, contour line maps available under liberal licenses. Deﬁning an optimal algorithm of contour line interpolation and evaluation of the quality of DEM is an important challenge. In this article, we propose a simple method allowing users to evaluate the quality of DEM produced from a contour line map and deﬁne an optimal interpolation algorithm. This method was tested on Turin (Italy) data sets. The results were proved by a visual analysis. The approach is utilized in existing e-Government WebGIS services. This work introduces an information theory-based approach to DEM quality assessment. The results can be utilized in various domains related to DEM quality assurance.


INTRODUCTION
In contrast to regular e-Government solutions, e-Government We-bGIS services provide map-based applications.Users expect the high quality of data involved in e-Government WebGIS.In the frame of the WeGovNow (We Government Now) (Boella et al., 2018) project funded by the EU Horizon-2020 program, a number of a map-based web application are developed.WeGovNow is a research and innovation action focused on civic participation in local government aiming at using state-of-the-art digital technologies in community engagement platforms to involve citizens in decision-making processes within their local neighborhood.WeGovNow enables a new type of interactivity in the coproduction of citizen-centered services and in the co-development of strategic approaches to community development.Several e-Government components and related services are provided in the frame of the project.The mentioned components provide the following facilities: citizens' urban activities coordination and collaboration, reporting local issues to a public administration, opinion formation on a given issue, and web mapping tools.All components implement map-centered web applications.
One of the important data types required for such sort of applications is a digital elevation model (DEM) or digital terrain model (DTM).Nowadays, researchers and practitioners obtain DEMs produced by remote sensing methods (e.g., LIDAR).Such data are quite expensive and rarely accessible for wide public.In e-Government solutions, it is very important to use accessible free data.Thus, it is very often when DEM for e-Government solutions are derived from contour line maps.In many cases, contour line maps are available for wide areas under open data licenses.This makes possible to provide a higher quality map-based web services for citizens.Accessible DEMs improve map visualization, allow users to build 2.5D and 3D maps, extend the abilities for data processing and analysis, etc.
Various available algorithms for vector data interpolation may confuse the user.The user wants to be sure that a selected algorithm produces the best DEM possible.At least, satisfactory results must be reached.In this work, we propose an information theory-based approach for optimal algorithm determination.The approach uses the entropy of Voronoi areas to assess the quality of a calculated DEM.
In this work, we propose an approach allowing the user to detect automatically an optimal algorithm for contour line interpolation.The rest of the article is organized as follows.In Section 2, related work is described.Section 3 introduces methods for achieving the aim.Results are provided by Section 4. A discussion is presented in Section 5. (Ardiansyah and Yokoyama, 2002) have mentioned that, in comparison to other data sources, contours are a cheap data source because, in most countries, they cover the whole area at different scales.They proposed to interpolate a DEM along the steep slope perpendicular to a contour.The problem of DEM generation based on the contour lines remains popular and still takes an attention (Li et al., 2017).Currently, a number of tools are available for creating DEMs from contours (e.g., ANUDEM (Hutchinson, 1989), TAPES-G (Gallant and Wilson, 1996), and TOPOG (Vertessy et al., 1994).All these methods can create DEMs from contours, providing results that are generally satisfactory.

RELATED WORK
In this work, we provide a framework for assessment and comparison of different methods of producing DEMs from contour lines.Only free open source implementations delivered by GRASS GIS (Neteler, 2008) are considered.The interpolation tools provided by GRASS GIS are accessible for a wide public and can be easily used on various operating systems for any purposes.

Information Theory
Among promising solutions to our problem, an information theory is distinguished by its ability to derive essential information properties of data and compare data in different forms.The information theory is a basic data communication theory that applies to the technical processes of encoding a signal for transmission and provides a statistical description of the message produced by the code (Businessdictionary).It was proposed by mathematicians Claude Shannon (Shannon, C., 1948) and Warren Weaver (Weaver, 1949); it focuses on how to transmit data more efficiently and economically, and how to detect errors in its transmission and reception.Information theory belongs to applied mathematics, electrical engineering, and computer science.It includes the quantification of information.Information theory was developed to find fundamental limits for signal processing operations such as compressing data and to reliably channel storing and communicating data.It has broadened and found applications in many other areas, e.g., statistical inference, natural language processing, cryptography, neurobiology, the evolution and function of molecular codes, model selection in ecology, thermal physics, quantum computing, linguistics, plagiarism detection and other forms of data analysis.A key measure of information is entropy, which is usually expressed by the average number of bits needed to store or communicate one symbol in a message.Entropy quantifies the uncertainty involved in predicting the value of a random variable.
In the frame of the theory, the logarithmic measures are intensively used.In (Hartley, R., 1928) it was mentioned that logarithmic measures can be used for several reasons: parameters of engineering importance such as time, bandwidth, number of relays, etc., tend to vary linearly with the logarithm of the number of possibilities; it is mathematically more useful.For computer processing, it is natural to use bits, N bits store 2N states.As we know log 2 2 N = N.In real life, we normally use decimal digits.The correspondence between logarithmic base 2 and 10 is presented in 1.
As mentioned, the most important quantities of information are entropy: the information in a random variable, and mutual information, the amount of information in common between two random variables.The former quantity indicates how easily message data can be compressed, while the latter can be used to find the communication rate across a channel.In this section, only discrete systems will be described, because it is primarily in the context of GIS information.The entropy concept has been derived from the capacity of the discrete channel (C).It can be defined as follows: , where N(T ) is the number of allowed signals of duration T .As mentioned, the entropy, usually depicted as H, of a discrete random variable X is a measure of the amount of uncertainty associated with the value of X.The general equation can be depicted as follows: , where p(x) is probability of state.The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X,Y ).It implies that if X and Y are independent, then their joint entropy is the sum of their individual entropies.It is defined as follows:

Information Theory and GIS
An application of information theory in the cartographic scope has focused primarily on the defining type, location, and a number of graphic elements (Sukhov, 1970;Pipkin, 1975).Unfortunately, with this type of analysis, there is an issue with reader's subjectivity, and a lack of congruence between the elements on the map and the amount of information extracted from those elements (Salichtchev, 1973).Recent reconsiderations of the value of this approach can be found in (Neumann, 1994;Tobler, 1997;Li and Huang, 2002;Clarke and Battersby, 2001;Noskov and Doytsher, 2014).In general, however, the greatest attention given to the theory in cartography peaked in the 1970s, and then it languished until the recent resurgence starting in the late 1990s.Of the works mentioned above, the most critical to information theory is that of Shannon.In Shannons paper, the potential rate of communication across an imperfect communication channel was quantified.In this, a value was assigned to the quantity of information transferred from a transmitter to a receiver the quantity of information is dependent on the number of possible states for each bit of information; when there are more states that are possible, the information potential is greater.
It is very popular to use information theory to estimate the result of generalization because, theoretically, the entropy of maps with different degrees of generalization derived from a one detailed map should be almost identical if extraneous data is removed.Generalization is a process for removing extraneous detail, while still maintaining the characteristics of objects.A good example of the task was demonstrated by (Clarke and Battersby, 2001).
The recently developed Coordinate Digit Density (CDD) function measures redundancy.The removal of extraneous data in the dataset should prove to increase information content through the reduction of redundancies in the coordinate set.This hypothesis was tested by applying the two common generalization algorithms in GIS, the Douglas-Peucker method, and ESRIs Bendsimplify.The Douglas-Peucker algorithm was designed to reduce the vertices needed to represent a line, while the Bendsimplify algorithm was designed more to preserve the shape of the original line.When the informational contents for generalized datasets are compared using the CDD, a pattern of increased information content is seen with both generalization algorithms.The information content does not increase at the same rate for the two different methods.While both datasets show increased informational content, there are several extreme spikes in the Bendsimplify data.Closer inspection of the effects of line simplification shows that the Bendsimplify algorithm adds additional nodes in locations where previously there were no nodes.When this happens, the effect on the precision (all nodes were integers to seven decimal places of precision) of the original coordinates is lost, and the increasing randomness in the decimal places of the data becomes a factor.A 1:24000 line dataset for the central coast region of California was used to examine the CDD.Instead of generalizing using the same tolerance for each dataset, it was generalized to specific levels of point reduction 50%, 20%, 10%, 5%, and 2% of the original points remaining.
Another domain of using information theory in GIS is in estimat-ing the quantity of information presented on a map.The Voronoi diagram is used to implement such tasks.We can divide it into 2 approaches: Voronoi neighbors and Voronoi areas.In order to calculate the entropy of Voronoi neighbors we need to use a number of neighbors for each centroid; for Voronoi areas, we need to use the areas of polygons.In (Li and Huang, 2002) an example of the calculation of a quantity of information was presented.
In (Wang et al., 2010), another application of information theory is presented.According to the paper, information characteristics should be calculated in different ways for three main types of geometric features: points, polylines, and polygons.Basic information parameters can be divided into four categories: 1. Spatial distribution: entity-related spatial measure information, spatial distribution, also known as geometric information.
2. Property information: describes the substance and non-spatial information characteristics, such as the administrative class of the residential area, types of roads, resided in attribute tables of GIS dataset layers.
3. Topological information: object coherences description.Today, many GIS applications allow the user to manipulate the topology of datasets.
According to (Fu, 2007), the entropy of two independent X and Y values can be written as H(XY ).In (Wang, 2010), the amount of information on topographic maps has been calculated by different ways using three types of models: (1) general geometric information measurements, such as length or area, (2) Voronoi graph based measuring and (3) shape or form information measuring.
Summarizing the literature review, one can conclude, that the entropy presents the amount of information providing by a map regardless of the coordinate system, an orientation of the map, and scale.In theory, maps created in slightly different scales using different cartographic projections with a different orientation (rotation) should provide the similar quantity of information (entropy) if they represent same objects.For the generalization case, the entropy of maps should be gradually and smoothly be varied with changing of map scales; steep differences indicate imperfections of the generalization.From this, we can conclude that the entropy is a suitable metric for the quality assessment of the interpolated contour lines because it allows comparison of source and interpolated contour lines regardless of the position of polylines.
Moreover, it should be mentioned, that in addition to the entropy other distance-calculation methods (Hamza and Krim, 2006) can be useful for such kind of tasks.For instance, the KulbackLiebler divergence (Kullback and Liebler, 1951) has been used in many applications including indexing and image retrieval.JensenShannon divergence is defined between an arbitrary number of probability distributions (Lin, 1991).

METHODS
In order to prepare satisfactory DEM, we need to choose a suitable interpolation algorithm and its parameters.A contour line data file covering the East part of Turin City was utilized.The file is disseminated under an open data license and available for free download1 .In order to reduce computation time, a sample data set was defined by an extent 412869, 5000841, 416330, 4997814 in meters of the EPSG:32632 projection.The sample data set (see Figure1) represents relatively complex relief allowing us to evaluate interpolation methods comprehensively.According to the metadata, contour lines were extracted from topographic maps 1:10000.Approximately, it corresponds to the spatial resolution of 1.25 meters.In the present article, we decided to use 5 meters resolution of resulting DEM.In order to apply different methods, sample data set is used in three forms: vector lines, rasterized contour lines and extracted vector vertices (i.e.points).
The following four algorithms are offered by the standard GRASS GIS (see the GRASS GIS manual for more information): • rv.surf.idw-Provides surface interpolation from vector point data by inverse distance squared weighting, • v.surf.bspline-Performs bicubic or bilinear spline interpolation with Tykhonov regularization, • r.surf.contour-Generates surface raster map from rasterized contours, • v.surf.rst-Performs surface interpolation from vector points map by splines (spatial approximation and topographic analysis from given point or contour line data in vector format to floating point raster format using regularized spline with tension).
We decided to use eight different interpolation methods.All these methods are encountered quite often: In general, the proposed method for assessment and comparison of the interpolation results can be described as follows.For source contour lines we calculate a grown map."Growing" is a raster analog of vector "buffering" technique (i.e., adding the area around an object according to the predefined buffer size).It is implemented in a raster environment, this prevents from the generation of multiple vector artifacts."Growing" allows us to fill the whole map area by polygons derived from rasterized contour lines."Growing" (or "Buffering") is a tessellation technique utilized for filling the empty space of a contour line map.This enables us to calculate the entropy of Voronoi areas.The Voronoi areas enable to calculate the entropy.This entropy represents the quantity of information provided by a contour line map.From the resulting DEMs, we can extract the contour lines.
As mentioned, the entropy can be used for retrieval of the quantity of essential information provided by a map.Thus, we extract contour lines from the resulting DEMs using different initial height but the similar intervals and compare them with the source map.In theory, extracted contour lines should have the same entropy as a source map.In order to eliminate the impact of insignificant variations of the source and resulting contour lines, we divided the contour like interval to 10 obtaining the delta value and, then, extract 10 contour line maps with initial contour line equals minh + delta * i. minh is the height of the minimal contour line of a source map.i is a number of maps.Then, for each extracted map we calculate the entropy and compare it with the entropy of the source map.Using all differences, we calculate the overall difference.Less the overall difference, better quality of the interpolation results.
In Figure 3, calculated DEMs are represented in 3D.As disclosed, methods vsurf* and rsurf* generate completely similar DEMs, thus they are represented by only one 3D model (i.e., one 3D model represents the results of interpolation of both vector and raster source data generated by the different implementation of corresponding methods).The algorithm for the definition of the best method is presented in the following listing.The code is written in the Tcl programming language.Commands in format "x.y.z" are the names of GRASS GIS 7 modules.A detailed description of the listing is provided after the code.
Tcl is an easy programming language.The core component of the language is a command.All instructions in Tcl contain command name and parameters in the following format: command namepa rameter1...parameterN.Curly brackets and double quotes combine elements of a list.Text inside square brackets is executed as a command.set command assign a variable provided as the first parameter using the second parameter.expr conducts math operations.proc defines functions.The i f , f oreach, global, sp lit, incr, lindex, lappend, lsort commands are utilized for conditions, loops, considering a variable in the global scope, splitting strings, incriminating numbers, extracting a value by an index, appending an element to a list, and sorting a list, correspondingly.The module db.select evaluates SQL code.The array is a type of collection; it supports unique indexes.The contours have been interpolated into DEM using the described methods.To evaluate results, techniques close to the estimation of Voronoi areas have been developed.In GRASS GIS, we can calculate Voronoi polygons for points and polygons only.To calculate Voronoi polygons for linear features we need to convert lines to the area, e.g. using a small buffer.However, in this project, it was decided to grow rasterized contour lines.It gets very similar results to Voronoi polygon features.To calculate entropy we use the following equation (entropy of Voronoi areas): The equation is implemented in the listing of the lines 1-7.The process returns entropy of growing contour lines' areas.In the equation, the meanings of variables are as follows: I result entropy, Si area of individual polygon, S area of the map extent.We calculate I for grown source contours.
The function "getDeltaI" (lines 17-26) returns entropies' deviations between the original sample contour map and a considering map.The process "sortByDeltaI" (lines 28-38) describes two elements by the entropy calculated.It is used for comparison by entropy.In lines 40-43, some important global variables are set.
S is an area of an extent of the considering data sample (10476447 m 2 ).minv and maxv define minimal and maximal levels of a sample data set (185 m and 570 m, correspondingly).mainI is an entropy of grown contour lines of the sample data set (0.7237).
In lines 46-57, an average interval between the contour lines of the original sample is calculated.It is weighted by the length of contour lines.This value is divided by 10 in line 57 and called "delta".
The main calculations are implemented in lines 59-76.A loop is implemented for ten steps from 0 to 9 (line 63).For each contour level of the sample map, a correspondent contour line in an interpolated map (DEM) is calculated.A level of a correspondent contour line is calculated according to the Eq. 6.
Where h is a current contour line level, i is a step and delta is 1/10 of the average contour line interval (0.9 m in our case, i.e. average interval equals 9 m).A contour line map is calculated for each step (i ∈ 0 − 9).A correspondent grown contour map and an entropy value are calculated as well.For every interpolated map, ten entropies are derived.Then, the deviation of derived contour line maps and original sample contour line map is calculated as follows: curI is individual for every map, n is the number of steps (i).It equals ten.
In lines 78-82, entropy values are collected in an array which sorted by ∆I.A lower value on it represents a higher quality of DEM.

RESULTS
The results of calculations are presented in Table 1 in order of increasing quality of the result DEMs.Entropy of grown source contours equals 0.7237.According to the entropy comparison, rsurfcontour derives the most appropriate results, vsurfidw64 delivers the worst result.Quality of DEMs is increased in the following order:vsurfidw64, rsurfidw64, rsurfidw24, vsurfidw24, vsurfrst, vsurfbspline bilin, vsurfbspline bicub and rsurfcontour.Visual analysis and review of 2D and 3D maps allow us to conclude that the results are satisfactory and the approach works well.

CONCLUSIONS
In this work, an approach to the automatic determination of the optimal interpolation method is presented.The approach allows the user to define a concrete module for contour line interpolation.Information theory is applied to resolve the problem.
The developed method detected that rsurfcontour provides an optimal algorithm.DEM covering Turin was prepared.This DEM will be utilized by e-Government solutions in the frame of the

1
proc g e t I { Ss S} { 2 s e t sm 0 3 f o r e a c h c u r S $Ss { 4 s e t sm [ expr {$sm + ( $ c u r S / d o u b l e ( $S ) ) * ( [ l n $ c u r S ] / d o u b l e ( [ l n $S ] ) ) c a l c A r e a s {map} { 10 r .g r o w $map out=grown $map r = 1 1 0 0 −−o 11 r .t o .v e c t grown $map out=grown $map t y p e = a r e a −−o 12 v .d b .a d d c o l u m n grown $map c o l = a \ d o u b l e \ p r e c i s i o n 13 v .t o .d b grown $map c o l = a o p = a r e a 14 r e t u r n grown $map 15 } 16 17 proc g e t D e l t a I { I } s u m D e l t a [ expr { $ s u m D e l t a +pow ( $ m a i n I − $ e l , 2 ) } ] 24 } 25 r e t u r n [ expr { s q r t ( $ s u m D e l t a / d o u b l e ( $numb ) ) } ] 26 } 27 28 proc s o r t B y D e l t a I { a b} { 29 s e t a D e l t a I [ g e t D e l t a I [ l i n d e x $a 1 ] ] 30 s e t b D e l t a I [ g e t D e l t a I [ l i n d e x $b 1 ] ] 31 i f { $ a D e l t a I = = $ b D e l t a I } { 32 r e t u r n 0 33 } e l s e i f { $ a D e l t a I <$ b D e l t a I } { 34 r e t u r n −1 35 } e l s e i f { $ a D e l t a I >$ b D e l t a I } t S [ expr { [ l i n d e x [ g .r e g i o n −e ] 2 ] * [ l i n d e x [ g .r e g i o n −e ] 5 ] } ] 41 s e t minv [ d b .s e l e c t sql=SELECT\ min (QUOTA) \ FROM\ c o n t o u r s −c ] 42 s e t maxv [ d b .s e l e c t sql=SELECT\ max (QUOTA) \ FROM\ c o n t o u r s −c ] 43 s e t m a i n I [ g e t I [ v .d b .s e l e c t [ c a l c A r e a s c o n t o u r s ] c o l = a s ep = \ −c ] $S ] 44 45 s e t l e n s u m 0 46 s e t h i g h t s [ d b .s e l e c t sql=SELECT\ QUOTA,sum ( SHAPE\ LEN ) \ FROM\ c o n t o u r s \ \ 47 GROUP\ BY\ QUOTA\ ORDER\ BY\ QUOTA −c ] 48 s e t c u r v a l 0 49 f o r { s e t i 1} { $ i <[ l l e n g t h $ h i g h t s ] } { i n c r i } { 50 l a s s i g n [ s p l i t [ l i n d e x $ h i g h t s [ expr { $i−1 } ] ] | ] hp l p 51 l a s s i g n [ s p l i t [ l i n d e x $ h i g h t s $ i ] | ] hn l n 52 s e t c u r l [ expr { $ l p + $ l n } ] 53 s e t c u r v a l [ expr { $ c u r v a l + $ c u r l * a b s ( $hp−$hn ) } ] 54 s e t l e n s u m [ expr { $ l e n s u m + $ c u r l } ] 55 } 56 s e t d e l t a [ expr {round ( $ c u r v a l / d o u b l e ( $ l e n s u m ) ) / 1 0 .0 } ] 57 58 a r r a y s e t I s {} 59 f o r e a c h map { v s u r f i d w 2 4 v s u r f i d w 6 4 r s u r f i d w 2 4 r s u r f i d w 6 4 \ 60 v s u r f b s p l i n e \ b i l i n v s u r f b s p l i n e \ b i c u b r s u r f c o n t o u r v s u r f r s t } { 61 # s e t t i n g d e l t a o f m i n i m a l c o n t o u r l i n e e l e v a t i o n ; 62 f o r e a c h i {0 1 2 3 4 5 6 7 8 9} { 63 f o r e a c h c u r [ d b .s e l e c t sql=SELECT\ DISTINCT\ QUOTA\ FROM\ \ 64 c o n t o u r s \ ORDER\ BY\ QUOTA −c ] { 65 lappend l e v l i s t [ expr { $ c u r + $ i * $ d e l t a } ] 66 } 67 r .c o n t o u r $map o u t = c o n t $ m a p l e v e l s = [ j o i n $ l e v l i s t , ] −−o 68 v .c a t e g o r y i n p = c o n t $ m a p o p = d e l cat=−1 −−o 69 v .c a t e g o r y i n p = c o n t $ m a p op=add t y p e = l i n e −−o 70 v .t o .r a s t c o n t $ m a p o u t = c o n t $ m a p u s e = c a t −−o 71 s e t g r o w n V e c t o r P o l y g o n s [ c a l c A r e a s c o n t $ m a p ] 72 lappend I s ( $map ) [ g e t I [ v .d b .s e l e c t $ g r o w n V e c t o r P o l y g o n s c o l = a \ 73 se p= \ −c ] $S ] 74 } 75 } 77 s e t l i s t O f I " " 78 f o r e a c h {map l t s I } [ a r r a y g e t I s ] { 79 lappend l i s t O f I [ l i s t $map $ l t s I ] 80 } 81 l s o r t − d e c r e a s i n g −command s o r t B y D e l t a I $ l i s t O f I

Table 1 .
WeGovNow project.Usage of open data and the provided source code enable one to evaluate the developed approach.Results Ardiansyah, P.O.D. and Yokoyama, R., 2002.DEM generation method from contour lines based on the steepest slope segment chain and a monotone interpolation function.ISPRS Journal of Photogrammetry and Remote Sensing, 57(1-2), pp.86-101.doi: 10.1016/S0924-2716(02)00117-X