MEASURES AND INDICATORS OF VGI QUALITY: AN OVERVIEW

: The evaluation of VGI quality has been a very interesting and popular issue amongst academics and researchers. Various metrics and indicators have been proposed for evaluating VGI quality elements. Various efforts have focused on the use of well-established methodologies for the evaluation of VGI quality elements against authoritative data. In this paper, a number of research papers have been reviewed and summarized in a detailed report on measures for each spatial data quality element. Emphasis is given on the methodology followed and the data used in order to assess and evaluate the quality of the VGI datasets. However, as the use of authoritative data is not always possible many researchers have turned their focus on the analysis of new quality indicators that can function as proxies for the understanding of VGI quality. In this paper, the difficulties in using authoritative datasets are briefly presented and new proposed quality indicators are discussed, as recorded through the literature review. We classify theses new indicators in four main categories that relate with: i) data, ii) demographics, iii) socio-economic situation and iv) contributors. This paper presents a dense, yet comprehensive overview of the research on this field and provides the basis for the ongoing academic effort to create a practical quality evaluation method through the use of appropriate quality indicators.


INTRODUCTION
For a long period of time, the creation of spatial data has been both a duty and a privilege of National Mapping Agencies (NMAs) or large commercial companies active in geospatial domain.Collecting, modelling, managing and updating geospatial datasets is a difficult and complicated task.In order to achieve the best possible quality, standardization procedures have been developed and closely followed.Consequently, the spatial products and services produced are accompanied with a certain level of guarantee that stemmed, among others, from the reputation and the credibility of the issuing authority.However, such authoritative products and services usually come with high costs and restrictive licensing terms.
During the last decade, the emergence of Volunteered Geographic Information (VGI) (Goodchild, 2007) provided an alternative to the availability of spatial data.The grassroots mechanisms of data collection and the nature of VGI can give competitive advantages over authoritative datasets.Local knowledge (Craglia et al., 2008), timely creation (Goodchild, 2007) or free use of data are just some of the characteristics that attracted the attention of researchers and of private sector alike.VGI can enrich, complement or update authoritative datasets and products or even be the single source for creating new ones (Antoniou, 2011).
Although there are cases and examples of real-world use of VGI data, especially in crisis and disaster management where authoritative data might be out of date (Haklay et al., 2014) there is little diffusion of such data into mainstream geospatial world.Perhaps the most compelling issue that VGI faces, and in a sense limits its broader mainstream diffusion (see for example AGI and PwC, 2010), is the quality evaluation.Early on, Flanagin and Metzger (2008) realised that it is of high importance to identify methods and techniques so to adequately evaluate VGI quality and Goodchild (2008) highlighted the challenge to re-define the assessment of spatial accuracy in the VGI era.Moreover, theoretic frameworks have been proposed so to better evaluate VGI data (Brando and Bucher, 2010).In this context, the paper tries to review the quality indicators that researchers and academics have explored in their effort to answer fundamental questions about VGI quality.
In terms of empirical studies on VGI quality, one of the most studied cases is OpenStreetMap (OSM).Although the authors realise that OSM cannot be equalled with all VGI datasets, OSM is treated here as a proxy for VGI data.Hence, the terms OSM and VGI datasets are used interchangeably.Moreover, as the topic of VGI quality has drawn increased academic interest over the last decade, the paper is by no means exhaustive.It provides a dense, yet comprehensive overview of all major topics around VGI quality and supports this overview with an adequate number of citations.
In this context, the paper is structured as follows: Section 2 provides an overview of the research efforts to use authoritative data as the reference dataset in order to evaluate one or more quality elements each time.The quality elements studied follow the nomenclature used by the "International Organization for Standardization" (ISO).Then, Section 3 discusses the problems and inefficiencies that arise from this line of research and turns the focus on efforts to discover and document new quality indicators that can function as proxies for the overall quality of VGI datasets.Section 4, discusses the findings of each line of research and highlights important issues that future research on this field should take into account.The paper ends with conclusions on the up to now academic research on the topic of VGI quality.

MEASURES OF VGI QUALITY: VGI VS. AUTHORITATIVE DATA
Understanding and documenting quality is an important factor when working with spatial data, especially with VGI as there are no specifications for data creation.In the field of geoinformation, ISO principles and guidelines can be taken into account for quality assessment.Relatively recently, the older standards ISO 19113 (ISO, 2005a) andISO 19114 (ISO, 2005b) have been replaced by the ISO 19157 (ISO, 2013) standard.The updated standard defines the following data quality elements: completeness, logical consistency, positional accuracy, temporal quality, thematic accuracy and usability.
Using the guidelines provided by the above mentioned ISO standards, a number of studies have tried to assess VGI quality based on the comparison of VGI with proprietary data provided by NMAs or commercial companies (see for example Hacklay, 2010;Zielstra and Zipf, 2010;Girres and Touya, 2010;Antoniou, 2011 etc).This comparison is based on the belief that the authoritative data is always of an accepted quality and created according to high standards.Thus, it is reasonable to assume that authoritative data can play the role of reference datasets during a quality evaluation process of VGI datasets.In these studies, a number of measures are adopted that exist in literature and are traditionally used to compare geographical data in processes such as quality assessment, data matching, generalisation evaluation etc.The remaining of this Section provides a review on the work that has taken place for each of the spatial data quality elements.

Logical consistency
Logical consistency is measured with percentages (%) that express the consistency of different database objects with other objects of the same theme (intra-theme consistency) or objects of other themes (inter-theme consistency) (Girres and Touya, 2010).In addition to this, hierarchical ordering and outliers spotting are used to check administrative data integrity (Ali and Schmid, 2014), mathematical techniques determine topological consistency (Corcoran et al., 2010) and spatial similarity in multi-representation are used to assess topological relationships (Hashemi et al., 2015).Moreover, a number of techniques deal with semantic similarity between the tags (Vandecasteele and Devillers, 2015) and the identification of entities with inappropriate classification (Ali et al., 2014).The improvement of the semantic quality can be also achieved by using ontologies such as OSMonto in data tagging (Codescu et al., 2011) or a tag recommendation system as OSMantic (Vandecasteele and Devillers, 2015), which automatically suggests relevant tags to contributors during the editing process.

Temporal accuracy
Studying the evolution of VGI data in time is considered a measure of temporal accuracy (see for example Girres and Touya, 2010;Arsanjani et al., 2013).In the cases of geo-tagged photos from explicit (e.g.Geograph) or implicit (e.g.Flickr) sources, temporal accuracy is measured as the time difference between the time of the photo capturing and the time of the photo uploading (Antoniou et al., 2010).

Usability
The last quality element is usability.According to ISO19157 (ISO, 2013), usability is based on user requirements and it can be evaluated by all quality elements.As a result, all the aforementioned measures can be used depending on the enduser aims.However, usability evaluation may be based on specific user requirements that cannot be described using the quality elements described above.In this case, the usability element is used to describe specific quality information about a dataset's suitability for a particular application or conformance to a set of requirements.From all quality elements, usability is the one most adequate for VGI since it is related to the term "fitness for use" used in VGI literature to describe data quality.

INDICATORS OF VGI QUALITY
Classic quality evaluation processes of VGI against authoritative data is not always possible, due to limited data availability, contradictory licensing restrictions or high procurement costs of the authoritative data.Moreover, internal or external quality (ISO, 2005) cannot be easily assessed the implementation of ISO standards is not a straightforward process due to the wiki-based nature of VGI data that results in the absence of data specifications (Antoniou, 2011) (Mooney and Corcoran, 2012) or gazetteers (Antoniou et al., 2015).Also, the very nature of grassroots participation introduces biases in the participation patterns of contributors and consequently on the volunteered content created as it has been observed that contributors are showing preference both in certain areas and specific features (Antoniou and Schlieder, 2014).Participation biases can be further influenced and enhanced by several factors such as: internet access, knowledge of language, users' available time or their technical capability (Holloway et al., 2007).Along the same vain, Zook and Graham (2007) support that cultural differences can create biases in participation patterns.
Interestingly enough, as VGI datasets get more and more detailed over time in some areas it becomes less and less clear whether the use of authoritative data as the reference datasets for quality evaluation is a valid choice.In other words, as (Vandecasteele and Devillers, 2015) note, a challenge with the use of such traditional data quality assessment methods is that VGI datasets are now, in many parts of the world, more complete and accurate than authoritative datasets.This violates the basic assumptions of the up to now quality assessment methods of VGI.
Thus, in fact, in a context where well established quality evaluation methods are not sufficient to provide solid answers to VGI quality, academic research started to focus in revealing more intrinsic, and consequently, more applicable to VGI data quality indicators.These indicators cover a wide range of possible proxy quality elements as researchers try to understand the engineering behind VGI datasets.Goodchild and Li (2012) note that intrinsic methods and mechanisms can be applied to ensure VGI data quality through data analysis in three domains: i) Crowdsourcing revision, where data quality can be ensured by multiple contributors, ii) Social measures, which focus on the assessment of contributors themselves as a proxy measure for the quality of their contributions, and iii) Geographic consistency, through an analysis of the consistency of contributed entities.Here, a different classification of quality indicators is presented.

Data Indicators
The direct evaluation of VGI internal quality can be problematic, since usually there are no detailed specifications, or the evaluation against authoritative data might not be possible outside an academic environment.Hence, researchers have focused on efforts that could reveal data quality by solely examining VGI data.For example, Ciepłuch et al. (2011) have used features' length and point density in a square-based grid to analyse the OSM data quality.Keßler and Groot (2013) examined feature-level attributes such as the number of versions, the stability against changes and the corrections and rollbacks of features so to infer OSM features' quality.Also working in a feature-based level, Van Exel et al. (2010) focused on the provenance of OSM features as an indicator of their quality.Finally, in Barron et al. (2014), a framework, named iOSMAnalyzer, that provides more than 25 methods and indicators, allows OSM quality assessment based solely on data history.

Demographic Indicators
As VGI is user generated content, many researchers have supported that a correlation between data quality and demographic data might exist (e.g.Tulloch, 2008;Elwood, 2008).Giving empirical evidence, Mullen et al. (2014) worked on the correlation between the demographics of an area and the completeness and positional accuracy of the data.Similarly, Zielstra and Zipf (2010) showed that low population density areas (i.e.rural areas) have a direct impact on the completeness of VGI data.Also, it has been shown that population density positively correlates with the number of contributions, thus affecting data completeness (Zielstra and Zipf, 2010;Haklay, 2010) or positional accuracy (Haklay et al, 2010).

Socio-economic Indicators
The grassroots engineering and the bottom-up process of VGI turned the focus of the research in socio-economic factors as it has been presumed that they might influence the overall quality.Indeed, Elwood et al. (2013) noted that various social processes might have different impact on quality.Moreover, in empirical studies, it has been shown that social deprivation and the underlying socio-economic reality of an area considerably affects completeness and positional accuracy of OSM data (Haklay et al, 2011;Antoniou, 2011).Similarly, Girres and Touya (2010) note that factors such as high income and low population age result into a higher number of contributions.

Contributors' Indicators
The research on VGI quality indicators could not exclude analysis on the VGI contributors themselves as the understanding of the motivation drivers can give a better insight into user generated data.For example, (Nedović-Budić and Budhathoki, 2010) suggested that the contributors' motivation can affect the generated content.Data quality indicators also include the history and the profiling of contributors (Ciepłuch et al., 2011) as well as the experience, recognition and local knowledge of the individual (Van Exel et al., 2010).Additionally, in line with Web 2.0 principles (O' Reilly, 2005), of special interest in the evolution of VGI is the collective intelligence that can be achieved by the work of several contributors on specific areas or spatial features.Thus, the number of contributors on certain areas or features has been examined and in several cases it has been positively correlated with data completeness and positional accuracy (e.g Haklay et al., 2010;Antoniou, 2011;Keßler and Groot, 2013).Finally, an evaluation model for the contributor's reputation and data trustworthiness is presented by D' Antonio et al. (2014) which propagates a trust value to corresponding features.

Comparing VGI against authoritative datasets
Although VGI datasets have fundamental differences in terms of the production processes when compared with those of the authoritative data, it still makes sense to use long-standing and well established quality evaluation methods to assess the quality of crowdsourced spatial datasets.Comprehensive toolsets are provided by the ISO standards that provide guidelines for the evaluation of the most fundamental characteristics of a spatial dataset.Following this line of research, a large number of academics and researchers tried to understand, document and convey VGI quality through measures known as spatial data quality elements.All the papers reviewed in this section follow the common and classical approach of assessing the overall quality of VGI datasets by comparing VGI to an authoritative dataset of the same area that acts as reference/ground-truth data.
None of the studies use field observations and there is no report on the reliability and the validity of the results reported.
From the measures review, it becomes obvious that despite the novelty of VGI, many studies have analyzed VGI datasets covering different geographical areas and contexts.Studies mostly use OSM data for cities in the following countries: Germany, United Kingdom, Austria, Sweden, Hungary, Italy, Romania, Ireland, France, Australia, Greece, Latvia, Estonia and Swicherland.Germany is the area that is mostly studied (8 studies) and only three studies cover more than two countries (i.e.Mooney at al., 2010;Ali and Schmid, 2014;Arsanjani and Vaz, 2015).It is interesting to note that the majority of the studies are for European A possible explanation, taking also into account the digital divide, is that OSM has started in London, UK and has mainly gained the interest of European researchers.Additionally, the bulk uploads of authoritative data (e.g.Tiger dataset for USA) in OSM databases might deter researchers from turning their focus in such areas.
Regarding the geographical coverage of the studies, it seems that measures are applied in relatively small areas, experimentally and not systematically.A complete country coverage is provided only in few studies (Hacklay et al., 2010;Mooney at al., 2010;Zielstra and Zipf, 2010;Antoniou, 2011).
The aforementioned quality measures are mostly applied to data from the OSM project when compared to a number of authoritative datasets.A number of different thematic layers are studied.The majority of studies deals with the road network (10 studies) and the rest of them with other thematic layers such as POI (Points of Interest), "green" areas like garden and park, land use etc. OSM roads are compared against a number of authoritative datasets provided by NMAs such as: ITN, OS Master Map and OS Meridian2 from UK, OS Ireland, Hellenic Military Geographic Service, IGN BD topo from France and BKG from Germany.In other cases proprietary datasets have been used from Navteq and TomTom.Other OSM thematic layers studied in terms of quality include POI vs. IGN BD topo (Girres & Touya, 2010), schools vs. official data (Jackson et al., 2013), waterways and coastlines vs. IGN BD topo (Girres and Touya, 2010), buildings vs. ATKIS (Fan et al., 2014), parks vs. government data for Victoria Australia (FOI-features of interest) (Kalantari and La, 2015), POI vs. Navteq and Yelp (Mashhadi et al., 2014), building footprints vs. ATKIS (Fan et al., 2014), parks vs. government data for Victoria Australia (FOI -features of interest) (Kalantari and La, 2015), administrative units (Ali and Schmid, 2014), OSM "green" areas such as gardens (Ali andSchmid, 2014, Ali et al., 2014), and OSM land use vs. GMESUA (Global Monitoring for Environment and Security Urban Atlas) (Arsanjania and Vaz, 2015;Arsanjani et al., 2015).Additionally, data from other VGI projects have been tested in terms of quality such as geocoded addresses from OA (Open Address) vs. proprietary Web Mapping Services (e.g.Bing Maps, Google Maps and Yahoo!Maps) (Stark, 2011).
The reviewed studies do not cover equally the six elements of data quality.While VGI positional accuracy assessment has received significant attention, fewer efforts have looked at the semantic quality of VGI (i.e.Mooney and Corcoran, 2012a).
According to the studies reviewed in this paper, completeness is assessed in 15 of them, logical consistency in 7, positional accuracy in 14, temporal accuracy in 3 and thematic accuracy in 9.Moreover, a consensus has not been reached as to which measures are more adequate for each quality element.
Although a number of measures are introduced and applied, according to the studies analysed, there is no study that assesses all the elements of data quality.Some studies cover a number of quality elements (see for example Girres and Touya, 2010;Antoniou, 2011;Koukoletsos et al., 2011;Arsanjani et al., 2013).A synthetic approach that will produce integrated results is missing.Only few of these studies result in expressing the overall VGI quality by combining and integrating the measures (Arsanjani et al., 2013;Forghani andDelavar, 2014, Barron et al., 2014).As a result it is difficult to draw a conclusion about the degree of adherence to a specific set of data quality standards or express usability.
Finally, in most cases the reporting of the quality evaluation results is made with arbitrary means and methods.ISO provides rigorous guidelines on the suggested way to unambiguously report the result of a quality evaluation method (see for example Antoniou, 2011).

VGI quality Indicators
Despite the work and empirical research available on the subject of VGI quality, a solid framework for assessing the quality of crowdsourced spatial data is far from being established.Perhaps, the major limitation is the fact that existing tools (as those described by ISO) are not inclusive enough or appropriate to eloquently evaluate data VGI data.First, the nature of VGI is fundamentally different to what geospatial experts have dealt with so far.The largely unknown social factor which is the driving force behind public contribution and thus considerably affects VGI creation has never been considered before.Participation biases have started to emerge that affect all levels of data granularity (i.e. from feature level up to national level datasets).While meticulous sampling methods are provided by ISO standards, unbiased data creation and existence of rigorous specification were considered to be in place throughout the production process from every NMA or enterprise.Second, VGI comes in many flavours.On the one hand, there are implicit (e.g.OSM, Geograph etc.) and explicit (e.g.Flickr, Twitter etc.) sources of spatial content.On the other, the geographic information retrieval methodologies have advanced and now can extract meaningful datasets from a variety of content available on the Web (e.g.geo-tagged photos, tweets, micro-blogging etc.).This combination creates a great variety of VGI content (e.g.noise maps, data about emotions etc.) that are different from the traditional and authoritative spatial products.Furthermore, as VGI is considered either as a replacement of authoritative data or as a way to enrich them, the use of authoritative data for the evaluation process does not make much sense other than research purposes.
These limitations create a context that pushes researches to explore new ways to determine VGI quality without the need of authoritative data.Thus, the aim is to discover intrinsic to VGI quality indicators so to facilitate the understanding of such data.
This paper groups some of the existing research efforts in four major categories.While this is an arbitrary typology, it covers all the main factors that can influence or give a better insight about the quality of a VGI dataset.First is the data factor where the efforts focus mainly on a features-level examination.By examining either the lineage (e.g.versions) or the basic characteristics (e.g.length, expected attributes) of each feature it is possible to paint a picture about the overall quality of a VGI dataset.Then, there are the demographic indicators.Here, the assumption is that as VGI datasets are mainly user generated content, the monitoring of demographics can function as proxy of some quality elements and especially for completeness and for positional accuracy.The third group includes indicators related to the underlying socio-economic reality of the area studied.Empirical research shows that social processes are not unrelated to data contribution both in terms of quantity and quality.The final group of research efforts focuses on the contributors themselves.The human factor is undoubtedly very important in any crowdsourced effort.Monitoring, understanding and modeling different contributors' behaviors can help to infer the level of different data quality elements.
As the debate is still open on how to build a solid framework that will efficiently assess VGI quality, this paper sheds light to the existing efforts.More groups of indicators might be need to better analyse VGI data.For example, the scope of each VGI project (e.g.humanitarian efforts) or the modes of user engagement available (e.g.gamification) can equally be influential in the effort to build such a framework.

CONCLUSIONS
VGI has been a growing phenomenon for over a decade now.While the popularity of VGI sources and datasets receive a lot of interest by academics and researchers the diffusion of VGI data in the mainstream Geomatics domain is still sparse and slow.Perhaps the most important factor that hinders VGI diffusion is the lack of a stable and standardized way to evaluate data quality.As it has been presented here, existing and wellestablished methods and processes for spatial data quality evaluation, while still valid, are not always applicable to VGI datasets.Realising this problem, researchers and academics turned their focus on discovering new methods so to eloquently answer the pressing question about "how good is VGI data?".
The nature and the creation mechanisms of VGI led to the analysis of a number of factors.However, research is still far from providing concrete answers and methods regarding the evaluation of VGI quality.This paper provides an overview of the ongoing academic effort to create a practical quality evaluation method through the use of appropriate quality measures and indicators.