A review of volunteered geographic information quality assessment methods
Summary (4 min read)
1. Introduction
- The authors present an extensive review of the existing methods in the state-ofthe-art to assess the quality of map-, image-, and text-based VGI.
- To the best of their knowledge, surveys on existing methods have not been done so far.
- This review provides an overview of methods that have been built based on theories and discussions in the literature.
- In Section 2, the authors describe the different quality measures and indicators for VGI.
2. Measures and indicators for VGI quality
- Quality of VGI can be described by quality measures and quality indicators (Antoniou and Skopeliti 2015) .
- Quality measures, mainly adhering to the ISO principles and guidelines refer to those elements that can be used to ascertain the discrepancy between the contributed spatial data and the ground truth (e.g., completeness of data) mainly by comparing to authoritative data.
- When authoritative data is no longer usable for comparisons, and the established measures become no longer adequate to assess the quality of VGI, researchers have explored more intrinsic ways to assess VGI quality by looking into other proxies for quality measures.
- In the following, these quality measures and indicators are described in detail.
- The review of quality assessment methods in Section 5 is based on these various quality measures and indicators.
2.1. Quality measures for VGI
- Completeness describes the relationship between the represented objects and their conceptualizations.
- Consistency is the coherence in the data structures of the digitized spatial data.
- The errors resulting from the lack of it are indicated by (i) conceptual consistency, (ii) domain consistency, (iii) format consistency, and (iv) topological consistency.
- Accuracy refers to the degree of closeness between a measurement of a quantity and the accepted true value of that quantity, and it is in the form of positional accuracy, temporal accuracy and thematic accuracy.
- In both cases, the discrepancies can be numerically estimated.
2.2. Quality indicators for VGI
- As part of the ISO standards, geographic information quality can be further assessed through qualitative quality indicators, such as the purpose, usage, and lineage.
- Purpose describes the intended usage of the dataset.
- In addition, where ISO standardized measures and indicators are not applicable, the authors have found in the literature more abstract quality indicators to imply the quality of VGI.
- Therefore, in assessing the credibility of data as a quality indicator one needs to consider factors that attribute to the trustworthiness and expertise.
- Maué (2007) further argue that similar to the eBay rating system, 8 the created geographic features on various VGI platforms can be rated, tagged, discussed, and annotated, which affects the data contributor's reputation value.
3. Map, image, and text-based VGI: definitions and quality issues
- The effective utilization of VGI is strongly associated with data quality, and this varies depending primarily on the type of VGI, the way data is collected on the different VGI platforms, and the context of usage.
- The following sections describe the selected forms of VGI: (1) map, (2) image, and (3) text, their uses, and how data quality issues arise.
- These three types of VGI are chosen based on the methods that are used to capture the data (maps: as GPS points and traces, image: as photos, text: as plain text), and because they are the most popular forms of VGI currently used.
- This section further lays the ground work to understand the subsequent section on various quality measures and indicators, and quality assessment methods used for these three types of VGI.
3.1. Map-based VGI
- Map-based VGI concerns all VGI sources that include geometries as points, lines, and polygons, the basic elements to design a map.
- Each tag describes a specific geographic entity from different perspectives.
- This open classification scheme can lead to misclassification and reduction in data quality.
- Map-based VGI is commonly used for purposes, such as navigation and POI search.
- In addition to accuracy, providing reliable services is affected by data completeness; features, attribute, and model completeness.
3.2. Image-based VGI
- Not only the GPS precision and accuracy errors resulting from various devices, but also other factors influence the quality of image-based VGI.
- Instead of stating the position from where the photo was taken (photographer position) some contributors tend to geotag the photo with the position of the photo content, which could be several kilometers away from where the photo originated causing positional accuracy issues (as also discussed in Keßler et al. 2009) .
- This is a problem when the authors want to utilize these photos, for example, in human trajectory analysis.
- Such contents are not fit for use for tasks, such as disaster management, environmental monitoring, or pedestrian navigation.
- Citizen Science Projects, such as GeoTag-X 9 have in place machine learning and crowd-sourcing methods to discover unauthentic material and clean them.
4. The literature review methodology
- Figure 2 shows the distribution of the reviewed papers for VGI quality assessment methods.
- Evidently, the publication of papers on this topic gained momentum in 2010, for the most part papers discuss methods for map-based VGI.
5. Existing methods for assessing the quality of VGI
- The authors have reviewed state-of-the-art methods to assess various quality measures and indicators of VGI.
- Comparing with satellite imagery is a method to assess the positional accuracy of maps.
- The found methods have been mostly conceptually implemented for a particular use-case.
- These methods have been reviewed mainly based on the type of VGI, the quality measures and indicators supported, and the approaches followed to develop the method.
5.1. Distribution of selected literature
- Out of the 56 papers that the authors reviewed, 40 papers discuss methods for assessing the quality of map-based VGI, in most cases taking OSM data as the VGI source.
- Eighteen papers introduce methods for text-based VGI taking mainly Twitter, Wikipedia, and Yahoo! answers as the VGI source.
5.2. Type of quality measures, indicators, and their associated methods
- In a rather different approach, Canavosio-Zuzelski et al. (2013) perform a photogrammetric approach for assessing the positional accuracy of OSM road features using stereo imagery and a vector adjustment model.
- Their method applies analytical measurement principles to compute accurate real world geo-locations of OSM road vectors.
- The proposed approach was tested on several urban gridded city streets from the OSM database with the results showing that the post adjusted shape points improved positional accuracy by 86%.
- Furthermore, the vector adjustment was able to recover 95% of the actual positional displacement present in the database.
- Brando and Bucher (2010) present a generic framework to manage the quality of ISO standardized quality measures by using formal specifications and reference datasets.
5.2.3. Quality assessment in text-based VGI
- Hasan Dalip et al. (2009) on the other hand use text length, structure, style readability, revision history, and social network as indicators of text content quality in Wikipedia articles.
- They further use regression analysis to combine various such weighed quality values into a single quality value, that represents an overall aggregated quality metric for text content quality.
- Bordogna et al. (2014) measure the validity of text data by measuring the number of words, proportion of correctly spelled words, language intelligibility, diffusion of words, and the presence of technical terms as indicators of text content quality.
- They further explored quality indicators such as experience, recognition, and reputation to determine the quality of VGI.
5.2.4. Generic approaches
- Table 2 shows a summary matrix of all quality measures and indicators observed in the literature review, with various methods that can be applied to assess these quality measures/indicators.
- Following this matrix, the authors can learn which methods can be applied to solve various quality issues within map, text and image-based VGI.
- This should be followed with caution, as the authors present here only what they discovered through the literature review, and the presented methods could be applied beyond their discovery, and therefore need to be further explored.
6. Discussion and future research perspectives in VGI quality
- As further evident from this review, there is no holy grail that could solve all types of quality issues in VGI.
- Addressing these limitations and thereby improving the existing methods already paves for new contributions on this topic that should be recognized as valid scientific contributions in the VGI community.
7. Conclusions
- The authors have taken a critical look at the quality issues within map, image, and text VGI types.
- The review shows the increasing utilization of implicit VGI for geospatial research.
- This explains the use of indicators, such as reputation, trust, credibility, vagueness, experience, recognition, or local knowledge as quality indicators.
- In addition, the implicit nature of the geography that is contributed in most of these VGI is yet another reason for the insufficiency of quality assessment methods for text and imagebased VGI.
- The authors have further discovered data mining as an additional approach in the literature that extends Goodchild and Li's (2012) classification.
Did you find this useful? Give us your feedback
Citations
314 citations
279 citations
259 citations
Cites background from "A review of volunteered geographic ..."
...[21, 22]; for a more comprehensive review, see [23])....
[...]
203 citations
185 citations
References
3,976 citations
"A review of volunteered geographic ..." refers background in this paper
...…increased potential and use of VGI (as demonstrated in the works of Liu et al. 2008, Jacob et al. 2009, McDougall 2009, Bulearca and Bulearca 2010, Sakaki et al. 2010, MacEachren et al. 2011, Chunara et al. 2012, Fuchs et al. 2013), it becomes increasingly important to be aware of the quality of…...
[...]
3,633 citations
"A review of volunteered geographic ..." refers methods in this paper
...…(VGI) is where citizens, often untrained, and regardless of their expertise and background create geographic information on dedicated web platforms (Goodchild 2007), e.g., OpenStreetMap (OSM),1 Wikimapia,2 Google MyMaps,3 Map Insight4 and Flickr.5 In a typology of VGI, the works of Antoniou et al.…...
[...]
[...]
2,410 citations
2,383 citations
"A review of volunteered geographic ..." refers background in this paper
...Lineage describes the history of a dataset from collection, acquisition to compilation and derivation to its form at the time of use (Hoyle 2001, Guinée 2002, Van Oort and Bregt 2005)....
[...]
2,123 citations
"A review of volunteered geographic ..." refers background or methods in this paper
...…as an information foraging source (MacEachren et al. 2011), in journalism to disseminate data to the public in near real-time basis (O’Connor 2009, Castillo et al. 2011), detect disease spreading (Chunara et al. 2012), event detection (Bosch et al. 2013), and for gaining insights on social…...
[...]
...Castillo et al. (2011) employed users on mechanical turk12 to classify pre-classified ‘news-worthy events’ and ’informal discussions’ on Twitter according to several classes of credibility [(i) almost certainly true, (ii) likely to be false, . . .]....
[...]
...Most of these content-based features are taken from Castillo et al. (2011)....
[...]
...These features differ somewhat to the features extracted through the supervised classification of Castillo et al. (2011)....
[...]
...Their approach is similar to that of Castillo et al. (2011), but the authors proposed a new technique to re-rank the Tweets based on a Pseudo Relevance Feedback....
[...]