scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A review of volunteered geographic information quality assessment methods

TL;DR: Data mining is introduced as an additional approach for quality handling in VGI by reviewing various quality measures and indicators for selected types of VGI and existing quality assessment methods.
Abstract: With the ubiquity of advanced web technologies and location-sensing hand held devices, citizens regardless of their knowledge or expertise, are able to produce spatial information. This phenomenon is known as volunteered geographic information VGI. During the past decade VGI has been used as a data source supporting a wide range of services, such as environmental monitoring, events reporting, human movement analysis, disaster management, etc. However, these volunteer-contributed data also come with varying quality. Reasons for this are: data is produced by heterogeneous contributors, using various technologies and tools, having different level of details and precision, serving heterogeneous purposes, and a lack of gatekeepers. Crowd-sourcing, social, and geographic approaches have been proposed and later followed to develop appropriate methods to assess the quality measures and indicators of VGI. In this article, we review various quality measures and indicators for selected types of VGI and existing quality assessment methods. As an outcome, the article presents a classification of VGI with current methods utilized to assess the quality of selected types of VGI. Through these findings, we introduce data mining as an additional approach for quality handling in VGI.

Summary (4 min read)

1. Introduction

  • The authors present an extensive review of the existing methods in the state-ofthe-art to assess the quality of map-, image-, and text-based VGI.
  • To the best of their knowledge, surveys on existing methods have not been done so far.
  • This review provides an overview of methods that have been built based on theories and discussions in the literature.
  • In Section 2, the authors describe the different quality measures and indicators for VGI.

2. Measures and indicators for VGI quality

  • Quality of VGI can be described by quality measures and quality indicators (Antoniou and Skopeliti 2015) .
  • Quality measures, mainly adhering to the ISO principles and guidelines refer to those elements that can be used to ascertain the discrepancy between the contributed spatial data and the ground truth (e.g., completeness of data) mainly by comparing to authoritative data.
  • When authoritative data is no longer usable for comparisons, and the established measures become no longer adequate to assess the quality of VGI, researchers have explored more intrinsic ways to assess VGI quality by looking into other proxies for quality measures.
  • In the following, these quality measures and indicators are described in detail.
  • The review of quality assessment methods in Section 5 is based on these various quality measures and indicators.

2.1. Quality measures for VGI

  • Completeness describes the relationship between the represented objects and their conceptualizations.
  • Consistency is the coherence in the data structures of the digitized spatial data.
  • The errors resulting from the lack of it are indicated by (i) conceptual consistency, (ii) domain consistency, (iii) format consistency, and (iv) topological consistency.
  • Accuracy refers to the degree of closeness between a measurement of a quantity and the accepted true value of that quantity, and it is in the form of positional accuracy, temporal accuracy and thematic accuracy.
  • In both cases, the discrepancies can be numerically estimated.

2.2. Quality indicators for VGI

  • As part of the ISO standards, geographic information quality can be further assessed through qualitative quality indicators, such as the purpose, usage, and lineage.
  • Purpose describes the intended usage of the dataset.
  • In addition, where ISO standardized measures and indicators are not applicable, the authors have found in the literature more abstract quality indicators to imply the quality of VGI.
  • Therefore, in assessing the credibility of data as a quality indicator one needs to consider factors that attribute to the trustworthiness and expertise.
  • Maué (2007) further argue that similar to the eBay rating system, 8 the created geographic features on various VGI platforms can be rated, tagged, discussed, and annotated, which affects the data contributor's reputation value.

3. Map, image, and text-based VGI: definitions and quality issues

  • The effective utilization of VGI is strongly associated with data quality, and this varies depending primarily on the type of VGI, the way data is collected on the different VGI platforms, and the context of usage.
  • The following sections describe the selected forms of VGI: (1) map, (2) image, and (3) text, their uses, and how data quality issues arise.
  • These three types of VGI are chosen based on the methods that are used to capture the data (maps: as GPS points and traces, image: as photos, text: as plain text), and because they are the most popular forms of VGI currently used.
  • This section further lays the ground work to understand the subsequent section on various quality measures and indicators, and quality assessment methods used for these three types of VGI.

3.1. Map-based VGI

  • Map-based VGI concerns all VGI sources that include geometries as points, lines, and polygons, the basic elements to design a map.
  • Each tag describes a specific geographic entity from different perspectives.
  • This open classification scheme can lead to misclassification and reduction in data quality.
  • Map-based VGI is commonly used for purposes, such as navigation and POI search.
  • In addition to accuracy, providing reliable services is affected by data completeness; features, attribute, and model completeness.

3.2. Image-based VGI

  • Not only the GPS precision and accuracy errors resulting from various devices, but also other factors influence the quality of image-based VGI.
  • Instead of stating the position from where the photo was taken (photographer position) some contributors tend to geotag the photo with the position of the photo content, which could be several kilometers away from where the photo originated causing positional accuracy issues (as also discussed in Keßler et al. 2009) .
  • This is a problem when the authors want to utilize these photos, for example, in human trajectory analysis.
  • Such contents are not fit for use for tasks, such as disaster management, environmental monitoring, or pedestrian navigation.
  • Citizen Science Projects, such as GeoTag-X 9 have in place machine learning and crowd-sourcing methods to discover unauthentic material and clean them.

4. The literature review methodology

  • Figure 2 shows the distribution of the reviewed papers for VGI quality assessment methods.
  • Evidently, the publication of papers on this topic gained momentum in 2010, for the most part papers discuss methods for map-based VGI.

5. Existing methods for assessing the quality of VGI

  • The authors have reviewed state-of-the-art methods to assess various quality measures and indicators of VGI.
  • Comparing with satellite imagery is a method to assess the positional accuracy of maps.
  • The found methods have been mostly conceptually implemented for a particular use-case.
  • These methods have been reviewed mainly based on the type of VGI, the quality measures and indicators supported, and the approaches followed to develop the method.

5.1. Distribution of selected literature

  • Out of the 56 papers that the authors reviewed, 40 papers discuss methods for assessing the quality of map-based VGI, in most cases taking OSM data as the VGI source.
  • Eighteen papers introduce methods for text-based VGI taking mainly Twitter, Wikipedia, and Yahoo! answers as the VGI source.

5.2. Type of quality measures, indicators, and their associated methods

  • In a rather different approach, Canavosio-Zuzelski et al. (2013) perform a photogrammetric approach for assessing the positional accuracy of OSM road features using stereo imagery and a vector adjustment model.
  • Their method applies analytical measurement principles to compute accurate real world geo-locations of OSM road vectors.
  • The proposed approach was tested on several urban gridded city streets from the OSM database with the results showing that the post adjusted shape points improved positional accuracy by 86%.
  • Furthermore, the vector adjustment was able to recover 95% of the actual positional displacement present in the database.
  • Brando and Bucher (2010) present a generic framework to manage the quality of ISO standardized quality measures by using formal specifications and reference datasets.

5.2.3. Quality assessment in text-based VGI

  • Hasan Dalip et al. (2009) on the other hand use text length, structure, style readability, revision history, and social network as indicators of text content quality in Wikipedia articles.
  • They further use regression analysis to combine various such weighed quality values into a single quality value, that represents an overall aggregated quality metric for text content quality.
  • Bordogna et al. (2014) measure the validity of text data by measuring the number of words, proportion of correctly spelled words, language intelligibility, diffusion of words, and the presence of technical terms as indicators of text content quality.
  • They further explored quality indicators such as experience, recognition, and reputation to determine the quality of VGI.

5.2.4. Generic approaches

  • Table 2 shows a summary matrix of all quality measures and indicators observed in the literature review, with various methods that can be applied to assess these quality measures/indicators.
  • Following this matrix, the authors can learn which methods can be applied to solve various quality issues within map, text and image-based VGI.
  • This should be followed with caution, as the authors present here only what they discovered through the literature review, and the presented methods could be applied beyond their discovery, and therefore need to be further explored.

6. Discussion and future research perspectives in VGI quality

  • As further evident from this review, there is no holy grail that could solve all types of quality issues in VGI.
  • Addressing these limitations and thereby improving the existing methods already paves for new contributions on this topic that should be recognized as valid scientific contributions in the VGI community.

7. Conclusions

  • The authors have taken a critical look at the quality issues within map, image, and text VGI types.
  • The review shows the increasing utilization of implicit VGI for geospatial research.
  • This explains the use of indicators, such as reputation, trust, credibility, vagueness, experience, recognition, or local knowledge as quality indicators.
  • In addition, the implicit nature of the geography that is contributed in most of these VGI is yet another reason for the insufficiency of quality assessment methods for text and imagebased VGI.
  • The authors have further discovered data mining as an additional approach in the literature that extends Goodchild and Li's (2012) classification.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

A review of volunteered geographic information quality
assessment methods
Hansi Senaratne
a
, Amin Mobasheri
b
, Ahmed Loai Ali
c,d
, Cristina Capineri
e
and Mordechai (Muki) Haklay
f
a
Data Analysis and Visualization Group, University of Konstanz, Konstanz, Germany;
b
GIScience Research
Group, Heidelberg University, Heidelberg, Germany;
c
Bremen Spatial Cognition Center, University of
Bremen, Bremen, Germany;
d
Information System Department, Assiut University, Assiut, Egypt;
e
Faculty of
Political Sciences, University of Sienna, Sienna, Italy;
f
Department of Geomatic Engineering, University
College London, London, UK
ABSTRACT
With the ubiquity of advanced web technologies and location-
sensing hand held devices, citizens regardless of their knowledge
or expertise, are able to produce spatial information. This phenom-
enon is known as volunteered geographic information (VGI).
During the past decade VGI has been used as a data source
supporting a wide range of services, such as environmental mon-
itoring, events reporting, human movement analysis, disaster
management, etc. However, these volunteer-contributed data
also come with varying quality. Reasons for this are: data is pro-
duced by heterogeneous contributors, using various technologies
and tools, having dierent level of details and precision, serving
heterogeneous purposes, and a lack of gatekeepers. Crowd-sour-
cing, social, and geographic approaches have been proposed and
later followed to develop appropriate methods to assess the
quality measures and indicators of VGI. In this article, we review
various quality measures and indicators for selected types of VGI
and existing quality assessment methods. As an outcome, the
article presents a classication of VGI with current methods uti-
lized to assess the quality of selected types of VGI. Through these
ndings, we introduce data mining as an additional approach for
quality handling in VGI.
1. Introduction
Volunteered geographic information (VGI) is where citizens, often untrained, and regard-
less of their expertise and background create geographic information on dedicated web
platforms (Goodchild 2007), e.g., OpenStreetMap (OSM),
1
Wikimapia,
2
Google MyMaps,
3
Map Insight
4
and Flickr.
5
In a typology of VGI, the works of Antoniou et al.(2010) and
Craglia et al.(2012) classied VGI based on the type of explicit/implicit geography being
captured and the type of explicit/implicit volunteering. In explicit-VGI, contributors are
mainly focused on mapping activities. Thus, the contributor explicitly annotates the data
with geographic contents (e.g., geometries in OSM, Wikimapia, or Google). Data that is
Konstanzer Online-Publikations-System (KOPS)
URL: http://nbn-resolving.de/urn:nbn:de:bsz:352-0-341149
Erschienen in: International Journal of Geographical Information Science ; 31 (2017), 1. - S. 139-167
https://dx.doi.org/10.1080/13658816.2016.1189556

implicitly associated with a geographic location could be any kind of media: text, image,
or video referring to or associated with a specic geographic location. For example,
geotagged microblogs (e.g., Tweets), geotagged images from Flicker, or Wikipedia
articles that refer to geographic locations. Craglia et al.(2012) further elaborated that
for each type of implicit/explicit geography and volunteering, there are potentially
dierent approaches for assessing the quality.
Due to the increased potential and use of VGI (as demonstrated in the works of Liu
et al. 2008, Jacob et al. 2009, McDougall 2009, Bulearca and Bulearca 2010, Sakaki et al.
2010, MacEachren et al. 2011, Chunara et al. 2012, Fuchs et al. 2013), it becomes
increasingly important to be aware of the quality of VGI, in order to derive accurate
information and decisions. Due to a lack of standardization, quality in VGI has shown to
vary across heterogeneous data sources (text, image, maps, etc.). For example, as seen in
Figure 1, a photograph of the famous tourist site the Brandenburg Gate in Berlin is
incorrectly geotagged in Jakarta, Indonesia on the photo-sharing platform Flickr. On the
other hand, OSM has also shown heterogeneity in coverage between dierent places
(Haklay 2010). These trigger a variable quality in VGI. This can be explained by the fact
that humans perceive and express geographic regions and spatial relations imprecisely,
and in terms of vague concepts (Montello et al. 2003). This vagueness in human
conceptualization of location is due not only to the fact that geographic entities are
continuous in nature, but also due to the quality and limitations of spatial knowledge
(Hollenstein and Purves 2014).
Figure 1. A photograph of the Brandenburg Gate in Berlin is incorrectly geotagged in Jakarta,
Indonesia on the popular photo-sharing platform Flickr.
140

Providing reliable services or extraction of useful information require data with a
tness-for-use quality standard. Incorrect (as seen in Figure 1) or malicious geographic
annotations could be minimized in place of appropriate quality indicators and measures
for these various VGI contributions.
Goodchild and Li (2012) have discussed three approaches for assuring the quality of
VGI: crowd-sourcing (the involvement of a group to validate and correct errors that have
been made by an individual contributor), social approaches (trusted individuals who
have made themselves a good reputation with their contributions to VGI can, for
example, act as gatekeepers to maintain and control the quality of other VGI contribu-
tions), and geographic approaches (use of laws and knowledge from geography, such as
Toblers rst law to assess the quality). Many works have developed methods to assess
the quality of VGI based on these approaches.
In this article, we present an extensive review of the existing methods in the state-of-
the-art to assess the quality of map-, image-, and text-based VGI. As an outcome of the
review, we identify data mining as one more stand-alone approach to assess VGI quality
by utilizing computational processes for discovering patterns and learning purely from
data, irrespective of the laws and knowledge from geography, and independent from
social or crowd-sourced approaches. Extending the spectrum of approaches will sprout
more quality assessment methods in the future, especially for VGI types that have not
been extensively researched so far. To the best of our knowledge, surveys on existing
methods have not been done so far. This review provides an overview of methods that
have been built based on theories and discussions in the literature. Furthermore, this
survey gives the reader a glimpse to the practical applicability of all identied
approaches. The remainder of this article unfolds as follows. In Section 2, we describe
the dierent quality measures and indicators for VGI. In Section 3, we describe the main
types of VGI that we consider for our survey, and in Section 4, we describe the
methodology that was followed for the selection of literature for this survey. Section 5
summarizes the ndings of the survey, and Section 6 discusses the limitations and future
research perspectives. Finally, we conclude our ndings in Section 7.
2. Measures and indicators for VGI quality
Quality of VGI can be described by quality measures and quality indicators (Antoniou and
Skopeliti 2015). Quality measures, mainly adhering to the ISO principles and guidelines
refer to those elements that can be used to ascertain the discrepancy between the
contributed spatial data and the ground truth (e.g., completeness of data) mainly by
comparing to authoritative data. When authoritative data is no longer usable for
comparisons, and the established measures become no longer adequate to assess the
quality of VGI, researchers have explored more intrinsic ways to assess VGI quality by
looking into other proxies for quality measures. These are called quality indicators, that
rely on various participation biases, contributor expertise or the lack of it, background,
etc., that inuence the quality of VGI, but cannot be directly measured (Antoniou and
Skopeliti 2015). In the following, these quality measures and indicators are described in
detail. The review of quality assessment methods in Section 5 is based on these various
quality measures and indicators.
141

2.1. Quality measures for VGI
International Organization for Standardization (ISO
6
)dened geographic information
quality as totality of characteristics of a product that bear on its ability to satisfy stated
and implied needs. ISO/TC 211
7
(Technical Committee) developed a set of international
standards that dene the measures of geographic information quality (standard 19138,
as part of the metadata standard 19115). These quantitative quality measures are:
completeness, consistency, positional accuracy, temporal accuracy, and thematic
accuracy.
Completeness describes the relationship between the represented objects and their
conceptualizations. This can be measured as the absence of data (errors of omission) and
presence of excess data (errors of commission). Consistency is the coherence in the data
structures of the digitized spatial data. The errors resulting from the lack of it are
indicated by (i) conceptual consistency, (ii) domain consistency, (iii) format consistency,
and (iv) topological consistency. Accuracy refers to the degree of closeness between a
measurement of a quantity and the accepted true value of that quantity, and it is in the
form of positional accuracy, temporal accuracy and thematic accuracy. Positional accu-
racy is indicated by (i) absolute or external accuracy, (ii) relative or internal accuracy, (iii)
gridded data position accuracy. Thematic accuracy is indicated by (i) classication
correctness, (ii) non-quantitative attribute correctness, (iii) quantitative attribute accu-
racy. In both cases, the discrepancies can be numerically estimated. Temporal accuracy is
indicated by (i) accuracy of a time measurement: correctness of the temporal references
of an item, (ii) temporal consistency: correctness of ordered events or sequences, (iii)
temporal validity: validity of data with regard to time.
2.2. Quality indicators for VGI
As part of the ISO standards, geographic information quality can be further assessed
through qualitative quality indicators, such as the purpose, usage, and lineage. These
indicators are mainly used to express the quality overview for the data. Purpose
describes the intended usage of the dataset. Usage describes the application(s) in
which the dataset has been utilized. Lineage describes the history of a dataset from
collection, acquisition to compilation and derivation to its form at the time of use (Hoyle
2001, Guinée 2002, Van Oort and Bregt 2005). In addition, where ISO standardized
measures and indicators are not applicable, we have found in the literature more
abstract quality indicators to imply the quality of VGI. These are: trustworthiness, cred-
ibility, text content quality, vagueness, local knowledge, experience, recognition, reputa-
tion. Trustworthiness is a receiver judgment based on subjective characteristics, such as
reliability or trust (good ratings on the creations, and the higher frequency of usage of
these creations indicate this trustworthiness) (Flanagin and Metzger 2008). In assessing
the credibility of VGI, the source of information plays a crucial role, as it is what
credibility is primarily based upon. However, this is not straightforward. Due to the
non-authoritative nature of VGI, the source maybe unavailable, concealed, or missing
(this is avoided by gatekeepers in authoritative data). Credibility was dened by Hovland
et al.(1953) as the believability of a source or message, which comprises primarily two
dimensions, the trustworthiness (as explained earlier), and expertise. Expertise contains
142

objective characteristics such as accuracy, authority, competence, or source credentials
(Flanagin and Metzger 2008). Therefore, in assessing the credibility of data as a quality
indicator one needs to consider factors that attribute to the trustworthiness and exper-
tise. Metadata about the origin of VGI can provide a foundation for the source creden-
tials of VGI (Frew 2007). Text content quality (mostly applicable for text-based VGI)
describes the quality of text data by the use of text features, such as the text length,
structure, style, readability, revision history, topical similarity, the use of technical termi-
nology, etc. Vagueness is the ambiguity with which the data is captured (e.g., vagueness
caused by low resolutions) (De Longueville et al. 2010). Local knowledge is the con-
tributors familiarity to the geographic surroundings that she/he is implicitly or explicitly
mapping. Experience is the involvement of a contributor with the VGI platform that she/
he contributes to. This can be expressed by the time that the contributor has been
registered with the VGI portal, number of global positioning system (GPS) tracks con-
tributed (e.g., in OSM) or the number of features added and edited, or the amount of
participation in online forums to discuss the data (Van Exel et al. 2010). Recognition is
the acknowledgement given to a contributor based on tokens achieved (e.g., in gamied
VGI platforms), and the reviewing of their contributions among their peers (Van Exel
et al. 2010). Maué (2007) described reputation as a tool to ensure the validity of VGI.
Reputation is assessed by, for example, the history of past interactions that are happen-
ing between collaborators. Resnick et al.(2000) described contributors abilities and
dispositions as features where this reputation can be based upon. Maué (2007) further
argue that similar to the eBay rating system,
8
the created geographic features on various
VGI platforms can be rated, tagged, discussed, and annotated, which aects the data
contributors reputation value.
3. Map, image, and text-based VGI: denitions and quality issues
The eective utilization of VGI is strongly associated with data quality, and this varies
depending primarily on the type of VGI, the way data is collected on the dierent VGI
platforms, and the context of usage. The following sections describe the selected forms
of VGI: (1) map, (2) image, and (3) text, their uses, and how data quality issues arise.
These three types of VGI are chosen based on the methods that are used to capture the
data (maps: as GPS points and traces, image: as photos, text: as plain text), and because
they are the most popular forms of VGI currently used. This section further lays the
ground work to understand the subsequent section on various quality measures and
indicators, and quality assessment methods used for these three types of VGI.
3.1. Map-based VGI
Map-based VGI concerns all VGI sources that include geometries as points, lines, and
polygons, the basic elements to design a map. Among others, OSM, Wikimapia, Google
Map Maker, and Map Insight are examples of map-based VGI projects. However, OSM is
the most prominent project due to the following reasons: (i) it aims to develop a free
map of the world accessible and obtainable for everyone; (ii) it has millions of registered
contributors; (iii) it has active mapper communities in many locations; and (iv) it provides
free and exible contribution mechanisms for data (useful for map provision, routing,
143

Citations
More filters
Journal ArticleDOI
TL;DR: In this paper, the role of informal volunteers in emergency and disaster management is reviewed and it is argued that there is an overemphasis on volunteering within, and for, state and formal organizations.
Abstract: Despite highly specialised and capable emergency management systems, ordinary citizens are usually first on the scene in an emergency or disaster, and remain long after official services have ceased. Citizens often play vital roles in helping those affected to respond and recover, and can provide invaluable assistance to official agencies. However, in most developed countries, emergency and disaster management relies largely on a workforce of professionals and, to varying degrees, volunteers affiliated with official agencies. Those who work outside of such systems have tended to be viewed as a nuisance or liability, and their efforts are often undervalued. Given increasing disaster risk worldwide due to population growth, urban development and climate change, it is likely that 'informal' volunteers will provide much of the additional surge capacity required to respond to more frequent emergencies and disasters in the future. This paper considers the role of informal volunteers in emergency and disaster management. Definitions of volunteerism are reviewed and it is argued that there is an overemphasis on volunteering within, and for, state and formal organisations. We offer a broader definition of 'informal volunteerism' that recognises the many ways ordinary citizens volunteer their time, knowledge, skills and resources to help others in times of crisis. Two broad types of informal volunteerism are identified - emergent and extending - and the implications for emergency and disaster management are considered. Particular attention is given to increasing 'digital volunteerism' due to the greater accessibility of sophisticated but simple information and communication technologies. Culture and legal liability are identified as key barriers to greater participation of informal volunteers. We argue that more adaptive and inclusive models of emergency and disaster management are needed to harness the capacities and resilience that exist within and across communities.

314 citations

Journal ArticleDOI
TL;DR: It is shown that most fundamental Arctic infrastructure and population will be at high hazard risk, even if the Paris Agreement target is achieved, and fundamental engineering structures at risk by 2050.
Abstract: Degradation of near-surface permafrost can pose a serious threat to the utilization of natural resources, and to the sustainable development of Arctic communities. Here we identify at unprecedentedly high spatial resolution infrastructure hazard areas in the Northern Hemisphere's permafrost regions under projected climatic changes and quantify fundamental engineering structures at risk by 2050. We show that nearly four million people and 70% of current infrastructure in the permafrost domain are in areas with high potential for thaw of near-surface permafrost. Our results demonstrate that one-third of pan-Arctic infrastructure and 45% of the hydrocarbon extraction fields in the Russian Arctic are in regions where thaw-related ground instability can cause severe damage to the built environment. Alarmingly, these figures are not reduced substantially even if the climate change targets of the Paris Agreement are reached.

279 citations

Journal ArticleDOI
10 Aug 2017-PLOS ONE
TL;DR: Two complementary, independent methods are used to assess the completeness of OSM road data in each country in the world and find that globally, OSM is ∼83% complete, and more than 40% of countries—including several in the developing world—have a fully mapped street network.
Abstract: OpenStreetMap, a crowdsourced geographic database, provides the only global-level, openly licensed source of geospatial road data, and the only national-level source in many countries. However, researchers, policy makers, and citizens who want to make use of OpenStreetMap (OSM) have little information about whether it can be relied upon in a particular geographic setting. In this paper, we use two complementary, independent methods to assess the completeness of OSM road data in each country in the world. First, we undertake a visual assessment of OSM data against satellite imagery, which provides the input for estimates based on a multilevel regression and poststratification model. Second, we fit sigmoid curves to the cumulative length of contributions, and use them to estimate the saturation level for each country. Both techniques may have more general use for assessing the development and saturation of crowd-sourced data. Our results show that in many places, researchers and policymakers can rely on the completeness of OSM, or will soon be able to do so. We find (i) that globally, OSM is ∼83% complete, and more than 40% of countries-including several in the developing world-have a fully mapped street network; (ii) that well-governed countries with good Internet access tend to be more complete, and that completeness has a U-shaped relationship with population density-both sparsely populated areas and dense cities are the best mapped; and (iii) that existing global datasets used by the World Bank undercount roads by more than 30%.

259 citations


Cites background from "A review of volunteered geographic ..."

  • ...[21, 22]; for a more comprehensive review, see [23])....

    [...]

Journal ArticleDOI
TL;DR: Critical areas for the development of the field include integration of different types of information in data mashups, development of quality assurance procedures and ethical codes, improved integration with existing methods, and assurance of long-term, free and easy-to-access provision of public social media data for future environmental researchers.
Abstract: The analysis of data from social media and social networking sites may be instrumental in achieving a better understanding of human-environment interactions and in shaping future conservation and environmental management. In this study, we systematically map the application of social media data in environmental research. The quantitative review of 169 studies reveals that most studies focus on the analysis of people’s behavior and perceptions of the environment, followed by environmental monitoring and applications in environmental planning and governance. The literature testifies to a very rapid growth in the field, with Twitter (52 studies) and Flickr (34 studies) being most frequently used as data sources. A growing number of studies combine data from multiple sites and jointly investigates multiple types of media. A broader, more qualitative review of the insights provided by the investigated studies suggests that while social media data offer unprecedented opportunities in terms of data volume, scale of analysis, and real-time monitoring, researchers are only starting to cope with the challenges of data’s heterogeneity and noise levels, potential biases, ethics of data acquisition and use, and uncertainty about future data availability. Critical areas for the development of the field include integration of different types of information in data mashups, development of quality assurance procedures and ethical codes, improved integration with existing methods, and assurance of long-term, free and easy-to-access provision of public social media data for future environmental researchers.

203 citations

Journal ArticleDOI
01 Jan 2020
TL;DR: A comprehensive and systematic review of existing research on four core algorithmic issues in spatial crowdsourcing: (1) task assignment, (2) quality control, (3) incentive mechanism design, and (4) privacy protection.
Abstract: Crowdsourcing is a computing paradigm where humans are actively involved in a computing task, especially for tasks that are intrinsically easier for humans than for computers. Spatial crowdsourcing is an increasing popular category of crowdsourcing in the era of mobile Internet and sharing economy, where tasks are spatiotemporal and must be completed at a specific location and time. In fact, spatial crowdsourcing has stimulated a series of recent industrial successes including sharing economy for urban services (Uber and Gigwalk) and spatiotemporal data collection (OpenStreetMap and Waze). This survey dives deep into the challenges and techniques brought by the unique characteristics of spatial crowdsourcing. Particularly, we identify four core algorithmic issues in spatial crowdsourcing: (1) task assignment, (2) quality control, (3) incentive mechanism design, and (4) privacy protection. We conduct a comprehensive and systematic review of existing research on the aforementioned four issues. We also analyze representative spatial crowdsourcing applications and explain how they are enabled by these four technical issues. Finally, we discuss open questions that need to be addressed for future spatial crowdsourcing research and applications.

185 citations

References
More filters
Proceedings ArticleDOI
26 Apr 2010
TL;DR: This paper investigates the real-time interaction of events such as earthquakes in Twitter and proposes an algorithm to monitor tweets and to detect a target event and produces a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location.
Abstract: Twitter, a popular microblogging service, has received much attention recently. An important characteristic of Twitter is its real-time nature. For example, when an earthquake occurs, people make many Twitter posts (tweets) related to the earthquake, which enables detection of earthquake occurrence promptly, simply by observing the tweets. As described in this paper, we investigate the real-time interaction of events such as earthquakes in Twitter and propose an algorithm to monitor tweets and to detect a target event. To detect a target event, we devise a classifier of tweets based on features such as the keywords in a tweet, the number of words, and their context. Subsequently, we produce a probabilistic spatiotemporal model for the target event that can find the center and the trajectory of the event location. We consider each Twitter user as a sensor and apply Kalman filtering and particle filtering, which are widely used for location estimation in ubiquitous/pervasive computing. The particle filter works better than other comparable methods for estimating the centers of earthquakes and the trajectories of typhoons. As an application, we construct an earthquake reporting system in Japan. Because of the numerous earthquakes and the large number of Twitter users throughout the country, we can detect an earthquake with high probability (96% of earthquakes of Japan Meteorological Agency (JMA) seismic intensity scale 3 or more are detected) merely by monitoring tweets. Our system detects earthquakes promptly and sends e-mails to registered users. Notification is delivered much faster than the announcements that are broadcast by the JMA.

3,976 citations


"A review of volunteered geographic ..." refers background in this paper

  • ...…increased potential and use of VGI (as demonstrated in the works of Liu et al. 2008, Jacob et al. 2009, McDougall 2009, Bulearca and Bulearca 2010, Sakaki et al. 2010, MacEachren et al. 2011, Chunara et al. 2012, Fuchs et al. 2013), it becomes increasingly important to be aware of the quality of…...

    [...]

Journal ArticleDOI
TL;DR: In recent months, there has been an explosion of interest in using the Web to create, assemble, and disseminate geographic information provided voluntarily by individuals as mentioned in this paper, and the role of the amateur in geographic observation has been discussed.
Abstract: In recent months there has been an explosion of interest in using the Web to create, assemble, and disseminate geographic information provided voluntarily by individuals. Sites such as Wikimapia and OpenStreetMap are empowering citizens to create a global patchwork of geographic information, while Google Earth and other virtual globes are encouraging volunteers to develop inter- esting applications using their own data. I review this phenomenon, and examine associated issues: what drives people to do this, how accurate are the results, will they threaten individual privacy, and how can they augment more conventional sources? I compare this new phenomenon to more traditional citizen science and the role of the amateur in geographic observation.

3,633 citations


"A review of volunteered geographic ..." refers methods in this paper

  • ...…(VGI) is where citizens, often untrained, and regardless of their expertise and background create geographic information on dedicated web platforms (Goodchild 2007), e.g., OpenStreetMap (OSM),1 Wikimapia,2 Google MyMaps,3 Map Insight4 and Flickr.5 In a typology of VGI, the works of Antoniou et al.…...

    [...]

Journal ArticleDOI
TL;DR: Systems T he Internet offers vast new opportunities to interact with total strangers, but these interactions can be fun, informative, even profitable, but they also involve risk.
Abstract: Systems T he Internet offers vast new opportunities to interact with total strangers. These interactions can be fun, informative, even profitable. But they also involve risk. Is the advice of a self-proclaimed expert at expertcentral.com reliable? Will an unknown dotcom site or eBay seller ship items promptly with appropriate packaging? Will the product be the same one described online? Prior to the Internet, such questions were answered, in part, through personal and corporate reputations. Vendors provided references, Better Business Bureaus tallied complaints, and past personal experience and person-to-person gossip told you on whom you could rely and on whom you could not. Participants’ standing in their communities, including their roles in church and civic organizations, served as a valuable hostage. Internet services operate on a vastly larger scale

2,410 citations

Book
31 May 2002
TL;DR: The Guide to LCA is a guide to the management of LCA projects: procedures and guiding principles for the present Guide, which aims to clarify goal and scope definition, impact assessment, and interpretation.
Abstract: Preface. Foreword. Part 1: LCA in Perspective. 1. Why a new Guide to LCA? 2. Main characteristics of LCA. 3. International developments. 4. Guiding principles for the present Guide. 5. Reading guide. Part 2a: Guide. Reading guidance. 1. Management of LCA projects: procedures. 2. Goal and scope definition. 3. Inventory analysis. 4. Impact assessment. 5. Interpretation. Appendix A: Terms, definitions and abbreviations. Part 2b: Operational annex. List of tables. Reading guidance. 1. Management of LCA projects: procedures. 2. Goal and scope definition. 3. Inventory analysis. 4. Impact assessment. 5. Interpretation. 6. References. Part 3: Scientific background. Reading guidance. 1. General introduction. 2. Goal and scope definition. 3. Inventory analysis. 4. Impact assessment. 5. Interpretation. 6. References. Annex A: Contributors. Appendix B: Areas of application of LCA. Appendix C: Partitioning economic inputs and outputs to product systems.

2,383 citations


"A review of volunteered geographic ..." refers background in this paper

  • ...Lineage describes the history of a dataset from collection, acquisition to compilation and derivation to its form at the time of use (Hoyle 2001, Guinée 2002, Van Oort and Bregt 2005)....

    [...]

Proceedings ArticleDOI
28 Mar 2011
TL;DR: There are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.
Abstract: We analyze the information credibility of news propagated through Twitter, a popular microblogging service. Previous research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often unintentionally.On this paper we focus on automatic methods for assessing the credibility of a given set of tweets. Specifically, we analyze microblog postings related to "trending" topics, and classify them as credible or not credible, based on features extracted from them. We use features from users' posting and re-posting ("re-tweeting") behavior, from the text of the posts, and from citations to external sources.We evaluate our methods using a significant number of human assessments about the credibility of items on a recent sample of Twitter postings. Our results shows that there are measurable differences in the way messages propagate, that can be used to classify them automatically as credible or not credible, with precision and recall in the range of 70% to 80%.

2,123 citations


"A review of volunteered geographic ..." refers background or methods in this paper

  • ...…as an information foraging source (MacEachren et al. 2011), in journalism to disseminate data to the public in near real-time basis (O’Connor 2009, Castillo et al. 2011), detect disease spreading (Chunara et al. 2012), event detection (Bosch et al. 2013), and for gaining insights on social…...

    [...]

  • ...Castillo et al. (2011) employed users on mechanical turk12 to classify pre-classified ‘news-worthy events’ and ’informal discussions’ on Twitter according to several classes of credibility [(i) almost certainly true, (ii) likely to be false, . . .]....

    [...]

  • ...Most of these content-based features are taken from Castillo et al. (2011)....

    [...]

  • ...These features differ somewhat to the features extracted through the supervised classification of Castillo et al. (2011)....

    [...]

  • ...Their approach is similar to that of Castillo et al. (2011), but the authors proposed a new technique to re-rank the Tweets based on a Pseudo Relevance Feedback....

    [...]