Showing papers on "Crowdsourcing published in 2012"

PDF

Open Access

Journal Article•DOI•

Towards an integrated crowdsourcing definition

[...]

Enrique Estellés-Arolas¹, Fernando González-Ladrón-de-Guevara¹•Institutions (1)

01 Apr 2012-Journal of Information Science

TL;DR: In this article, existing definitions of crowdsourcing are analysed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative.

...read moreread less

Abstract: 'Crowdsourcing' is a relatively recent concept that encompasses many practices. This diversity leads to the blurring of the limits of crowdsourcing that may be identified virtually with any type of internet-based collaborative activity, such as co-creation or user innovation. Varying definitions of crowdsourcing exist, and therefore some authors present certain specific examples of crowdsourcing as paradigmatic, while others present the same examples as the opposite. In this article, existing definitions of crowdsourcing are analysed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative. Based on these existing definitions, an exhaustive and consistent definition for crowdsourcing is presented and contrasted in 11 cases.

...read moreread less

1,616 citations

Journal Article•DOI•

Crowdsourcing As a Solution to Distant Search

[...]

Allan Afuah¹, Christopher L. Tucci²•Institutions (2)

University of Michigan¹, École Polytechnique Fédérale de Lausanne²

01 Jul 2012-Academy of Management Review

TL;DR: It is argued that under certain circumstances crowdsourcing transforms distant search into local search, improving the efficiency and effectiveness of problem solving.

...read moreread less

Abstract: We argue that under certain circumstances crowdsourcing transforms distant search into local search, improving the efficiency and effectiveness of problem solving. Under such circumstances a firm may choose to crowdsource problem solving rather than solve the problem internally or contract it to a designated supplier. These circumstances depend on the characteristics of the problem, the knowledge required for the solution, the crowd, and the solutions to be evaluated.

...read moreread less

999 citations

Journal Article•DOI•

The Value of Crowdsourcing: Can Users Really Compete with Professionals in Generating New Product Ideas?

[...]

Marion Poetz¹, Martin Schreier²•Institutions (2)

Copenhagen Business School¹, Bocconi University²

01 Mar 2012-Journal of Product Innovation Management

TL;DR: In this paper, a real-world comparison of ideas actually generated by a firm's professionals with those generated by users in the course of an idea generation contest is presented, which suggests that, at least under certain conditions, crowdsourcing might constitute a promising method to gather user ideas that can complement those of a firm' professionals at the idea generation stage in NPD.

...read moreread less

881 citations

Posted Content•

The Future of Crowd Work

[...]

Aniket Kittur¹, Jeffrey V. Nickerson², Michael S. Bernstein³, Elizabeth M. Gerber⁴, Aaron Shaw⁵, Aaron Shaw⁶, John Zimmerman¹, Matthew Lease⁷, John Horton⁸ - Show less +5 more•Institutions (8)

Carnegie Mellon University¹, Stevens Institute of Technology², Stanford University³, Northwestern University⁴, Harvard University⁵, University of California, Berkeley⁶, University of Texas at Austin⁷, New York University⁸

18 Dec 2012-Social Science Research Network

TL;DR: In this paper, the authors outline a framework that will enable crowd work that is complex, collaborative, and sustainable, and lay out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

Abstract: Paid crowd work offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale. But it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework. Can we foresee a future crowd workplace in which we would want our children to participate? This paper frames the major challenges that stand in the way of this goal. Drawing on theory from organizational behavior and distributed computing, as well as direct feedback from workers, we outline a framework that will enable crowd work that is complex, collaborative, and sustainable. The framework lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

...read moreread less

803 citations

Journal Article•DOI•

Crowdsourcing New Product Ideas Over Time: An Analysis of Dell's Ideastorm Community

[...]

Barry L. Bayus¹•Institutions (1)

University of North Carolina at Chapel Hill¹

04 Jan 2012-Social Science Research Network

TL;DR: Studying Dell's IdeaStorm community, serial ideators are found to be more likely than consumers with only one idea to generate an idea the organization finds valuable enough to implement, but they are unlikely to repeat their early success once their ideas are implemented.

...read moreread less

Abstract: Several organizations have developed ongoing crowdsourcing communities that repeatedly collect ideas for new products and services from a large, dispersed “crowd” of non-experts (consumers) over time. Despite its promises, little is known about the nature of an individual’s ideation efforts in such an online community. Studying Dell’s IdeaStorm community, serial ideators are found to be more likely than consumers with only one idea to generate an idea the organization find valuable enough to implement, but are unlikely to repeat their early success once their ideas are implemented. As ideators with past success attempt to again come up with ideas that will excite the organization, they instead end up proposing ideas similar to their ideas that were already implemented (i.e., they generate less diverse ideas). The negative effects of past success are somewhat mitigated for ideators with diverse commenting activity on others’ ideas. These findings highlight some of the challenges in maintaining an ongoing supply of quality ideas from the crowd over time.

...read moreread less

697 citations

Journal Article•DOI•

CrowdER: crowdsourcing entity resolution

[...]

Jiannan Wang¹, Tim Kraska², Michael J. Franklin², Jianhua Feng¹•Institutions (2)

Tsinghua University¹, University of California, Berkeley²

01 Jul 2012

TL;DR: This work proposes a hybrid human-machine approach in which machines are used to do an initial, coarse pass over all the data, and people are use to verify only the most likely matching pairs, and develops a novel two-tiered heuristic approach for creating batched tasks.

...read moreread less

Abstract: Entity resolution is central to data integration and data cleaning. Algorithmic approaches have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a more accurate but expensive (and slow) way to bring human insight into the process. Previous work has proposed batching verification tasks for presentation to human workers but even with batching, a human-only approach is infeasible for data sets of even moderate size, due to the large numbers of matches to be tested. Instead, we propose a hybrid human-machine approach in which machines are used to do an initial, coarse pass over all the data, and people are used to verify only the most likely matching pairs. We show that for such a hybrid system, generating the minimum number of verification tasks of a given size is NP-Hard, but we develop a novel two-tiered heuristic approach for creating batched tasks. We describe this method, and present the results of extensive experiments on real data sets using a popular crowdsourcing platform. The experiments show that our hybrid approach achieves both good efficiency and high accuracy compared to machine-only or human-only alternatives.

...read moreread less

499 citations

Proceedings Article•DOI•

Expectation and purpose: understanding users' mental models of mobile app privacy through crowdsourcing

[...]

Jialiu Lin¹, Shahriyar Amini¹, Jason Hong¹, Norman Sadeh¹, Janne Lindqvist², Joy Zhang¹ - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Rutgers University²

05 Sep 2012

TL;DR: A new model for privacy is introduced, namely privacy as expectations, which involves using crowdsourcing to capture users' expectations of what sensitive resources mobile apps use and a new privacy summary interface that prioritizes and highlights places where mobile apps break people's expectations.

...read moreread less

Abstract: Smartphone security research has produced many useful tools to analyze the privacy-related behaviors of mobile apps. However, these automated tools cannot assess people's perceptions of whether a given action is legitimate, or how that action makes them feel with respect to privacy. For example, automated tools might detect that a blackjack game and a map app both use one's location information, but people would likely view the map's use of that data as more legitimate than the game. Our work introduces a new model for privacy, namely privacy as expectations. We report on the results of using crowdsourcing to capture users' expectations of what sensitive resources mobile apps use. We also report on a new privacy summary interface that prioritizes and highlights places where mobile apps break people's expectations. We conclude with a discussion of implications for employing crowdsourcing as a privacy evaluation technique.

...read moreread less

491 citations

Proceedings Article•DOI•

GeoCrowd: enabling query answering with spatial crowdsourcing

[...]

Leyla Kazemi¹, Cyrus Shahabi¹•Institutions (1)

University of Southern California¹

06 Nov 2012

TL;DR: This paper introduces a taxonomy for spatial crowdsourcing, and focuses on one class of this taxonomy, in which workers send their locations to a centralized server and thereafter the server assigns to every worker his nearby tasks with the objective of maximizing the overall number of assigned tasks.

...read moreread less

Abstract: With the ubiquity of mobile devices, spatial crowdsourcing is emerging as a new platform, enabling spatial tasks (i.e., tasks related to a location) assigned to and performed by human workers. In this paper, for the first time we introduce a taxonomy for spatial crowdsourcing. Subsequently, we focus on one class of this taxonomy, in which workers send their locations to a centralized server and thereafter the server assigns to every worker his nearby tasks with the objective of maximizing the overall number of assigned tasks. We formally define this maximum task assignment (or MTA) problem in spatial crowdsourcing, and identify its challenges. We propose alternative solutions to address these challenges by exploiting the spatial properties of the problem space. Finally, our experimental evaluations on both real-world and synthetic data verify the applicability of our proposed approaches and compare them by measuring both the number of assigned tasks and the travel cost of the workers.

...read moreread less

484 citations

Proceedings Article•DOI•

ZenCrowd: leveraging probabilistic reasoning and crowdsourcing techniques for large-scale entity linking

[...]

Gianluca Demartini¹, Djellel Eddine Difallah¹, Philippe Cudré-Mauroux¹•Institutions (1)

University of Fribourg¹

16 Apr 2012

TL;DR: A probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers is developed and developed to improve the quality of the links while limiting the amount of work performed by the crowd.

...read moreread less

Abstract: We tackle the problem of entity linking for large collections of online pages; Our system, ZenCrowd, identifies entities from natural language text using state of the art techniques and automatically connects them to the Linked Open Data cloud. We show how one can take advantage of human intelligence to improve the quality of the links by dynamically generating micro-tasks on an online crowdsourcing platform. We develop a probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers. We evaluate ZenCrowd in a real deployment and show how a combination of both probabilistic reasoning and crowdsourcing techniques can significantly improve the quality of the links, while limiting the amount of work performed by the crowd.

...read moreread less

454 citations

Posted Content•

CrowdER: Crowdsourcing Entity Resolution

[...]

Jiannan Wang¹, Tim Kraska², Michael J. Franklin², Jianhua Feng¹•Institutions (2)

Tsinghua University¹, University of California, Berkeley²

09 Aug 2012-arXiv: Databases

TL;DR: In this paper, a hybrid human-machine approach is proposed, in which machines are used to do an initial, coarse pass over all the data, and people were used to verify only the most likely matching pairs.

...read moreread less

450 citations

Proceedings Article•DOI•

Dynamic changes in motivation in collaborative citizen-science projects

[...]

Dana Rotman¹, Jenny Preece¹, Jen Hammock², Kezee Procita¹, Derek L. Hansen¹, Cynthia Parr², Darcy Lewis¹, David W. Jacobs¹ - Show less +4 more•Institutions (2)

University of Maryland, College Park¹, Smithsonian Institution²

11 Feb 2012

TL;DR: It is shown that volunteers are motivated by a complex framework of factors that dynamically change throughout their cycle of work on scientific projects; this motivational framework is strongly affected by personal interests as well as external factors such as attribution and acknowledgment.

...read moreread less

Abstract: Online citizen science projects engage volunteers in collecting, analyzing, and curating scientific data. Existing projects have demonstrated the value of using volunteers to collect data, but few projects have reached the full collaborative potential of scientists and volunteers. Understanding the shared and unique motivations of these two groups can help designers establish the technical and social infrastructures needed to promote effective partnerships. We present findings from a study of the motivational factors affecting participation in ecological citizen science projects. We show that volunteers are motivated by a complex framework of factors that dynamically change throughout their cycle of work on scientific projects; this motivational framework is strongly affected by personal interests as well as external factors such as attribution and acknowledgment. Identifying the pivotal points of motivational shift and addressing them in the design of citizen-science systems will facilitate improved collaboration between scientists and volunteers.

...read moreread less

Journal Article•DOI•

Crowdsourcing with Smartphones

[...]

Georgios Chatzimilioudis¹, Andreas Konstantinidis¹, Christos Laoudias¹, Demetrios Zeinalipour-Yazti¹•Institutions (1)

University of Cyprus¹

01 Sep 2012-IEEE Internet Computing

TL;DR: A taxonomy that classifies the mobile crowdsourcing field and three new applications that optimize location-based search and similarity services based on crowd-generated data are illustrated.

...read moreread less

Abstract: Smartphones can reveal crowdsourcing's full potential and let users transparently contribute to complex and novel problem solving. This emerging area is illustrated through a taxonomy that classifies the mobile crowdsourcing field and through three new applications that optimize location-based search and similarity services based on crowd-generated data. Such applications can be deployed on SmartLab, a cloud of more than 40 Android devices deployed at the University of Cyprus that provides an open testbed to facilitate research and development of smartphone applications on a massive scale.

...read moreread less

Proceedings Article•

Variational Inference for Crowdsourcing

[...]

Qiang Liu¹, Jian Peng², Alexander T. Ihler¹•Institutions (2)

University of California, Irvine¹, Massachusetts Institute of Technology²

03 Dec 2012

TL;DR: By choosing the prior properly, both BP and MF perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions.

...read moreread less

Abstract: Crowdsourcing has become a popular paradigm for labeling large datasets. However, it has given rise to the computational task of aggregating the crowdsourced labels provided by a collection of unreliable annotators. We approach this problem by transforming it into a standard inference problem in graphical models, and applying approximate variational methods, including belief propagation (BP) and mean field (MF). We show that our BP algorithm generalizes both majority voting and a recent algorithm by Karger et al. [1], while our MF method is closely related to a commonly used EM algorithm. In both cases, we find that the performance of the algorithms critically depends on the choice of a prior distribution on the workers' reliability; by choosing the prior properly, both BP and MF (and EM) perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions.

...read moreread less

Proceedings Article•DOI•

Combining human and machine intelligence in large-scale crowdsourcing

[...]

Ece Kamar¹, Severin Hacker², Eric Horvitz¹•Institutions (2)

Microsoft¹, Carnegie Mellon University²

04 Jun 2012

TL;DR: A set of Bayesian predictive models from data are constructed and described how the models operate within an overall crowd-sourcing architecture that combines the efforts of people and machine vision on the task of classifying celestial bodies defined within a citizens' science project named Galaxy Zoo.

...read moreread less

Abstract: We show how machine learning and inference can be harnessed to leverage the complementary strengths of humans and computational agents to solve crowdsourcing tasks. We construct a set of Bayesian predictive models from data and describe how the models operate within an overall crowd-sourcing architecture that combines the efforts of people and machine vision on the task of classifying celestial bodies defined within a citizens' science project named Galaxy Zoo. We show how learned probabilistic models can be used to fuse human and machine contributions and to predict the behaviors of workers. We employ multiple inferences in concert to guide decisions on hiring and routing workers to tasks so as to maximize the efficiency of large-scale crowdsourcing processes based on expected utility.

...read moreread less

Proceedings Article•

Online task assignment in crowdsourcing markets

[...]

Chien-Ju Ho¹, Jennifer Wortman Vaughan¹•Institutions (1)

University of California, Los Angeles¹

22 Jul 2012

TL;DR: This work presents a two-phase exploration-exploitation assignment algorithm and proves that it is competitive with respect to the optimal offline algorithm which has access to the unknown skill levels of each worker.

...read moreread less

Abstract: We explore the problem of assigning heterogeneous tasks to workers with different, unknown skill sets in crowdsourcing markets such as Amazon Mechanical Turk. We first formalize the online task assignment problem, in which a requester has a fixed set of tasks and a budget that specifies how many times he would like each task completed. Workers arrive one at a time (with the same worker potentially arriving multiple times), and must be assigned to a task upon arrival. The goal is to allocate workers to tasks in a way that maximizes the total benefit that the requester obtains from the completed work. Inspired by recent research on the online adwords problem, we present a two-phase exploration-exploitation assignment algorithm and prove that it is competitive with respect to the optimal offline algorithm which has access to the unknown skill levels of each worker. We empirically evaluate this algorithm using data collected on Mechanical Turk and show that it performs better than random assignment or greedy algorithms. To our knowledge, this is the first work to extend the online primal-dual technique used in the online adwords problem to a scenario with unknown parameters, and the first to offer an empirical validation of an online primal-dual algorithm.

...read moreread less

Proceedings Article•DOI•

Shepherding the crowd yields better work

[...]

Steven Dow¹, Anand Kulkarni², Scott R. Klemmer³, Björn Hartmann²•Institutions (3)

Carnegie Mellon University¹, University of California, Berkeley², Stanford University³

11 Feb 2012

TL;DR: This paper investigates whether timely, task-specific feedback helps crowd workers learn, persevere, and produce better results in micro-task platforms by discussing interaction and infrastructure approaches for integrating real-time assessment into online work.

...read moreread less

Abstract: Micro-task platforms provide massively parallel, on-demand labor. However, it can be difficult to reliably achieve high-quality work because online workers may behave irresponsibly, misunderstand the task, or lack necessary skills. This paper investigates whether timely, task-specific feedback helps crowd workers learn, persevere, and produce better results. We investigate this question through Shepherd, a feedback system for crowdsourced work. In a between-subjects study with three conditions, crowd workers wrote consumer reviews for six products they own. Participants in the None condition received no immediate feedback, consistent with most current crowdsourcing practices. Participants in the Self-assessment condition judged their own work. Participants in the External assessment condition received expert feedback. Self-assessment alone yielded better overall work than the None condition and helped workers improve over time. External assessment also yielded these benefits. Participants who received external assessment also revised their work more. We conclude by discussing interaction and infrastructure approaches for integrating real-time assessment into online work.

...read moreread less

Book•DOI•

Crowdsourcing Geographic Knowledge: Volunteered Geographic Information (VGI) in Theory and Practice

[...]

Daniel Z. Sui¹, Sarah Elwood², Michael F. Goodchild³•Institutions (3)

Ohio State University¹, University of Washington², University of California, Santa Barbara³

09 Aug 2012

TL;DR: The 20 chapters in this paper explore both the theories and applications of crowdsourcing for geographic knowledge production with three sections focusing on VGI, public participation, and citizen science; Geographic Knowledge Production and Place Inference; Emerging Applications and New Challenges.

...read moreread less

Abstract: The phenomenon of volunteered geographic information is part of a profound transformation in how geographic data, information, and knowledge are produced and circulated. By situating volunteered geographic information (VGI) in the context of big-data deluge and the data-intensive inquiry, the 20 chapters in this book explore both the theories and applications of crowdsourcing for geographic knowledge production with three sections focusing on 1). VGI, Public Participation, and Citizen Science; 2). Geographic Knowledge Production and Place Inference; and 3). Emerging Applications and New Challenges. This book argues that future progress in VGI research depends in large part on building strong linkages with diverse geographic scholarship. Contributors of this volume situate VGI research in geographys core concerns with space and place, and offer several ways of addressing persistent challenges of quality assurance in VGI. This book positions VGI as part of a shift toward hybrid epistemologies, and potentially a fourth paradigm of data-intensive inquiry across the sciences. It also considers the implications of VGI and the exaflood for further time-space compression and new forms, degrees of digital inequality, the renewed importance of geography, and the role of crowdsourcing for geographic knowledge production.

...read moreread less

Journal Article•DOI•

CDAS: a crowdsourcing data analytics system

[...]

Xuan Liu¹, Meiyu Lu¹, Beng Chin Ooi¹, Yanyan Shen¹, Sai Wu², Meihui Zhang¹ - Show less +2 more•Institutions (2)

National University of Singapore¹, Zhejiang University²

01 Jun 2012

TL;DR: A quality-sensitive answering model is introduced, which guides the crowdsourcing query engine for the design and processing of the corresponding crowdsourcing jobs, and effectively reduces the processing cost while maintaining the required query answer quality.

...read moreread less

Abstract: Some complex problems, such as image tagging and natural language processing, are very challenging for computers, where even state-of-the-art technology is yet able to provide satisfactory accuracy. Therefore, rather than relying solely on developing new and better algorithms to handle such tasks, we look to the crowdsourcing solution -- employing human participation -- to make good the shortfall in current technology. Crowdsourcing is a good supplement to many computer tasks. A complex job may be divided into computer-oriented tasks and human-oriented tasks, which are then assigned to machines and humans respectively.To leverage the power of crowdsourcing, we design and implement a Crowdsourcing Data Analytics System, CDAS. CDAS is a framework designed to support the deployment of various crowdsourcing applications. The core part of CDAS is a quality-sensitive answering model, which guides the crowdsourcing engine to process and monitor the human tasks. In this paper, we introduce the principles of our quality-sensitive model. To satisfy user required accuracy, the model guides the crowdsourcing query engine for the design and processing of the corresponding crowdsourcing jobs. It provides an estimated accuracy for each generated result based on the human workers' historical performances. When verifying the quality of the result, the model employs an online strategy to reduce waiting time. To show the effectiveness of the model, we implement and deploy two analytics jobs on CDAS, a twitter sentiment analytics job and an image tagging job. We use real Twitter and Flickr data as our queries respectively. We compare our approaches with state-of-the-art classification and image annotation techniques. The results show that the human-assisted methods can indeed achieve a much higher accuracy. By embedding the quality-sensitive model into crowdsourcing query engine, we effectively reduce the processing cost while maintaining the required query answer quality.

...read moreread less

Proceedings Article•DOI•

Collaboratively crowdsourcing workflows with turkomatic

[...]

Anand Kulkarni¹, Matthew Can², Björn Hartmann¹•Institutions (2)

University of California, Berkeley¹, Stanford University²

11 Feb 2012

TL;DR: It is argued that Turkomatic's collaborative approach can be more successful than the conventional workflow design process and implications for the design of collaborative crowd planning systems are discussed.

...read moreread less

Abstract: Preparing complex jobs for crowdsourcing marketplaces requires careful attention to workflow design, the process of decomposing jobs into multiple tasks, which are solved by multiple workers. Can the crowd help design such workflows? This paper presents Turkomatic, a tool that recruits crowd workers to aid requesters in planning and solving complex jobs. While workers decompose and solve tasks, requesters can view the status of worker-designed workflows in real time; intervene to change tasks and solutions; and request new solutions to subtasks from the crowd. These features lower the threshold for crowd employers to request complex work. During two evaluations, we found that allowing the crowd to plan without requester supervision is partially successful, but that requester intervention during workflow planning and execution improves quality substantially. We argue that Turkomatic's collaborative approach can be more successful than the conventional workflow design process and discuss implications for the design of collaborative crowd planning systems.

...read moreread less

Proceedings Article•DOI•

Reputation-based incentive protocols in crowdsourcing applications

[...]

Yu Zhang¹, Mihaela van der Schaar¹•Institutions (1)

University of California, Los Angeles¹

25 Mar 2012

TL;DR: It is proved that the proposed incentives protocol can make the website operate close to Pareto efficiency, and also examines an alternative scenario, where the protocol designer aims at maximizing the revenue of the website and evaluate the performance of the optimal protocol.

...read moreread less

Abstract: Crowdsourcing websites (e.g. Yahoo! Answers, Amazon Mechanical Turk, and etc.) emerged in recent years that allow requesters from all around the world to post tasks and seek help from an equally global pool of workers. However, intrinsic incentive problems reside in crowdsourcing applications as workers and requester are selfish and aim to strategically maximize their own benefit. In this paper, we propose to provide incentives for workers to exert effort using a novel game-theoretic model based on repeated games. As there is always a gap in the social welfare between the non-cooperative equilibria emerging when workers pursue their self-interests and the desirable Pareto efficient outcome, we propose a novel class of incentive protocols based on social norms which integrates reputation mechanisms into the existing pricing schemes currently implemented on crowdsourcing websites, in order to improve the performance of the non-cooperative equilibria emerging in such applications. We first formulate the exchanges on a crowdsourcing website as a two-sided market where requesters and workers are matched and play gift-giving games repeatedly. Subsequently, we study the protocol designer's problem of finding an optimal and sustainable (equilibrium) protocol which achieves the highest social welfare for that website. We prove that the proposed incentives protocol can make the website operate close to Pareto efficiency. Moreover, we also examine an alternative scenario, where the protocol designer aims at maximizing the revenue of the website and evaluate the performance of the optimal protocol.

...read moreread less

Journal Article•DOI•

Geo-Wiki: An online platform for improving global land cover

[...]

Steffen Fritz¹, Ian McCallum¹, Christian Schill², Christoph Perger³, Linda See¹, Dmitry Schepaschenko¹, Marijn van der Velde¹, Florian Kraxner¹, Michael Obersteiner¹ - Show less +5 more•Institutions (3)

International Institute for Applied Systems Analysis¹, University of Freiburg², University of Applied Sciences Wiener Neustadt³

01 May 2012-Environmental Modelling and Software

TL;DR: The components that comprise Geo-Wiki are outlined and how they are integrated in the architectural design, in particular the need to add a mechanism for feedback and interaction as part of community building, and theneed to address issues of data quality.

...read moreread less

Abstract: Land cover derived from remotely sensed products is an important input to a number of different global, regional and national scale applications including resource assessments and economic land use models. During the last decade three global land cover datasets have been created, i.e. the GLC-2000, MODIS and GlobCover, but comparison studies have shown that there are large spatial discrepancies between these three products. One of the reasons for these discrepancies is the lack of sufficient in-situ data for the development of these products. To address this issue, a crowdsourcing tool called Geo-Wiki has been developed. Geo-Wiki has two main aims: to increase the amount of in-situ land cover data available for training, calibration and validation, and to create a hybrid global land cover map that provides more accurate land cover information than any current individual product. This paper outlines the components that comprise Geo-Wiki and how they are integrated in the architectural design. An overview of the main functionality of Geo-Wiki is then provided along with the current usage statistics and the lessons learned to date, in particular the need to add a mechanism for feedback and interaction as part of community building, and the need to address issues of data quality. The tool is located at geo-wiki.org.

...read moreread less

Proceedings Article•

Crowdsourcing annotations for visual object detection

[...]

Hao Su¹, Jia Deng¹, Li Fei-Fei¹•Institutions (1)

Stanford University¹

01 Dec 2012

TL;DR: The key observation is that drawing a bounding box is significantly more difficult and time consuming than giving answers to multiple choice questions, so quality control through additional verification tasks is more cost effective than consensus based algorithms.

...read moreread less

Abstract: A large number of images with ground truth object bounding boxes are critical for learning object detectors, which is a fundamental task in compute vision. In this paper, we study strategies to crowd-source bounding box annotations. The core challenge of building such a system is to effectively control the data quality with minimal cost. Our key observation is that drawing a bounding box is significantly more difficult and time consuming than giving answers to multiple choice questions. Thus quality control through additional verification tasks is more cost effective than consensus based algorithms. In particular, we present a system that consists of three simple sub-tasks — a drawing task, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate, and cost-effective.

...read moreread less

Journal Article•DOI•

Smart ideas for smart cities: investigating crowdsourcing for generating and selecting ideas for ICT innovation in a city context

[...]

Dimitri Schuurman¹, Bastiaan Baccarne¹, Lieven De Marez¹, Peter Mechant¹•Institutions (1)

Ghent University¹

01 Dec 2012-Journal of Theoretical and Applied Electronic Commerce Research

TL;DR: Crowdsourcing appears to be a useful and effective tool in the context of smart city innovation, but should be thoughtfully used and combined with other user involvement approaches and within broader frameworks such as Living Labs.

...read moreread less

Abstract: Within this article, the strengths and weaknesses of crowdsourcing for idea generation and idea selection in the context of smart city innovation are investigated. First, smart cities are defined next to similar but different concepts such as digital cities, intelligent cities or ubiquitous cities. It is argued that the smart city-concept is in fact a more user-centered evolution of the other city-concepts which seem to be more technological deterministic in nature. The principles of crowdsourcing are explained and the different manifestations are demonstrated. By means of a case study, the generation of ideas for innovative uses of ICT for city innovation by citizens through an online platform is studied, as well as the selection process. For this selection, a crowdsourcing solution is compared to a selection made by external experts. The comparison of both indicates that using the crowd as gatekeeper and selector of innovative ideas yields a long list with high user benefits. However, the generation of ideas in itself appeared not to deliver extremely innovative ideas. Crowdsourcing thus appears to be a useful and effective tool in the context of smart city innovation, but should be thoughtfully used and combined with other user involvement approaches and within broader frameworks such as Living Labs.

...read moreread less

Proceedings Article•DOI•

CrowdScreen: algorithms for filtering data with humans

[...]

Aditya Parameswaran¹, Hector Garcia-Molina¹, Hyunjung Park¹, Neoklis Polyzotis², Aditya Ramesh¹, Jennifer Widom¹ - Show less +2 more•Institutions (2)

Stanford University¹, University of California, Santa Cruz²

20 May 2012

TL;DR: Deterministic and probabilistic algorithms to optimize the expected cost (i.e., number of questions) and expected error and can form an integral part of any query processor that uses human computation.

...read moreread less

Abstract: Given a large set of data items, we consider the problem of filtering them based on a set of properties that can be verified by humans. This problem is commonplace in crowdsourcing applications, and yet, to our knowledge, no one has considered the formal optimization of this problem. (Typical solutions use heuristics to solve the problem.) We formally state a few different variants of this problem. We develop deterministic and probabilistic algorithms to optimize the expected cost (i.e., number of questions) and expected error. We experimentally show that our algorithms provide definite gains with respect to other strategies. Our algorithms can be applied in a variety of crowdsourcing scenarios and can form an integral part of any query processor that uses human computation.

...read moreread less

Journal Article•DOI•

Citizen Science in the Age of Neogeography: Utilizing Volunteered Geographic Information for Environmental Monitoring

[...]

John P. Connors¹, Shufei Lei², Maggi Kelly²•Institutions (2)

Arizona State University¹, University of California, Berkeley²

01 Oct 2012-Annals of The Association of American Geographers

TL;DR: In this paper, the authors present a case study of an integrated tool set that engages multiple types of users (from targeted citizen-based observation networks, expert-driven focused monitoring, and opportunistic crowdsourcing efforts) in monitoring a forest disease in the western United States.

...read moreread less

Abstract: The interface between neogeography and citizen science has great potential for environmental monitoring, but this nexus has been explored less often than each subject individually. In this article we review the emerging role of volunteered geographic information in citizen science and present a case study of an integrated tool set that engages multiple types of users (from targeted citizen-based observation networks, expert-driven focused monitoring, and opportunistic crowdsourcing efforts) in monitoring a forest disease in the western United States. We first introduce the overall challenge of data collection in environmental monitoring projects and then discuss the literature surrounding an emergent integration of citizen science and volunteered geographical information. We next explore how these methods characterize and underpin knowledge discovery and how multimodal interaction is supported so that a large spectrum of contributors can be included. These concepts are summarized in a conceptual model tha...

...read moreread less

Journal Article•DOI•

Managing Distributed Innovation: Strategic Utilization of Open and User Innovation

[...]

Marcel Bogers¹, Joel West²•Institutions (2)

University of Southern Denmark¹, Keck Graduate Institute of Applied Life Sciences²

01 Mar 2012-Creativity and Innovation Management

TL;DR: In this article, the authors contrast the vertically integrated innovation model to open innovation, user innovation, as well as other distributed processes (cumulative innovation, communities or social production, and co-creation), while also discuss open source software and crowdsourcing as applications of the perspectives.

...read moreread less

Abstract: Research from a variety of perspectives has argued that innovation no longer takes place within a single organization, but rather is distributed across multiple stakeholders in a value network. Here we contrast the vertically integrated innovation model to open innovation, user innovation, as well as other distributed processes (cumulative innovation, communities or social production, and co-creation), while we also discuss open source software and crowdsourcing as applications of the perspectives. We consider differences in the nature of distributed innovation, as well as its origins and its effects. From this, we contrast the predictions of the perspectives on the sources, motivation and value appropriation of external innovation, and thereby provide a framework for the strategic management of distributed innovation.

...read moreread less

Proceedings Article•DOI•

Cognos: crowdsourcing search for topic experts in microblogs

[...]

Saptarshi Ghosh¹, Naveen Kr. Sharma¹, Fabrício Benevenuto², Niloy Ganguly¹, Krishna P. Gummadi³ - Show less +1 more•Institutions (3)

Indian Institute of Technology Kharagpur¹, Universidade Federal de Ouro Preto², Max Planck Society³

12 Aug 2012

TL;DR: This paper proposes and investigates a new methodology for discovering topic experts in the popular Twitter social network that leverages Twitter Lists, which is often carefully curated by individual users to include experts on topics that interest them and whose meta-data provides valuable semantic cues to the experts' domain of expertise.

...read moreread less

Abstract: Finding topic experts on microblogging sites with millions of users, such as Twitter, is a hard and challenging problem. In this paper, we propose and investigate a new methodology for discovering topic experts in the popular Twitter social network. Our methodology relies on the wisdom of the Twitter crowds -- it leverages Twitter Lists, which are often carefully curated by individual users to include experts on topics that interest them and whose meta-data (List names and descriptions) provides valuable semantic cues to the experts' domain of expertise. We mined List information to build Cognos, a system for finding topic experts in Twitter. Detailed experimental evaluation based on a real-world deployment shows that: (a) Cognos infers a user's expertise more accurately and comprehensively than state-of-the-art systems that rely on the user's bio or tweet content, (b) Cognos scales well due to built-in mechanisms to efficiently update its experts' database with new users, and (c) Despite relying only on a single feature, namely crowdsourced Lists, Cognos yields results comparable to, if not better than, those given by the official Twitter experts search engine for a wide range of queries in user tests. Our study highlights Lists as a potentially valuable source of information for future content or expert search systems in Twitter.

...read moreread less

Journal Article•DOI•

Democratizing Strategy: How Crowdsourcing Can Be Used for Strategy Dialogues:

[...]

Daniel Stieger, Kurt Matzler, Sayan Chatterjee, Florian Ladstaetter-Fussenegger

01 Jul 2012-California Management Review

TL;DR: A framework for a company-internal application of crowdsourcing methods is proposed and a set of five goals companies can pursue employing internal crowdsourcing are presented.

...read moreread less

Abstract: Crowdsourcing is typically associated with the incorporation of company-external stakeholders such as customers in the value creating process. This article proposes a framework for a company-internal application of crowdsourcing methods. It presents a set of five goals companies can pursue employing internal crowdsourcing. The practical approach of an Austrian medium-sized technology company is described in detail, including insights on software design and appropriate procedures.

...read moreread less

Proceedings Article•DOI•

So who won?: dynamic max discovery with the crowd

[...]

Stephen Guo¹, Aditya Parameswaran¹, Hector Garcia-Molina¹•Institutions (1)

Stanford University¹

20 May 2012

TL;DR: It is shown that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard, and heuristic functions are provided to select the maximum given evidence, and to select additional votes.

...read moreread less

Abstract: We consider a crowdsourcing database system that may cleanse, populate, or filter its data by using human workers. Just like a conventional DB system, such a crowdsourcing DB system requires data manipulation functions such as select, aggregate, maximum, average, and so on, except that now it must rely on human operators (that for example compare two objects) with very different latency, cost and accuracy characteristics. In this paper, we focus on one such function, maximum, that finds the highest ranked object or tuple in a set. In particularm we study two problems: given a set of votes (pairwise comparisons among objects), how do we select the maximum? And how do we improve our estimate by requesting additional votes? We show that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard. We then provide heuristic functions to select the maximum given evidence, and to select additional votes. We experimentally evaluate our functions to highlight their strengths and weaknesses.

...read moreread less

Posted Content•

How To Grade a Test Without Knowing the Answers --- A Bayesian Graphical Model for Adaptive Crowdsourcing and Aptitude Testing

[...]

Yoram Bachrach¹, Thore Graepel¹, Tom Minka¹, John Guiver¹•Institutions (1)

Microsoft¹

27 Jun 2012-arXiv: Learning

TL;DR: An active learning/adaptive testing scheme based on a greedy minimization of expected model entropy is devised, which allows a more efficient resource allocation by dynamically choosing the next question to be asked based on the previous responses.

...read moreread less

Abstract: We propose a new probabilistic graphical model that jointly models the difficulties of questions, the abilities of participants and the correct answers to questions in aptitude testing and crowdsourcing settings. We devise an active learning/adaptive testing scheme based on a greedy minimization of expected model entropy, which allows a more efficient resource allocation by dynamically choosing the next question to be asked based on the previous responses. We present experimental results that confirm the ability of our model to infer the required parameters and demonstrate that the adaptive testing scheme requires fewer questions to obtain the same accuracy as a static test scenario.

...read moreread less

Collapse