Showing papers on "Crowdsourcing published in 2011"

PDF

Open Access

Journal Article•DOI•

Crowdsourcing systems on the World-Wide Web

[...]

AnHai Doan¹, Raghu Ramakrishnan², Alon Halevy³•Institutions (3)

University of Wisconsin-Madison¹, Yahoo!², Google³

01 Apr 2011-Communications of The ACM

TL;DR: The practice of crowdsourcing is transforming the Web and giving rise to a new field of inquiry called "crowdsourcing", which aims to provide real-time information about events in a democratic manner.

...read moreread less

1,165 citations

Journal Article•DOI•

The viability of crowdsourcing for survey research.

[...]

Tara S. Behrend¹, David Sharek², Adam W. Meade², Eric N. Wiebe²•Institutions (2)

George Washington University¹, North Carolina State University²

25 Mar 2011-Behavior Research Methods

TL;DR: It is concluded that the use of these labor portals is an efficient and appropriate alternative to a university participant pool, despite small differences in personality and socially desirable responding across the samples.

...read moreread less

Abstract: Online contract labor portals (i.e., crowdsourcing) have recently emerged as attractive alternatives to university participant pools for the purposes of collecting survey data for behavioral research. However, prior research has not provided a thorough examination of crowdsourced data for organizational psychology research. We found that, as compared with a traditional university participant pool, crowdsourcing respondents were older, were more ethnically diverse, and had more work experience. Additionally, the reliability of the data from the crowdsourcing sample was as good as or better than the corresponding university sample. Moreover, measurement invariance generally held across these groups. We conclude that the use of these labor portals is an efficient and appropriate alternative to a university participant pool, despite small differences in personality and socially desirable responding across the samples. The risks and advantages of crowdsourcing are outlined, and an overview of practical and ethical guidelines is provided.

...read moreread less

979 citations

Proceedings Article•DOI•

Human computation: a survey and taxonomy of a growing field

[...]

Alexander J. Quinn¹, Benjamin B. Bederson¹•Institutions (1)

University of Maryland, College Park¹

07 May 2011

TL;DR: This work classifies human computation systems to help identify parallels between different systems and reveal "holes" in the existing work as opportunities for new research.

...read moreread less

Abstract: The rapid growth of human computation within research and industry has produced many novel ideas aimed at organizing web users to do great things. However, the growth is not adequately supported by a framework with which to understand each new system in the context of the old. We classify human computation systems to help identify parallels between different systems and reveal "holes" in the existing work as opportunities for new research. Since human computation is often confused with "crowdsourcing" and other terms, we explore the position of human computation with respect to these related topics.

...read moreread less

865 citations

Journal Article•DOI•

Harnessing the Crowdsourcing Power of Social Media for Disaster Relief

[...]

Huiji Gao¹, Geoffrey Barbier¹, Rebecca Goolsby²•Institutions (2)

Arizona State University¹, Office of Naval Research²

01 May 2011-IEEE Intelligent Systems

TL;DR: The advantages and disadvantages of crowdsourcing applications applied to disaster relief coordination are described and several challenges must be addressed to make crowdsourcing a useful tool that can effectively facilitate the relief progress in coordination, accuracy, and security.

...read moreread less

Abstract: This article briefly describes the advantages and disadvantages of crowdsourcing applications applied to disaster relief coordination. It also discusses several challenges that must be addressed to make crowdsourcing a useful tool that can effectively facilitate the relief progress in coordination, accuracy, and security.

...read moreread less

817 citations

Proceedings Article•DOI•

CrowdDB: answering queries with crowdsourcing

[...]

Michael J. Franklin¹, Donald Kossmann², Tim Kraska¹, Sukriti Ramesh², Reynold Xin¹ - Show less +1 more•Institutions (2)

University of California, Berkeley¹, ETH Zurich²

12 Jun 2011

TL;DR: The design of CrowdDB is described, a major change is that the traditional closed-world assumption for query processing does not hold for human input, and important avenues for future work in the development of crowdsourced query processing systems are outlined.

...read moreread less

Abstract: Some queries cannot be answered by machines only. Processing such queries requires human input for providing information that is missing from the database, for performing computationally difficult functions, and for matching, ranking, or aggregating results based on fuzzy criteria. CrowdDB uses human input via crowdsourcing to process queries that neither database systems nor search engines can adequately answer. It uses SQL both as a language for posing complex queries and as a way to model data. While CrowdDB leverages many aspects of traditional database systems, there are also important differences. Conceptually, a major change is that the traditional closed-world assumption for query processing does not hold for human input. From an implementation perspective, human-oriented query operators are needed to solicit, integrate and cleanse crowdsourced data. Furthermore, performance and cost depend on a number of new factors including worker affinity, training, fatigue, motivation and location. We describe the design of CrowdDB, report on an initial set of experiments using Amazon Mechanical Turk, and outline important avenues for future work in the development of crowdsourced query processing systems.

...read moreread less

688 citations

Proceedings Article•DOI•

From Conservation to Crowdsourcing: A Typology of Citizen Science

[...]

Andrea Wiggins¹, Kevin Crowston¹•Institutions (1)

Syracuse University¹

04 Jan 2011

TL;DR: By examining a variety of project characteristics, this work identified five types-Action, Conservation, Investigation, Virtual, and Education- that differ in primary project goals and the importance of physical environment to participation.

...read moreread less

Abstract: Citizen science is a form of research collaboration involving members of the public in scientific research projects to address real-world problems. Often organized as a virtual collaboration, these projects are a type of open movement, with collective goals addressed through open participation in research tasks. Existing typologies of citizen science projects focus primarily on the structure of participation, paying little attention to the organizational and macrostructural properties that are important to designing and managing effective projects and technologies. By examining a variety of project characteristics, we identified five types-Action, Conservation, Investigation, Virtual, and Education- that differ in primary project goals and the importance of physical environment to participation.

...read moreread less

590 citations

Proceedings Article•

More than fun and money. Worker motivation in crowdsourcing - a study on mechanical turk

[...]

Nicolas Kaufmann, Thimo Schulze¹, Daniel Veit¹•Institutions (1)

University of Mannheim¹

01 Jan 2011

TL;DR: This work adapts different models from classic motivation theory, work motivation theory and Open Source Software Development to crowdsourcing markets and finds that the extrinsic motivational categories have a strong effect on the time spent on the platform.

...read moreread less

Abstract: The payment in paid crowdsourcing markets like Amazon Mechanical Turk is very low, and still collected demographic data shows that the participants are a very diverse group including highly skilled full time workers. Many existing studies on their motivation are rudimental and not grounded on established motivation theory. Therefore, we adapt different models from classic motivation theory, work motivation theory and Open Source Software Development to crowdsourcing markets. The model is tested with a survey of 431 workers on Mechanical Turk. We find that the extrinsic motivational categories (imme-diate payoffs, delayed payoffs, social motivation) have a strong effect on the time spent on the platform. For many workers, however, intrinsic motivation aspects are more important, especially the different facets of enjoyment based motivation like “task autonomy” and “skill variety”. Our contribution is a preliminary model based on established theory intended for the comparison of different crowdsourcing platforms.

...read moreread less

546 citations

Proceedings Article•DOI•

"Voluntweeters": self-organizing by digital volunteers in times of crisis

[...]

Kate Starbird¹, Leysia Palen¹•Institutions (1)

University of Colorado Boulder¹

07 May 2011

TL;DR: This empirical study of "digital volunteers" in the aftermath of the Haiti earthquake describes their behaviors and mechanisms of self-organizing in the information space of a microblogging environment, where collaborators were newly found and distributed across continents.

...read moreread less

Abstract: This empirical study of "digital volunteers" in the aftermath of the January 12, 2010 Haiti earthquake describes their behaviors and mechanisms of self-organizing in the information space of a microblogging environment, where collaborators were newly found and distributed across continents. The paper explores the motivations, resources, activities and products of digital volunteers. It describes how seemingly small features of the technical environment offered structure for self-organizing, while considering how the social-technical milieu enabled individual capacities and collective action. Using social theory about self-organizing, the research offers insight about features of coordination within a setting of massive interaction.

...read moreread less

539 citations

Proceedings Article•

Iterative Learning for Reliable Crowdsourcing Systems

[...]

David R. Karger¹, Sewoong Oh¹, Devavrat Shah¹•Institutions (1)

Massachusetts Institute of Technology¹

12 Dec 2011

TL;DR: This paper gives a new algorithm for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers, and shows that the algorithm significantly outperforms majority voting and is asymptotically optimal through comparison to an oracle that knows the reliability of every worker.

...read moreread less

Abstract: Crowdsourcing systems, in which tasks are electronically distributed to numerous "information piece-workers", have emerged as an effective paradigm for human-powered solving of large scale problems in domains such as image classification, data entry, optical character recognition, recommendation, and proofreading. Because these low-paid workers can be unreliable, nearly all crowdsourcers must devise schemes to increase confidence in their answers, typically by assigning each task multiple times and combining the answers in some way such as majority voting. In this paper, we consider a general model of such crowdsourcing tasks, and pose the problem of minimizing the total price (i.e., number of task assignments) that must be paid to achieve a target overall reliability. We give a new algorithm for deciding which tasks to assign to which workers and for inferring correct answers from the workers' answers. We show that our algorithm significantly outperforms majority voting and, in fact, is asymptotically optimal through comparison to an oracle that knows the reliability of every worker.

...read moreread less

494 citations

Journal Article•DOI•

Task Design, Motivation, and Participation in Crowdsourcing Contests

[...]

Haichao Zheng¹, Dahui Li², Wenhua Hou³•Institutions (3)

Southwestern University of Finance and Economics¹, University of Minnesota², Nankai University³

01 Jul 2011-International Journal of Electronic Commerce

TL;DR: It is found that intrinsic motivation was more important than extrinsic motivation in inducing participation in crowdsourcing contests, and it is suggested that crowdsourcing contest tasks should preferably be highly autonomous, explicitly specified, and less complex, as well as require a variety of skills.

...read moreread less

Abstract: Firms can seek innovative external ideas and solutions to business tasks by sponsoring co-creation activities such as crowdsourcing. To get optimal solutions from crowdsourcing contest participants, firms need to improve task design and motivate contest solvers' participation in the co-creation process. Based on the theory of extrinsic and intrinsic motivation as well as the theory of job design, we developed a research model to explain participation in crowdsourcing contests, as well as the effects of task attributes on intrinsic motivation. Subjective and objective data were collected from 283 contest solvers at two different time points. We found that intrinsic motivation was more important than extrinsic motivation in inducing participation. Contest autonomy, variety, and analyzability were positively associated with intrinsic motivation, whereas contest tacitness was negatively associated with intrinsic motivation. The findings suggest a balanced view of extrinsic and intrinsic motivation in order to encourage participation in crowdsourcing. We also suggest that crowdsourcing contest tasks should preferably be highly autonomous, explicitly specified, and less complex, as well as require a variety of skills.

...read moreread less

477 citations

Journal Article•DOI•

Towards a characterization of crowdsourcing practices

[...]

Eric Schenk, Claude Guittard

06 Apr 2011-Journal of Innovation Economics

TL;DR: A typology of Crowdsourcing practices is proposed based on two criteria: the integrative or selective nature of the process and the type of tasks that are crowdsourced (simple, complex and creative tasks).

...read moreread less

Abstract: The word Crowdsourcing -a compound contraction of Crowd and Outsourcing, was used by Howe in order to define outsourcing to the crowd. The Crowdsourcing phenomenon covers heterogeneous situations and it has inspired a number of authors. However, we are still lacking a general and synthetic view of this concept. The aim of our work is to characterize Crowdsourcing in its various aspects. First we define of Crowdsourcing, and provide examples that illustrate the diversity of Crowdsourcing practices and we present similarities and differences between Crowdsourcing and established theories (Open Innovation, User Innovation and Open Source Software). Then, we propose and illustrate a typology of Crowdsourcing practices based on two criteria: the integrative or selective nature of the process and the type of tasks that are crowdsourced (simple, complex and creative tasks). Finally, we present some potential benefits and pitfalls of Crowdsourcing.

...read moreread less

Proceedings Article•DOI•

CrowdForge: crowdsourcing complex work

[...]

Aniket Kittur¹, Boris Smus¹, Susheel Khamkar¹, Robert E. Kraut¹•Institutions (1)

Carnegie Mellon University¹

07 May 2011

TL;DR: This work presents a general purpose framework for micro-task markets that provides a scaffolding for more complex human computation tasks which require coordination among many individuals, such as writing an article.

...read moreread less

Abstract: Micro-task markets such as Amazon's Mechanical Turk represent a new paradigm for accomplishing work, in which employers can tap into a large population of workers around the globe to accomplish tasks in a fraction of the time and money of more traditional methods. However, such markets have been primarily used for simple, independent tasks, such as labeling an image or judging the relevance of a search result. Here we present a general purpose framework for accomplishing complex and interdependent tasks using micro-task markets. We describe our framework, a web-based prototype, and case studies on article writing, decision making, and science journalism that demonstrate the benefits and limitations of the approach.

...read moreread less

Proceedings Article•DOI•

A Survey of Crowdsourcing Systems

[...]

Man-Ching Yuen¹, Irwin King¹, Kwong-Sak Leung¹•Institutions (1)

The Chinese University of Hong Kong¹

01 Oct 2011

TL;DR: A structured view of the research on crowd sourcing to date is provided, which is categorized according to their applications, algorithms, performances and datasets.

...read moreread less

Abstract: Crowd sourcing is evolving as a distributed problem-solving and business production model in recent years. In crowd sourcing paradigm, tasks are distributed to networked people to complete such that a company's production cost can be greatly reduced. In 2003, Luis von Ahn and his colleagues pioneered the concept of "human computation", which utilizes human abilities to perform computation tasks that are difficult for computers to process. Later, the term "crowdsourcing" was coined by Jeff Howe in 2006. Since then, a lot of work in crowd sourcing has focused on different aspects of crowd sourcing, such as computational techniques and performance analysis. In this paper, we give a survey on the literature on crowd sourcing which are categorized according to their applications, algorithms, performances and datasets. This paper provides a structured view of the research on crowd sourcing to date.

...read moreread less

Journal Article•DOI•

Crowdsourcing, citizen sensing and sensor web technologies for public and environmental health surveillance and crisis management: trends, OGC standards and application examples

[...]

Maged N. Kamel Boulos¹, Bernd Resch², Bernd Resch³, David N. Crowley⁴, John G. Breslin⁴, Gunho Sohn⁵, Russ Burtner⁶, William A. Pike⁶, Eduardo Jezierski, Kuo-Yu Slayer Chuang⁷ - Show less +6 more•Institutions (7)

University of Plymouth¹, University of Osnabrück², Massachusetts Institute of Technology³, National University of Ireland, Galway⁴, York University⁵, Pacific Northwest National Laboratory⁶, Industrial Technology Research Institute⁷

21 Dec 2011-International Journal of Health Geographics

TL;DR: A comprehensive review of the overlapping domains of the Sensor Web, citizen sensing and human-in-the-loop sensing in the era of Mobile and Social Web, and the roles these domains can play in environmental and public health surveillance and crisis/disaster informatics can be found in this article.

...read moreread less

Abstract: 'Wikification of GIS by the masses' is a phrase-term first coined by Kamel Boulos in 2005, two years earlier than Goodchild's term 'Volunteered Geographic Information'. Six years later (2005-2011), OpenStreetMap and Google Earth (GE) are now full-fledged, crowdsourced 'Wikipedias of the Earth' par excellence, with millions of users contributing their own layers to GE, attaching photos, videos, notes and even 3-D (three dimensional) models to locations in GE. From using Twitter in participatory sensing and bicycle-mounted sensors in pervasive environmental sensing, to creating a 100,000-sensor geo-mashup using Semantic Web technology, to the 3-D visualisation of indoor and outdoor surveillance data in real-time and the development of next-generation, collaborative natural user interfaces that will power the spatially-enabled public health and emergency situation rooms of the future, where sensor data and citizen reports can be triaged and acted upon in real-time by distributed teams of professionals, this paper offers a comprehensive state-of-the-art review of the overlapping domains of the Sensor Web, citizen sensing and 'human-in-the-loop sensing' in the era of the Mobile and Social Web, and the roles these domains can play in environmental and public health surveillance and crisis/disaster informatics. We provide an in-depth review of the key issues and trends in these areas, the challenges faced when reasoning and making decisions with real-time crowdsourced data (such as issues of information overload, "noise", misinformation, bias and trust), the core technologies and Open Geospatial Consortium (OGC) standards involved (Sensor Web Enablement and Open GeoSMS), as well as a few outstanding project implementation examples from around the world.

...read moreread less

Crowdsourcing, citizen sensing and Sensor Web technologies for public and environmental health surveillance and crisis management: trends, OGC standards and application examples

[...]

01 Dec 2011

TL;DR: An in-depth review of the key issues and trends in these areas, the challenges faced when reasoning and making decisions with real-time crowdsourced data, the core technologies and Open Geospatial Consortium standards involved (Sensor Web Enablement and Open GeoSMS), as well as a few outstanding project implementation examples from around the world.

...read moreread less

Proceedings Article•DOI•

Quantification of YouTube QoE via Crowdsourcing

[...]

Tobias Hoßfeld¹, Michael Seufert¹, Matthias Hirth¹, Thomas Zinner¹, Phuoc Tran-Gia¹, Raimund Schatz - Show less +2 more•Institutions (1)

University of Würzburg¹

05 Dec 2011

TL;DR: The results suggest that, crowd sourcing is a highly effective QoE assessment method not only for online video, but also for a wide range of other current and future Internet applications.

...read moreread less

Abstract: This paper addresses the challenge of assessing and modeling Quality of Experience (QoE) for online video services that are based on TCP-streaming. We present a dedicated QoE model for You Tube that takes into account the key influence factors (such as stalling events caused by network bottlenecks) that shape quality perception of this service. As second contribution, we propose a generic subjective QoE assessment methodology for multimedia applications (like online video) that is based on crowd sourcing - a highly cost-efficient, fast and flexible way of conducting user experiments. We demonstrate how our approach successfully leverages the inherent strengths of crowd sourcing while addressing critical aspects such as the reliability of the experimental data obtained. Our results suggest that, crowd sourcing is a highly effective QoE assessment method not only for online video, but also for a wide range of other current and future Internet applications.

...read moreread less

Proceedings Article•DOI•

An Assessment of Intrinsic and Extrinsic Motivation on Task Performance in Crowdsourcing Markets

[...]

Jakob Rogstadius¹, Vassilis Kostakos¹, Aniket Kittur², Boris Smus¹, Jim A. Laredo³, Maja Vukovic³ - Show less +2 more•Institutions (3)

University of Madeira¹, Carnegie Mellon University², IBM³

05 Jul 2011

TL;DR: Results suggest that intrinsic motivation can indeed improve the quality of workers’ output, confirming the hypothesis and finding a synergistic interaction between intrinsic and extrinsic motivators that runs contrary to previous literature suggesting “crowding out” effects.

...read moreread less

Abstract: Crowdsourced labor markets represent a powerful new paradigm for accomplishing work. Understanding the motivating factors that lead to high quality work could have significant benefits. However, researchers have so far found that motivating factors such as increased monetary reward generally increase workers’ willingness to accept a task or the speed at which a task is completed, but do not improve the quality of the work. We hypothesize that factors that increase the intrinsic motivation of a task – such as framing a task as helping others – may succeed in improving output quality where extrinsic motivators such as increased pay do not. In this paper we present an experiment testing this hypothesis along with a novel experimental design that enables controlled experimentation with intrinsic and extrinsic motivators in Amazon’s Mechanical Turk, a popular crowdsourcing task market. Results suggest that intrinsic motivation can indeed improve the quality of workers’ output, confirming our hypothesis. Furthermore, we find a synergistic interaction between intrinsic and extrinsic motivators that runs contrary to previous literature suggesting “crowding out” effects. Our results have significant practical and theoretical implications for crowd work.

...read moreread less

Proceedings Article•

Managing the Crowd: Towards a Taxonomy of Crowdsourcing Processes

[...]

David Geiger¹, Stefan Seedorf¹, Thimo Schulze¹, Robert C. Nickerson², Martin Schader¹ - Show less +1 more•Institutions (2)

University of Mannheim¹, San Francisco State University²

01 Jan 2011

TL;DR: A new taxonomic framework for crowdsourcing processes is proposed following a method of IS taxonomy development and focuses exclusively on an organizational perspective and on the mechanisms available to crowdsourcing organizations.

...read moreread less

Abstract: Crowdsourcing is an umbrella term for a variety of approaches that tap into the potential of a large and open crowd of people. So far, there is no systematic understanding of the processes used to source and aggregate contributions from the crowd. In particular, crowdsourcing organizations striving to achieve a specific goal should be able to evaluate the mechanisms that impact these processes. Following a method of IS taxonomy development we propose a new taxonomic framework for crowdsourcing processes. In contrast to previous work, this classification scheme focuses exclusively on an organizational perspective and on the mechanisms available to these organizations. The resulting dimensions are preselection of contributors, accessibility of peer contributions, aggregation of contributions, and remuneration for contributions. By classifying the processes of 46 crowdsourcing examples, we identify 19 distinct process types. A subsequent cluster analysis shows general patterns among these types and indicates a link to certain applications of crowdsourcing.

...read moreread less

Posted Content•

Rules of Crowdsourcing: Models, Issues, and Systems of Control

[...]

Gregory D. Saxton¹, Onook Oh², Rajiv Kishore³•Institutions (3)

York University¹, University of Nebraska Omaha², University at Buffalo³

01 Jan 2011-Social Science Research Network

TL;DR: A “taxonomic theory” of crowdsourcing is developed by organizing the empirical variants in nine distinct forms of crowdsourced models by developing the hermeneutic reading principle and analyzing 103 well-known crowdsourcing web sites.

...read moreread less

Abstract: In this paper, we first provide a practical yet rigorous definition of crowdsourcing that incorporates “crowds,” outsourcing, and social web technologies. We then analyze 103 well-known crowdsourcing websites using content analysis methods and the hermeneutic reading principle. Based on our analysis, we develop a “taxonomic theory” of crowdsourcing by organizing the empirical variants in nine distinct forms of crowdsourcing models. We also discuss key issues and directions, concentrating on the notion of managerial control systems.

...read moreread less

Journal Article•DOI•

Communitition: The Tension between Competition and Collaboration in Community-Based Design Contests

[...]

Katja Hutter¹, Julia Hautz¹, Johann Füller¹, Julia Mueller¹, Kurt Matzler¹ - Show less +1 more•Institutions (1)

University of Innsbruck¹

01 Mar 2011-Creativity and Innovation Management

TL;DR: In this paper, the authors argue that the firm-level concept of co-opetition might also be relevant for an innovation's success on the individual level within contest communities, as numerous user discussions and comments improve the quality of submitted ideas and allow the future potential of an idea to shine through the so-called "wisdom of the crowd".

...read moreread less

Abstract: Following the concepts of crowdsourcing, co-creation or open innovation, companies are increasingly using contests to foster the generation of creative solutions. Currently, online idea and design contests are enjoying a resurgence through the usage of new information and communication technologies. These virtual platforms allow users both to competitively disclose their creative ideas to corporations and also to interact and collaborate with like-minded peers, communicating, discussing and sharing their insights and experiences, building social networks and establishing a sense of community. Little research has considered that contest communities both promote and benefit from simultaneous co-operation and competition and that both types of relationships need to be emphasized at the same time. In this article, it is argued that the firm-level concept of co-opetition might also be relevant for an innovation's success on the individual level within contest communities. Our concept of ‘communitition’ should include the elements of competitive participation without disabling the climate for co-operation, as numerous user discussions and comments improve the quality of submitted ideas and allow the future potential of an idea to shine through the so-called ‘wisdom of the crowd’.

...read moreread less

Proceedings Article•DOI•

Platemate: crowdsourcing nutritional analysis from food photographs

[...]

Jon Noronha¹, Eric Hysen¹, Haoqi Zhang¹, Krzysztof Z. Gajos¹•Institutions (1)

Harvard University¹

16 Oct 2011

TL;DR: Results of the evaluations show that PlateMate is nearly as accurate as a trained dietitian and easier to use for most users than traditional self-reporting.

...read moreread less

Abstract: We introduce PlateMate, a system that allows users to take photos of their meals and receive estimates of food intake and composition. Accurate awareness of this information can help people monitor their progress towards dieting goals, but current methods for food logging via self-reporting, expert observation, or algorithmic analysis are time-consuming, expensive, or inaccurate. PlateMate crowdsources nutritional analysis from photographs using Amazon Mechanical Turk, automatically coordinating untrained workers to estimate a meal's calories, fat, carbohydrates, and protein. We present the Management framework for crowdsourcing complex tasks, which supports PlateMate's nutrition analysis workflow. Results of our evaluations show that PlateMate is nearly as accurate as a trained dietitian and easier to use for most users than traditional self-reporting.

...read moreread less

Proceedings Article•DOI•

Crowdsourcing in the cultural heritage domain: opportunities and challenges

[...]

Johan Oomen¹, Lora Aroyo²•Institutions (2)

Netherlands Institute for Sound and Vision¹, University of Amsterdam²

29 Jun 2011

TL;DR: The path towards a more open, connected and smart cultural heritage is shown: open (the data is open, shared and accessible), connected ( the use of linked data allows for interoperable infrastructures, with users and providers getting more and more connected), and smart (the use of knowledge and web technologies allows us to provide interesting data to the right users).

...read moreread less

Abstract: Galleries, Libraries, Archives and Museums (short: GLAMs) around the globe are beginning to explore the potential of crowdsourcing, i. e. outsourcing specific activities to a community though an open call. In this paper, we propose a typology of these activities, based on an empirical study of a substantial amount of projects initiated by relevant cultural heritage institutions. We use the Digital Content Life Cycle model to study the relation between the different types of crowdsourcing and the core activities of heritage organizations. Finally, we focus on two critical challenges that will define the success of these collaborations between amateurs and professionals: (1) finding sufficient knowledgeable, and loyal users; (2) maintaining a reasonable level of quality. We thus show the path towards a more open, connected and smart cultural heritage: open (the data is open, shared and accessible), connected (the use of linked data allows for interoperable infrastructures, with users and providers getting more and more connected), and smart (the use of knowledge and web technologies allows us to provide interesting data to the right users, in the right context, anytime, anywhere -- both with involved users/consumers and providers). It leads to a future cultural heritage that is open, has intelligent infrastructures and has involved users, consumers and providers.

...read moreread less

Journal Article•DOI•

Whom Should Firms Attract to Open Innovation Platforms? The Role of Knowledge Diversity and Motivation

[...]

Karsten Frey, Christian Lüthje, Simon Haag

01 Oct 2011-Long Range Planning

TL;DR: In this paper, the authors explore how the motivation and knowledge of individuals participating in innovation projects broadcast on the Internet affect their contribution performance and identify the most valuable contributors as those who combine high levels of intrinsic enjoyment in contributing with a cognitive base fed from diverse knowledge domains.

...read moreread less

Posted Content•

Human-powered Sorts and Joins

[...]

Adam Marcus¹, Eugene Wu¹, David R. Karger¹, Samuel Madden¹, Robert C. Miller¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

30 Sep 2011-arXiv: Databases

TL;DR: The authors integrated crowds into a declarative workflow engine called Qurk to reduce the burden on workflow designers and used humans to compare items for sorting and joining data, two of the most common operations in DBMSs.

...read moreread less

Abstract: Crowdsourcing markets like Amazon's Mechanical Turk (MTurk) make it possible to task people with small jobs, such as labeling images or looking up phone numbers, via a programmatic interface. MTurk tasks for processing datasets with humans are currently designed with significant reimplementation of common workflows and ad-hoc selection of parameters such as price to pay per task. We describe how we have integrated crowds into a declarative workflow engine called Qurk to reduce the burden on workflow designers. In this paper, we focus on how to use humans to compare items for sorting and joining data, two of the most common operations in DBMSs. We describe our basic query interface and the user interface of the tasks we post to MTurk. We also propose a number of optimizations, including task batching, replacing pairwise comparisons with numerical ratings, and pre-filtering tables before joining them, which dramatically reduce the overall cost of running sorts and joins on the crowd. In an experiment joining two sets of images, we reduce the overall cost from $67 in a naive implementation to about $3, without substantially affecting accuracy or latency. In an end-to-end experiment, we reduced cost by a factor of 14.5.

...read moreread less

Proceedings Article•DOI•

Participatory Sensing: Crowdsourcing Data from Mobile Smartphones in Urban Spaces

[...]

Salil S. Kanhere¹•Institutions (1)

University of New South Wales¹

06 Jun 2011

TL;DR: This advanced seminar will provide a comprehensive overview of this new and exciting paradigm for monitoring the urban landscape known as participatory sensing, and outline the major research challenges.

...read moreread less

Abstract: The recent wave of sensor-rich, Internet-enabled, smart mobile devices such as the Apple iPhone has opened the door for a novel paradigm for monitoring the urban landscape known as participatory sensing. Using this paradigm, ordinary citizens can collect multi-modal data streams from the surrounding environment using their mobile devices and share the same using existing communication infrastructure (e.g., 3G service or WiFi access points). The data contributed from multiple participants can be combined to build a spatiotemporal view of the phenomenon of interest and also to extract important community statistics. Given the ubiquity of mobile phones and the high density of people in metropolitan areas, participatory sensing can achieve an unprecedented level of coverage in both space and time for observing events of interest in urban spaces. Several exciting participatory sensing applications have emerged in recent years. For example, GPS traces uploaded by drivers and passengers can be used to generate real time traffic statistics. Similarly, street-level audio samples collected by pedestrians can be aggregated to create a citywide noise map. In this advanced seminar, we will provide a comprehensive overview of this new and exciting paradigm and outline the major research challenges.

...read moreread less

Journal Article•DOI•

Human-powered sorts and joins

[...]

Adam Marcus¹, Eugene Wu¹, David R. Karger¹, Samuel Madden¹, Robert C. Miller¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 2011

TL;DR: This paper describes how MTurk tasks for processing datasets with humans are currently designed with significant reimplementation of common workflows and ad-hoc selection of parameters such as price to pay per task, and proposes a number of optimizations, including task batching, replacing pairwise comparisons with numerical ratings, and pre-filtering tables before joining them.

...read moreread less

Proceedings Article•

Programmatic gold: targeted and scalable quality assurance in crowdsourcing

[...]

David Oleson, Alexander Sorokin, Greg Laughlin, Vaughn Hester, John Le, Lukas Biewald - Show less +2 more

01 Jan 2011

TL;DR: This paper presents an automated quality assurance process that is inexpensive and scalable, and finds that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results.

...read moreread less

Abstract: Crowdsourcing is an effective tool for scalable data annotation in both research and enterprise contexts Due to crowdsourcing's open participation model, quality assurance is critical to the success of any project Present methods rely on EM-style post-processing or manual annotation of large gold standard sets In this paper we present an automated quality assurance process that is inexpensive and scalable Our novel process relies on programmatic gold creation to provide targeted training feedback to workers and to prevent common scamming scenarios We find that it decreases the amount of manual work required to manage crowdsourced labor while improving the overall quality of the results

...read moreread less

Proceedings Article•DOI•

CROWDMOS: An approach for crowdsourcing mean opinion score studies

[...]

Flavio Protasio Ribeiro¹, Dinei Florencio², Cha Zhang², Michael L. Seltzer²•Institutions (2)

University of São Paulo¹, Microsoft²

22 May 2011

TL;DR: To automate crowdMOS testing, a set of freely distributable, open-source tools for Amazon Mechanical Turk, a platform designed to facilitate crowdsourcing, are offered, providing researchers with a user-friendly means of performing subjective quality evaluations without the overhead associated with laboratory studies.

...read moreread less

Abstract: MOS (mean opinion score) subjective quality studies are used to evaluate many signal processing methods. Since laboratory quality studies are time consuming and expensive, researchers often run small studies with less statistical significance or use objective measures which only approximate human perception. We propose a cost-effective and convenient measure called crowdMOS, obtained by having internet users participate in a MOS-like listening study. Workers listen and rate sentences at their leisure, using their own hardware, in an environment of their choice. Since these individuals cannot be supervised, we propose methods for detecting and discarding inaccurate scores. To automate crowdMOS testing, we offer a set of freely distributable, open-source tools for Amazon Mechanical Turk, a platform designed to facilitate crowdsourcing. These tools implement the MOS testing methodology described in this paper, providing researchers with a user-friendly means of performing subjective quality evaluations without the overhead associated with laboratory studies. Finally, we demonstrate the use of crowdMOS using data from the Blizzard text-to-speech competition, showing that it delivers accurate and repeatable results.

...read moreread less

Proceedings Article•DOI•

Who moderates the moderators?: crowdsourcing abuse detection in user-generated content

[...]

Arpita Ghosh¹, Satyen Kale¹, Preston McAfee¹•Institutions (1)

Yahoo!¹

05 Jun 2011

TL;DR: This paper introduces a framework to address the problem of moderating online content using crowdsourced ratings, and presents efficient algorithms to accurately detect abuse that only require knowledge about the identity of a single 'good' agent, who rates contributions accurately more than half the time.

...read moreread less

Abstract: A large fraction of user-generated content on the Web, such as posts or comments on popular online forums, consists of abuse or spam. Due to the volume of contributions on popular sites, a few trusted moderators cannot identify all such abusive content, so viewer ratings of contributions must be used for moderation. But not all viewers who rate content are trustworthy and accurate. What is a principled approach to assigning trust and aggregating user ratings, in order to accurately identify abusive content? In this paper, we introduce a framework to address the problem of moderating online content using crowdsourced ratings. Our framework encompasses users who are untrustworthy or inaccurate to an unknown extent --- that is, both the content and the raters are of unknown quality. With no knowledge whatsoever about the raters, it is impossible to do better than a random estimate. We present efficient algorithms to accurately detect abuse that only require knowledge about the identity of a single 'good' agent, who rates contributions accurately more than half the time. We prove that our algorithm can infer the quality of contributions with error that rapidly converges to zero as the number of observations increases; we also numerically demonstrate that the algorithm has very high accuracy for much fewer observations. Finally, we analyze the robustness of our algorithms to manipulation by adversarial or strategic raters, an important issue in moderating online content, and quantify how the performance of the algorithm degrades with the number of manipulating agents.

...read moreread less

Proceedings Article•DOI•

Large-scale live active learning: Training object detectors with crawled data and crowds

[...]

Sudheendra Vijayanarasimhan¹, Kristen Grauman¹•Institutions (1)

University of Texas at Austin¹

20 Jun 2011

TL;DR: This work presents an approach for live learning of object detectors, in which the system autonomously refines its models by actively requesting crowd-sourced annotations on images crawled from the Web, and introduces a novel part-based detector amenable to linear classifiers.

...read moreread less

Abstract: Active learning and crowdsourcing are promising ways to efficiently build up training sets for object recognition, but thus far techniques are tested in artificially controlled settings. Typically the vision researcher has already determined the dataset's scope, the labels “actively” obtained are in fact already known, and/or the crowd-sourced collection process is iteratively fine-tuned. We present an approach for live learning of object detectors, in which the system autonomously refines its models by actively requesting crowd-sourced annotations on images crawled from the Web. To address the technical issues such a large-scale system entails, we introduce a novel part-based detector amenable to linear classifiers, and show how to identify its most uncertain instances in sub-linear time with a hashing-based solution. We demonstrate the approach with experiments of unprecedented scale and autonomy, and show it successfully improves the state-of-the-art for the most challenging objects in the PASCAL benchmark. In addition, we show our detector competes well with popular nonlinear classifiers that are much more expensive to train.

...read moreread less

Collapse