scispace - formally typeset
Search or ask a question

Showing papers on "Crowdsourcing published in 2012"


Journal ArticleDOI
TL;DR: In this article, existing definitions of crowdsourcing are analysed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative.
Abstract: 'Crowdsourcing' is a relatively recent concept that encompasses many practices. This diversity leads to the blurring of the limits of crowdsourcing that may be identified virtually with any type of internet-based collaborative activity, such as co-creation or user innovation. Varying definitions of crowdsourcing exist, and therefore some authors present certain specific examples of crowdsourcing as paradigmatic, while others present the same examples as the opposite. In this article, existing definitions of crowdsourcing are analysed to extract common elements and to establish the basic characteristics of any crowdsourcing initiative. Based on these existing definitions, an exhaustive and consistent definition for crowdsourcing is presented and contrasted in 11 cases.

1,616 citations


Journal ArticleDOI
TL;DR: It is argued that under certain circumstances crowdsourcing transforms distant search into local search, improving the efficiency and effectiveness of problem solving.
Abstract: We argue that under certain circumstances crowdsourcing transforms distant search into local search, improving the efficiency and effectiveness of problem solving. Under such circumstances a firm may choose to crowdsource problem solving rather than solve the problem internally or contract it to a designated supplier. These circumstances depend on the characteristics of the problem, the knowledge required for the solution, the crowd, and the solutions to be evaluated.

999 citations


Journal ArticleDOI
TL;DR: In this paper, a real-world comparison of ideas actually generated by a firm's professionals with those generated by users in the course of an idea generation contest is presented, which suggests that, at least under certain conditions, crowdsourcing might constitute a promising method to gather user ideas that can complement those of a firm' professionals at the idea generation stage in NPD.

881 citations


Posted Content
TL;DR: In this paper, the authors outline a framework that will enable crowd work that is complex, collaborative, and sustainable, and lay out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.
Abstract: Paid crowd work offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale. But it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework. Can we foresee a future crowd workplace in which we would want our children to participate? This paper frames the major challenges that stand in the way of this goal. Drawing on theory from organizational behavior and distributed computing, as well as direct feedback from workers, we outline a framework that will enable crowd work that is complex, collaborative, and sustainable. The framework lays out research challenges in twelve major areas: workflow, task assignment, hierarchy, real-time response, synchronous collaboration, quality control, crowds guiding AIs, AIs guiding crowds, platforms, job design, reputation, and motivation.

803 citations


Journal ArticleDOI
TL;DR: Studying Dell's IdeaStorm community, serial ideators are found to be more likely than consumers with only one idea to generate an idea the organization finds valuable enough to implement, but they are unlikely to repeat their early success once their ideas are implemented.
Abstract: Several organizations have developed ongoing crowdsourcing communities that repeatedly collect ideas for new products and services from a large, dispersed “crowd” of non-experts (consumers) over time. Despite its promises, little is known about the nature of an individual’s ideation efforts in such an online community. Studying Dell’s IdeaStorm community, serial ideators are found to be more likely than consumers with only one idea to generate an idea the organization find valuable enough to implement, but are unlikely to repeat their early success once their ideas are implemented. As ideators with past success attempt to again come up with ideas that will excite the organization, they instead end up proposing ideas similar to their ideas that were already implemented (i.e., they generate less diverse ideas). The negative effects of past success are somewhat mitigated for ideators with diverse commenting activity on others’ ideas. These findings highlight some of the challenges in maintaining an ongoing supply of quality ideas from the crowd over time.

697 citations


Journal ArticleDOI
01 Jul 2012
TL;DR: This work proposes a hybrid human-machine approach in which machines are used to do an initial, coarse pass over all the data, and people are use to verify only the most likely matching pairs, and develops a novel two-tiered heuristic approach for creating batched tasks.
Abstract: Entity resolution is central to data integration and data cleaning. Algorithmic approaches have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a more accurate but expensive (and slow) way to bring human insight into the process. Previous work has proposed batching verification tasks for presentation to human workers but even with batching, a human-only approach is infeasible for data sets of even moderate size, due to the large numbers of matches to be tested. Instead, we propose a hybrid human-machine approach in which machines are used to do an initial, coarse pass over all the data, and people are used to verify only the most likely matching pairs. We show that for such a hybrid system, generating the minimum number of verification tasks of a given size is NP-Hard, but we develop a novel two-tiered heuristic approach for creating batched tasks. We describe this method, and present the results of extensive experiments on real data sets using a popular crowdsourcing platform. The experiments show that our hybrid approach achieves both good efficiency and high accuracy compared to machine-only or human-only alternatives.

499 citations


Proceedings ArticleDOI
05 Sep 2012
TL;DR: A new model for privacy is introduced, namely privacy as expectations, which involves using crowdsourcing to capture users' expectations of what sensitive resources mobile apps use and a new privacy summary interface that prioritizes and highlights places where mobile apps break people's expectations.
Abstract: Smartphone security research has produced many useful tools to analyze the privacy-related behaviors of mobile apps. However, these automated tools cannot assess people's perceptions of whether a given action is legitimate, or how that action makes them feel with respect to privacy. For example, automated tools might detect that a blackjack game and a map app both use one's location information, but people would likely view the map's use of that data as more legitimate than the game. Our work introduces a new model for privacy, namely privacy as expectations. We report on the results of using crowdsourcing to capture users' expectations of what sensitive resources mobile apps use. We also report on a new privacy summary interface that prioritizes and highlights places where mobile apps break people's expectations. We conclude with a discussion of implications for employing crowdsourcing as a privacy evaluation technique.

491 citations


Proceedings ArticleDOI
06 Nov 2012
TL;DR: This paper introduces a taxonomy for spatial crowdsourcing, and focuses on one class of this taxonomy, in which workers send their locations to a centralized server and thereafter the server assigns to every worker his nearby tasks with the objective of maximizing the overall number of assigned tasks.
Abstract: With the ubiquity of mobile devices, spatial crowdsourcing is emerging as a new platform, enabling spatial tasks (i.e., tasks related to a location) assigned to and performed by human workers. In this paper, for the first time we introduce a taxonomy for spatial crowdsourcing. Subsequently, we focus on one class of this taxonomy, in which workers send their locations to a centralized server and thereafter the server assigns to every worker his nearby tasks with the objective of maximizing the overall number of assigned tasks. We formally define this maximum task assignment (or MTA) problem in spatial crowdsourcing, and identify its challenges. We propose alternative solutions to address these challenges by exploiting the spatial properties of the problem space. Finally, our experimental evaluations on both real-world and synthetic data verify the applicability of our proposed approaches and compare them by measuring both the number of assigned tasks and the travel cost of the workers.

484 citations


Proceedings ArticleDOI
16 Apr 2012
TL;DR: A probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers is developed and developed to improve the quality of the links while limiting the amount of work performed by the crowd.
Abstract: We tackle the problem of entity linking for large collections of online pages; Our system, ZenCrowd, identifies entities from natural language text using state of the art techniques and automatically connects them to the Linked Open Data cloud. We show how one can take advantage of human intelligence to improve the quality of the links by dynamically generating micro-tasks on an online crowdsourcing platform. We develop a probabilistic framework to make sensible decisions about candidate links and to identify unreliable human workers. We evaluate ZenCrowd in a real deployment and show how a combination of both probabilistic reasoning and crowdsourcing techniques can significantly improve the quality of the links, while limiting the amount of work performed by the crowd.

454 citations


Posted Content
TL;DR: In this paper, a hybrid human-machine approach is proposed, in which machines are used to do an initial, coarse pass over all the data, and people were used to verify only the most likely matching pairs.
Abstract: Entity resolution is central to data integration and data cleaning. Algorithmic approaches have been improving in quality, but remain far from perfect. Crowdsourcing platforms offer a more accurate but expensive (and slow) way to bring human insight into the process. Previous work has proposed batching verification tasks for presentation to human workers but even with batching, a human-only approach is infeasible for data sets of even moderate size, due to the large numbers of matches to be tested. Instead, we propose a hybrid human-machine approach in which machines are used to do an initial, coarse pass over all the data, and people are used to verify only the most likely matching pairs. We show that for such a hybrid system, generating the minimum number of verification tasks of a given size is NP-Hard, but we develop a novel two-tiered heuristic approach for creating batched tasks. We describe this method, and present the results of extensive experiments on real data sets using a popular crowdsourcing platform. The experiments show that our hybrid approach achieves both good efficiency and high accuracy compared to machine-only or human-only alternatives.

450 citations


Proceedings ArticleDOI
11 Feb 2012
TL;DR: It is shown that volunteers are motivated by a complex framework of factors that dynamically change throughout their cycle of work on scientific projects; this motivational framework is strongly affected by personal interests as well as external factors such as attribution and acknowledgment.
Abstract: Online citizen science projects engage volunteers in collecting, analyzing, and curating scientific data. Existing projects have demonstrated the value of using volunteers to collect data, but few projects have reached the full collaborative potential of scientists and volunteers. Understanding the shared and unique motivations of these two groups can help designers establish the technical and social infrastructures needed to promote effective partnerships. We present findings from a study of the motivational factors affecting participation in ecological citizen science projects. We show that volunteers are motivated by a complex framework of factors that dynamically change throughout their cycle of work on scientific projects; this motivational framework is strongly affected by personal interests as well as external factors such as attribution and acknowledgment. Identifying the pivotal points of motivational shift and addressing them in the design of citizen-science systems will facilitate improved collaboration between scientists and volunteers.

Journal ArticleDOI
TL;DR: A taxonomy that classifies the mobile crowdsourcing field and three new applications that optimize location-based search and similarity services based on crowd-generated data are illustrated.
Abstract: Smartphones can reveal crowdsourcing's full potential and let users transparently contribute to complex and novel problem solving. This emerging area is illustrated through a taxonomy that classifies the mobile crowdsourcing field and through three new applications that optimize location-based search and similarity services based on crowd-generated data. Such applications can be deployed on SmartLab, a cloud of more than 40 Android devices deployed at the University of Cyprus that provides an open testbed to facilitate research and development of smartphone applications on a massive scale.

Proceedings Article
03 Dec 2012
TL;DR: By choosing the prior properly, both BP and MF perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions.
Abstract: Crowdsourcing has become a popular paradigm for labeling large datasets. However, it has given rise to the computational task of aggregating the crowdsourced labels provided by a collection of unreliable annotators. We approach this problem by transforming it into a standard inference problem in graphical models, and applying approximate variational methods, including belief propagation (BP) and mean field (MF). We show that our BP algorithm generalizes both majority voting and a recent algorithm by Karger et al. [1], while our MF method is closely related to a commonly used EM algorithm. In both cases, we find that the performance of the algorithms critically depends on the choice of a prior distribution on the workers' reliability; by choosing the prior properly, both BP and MF (and EM) perform surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions.

Proceedings ArticleDOI
04 Jun 2012
TL;DR: A set of Bayesian predictive models from data are constructed and described how the models operate within an overall crowd-sourcing architecture that combines the efforts of people and machine vision on the task of classifying celestial bodies defined within a citizens' science project named Galaxy Zoo.
Abstract: We show how machine learning and inference can be harnessed to leverage the complementary strengths of humans and computational agents to solve crowdsourcing tasks. We construct a set of Bayesian predictive models from data and describe how the models operate within an overall crowd-sourcing architecture that combines the efforts of people and machine vision on the task of classifying celestial bodies defined within a citizens' science project named Galaxy Zoo. We show how learned probabilistic models can be used to fuse human and machine contributions and to predict the behaviors of workers. We employ multiple inferences in concert to guide decisions on hiring and routing workers to tasks so as to maximize the efficiency of large-scale crowdsourcing processes based on expected utility.

Proceedings Article
22 Jul 2012
TL;DR: This work presents a two-phase exploration-exploitation assignment algorithm and proves that it is competitive with respect to the optimal offline algorithm which has access to the unknown skill levels of each worker.
Abstract: We explore the problem of assigning heterogeneous tasks to workers with different, unknown skill sets in crowdsourcing markets such as Amazon Mechanical Turk. We first formalize the online task assignment problem, in which a requester has a fixed set of tasks and a budget that specifies how many times he would like each task completed. Workers arrive one at a time (with the same worker potentially arriving multiple times), and must be assigned to a task upon arrival. The goal is to allocate workers to tasks in a way that maximizes the total benefit that the requester obtains from the completed work. Inspired by recent research on the online adwords problem, we present a two-phase exploration-exploitation assignment algorithm and prove that it is competitive with respect to the optimal offline algorithm which has access to the unknown skill levels of each worker. We empirically evaluate this algorithm using data collected on Mechanical Turk and show that it performs better than random assignment or greedy algorithms. To our knowledge, this is the first work to extend the online primal-dual technique used in the online adwords problem to a scenario with unknown parameters, and the first to offer an empirical validation of an online primal-dual algorithm.

Proceedings ArticleDOI
11 Feb 2012
TL;DR: This paper investigates whether timely, task-specific feedback helps crowd workers learn, persevere, and produce better results in micro-task platforms by discussing interaction and infrastructure approaches for integrating real-time assessment into online work.
Abstract: Micro-task platforms provide massively parallel, on-demand labor. However, it can be difficult to reliably achieve high-quality work because online workers may behave irresponsibly, misunderstand the task, or lack necessary skills. This paper investigates whether timely, task-specific feedback helps crowd workers learn, persevere, and produce better results. We investigate this question through Shepherd, a feedback system for crowdsourced work. In a between-subjects study with three conditions, crowd workers wrote consumer reviews for six products they own. Participants in the None condition received no immediate feedback, consistent with most current crowdsourcing practices. Participants in the Self-assessment condition judged their own work. Participants in the External assessment condition received expert feedback. Self-assessment alone yielded better overall work than the None condition and helped workers improve over time. External assessment also yielded these benefits. Participants who received external assessment also revised their work more. We conclude by discussing interaction and infrastructure approaches for integrating real-time assessment into online work.

BookDOI
09 Aug 2012
TL;DR: The 20 chapters in this paper explore both the theories and applications of crowdsourcing for geographic knowledge production with three sections focusing on VGI, public participation, and citizen science; Geographic Knowledge Production and Place Inference; Emerging Applications and New Challenges.
Abstract: The phenomenon of volunteered geographic information is part of a profound transformation in how geographic data, information, and knowledge are produced and circulated. By situating volunteered geographic information (VGI) in the context of big-data deluge and the data-intensive inquiry, the 20 chapters in this book explore both the theories and applications of crowdsourcing for geographic knowledge production with three sections focusing on 1). VGI, Public Participation, and Citizen Science; 2). Geographic Knowledge Production and Place Inference; and 3). Emerging Applications and New Challenges. This book argues that future progress in VGI research depends in large part on building strong linkages with diverse geographic scholarship. Contributors of this volume situate VGI research in geographys core concerns with space and place, and offer several ways of addressing persistent challenges of quality assurance in VGI. This book positions VGI as part of a shift toward hybrid epistemologies, and potentially a fourth paradigm of data-intensive inquiry across the sciences. It also considers the implications of VGI and the exaflood for further time-space compression and new forms, degrees of digital inequality, the renewed importance of geography, and the role of crowdsourcing for geographic knowledge production.

Journal ArticleDOI
01 Jun 2012
TL;DR: A quality-sensitive answering model is introduced, which guides the crowdsourcing query engine for the design and processing of the corresponding crowdsourcing jobs, and effectively reduces the processing cost while maintaining the required query answer quality.
Abstract: Some complex problems, such as image tagging and natural language processing, are very challenging for computers, where even state-of-the-art technology is yet able to provide satisfactory accuracy. Therefore, rather than relying solely on developing new and better algorithms to handle such tasks, we look to the crowdsourcing solution -- employing human participation -- to make good the shortfall in current technology. Crowdsourcing is a good supplement to many computer tasks. A complex job may be divided into computer-oriented tasks and human-oriented tasks, which are then assigned to machines and humans respectively.To leverage the power of crowdsourcing, we design and implement a Crowdsourcing Data Analytics System, CDAS. CDAS is a framework designed to support the deployment of various crowdsourcing applications. The core part of CDAS is a quality-sensitive answering model, which guides the crowdsourcing engine to process and monitor the human tasks. In this paper, we introduce the principles of our quality-sensitive model. To satisfy user required accuracy, the model guides the crowdsourcing query engine for the design and processing of the corresponding crowdsourcing jobs. It provides an estimated accuracy for each generated result based on the human workers' historical performances. When verifying the quality of the result, the model employs an online strategy to reduce waiting time. To show the effectiveness of the model, we implement and deploy two analytics jobs on CDAS, a twitter sentiment analytics job and an image tagging job. We use real Twitter and Flickr data as our queries respectively. We compare our approaches with state-of-the-art classification and image annotation techniques. The results show that the human-assisted methods can indeed achieve a much higher accuracy. By embedding the quality-sensitive model into crowdsourcing query engine, we effectively reduce the processing cost while maintaining the required query answer quality.

Proceedings ArticleDOI
11 Feb 2012
TL;DR: It is argued that Turkomatic's collaborative approach can be more successful than the conventional workflow design process and implications for the design of collaborative crowd planning systems are discussed.
Abstract: Preparing complex jobs for crowdsourcing marketplaces requires careful attention to workflow design, the process of decomposing jobs into multiple tasks, which are solved by multiple workers. Can the crowd help design such workflows? This paper presents Turkomatic, a tool that recruits crowd workers to aid requesters in planning and solving complex jobs. While workers decompose and solve tasks, requesters can view the status of worker-designed workflows in real time; intervene to change tasks and solutions; and request new solutions to subtasks from the crowd. These features lower the threshold for crowd employers to request complex work. During two evaluations, we found that allowing the crowd to plan without requester supervision is partially successful, but that requester intervention during workflow planning and execution improves quality substantially. We argue that Turkomatic's collaborative approach can be more successful than the conventional workflow design process and discuss implications for the design of collaborative crowd planning systems.

Proceedings ArticleDOI
25 Mar 2012
TL;DR: It is proved that the proposed incentives protocol can make the website operate close to Pareto efficiency, and also examines an alternative scenario, where the protocol designer aims at maximizing the revenue of the website and evaluate the performance of the optimal protocol.
Abstract: Crowdsourcing websites (e.g. Yahoo! Answers, Amazon Mechanical Turk, and etc.) emerged in recent years that allow requesters from all around the world to post tasks and seek help from an equally global pool of workers. However, intrinsic incentive problems reside in crowdsourcing applications as workers and requester are selfish and aim to strategically maximize their own benefit. In this paper, we propose to provide incentives for workers to exert effort using a novel game-theoretic model based on repeated games. As there is always a gap in the social welfare between the non-cooperative equilibria emerging when workers pursue their self-interests and the desirable Pareto efficient outcome, we propose a novel class of incentive protocols based on social norms which integrates reputation mechanisms into the existing pricing schemes currently implemented on crowdsourcing websites, in order to improve the performance of the non-cooperative equilibria emerging in such applications. We first formulate the exchanges on a crowdsourcing website as a two-sided market where requesters and workers are matched and play gift-giving games repeatedly. Subsequently, we study the protocol designer's problem of finding an optimal and sustainable (equilibrium) protocol which achieves the highest social welfare for that website. We prove that the proposed incentives protocol can make the website operate close to Pareto efficiency. Moreover, we also examine an alternative scenario, where the protocol designer aims at maximizing the revenue of the website and evaluate the performance of the optimal protocol.

Journal ArticleDOI
TL;DR: The components that comprise Geo-Wiki are outlined and how they are integrated in the architectural design, in particular the need to add a mechanism for feedback and interaction as part of community building, and theneed to address issues of data quality.
Abstract: Land cover derived from remotely sensed products is an important input to a number of different global, regional and national scale applications including resource assessments and economic land use models. During the last decade three global land cover datasets have been created, i.e. the GLC-2000, MODIS and GlobCover, but comparison studies have shown that there are large spatial discrepancies between these three products. One of the reasons for these discrepancies is the lack of sufficient in-situ data for the development of these products. To address this issue, a crowdsourcing tool called Geo-Wiki has been developed. Geo-Wiki has two main aims: to increase the amount of in-situ land cover data available for training, calibration and validation, and to create a hybrid global land cover map that provides more accurate land cover information than any current individual product. This paper outlines the components that comprise Geo-Wiki and how they are integrated in the architectural design. An overview of the main functionality of Geo-Wiki is then provided along with the current usage statistics and the lessons learned to date, in particular the need to add a mechanism for feedback and interaction as part of community building, and the need to address issues of data quality. The tool is located at geo-wiki.org.

Proceedings Article
01 Dec 2012
TL;DR: The key observation is that drawing a bounding box is significantly more difficult and time consuming than giving answers to multiple choice questions, so quality control through additional verification tasks is more cost effective than consensus based algorithms.
Abstract: A large number of images with ground truth object bounding boxes are critical for learning object detectors, which is a fundamental task in compute vision. In this paper, we study strategies to crowd-source bounding box annotations. The core challenge of building such a system is to effectively control the data quality with minimal cost. Our key observation is that drawing a bounding box is significantly more difficult and time consuming than giving answers to multiple choice questions. Thus quality control through additional verification tasks is more cost effective than consensus based algorithms. In particular, we present a system that consists of three simple sub-tasks — a drawing task, a quality verification task and a coverage verification task. Experimental results demonstrate that our system is scalable, accurate, and cost-effective.

Journal ArticleDOI
TL;DR: Crowdsourcing appears to be a useful and effective tool in the context of smart city innovation, but should be thoughtfully used and combined with other user involvement approaches and within broader frameworks such as Living Labs.
Abstract: Within this article, the strengths and weaknesses of crowdsourcing for idea generation and idea selection in the context of smart city innovation are investigated. First, smart cities are defined next to similar but different concepts such as digital cities, intelligent cities or ubiquitous cities. It is argued that the smart city-concept is in fact a more user-centered evolution of the other city-concepts which seem to be more technological deterministic in nature. The principles of crowdsourcing are explained and the different manifestations are demonstrated. By means of a case study, the generation of ideas for innovative uses of ICT for city innovation by citizens through an online platform is studied, as well as the selection process. For this selection, a crowdsourcing solution is compared to a selection made by external experts. The comparison of both indicates that using the crowd as gatekeeper and selector of innovative ideas yields a long list with high user benefits. However, the generation of ideas in itself appeared not to deliver extremely innovative ideas. Crowdsourcing thus appears to be a useful and effective tool in the context of smart city innovation, but should be thoughtfully used and combined with other user involvement approaches and within broader frameworks such as Living Labs.

Proceedings ArticleDOI
20 May 2012
TL;DR: Deterministic and probabilistic algorithms to optimize the expected cost (i.e., number of questions) and expected error and can form an integral part of any query processor that uses human computation.
Abstract: Given a large set of data items, we consider the problem of filtering them based on a set of properties that can be verified by humans. This problem is commonplace in crowdsourcing applications, and yet, to our knowledge, no one has considered the formal optimization of this problem. (Typical solutions use heuristics to solve the problem.) We formally state a few different variants of this problem. We develop deterministic and probabilistic algorithms to optimize the expected cost (i.e., number of questions) and expected error. We experimentally show that our algorithms provide definite gains with respect to other strategies. Our algorithms can be applied in a variety of crowdsourcing scenarios and can form an integral part of any query processor that uses human computation.

Journal ArticleDOI
TL;DR: In this paper, the authors present a case study of an integrated tool set that engages multiple types of users (from targeted citizen-based observation networks, expert-driven focused monitoring, and opportunistic crowdsourcing efforts) in monitoring a forest disease in the western United States.
Abstract: The interface between neogeography and citizen science has great potential for environmental monitoring, but this nexus has been explored less often than each subject individually. In this article we review the emerging role of volunteered geographic information in citizen science and present a case study of an integrated tool set that engages multiple types of users (from targeted citizen-based observation networks, expert-driven focused monitoring, and opportunistic crowdsourcing efforts) in monitoring a forest disease in the western United States. We first introduce the overall challenge of data collection in environmental monitoring projects and then discuss the literature surrounding an emergent integration of citizen science and volunteered geographical information. We next explore how these methods characterize and underpin knowledge discovery and how multimodal interaction is supported so that a large spectrum of contributors can be included. These concepts are summarized in a conceptual model tha...

Journal ArticleDOI
TL;DR: In this article, the authors contrast the vertically integrated innovation model to open innovation, user innovation, as well as other distributed processes (cumulative innovation, communities or social production, and co-creation), while also discuss open source software and crowdsourcing as applications of the perspectives.
Abstract: Research from a variety of perspectives has argued that innovation no longer takes place within a single organization, but rather is distributed across multiple stakeholders in a value network. Here we contrast the vertically integrated innovation model to open innovation, user innovation, as well as other distributed processes (cumulative innovation, communities or social production, and co-creation), while we also discuss open source software and crowdsourcing as applications of the perspectives. We consider differences in the nature of distributed innovation, as well as its origins and its effects. From this, we contrast the predictions of the perspectives on the sources, motivation and value appropriation of external innovation, and thereby provide a framework for the strategic management of distributed innovation.

Proceedings ArticleDOI
12 Aug 2012
TL;DR: This paper proposes and investigates a new methodology for discovering topic experts in the popular Twitter social network that leverages Twitter Lists, which is often carefully curated by individual users to include experts on topics that interest them and whose meta-data provides valuable semantic cues to the experts' domain of expertise.
Abstract: Finding topic experts on microblogging sites with millions of users, such as Twitter, is a hard and challenging problem. In this paper, we propose and investigate a new methodology for discovering topic experts in the popular Twitter social network. Our methodology relies on the wisdom of the Twitter crowds -- it leverages Twitter Lists, which are often carefully curated by individual users to include experts on topics that interest them and whose meta-data (List names and descriptions) provides valuable semantic cues to the experts' domain of expertise. We mined List information to build Cognos, a system for finding topic experts in Twitter. Detailed experimental evaluation based on a real-world deployment shows that: (a) Cognos infers a user's expertise more accurately and comprehensively than state-of-the-art systems that rely on the user's bio or tweet content, (b) Cognos scales well due to built-in mechanisms to efficiently update its experts' database with new users, and (c) Despite relying only on a single feature, namely crowdsourced Lists, Cognos yields results comparable to, if not better than, those given by the official Twitter experts search engine for a wide range of queries in user tests. Our study highlights Lists as a potentially valuable source of information for future content or expert search systems in Twitter.

Journal ArticleDOI
TL;DR: A framework for a company-internal application of crowdsourcing methods is proposed and a set of five goals companies can pursue employing internal crowdsourcing are presented.
Abstract: Crowdsourcing is typically associated with the incorporation of company-external stakeholders such as customers in the value creating process. This article proposes a framework for a company-internal application of crowdsourcing methods. It presents a set of five goals companies can pursue employing internal crowdsourcing. The practical approach of an Austrian medium-sized technology company is described in detail, including insights on software design and appropriate procedures.

Proceedings ArticleDOI
20 May 2012
TL;DR: It is shown that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard, and heuristic functions are provided to select the maximum given evidence, and to select additional votes.
Abstract: We consider a crowdsourcing database system that may cleanse, populate, or filter its data by using human workers. Just like a conventional DB system, such a crowdsourcing DB system requires data manipulation functions such as select, aggregate, maximum, average, and so on, except that now it must rely on human operators (that for example compare two objects) with very different latency, cost and accuracy characteristics. In this paper, we focus on one such function, maximum, that finds the highest ranked object or tuple in a set. In particularm we study two problems: given a set of votes (pairwise comparisons among objects), how do we select the maximum? And how do we improve our estimate by requesting additional votes? We show that in a crowdsourcing DB system, the optimal solution to both problems is NP-Hard. We then provide heuristic functions to select the maximum given evidence, and to select additional votes. We experimentally evaluate our functions to highlight their strengths and weaknesses.

Posted Content
TL;DR: An active learning/adaptive testing scheme based on a greedy minimization of expected model entropy is devised, which allows a more efficient resource allocation by dynamically choosing the next question to be asked based on the previous responses.
Abstract: We propose a new probabilistic graphical model that jointly models the difficulties of questions, the abilities of participants and the correct answers to questions in aptitude testing and crowdsourcing settings. We devise an active learning/adaptive testing scheme based on a greedy minimization of expected model entropy, which allows a more efficient resource allocation by dynamically choosing the next question to be asked based on the previous responses. We present experimental results that confirm the ability of our model to infer the required parameters and demonstrate that the adaptive testing scheme requires fewer questions to obtain the same accuracy as a static test scenario.