Topic

Crowdsourcing

About: Crowdsourcing is a research topic. Over the lifetime, 12889 publications have been published within this topic receiving 230638 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Unsupervised Clickstream Clustering for User Behavior Analysis

[...]

Gang Wang¹, Xinyi Zhang¹, Shiliang Tang¹, Haitao Zheng¹, Ben Y. Zhao¹ - Show less +1 more•Institutions (1)

University of California, Santa Barbara¹

07 May 2016

TL;DR: An unsupervised system to capture dominating user behaviors from clickstream data, and visualize the detected behaviors in an intuitive manner, which effectively identifies previously unknown behaviors, e.g., dormant users, hostile chatters.

...read moreread less

Abstract: Online services are increasingly dependent on user participation Whether it's online social networks or crowdsourcing services, understanding user behavior is important yet challenging In this paper, we build an unsupervised system to capture dominating user behaviors from clickstream data (traces of users' click events), and visualize the detected behaviors in an intuitive manner Our system identifies "clusters" of similar users by partitioning a similarity graph (nodes are users; edges are weighted by clickstream similarity) The partitioning process leverages iterative feature pruning to capture the natural hierarchy within user clusters and produce intuitive features for visualizing and understanding captured user behaviors For evaluation, we present case studies on two large-scale clickstream traces (142 million events) from real social networks Our system effectively identifies previously unknown behaviors, eg, dormant users, hostile chatters Also, our user study shows people can easily interpret identified behaviors using our visualization tool

...read moreread less

211 citations

Proceedings Article•DOI•

Two's company, three's a crowd: a case study of crowdsourcing software development

[...]

Klaas-Jan Stol¹, Brian Fitzgerald¹•Institutions (1)

University of Limerick¹

31 May 2014

TL;DR: This case study highlights a number of challenges that arise when crowdsourcing software development at a multinational corporation and works better for specific software development tasks that are less complex and stand-alone without inter dependency.

...read moreread less

Abstract: Crowdsourcing is an emerging and promising approach which involves delegating a variety of tasks to an unknown workforce - the crowd. Crowdsourcing has been applied quite successfully in various contexts from basic tasks on Amazon Mechanical Turk to solving complex industry problems, e.g. InnoCentive. Companies are increasingly using crowdsourcing to accomplish specific software development tasks. However, very little research exists on this specific topic. This paper presents an in-depth industry case study of crowdsourcing software development at a multinational corporation. Our case study highlights a number of challenges that arise when crowdsourcing software development. For example, the crowdsourcing development process is essentially a waterfall model and this must eventually be integrated with the agile approach used by the company. Crowdsourcing works better for specific software development tasks that are less complex and stand-alone without interdependencies. The development cost was much greater than originally expected, overhead in terms of company effort to prepare specifications and answer crowdsourcing community queries was much greater, and the time-scale to complete contests, review submissions and resolve quality issues was significant. Finally, quality issues were pushed later in the lifecycle given the lengthy process necessary to identify and resolve quality issues. Given the emphasis in software engineering on identifying bugs as early as possible, this is quite problematic.

...read moreread less

207 citations

Journal Article•DOI•

Massive Online Crowdsourced Study of Subjective and Objective Picture Quality

[...]

Deepti Ghadiyaram¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

09 Nov 2015-arXiv: Computer Vision and Pattern Recognition

TL;DR: The LIVE In the Wild Image Quality Challenge Database as discussed by the authors contains widely diverse authentic image distortions on a large number of images captured using a representative variety of modern mobile devices and has been used to conduct a very large-scale, multi-month image quality assessment subjective study.

...read moreread less

Abstract: Most publicly available image quality databases have been created under highly controlled conditions by introducing graded simulated distortions onto high-quality photographs. However, images captured using typical real-world mobile camera devices are usually afflicted by complex mixtures of multiple distortions, which are not necessarily well-modeled by the synthetic distortions found in existing databases. The originators of existing legacy databases usually conducted human psychometric studies to obtain statistically meaningful sets of human opinion scores on images in a stringently controlled visual environment, resulting in small data collections relative to other kinds of image analysis databases. Towards overcoming these limitations, we designed and created a new database that we call the LIVE In the Wild Image Quality Challenge Database, which contains widely diverse authentic image distortions on a large number of images captured using a representative variety of modern mobile devices. We also designed and implemented a new online crowdsourcing system, which we have used to conduct a very large-scale, multi-month image quality assessment subjective study. Our database consists of over 350000 opinion scores on 1162 images evaluated by over 7000 unique human observers. Despite the lack of control over the experimental environments of the numerous study participants, we demonstrate excellent internal consistency of the subjective dataset. We also evaluate several top-performing blind Image Quality Assessment algorithms on it and present insights on how mixtures of distortions challenge both end users as well as automatic perceptual quality prediction models.

...read moreread less

207 citations

Journal Article•DOI•

Crowdsourcing Based Description of Urban Emergency Events Using Social Media Big Data

[...]

Zheng Xu¹, Yunhuai Liu², Neil Y. Yen³, Lin Mei², Xiangfeng Luo⁴, Xiao Wei⁵, Chuanping Hu² - Show less +3 more•Institutions (5)

Tsinghua University¹, Chinese Ministry of Public Security², University of Aizu³, Shanghai University⁴, Shanghai Institute of Technology⁵

01 Apr 2020-IEEE Transactions on Cloud Computing

TL;DR: In this paper, in order to detect and describe the real time urban emergency event, the 5W (What, Where, When, Who, and Why) model is proposed and results show the accuracy and efficiency of the proposed method.

...read moreread less

Abstract: Crowdsourcing is a process of acquisition, integration, and analysis of big and heterogeneous data generated by a diversity of sources in urban spaces, such as sensors, devices, vehicles, buildings, and human. Especially, nowadays, no countries, no communities, and no person are immune to urban emergency events. Detection about urban emergency events, e.g., fires, storms, traffic jams is of great importance to protect the security of humans. Recently, social media feeds are rapidly emerging as a novel platform for providing and dissemination of information that is often geographic. The content from social media usually includes references to urban emergency events occurring at, or affecting specific locations. In this paper, in order to detect and describe the real time urban emergency event, the 5W (What, Where, When, Who, and Why) model is proposed. Firstly, users of social media are set as the target of crowd sourcing. Secondly, the spatial and temporal information from the social media are extracted to detect the real time event. Thirdly, a GIS based annotation of the detected urban emergency event is shown. The proposed method is evaluated with extensive case studies based on real urban emergency events. The results show the accuracy and efficiency of the proposed method.

...read moreread less

206 citations

Proceedings Article•DOI•

Revolt: Collaborative Crowdsourcing for Labeling Machine Learning Datasets

[...]

Joseph Chee Chang¹, Saleema Amershi², Ece Kamar²•Institutions (2)

Carnegie Mellon University¹, Microsoft²

02 May 2017

TL;DR: Revolt eliminates the burden of creating detailed label guidelines by harnessing crowd disagreements to identify ambiguous concepts and create rich structures (groups of semantically related items) for post-hoc label decisions.

...read moreread less

Abstract: Crowdsourcing provides a scalable and efficient way to construct labeled datasets for training machine learning systems. However, creating comprehensive label guidelines for crowdworkers is often prohibitive even for seemingly simple concepts. Incomplete or ambiguous label guidelines can then result in differing interpretations of concepts and inconsistent labels. Existing approaches for improving label quality, such as worker screening or detection of poor work, are ineffective for this problem and can lead to rejection of honest work and a missed opportunity to capture rich interpretations about data. We introduce Revolt, a collaborative approach that brings ideas from expert annotation workflows to crowd-based labeling. Revolt eliminates the burden of creating detailed label guidelines by harnessing crowd disagreements to identify ambiguous concepts and create rich structures (groups of semantically related items) for post-hoc label decisions. Experiments comparing Revolt to traditional crowdsourced labeling show that Revolt produces high quality labels without requiring label guidelines in turn for an increase in monetary cost. This up front cost, however, is mitigated by Revolt's ability to produce reusable structures that can accommodate a variety of label boundaries without requiring new data to be collected. Further comparisons of Revolt's collaborative and non-collaborative variants show that collaboration reaches higher label accuracy with lower monetary cost.

...read moreread less

205 citations

Collapse

Network Information

Performance

Metrics

14,950

Papers

282,478

Citations

No. of papers in the topic in previous years
Year	Papers
2023	637
2022	1,420
2021	996
2020	1,250
2019	1,341
2018	1,396

Crowdsourcing

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics