Showing papers on "Crowdsourcing published in 2020"

PDF

Open Access

Journal Article•DOI•

KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment

[...]

Vlad Hosu¹, Hanhe Lin¹, Tamás Szirányi, Dietmar Saupe¹•Institutions (1)

24 Jan 2020-IEEE Transactions on Image Processing

TL;DR: This work presents a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10,073 quality scored images, and proposes a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set.

...read moreread less

Abstract: Deep learning methods for image quality assessment (IQA) are limited due to the small size of existing datasets. Extensive datasets require substantial resources both for generating publishable content and annotating it accurately. We present a systematic and scalable approach to creating KonIQ-10k, the largest IQA dataset to date, consisting of 10,073 quality scored images. It is the first in-the-wild database aiming for ecological validity, concerning the authenticity of distortions, the diversity of content, and quality-related indicators. Through the use of crowdsourcing, we obtained 1.2 million reliable quality ratings from 1,459 crowd workers, paving the way for more general IQA models. We propose a novel, deep learning model (KonCept512), to show an excellent generalization beyond the test set (0.921 SROCC), to the current state-of-the-art database LIVE-in-the-Wild (0.825 SROCC). The model derives its core performance from the InceptionResNet architecture, being trained at a higher resolution than previous models ( $512\times 384$ ). Correlation analysis shows that KonCept512 performs similar to having 9 subjective scores for each test image.

...read moreread less

299 citations

Journal Article•DOI•

Moving towards smart cities: Solutions that lead to the Smart City Transformation Framework

[...]

Harish Kumar¹, Manoj Kumar Singh¹, M. P. Gupta¹, Jitendra Madaan¹•Institutions (1)

Indian Institutes of Technology¹

01 Apr 2020-Technological Forecasting and Social Change

TL;DR: The study explores all the possible services among various city dimensions which can make a city smart and suggests multi-dimensional service classification along with required basic infrastructural development.

...read moreread less

224 citations

Journal Article•DOI•

Crowdsourcing Based Description of Urban Emergency Events Using Social Media Big Data

[...]

Zheng Xu¹, Yunhuai Liu², Neil Y. Yen³, Lin Mei², Xiangfeng Luo⁴, Xiao Wei⁵, Chuanping Hu² - Show less +3 more•Institutions (5)

Tsinghua University¹, Chinese Ministry of Public Security², University of Aizu³, Shanghai University⁴, Shanghai Institute of Technology⁵

01 Apr 2020-IEEE Transactions on Cloud Computing

TL;DR: In this paper, in order to detect and describe the real time urban emergency event, the 5W (What, Where, When, Who, and Why) model is proposed and results show the accuracy and efficiency of the proposed method.

...read moreread less

Abstract: Crowdsourcing is a process of acquisition, integration, and analysis of big and heterogeneous data generated by a diversity of sources in urban spaces, such as sensors, devices, vehicles, buildings, and human. Especially, nowadays, no countries, no communities, and no person are immune to urban emergency events. Detection about urban emergency events, e.g., fires, storms, traffic jams is of great importance to protect the security of humans. Recently, social media feeds are rapidly emerging as a novel platform for providing and dissemination of information that is often geographic. The content from social media usually includes references to urban emergency events occurring at, or affecting specific locations. In this paper, in order to detect and describe the real time urban emergency event, the 5W (What, Where, When, Who, and Why) model is proposed. Firstly, users of social media are set as the target of crowd sourcing. Secondly, the spatial and temporal information from the social media are extracted to detect the real time event. Thirdly, a GIS based annotation of the detected urban emergency event is shown. The proposed method is evaluated with extensive case studies based on real urban emergency events. The results show the accuracy and efficiency of the proposed method.

...read moreread less

206 citations

Journal Article•DOI•

Spatial crowdsourcing: a survey

[...]

Yongxin Tong¹, Zimu Zhou², Yuxiang Zeng³, Lei Chen³, Cyrus Shahabi⁴ - Show less +1 more•Institutions (4)

Beihang University¹, ETH Zurich², Hong Kong University of Science and Technology³, University of Southern California⁴

01 Jan 2020

TL;DR: A comprehensive and systematic review of existing research on four core algorithmic issues in spatial crowdsourcing: (1) task assignment, (2) quality control, (3) incentive mechanism design, and (4) privacy protection.

...read moreread less

Abstract: Crowdsourcing is a computing paradigm where humans are actively involved in a computing task, especially for tasks that are intrinsically easier for humans than for computers. Spatial crowdsourcing is an increasing popular category of crowdsourcing in the era of mobile Internet and sharing economy, where tasks are spatiotemporal and must be completed at a specific location and time. In fact, spatial crowdsourcing has stimulated a series of recent industrial successes including sharing economy for urban services (Uber and Gigwalk) and spatiotemporal data collection (OpenStreetMap and Waze). This survey dives deep into the challenges and techniques brought by the unique characteristics of spatial crowdsourcing. Particularly, we identify four core algorithmic issues in spatial crowdsourcing: (1) task assignment, (2) quality control, (3) incentive mechanism design, and (4) privacy protection. We conduct a comprehensive and systematic review of existing research on the aforementioned four issues. We also analyze representative spatial crowdsourcing applications and explain how they are enabled by these four technical issues. Finally, we discuss open questions that need to be addressed for future spatial crowdsourcing research and applications.

...read moreread less

185 citations

Journal Article•DOI•

Personalized Federated Learning With Differential Privacy

[...]

Rui Hu¹, Yuanxiong Guo¹, Hongning Li², Qingqi Pei², Yanmin Gong¹ - Show less +1 more•Institutions (2)

University of Texas at San Antonio¹, Xidian University²

30 Apr 2020-IEEE Internet of Things Journal

TL;DR: A privacy-preserving approach for learning effective personalized models on distributed user data while guaranteeing the differential privacy of user data is proposed and the experimental results demonstrate that the proposed approach is robust to user heterogeneity and offers a good tradeoff between accuracy and privacy.

...read moreread less

Abstract: To provide intelligent and personalized services on smart devices, machine learning techniques have been widely used to learn from data, identify patterns, and make automated decisions. Machine learning processes typically require a large amount of representative data that are often collected through crowdsourcing from end users. However, user data could be sensitive in nature, and training machine learning models on these data may expose sensitive information of users, violating their privacy. Moreover, to meet the increasing demand of personalized services, these learned models should capture their individual characteristics. This article proposes a privacy-preserving approach for learning effective personalized models on distributed user data while guaranteeing the differential privacy of user data. Practical issues in a distributed learning system such as user heterogeneity are considered in the proposed approach. In addition, the convergence property and privacy guarantee of the proposed approach are rigorously analyzed. The experimental results on realistic mobile sensing data demonstrate that the proposed approach is robust to user heterogeneity and offers a good tradeoff between accuracy and privacy.

...read moreread less

141 citations

Journal Article•DOI•

zkCrowd: A Hybrid Blockchain-Based Crowdsourcing Platform

[...]

Saide Zhu¹, Zhipeng Cai¹, Huafu Hu, Yingshu Li¹, Wei Li¹ - Show less +1 more•Institutions (1)

Georgia State University¹

01 Jun 2020-IEEE Transactions on Industrial Informatics

TL;DR: This article proposes an innovative hybrid blockchain crowdsourcing platform, named zkCrowd, which integrates with a hybrid blockchain structure, smart contract, dual ledgers, and dual consensus protocols to secure communications, verify transactions, and preserve privacy.

...read moreread less

Abstract: Blockchain, a promising decentralized para-digm, can be exploited not only to overcome the shortcomings of the traditional crowdsourcing systems, but also to bring technical innovations, such as decentralization and accountability. Nevertheless, some critical inherent limitations of blockchain have been rarely addressed in the literature when it is incorporated into crowdsourcing, which may yield the performance bottleneck in the crowdsourcing systems. To further leverage the superiority of combining blockchain and crowdsourcing, in this article, we propose an innovative hybrid blockchain crowdsourcing platform, named zkCrowd. Our zkCrowd integrates with a hybrid blockchain structure, smart contract, dual ledgers, and dual consensus protocols to secure communications, verify transactions, and preserve privacy. Both the theoretical analysis and experiments are performed to evaluate the advantages of zkCrowd over the state of the art.

...read moreread less

130 citations

Posted Content•

Local Differential Privacy based Federated Learning for Internet of Things

[...]

Yang Zhao¹, Jun Zhao¹, Mengmeng Yang¹, Teng Wang, Ning Wang², Lingjuan Lyu³, Dusit Niyato¹, Kwok-Yan Lam¹ - Show less +4 more•Institutions (3)

Nanyang Technological University¹, Ocean University of China², National University of Singapore³

19 Apr 2020-arXiv: Cryptography and Security

TL;DR: This article proposes to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model, and proposes four LDP mechanisms to perturb gradients generated by vehicles.

...read moreread less

Abstract: Internet of Vehicles (IoV) is a promising branch of the Internet of Things. IoV simulates a large variety of crowdsourcing applications such as Waze, Uber, and Amazon Mechanical Turk, etc. Users of these applications report the real-time traffic information to the cloud server which trains a machine learning model based on traffic information reported by users for intelligent traffic management. However, crowdsourcing application owners can easily infer users' location information, which raises severe location privacy concerns of the users. In addition, as the number of vehicles increases, the frequent communication between vehicles and the cloud server incurs unexpected amount of communication cost. To avoid the privacy threat and reduce the communication cost, in this paper, we propose to integrate federated learning and local differential privacy (LDP) to facilitate the crowdsourcing applications to achieve the machine learning model. Specifically, we propose four LDP mechanisms to perturb gradients generated by vehicles. The Three-Outputs mechanism is proposed which introduces three different output possibilities to deliver a high accuracy when the privacy budget is small. The output possibilities of Three-Outputs can be encoded with two bits to reduce the communication cost. Besides, to maximize the performance when the privacy budget is large, an optimal piecewise mechanism (PM-OPT) is proposed. We further propose a suboptimal mechanism (PM-SUB) with a simple formula and comparable utility to PM-OPT. Then, we build a novel hybrid mechanism by combining Three-Outputs and PM-SUB.

...read moreread less

102 citations

Journal Article•DOI•

Exploring how consumer goods companies innovate in the digital age: the role of big data analytics companies

[...]

Marcello M. Mariani¹, S Fosso Wamba²•Institutions (2)

University of Reading¹, Toulouse Business School²

01 Dec 2020-Journal of Business Research

TL;DR: In this paper, a new generation of big data analytics (BDA) companies are crowdsourcing large volumes of online consumer reviews by means of controlled ad hoc online experiments and advanced machine learning (ML) techniques to forecast demand and determine the market potential for new products in several industries.

...read moreread less

90 citations

Journal Article•DOI•

A Crowdsourcing Framework for On-Device Federated Learning

[...]

Shashi Raj Pandey¹, Nguyen H. Tran², Mehdi Bennis¹, Yan Kyaw Tun¹, Aunas Manzoor¹, Choong Seon Hong¹ - Show less +2 more•Institutions (2)

Kyung Hee University¹, University of Sydney²

12 Feb 2020-IEEE Transactions on Wireless Communications

TL;DR: This work shows an incentive-based interaction between the crowdsourcing platform and the participating client’s independent strategies for training a global learning model, where each side maximizes its own benefit and proposes a novel crowdsourcing framework to leverage FL that considers the communication efficiency during parameters exchange.

...read moreread less

Abstract: Federated learning (FL) rests on the notion of training a global model in a decentralized manner. Under this setting, mobile devices perform computations on their local data before uploading the required updates to improve the global model. However, when the participating clients implement an uncoordinated computation strategy, the difficulty is to handle the communication efficiency (i.e., the number of communications per iteration) while exchanging the model parameters during aggregation. Therefore, a key challenge in FL is how users participate to build a high-quality global model with communication efficiency. We tackle this issue by formulating a utility maximization problem, and propose a novel crowdsourcing framework to leverage FL that considers the communication efficiency during parameters exchange. First, we show an incentive-based interaction between the crowdsourcing platform and the participating client’s independent strategies for training a global learning model, where each side maximizes its own benefit. We formulate a two-stage Stackelberg game to analyze such scenario and find the game’s equilibria. Second, we formalize an admission control scheme for participating clients to ensure a level of local accuracy. Simulated results demonstrate the efficacy of our proposed solution with up to 22% gain in the offered reward.

...read moreread less

77 citations

Journal Article•DOI•

A worker-selection incentive mechanism for optimizing platform-centric mobile crowdsourcing systems

[...]

Yingjie Wang¹, Yang Gao¹, Yingshu Li¹, Yingshu Li², Xiangrong Tong¹ - Show less +1 more•Institutions (2)

Yantai University¹, Georgia State University²

22 Apr 2020-Computer Networks

TL;DR: From experimental results, it can be inferred that the proposed worker-selection incentive mechanism can inspire users to participate in crowd tasks and maximize the utility of mobile crowdsourcing systems effectively.

...read moreread less

71 citations

Proceedings Article•DOI•

SURF: improving classifiers in production by learning from busy and noisy end users

[...]

Joshua Lockhart¹, Samuel Assefa¹, Ayham Alajdad¹, Andrew Alexander¹, Tucker Balch¹, Manuela Veloso¹ - Show less +2 more•Institutions (1)

J.P. Morgan & Co.¹

15 Oct 2020

TL;DR: In this article, the authors show that conventional crowdsourcing algorithms struggle in this user feedback setting, and present a new algorithm, SURF, that can cope with this nonresponse ambiguity.

...read moreread less

Abstract: Supervised learning classifiers inevitably make mistakes in production, perhaps mis-labeling an email, or flagging an otherwise routine transaction as fraudulent. It is vital that the end users of such a system are provided with a means of relabeling data points that they deem to have been mislabeled. The classifier can then be retrained on the relabeled data points in the hope of performance improvement. To reduce noise in this feedback data, well known algorithms from the crowdsourcing literature can be employed. However, the feedback setting provides a new challenge: how do we know what to do in the case of user non-response? If a user provides us with no feedback on a label then it can be dangerous to assume they implicitly agree: a user can be busy, lazy, or no longer a user of the system! We show that conventional crowdsourcing algorithms struggle in this user feedback setting, and present a new algorithm, SURF, that can cope with this non-response ambiguity.

...read moreread less

Posted Content•

Fighting the COVID-19 Infodemic in Social Media: A Holistic Perspective and a Call to Arms.

[...]

Firoj Alam¹, Fahim Dalvi¹, Shaden Shaar¹, Nadir Durrani¹, Hamdy Mubarak¹, Alex Nikolov², Giovanni Da San Martino³, Ahmed Abdelali¹, Hassan Sajjad¹, Kareem Darwish¹, Preslav Nakov¹ - Show less +7 more•Institutions (3)

Qatar Airways¹, Sofia University², University of Padua³

15 Jul 2020-arXiv: Information Retrieval

TL;DR: An annotation schema is defined and detailed annotation instructions are defined, which reflect the perspectives of journalists, fact-checkers, policymakers, government entities, social media platforms, and society as a whole about fighting the first global infodemic.

...read moreread less

Abstract: With the outbreak of the COVID-19 pandemic, people turned to social media to read and to share timely information including statistics, warnings, advice, and inspirational stories. Unfortunately, alongside all this useful information, there was also a new blending of medical and political misinformation and disinformation, which gave rise to the first global infodemic. While fighting this infodemic is typically thought of in terms of factuality, the problem is much broader as malicious content includes not only fake news, rumors, and conspiracy theories, but also promotion of fake cures, panic, racism, xenophobia, and mistrust in the authorities, among others. This is a complex problem that needs a holistic approach combining the perspectives of journalists, fact-checkers, policymakers, government entities, social media platforms, and society as a whole. Taking them into account we define an annotation schema and detailed annotation instructions, which reflect these perspectives. We performed initial annotations using this schema, and our initial experiments demonstrated sizable improvements over the baselines. Now, we issue a call to arms to the research community and beyond to join the fight by supporting our crowdsourcing annotation efforts.

...read moreread less

Journal Article•DOI•

Crowdsourcing a crisis response for COVID-19 in oncology.

[...]

Aakash Desai¹, Jeremy L. Warner², Nicole M. Kuderer, Michael A. Thompson³, Corrie A. Painter⁴, Gary H. Lyman⁵, Gilberto Lopes⁶ - Show less +3 more•Institutions (6)

University of Connecticut¹, Vanderbilt University², Aurora Health Care³, Broad Institute⁴, University of Washington⁵, University of Miami⁶

21 Apr 2020-Nature Reviews Cancer

TL;DR: Crowdsourcing efforts are currently underway to collect and analyze data from patients with cancer who are affected by the COVID-19 pandemic to fill key knowledge gaps to tackle crucial clinical questions on the complexities of infection with the causative coronavirus SARS-Cov-2 in the large, heterogeneous group of vulnerable Patients with cancer.

...read moreread less

Abstract: Crowdsourcing efforts are currently underway to collect and analyze data from patients with cancer who are affected by the COVID-19 pandemic. These community-led initiatives will fill key knowledge gaps to tackle crucial clinical questions on the complexities of infection with the causative coronavirus SARS-Cov-2 in the large, heterogeneous group of vulnerable patients with cancer.

...read moreread less

Journal Article•DOI•

Privacy-Aware Task Allocation and Data Aggregation in Fog-Assisted Spatial Crowdsourcing

[...]

Haiqin Wu¹, Liangmin Wang¹, Guoliang Xue²•Institutions (2)

Jiangsu University¹, Arizona State University²

01 Jan 2020-IEEE Transactions on Network Science and Engineering

TL;DR: A privacy-aware task allocation and data aggregation scheme (PTAA) is proposed leveraging bilinear pairing and homomorphic encryption and security analysis shows that PTAA can achieve the desirable security goals.

...read moreread less

Abstract: Spatial crowdsourcing (SC) enables task owners (TOs) to outsource spatial-related tasks to a SC-server who engages mobile users in collecting sensing data at some specified locations with their mobile devices. Data aggregation, as a specific SC task, has drawn much attention in mining the potential value of the massive spatial crowdsensing data. However, the release of SC tasks and the execution of data aggregation may pose considerable threats to the privacy of TOs and mobile users, respectively. Besides, it is nontrivial for the SC-server to allocate numerous tasks efficiently and accurately to qualified mobile users, as the SC-server has no knowledge about the entire geographical user distribution. To tackle these issues, in this paper, we introduce a fog-assisted SC architecture, in which many fog nodes deployed in different regions can assist the SC-server to distribute tasks and aggregate data in a privacy-aware manner. Specifically, a privacy-aware task allocation and data aggregation scheme (PTAA) is proposed leveraging bilinear pairing and homomorphic encryption. PTAA supports representative aggregate statistics (e.g., sum, mean, variance, and minimum) with efficient data update while providing strong privacy protection. Security analysis shows that PTAA can achieve the desirable security goals. Extensive experiments also demonstrate its feasibility and efficiency.

...read moreread less

Journal Article•DOI•

A survey of blockchain technology on security, privacy, and trust in crowdsourcing services

[...]

Ying Ma¹, Yu Sun², Yunjie Lei¹, Nan Qin¹, Lu Junwen¹ - Show less +1 more•Institutions (2)

Xiamen University of Technology¹, National Tsing Hua University²

01 Jan 2020-World Wide Web

TL;DR: The development trend of blockchain technology from the perspective of global government and enterprises and main technologies on security, privacy, and trust in crowdsourcing services and application scenarios related to this field are observed.

...read moreread less

Abstract: Blockchain is a new decentralized distributed technology, which guarantees trusted transactions in untrustworthy environments by realizing value transfer network. Because of its important value and significance to lead human society from the information transmission internet era to the value transmission internet era, it has attracted the attention of researchers in crowdsourcing services. This paper firstly observes the development trend of blockchain technology from the perspective of global government and enterprises. Then we briefly review the related concepts of blockchain and basic model of blockchain. On this basis, a comprehensive summary of the state of the blockchain research has been made on related articles which are recently published. In order to show its functional value, further investigation has been taken on the main technologies on security, privacy, and trust in crowdsourcing services and application scenarios related to this field. Finally, the advantages and challenges of blockchains are discussed. It is hoped to provide useful reference and help for the future research on blockchain technology used in crowdsourcing services.

...read moreread less

Journal Article•DOI•

PriRadar: A Privacy-Preserving Framework for Spatial Crowdsourcing

[...]

Dong Yuan¹, Qi Li¹, Guoliang Li¹, Qian Wang², Kui Ren³ - Show less +1 more•Institutions (3)

Tsinghua University¹, Wuhan University², Zhejiang University³

01 Jan 2020-IEEE Transactions on Information Forensics and Security

TL;DR: This paper devise a grid-based location protection method, which can protect the locations of workers and tasks while keeping the distance-aware information on the protected locations such that it can quantify the distance between tasks and workers.

...read moreread less

Abstract: Privacy leakage is a serious issue in spatial crowdsourcing in various scenarios. In this paper, we study privacy protection in spatial crowdsourcing. The main challenge is to efficiently assign tasks to nearby workers without needing to know the exact locations of tasks and workers. To address this problem, we propose a privacy-preserving framework without online trusted third parties. We devise a grid-based location protection method, which can protect the locations of workers and tasks while keeping the distance-aware information on the protected locations such that we can quantify the distance between tasks and workers. We propose an efficient task assignment algorithm, which can instantly assign tasks to nearby workers on encrypted data. To protect the task content, we leverage both attribute-based encryption and symmetric-key encryption to establish secure channels through servers, which ensures that the task is delivered securely and accurately by any untrusted server. Moreover, we analyze the security properties of our method. We have conducted real experiments on real-world datasets. Experimental results show that our method outperforms existing approaches.

...read moreread less

Proceedings Article•DOI•

Twitter A11y: A Browser Extension to Make Twitter Images Accessible

[...]

Cole Gleason¹, Amy Pavel¹, Emma McCamey², Christina Low³, Patrick Carrington¹, Kris M. Kitani¹, Jeffrey P. Bigham¹ - Show less +3 more•Institutions (3)

Carnegie Mellon University¹, Virginia Commonwealth University², Stony Brook University³

21 Apr 2020

TL;DR: Twitter A11y increases access to social media platforms for people with visual impairments by providing high-quality automatic descriptions for user-posted images by increasing alt-text coverage from 7.6% to 78.5%, before crowdsourcing descriptions for the remaining images.

...read moreread less

Abstract: Social media platforms are integral to public and private discourse, but are becoming less accessible to people with vision impairments due to an increase in user-posted images. Some platforms (i.e. Twitter) let users add image descriptions (alternative text), but only 0.1% of images include these. To address this accessibility barrier, we created Twitter A11y, a browser extension to add alternative text on Twitter using six methods. For example, screenshots of text are common, so we detect textual images, and create alternative text using optical character recognition. Twitter A11y also leverages services to automatically generate alternative text or reuse them from across the web. We compare the coverage and quality of Twitter A11y's six alt-text strategies by evaluating the timelines of 50 self-identified blind Twitter users. We find that Twitter A11y increases alt-text coverage from 7.6% to 78.5%, before crowdsourcing descriptions for the remaining images. We estimate that 57.5% of returned descriptions are high-quality. We then report on the experiences of 10 participants with visual impairments using the tool during a week-long deployment. Twitter A11y increases access to social media platforms for people with visual impairments by providing high-quality automatic descriptions for user-posted images.

...read moreread less

Journal Article•DOI•

Crowdsourcing in health and medical research: a systematic review

[...]

Cheng Wang¹, Larry Han², Gabriella Stein³, Suzanne Day³, Cedric H. Bien-Gund³, Allison Mathews³, Jason J. Ong⁴, Pei Zhen Zhao¹, Shu Fang Wei, Jennifer S. Walker³, Roger Chou⁵, Amy S. Lee³, Angela Chen⁶, Barry L. Bayus³, Joseph D. Tucker - Show less +11 more•Institutions (6)

Southern Medical University¹, Harvard University², University of North Carolina at Chapel Hill³, University of London⁴, Oregon Health & Science University⁵, Brigham and Women's Hospital⁶

20 Jan 2020-Infectious Diseases of Poverty

TL;DR: Although crowdsourcing is effective at improving behavioral outcomes, more research is needed to understand effects on clinical outcomes and costs and to develop artificial intelligence systems in medicine.

...read moreread less

Abstract: Crowdsourcing is used increasingly in health and medical research. Crowdsourcing is the process of aggregating crowd wisdom to solve a problem. The purpose of this systematic review is to summarize quantitative evidence on crowdsourcing to improve health. We followed Cochrane systematic review guidance and systematically searched seven databases up to September 4th 2019. Studies were included if they reported on crowdsourcing and related to health or medicine. Studies were excluded if recruitment was the only use of crowdsourcing. We determined the level of evidence associated with review findings using the GRADE approach. We screened 3508 citations, accessed 362 articles, and included 188 studies. Ninety-six studies examined effectiveness, 127 examined feasibility, and 37 examined cost. The most common purposes were to evaluate surgical skills (17 studies), to create sexual health messages (seven studies), and to provide layperson cardio-pulmonary resuscitation (CPR) out-of-hospital (six studies). Seventeen observational studies used crowdsourcing to evaluate surgical skills, finding that crowdsourcing evaluation was as effective as expert evaluation (low quality). Four studies used a challenge contest to solicit human immunodeficiency virus (HIV) testing promotion materials and increase HIV testing rates (moderate quality), and two of the four studies found this approach saved money. Three studies suggested that an interactive technology system increased rates of layperson initiated CPR out-of-hospital (moderate quality). However, studies analyzing crowdsourcing to evaluate surgical skills and layperson-initiated CPR were only from high-income countries. Five studies examined crowdsourcing to inform artificial intelligence projects, most often related to annotation of medical data. Crowdsourcing was evaluated using different outcomes, limiting the extent to which studies could be pooled. Crowdsourcing has been used to improve health in many settings. Although crowdsourcing is effective at improving behavioral outcomes, more research is needed to understand effects on clinical outcomes and costs. More research is needed on crowdsourcing as a tool to develop artificial intelligence systems in medicine. PROSPERO: CRD42017052835. December 27, 2016.

...read moreread less

Journal Article•DOI•

Walrasian Equilibrium-Based Multiobjective Optimization for Task Allocation in Mobile Crowdsourcing

[...]

Yingjie Wang¹, Zhipeng Cai², Zhi-Hui Zhan³, Bingxu Zhao¹, Xiangrong Tong¹, Lianyong Qi⁴ - Show less +2 more•Institutions (4)

Yantai University¹, Georgia State University², South China University of Technology³, Qufu Normal University⁴

29 May 2020-IEEE Transactions on Computational Social Systems

TL;DR: This article proposes the Markov and Collaborative filtering-based Task Recommendation (MCTR) model, and based on the Walrasian equilibrium, the optimum solution is researched to maximize the social welfare of mobile crowdsourcing systems.

...read moreread less

Abstract: With the rapid development of Industry 5.0 and mobile devices, the research of mobile crowdsensing networks has become an important research focus. Task allocation is an important research content that can inspire crowd workers to participate in crowd tasks and provide truthful sensed data in mobile crowdsourcing systems. However, how to inspire crowd workers to participate in crowd tasks and provide truthful sensed data still has many challenges. In this article, based on the Markov model and collaborative filtering model, the similarities, trajectory prediction, dwell time, and trust degree are considered to propose the Markov and Collaborative filtering-based Task Recommendation (MCTR) model. Then, based on the Walrasian equilibrium, the optimum solution is researched to maximize the social welfare of mobile crowdsourcing systems. Finally, the comparison experiments are carried out to evaluate the performance of the proposed multiobjective optimization and the Markov-based task allocation with other methods. Through comparison experiments, the efficiency and adaptation of mobile crowdsourcing systems could be improved by the proposed task allocation.

...read moreread less

Journal Article•DOI•

Approval Voting and Incentives in Crowdsourcing

[...]

Nihar B. Shah¹, Dengyong Zhou²•Institutions (2)

Carnegie Mellon University¹, Google²

22 Jun 2020

TL;DR: This article introduces approval voting to utilize the expertise of workers who have partial knowledge of the true answer and coupling it with two strictly proper scoring rules, and establishes attractive properties of optimality and uniqueness of the scoring rules.

...read moreread less

Abstract: The growing need for labeled training data has made crowdsourcing a vital tool for developing machine learning applications. Here, workers on a crowdsourcing platform are typically shown a list of unlabeled items, and for each of these items, are asked to choose a label from one of the provided options. The workers in crowdsourcing platforms are not experts, thereby making it essential to judiciously elicit the information known to the workers. With respect to this goal, there are two key shortcomings of current systems: (i) the incentives of the workers are not aligned with those of the requesters; and (ii) the interface does not allow workers to convey their knowledge accurately by forcing them to make a single choice among a set of options. In this article, we address these issues by introducing approval voting to utilize the expertise of workers who have partial knowledge of the true answer and coupling it with two strictly proper scoring rules. We additionally establish attractive properties of optimality and uniqueness of our scoring rules. We also conduct preliminary empirical studies on Amazon Mechanical Turk, and the results of these experiments validate our approach.

...read moreread less

Journal Article•DOI•

Crowdsourcing and Crowdfunding in the Manufacturing and Services Sectors

[...]

Gad Allon¹, Volodymyr Babich²•Institutions (2)

University of Pennsylvania¹, Georgetown University²

02 Jan 2020-Manufacturing & Service Operations Management

TL;DR: In the last few years, the emergence of two new ways in which firms interact with outside stakeholders, namely crowdsourcing and crowdfunding service providers, have been seen.

...read moreread less

Abstract: In the last few years, we have seen the emergence of two new ways in which firms interact with outside stakeholders, namely crowdsourcing and crowdfunding service providers. In this article, we def...

...read moreread less

Journal Article•DOI•

SRA: Secure Reverse Auction for Task Assignment in Spatial Crowdsourcing

[...]

Mingjun Xiao¹, Ma Kai¹, An Liu², Hui Zhao¹, Zhixu Li², Kai Zheng³, Xiaofang Zhou⁴ - Show less +3 more•Institutions (4)

University of Science and Technology of China¹, Soochow University (Suzhou)², University of Electronic Science and Technology of China³, University of Queensland⁴

01 Apr 2020-IEEE Transactions on Knowledge and Data Engineering

TL;DR: The approximation performance of the proposed Secure Reverse Auction (SRA) protocol is analyzed and it is proved that it has some desired properties, including truthfulness, individual rationality, computational efficiency, and security.

...read moreread less

Abstract: In this paper, we study a new type of spatial crowdsourcing, namely competitive detour tasking, where workers can make detours from their original travel paths to perform multiple tasks, and each worker is allowed to compete for preferred tasks by strategically claiming his/her detour costs. The objective is to make suitable task assignment by maximizing the social welfare of crowdsourcing systems and protecting workers’ private sensitive information. We first model the task assignment problem as a reverse auction process. We formalize the winning bid selection of reverse auction as an $n$ n -to-one weighted bipartite graph matching problem with multiple 0-1 knapsack constraints. Since this problem is NP-hard, we design an approximation algorithm to select winning bids and determine corresponding payments. Based on this, a Secure Reverse Auction (SRA) protocol is proposed for this novel spatial crowdsourcing. We analyze the approximation performance of the proposed protocol and prove that it has some desired properties, including truthfulness, individual rationality, computational efficiency, and security. To the best of our knowledge, this is the first theoretically provable secure auction protocol for spatial crowdsourcing systems. In addition, we also conduct extensive simulations on a real trace to verify the performance of the proposed protocol.

...read moreread less

Journal Article•DOI•

User idea implementation in open innovation communities: Evidence from a new product development crowdsourcing community

[...]

Qian Liu¹, Qianzhou Du², Yili Hong³, Weiguo Fan⁴, Shuang Wu⁵ - Show less +1 more•Institutions (5)

Central University of Finance and Economics¹, Nanjing University², Arizona State University³, University of Iowa⁴, Xi'an Jiaotong University⁵

01 Sep 2020-Information Systems Journal

Posted Content•

ETHOS: an Online Hate Speech Detection Dataset.

[...]

Ioannis Mollas¹, Zoe Chrysopoulou¹, Stamatis Karlos¹, Grigorios Tsoumakas¹•Institutions (1)

Aristotle University of Thessaloniki¹

11 Jun 2020-arXiv: Computation and Language

TL;DR: ‘ETHOS’ (multi-labEl haTe speecH detectiOn dataSet), a textual dataset with two variants: binary and multi-label, based on YouTube and Reddit comments validated using the Figure-Eight crowdsourcing platform, and the annotation protocol used to create this dataset is presented.

...read moreread less

Abstract: Online hate speech is a newborn problem in our modern society which is growing at a steady rate exploiting weaknesses of the corresponding regimes that characterise several social media platforms. Therefore, this phenomenon is mainly cultivated through such comments, either during users' interaction or on posted multimedia context. Nowadays, giant companies own platforms where many millions of users log in daily. Thus, protection of their users from exposure to similar phenomena for keeping up with the corresponding law, as well as for retaining a high quality of offered services, seems mandatory. Having a robust and reliable mechanism for identifying and preventing the uploading of related material would have a huge effect on our society regarding several aspects of our daily life. On the other hand, its absence would deteriorate heavily the total user experience, while its erroneous operation might raise several ethical issues. In this work, we present a protocol for creating a more suitable dataset, regarding its both informativeness and representativeness aspects, favouring the safer capture of hate speech occurrence, without at the same time restricting its applicability to other classification problems. Moreover, we produce and publish a textual dataset with two variants: binary and multi-label, called `ETHOS', based on YouTube and Reddit comments validated through figure-eight crowdsourcing platform. Our assumption about the production of more compatible datasets is further investigated by applying various classification models and recording their behaviour over several appropriate metrics.

...read moreread less

Proceedings Article•DOI•

Predictive Task Assignment in Spatial Crowdsourcing: A Data-driven Approach

[...]

Yan Zhao¹, Kai Zheng², Cui Yue², Han Su², Feida Zhu³, Xiaofang Zhou⁴ - Show less +2 more•Institutions (4)

Soochow University (Suzhou)¹, University of Electronic Science and Technology of China², Singapore Management University³, University of Queensland⁴

20 Apr 2020

TL;DR: This work studies a novel spatial crowdsourcing problem, namely Predictive Task Assignment (PTA), which aims to maximize the number of assigned tasks by taking into account both current and future workers/tasks that enter the system dynamically with location unknown in advance and proposes a two-phase data-driven framework.

...read moreread less

Abstract: With the rapid development of mobile networks and the widespread usage of mobile devices, spatial crowdsourcing, which refers to assigning location-based tasks to moving workers, has drawn increasing attention. One of the major issues in spatial crowdsourcing is task assignment, which allocates tasks to appropriate workers. However, existing works generally assume the static offline scenarios, where the spatio-temporal information of all the workers and tasks is determined and known a priori. Ignorance of the dynamic spatio-temporal distributions of workers and tasks can often lead to poor assignment results. In this work we study a novel spatial crowdsourcing problem, namely Predictive Task Assignment (PTA), which aims to maximize the number of assigned tasks by taking into account both current and future workers/tasks that enter the system dynamically with location unknown in advance. We propose a two-phase data-driven framework. The prediction phase hybrids different learning models to predict the locations and routes of future workers and designs a graph embedding approach to estimate the distribution of future tasks. In the assignment component, we propose both greedy algorithm for large-scale applications and optimal algorithm with graph partition based decomposition. Extensive experiments on two real datasets demonstrate the effectiveness of our framework.

...read moreread less

Proceedings Article•DOI•

Differentially Private Online Task Assignment in Spatial Crowdsourcing: A Tree-based Approach

[...]

Qian Tao¹, Yongxin Tong¹, Zimu Zhou², Yexuan Shi¹, Lei Chen³, Ke Xu¹ - Show less +2 more•Institutions (3)

Beihang University¹, ETH Zurich², Hong Kong University of Science and Technology³

20 Apr 2020

TL;DR: A novel privacy mechanism based on Hierarchically Well-Separated Trees (HSTs) is designed and extensive experiments show that online task assignment under this privacy mechanism is notably more effective in terms of total distance than under prior differentially private mechanisms.

...read moreread less

Abstract: With spatial crowdsourcing applications such as Uber and Waze deeply penetrated into everyday life, there is a growing concern to protect user privacy in spatial crowdsourcing. Particularly, locations of workers and tasks should be properly processed via certain privacy mechanism before reporting to the untrusted spatial crowdsourcing server for task assignment. Privacy mechanisms typically permute the location information, which tends to make task assignment ineffective. Prior studies only provide guarantees on privacy protection without assuring the effectiveness of task assignment. In this paper, we investigate privacy protection for online task assignment with the objective of minimizing the total distance, an important task assignment formulation in spatial crowdsourcing. We design a novel privacy mechanism based on Hierarchically Well-Separated Trees (HSTs). We prove that the mechanism is e-Geo-Indistinguishable and show that there is a task assignment algorithm with a competitive ratio of $O\left( {\frac{1}{{{\varepsilon ^4}}}\log N{{\log }^2}k} \right)$, where is the privacy budget, N is the number of predefined points on the HST, and k is the matching size. Extensive experiments on synthetic and real datasets show that online task assignment under our privacy mechanism is notably more effective in terms of total distance than under prior differentially private mechanisms.

...read moreread less

Journal Article•DOI•

Destination-Aware Task Assignment in Spatial Crowdsourcing: A Worker Decomposition Approach

[...]

Yan Zhao¹, Kai Zheng², Yang Li¹, Han Su², Jiajun Liu³, Xiaofang Zhou⁴ - Show less +2 more•Institutions (4)

Soochow University (Suzhou)¹, University of Electronic Science and Technology of China², Renmin University of China³, University of Queensland⁴

01 Dec 2020-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper studies a destination-aware task assignment problem that concerns the optimal strategy of assigning each task to proper worker such that the total number of completed tasks can be maximized whilst all workers can reach their destinations before deadlines after performing assigned tasks.

...read moreread less

Abstract: With the proliferation of GPS-enabled smart devices and increased availability of wireless network, spatial crowdsourcing (SC) has been recently proposed as a framework to automatically request workers (i.e., smart device carriers) to perform location-sensitive tasks (e.g., taking scenic photos, reporting events). In this paper, we study a destination-aware task assignment problem that concerns the optimal strategy of assigning each task to proper worker such that the total number of completed tasks can be maximized whilst all workers can reach their destinations before deadlines after performing assigned tasks. Finding the global optimal assignment turns out to be an intractable problem since it does not imply optimal assignment for individual worker. Observing that the task assignment dependency only exists amongst subsets of workers, we utilize tree-decomposition technique to separate workers into independent clusters and develop an efficient depth-first search algorithm with progressive bounds to prune non-promising assignments. In order to make our proposed framework applicable to more scenarios, we further optimize the original framework by proposing strategies to reduce the overall travel cost and allow each task to be assigned to multiple workers. Extensive empirical studies verify that the proposed technique and optimization strategies perform effectively and settle the problem nicely.

...read moreread less

Proceedings Article•DOI•

History for Visual Dialog: Do we really need it?

[...]

Shubham Agarwal¹, Trung Bui², Joon-Young Lee², Ioannis Konstas¹, Verena Rieser¹ - Show less +1 more•Institutions (2)

Heriot-Watt University¹, Adobe Systems²

01 Jul 2020

TL;DR: It is shown that co-attention models which explicitly encode dialoh history outperform models that don’t, achieving state-of-the-art performance, and a challenging subset of the VisdialVal set and the benchmark NDCG of 63%.

...read moreread less

Abstract: Visual Dialogue involves "understanding'' the dialogue history (what has been discussed previously) and the current question (what is asked), in addition to grounding information in the image, to accurately generate the correct response. In this paper, we show that co-attention models which explicitly encode dialoh history outperform models that don't, achieving state-of-the-art performance (72 % NDCG on val set). However, we also expose shortcomings of the crowdsourcing dataset collection procedure, by showing that dialogue history is indeed only required for a small amount of the data, and that the current evaluation metric encourages generic replies. To that end, we propose a challenging subset (VisdialConv) of the VisdialVal set and the benchmark NDCG of 63%.

...read moreread less

Journal Article•DOI•

Expert finding in community question answering: a review

[...]

Sha Yuan¹, Yu Zhang², Jie Tang¹, Wendy Hall³, Juan Bautista Cabota⁴ - Show less +1 more•Institutions (4)

Tsinghua University¹, Peking Union Medical College², University of Southampton³, University of Valencia⁴

01 Feb 2020-Artificial Intelligence Review

TL;DR: In this paper, the authors classify the recent solutions into four different categories: matrix factorization based models (MF-based models), gradient boosting tree based models, deep learning based models and ranking based models.

...read moreread less

Abstract: The rapid development of Community Question Answering (CQA) satisfies users’ quest for professional and personal knowledge about anything. In CQA, one central issue is to find users with expertise and willingness to answer the given questions. Expert finding in CQA often exhibits very different challenges compared to traditional methods. The new features of CQA (such as huge volume, sparse data and crowdsourcing) violate fundamental assumptions of traditional recommendation systems. This paper focuses on reviewing and categorizing the current progress on expert finding in CQA. We classify the recent solutions into four different categories: matrix factorization based models (MF-based models), gradient boosting tree based models (GBT-based models), deep learning based models (DL-based models) and ranking based models (R-based models). We find that MF-based models outperform other categories of models in the crowdsourcing situation. Moreover, we use innovative diagrams to clarify several important concepts of ensemble learning, and find that ensemble models with several specific single models can further boost the performance. Further, we compare the performance of different models on different types of matching tasks, including textvs.text, graphvs.text, audiovs.text and videovs.text. The results will help the model selection of expert finding in practice. Finally, we explore some potential future issues in expert finding research in CQA.

...read moreread less

Journal Article•DOI•

Machine Learning-Based Design Concept Evaluation

[...]

Bradley Camburn¹, Yuejun He², Sujithra Raviselvam², Jianxi Luo², Kristin L. Wood² - Show less +1 more•Institutions (2)

Oregon State University¹, Singapore University of Technology and Design²

01 Mar 2020-Journal of Mechanical Design

TL;DR: An automated method for design concept assessment that provides a possible avenue to rate design concepts deterministically and hints at bias in human design concept selection is developed and demonstrated.

...read moreread less

Abstract: In order to develop novel solutions for complex systems and in increasingly competitive markets, it may be advantageous to generate large numbers of design concepts and then to identify the most novel and valuable ideas. However, it can be difficult to process, review, and assess thousands of design concepts. Based on this need, we develop and demonstrate an automated method for design concept assessment. In the method, machine learning technologies are first applied to extract ontological data from design concepts. Then, a filtering strategy and quantitative metrics are introduced that enable creativity rating based on the ontological data. This method is tested empirically. Design concepts are crowd-generated for a variety of actual industry design problems/opportunities. Over 4000 design concepts were generated by humans for assessment. Empirical evaluation assesses: (1) correspondence of the automated ratings with human creativity ratings; (2) whether concepts selected using the method are highly scored by another set of crowd raters; and finally (3) if high scoring designs have a positive correlation or relationship to industrial technology development. The method provides a possible avenue to rate design concepts deterministically. A highlight is that a subset of designs selected automatically out of a large set of candidates was scored higher than a subset selected by humans when evaluated by a set of third-party raters. The results hint at bias in human design concept selection and encourage further study in this topic.

...read moreread less

Collapse