Pushing the boundaries of crowd-enabled databases with query-driven schema expansion

doi:10.14778/2168651.2168655

Open AccessJournal ArticleDOI

Pushing the boundaries of crowd-enabled databases with query-driven schema expansion

Joachim Selke, +2 more

- Vol. 5, Iss: 6, pp 538-549

Chats0

TLDR

This paper extends crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time, and leverages the usergenerated data found in the Social Web to build perceptual spaces.

Abstract:

By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time. However, the number of crowd-sourced mini-tasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, we leverage the usergenerated data found in the Social Web: By exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from the perceptual space with high quality and at low costs. Extensive experiments show that our approach can boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a query-driven fashion.

Pushing the boundaries of crowd-enabled databases with query-driven schema expansion

Citations

Pick-a-crowd: tell me what you like, and i'll tell you what to do

Using the crowd for top-k and group-by queries

Large-scale linked data integration using probabilistic reasoning and crowdsourcing

Skyline queries in crowd-enabled databases

An online cost sensitive decision-making method in crowdsourcing systems

References

Likability-Based Genres: Analysis and Evaluation of the Netflix Dataset

Related Papers (5)

CrowdDB: answering queries with crowdsourcing

CrowdER: crowdsourcing entity resolution

Human-powered sorts and joins

Crowdsourcing systems on the World-Wide Web

CDAS: a crowdsourcing data analytics system