Pushing the boundaries of crowd-enabled databases with query-driven schema expansion
Joachim Selke,Christoph Lofi,Wolf-Tilo Balke +2 more
- Vol. 5, Iss: 6, pp 538-549
TLDR
This paper extends crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time, and leverages the usergenerated data found in the Social Web to build perceptual spaces.Abstract:
By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time. However, the number of crowd-sourced mini-tasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, we leverage the usergenerated data found in the Social Web: By exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from the perceptual space with high quality and at low costs. Extensive experiments show that our approach can boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a query-driven fashion.read more
Citations
More filters
Journal ArticleDOI
Crowdsourcing for Query Processing on Web Data: A Case Study on the Skyline Operator
TL;DR: This paper presents some heuristics-based approaches and compare them to crowdsourcing- based approaches using sophisticated optimization techniques while especially focusing on result correctness and demonstrates how such optimizations can be performed for the popular skyline operator for preference queries.
Book ChapterDOI
Retaining Rough Diamonds: Towards a Fairer Elimination of Low-Skilled Workers
Kinda El Maarry,Wolf-Tilo Balke +1 more
TL;DR: It is shown how current quality control measures misjudge and subsequently discriminate against honest workers with lower skill levels, and how to distinguish between basically honest workers, who might just be lacking educational skills, and plainly unethical workers is discussed.
Proceedings ArticleDOI
Social opportunistic sensing and social centric networking: enabling technology for smart cities
Stephan Sigg,Xiaoming Fu +1 more
TL;DR: The concepts of Social Opportunistic Sensing (SOS) and Social Centric Networking (SCN) are introduced and the former has the potential to enhance offline social networks in Internet of Things (IoT) enhanced Smart Cities by connecting individuals based on their automatically updated profile via context-based routing.
Proceedings ArticleDOI
Visual orchestration and autonomous execution of distributed and heterogeneous computational biology pipelines
TL;DR: A declarative meta-language, called VisFlow, for requirement specification, and a translator for mapping requirements into executable queries in a variant of SQL augmented with integration artifacts are presented.
Proceedings ArticleDOI
Colledge: a vision of collaborative knowledge networks
TL;DR: This work combines all the pieces envisioning a social knowledge network that enables collaborative knowledge generation and exchange by extracting information at query time, personalizing queries, and integration of user feedback.
References
More filters
Journal ArticleDOI
Indexing by Latent Semantic Analysis
TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI
A tutorial on support vector regression
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Journal ArticleDOI
Learning from Imbalanced Data
Haibo He,E.A. Garcia +1 more
TL;DR: A critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario is provided.
Proceedings Article
Support Vector Regression Machines
TL;DR: This work compares support vector regression (SVR) with a committee regression technique (bagging) based on regression trees and ridge regression done in feature space and expects that SVR will have advantages in high dimensionality space because SVR optimization does not depend on the dimensionality of the input space.
BookDOI
Semi-Supervised Learning
TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).