scispace - formally typeset
Open AccessJournal ArticleDOI

Pushing the boundaries of crowd-enabled databases with query-driven schema expansion

TLDR
This paper extends crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time, and leverages the usergenerated data found in the Social Web to build perceptual spaces.
Abstract
By incorporating human workers into the query execution process crowd-enabled databases facilitate intelligent, social capabilities like completing missing data at query time or performing cognitive operators. But despite all their flexibility, crowd-enabled databases still maintain rigid schemas. In this paper, we extend crowd-enabled databases by flexible query-driven schema expansion, allowing the addition of new attributes to the database at query time. However, the number of crowd-sourced mini-tasks to fill in missing values may often be prohibitively large and the resulting data quality is doubtful. Instead of simple crowd-sourcing to obtain all values individually, we leverage the usergenerated data found in the Social Web: By exploiting user ratings we build perceptual spaces, i.e., highly-compressed representations of opinions, impressions, and perceptions of large numbers of users. Using few training samples obtained by expert crowd sourcing, we then can extract all missing data automatically from the perceptual space with high quality and at low costs. Extensive experiments show that our approach can boost both performance and quality of crowd-enabled databases, while also providing the flexibility to expand schemas in a query-driven fashion.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

"I would like to watch something like 'The Terminator'…" Cooperative Query Personalization Based on Perceptual Similarity

TL;DR: This paper presents a privacy-preserving query personalization system for experience items like movies, music, games, or books which relies on a high-dimensional feature space which was mined from rating data of a large number of users to steer the query process.
Journal ArticleDOI

Exploratory product search using top-k join queries

TL;DR: Algorithms for processing exploratory top-k joins that adopt the pull-bound framework for rank-join processing are presented and a novel algorithm (XRJN) which employs a more efficient bounding scheme and allows earlier termination of query processing is introduced.
Book ChapterDOI

Making Collective Wisdom Wiser

TL;DR: This talk will overview the MoDaS project that investigates how database technology can be put to work to effectively gather information from the public, efficiently moderate the process, and identify questionable input with minimal human interaction.

Perceptual relational attributes : Navigating and discovering shared perspectives from user-generated reviews

TL;DR: This paper discusses how to use unstructured user reviews to build a structured semantic representation of database items such that these perceptual attributes are represented and usable for navigational queries and argues that a central challenge when extracting perceptual attributes from social judgments is respecting the subjectivity of expressed opinions.
References
More filters
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal ArticleDOI

A tutorial on support vector regression

TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Journal ArticleDOI

Learning from Imbalanced Data

TL;DR: A critical review of the nature of the problem, the state-of-the-art technologies, and the current assessment metrics used to evaluate learning performance under the imbalanced learning scenario is provided.
Proceedings Article

Support Vector Regression Machines

TL;DR: This work compares support vector regression (SVR) with a committee regression technique (bagging) based on regression trees and ridge regression done in feature space and expects that SVR will have advantages in high dimensionality space because SVR optimization does not depend on the dimensionality of the input space.
BookDOI

Semi-Supervised Learning

TL;DR: Semi-supervised learning (SSL) as discussed by the authors is the middle ground between supervised learning (in which all training examples are labeled) and unsupervised training (where no label data are given).
Related Papers (5)