Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

doi:10.1007/S10994-017-5651-7

Open AccessJournal ArticleDOI

Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

Pedram Daee, +3 more

- 12 Jul 2017 -

Machine Learning

- Vol. 106, Iss: 9, pp 1599-1620

Chats0

TLDR

In this paper, the authors formulate knowledge elicitation as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions, and propose an algorithm and computational approximation for fast and efficient interaction.

Abstract:

Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge

Iiris Sundin, +14 more

TL;DR: In this paper, a probabilistic framework was proposed to incorporate expert feedback about the impact of genomic measurements on the outcome of interest and presented a novel approach to collect the feedback efficiently, based on Bayesian experimental design.

...read moreread less

Proceedings ArticleDOI

User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

Pedram Daee, +3 more

TL;DR: In this paper, a user modelling methodology is proposed to guard against double use of data and overfitting in human-in-the-loop machine learning, where the user provides information beyond that in the training data.

...read moreread less

Proceedings ArticleDOI

Teacher-aware active robot learning

Mattia Racca, +2 more

TL;DR: This paper proposes a learning strategy that aims to minimize the user's workload by taking into account the flow of the questions and reports results from both the robot's performance and the human teacher's perspectives, observing how the hybrid strategy represents a good compromise between learning performance and user's experienced workload.

...read moreread less

Journal ArticleDOI

Preference Elicitation within Framework of Fully Probabilistic Design of Decision Strategies

Miroslav Kárný, +1 more

- 01 Jan 2019 -

IFAC-PapersOnLine

TL;DR: The paper provides a general choice of the best ideal closed-loop model reflecting agent’s preferences and shows the general solution is illustrated on the regulation task with a linear Gaussian model describing the agent's environment.

...read moreread less

DOI

A Survey of Domain Knowledge Elicitation in Applied Machine Learning

Daniel Kerrigan, +2 more

TL;DR: This article developed a taxonomy to characterize elicitation approaches according to the elicitation goal, elicitation target, and use of elicited knowledge, and identified opportunities for adding rigor to these elicitation methods.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Pattern Recognition and Machine Learning

Christopher M. Bishop

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.

...read moreread less

Book

Bayesian Data Analysis

Andrew Gelman, +5 more

TL;DR: Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided.

...read moreread less

Active Learning Literature Survey

Burr Settles

TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.

...read moreread less

Journal ArticleDOI

Variable selection via Gibbs sampling

Edward I. George, +1 more

- 01 Sep 1993 -

Journal of the American Statistical Asso...

TL;DR: In this paper, the Gibbs sampler is used to indirectly sample from the multinomial posterior distribution on the set of possible subset choices to identify the promising subsets by their more frequent appearance in the Gibbs sample.

...read moreread less

Proceedings Article

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

John Blitzer, +2 more

TL;DR: This work extends to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline.

...read moreread less