scispace - formally typeset
Open AccessJournal ArticleDOI

Knowledge elicitation via sequential probabilistic inference for high-dimensional prediction

Reads0
Chats0
TLDR
In this paper, the authors formulate knowledge elicitation as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions, and propose an algorithm and computational approximation for fast and efficient interaction.
Abstract
Prediction in a small-sized sample with a large number of covariates, the “small n, large p” problem, is challenging. This setting is encountered in multiple applications, such as in precision medicine, where obtaining additional data can be extremely costly or even impossible, and extensive research effort has recently been dedicated to finding principled solutions for accurate prediction. However, a valuable source of additional information, domain experts, has not yet been efficiently exploited. We formulate knowledge elicitation generally as a probabilistic inference process, where expert knowledge is sequentially queried to improve predictions. In the specific case of sparse linear regression, where we assume the expert has knowledge about the relevance of the covariates, or of values of the regression coefficients, we propose an algorithm and computational approximation for fast and efficient interaction, which sequentially identifies the most informative features on which to query expert knowledge. Evaluations of the proposed method in experiments with simulated and real users show improved prediction accuracy already with a small effort from the expert.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Improving genomics-based predictions for precision medicine through active elicitation of expert knowledge

TL;DR: In this paper, a probabilistic framework was proposed to incorporate expert feedback about the impact of genomic measurements on the outcome of interest and presented a novel approach to collect the feedback efficiently, based on Bayesian experimental design.
Proceedings ArticleDOI

User Modelling for Avoiding Overfitting in Interactive Knowledge Elicitation for Prediction

TL;DR: In this paper, a user modelling methodology is proposed to guard against double use of data and overfitting in human-in-the-loop machine learning, where the user provides information beyond that in the training data.
Proceedings ArticleDOI

Teacher-aware active robot learning

TL;DR: This paper proposes a learning strategy that aims to minimize the user's workload by taking into account the flow of the questions and reports results from both the robot's performance and the human teacher's perspectives, observing how the hybrid strategy represents a good compromise between learning performance and user's experienced workload.
Journal ArticleDOI

Preference Elicitation within Framework of Fully Probabilistic Design of Decision Strategies

TL;DR: The paper provides a general choice of the best ideal closed-loop model reflecting agent’s preferences and shows the general solution is illustrated on the regulation task with a linear Gaussian model describing the agent's environment.

A Survey of Domain Knowledge Elicitation in Applied Machine Learning

TL;DR: This article developed a taxonomy to characterize elicitation approaches according to the elicitation goal, elicitation target, and use of elicited knowledge, and identified opportunities for adding rigor to these elicitation methods.
References
More filters
Book

Pattern Recognition and Machine Learning

TL;DR: Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied.
Book

Bayesian Data Analysis

TL;DR: Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided.

Active Learning Literature Survey

Burr Settles
TL;DR: This report provides a general introduction to active learning and a survey of the literature, including a discussion of the scenarios in which queries can be formulated, and an overview of the query strategy frameworks proposed in the literature to date.
Journal ArticleDOI

Variable selection via Gibbs sampling

TL;DR: In this paper, the Gibbs sampler is used to indirectly sample from the multinomial posterior distribution on the set of possible subset choices to identify the promising subsets by their more frequent appearance in the Gibbs sample.
Proceedings Article

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification

TL;DR: This work extends to sentiment classification the recently-proposed structural correspondence learning (SCL) algorithm, reducing the relative error due to adaptation between domains by an average of 30% over the original SCL algorithm and 46% over a supervised baseline.
Related Papers (5)