scispace - formally typeset
B

Been Kim

Researcher at Google

Publications -  75
Citations -  13180

Been Kim is an academic researcher from Google. The author has contributed to research in topics: Interpretability & Computer science. The author has an hindex of 38, co-authored 70 publications receiving 8631 citations. Previous affiliations of Been Kim include Massachusetts Institute of Technology & Allen Institute for Artificial Intelligence.

Papers
More filters
Posted Content

Learning About Meetings

TL;DR: Tentative evidence that it is possible to automatically detect when during the meeting a key decision is taking place and it is often possible to predict whether a proposal during a meeting will be accepted or rejected based entirely on the language used by the speaker is provided.
Journal ArticleDOI

Inferring team task plans from human meetings: a generative modeling approach with logic-based prior

TL;DR: In this article, a hybrid approach combines probabilistic generative modeling with logical plan validation used to compute a highly structured prior over possible plans, enabling them to overcome the challenge of performing inference over a large solution space with only a small amount of noisy data from the team planning session.
Journal ArticleDOI

Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models

TL;DR: This article showed that representation denoising does not provide any insight into which model MLP layer would be best to edit in order to override an existing stored fact with a new one.
Journal ArticleDOI

Impossibility Theorems for Feature Attribution

TL;DR: The authors show that for moderately rich model classes (easily satisfied by neural networks), any feature attribution method that is complete and linear (i.e., Integrated Gradients and SHAP) can provably fail to improve on random guessing for inferring model behavior.
Posted Content

Interpreting Black Box Predictions using Fisher Kernels

TL;DR: The authors use Fisher kernels as the defining feature embedding of each data point, combined with Sequential Bayesian Quadrature (SBQ) for efficient selection of examples for black box interpretation of test predictions in terms of training examples.