Reporting bias and knowledge acquisition

doi:10.1145/2509558.2509563

Open AccessProceedings ArticleDOI

Reporting bias and knowledge acquisition

Jonathan Gordon, +1 more

- pp 25-30

Chats0

TLDR

This paper questions the idea that the frequency with which people write about actions, outcomes, or properties is a reflection of real-world frequencies or the degree to which a property is characteristic of a class of individuals.

Abstract:

Much work in knowledge extraction from text tacitly assumes that the frequency with which people write about actions, outcomes, or properties is a reflection of real-world frequencies or the degree to which a property is characteristic of a class of individuals. In this paper, we question this idea, examining the phenomenon of reporting bias and the challenge it poses for knowledge extraction. We conclude with discussion of approaches to learning commonsense knowledge from text despite this distortion.

Figures

Table 2: N-gram frequencies for various verbal events and the number of times Knext learns that A person may 〈x〉, including appropriate arguments, e.g., A person may hug a person.

Table 3:Miles Travelled, Crashes, andMiles/Crash are for travel in the United States in 2006 [31]. A plane crash is considered any event in which the plane was damaged. Teraword results are for the patterns car (crash |accident), motorcycle (crash |accident), and (airplane |plane) (crash |accident).

Table 1: N-gram frequencies for (his |her |my |your) 〈body part〉 and the number of times Knext learns A 〈body part〉may pertain to a person. Plurals are included when appropriate.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

Antoine Bosselut, +5 more

TL;DR: This investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs, and suggests that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.

...read moreread less

Book ChapterDOI

Women Also Snowboard: Overcoming Bias in Captioning Models

Lisa Anne Hendricks, +4 more

TL;DR: The authors proposed a new Equalizer model that encourages equal gender probability when gender evidence is occluded in a scene and confident predictions when gender evidences is present, which can be added to any description model in order to mitigate impacts of unwanted bias in a description dataset.

...read moreread less

Proceedings ArticleDOI

Social IQa: Commonsense Reasoning about Social Interactions

Maarten Sap, +4 more

TL;DR: Social IQa as mentioned in this paper is a large-scale benchmark for commonsense reasoning about social situations, which contains 38,000 multiple choice questions for probing emotional and social intelligence in a variety of everyday situations.

...read moreread less

Posted Content

WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale

Keisuke Sakaguchi, +3 more

- 24 Jul 2019 -

arXiv: Computation and Language

TL;DR: The authors introduced WinoGrande, a large-scale dataset of 44k problems, inspired by the original Winograd Schema Challenge (WSC) design, but adjusted to improve both the scale and the hardness of the dataset.

...read moreread less

Posted Content

HellaSwag: Can a Machine Really Finish Your Sentence?.

Rowan Zellers, +4 more

- 19 May 2019 -

arXiv: Computation and Language

TL;DR: HellaSwag as discussed by the authors ) is a commonsense NLP dataset where a series of discriminators iteratively select an adversarial set of machine-generated wrong answers, and the key insight is to scale up the length and complexity of the dataset examples towards a critical 'Goldilocks' zone where generated text is ridiculous to humans, yet often misclassified by state-of-the-art models.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

Logic and conversation

H. P. Grice

- 12 Dec 1975 -

Syntax and Semantics

Journal ArticleDOI

WordNet : an electronic lexical database

Christiane Fellbaum

- 01 Sep 2000 -

Language

TL;DR: The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.

...read moreread less

Book ChapterDOI

Logic and Conversation

Siobhan Chapman

TL;DR: For instance, Grice was interested in Quine's logical approach to language, although he differed from Quine over certain specific specific questions, such as the viability of the distinction between analytic and synthetic statements.

...read moreread less

Book

Computational analysis of present-day American English

Henry Kučera, +5 more

Proceedings Article

Generating Typed Dependency Parses from Phrase Structure Parses

Marie-Catherine de Marneffe, +2 more

TL;DR: A system for extracting typed dependency parses of English sentences from phrase structure parses that captures inherent relations occurring in corpus texts that can be critical in real-world applications is described.

...read moreread less

Collapse

Related Papers (5)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019 -

arXiv: Computation and Language

Reporting bias and knowledge acquisition

Figures

Citations

COMET: Commonsense Transformers for Automatic Knowledge Graph Construction

Women Also Snowboard: Overcoming Bias in Captioning Models

Social IQa: Commonsense Reasoning about Social Interactions

WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale

HellaSwag: Can a Machine Really Finish Your Sentence?.

References

Logic and conversation

WordNet : an electronic lexical database

Logic and Conversation

Computational analysis of present-day American English

Generating Typed Dependency Parses from Phrase Structure Parses

Related Papers (5)

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

RoBERTa: A Robustly Optimized BERT Pretraining Approach

ConceptNet 5.5: An Open Multilingual Graph of General Knowledge

Glove: Global Vectors for Word Representation

Attention is All you Need