scispace - formally typeset
Open AccessProceedings Article

Large-scale Semantic Parsing via Schema Matching and Lexicon Extension

Qingqing Cai, +1 more
- pp 423-433
Reads0
Chats0
TLDR
A semantic parser for Freebase is developed based on a reduction to standard supervised training algorithms, schema matching, and pattern learning that is capable of parsing questions with an F1 that improves by 0.42 over a purely-supervised learning algorithm.
Abstract
Supervised training procedures for semantic parsers produce high-quality semantic parsers, but they have difficulty scaling to large databases because of the sheer number of logical constants for which they must see labeled training data. We present a technique for developing semantic parsers for large databases based on a reduction to standard supervised training algorithms, schema matching, and pattern learning. Leveraging techniques from each of these areas, we develop a semantic parser for Freebase that is capable of parsing questions with an F1 that improves by 0.42 over a purely-supervised learning algorithm.

read more

Citations
More filters
Book ChapterDOI

VQuAnDa: Verbalization QUestion ANswering DAtaset

TL;DR: This work aims to fill the gap in Question Answering datasets by providing the first QA dataset VQuAnDa that includes the verbalization of each answer, based on a commonly used large-scale QA datasets – LC-QuAD, in order to support compatibility and continuity of previous work.
Proceedings ArticleDOI

Natural Language Data Management and Interfaces: Recent Development and Open Challenges

TL;DR: This tutorial explores two more relevant areas of overlap to the database community: (1) managing natural language text data in a relational database, and (2) developing natural language interfaces to databases.
Proceedings ArticleDOI

Meta-Learning for Domain Generalization in Semantic Parsing

TL;DR: The authors use a meta-learning framework which targets zero-shot domain generalization for semantic parsing by constructing virtual train and test sets from disjoint domains, which can encourage a parser to generalize to unseen target domains.
Journal ArticleDOI

Poisoning Web-Scale Training Datasets is Practical

TL;DR: In this paper , the authors introduce two new dataset poisoning attacks that intentionally introduce malicious examples to a model's performance, which are immediately practical and could, today, poison 10 popular datasets.
Posted Content

Logician: A Unified End-to-End Neural Approach for Open-Domain Information Extraction

TL;DR: Li et al. as mentioned in this paper proposed a unified knowledge expression form, SAOKE, to express four types of valuable intermediate structures (Relation, Attribute, Description, and Concept), and trained an end-to-end neural model using the sequenceto-sequence paradigm, called Logician, to transform sentences into facts.
References
More filters
Proceedings ArticleDOI

Freebase: a collaboratively created graph database for structuring human knowledge

TL;DR: MQL provides an easy-to-use object-oriented interface to the tuple data in Freebase and is designed to facilitate the creation of collaborative, Web-based data-oriented applications.
Journal ArticleDOI

A survey of approaches to automatic schema matching

TL;DR: A taxonomy is presented that distinguishes between schema-level and instance-level, element- level and structure- level, and language-based and constraint-based matchers and is intended to be useful when comparing different approaches to schema matching, when developing a new match algorithm, and when implementing a schema matching component.
Proceedings Article

Toward an architecture for never-ending language learning

TL;DR: This work proposes an approach and a set of design principles for an intelligent computer agent that runs forever and describes a partial implementation of such a system that has already learned to extract a knowledge base containing over 242,000 beliefs.
Proceedings Article

Open information extraction from the web

TL;DR: Open Information Extraction (OIE) as mentioned in this paper is a new extraction paradigm where the system makes a single data-driven pass over its corpus and extracts a large set of relational tuples without requiring any human input.
Proceedings Article

Identifying Relations for Open Information Extraction

TL;DR: Two simple syntactic and lexical constraints on binary relations expressed by verbs are introduced in the ReVerb Open IE system, which more than doubles the area under the precision-recall curve relative to previous extractors such as TextRunner and woepos.