Knowledge graph augmented advanced learning models for commonsense reasoning

doi:10.1088/1757-899X/1022/1/012038

Home
/
Papers
/
Knowledge graph augmented advanced learning models for commonsense reasoning

Journal Article•DOI•

Knowledge graph augmented advanced learning models for commonsense reasoning

Akhilesh Pothuri¹, Hari Sai Raghuram Veeramallu, Pooja Malik¹•Institutions (1)

Shiv Nadar University¹

01 Jan 2021-Vol. 1022, Iss: 1, pp 012038

TL;DR: This paper proposes to respond to common sense issues through a textual inference system with external, organized common sense graphs for explanatory inferences, and achieves state-of-the-art reliability on CommonsenseQA, a broad database of common sense reasoning utilizing ConceptNet as the only external tool for BERT-based models.

read less

Abstract: Machine learning is the key solution to many AI issues, but learning models rely heavily on specific training data. While a Bayesian setup can be used to incorporate some learning patterns with previous knowledge, those patterns can not access any organized world knowledge on requirements. The primary objective is to enable human-capable machines in ordinary everyday circumstances to estimate and make presumptions. In this paper we propose to respond to such common sense issues through a textual inference system with external, organized common sense graphs for explanatory inferences. The framework is based on a schematic map as a pair of questions and answers, a linked subgraph from the semantine to the symbolic space of knowledge-based external information. It displays a schematic map with a new network graphic module for information knowledge and performance with graph representations. LSTMs and graphical networks with a hierarchical attention-based direction are the basis of our model. It is flexible and understandable from the intermediate attention scores, leading to confident results. We also achieved state-of-the-art reliability on CommonsenseQA, a broad database of common sense reasoning utilizing ConceptNet as the only external tool for BERT-based models.

...read moreread less

References

PDF

Open Access

More filters

Posted Content•

Semi-Supervised Classification with Graph Convolutional Networks

[...]

Thomas Kipf¹, Max Welling¹•Institutions (1)

University of Amsterdam¹

09 Sep 2016-arXiv: Learning

TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.

...read moreread less

Abstract: We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

...read moreread less

15,696 citations

Posted Content•

Neural Machine Translation by Jointly Learning to Align and Translate

[...]

Dzmitry Bahdanau¹, Kyunghyun Cho², Yoshua Bengio²•Institutions (2)

Jacobs University Bremen¹, Université de Montréal²

01 Sep 2014-arXiv: Computation and Language

TL;DR: In this paper, the authors propose to use a soft-searching model to find the parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly.

...read moreread less

Abstract: Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize the translation performance. The models proposed recently for neural machine translation often belong to a family of encoder-decoders and consists of an encoder that encodes a source sentence into a fixed-length vector from which a decoder generates a translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. With this new approach, we achieve a translation performance comparable to the existing state-of-the-art phrase-based system on the task of English-to-French translation. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

...read moreread less

14,077 citations

Posted Content•

RoBERTa: A Robustly Optimized BERT Pretraining Approach

[...]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Michael Lewis, Luke Zettlemoyer, Veselin Stoyanov - Show less +6 more

26 Jul 2019-arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

Abstract: Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging. Training is computationally expensive, often done on private datasets of different sizes, and, as we will show, hyperparameter choices have significant impact on the final results. We present a replication study of BERT pretraining (Devlin et al., 2019) that carefully measures the impact of many key hyperparameters and training data size. We find that BERT was significantly undertrained, and can match or exceed the performance of every model published after it. Our best model achieves state-of-the-art results on GLUE, RACE and SQuAD. These results highlight the importance of previously overlooked design choices, and raise questions about the source of recently reported improvements. We release our models and code.

...read moreread less

13,994 citations

"Knowledge graph augmented advanced ..." refers methods in this paper

...The interesting fact is that the initial RoBERTa fine-tuned system, pre-trained with corpora much greater than BERT, is still the best performance on the OStest Collection....
[...]
...We think that RoBERTas fine-tuning has reached the limit because of our aforementioned failed case analyses in the data set and the absence of comparative reasoning strategies....
[...]
...Other methods are used to improve orthogonal performance; the most recent submissions with public information on the leaderboard (as of Novemeber 2019) are using larger additional text data and fine-tuning on larger pre-training encoders like XLNet [37], RoBERTa[38]....
[...]

Journal Article•DOI•

Introduction to WordNet: An On-line Lexical Database

[...]

George A. Miller¹, Richard Beckwith¹, Christiane Fellbaum¹, Derek Gross², Katherine J. Miller¹ - Show less +1 more•Institutions (2)

Princeton University¹, University of Rochester²

01 Dec 1990-International Journal of Lexicography

TL;DR: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list.

...read moreread less

Abstract: Standard alphabetical procedures for organizing lexical information put together words that are spelled alike and scatter words with similar or related meanings haphazardly through the list. Unfortunately, there is no obvious alternative, no other simple way for lexicographers to keep track of what has been done or for readers to find the word they are looking for. But a frequent objection to this solution is that finding things on an alphabetical list can be tedious and time-consuming. Many people who would like to refer to a dictionary decide not to bother with it because finding the information would interrupt their work and break their train of thought.

...read moreread less

5,038 citations

Proceedings Article•DOI•

Freebase: a collaboratively created graph database for structuring human knowledge

[...]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, Jamie Taylor - Show less +1 more

09 Jun 2008

TL;DR: MQL provides an easy-to-use object-oriented interface to the tuple data in Freebase and is designed to facilitate the creation of collaborative, Web-based data-oriented applications.

...read moreread less

Abstract: Freebase is a practical, scalable tuple database used to structure general human knowledge. The data in Freebase is collaboratively created, structured, and maintained. Freebase currently contains more than 125,000,000 tuples, more than 4000 types, and more than 7000 properties. Public read/write access to Freebase is allowed through an HTTP-based graph-query API using the Metaweb Query Language (MQL) as a data query and manipulation language. MQL provides an easy-to-use object-oriented interface to the tuple data in Freebase and is designed to facilitate the creation of collaborative, Web-based data-oriented applications.

...read moreread less

4,813 citations

"Knowledge graph augmented advanced ..." refers background in this paper

...2008 Freebase: a collaboratively created graph database for structuring human knowledge....
[...]
...There are also knowledge bases created by humans, such as Freebase [8] and WordNet [9]....
[...]