scispace - formally typeset
Search or ask a question

Showing papers by "Hiroyuki Shindo published in 2017"


Journal ArticleDOI
TL;DR: This paper proposed a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities, which achieved state-of-the-art results on factoid question answering and sentence textual similarity.
Abstract: We describe a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities. Given a text in the KB, we train our proposed model to predict entities that are relevant to the text. Our model is designed to be generic with the ability to address various NLP tasks with ease. We train the model using a large corpus of texts and their entity annotations extracted from Wikipedia. We evaluated the model on three important NLP tasks (i.e., sentence textual similarity, entity linking, and factoid question answering) involving both unsupervised and supervised settings. As a result, we achieved state-of-the-art results on all three of these tasks. Our code and trained models are publicly available for further academic research.

101 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: A model that uses grid-type recurrent neural networks that automatically induces features sensitive to multi-predicate interactions from the word sequence information of a sentence and outperforms previous syntax-dependent models.
Abstract: The performance of Japanese predicate argument structure (PAS) analysis has improved in recent years thanks to the joint modeling of interactions between multiple predicates. However, this approach relies heavily on syntactic information predicted by parsers, and suffers from errorpropagation. To remedy this problem, we introduce a model that uses grid-type recurrent neural networks. The proposed model automatically induces features sensitive to multi-predicate interactions from the word sequence information of a sentence. Experiments on the NAIST Text Corpus demonstrate that without syntactic information, our model outperforms previous syntax-dependent models.

26 citations


Proceedings ArticleDOI
01 Aug 2017
TL;DR: This paper proposes an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework and proposes a method of incorporating tag dictionary information into the authors' neural models by combining word representations with representations of the sets of possible tags.
Abstract: Part-of-speech (POS) tagging for morphologically rich languages such as Arabic is a challenging problem because of their enormous tag sets. One reason for this is that in the tagging scheme for such languages, a complete POS tag is formed by combining tags from multiple tag sets defined for each morphosyntactic category. Previous approaches in Arabic POS tagging applied one model for each morphosyntactic tagging task, without utilizing shared information between the tasks. In this paper, we propose an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework. We also propose a method of incorporating tag dictionary information into our neural models by combining word representations with representations of the sets of possible tags. Our experiments showed that the joint model with tag dictionary information results in an accuracy of 91.38% on the Penn Arabic Treebank data set, with an absolute improvement of 2.11% over the current state-of-the-art tagger.

25 citations


Posted Content
TL;DR: A neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities to be generic with the ability to address various NLP tasks with ease is described.
Abstract: We describe a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities. Given a text in the KB, we train our proposed model to predict entities that are relevant to the text. Our model is designed to be generic with the ability to address various NLP tasks with ease. We train the model using a large corpus of texts and their entity annotations extracted from Wikipedia. We evaluated the model on three important NLP tasks (i.e., sentence textual similarity, entity linking, and factoid question answering) involving both unsupervised and supervised settings. As a result, we achieved state-of-the-art results on all three of these tasks. Our code and trained models are publicly available for further academic research.

23 citations


Proceedings Article
01 Nov 2017
TL;DR: This work presents Segment-level Neural CRF, which combines neural networks with a linear chain CRF for segment-level sequence modeling tasks such as named entity recognition (NER) and syntactic chunking.
Abstract: We present Segment-level Neural CRF, which combines neural networks with a linear chain CRF for segment-level sequence modeling tasks such as named entity recognition (NER) and syntactic chunking. Our segment-level CRF can consider higher-order label dependencies compared with conventional word-level CRF. Since it is difficult to consider all possible variable length segments, our method uses segment lattice constructed from the word-level tagging model to reduce the search space. Performing experiments on NER and chunking, we demonstrate that our method outperforms conventional word-level CRF with neural networks.

23 citations


Proceedings ArticleDOI
03 Jun 2017
TL;DR: This paper presents the new proposal, in which Neural Networks with attention mechanism are applied to all the civil law articles in deciding the truthness of the query t2, which takes into account all the articles by properly calculating their weighted sum.
Abstract: This year’s COLIEE has two tasks called phases 1 and 2. The phase 1 needs to find the relevant article given a query t2, and the phase 2 needs to answer whether the given query t2 is yes or no according to Japan civil law articles. This paper presents our proposals for the phase 2 task. Two methods are presented. The first goes along the standard method taken by many authors, such that the relevant article t1 is selected by the similarity to the query t2 at the requirement (condition) and the effect (conclusion) descriptions of the articles. The second is our new proposal, in which Neural Networks with attention mechanism are applied to all the civil law articles in deciding the truthness of the query t2. This method takes into account all the articles by properly calculating their weighted sum.

14 citations


Proceedings Article
01 Nov 2017
TL;DR: A neural network model for coordination boundary detection that improves identification of clause-level coordination using bidirectional RNNs incorporating two properties as features including similarity and replaceability in conjuncts is proposed.
Abstract: We propose a neural network model for coordination boundary detection. Our method relies on the two common properties - similarity and replaceability in conjuncts - in order to detect both similar pairs of conjuncts and dissimilar pairs of conjuncts. The model improves identification of clause-level coordination using bidirectional RNNs incorporating two properties as features. We show that our model outperforms the existing state-of-the-art methods on the coordination annotated Penn Treebank and Genia corpus without any syntactic information from parsers.

10 citations


Proceedings ArticleDOI
01 Jul 2017
TL;DR: This work constructs a corpus that ensures consistency between dependency structures and MWEs, including named entities, and explores models that predict both MWE-spans and an Mwe-aware dependency structure.
Abstract: Because syntactic structures and spans of multiword expressions (MWEs) are independently annotated in many English syntactic corpora, they are generally inconsistent with respect to one another, which is harmful to the implementation of an aggregate system. In this work, we construct a corpus that ensures consistency between dependency structures and MWEs, including named entities. Further, we explore models that predict both MWE-spans and an MWE-aware dependency structure. Experimental results show that our joint model using additional MWE-span features achieves an MWE recognition improvement of 1.35 points over a pipeline model.

4 citations




Posted Content
TL;DR: An approach wherein the outputs of multiple neural network classifiers are combined using a supervised machine learning model to enhance the ranking results in entity retrieval tasks is proposed.
Abstract: This paper describes our approach for the triple scoring task at the WSDM Cup 2017. The task required participants to assign a relevance score for each pair of entities and their types in a knowledge base in order to enhance the ranking results in entity retrieval tasks. We propose an approach wherein the outputs of multiple neural network classifiers are combined using a supervised machine learning model. The experimental results showed that our proposed method achieved the best performance in one out of three measures (i.e., Kendall's tau), and performed competitively in the other two measures (i.e., accuracy and average score difference).

Book ChapterDOI
17 Apr 2017
TL;DR: A novel model based on the long short-term memory network is developed to fully exploit the differences between training data by formulating the detection of coordinations as a ranking problem to remedy the problem of data imbalance.
Abstract: Coordinations refer to phrases such as “A and/but/or/... B”. The detection of coordinations remains a major problem due to the complexity of their components. Existing work normally classified the training data into two categories: correct and incorrect. This often caused the problem of data imbalance which inevitably damaged performances of the models they used. We propose to fully exploit the differences between training data by formulating the detection of coordinations as a ranking problem to remedy this problem. We develop a novel model based on the long short-term memory network. Experiments on Penn Treebank and Genia verified the effectiveness of the proposed model.