Showing papers by "Hiroyuki Shindo published in 2017"

PDF

Open Access

Journal Article•DOI•

Learning Distributed Representations of Texts and Entities from Knowledge Base

[...]

Ikuya Yamada¹, Hiroyuki Shindo², Hideaki Takeda³, Yoshiyasu Takefuji¹•Institutions (3)

Keio University¹, Nara Institute of Science and Technology², National Institute of Informatics³

06 Nov 2017-Transactions of the Association for Computational Linguistics

TL;DR: This paper proposed a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities, which achieved state-of-the-art results on factoid question answering and sentence textual similarity.

...read moreread less

Abstract: We describe a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities. Given a text in the KB, we train our proposed model to predict entities that are relevant to the text. Our model is designed to be generic with the ability to address various NLP tasks with ease. We train the model using a large corpus of texts and their entity annotations extracted from Wikipedia. We evaluated the model on three important NLP tasks (i.e., sentence textual similarity, entity linking, and factoid question answering) involving both unsupervised and supervised settings. As a result, we achieved state-of-the-art results on all three of these tasks. Our code and trained models are publicly available for further academic research.

...read moreread less

101 citations

Proceedings Article•DOI•

Neural Modeling of Multi-Predicate Interactions for Japanese Predicate Argument Structure Analysis

[...]

Hiroki Ouchi¹, Hiroyuki Shindo¹, Yuji Matsumoto¹•Institutions (1)

Nara Institute of Science and Technology¹

01 Jul 2017

TL;DR: A model that uses grid-type recurrent neural networks that automatically induces features sensitive to multi-predicate interactions from the word sequence information of a sentence and outperforms previous syntax-dependent models.

...read moreread less

Abstract: The performance of Japanese predicate argument structure (PAS) analysis has improved in recent years thanks to the joint modeling of interactions between multiple predicates. However, this approach relies heavily on syntactic information predicted by parsers, and suffers from errorpropagation. To remedy this problem, we introduce a model that uses grid-type recurrent neural networks. The proposed model automatically induces features sensitive to multi-predicate interactions from the word sequence information of a sentence. Experiments on the NAIST Text Corpus demonstrate that without syntactic information, our model outperforms previous syntax-dependent models.

...read moreread less

26 citations

Proceedings Article•DOI•

Joint Prediction of Morphosyntactic Categories for Fine-Grained Arabic Part-of-Speech Tagging Exploiting Tag Dictionary Information

[...]

Go Inoue¹, Hiroyuki Shindo², Yuji Matsumoto³•Institutions (3)

Tokyo Medical and Dental University¹, Nara Institute of Science and Technology², Tohoku University³

01 Aug 2017

TL;DR: This paper proposes an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework and proposes a method of incorporating tag dictionary information into the authors' neural models by combining word representations with representations of the sets of possible tags.

...read moreread less

Abstract: Part-of-speech (POS) tagging for morphologically rich languages such as Arabic is a challenging problem because of their enormous tag sets. One reason for this is that in the tagging scheme for such languages, a complete POS tag is formed by combining tags from multiple tag sets defined for each morphosyntactic category. Previous approaches in Arabic POS tagging applied one model for each morphosyntactic tagging task, without utilizing shared information between the tasks. In this paper, we propose an approach that utilizes this information by jointly modeling multiple morphosyntactic tagging tasks with a multi-task learning framework. We also propose a method of incorporating tag dictionary information into our neural models by combining word representations with representations of the sets of possible tags. Our experiments showed that the joint model with tag dictionary information results in an accuracy of 91.38% on the Penn Arabic Treebank data set, with an absolute improvement of 2.11% over the current state-of-the-art tagger.

...read moreread less

25 citations

Posted Content•

Learning Distributed Representations of Texts and Entities from Knowledge Base

[...]

Ikuya Yamada¹, Hiroyuki Shindo², Hideaki Takeda³, Yoshiyasu Takefuji¹•Institutions (3)

Keio University¹, Nara Institute of Science and Technology², National Institute of Informatics³

06 May 2017-arXiv: Computation and Language

TL;DR: A neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities to be generic with the ability to address various NLP tasks with ease is described.

...read moreread less

23 citations

Proceedings Article•

Segment-Level Neural Conditional Random Fields for Named Entity Recognition

[...]

Motoki Sato, Hiroyuki Shindo¹, Ikuya Yamada², Yuji Matsumoto³•Institutions (3)

Nara Institute of Science and Technology¹, Keio University², Tohoku University³

01 Nov 2017

TL;DR: This work presents Segment-level Neural CRF, which combines neural networks with a linear chain CRF for segment-level sequence modeling tasks such as named entity recognition (NER) and syntactic chunking.

...read moreread less

Abstract: We present Segment-level Neural CRF, which combines neural networks with a linear chain CRF for segment-level sequence modeling tasks such as named entity recognition (NER) and syntactic chunking. Our segment-level CRF can consider higher-order label dependencies compared with conventional word-level CRF. Since it is difficult to consider all possible variable length segments, our method uses segment lattice constructed from the word-level tagging model to reduce the search space. Performing experiments on NER and chunking, we demonstrate that our method outperforms conventional word-level CRF with neural networks.

...read moreread less

23 citations

Proceedings Article•DOI•

Legal Question Answering System using Neural Attention.

[...]

Ayaka Morimoto, Daiki Kubo, Motoki Sato, Hiroyuki Shindo, Yuji Matsumoto - Show less +1 more

03 Jun 2017

TL;DR: This paper presents the new proposal, in which Neural Networks with attention mechanism are applied to all the civil law articles in deciding the truthness of the query t2, which takes into account all the articles by properly calculating their weighted sum.

...read moreread less

Abstract: This year’s COLIEE has two tasks called phases 1 and 2. The phase 1 needs to find the relevant article given a query t2, and the phase 2 needs to answer whether the given query t2 is yes or no according to Japan civil law articles. This paper presents our proposals for the phase 2 task. Two methods are presented. The first goes along the standard method taken by many authors, such that the relevant article t1 is selected by the similarity to the query t2 at the requirement (condition) and the effect (conclusion) descriptions of the articles. The second is our new proposal, in which Neural Networks with attention mechanism are applied to all the civil law articles in deciding the truthness of the query t2. This method takes into account all the articles by properly calculating their weighted sum.

...read moreread less

14 citations

Proceedings Article•

Coordination Boundary Identification with Similarity and Replaceability

[...]

Hiroki Teranishi¹, Hiroyuki Shindo¹, Yuji Matsumoto²•Institutions (2)

Nara Institute of Science and Technology¹, Tohoku University²

01 Nov 2017

TL;DR: A neural network model for coordination boundary detection that improves identification of clause-level coordination using bidirectional RNNs incorporating two properties as features including similarity and replaceability in conjuncts is proposed.

...read moreread less

Abstract: We propose a neural network model for coordination boundary detection. Our method relies on the two common properties - similarity and replaceability in conjuncts - in order to detect both similar pairs of conjuncts and dissimilar pairs of conjuncts. The model improves identification of clause-level coordination using bidirectional RNNs incorporating two properties as features. We show that our model outperforms the existing state-of-the-art methods on the coordination annotated Penn Treebank and Genia corpus without any syntactic information from parsers.

...read moreread less

10 citations

Proceedings Article•DOI•

English Multiword Expression-aware Dependency Parsing Including Named Entities

[...]

Akihiko Kato¹, Hiroyuki Shindo¹, Yuji Matsumoto¹•Institutions (1)

Nara Institute of Science and Technology¹

01 Jul 2017

TL;DR: This work constructs a corpus that ensures consistency between dependency structures and MWEs, including named entities, and explores models that predict both MWE-spans and an Mwe-aware dependency structure.

...read moreread less

Abstract: Because syntactic structures and spans of multiword expressions (MWEs) are independently annotated in many English syntactic corpora, they are generally inconsistent with respect to one another, which is harmful to the implementation of an aggregate system. In this work, we construct a corpus that ensures consistency between dependency structures and MWEs, including named entities. Further, we explore models that predict both MWE-spans and an MWE-aware dependency structure. Experimental results show that our joint model using additional MWE-span features achieves an MWE recognition improvement of 1.35 points over a pipeline model.

...read moreread less

4 citations

Journal Article•DOI•

Generating Artificial Error Data for Indonesian Preposition Error Corrections

[...]

Budi Irmawati, Hiroyuki Shindo, Yuji Matsumoto

29 Apr 2017-International Journal of Technology

4 citations

Journal Article•DOI•

A Dependency Annotation Scheme to Extract Syntactic Features in Indonesian Sentences

[...]

Budi Irmawati, Hiroyuki Shindo, Yuji Matsumoto

31 Oct 2017-International Journal of Technology

3 citations

Posted Content•

Ensemble of Neural Classifiers for Scoring Knowledge Base Triples

[...]

Ikuya Yamada, Motoki Sato, Hiroyuki Shindo

15 Mar 2017-arXiv: Computation and Language

TL;DR: An approach wherein the outputs of multiple neural network classifiers are combined using a supervised machine learning model to enhance the ranking results in entity retrieval tasks is proposed.

...read moreread less

Abstract: This paper describes our approach for the triple scoring task at the WSDM Cup 2017. The task required participants to assign a relevance score for each pair of entities and their types in a knowledge base in order to enhance the ranking results in entity retrieval tasks. We propose an approach wherein the outputs of multiple neural network classifiers are combined using a supervised machine learning model. The experimental results showed that our proposed method achieved the best performance in one out of three measures (i.e., Kendall's tau), and performed competitively in the other two measures (i.e., accuracy and average score difference).

...read moreread less

Book Chapter•DOI•

Learning to Rank for Coordination Detection

[...]

Xun Wang, Rumeng Li¹, Hiroyuki Shindo¹, Katsuhito Sudoh, Masaaki Nagata - Show less +1 more•Institutions (1)

Nara Institute of Science and Technology¹

17 Apr 2017

TL;DR: A novel model based on the long short-term memory network is developed to fully exploit the differences between training data by formulating the detection of coordinations as a ranking problem to remedy the problem of data imbalance.

...read moreread less

Abstract: Coordinations refer to phrases such as “A and/but/or/... B”. The detection of coordinations remains a major problem due to the complexity of their components. Existing work normally classified the training data into two categories: correct and incorrect. This often caused the problem of data imbalance which inevitably damaged performances of the models they used. We propose to fully exploit the differences between training data by formulating the detection of coordinations as a ranking problem to remedy this problem. We develop a novel model based on the long short-term memory network. Experiments on Penn Treebank and Genia verified the effectiveness of the proposed model.

...read moreread less