Top 9 papers published by Hiroyuki Shindo from Nara Institute of Science and Technology in 2020

Posted Content•

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

[...]

Ikuya Yamada¹, Akari Asai², Hiroyuki Shindo³, Hideaki Takeda⁴, Yuji Matsumoto³ - Show less +1 more•Institutions (4)

Keio University¹, University of Washington², Nara Institute of Science and Technology³, National Institute of Informatics⁴

02 Oct 2020-arXiv: Computation and Language

TL;DR: New pretrained contextualized representations of words and entities based on the bidirectional transformer, and an entity-aware self-attention mechanism that considers the types of tokens (words or entities) when computing attention scores are proposed.

...read moreread less

Abstract: Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task based on the masked language model of BERT. The task involves predicting randomly masked words and entities in a large entity-annotated corpus retrieved from Wikipedia. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer, and considers the types of tokens (words or entities) when computing attention scores. The proposed model achieves impressive empirical performance on a wide range of entity-related tasks. In particular, it obtains state-of-the-art results on five well-known datasets: Open Entity (entity typing), TACRED (relation classification), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), and SQuAD 1.1 (extractive question answering). Our source code and pretrained representations are available at this https URL.

...read moreread less

288 citations

Proceedings Article•DOI•

LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention

[...]

Ikuya Yamada¹, Akari Asai², Hiroyuki Shindo³, Hideaki Takeda⁴, Yuji Matsumoto³ - Show less +1 more•Institutions (4)

Keio University¹, University of Washington², Nara Institute of Science and Technology³, National Institute of Informatics⁴

02 Oct 2020

TL;DR: This article proposed new pretrained contextualized representations of words and entities based on the bidirectional transformer, which treats words and entity in a given text as independent tokens, and outputs contextualised representations of them.

...read moreread less

Abstract: Entity representations are useful in natural language tasks involving entities. In this paper, we propose new pretrained contextualized representations of words and entities based on the bidirectional transformer. The proposed model treats words and entities in a given text as independent tokens, and outputs contextualized representations of them. Our model is trained using a new pretraining task based on the masked language model of BERT. The task involves predicting randomly masked words and entities in a large entity-annotated corpus retrieved from Wikipedia. We also propose an entity-aware self-attention mechanism that is an extension of the self-attention mechanism of the transformer, and considers the types of tokens (words or entities) when computing attention scores. The proposed model achieves impressive empirical performance on a wide range of entity-related tasks. In particular, it obtains state-of-the-art results on five well-known datasets: Open Entity (entity typing), TACRED (relation classification), CoNLL-2003 (named entity recognition), ReCoRD (cloze-style question answering), and SQuAD 1.1 (extractive question answering). Our source code and pretrained representations are available at https://github.com/studio-ousia/luke.

...read moreread less

142 citations

Proceedings Article•DOI•

Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia

[...]

Ikuya Yamada¹, Akari Asai², Jin Sakuma³, Hiroyuki Shindo⁴, Hideaki Takeda⁵, Yoshiyasu Takefuji¹, Yuji Matsumoto⁴ - Show less +3 more•Institutions (5)

Keio University¹, University of Washington², University of Tokyo³, Nara Institute of Science and Technology⁴, National Institute of Informatics⁵

01 Oct 2020

TL;DR: Wikipedia2Vec, a Python-based open-source tool for learning the embeddings of words and entities from Wikipedia, is presented and achieves a state-of-the-art result on the KORE entity relatedness dataset, and competitive results on various standard benchmark datasets.

...read moreread less

Abstract: The embeddings of entities in a large knowledge base (e.g., Wikipedia) are highly beneficial for solving various natural language tasks that involve real world knowledge. In this paper, we present Wikipedia2Vec, a Python-based open-source tool for learning the embeddings of words and entities from Wikipedia. The proposed tool enables users to learn the embeddings efficiently by issuing a single command with a Wikipedia dump file as an argument. We also introduce a web-based demonstration of our tool that allows users to visualize and explore the learned embeddings. In our experiments, our tool achieved a state-of-the-art result on the KORE entity relatedness dataset, and competitive results on various standard benchmark datasets. Furthermore, our tool has been used as a key component in various recent studies. We publicize the source code, demonstration, and the pretrained embeddings for 12 languages at https://wikipedia2vec.github.io/.

...read moreread less

140 citations

Posted Content•

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

[...]

Saito Itsumi, Kyosuke Nishida, Kosuke Nishida, Otsuka Atsushi, Asano Hisako, Junji Tomita, Hiroyuki Shindo, Yuji Matsumoto - Show less +4 more

21 Jan 2020-arXiv: Computation and Language

TL;DR: A new length-controllable abstractive summarization model that incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings to generate an informative and length-controlled summary.

...read moreread less

Abstract: We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization, especially of the length, is an important aspect for practical applications. Previous studies on length-controllable abstractive summarization incorporate length embeddings in the decoder module for controlling the summary length. Although the length embeddings can control where to stop decoding, they do not decide which information should be included in the summary within the length constraint. Unlike the previous models, our length-controllable abstractive summarization model incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings. Our model generates a summary in two steps. First, our word-level extractor extracts a sequence of important words (we call it the "prototype text") from the source text according to the word-level importance scores and the length constraint. Second, the prototype text is used as additional input to the encoder-decoder model, which generates a summary by jointly encoding and copying words from both the prototype text and source text. Since the prototype text is a guide to both the content and length of the summary, our model can generate an informative and length-controlled summary. Experiments with the CNN/Daily Mail dataset and the NEWSROOM dataset show that our model outperformed previous models in length-controlled settings.

...read moreread less

21 citations

Proceedings Article•DOI•

A trainable die-to-database for fast e-Beam inspection: learning normal images to detect defects

[...]

Masanori Ouchi¹, Masayoshi Ishikawa¹, Shinichi Shinoda¹, Yasutaka Toyoda¹, Ryo Yumiba¹, Hiroyuki Shindo¹, Masayuki Izawa¹ - Show less +3 more•Institutions (1)

Hitachi¹

20 Mar 2020

TL;DR: A deep-learning-based D2DB inspection that can distinguish a defect deformation from a normal deformation by learning the luminosity distribution in normal images is proposed, and it is shown that this inspection can detect unseen defects.

...read moreread less

Abstract: In the drive toward sub-10-nm semiconductor devices, manufacturers have been developing advanced lithography technologies such as extreme ultraviolet lithography and multiple patterning. However, these technologies can cause unexpected defects, and a high-speed inspection is thus required to cover the entire surface of a wafer. A Die-to-Database (D2DB) inspection is commonly known as a high-speed inspection. The D2DB inspection compares an inspection image with a design layout, so it does not require a reference image for comparing with the inspection image, unlike a die-to-die inspection, thereby achieving a high-speed inspection. However, conventional D2DB inspections suffer from erroneous detection because the manufacturing processes deform the circuit pattern from the design layout, and such deformations will be detected as defects. To resolve this issue, we propose a deep-learning-based D2DB inspection that can distinguish a defect deformation from a normal deformation by learning the luminosity distribution in normal images. Our inspection detects outliers of the learned luminosity distribution as defects. Because our inspection requires only normal images, we can train the model without defect images, which are difficult to obtain with enough variety. In this way, our inspection can detect unseen defects. Through experiments, we show that our inspection can detect only the defect region on an inspection image.

...read moreread less

10 citations

Patent•

Pattern inspection system

[...]

Dou Shuyang¹, Shinichi Shinoda¹, Yasutaka Toyoda¹, Hiroyuki Shindo¹•Institutions (1)

Hitachi¹

05 Mar 2020

TL;DR: In this article, a machine learning-based pattern inspection system is proposed to inspect an image of an inspection target pattern of an electronic device using an identifier constituted by machine learning, based on the image of the inspection target patterns of the electronic device and data used to manufacture the inspected patterns.

...read moreread less

Abstract: A pattern inspection system inspects an image of an inspection target pattern of an electronic device using an identifier constituted by machine learning, based on the image of the inspection target pattern of the electronic device and data used to manufacture the inspection target pattern The system includes a storage unit which stores a plurality of pattern images of the electronic device and pattern data used to manufacture a pattern of the electronic device, and an image selection unit which selects a learning pattern image used in the machine learning from the plurality of pattern images, based on the pattern data and the pattern image stored in the storage unit

...read moreread less

1 citations

Length-controllable Abstractive Summarization by Guiding with Summary Prototype

[...]

Saito Itsumi, Kyosuke Nishida, Kosuke Nishida, Otsuka Atsushi, Asano Hisako, Junji Tomita, Hiroyuki Shindo, Yuji Matsumoto - Show less +4 more

01 Apr 2020

TL;DR: This paper proposed a length-controllable abstractive summarization model, which incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings.

...read moreread less

Abstract: We propose a new length-controllable abstractive summarization model. Recent state-of-the-art abstractive summarization models based on encoder-decoder models generate only one summary per source text. However, controllable summarization, especially of the length, is an important aspect for practical applications. Previous studies on length-controllable abstractive summarization incorporate length embeddings in the decoder module for controlling the summary length. Although the length embeddings can control where to stop decoding, they do not decide which information should be included in the summary within the length constraint. Unlike the previous models, our length-controllable abstractive summarization model incorporates a word-level extractive module in the encoder-decoder model instead of length embeddings. Our model generates a summary in two steps. First, our word-level extractor extracts a sequence of important words (we call it the "prototype text") from the source text according to the word-level importance scores and the length constraint. Second, the prototype text is used as additional input to the encoder-decoder model, which generates a summary by jointly encoding and copying words from both the prototype text and source text. Since the prototype text is a guide to both the content and length of the summary, our model can generate an informative and length-controlled summary. Experiments with the CNN/Daily Mail dataset and the NEWSROOM dataset show that our model outperformed previous models in length-controlled settings.

...read moreread less

1 citations

Proceedings Article•DOI•

Coordination Boundary Identification without Labeled Data for Compound Terms Disambiguation

[...]

Yuya Sawada, Takashi Wada, Takayoshi Shibahara, Hiroki Teranishi¹, Shuhei Kondo, Hiroyuki Shindo¹, Taro Watanabe², Yuji Matsumoto³ - Show less +4 more•Institutions (3)

Nara Institute of Science and Technology¹, Google², Tohoku University³

01 Dec 2020

TL;DR: The method can identify the coordination boundaries without training on labeled data, and can be applied even if coordination structure annotations are not available, and is comparable to a recent supervised method when the coordinator conjoins simple noun phrases.

...read moreread less

Abstract: We propose a simple method for nominal coordination boundary identification. As the main strength of our method, it can identify the coordination boundaries without training on labeled data, and can be applied even if coordination structure annotations are not available. Our system employs pre-trained word embeddings to measure the similarities of words and detects the span of coordination, assuming that conjuncts share syntactic and semantic similarities. We demonstrate that our method yields good results in identifying coordinated noun phrases in the GENIA corpus and is comparable to a recent supervised method for the case when the coordinator conjoins simple noun phrases.

...read moreread less

Journal Article•DOI•

局所的モデルと cky アルゴリズムによる並列構造解析

[...]

Hiroki Teranishi¹, Hiroyuki Shindo¹, Taro Watanabe¹, Yuji Matsumoto•Institutions (1)

Nara Institute of Science and Technology¹

15 Dec 2020

Showing papers by "Hiroyuki Shindo published in 2020"