Showing papers by "Hiroyuki Shindo published in 2019"

PDF

Open Access

Posted Content•

Global Entity Disambiguation with Pretrained Contextualized Embeddings of Words and Entities

[...]

Ikuya Yamada, Koki Washio¹, Hiroyuki Shindo², Yuji Matsumoto•Institutions (2)

University of Tokyo¹, Nara Institute of Science and Technology²

01 Sep 2019-arXiv: Computation and Language

TL;DR: This work proposes a new global entity disambiguation (ED) model based on a bidirectional transformer encoder and produces contextualized embeddings for words and entities in the input text that achieves new state-of-the-art results on all but one dataset.

...read moreread less

Abstract: We propose a new global entity disambiguation (ED) model based on contextualized embeddings of words and entities. Our model is based on a bidirectional transformer encoder (i.e., BERT) and produces contextualized embeddings for words and entities in the input text. The model is trained using a new masked entity prediction task that aims to train the model by predicting randomly masked entities in entity-annotated texts obtained from Wikipedia. We further extend the model by solving ED as a sequential decision task to capture global contextual information. We evaluate our model using six standard ED datasets and achieve new state-of-the-art results on all but one dataset.

...read moreread less

31 citations

Proceedings Article•DOI•

Neural Attentive Bag-of-Entities Model for Text Classification

[...]

Ikuya Yamada¹, Hiroyuki Shindo²•Institutions (2)

Keio University¹, Nara Institute of Science and Technology²

03 Sep 2019

TL;DR: A Neural Attentive Bag-of-Entities model is proposed, which is a neural network model that performs text classification using entities in a knowledge base that combines simple high-recall entity detection based on a dictionary, with a novel neural attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities.

...read moreread less

Abstract: This study proposes a Neural Attentive Bag-of-Entities model, which is a neural network model that performs text classification using entities in a knowledge base. Entities provide unambiguous and relevant semantic signals that are beneficial for text classification. We combine simple high-recall entity detection based on a dictionary, to detect entities in a document, with a novel neural attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities. We tested the effectiveness of our model using two standard text classification datasets (i.e., the 20 Newsgroups and R8 datasets) and a popular factoid question answering dataset based on a trivia quiz game. As a result, our model achieved state-of-the-art results on all datasets. The source code of the proposed model is available online at https://github.com/wikipedia2vec/wikipedia2vec.

...read moreread less

27 citations

Proceedings Article•DOI•

Playing by the Book: An Interactive Game Approach for Action Graph Extraction from Text

[...]

Ronen Tamari, Hiroyuki Shindo, Dafna Shahaf, Yuji Matsumoto

01 Jun 2019

TL;DR: Text2Quest as discussed by the authors is an interactive game-based approach for action-graph extraction from materials science papers, where procedural text is interpreted as instructions for an interactive role-playing game and a learning agent completes the game by executing the procedure correctly.

...read moreread less

Abstract: Understanding procedural text requires tracking entities, actions and effects as the narrative unfolds. We focus on the challenging real-world problem of action-graph extraction from materials science papers, where language is highly specialized and data annotation is expensive and scarce. We propose a novel approach, Text2Quest, where procedural text is interpreted as instructions for an interactive game. A learning agent completes the game by executing the procedure correctly in a text-based simulated lab environment. The framework can complement existing approaches and enables richer forms of learning compared to static texts. We discuss potential limitations and advantages of the approach, and release a prototype proof-of-concept, hoping to encourage research in this direction.

...read moreread less

20 citations

Proceedings Article•DOI•

Stochastic Tokenization with a Language Model for Neural Text Classification

[...]

Tatsuya Hiraoka, Hiroyuki Shindo¹, Yuji Matsumoto²•Institutions (2)

Nara Institute of Science and Technology¹, Tohoku University²

01 Jul 2019

TL;DR: This model incorporates a language model for unsupervised tokenization into a text classifier and then trains both models simultaneously, which achieves better performance than previous methods.

...read moreread less

Abstract: For unsegmented languages such as Japanese and Chinese, tokenization of a sentence has a significant impact on the performance of text classification. Sentences are usually segmented with words or subwords by a morphological analyzer or byte pair encoding and then encoded with word (or subword) representations for neural networks. However, segmentation is potentially ambiguous, and it is unclear whether the segmented tokens achieve the best performance for the target task. In this paper, we propose a method to simultaneously learn tokenization and text classification to address these problems. Our model incorporates a language model for unsupervised tokenization into a text classifier and then trains both models simultaneously. To make the model robust against infrequent tokens, we sampled segmentation for each sentence stochastically during training, which resulted in improved performance of text classification. We conducted experiments on sentiment analysis as a text classification task and show that our method achieves better performance than previous methods.

...read moreread less

19 citations

Journal Article•DOI•

Distant Supervision for Relation Extraction via Piecewise Attention and Bag-Level Contextual Inference

[...]

Van-Thuy Phi¹, Joan Santoso², Van-Hien Tran¹, Hiroyuki Shindo¹, Masashi Shimbo¹, Yuji Matsumoto¹ - Show less +2 more•Institutions (2)

Nara Institute of Science and Technology¹, Sepuluh Nopember Institute of Technology²

30 Jul 2019-IEEE Access

TL;DR: A novel neural RE model that combines a bidirectional gated recurrent unit model with a form of hierarchical attention that is better suited to RE is proposed and a contextual inference method that can infer the most likely positive examples of an entity pair in bags with very limited contextual information is proposed.

...read moreread less

Abstract: Distant supervision (DS) has become an efficient approach for relation extraction (RE) to alleviate the lack of labeled examples in supervised learning. In this paper, we propose a novel neural RE model that combines a bidirectional gated recurrent unit model with a form of hierarchical attention that is better suited to RE. We demonstrate that an additional attention mechanism called piecewise attention, which builds itself upon segment level representations, significantly enhances the performance of the distantly supervised relation extraction task. Our piecewise attention mechanism not only captures crucial segments in each sentence but also reflects the direction of relations between two entities. Furthermore, we propose a contextual inference method that can infer the most likely positive examples of an entity pair in bags with very limited contextual information. In addition, we provide an annotated dataset without false positive examples based on the Riedel testing dataset, and report on the actual performance of several RE models. The experimental results show that our proposed methods outperform the previous state-of-the-art baselines on both original and annotated datasets for the distantly supervised RE task.

...read moreread less

13 citations

Posted Content•

Gated Graph Recursive Neural Networks for Molecular Property Prediction

[...]

Hiroyuki Shindo¹, Yuji Matsumoto¹•Institutions (1)

Nara Institute of Science and Technology¹

31 Aug 2019-arXiv: Learning

TL;DR: This work proposes a simple and powerful graph neural networks for molecular property prediction as a directed complete graph in which each atom has a spatial position, and introduces a recursive neural network with simple gating function.

...read moreread less

Abstract: Molecule property prediction is a fundamental problem for computer-aided drug discovery and materials science. Quantum-chemical simulations such as density functional theory (DFT) have been widely used for calculating the molecule properties, however, because of the heavy computational cost, it is difficult to search a huge number of potential chemical compounds. Machine learning methods for molecular modeling are attractive alternatives, however, the development of expressive, accurate, and scalable graph neural networks for learning molecular representations is still challenging. In this work, we propose a simple and powerful graph neural networks for molecular property prediction. We model a molecular as a directed complete graph in which each atom has a spatial position, and introduce a recursive neural network with simple gating function. We also feed input embeddings for every layers as skip connections to accelerate the training. Experimental results show that our model achieves the state-of-the-art performance on the standard benchmark dataset for molecular property prediction.

...read moreread less

12 citations

Proceedings Article•DOI•

Deep learning's impact on contour extraction for design based metrology and design based inspection

[...]

Ryo Yumiba¹, Masayoshi Ishikawa¹, Shinichi Shinoda¹, Shigetoshi Sakimura¹, Yasutaka Toyoda¹, Hiroyuki Shindo¹, Masayuki Izawa¹ - Show less +3 more•Institutions (1)

Hitachi¹

26 Mar 2019

TL;DR: Contour extraction using deep learning possesses high noise immunity and excellent pattern recognition ability, and demonstrates high performance to contour extraction from low SN SEM images and multiple layers pattern ones.

...read moreread less

Abstract: With the miniaturization of devices, hot spots caused by wafer topology are becoming a problem in addition to hot spots resulting from design, mask and wafer process, and hot spot evaluation of a wide area in a chip is becoming required. Although DBM (Design Based Metrology) is an effective method for evaluating systematic defects of EUV lithography and multi-patterning, it requires a long time to evaluate because it is necessary to acquire a high-SN SEM image captured by a contour extraction for DBM that can handle low-SN SEM image captured by high-speed SEM scanning conditions. Contour extraction using deep learning possesses high noise immunity and excellent pattern recognition ability, and demonstrates high performance to contour extraction from low SN SEM images and multiple layers pattern ones. The proposed method is composed of annotation operation of SEM image samples, training process using annotation data and SEM image samples, and contour extraction process using the trained outcome. In the evaluation experiment, we confirmed that satisfactory contours are extracted from low SN SEM images and multiple layers pattern ones.

...read moreread less

11 citations

Proceedings Article•DOI•

Decomposed Local Models for Coordinate Structure Parsing.

[...]

Hiroki Teranishi¹, Hiroyuki Shindo¹, Yuji Matsumoto²•Institutions (2)

Nara Institute of Science and Technology¹, Tohoku University²

01 Jun 2019

TL;DR: This work proposes a simple and accurate model for coordination boundary identification that makes use of probabilities of coordinators and conjuncts in the CKY parsing to find the optimal combination of coordinate structures.

...read moreread less

Abstract: We propose a simple and accurate model for coordination boundary identification. Our model decomposes the task into three sub-tasks during training; finding a coordinator, identifying inside boundaries of a pair of conjuncts, and selecting outside boundaries of it. For inference, we make use of probabilities of coordinators and conjuncts in the CKY parsing to find the optimal combination of coordinate structures. Experimental results demonstrate that our model achieves state-of-the-art results, ensuring that the global structure of coordinations is consistent.

...read moreread less

10 citations

Posted Content•

Improving Multi-Word Entity Recognition for Biomedical Texts.

[...]

Hamada A. Nayel, H. L. Shashirekha, Hiroyuki Shindo, Yuji Matsumoto

15 Aug 2019-arXiv: Computation and Language

TL;DR: This paper proposes an extension of IOBES model to improve the performance of BioNER and proposes a new segment Representation model, FROBES, which outperforms other models for multi-word entities with length greater than two.

...read moreread less

Abstract: Biomedical Named Entity Recognition (BioNER) is a crucial step for analyzing Biomedical texts, which aims at extracting biomedical named entities from a given text. Different supervised machine learning algorithms have been applied for BioNER by various researchers. The main requirement of these approaches is an annotated dataset used for learning the parameters of machine learning algorithms. Segment Representation (SR) models comprise of different tag sets used for representing the annotated data, such as IOB2, IOE2 and IOBES. In this paper, we propose an extension of IOBES model to improve the performance of BioNER. The proposed SR model, FROBES, improves the representation of multi-word entities. We used Bidirectional Long Short-Term Memory (BiLSTM) network; an instance of Recurrent Neural Networks (RNN), to design a baseline system for BioNER and evaluated the new SR model on two datasets, i2b2/VA 2010 challenge dataset and JNLPBA 2004 shared task dataset. The proposed SR model outperforms other models for multi-word entities with length greater than two. Further, the outputs of different SR models have been combined using majority voting ensemble method which outperforms the baseline models performance.

...read moreread less

9 citations

Proceedings Article•DOI•

Relation Classification Using Segment-Level Attention-based CNN and Dependency-based RNN.

[...]

Van-Hien Tran¹, Van-Thuy Phi¹, Hiroyuki Shindo¹, Yuji Matsumoto²•Institutions (2)

Nara Institute of Science and Technology¹, Tohoku University²

01 Jun 2019

TL;DR: This article proposed a new model combining Segment-level Attention-based Convolutional Neural Networks (SACNNs) and Dependency-based Recurrent Neural Networks(DepRNNs), which can handle the long-distance relations from the shortest dependency path of relation entities.

...read moreread less

Abstract: Recently, relation classification has gained much success by exploiting deep neural networks. In this paper, we propose a new model effectively combining Segment-level Attention-based Convolutional Neural Networks (SACNNs) and Dependency-based Recurrent Neural Networks (DepRNNs). While SACNNs allow the model to selectively focus on the important information segment from the raw sequence, DepRNNs help to handle the long-distance relations from the shortest dependency path of relation entities. Experiments on the SemEval-2010 Task 8 dataset show that our model is comparable to the state-of-the-art without using any external lexical features.

...read moreread less

8 citations

Proceedings Article•DOI•

Deep learning's impact on pattern matching for design based metrology and design based inspection

[...]

Dou Shuyang¹, Shinichi Shinoda¹, Masayoshi Ishikawa¹, Ryou Yumiba¹, Yasutaka Toyoda¹, Hiroyuki Shindo¹, Masayuki Izawa¹, Masanori Ouchi¹ - Show less +4 more•Institutions (1)

Hitachi¹

26 Mar 2019

TL;DR: Experimental results showed that the proposed method could estimate the design layout from the low-SN SEM image and improve the pattern matching success rate, and it is expected that this method will be advantageous for evaluating mass systematic defects during the process development.

...read moreread less

Abstract: With the miniaturization of devices, hot spot evaluation of a wide area of a wafer for small change points such as wafer topology is required. DBM (Design Based Metrology) is an effective method for evaluating systematic defects of multiple patterning and EUV lithography. However, it takes a long time to evaluate because it is necessary to acquire a high-SN SEM image captured by low-speed SEM scanning conditions. Therefore, we developed a new pattern matching method of DBM by utilizing deep learning technology. Our proposed method can handle low-SN SEM images captured under high-speed SEM scanning conditions. In the proposed method, we use deep learning to estimate design layout from SEM image, and then perform pattern matching between this estimated design layout and the true design layout. The proposed method is particularly effective for pattern matching of low-SN SEM images and circuit pattern distorted during manufacturing process. It is expected that this method will be advantageous for evaluating mass systematic defects during the process development. Experimental results showed that the proposed method could estimate the design layout from the low-SN SEM image and improve the pattern matching success rate.

...read moreread less

Posted Content•

Neural Attentive Bag-of-Entities Model for Text Classification

[...]

Ikuya Yamada¹, Hiroyuki Shindo²•Institutions (2)

Keio University¹, Nara Institute of Science and Technology²

03 Sep 2019-arXiv: Computation and Language

TL;DR: The authors proposed a Neural Attention-Bag-of-Entities model, which combines simple high-recall entity detection based on a dictionary, to detect entities in a document, with a novel neural attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities.

...read moreread less

Abstract: This study proposes a Neural Attentive Bag-of-Entities model, which is a neural network model that performs text classification using entities in a knowledge base. Entities provide unambiguous and relevant semantic signals that are beneficial for capturing semantics in texts. We combine simple high-recall entity detection based on a dictionary, to detect entities in a document, with a novel neural attention mechanism that enables the model to focus on a small number of unambiguous and relevant entities. We tested the effectiveness of our model using two standard text classification datasets (i.e., the 20 Newsgroups and R8 datasets) and a popular factoid question answering dataset based on a trivia quiz game. As a result, our model achieved state-of-the-art results on all datasets. The source code of the proposed model is available online at this https URL.

...read moreread less

Patent•

Image evaluation method and image evaluation device

[...]

Shinichi Shinoda¹, Masayoshi Ishikawa¹, Yasutaka Toyoda¹, Yuichi Abe¹, Hiroyuki Shindo¹ - Show less +1 more•Institutions (1)

Hitachi¹

18 Jan 2019

TL;DR: In this article, a machine learning model is used to generate a design data image from an inspection target image, using the design data as a teacher and using the inspection target as a source image corresponding to the image.

...read moreread less

Abstract: The image evaluation device includes a design data image generation unit that images design data; a machine learning unit that creates a model for generating a design data image from an inspection target image, using the design data image as a teacher and using the inspection target image corresponding to the design data image; a design data prediction image generation unit that predicts the design data image from the inspection target image, using the model created by the machine learning unit; a design data image generation unit that images the design data corresponding to the inspection target image; and a comparison unit that compares a design data prediction image generated by the design data prediction image generation unit and the design data image. As a result, it is possible to detect a systematic defect without using a defect image and generating misinformation frequently.

...read moreread less

Journal Article•DOI•

Development of a computer-assisted Japanese functional expression learning system for Chinese-speaking learners

[...]

Jun Liu¹, Hiroyuki Shindo¹, Yuji Matsumoto¹•Institutions (1)

Nara Institute of Science and Technology¹

01 Oct 2019-Educational Technology Research and Development

TL;DR: Jastudy, a computer-assisted language learning (CALL) system designed specifically for Chinese-speaking learners studying Japanese functional expressions, and uses a ranking system, which gives easier sentences a higher rank, when selecting example sentences.

...read moreread less

Abstract: Because a large number of Chinese characters are commonly used in both Japanese and Chinese, Chinese-speaking learners of Japanese as a second language (JSL) find it more challenging to learn Japanese functional expressions than to learn other Japanese vocabulary. To address this challenge, we have developed Jastudy, a computer-assisted language learning (CALL) system designed specifically for Chinese-speaking learners studying Japanese functional expressions. Given a Japanese sentence as an input, the system automatically detects Japanese functional expressions using a character-based bidirectional long short-term memory with a conditional random field (BiLSTM-CRF) model. The sentence is then segmented and the parts of speech (POS) are tagged (word segmentation and POS tagging) by a Japanese morphological analyzer, MeCab ( http://taku910.github.io/mecab/ ), trained using a CRF model. In addition, the system provides JSL learners with appropriate example sentences that illustrate Japanese functional expressions. The system uses a ranking system, which gives easier sentences a higher rank, when selecting example sentences. A support vector machine for ranking (SVMRank) algorithm estimates the readability of example sentences, using Japanese-Chinese common words as an important feature. A k-means clustering algorithm is used to cluster example sentences that contain functional expressions with the same meanings, based on part-of-speech, conjugation form, and semantic attributes. Finally, to evaluate the usefulness of the system, we have conducted experiments and reported on a preliminary user study involving Chinese-speaking JSL learners.

...read moreread less

Journal Article•DOI•

Scientific Article Search System Based on Discourse Facet Representation

[...]

Yuta Kobayashi¹, Hiroyuki Shindo¹, Yuji Matsumoto¹•Institutions (1)

Nara Institute of Science and Technology¹

17 Jul 2019

TL;DR: A browser-based scientific article search system based on triples of distributed representations of articles, each triple representing a scientific discourse facet (Objective, Method, or Result) using both text and citation information is presented.

...read moreread less

Abstract: We present a browser-based scientific article search system with graphical visualization. This system is based on triples of distributed representations of articles, each triple representing a scientific discourse facet (Objective, Method, or Result) using both text and citation information. Because each facet of an article is encoded as a separate vector, the similarity between articles can be measured by considering the articles not only in their entirety but also on a facet-by-facet basis. Our system provides three search options: a similarity ranking search, a citation graph with facet-labeled edges, and a scatter plot visualization with facets as the axes.

...read moreread less