scispace - formally typeset
S

Shinsuke Mori

Researcher at Kyoto University

Publications -  102
Citations -  1743

Shinsuke Mori is an academic researcher from Kyoto University. The author has contributed to research in topics: Text segmentation & Language model. The author has an hindex of 20, co-authored 96 publications receiving 1567 citations.

Papers
More filters
Proceedings Article

Pointwise Prediction for Robust, Adaptable Japanese Morphological Analysis

TL;DR: A pointwise approach to Japanese morphological analysis (MA) that ignores structure information during learning and tagging is presented, able to outperform the current state-of-the-art structured approach, and achieves accuracy similar to that of structured predictors using the same feature set.
Proceedings ArticleDOI

A new method of N-gram statistics for large number of n and automatic extraction of words and phrases from large text data of Japanese

TL;DR: A new algorithm of n-grams of large text data for arbitrary large n is developed and calculated successfully, within relatively short time, n- grams of some Japanese text data containing between two and thirty million characters.
Proceedings ArticleDOI

Training Conditional Random Fields Using Incomplete Annotations

TL;DR: A parameter estimation method for Conditional Random Fields (CRFs) is proposed, which enables us to use incomplete annotations in corpus building situations, where complete annotations to the whole corpus is time consuming and unrealistic.
Proceedings Article

Flow Graph Corpus from Recipe Texts

TL;DR: This paper presents an attempt at annotating procedural texts with a flow graph as a representation of understanding, focusing on cooking recipe, and details the annotation framework and some statistics on the corpus.
Proceedings Article

An Unsupervised Model for Joint Phrase Alignment and Extraction

TL;DR: An unsupervised model for joint phrase alignment and extraction using non-parametric Bayesian methods and inversion transduction grammars (ITGs) is presented, which matches the accuracy of traditional two-step word alignment/phrase extraction approach while reducing the phrase table to a fraction of the original size.