Author
Satoshi Kobayashi
Bio: Satoshi Kobayashi is an academic researcher from Keio University. The author has contributed to research in topics: Grammar induction & Language model. The author has an hindex of 2, co-authored 2 publications receiving 11 citations.
Papers
More filters
01 Jan 2004
6 citations
Book•
19 Oct 2006
TL;DR: Grammatical Inference for Syntax-Based Statistical Machine Translation and Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses.
Abstract: Invited Papers.- Parsing Without Grammar Rules.- Classification of Biological Sequences with Kernel Methods.- Regular Papers.- Identification in the Limit of Systematic-Noisy Languages.- Ten Open Problems in Grammatical Inference.- Polynomial-Time Identification of an Extension of Very Simple Grammars from Positive Data.- PAC-Learning Unambiguous NTS Languages.- Incremental Learning of Context Free Grammars by Bridging Rule Generation and Search for Semi-optimum Rule Sets.- Variational Bayesian Grammar Induction for Natural Language.- Stochastic Analysis of Lexical and Semantic Enhanced Structural Language Model.- Using Pseudo-stochastic Rational Languages in Probabilistic Grammatical Inference.- Learning Analysis by Reduction from Positive Data.- Inferring Grammars for Mildly Context Sensitive Languages in Polynomial-Time.- Planar Languages and Learnability.- A Unified Algorithm for Extending Classes of Languages Identifiable in the Limit from Positive Data.- Protein Motif Prediction by Grammatical Inference.- Grammatical Inference in Practice: A Case Study in the Biomedical Domain.- Inferring Grammar Rules of Programming Language Dialects.- The Tenjinno Machine Translation Competition.- Large Scale Inference of Deterministic Transductions: Tenjinno Problem 1.- A Discriminative Model of Stochastic Edit Distance in the Form of a Conditional Transducer.- Learning n-Ary Node Selecting Tree Transducers from Completely Annotated Examples.- Learning Multiplicity Tree Automata.- Learning DFA from Correction and Equivalence Queries.- Using MDL for Grammar Induction.- Characteristic Sets for Inferring the Unions of the Tree Pattern Languages by the Most Fitting Hypotheses.- Learning Deterministic DEC Grammars Is Learning Rational Numbers.- Iso-array Acceptors and Learning.- Poster Papers.- A Merging States Algorithm for Inference of RFSAs.- Query-Based Learning of XPath Expressions.- Learning Finite-State Machines from Inexperienced Teachers.- Suprasymbolic Grammar Induction by Recurrent Self-Organizing Maps.- Graph-Based Structural Data Mining in Cognitive Pattern Interpretation.- Constructing Song Syntax by Automata Induction.- Learning Reversible Languages with Terminal Distinguishability.- Grammatical Inference for Syntax-Based Statistical Machine Translation.
5 citations
Cited by
More filters
TL;DR: The authors develop an evaluation metric for Optimality Theory that allows a learner to induce a lexicon and a phonological grammar from unanalyzed surface forms, and show that the learner succeeds in obtaining this kind of knowledge and is better equipped to do so than other existing learners in the literature.
Abstract: We develop an evaluation metric for Optimality Theory that allows a learner to induce a lexicon and a phonological grammar from unanalyzed surface forms. We wish to model aspects of knowledge such as the English-speaking child’s knowledge that the aspiration of the first segment of khaet is predictable and the French-speaking child’s knowledge that the final l of table ‘table’ is optional and can be deleted while that of parle ‘speak’ cannot. We show that the learner we present succeeds in obtaining this kind of knowledge and is better equipped to do so than other existing learners in the literature.
22 citations
TL;DR: The analysis shows that the type of grammars induced by the algorithm can potentially outperform the state-of-the-art in unsupervised parsing on the WSJ10 corpus and are, in theory, capable of modelling context-free features of natural language syntax.
Abstract: Recently, different theoretical learning results have been found for a variety of contextfree grammar subclasses through the use of distributional learning [1]. However, these results are still not extended to probabilistic grammars. In this work, we give a practical algorithm, with some proven properties, that learns a subclass of probabilistic grammars from positive data. A minimum satisfiability solver is used to direct the search towards small grammars. Experiments on well-known context-free languages and artificial natural language grammars give positive results. Moreover, our analysis shows that the type of grammars induced by our algorithm are, in theory, capable of modelling context-free features of natural language syntax. One of our experiments shows that our algorithm can potentially outperform the state-of-the-art in unsupervised parsing on the WSJ10 corpus.
13 citations
21 Jul 2009
TL;DR: Grammatical inference and grammar induction both seem to indicate that techniques aiming at building grammatical formalisms when given some information about a language are not concerned with automata or other finite state machines, but this is far from true, and many of the more important results in grammatical inference rely heavily on automata formalisms, and particularly on the specific use of determinism that is made.
Abstract: The terms grammatical inference and grammar induction both seem to indicate that techniques aiming at building grammatical formalisms when given some information about a language are not concerned with automata or other finite state machines. This is far from true, and many of the more important results in grammatical inference rely heavily on automata formalisms, and particularly on the specific use of determinism that is made. We survey here some of the main ideas and results in the field.
6 citations
TL;DR: In this paper, the existence of a canonical form for semi-deterministic transducers with sets of pairwise incomparable output strings is proved. But this form requires domain knowledge only and there is no learning algorithm that uses only domain knowledge.
Abstract: We prove the existence of a canonical form for semi-deterministic transducers with sets of pairwise incomparable output strings. Based on this, we develop an algorithm which learns semi-deterministic transducers given access to translation queries. We also prove that there is no learning algorithm for semi-deterministic transducers that uses only domain knowledge.
5 citations
01 Jan 2016
TL;DR: By controlling better the information to which one has access, this setting provides a better understanding of the hardness of learning tasks, and allows us to solve practical learning situations, for which new algorithms are needed.
Abstract: When learning languages or grammars, an attractive alternative to using a large corpus is to learn by interacting with the environment. This can allow us to deal with situations where data is scarce or expensive, but testing or experimenting is possible. The situation, which arises in a number of fields, is formalised in a setting called active learning or query learning. By controlling better the information to which one has access, this setting provides us with a better understanding of the hardness of learning tasks. But the setting also allows us to solve practical learning situations, for which new algorithms are needed.
4 citations