scispace - formally typeset
Search or ask a question
Author

Naoyuki Tokuda

Bio: Naoyuki Tokuda is an academic researcher from Utsunomiya University. The author has contributed to research in topics: Parsing & Artificial neural network. The author has an hindex of 13, co-authored 54 publications receiving 780 citations.


Papers
More filters
Patent
08 May 2001
TL;DR: A computer-based information search and retrieval system and method for retrieving textual digital objects that makes full use of the projections of the documents onto both the reduced document space characterized by the singular value decomposition-based latent semantic structure and its orthogonal space is presented in this paper.
Abstract: A computer-based information search and retrieval system and method for retrieving textual digital objects that makes full use of the projections of the documents onto both the reduced document space characterized by the singular value decomposition-based latent semantic structure and its orthogonal space. The resulting system and method has increased robustness, improving the instability of the traditional keyword search engine due to synonymy and/or polysemy of a natural language, and therefore is particularly suitable for web document searching over a distributed computer network such as the Internet.

218 citations

Journal ArticleDOI
TL;DR: A combined use of the projections on and the distances to the DLSI spaces introduced from the differential document vectors improves the adaptability of the LSI (latent semantic indexing) method by capturing unique characteristics of documents.
Abstract: We have developed a new effective probabilistic classifier for document classification by introducing the concept of differential document vectors and DLSI (differential latent semantic indexing) spaces. A combined use of the projections on and the distances to the DLSI spaces introduced from the differential document vectors improves the adaptability of the LSI (latent semantic indexing) method by capturing unique characteristics of documents. Using the intra- and extra-document statistics, both a simple posteriori calculation on a small example and an experiment on a large Reuters-21578 database demonstrate the advantage of the DLSI space-based probabilistic classifier over the LSI space-based classifier in classification performance.

195 citations

Patent
12 Feb 2002
TL;DR: An accurate grammar analyzer based on a so-called POST (part-of-speech tagged) (322) parser and a learners' model (STEP 200) for use in automated language learning applications such as the template-based ICALL (intelligent computer assisted language learning) system is presented in this paper.
Abstract: An accurate grammar analyzer based on a so-called POST (part-of-speech tagged) (322) parser and a learners' model (STEP 200) for use in automated language learning applications such as the template-based ICALL (intelligent computer assisted language learning) system (Fig.2).

51 citations

Patent
12 Feb 2002
TL;DR: In this paper, a template automaton and the LSI principle play an important role in implementing an efficient process of narrowing down an efficient solution space from among the many example sentences of the databases in a target language by exploiting their respective unique search space reduction function.
Abstract: A new, more efficient memory translation algorithm facilitating the acquisition of a most appropriate translation in a target language from among those of nearly narrowed-down candidates of translation by separately applying the so-called dimension reducing functions of a template automaton and the LSI (latent semantic index) technique. Both the template automaton and the LSI principle play an important role in implementing an efficient process of narrowing down an efficient solution space from among the many example sentences of the databases in a target language by exploiting their respective unique search space reduction function. Once developed into a fully operational system, an expert editor rather than an expert translator can tune up the translation memory system, markedly widening the range of available experts who can utilize the system.

24 citations

Journal ArticleDOI
TL;DR: A new general probabilistic model on the regional voting and the national voting is developed, where the percentage of a candidate's supporters in the nation as the probability of a voter voting for the candidate is regarded.
Abstract: By discarding the previous restrictive weak average distribution assumption on region sizes, we have developed a new general probabilistic model on the regional voting (known as "direct popular voting" in political science) and the national voting (typically, the electoral college), where we regard the percentage of a candidate's supporters in the nation as the probability of a voter voting for the candidate. Our analysis demonstrates that the regional voting is always more stable than the national voting, and that the stability margin of the regional voting always increases as the size of such partitioned regions decreases down to a certain critical value of region size, beyond which the stability margin starts to decrease, asymptoting to a national voting level where the size of the partitioned regions approaches the unit of voting cell so that the improved stability of the regional voting by localizing the effects of noise into a restricted number of smaller effective areas will not be effective. Our stability analysis remains valid over the entire range in size of the partitioned regions for regional voting. We show that the regional voting asymptotes to the national voting in two extreme limiting cases, when the region size decreases to a voting cell size and when the region size increases to the size of the nation.

21 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Patent
11 Jan 2011
TL;DR: In this article, an intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions.
Abstract: An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.

1,462 citations

Proceedings ArticleDOI
06 Jul 2001
TL;DR: This model transforms a source-language parse tree into a target-language string by applying stochastic operations at each node, and produces word alignments that are better than those produced by IBM Model 5.
Abstract: We present a syntax-based statistical translation model. Our model transforms a source-language parse tree into a target-language string by applying stochastic operations at each node. These operations capture linguistic differences such as word order and case marking. Model parameters are estimated in polynomial time using an EM algorithm. The model produces word alignments that are better than those produced by IBM Model 5.

924 citations

Patent
19 Oct 2007
TL;DR: In this paper, various methods and devices described herein relate to devices which, in at least certain embodiments, may include one or more sensors for providing data relating to user activity and at least one processor for causing the device to respond based on the user activity which was determined, at least in part, through the sensors.
Abstract: The various methods and devices described herein relate to devices which, in at least certain embodiments, may include one or more sensors for providing data relating to user activity and at least one processor for causing the device to respond based on the user activity which was determined, at least in part, through the sensors. The response by the device may include a change of state of the device, and the response may be automatically performed after the user activity is determined.

844 citations