M
Moshe Dubiner
Researcher at Google
Publications - 9
Citations - 275
Moshe Dubiner is an academic researcher from Google. The author has contributed to research in topics: k-nearest neighbors algorithm & Probability distribution. The author has an hindex of 5, co-authored 9 publications receiving 266 citations.
Papers
More filters
Proceedings Article
Large Scale Parallel Document Mining for Machine Translation
TL;DR: A distributed system is described that reliably mines parallel text from large corpora as cross-language near-duplicate detection, enabled by an initial, low-quality batch translation.
Journal ArticleDOI
Bucketing Coding and Information Theory for the Statistical High-Dimensional Nearest-Neighbor Problem
TL;DR: Bucketing information is defined, and is proven to bound the performance of all bucketing codes, and it is shown that order of 1/p+∈comparisons suffice, for any ∈ > 0
Patent
Parallel document mining
TL;DR: This paper provided a collection of documents in multiple languages, identifying, from the collection, a group of candidate documents, where each candidate document in the group shared multiple corresponding rare features, and evaluated pairs of candidates in the candidate documents in the collection using multiple common features present in the collections of documents.
Patent
Identifying nearest neighbors for machine translation
TL;DR: In this paper, the authors describe technologies relating to identifying nearest neighbors are provided. In one implementation, a method includes using a first and a second collections of n-grams and their associated probabilities to generate a plurality of randomized ranked collections.
Journal ArticleDOI
A Heterogeneous High-Dimensional Approximate Nearest Neighbor Algorithm
TL;DR: An old style probabilistic formulation is introduced instead of the more general locality sensitive hashing (LSH) formulation, and it is shown that at least for sparse problems it recognizes much more efficient algorithms than the sparseness destroying LSH random projections.