Search or ask a question

Showing papers by "Ran El-Yaniv published in 2005"

PDF

Open Access

Proceedings Article•DOI•

Multi-way distributional clustering via pairwise interactions

[...]

Ron Bekkerman¹, Ran El-Yaniv², Andrew McCallum¹•Institutions (2)

University of Massachusetts Amherst¹, Technion – Israel Institute of Technology²

07 Aug 2005

TL;DR: An extensive empirical study of two-way, three-way and four-way applications of the MDC scheme using six real-world datasets including the 20 News-groups and the Enron email collection shows that the algorithms consistently and significantly outperform previous state-of-the-art information theoretic clustering algorithms.

...read moreread less

Abstract: We present a novel unsupervised learning scheme that simultaneously clusters variables of several types (e.g., documents, words and authors) based on pairwise interactions between the types, as observed in co-occurrence data. In this scheme, multiple clustering systems are generated aiming at maximizing an objective function that measures multiple pairwise mutual information between cluster variables. To implement this idea, we propose an algorithm that interleaves top-down clustering of some variables and bottom-up clustering of the other variables, with a local optimization correction routine. Focusing on document clustering we present an extensive empirical study of two-way, three-way and four-way applications of our scheme using six real-world datasets including the 20 News-groups (20NG) and the Enron email collection. Our multi-way distributional clustering (MDC) algorithms consistently and significantly outperform previous state-of-the-art information theoretic clustering algorithms.

...read moreread less

122 citations

Journal Article•DOI•

Effective transductive learning via objective model selection

[...]

Ran El-Yaniv¹, Leonid Gerzon¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Oct 2005-Pattern Recognition Letters

TL;DR: Empirical examination of a recent transductive learning approach based on clustering, implemented with 'spectral clustering', on a suite of benchmark datasets from the UCI repository indicates that the new approach is effective and comparable with one of the best known transductives learning algorithms to-date.

...read moreread less

14 citations

Journal Article•DOI•

Correcting BLAST e-values for low-complexity segments.

[...]

Itai Sharon¹, Aaron Birkland, Kuan Y. Chang, Ran El-Yaniv, Golan Yona - Show less +1 more•Institutions (1)

Technion – Israel Institute of Technology¹

04 Oct 2005-Journal of Computational Biology

TL;DR: A model is presented, based on divergence measures and statistics of the alignment structure, that corrects BLAST e-values for low complexity sequences without filtering or excluding them and generates scores that are more effective in distinguishing true similarities from chance similarities.

...read moreread less

Abstract: The statistical estimates of BLAST and PSI-BLAST are of extreme importance to determine the biological relevance of sequence matches. While being very effective in evaluating most matches, these es...

...read moreread less

12 citations