scispace - formally typeset
Search or ask a question

Showing papers by "Ran El-Yaniv published in 2002"


Journal ArticleDOI
TL;DR: A statistical learning algorithm for synthesizing new random instances of natural sounds using a granular method of sonic analysis, which views sound as a series of short, distinct bursts of energy.
Abstract: Natural sounds are complex phenomena because they typically contain a mixture of events localized in time and frequency. Moreover, dependencies exist across different time scales and frequency bands, which are important for proper sound characterization. Historically, acoustical theorists have represented sound in numerous ways. Our research has focused on a granular method of sonic analysis, which views sound as a series of short, distinct bursts of energy. Using that theory, this article presents a statistical learning algorithm for synthesizing new random instances of natural sounds.

85 citations


Journal ArticleDOI
TL;DR: A hierarchical clustering algorithm is devised, which employs the basic bipartition algorithm in a straightforward divisive manner and copes with the model validation problem using a general cross-validation approach, which may be combined with various hierarchical clustered methods.
Abstract: We present a novel pairwise clustering method Given a proximity matrix of pairwise relations (ie pairwise similarity or dissimilarity estimates) between data points, our algorithm extracts the two most prominent clusters in the data set The algorithm, which is completely nonparametric, iteratively employs a two-step transformation on the proximity matrix The first step of the transformation represents each point by its relation to all other data points, and the second step re-estimates the pairwise distances using a statistically motivated proximity measure on these representations Using this transformation, the algorithm iteratively partitions the data points, until it finally converges to two clusters Although the algorithm is simple and intuitive, it generates a complex dynamics of the proximity matrices Based on this bipartition procedure we devise a hierarchical clustering algorithm, which employs the basic bipartition algorithm in a straightforward divisive manner The hierarchical clustering algorithm copes with the model validation problem using a general cross-validation approach, which may be combined with various hierarchical clustering methods We further present an experimental study of this algorithm We examine some of the algorithm's properties and performance on some synthetic and ‘standard’ data sets The experiments demonstrate the robustness of the algorithm and indicate that it generates a good clustering partition even when the data is noisy or corrupted

40 citations


Journal ArticleDOI
TL;DR: Two new families of optimal, 2-competitive, deterministic online algorithms of list accessing algorithms are presented and the degree of locality has a considerable influence on the algorithms' absolute and relative costs, as well as on their rankings.
Abstract: This paper concerns the online list accessing problem. In the first part of the paper we present two new families of list accessing algorithms. The first family is of optimal, 2-competitive, deterministic online algorithms. This family, called the MRI (MOVE-TO-RECENT-ITEM) family, includes as members the well-known MOVE-TO-FRONT (MTF) algorithm and the recent, more ``conservative'' algorithm TIMESTAMP due to Albers. So far MOVE-TO-FRONT and TIMESTAMP were the only algorithms known to be optimal in terms of their competitive ratio. This new family contains a sequence of algorithms { A(i) } i \geq 1 where A(1) is equivalent to TIMESTAMP and the limit element A(∈fty) is \mtf. Further, in this class, for each i , the algorithm A(i) is more conservative than algorithm A(i+1) in the sense that it is more reluctant to move an accessed item to the front, thus giving a gradual transition from the conservative TIMESTAMP to the ``reckless'' MTF. The second new family, called the PRI (PASS-RECENT-ITEM) family, is also infinite and contains TIMESTAMP. We show that most algorithms in this family attain a competitive ratio of 3. In the second, experimental, part of the paper we report the results of an extensive empirical study of the performances of a large set of online list accessing algorithms (including members of our MRI and PRI families). The algorithms' access cost performances were tested with respect to a number of different request sequences. These include sequences of independent requests generated by probability distributions and sequences generated by Markov sources to examine the influence of locality. It turns out that the degree of locality has a considerable influence on the algorithms' absolute and relative costs, as well as on their rankings. In another experiment we tested the algorithms' performances as data compressors in two variants of the compression scheme of Bentley et al. In both experiments, members of the MRI and PRI families were found to be among the best performing algorithms.

36 citations


Book ChapterDOI
19 Aug 2002
TL;DR: In this paper, the authors proposed and studied a new technique for aggregating an ensemble of bootstrapped classifiers, which is called Markowitz Mean-Variance Portfolio Optimization.
Abstract: We propose and study a new technique for aggregating an ensemble of bootstrapped classifiers. In this method we seek a linear combination of the base-classifiers such that the weights are optimized to reduce variance. Minimum variance combinations are computed using quadratic programming. This optimization technique is borrowed from Mathematical Finance where it is called Markowitz Mean-Variance Portfolio Optimization. We test the new method on a number of binary classification problems from the UCI repository using a Support Vector Machine (SVM) as the base-classifier learning algorithm. Our results indicate that the proposed technique can consistently outperform Bagging and can dramatically improve the SVM performance even in cases where the Bagging fails to improve the base-classifier.

32 citations


Journal Article
TL;DR: The results indicate that the proposed technique can consistently outperform Bagging and can dramatically improve the SVM performance even in cases where the Bagging fails to improve the base-classifier.
Abstract: We propose and study a new technique for aggregating an ensemble of bootstrapped classifiers. In this method we seek a linear combination of the base-classifiers such that the weights are optimized to reduce variance. Minimum variance combinations are computed using quadratic programming. This optimization technique is borrowed from Mathematical Finance where it is called Markowitz Mean-Variance Portfolio Optimization. We test the new method on a number of binary classification problems from the UCI repository using a Support Vector Machine (SVM) as the base-classifier learning algorithm. Our results indicate that the proposed technique can consistently outperform Bagging and can dramatically improve the SVM performance even in cases where the Bagging fails to improve the base-classifier.

25 citations


01 Jan 2002
TL;DR: A statistical learning algorithm is presented for synthesizing new random instances of natual sounds by applying wavelet analysis to view sound as a series of short, distinct bursts of energy.
Abstract: atural sounds are complex phenomena because they typically contain a mixture of events localized in time and frequency. Moreover, dependencies exist across different time scales and frequency bands, which are important for proper sound characterization. Historically, acoustical theorists have represented sound in numerous ways. Our research has focused on a granular method of sonic analysis, which views sound as a series of short, distinct bursts of energy. Using that theory, this article presents a statistical learning algorithm for synthesizing new random instances of natual sounds. Applying wavelet analysis, our algorithm captures the