scispace - formally typeset
R

Rajeev Rastogi

Researcher at Amazon.com

Publications -  272
Citations -  22554

Rajeev Rastogi is an academic researcher from Amazon.com. The author has contributed to research in topics: Approximation algorithm & Data stream mining. The author has an hindex of 68, co-authored 271 publications receiving 21676 citations. Previous affiliations of Rajeev Rastogi include Alcatel-Lucent & University of Manitoba.

Papers
More filters
Proceedings ArticleDOI

CURE: an efficient clustering algorithm for large databases

TL;DR: This work proposes a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size, and demonstrates that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality.
Journal ArticleDOI

Efficient algorithms for mining outliers from large data sets

TL;DR: A novel formulation for distance-based outliers that is based on the distance of a point from its kth nearest neighbor is proposed and the top n points in this ranking are declared to be outliers.
Journal ArticleDOI

ROCK: a robust clustering algorithm for categorical attributes

TL;DR: This paper develops a robust hierarchical clustering algorithm ROCK that employs links and not distances when merging clusters, and indicates that ROCK not only generates better quality clusters than traditional algorithms, but it also exhibits good scalability properties.
Proceedings ArticleDOI

ROCK: a robust clustering algorithm for categorical attributes

TL;DR: This work develops a robust hierarchical clustering algorithm, ROCK, that employs links and not distances when merging clusters, and shows that ROCK not only generates better quality clusters than traditional algorithms, but also exhibits good scalability properties.
Journal ArticleDOI

Cure: an efficient clustering algorithm for large databases

TL;DR: It is demonstrated that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality.