T
Tapas Kanungo
Researcher at Microsoft
Publications - 85
Citations - 9156
Tapas Kanungo is an academic researcher from Microsoft. The author has contributed to research in topics: Optical character recognition & Image segmentation. The author has an hindex of 29, co-authored 84 publications receiving 8528 citations. Previous affiliations of Tapas Kanungo include Pennsylvania State University & University of Maryland, College Park.
Papers
More filters
Journal ArticleDOI
An efficient k-means clustering algorithm: analysis and implementation
Tapas Kanungo,David M. Mount,Nathan S. Netanyahu,Christine D. Piatko,Ruth Silverman,Angela Y. Wu +5 more
TL;DR: This work presents a simple and efficient implementation of Lloyd's k-means clustering algorithm, which it calls the filtering algorithm, and establishes the practical efficiency of the algorithm's running time.
Proceedings ArticleDOI
A local search approximation algorithm for k-means clustering
Tapas Kanungo,David M. Mount,Nathan S. Netanyahu,Christine D. Piatko,Ruth Silverman,Angela Y. Wu +5 more
TL;DR: This work considers the question of whether there exists a simple and practical approximation algorithm for k-means clustering, and presents a local improvement heuristic based on swapping centers in and out that yields a (9+ε)-approximation algorithm.
Proceedings ArticleDOI
SemTag and seeker: bootstrapping the semantic web via automated semantic annotation
Stephen Dill,Nadav Eiron,David Gibson,Daniel Gruhl,Ramanathan V. Guha,Anant Jhingran,Tapas Kanungo,Sridhar Rajagopalan,Andrew Tomkins,John A. Tomlin,Jason Zien +10 more
TL;DR: It is argued that automated large scale semantic tagging of ambiguous content can bootstrap and accelerate the creation of the semantic web.
Proceedings ArticleDOI
Document structure analysis algorithms: a literature survey
TL;DR: This paper provides a detailed survey of past work on document structure analysis algorithms and summarize the limitations of past approaches.
Proceedings ArticleDOI
Global and local document degradation models
TL;DR: An illumination model is described to account for the nonlinear intensity change occuring across a page in a perspective-distorted document.