scispace - formally typeset
Search or ask a question
Author

Timothy M. Chan

Bio: Timothy M. Chan is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Computational geometry & Convex hull. The author has an hindex of 52, co-authored 293 publications receiving 8508 citations. Previous affiliations of Timothy M. Chan include Johns Hopkins University & University of Waterloo.


Papers
More filters
Journal ArticleDOI
TL;DR: This work presents simple output-sensitive algorithms that construct the convex hull of a set of n points in two or three dimensions in worst-case optimalO (n logh) time and O (n) space, whereh denotes the number of vertices of the conveX hull.
Abstract: We present simple output-sensitive algorithms that construct the convex hull of a set ofn points in two or three dimensions in worst-case optimalO (n logh) time andO (n) space, whereh denotes the number of vertices of the convex hull.

355 citations

Journal ArticleDOI
TL;DR: A cohort of 279 head and neck cancers with next generation RNA and DNA sequencing is profiled to provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viraloncogenesis.
Abstract: Previous studies have established that a subset of head and neck tumors contains human papillomavirus (HPV) sequences and that HPV-driven head and neck cancers display distinct biological and clinical features. HPV is known to drive cancer by the actions of the E6 and E7 oncoproteins, but the molecular architecture of HPV infection and its interaction with the host genome in head and neck cancers have not been comprehensively described. We profiled a cohort of 279 head and neck cancers with next generation RNA and DNA sequencing and show that 35 (12.5%) tumors displayed evidence of high-risk HPV types 16, 33, or 35. Twenty-five cases had integration of the viral genome into one or more locations in the human genome with statistical enrichment for genic regions. Integrations had a marked impact on the human genome and were associated with alterations in DNA copy number, mRNA transcript abundance and splicing, and both inter- and intrachromosomal rearrangements. Many of these events involved genes with documented roles in cancer. Cancers with integrated vs. nonintegrated HPV displayed different patterns of DNA methylation and both human and viral gene expressions. Together, these data provide insight into the mechanisms by which HPV interacts with the human genome beyond expression of viral oncoproteins and suggest that specific integration events are an integral component of viral oncogenesis.

311 citations

Proceedings ArticleDOI
13 Jun 2011
TL;DR: A randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n log n) plus the output size is given, which resolves two open problems: given a set of n axis-aligned rectangles in the plane, the authors can report all k enclosure pairs in O( n lg n + k) expected time; and given aSet of n points in 4-D,they can find all maximal points (points not dominated by any other points) in O
Abstract: We present a number of new results on one of the most extensively studied topics in computational geometry, orthogonal range searching All our results are in the standard word RAM model: We present two data structures for 2-d orthogonal range emptiness The first achieves O(n lg lg n) space and O(lg lg n) query time, assuming that the n given points are in rank space This improves the previous results by Alstrup, Brodal, and Rauhe (FOCS'00), with O(n lge n) space and O(lg lg n) query time, or with O(n lg lg n) space and O(lg2lg n) query time Our second data structure uses O(n) space and answers queries in O(lge n) time The best previous O(n)-space data structure, due to Nekrich (WADS'07), answers queries in O(lg n/lg lg n) time We give a data structure for 3-d orthogonal range reporting with O(n lg1+e n) space and O(lg lg n + k) query time for points in rank space, for any constant e>0 This improves the previous results by Afshani (ESA'08), Karpinski and Nekrich (COCOON'09), and Chan (SODA'11), with O(n lg3 n) space and O(lg lg n + k) query time, or with O(n lg1+en) space and O(lg2lg n + k) query time Consequently, we obtain improved upper bounds for orthogonal range reporting in all constant dimensions above 3Our approach also leads to a new data structure for 2D orthogonal range minimum queries with O(n lge n) space and O(lg lg n) query time for points in rank space We give a randomized algorithm for 4-d offline dominance range reporting/emptiness with running time O(n log n) plus the output size This resolves two open problems (both appeared in Preparata and Shamos' seminal book): given a set of n axis-aligned rectangles in the plane, we can report all k enclosure pairs (ie, pairs (r1,r2) where rectangle r1 completely encloses rectangle r2) in O(n lg n + k) expected time; given a set of n points in 4-d, we can find all maximal points (points not dominated by any other points) in O(n lg n) expected time The most recent previous development on (a) was reported back in SoCG'95 by Gupta, Janardan, Smid, and Dasgupta, whose main result was an O([n lg n + k] lg lg n) algorithm The best previous result on (b) was an O(n lg n lg lg n) algorithm due to Gabow, Bentley, and Tarjan---from STOC'84! As a consequence, we also obtain the current-record time bound for the maxima problem in all constant dimensions above~4

233 citations

Proceedings ArticleDOI
11 Jun 2007
TL;DR: A new algorithm with running time approaching O(n3/log2n), which improves all known algorithms for general real-weighted dense graphs and is perhaps close to the best result possible without using fast matrix multiplication, modulo a few log log n factors.
Abstract: In the first part of the paper, we reexamine the all-pairsshortest paths (APSP) problem and present a newalgorithm with running time approaching O(n3/log2n), which improves all known algorithms for general real-weighted dense graphs andis perhaps close to the best result possible without using fast matrix multiplication, modulo a few log log n factors.In the second part of the paper, we use fast matrix multiplication to obtain truly subcubic APSP algorithms for a large class of "geometrically weighted" graphs, where the weight of an edge is a function of the coordinates of its vertices. For example, for graphs embedded in Euclidean space of a constant dimension d, we obtain a time bound near O(n3-(3-Ω)/(2d+4)), where Ω

212 citations

Journal ArticleDOI
TL;DR: A different algorithm, based on geometric separators, that requires only linear space is described, that can also be applied to piercing, yielding the first PTAS for that problem.

172 citations


Cited by
More filters
Proceedings ArticleDOI
23 May 1998
TL;DR: In this paper, the authors present two algorithms for the approximate nearest neighbor problem in high-dimensional spaces, for data sets of size n living in R d, which require space that is only polynomial in n and d.
Abstract: We present two algorithms for the approximate nearest neighbor problem in high-dimensional spaces. For data sets of size n living in R d , the algorithms require space that is only polynomial in n and d, while achieving query times that are sub-linear in n and polynomial in d. We also show applications to other high-dimensional geometric problems, such as the approximate minimum spanning tree. The article is based on the material from the authors' STOC'98 and FOCS'01 papers. It unifies, generalizes and simplifies the results from those papers.

4,478 citations

Journal ArticleDOI
TL;DR: FastJet as mentioned in this paper is a C++ package that provides a broad range of jet finding and analysis tools, including efficient native implementations of all widely used 2→1 sequential recombination jet algorithms for pp and e − − collisions.
Abstract: FastJet is a C++ package that provides a broad range of jet finding and analysis tools. It includes efficient native implementations of all widely used 2→1 sequential recombination jet algorithms for pp and e + e − collisions, as well as access to 3rd party jet algorithms through a plugin mechanism, including all currently used cone algorithms. FastJet also provides means to facilitate the manipulation of jet substructure, including some common boosted heavy-object taggers, as well as tools for estimation of pileup and underlying-event noise levels, determination of jet areas and subtraction or suppression of noise in jets.

3,713 citations

Proceedings Article
07 Sep 1999
TL;DR: Experimental results indicate that the novel scheme for approximate similarity search based on hashing scales well even for a relatively large number of dimensions, and provides experimental evidence that the method gives improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition.
Abstract: The nearestor near-neighbor query problems arise in a large variety of database applications, usually in the context of similarity searching. Of late, there has been increasing interest in building search/index structures for performing similarity search over high-dimensional data, e.g., image databases, document collections, time-series databases, and genome databases. Unfortunately, all known techniques for solving this problem fall prey to the \curse of dimensionality." That is, the data structures scale poorly with data dimensionality; in fact, if the number of dimensions exceeds 10 to 20, searching in k-d trees and related structures involves the inspection of a large fraction of the database, thereby doing no better than brute-force linear search. It has been suggested that since the selection of features and the choice of a distance metric in typical applications is rather heuristic, determining an approximate nearest neighbor should su ce for most practical purposes. In this paper, we examine a novel scheme for approximate similarity search based on hashing. The basic idea is to hash the points Supported by NAVY N00014-96-1-1221 grant and NSF Grant IIS-9811904. Supported by Stanford Graduate Fellowship and NSF NYI Award CCR-9357849. Supported by ARO MURI Grant DAAH04-96-1-0007, NSF Grant IIS-9811904, and NSF Young Investigator Award CCR9357849, with matching funds from IBM, Mitsubishi, Schlumberger Foundation, Shell Foundation, and Xerox Corporation. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 25th VLDB Conference, Edinburgh, Scotland, 1999. from the database so as to ensure that the probability of collision is much higher for objects that are close to each other than for those that are far apart. We provide experimental evidence that our method gives signi cant improvement in running time over other methods for searching in highdimensional spaces based on hierarchical tree decomposition. Experimental results also indicate that our scheme scales well even for a relatively large number of dimensions (more than 50).

3,705 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that given an integer k ≥ 1, (1 + ϵ)-approximation to the k nearest neighbors of q can be computed in additional O(kd log n) time.
Abstract: Consider a set of S of n data points in real d-dimensional space, Rd, where distances are measured using any Minkowski metric. In nearest neighbor searching, we preprocess S into a data structure, so that given any query point q∈ Rd, is the closest point of S to q can be reported quickly. Given any positive real ϵ, data point p is a (1 +ϵ)-approximate nearest neighbor of q if its distance from q is within a factor of (1 + ϵ) of the distance to the true nearest neighbor. We show that it is possible to preprocess a set of n points in Rd in O(dn log n) time and O(dn) space, so that given a query point q ∈ Rd, and ϵ > 0, a (1 + ϵ)-approximate nearest neighbor of q can be computed in O(cd, ϵ log n) time, where cd,ϵ≤d ⌈1 + 6d/ϵ⌉d is a factor depending only on dimension and ϵ. In general, we show that given an integer k ≥ 1, (1 + ϵ)-approximations to the k nearest neighbors of q can be computed in additional O(kd log n) time.

2,813 citations

Journal ArticleDOI
TL;DR: Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.
Abstract: In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerged for reasoning about algorithms that work within these constraints on space, time, and number of passes. Some of the methods rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity. The applications for this scenario include IP network traffic analysis, mining text message streams and processing massive data sets in general. Researchers in Theoretical Computer Science, Databases, IP Networking and Computer Systems are working on the data stream challenges. This article is an overview and survey of data stream algorithmics and is an updated version of [1].

1,598 citations