Author

# Andrew Rau-Chaplin

Other affiliations: Halifax, Carleton University, bell northern research

Bio: Andrew Rau-Chaplin is an academic researcher from Dalhousie University. The author has contributed to research in topics: Online analytical processing & Parallel algorithm. The author has an hindex of 21, co-authored 127 publications receiving 1779 citations. Previous affiliations of Andrew Rau-Chaplin include Halifax & Carleton University.

##### Papers published on a yearly basis

##### Papers

More filters

••

01 Jul 1993

TL;DR: In this article, the authors present scalable algorithms for a number of geometric problems, such as lower envelope of line segments, 2D-nearest neighbor, 3D-maxima, and 2D weighted dominance counting area of the union of rectangles.

Abstract: Whereas most of the literature assumes that the number of processors p is a function of the problem size n, in scalable algorithms p becomes a parameter of the time complexity. This is a more realistic modelisation of real parallel machines and yields optimal algorithms, for the case that n H p, where H is a function depending on the architecture of the interconnexion network. In this paper we present scalable algorithms for a number of geometric problems, namely lower envelope of line segments, 2D-nearest neighbour, 3D-maxima, 2D-weighted dominance counting area of the union of rectangles, 2D-convex hull. The main idea of these algorithms is to decompose the problem in p subproblems of size 0(F(n;p) + f(p)), with f(p) 2 F(n;p) , which can be solved independently using optimal sequential algorithms. For each problem we present a spatial decomposition scheme based on some geometric observations. The decomposition schemes have in common that they can be computed by globally sorting the entire data set at most twice. The data redundancy of f(p) duplicates of data elements per processor does not increase the asymptotic time complexity and ranges for the algorithms presented in this paper, from p to p2. The algorithms do not depend on a specific architecture,they are easy to implement and in practice efficient as experiments show.

187 citations

••

TL;DR: This work presents O(Tsequential/p+Ts(n, p)) time scalable parallel algorithms for several computational geometry problems, which use only a small number of very large messages and greatly reduces the overhead for the communication protocol between processors.

Abstract: We study scalable parallel computational geometry algorithms for the coarse grained multicomputer model: p processors solving a problem on n data items, were each processor has O(n/p)≫O(1) local memory and all processors are connected via some arbitrary interconnection network (e.g. mesh, hypercube, fat tree). We present O(Tsequential/p+Ts(n, p)) time scalable parallel algorithms for several computational geometry problems. Ts(n, p) refers to the time of a global sort operation. Our results are independent of the multicomputer’s interconnection network. Their time complexities become optimal when Tsequential/p dominates Ts(n, p) or when Ts(n, p) is optimal. This is the case for several standard architectures, including meshes and hypercubes, and a wide range of ratios n/p that include many of the currently available machine configurations. Our methods also have some important practical advantages: For interprocessor communication, they use only a small fixed number of one global routing operation, global sort, and all other programming is in the sequential domain. Furthermore, our algorithms use only a small number of very large messages, which greatly reduces the overhead for the communication protocol between processors. (Note however, that our time complexities account for the lengths of messages.) Experiments show that our methods are easy to implement and give good timing results.

106 citations

••

TL;DR: The potential of parallelism when applied to the bounded-tree search phase of FPT algorithms is demonstrated, thereby allowing even larger problem instances to be solved in practice.

Abstract: Fixed-parameter tractability (FPT) techniques have recently been successful in solving NP-complete problem instances of practical importance which were too large to be solved with previous methods. In this paper, we show how to enhance this approach through the addition of parallelism, thereby allowing even larger problem instances to be solved in practice. More precisely, we demonstrate the potential of parallelism when applied to the bounded-tree search phase of FPT algorithms. We apply our methodology to the k-VERTEX COVER problem which has important applications in, for example, the analysis of multiple sequence alignments for computational biochemistry. We have implemented our parallel FPT method for the k-VERTEX COVER problem using C and the MPI communication library, and tested it on a 32-node Beowulf cluster. This is the first experimental examination of parallel FPT techniques. As part of our experiments, we solved larger instances of k-VERTEX COVER than in any previously reported implementations. For example, our code can solve problem instances with k≥400 in less than 1.5 h.

103 citations

••

TL;DR: The method is a heuristic version of a fixed parameter tractability (FPT) approach and the running time behaves similar to FPT algorithms, and was able to quickly compute dSPR for the majority of trees that were part of a study of LGT in 144 prokaryotic genomes.

Abstract: The subtree prune and regraft distance (d SPR ) between phylogenetic trees is important both as a general means of comparing phylogenetic tree topologies as well as a measure of lateral gene transfer (LGT). Although there has been exten- sive study on the computation of d SPR and similar metrics between rooted trees, much less is known about SPR distances for unrooted trees, which often arise in practice when the root is unresolved. We show that unrooted SPR distance compu- tation is NP-Hard and verify which techniques from related work can and cannot be applied. We then present an effi cient heuristic algorithm for this problem and benchmark it on a variety of synthetic datasets. Our algorithm computes the exact SPR distance between unrooted tree, and the heuristic element is only with respect to the algorithm's computation time. Our method is a heuristic version of a fi xed parameter tractability (FPT) approach and our experiments indicate that the running time behaves similar to FPT algorithms. For real data sets, our algorithm was able to quickly compute d SPR for the majority of trees that were part of a study of LGT in 144 prokaryotic genomes. Our analysis of its performance, especially with respect to searching and reduction rules, is applicable to computing many related distance measures. 1. Introduction trees are used to describe evolutionary relationships. DNA or protein sequences are associated with the leaves of the tree and the internal nodes correspond to speciation or gene duplication events. In order to model ancestor-descendant relationships on the tree, a direction must be associated with its edges by assigning a root. Often, insuffi cient information exists to determine the root and the tree is left unrooted. Unrooted trees still provide a notion of evolutionary relationship between organ- isms even if the direction of descent remains unknown. The phylogenetic tree representation has recently come under scrutiny with critics claiming that it is too simple to properly model microbial evolution, particularly in the presence of lateral gene trans- fer (LGT) events (Doolittle, 1999). A LGT is the transfer of genetic material between species by means other than inheritance and thus cannot be represented in a tree as it would create a cycle. The preva- lence of LGT events in microbial evolution can, however, still be studied using phylogenetic trees. Given a pair of trees describing the same sets of species, each constructed using different sets of genes, a LGT event corresponds to a displacement of a common subtree, referred to as a SPR operation. The SPR distance is the minimum number of SPR operations, denoted by d SPR , that explain the topological differences between a pair of trees. It is equivalent to the number of transfers in the most parsimonious LGT scenario (Beiko and Hamilton, 2006). In general, d SPR can be used as a measure of the topo- logical difference between two trees, e.g. for comparing the outputs of different tree construction algorithms. Tree bisection and reconnection (TBR) is a generalization of SPR that allows the pruned subtree to be rerooted before being regrafted. Computation of the TBR distance (d TBR ) was shown to be NP-hard (nondeterministic polynomial-time hard) by Allen and Steel (2001), who also provided two rules that reduce two input trees to a size that is a linear functions of d TBR without altering their distance. These rules, which reduce common chains and subtrees, also form the basis of algorithms that compute the SPR distance between rooted trees (d rSPR ) (Bordewich and Semple, 2004) as well as hybridization number (h) (Bordewich et al. 2007), see Section 3.3. Such algorithms proceed as follows. First the distance problem is shown to be equivalent to counting components of a maximum agreement forest,

93 citations

••

TL;DR: This paper shows how intuitive notions can be translated into a well-founded Bayesian approach to object recognition and give precise formulas for the optimal weight functions that should be used in hash space, and demonstrates the validity of the approach by performing similarity-invariant object recognition.

Abstract: Geometric hashing methods provide an efficient approach to indexing from image features into a database of models. The hash functions that have typically been used involve quantization of the values, which can result in nongraceful degradation of the performance of the system in the presence of noise. Intuitively, it is desirable to replace the quantization of hash values and the resulting binning of hash entries by a method that gives increasingly less weight to a hash table entry as a hashed feature becomes more distant from the hash entry position. In this paper, we show how these intuitive notions can be translated into a well-founded Bayesian approach to object recognition and give precise formulas for the optimal weight functions that should be used in hash space. These extensions allow the geometric hashing method to be viewed as a Bayesian maximum-likelihood framework. We demonstrate the validity of the approach by performing similarity-invariant object recognition using models obtained from drawings of military aircraft and automobiles and test images from real-world grayscale images of the same aircraft and automobile types. Our experimental results represent a complete object recognition system, since the feature extraction process is automated. Our system is scalable and works rapidly and very efficiently on an 8K-processor CM - 2, and the quality of results using similarity-invariant model matching is excellent.

68 citations

##### Cited by

More filters

•

TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.

Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

•

01 Jan 2006

TL;DR: This paper discusses Fixed-Parameter Algorithms, Parameterized Complexity Theory, and Selected Case Studies, and some of the techniques used in this work.

Abstract: PART I: FOUNDATIONS 1. Introduction to Fixed-Parameter Algorithms 2. Preliminaries and Agreements 3. Parameterized Complexity Theory - A Primer 4. Vertex Cover - An Illustrative Example 5. The Art of Problem Parameterization 6. Summary and Concluding Remarks PART II: ALGORITHMIC METHODS 7. Data Reduction and Problem Kernels 8. Depth-Bounded Search Trees 9. Dynamic Programming 10. Tree Decompositions of Graphs 11. Further Advanced Techniques 12. Summary and Concluding Remarks PART III: SOME THEORY, SOME CASE STUDIES 13. Parameterized Complexity Theory 14. Connections to Approximation Algorithms 15. Selected Case Studies 16. Zukunftsmusik References Index

1,730 citations

••

01 Oct 1997TL;DR: Geometric hashing, a technique originally developed in computer vision for matching geometric features against a database of such features, finds use in a number of other areas.

Abstract: Geometric hashing, a technique originally developed in computer vision for matching geometric features against a database of such features, finds use in a number of other areas. Matching is possible even when the recognizable database objects have undergone transformations or when only partial information is present. The technique is highly efficient and of low polynomial complexity.

618 citations