scispace - formally typeset
Search or ask a question
Author

Brenda S. Baker

Bio: Brenda S. Baker is an academic researcher from Bell Labs. The author has contributed to research in topics: Approximation algorithm & Bin packing problem. The author has an hindex of 27, co-authored 37 publications receiving 4721 citations. Previous affiliations of Brenda S. Baker include AT&T Labs & University of California, Berkeley.

Papers
More filters
Journal ArticleDOI
Brenda S. Baker1
TL;DR: A general technique that can be used to obtain approximation algorithms for various NP-complete problems on planar graphs, which includes maximum independent set, maximum tile salvage, partition into triangles, maximum H-matching, minimum vertex cover, minimum dominating set, and minimum edge dominating set.
Abstract: This paper describes a general technique that can be used to obtain approximation schemes for various NP-complete problems on planar graphs. The strategy depends on decompos- ing a planar graph into subgraphs of a form we call k-outerplanar. For fixed k, the problems of interest are solvable optimally in linear time on k-outerplanar graphs by dynamic programming. For general planar graphs, if the problem is a maximization problem, such as maximum independent set, this technique gives for each k a linear time algorithm that produces a solution whose size is at least k/(k + 1)optimal. If the problem is a minimization problem, such as minimum vertex cover, it gives for each k a linear time algorithm that produces a solution whose size is at most (k + 1)/k optimal. Taking k = (c log log nl or k = (c log nl, where n is the number of nodes and c is some constant, we get polynomial time approximation algorithms whose solution sizes converge toward optimal as n increases. The class of problems for which this approach provides approximation schemes includes maximum independent set, maximum tile salvage, partition into triangles, maximum H-matching, minimum vertex cover, minimum dominat- ing set, and minimum edge dominating set. For these and certain other problems, the proof of solvability on k-outerplanar graphs also enlarges the class of planar gmphs for which the problems are known to be solvable in polynomial time.

1,047 citations

Proceedings ArticleDOI
Brenda S. Baker1
14 Jul 1995
TL;DR: A program called dup can be used to locate instances of duplication or near-duplication in a software system and is shown to be both effective at locating duplication and fast.
Abstract: This paper describes how a program called dup can be used to locate instances of duplication or near-duplication in a software system. Dup reports both textually identical sections of code and sections that are the same textually except for systematic substitution of one set of variable names and constants for another. Further processing locates longer sections of code that are the same except for other small modifications. Experimental results from running dup on millions of lines from two large software systems show dup to be both effective at locating duplication and fast. Applications could include identifying sections of code that should be replaced by procedures, elimination of duplication during reengineering of the system, redocumentation to include references to copies, and debugging.

800 citations

Journal ArticleDOI
TL;DR: Efficient approximation algorithms are devised, their limitations are studied, and worst-case bounds on the performance of the packings they produce are derived.
Abstract: We consider problems of packing an arbitrary collection of rectangular pieces into an open-ended, rectangular bin so as to minimize the height achieved by any piece. This problem has numerous applications in operations research and studies of computer operation. We devise efficient approximation algorithms, study their limitations, and derive worst-case bounds on the performance of the packings they produce.

676 citations

Proceedings ArticleDOI
01 Jun 1993
TL;DR: This paper develops a theory and algoritbrns for an application problem arising in software maintenance to track down duplication in a large software system, and gives efficient algorithms for constructing parametrized suffix trees and for reporting duplication over a threshold length.
Abstract: This paper develops a theory and algoritbrns for an application problem arising in software maintenance. The application is to track down duplication in a large software system. We want to find not only exact matches between sections of code, but parametrized matches, where a parametrized match between two sections of code means that one section can be transformed into the other by replacing the parameter names (e.g. identifiers and constants) of one section by the parameter names of the other via a one-to-one function. This paper formalizes this problem in terms of parametrized strings and parametrized pattern matching and detirtes a new data structure (parametrized sujjfi.x tree) suitable for parametrized pattern matching. It gives efficient algorithms for constructing this data structure, efficient algorithms for parametrized pattern matchmg, and an efficient algorithm for timing all maximal parametrized matches over a threshold length in a parametrized string. The algorithms for constructing parametrized suffix trees and for reporting duplication over a threshold length have been implemented. Tests on C code indicate that these algorithms should perform well in the application.

185 citations

Journal ArticleDOI
TL;DR: Algorithms are given to construct a parameterized suffix tree in linear time and to find all maximal parameterized matches over a threshold length in a parameterization p-string in time linear in the size of the input plus the number of matches reported.
Abstract: As an aid in software maintenance, it would be useful to be able to track down duplication in large software systems efficiently. Duplication in code is often in the form of sections of code that are the same except for a systematic change of parameters such as identifiers and constants. To model such parameterized duplication in code, this paper introduces the notions of parameterized strings and parameterized matches of parameterized strings. A data structure called a parameterized suffix tree is defined to aid in searching for parameterized matches. For fixed alphabets, algorithms are given to construct a parameterized suffix tree in linear time and to find all maximal parameterized matches over a threshold length in a parameterized p-string in time linear in the size of the input plus the number of matches reported. The algorithms have been implemented, and experimental results show that they perform well on C code.

181 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Program slicing as mentioned in this paper is a method for automatically decomposing programs by analyzing their data flow and control flow. But it is not a technique for finding statement-minimal slices, as it is in general unsolvable, but using data flow analysis is sufficient to find approximate slices.
Abstract: Program slicing is a method for automatically decomposing programs by analyzing their data flow and control flow. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a ``slice,'' is an independent program guaranteed to represent faithfully the original program within the domain of the specified subset of behavior. Some properties of slices are presented. In particular, finding statement-minimal slices is in general unsolvable, but using data flow analysis is sufficient to find approximate slices. Potential applications include automatic slicing tools for debuggng and parallel processing of slices.

3,163 citations

Book
01 Jan 2006
TL;DR: This paper discusses Fixed-Parameter Algorithms, Parameterized Complexity Theory, and Selected Case Studies, and some of the techniques used in this work.
Abstract: PART I: FOUNDATIONS 1. Introduction to Fixed-Parameter Algorithms 2. Preliminaries and Agreements 3. Parameterized Complexity Theory - A Primer 4. Vertex Cover - An Illustrative Example 5. The Art of Problem Parameterization 6. Summary and Concluding Remarks PART II: ALGORITHMIC METHODS 7. Data Reduction and Problem Kernels 8. Depth-Bounded Search Trees 9. Dynamic Programming 10. Tree Decompositions of Graphs 11. Further Advanced Techniques 12. Summary and Concluding Remarks PART III: SOME THEORY, SOME CASE STUDIES 13. Parameterized Complexity Theory 14. Connections to Approximation Algorithms 15. Selected Case Studies 16. Zukunftsmusik References Index

1,730 citations

Journal ArticleDOI
TL;DR: A new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison, is proposed, which has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems.
Abstract: A code clone is a code portion in source files that is identical or similar to another. Since code clones are believed to reduce the maintainability of software, several code clone detection techniques and tools have been proposed. This paper proposes a new clone detection technique, which consists of the transformation of input source text and a token-by-token comparison. For its implementation with several useful optimization techniques, we have developed a tool, named CCFinder (Code Clone Finder), which extracts code clones in C, C++, Java, COBOL and other source files. In addition, metrics for the code clones have been developed. In order to evaluate the usefulness of CCFinder and metrics, we conducted several case studies where we applied the new tool to the source code of JDK, FreeBSD, NetBSD, Linux, and many other systems. As a result, CCFinder has effectively found clones and the metrics have been able to effectively identify the characteristics of the systems. In addition, we have compared the proposed technique with other clone detection techniques.

1,700 citations

Book
27 Jul 2015
TL;DR: This comprehensive textbook presents a clean and coherent account of most fundamental tools and techniques in Parameterized Algorithms and is a self-contained guide to the area, providing a toolbox of algorithmic techniques.
Abstract: This comprehensive textbook presents a clean and coherent account of most fundamental tools and techniques in Parameterized Algorithms and is a self-contained guide to the area. The book covers many of the recent developments of the field, including application of important separators, branching based on linear programming, Cut & Count to obtain faster algorithms on tree decompositions, algorithms based on representative families of matroids, and use of the Strong Exponential Time Hypothesis. A number of older results are revisited and explained in a modern and didactic way. The book provides a toolbox of algorithmic techniques. Part I is an overview of basic techniques, each chapter discussing a certain algorithmic paradigm. The material covered in this part can be used for an introductory course on fixed-parameter tractability. Part II discusses more advanced and specialized algorithmic ideas, bringing the reader to the cutting edge of current research. Part III presents complexity results and lower bounds, giving negative evidence by way of W[1]-hardness, the Exponential Time Hypothesis, and kernelization lower bounds. All the results and concepts are introduced at a level accessible to graduate students and advanced undergraduate students. Every chapter is accompanied by exercises, many with hints, while the bibliographic notes point to original publications and related work.

1,544 citations

Book
01 Jan 1997
TL;DR: The goal of this book is to provide a textbook which presents the basics ofTree automata and several variants of tree automata which have been devised for applications in the aforementioned domains.
Abstract: CONTENTS 7 Acknowledgments Many people gave substantial suggestions to improve the contents of this book. These are, in alphabetic order, Introduction During the past few years, several of us have been asked many times about references on finite tree automata. On one hand, this is the witness of the liveness of this field. On the other hand, it was difficult to answer. Besides several excellent survey chapters on more specific topics, there is only one monograph devoted to tree automata by Gécseg and Steinby. Unfortunately, it is now impossible to find a copy of it and a lot of work has been done on tree automata since the publication of this book. Actually using tree automata has proved to be a powerful approach to simplify and extend previously known results, and also to find new results. For instance recent works use tree automata for application in abstract interpretation using set constraints, rewriting, automated theorem proving and program verification, databases and XML schema languages. Tree automata have been designed a long time ago in the context of circuit verification. Many famous researchers contributed to this school which was headed by A. Church in the late 50's and the early 60's: B. Trakhtenbrot, Many new ideas came out of this program. For instance the connections between automata and logic. Tree automata also appeared first in this framework, following the work of Doner, Thatcher and Wright. In the 70's many new results were established concerning tree automata, which lose a bit their connections with the applications and were studied for their own. In particular, a problem was the very high complexity of decision procedures for the monadic second order logic. Applications of tree automata to program verification revived in the 80's, after the relative failure of automated deduction in this field. It is possible to verify temporal logic formulas (which are particular Monadic Second Order Formulas) on simpler (small) programs. Automata, and in particular tree automata, also appeared as an approximation of programs on which fully automated tools can be used. New results were obtained connecting properties of programs or type systems or rewrite systems with automata. Our goal is to fill in the existing gap and to provide a textbook which presents the basics of tree automata and several variants of tree automata which have been devised for applications in the aforementioned domains. We shall discuss only finite tree automata, and the …

1,492 citations