scispace - formally typeset
Search or ask a question
Author

A.J. Malton

Bio: A.J. Malton is an academic researcher from University of Waterloo. The author has contributed to research in topics: Source transformation & Source code. The author has an hindex of 12, co-authored 16 publications receiving 575 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The basic features of modern TXL are introduced and its use in a range of software engineering applications are introduced, with an emphasis on how each task can be achieved by source transformation.
Abstract: Many tasks in software engineering can be characterized as source to source transformations. Design recovery, software restructuring, forward engineering, language translation, platform migration, and code reuse can all be understood as transformations from one source text to another. The tree transformation language, TXL, is a programming language and rapid prototyping system specifically designed to support rule-based source to source transformation. Originally conceived as a tool for exploring programming language dialects, TXL has evolved into a general purpose source transformation system that has proven well suited to a wide range of software maintenance and reengineering tasks, including the design recovery, analysis and automated reprogramming of billions of lines of commercial Cobol, PL/I, and RPG code for the Year 2000. In this paper, we introduce the basic features of modern TXL and its use in a range of software engineering applications, with an emphasis on how each task can be achieved by source transformation.

140 citations

Proceedings ArticleDOI
02 Oct 2001
TL;DR: CPPX (C Plus Plus eXtractor) is described, which transforms a schema designed as the internals of a compiler (GCC) to a schemadesigned for software exchange (Datrix) as a pipelined sequence of sub-transformations.
Abstract: An extractor is a program which processes source code and outputs facts about the code in a software exchange format (SEF). An SEF can be further specified by a schema, analogous to a schema for a database. This paper explains how two such schemas can be combined into a union schema as the basis for creating an extractor. We describe CPPX (C Plus Plus eXtractor), which transforms a schema designed as the internals of a compiler (GCC) to a schema designed for software exchange (Datrix). CPPX performs this transformation as a pipelined sequence of sub-transformations. At each stage in the pipeline, the intermediate data conforms to the union of the two schemas.

59 citations

Journal ArticleDOI
01 Oct 2003
TL;DR: The basic techniques of agile parsing in TXL are introduced and several industry proven techniques for exploiting agile parse in software source analysis and transformation are discussed.
Abstract: Syntactic analysis forms a foundation of many source analysis and reverse engineering tools. However, a single standard grammar is not always appropriate for all source analysis and manipulation tasks. Small custom modifications to the grammar can make the programs used to implement these tasks simpler, clearer and more efficient. This leads to a new paradigm for programming these tools: agile parsing. In agile parsing the effective grammar used by a particular tool is a combination of two parts: the standard base grammar for the input language, and a set of explicit grammar overrides that modify the parse to support the task at hand. This paper introduces the basic techniques of agile parsing in TXL and discusses several industry proven techniques for exploiting agile parsing in software source analysis and transformation.

58 citations

Proceedings ArticleDOI
07 Nov 2001
TL;DR: The paper presents an overview of LS/2000, a system that used design recovery to analyze source code for year 2000 risks and guide a source transformation that was able to automatically remediate over 99% of the year2000 risks in over three billion lines of production IT source.
Abstract: The year 2000 problem posed a difficult problem for many IT shops world wide. The most difficult part of the problem was not the actual changes to ensure compliance, but finding and classifying the data fields that represent dates. This is a problem well suited to design recovery. The paper presents an overview of LS/2000, a system that used design recovery to analyze source code for year 2000 risks and guide a source transformation that was able to automatically remediate over 99% of the year 2000 risks in over three billion lines of production IT source.

50 citations

Proceedings ArticleDOI
12 May 2001
TL;DR: A simple technique, 'source factoring', is described by which a common structural decomposition of source text can address the many issues of preprocessing, macro processing, lexical analysis, design recovery, and automated transformation.
Abstract: Software source text is the raw material of program understanding and transformation systems. In order to share the results of source analyses, both between phases of a design recovery process, and between tools and systems in different processes, a source text interchange format is needed. The paper describes a simple technique, 'source factoring', by which a common structural decomposition of source text can address the many issues of preprocessing, macro processing, lexical analysis, design recovery, and automated transformation. Above all, source factorization allows the results of design analysis to be attached to source, and the results of source transformation to be reinstalled cleanly into the code base. This view of source text underlies the architecture of a successful software maintenance system which has processed billions of lines of legacy code in all major programming languages.

44 citations


Cited by
More filters
Proceedings ArticleDOI
10 Jun 2008
TL;DR: A new language- specific parser-based but lightweight clone detection approach exploiting a novel application of a source transformation system that is capable of finding near-miss clones with high precision and recall, and with reasonable performance.
Abstract: This paper examines the effectiveness of a new language- specific parser-based but lightweight clone detection approach. Exploiting a novel application of a source transformation system, the method accurately finds near-miss clones using an efficient text line comparison technique. The transformation system assists the method in three ways. First, using agile parsing it provides user-specified flexible pretty- printing to remove noise, standardize formatting and break program statements into parts such that potential changes can be detected as simple linewise text differences. Second, it provides efficient flexible extraction of potential clones to be compared using island grammars and agile parsing to select granularities and enumerate potential clones. Third, using transformation rules it provides flexible code normalization to allow for local editing differences between similar code segments and filtering out of uninteresting parts of potential clones. In this paper we introduce the theory and practice of the framework and demonstrate its use in finding function clones in C code. Early experiments indicate that the method is capable of finding near-miss clones with high precision and recall, and with reasonable performance.

497 citations

Journal ArticleDOI
01 Aug 2006
TL;DR: The history, evolution and concepts of TXL are outlined, with emphasis on its distinctive style and philosophy, and examples of its use in expressing and applying recent new paradigms in language processing are given.
Abstract: TXL is a special-purpose programming language designed for creating, manipulating and rapidly prototyping language descriptions, tools and applications. TXL is designed to allow explicit programmer control over the interpretation, application, order and backtracking of both parsing and rewriting rules. Using first order functional programming at the higher level and term rewriting at the lower level, TXL provides for flexible programming of traversals, guards, scope of application and parameterized context. This flexibility has allowed TXL users to express and experiment with both new ideas in parsing, such as robust, island and agile parsing, and new paradigms in rewriting, such as XML mark-up, rewriting strategies and contextualized rules, without any change to TXL itself. This paper outlines the history, evolution and concepts of TXL with emphasis on its distinctive style and philosophy, and gives examples of its use in expressing and applying recent new paradigms in language processing.

367 citations

Journal ArticleDOI
TL;DR: This work identifies the problems with the current grammarware practices, the barriers that currently hamper research, and the promises of an engineering discipline for grammarware, its principles and the research challenges that have to be addressed.
Abstract: Grammarware comprises grammars and all grammar-dependent software. The term grammar is meant here in the sense of all established grammar formalisms and grammar notations including context-free grammars, class dictionaries, and XML schemas as well as some forms of tree and graph grammars. The term grammar-dependent software refers to all software that involves grammar knowledge in an essential manner. Archetypal examples of grammar-dependent software are parsers, program converters, and XML document processors. Despite the pervasive role of grammars in software systems, the engineering aspects of grammarware are insufficiently understood. We lay out an agenda that is meant to promote research on increasing the productivity of grammarware development and on improving the quality of grammarware. To this end, we identify the problems with the current grammarware practices, the barriers that currently hamper research, and the promises of an engineering discipline for grammarware, its principles, and the research challenges that have to be addressed.

295 citations

Book ChapterDOI
11 May 2018
TL;DR: This chapter on chemical graph theory forms part of the natural science and processes section of the handbook.
Abstract: This chapter on chemical graph theory forms part of the natural science and processes section of the handbook

263 citations

Proceedings ArticleDOI
22 Oct 2006
TL;DR: The authors present Sourcerer, a search engine for open-source code that extracts fine-grained structural information from the code and stores it in a relational model to implement a basic notion of CodeRank and to enable search forms that go beyond conventional keyword-based searches.
Abstract: We present Sourcerer, a search engine for open-source code. Sourcerer extracts fine-grained structural information from the code and stores it in a relational model. This information is used to implement a basic notion of CodeRank and to enable search forms that go beyond conventional keyword-based searches.

258 citations