Home
/
Authors
/
Thomas J. Pennello

Author

Thomas J. Pennello

Bio: Thomas J. Pennello is an academic researcher from University of California, Santa Cruz. The author has contributed to research in topics: LALR parser & Parsing. The author has an hindex of 5, co-authored 6 publications receiving 248 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Efficient Computation of LALR(1) Look-Ahead Sets

[...]

Frank DeRemer¹, Thomas J. Pennello¹•Institutions (1)

University of California, Santa Cruz¹

01 Oct 1982-ACM Transactions on Programming Languages and Systems

TL;DR: Two relations that capture the essential structure of the problem of computing LALR(1) look-ahead sets are defined, and an efficient algorithm is presented to compute the sets in time linear in the size of the relations.

...read moreread less

Abstract: Two relations that capture the essential structure of the problem of computing LALR(1) look-ahead sets are defined, and an efficient algorithm is presented to compute the sets in time linear in the size of the relations. In particular, for a PASCAL grammar, the algorithm performs fewer than 15 percent of the set unions performed by the popular compiler-compiler YACC. When a grammar is not LALR(1), the relations, represented explicitly, provide for printing useroriented error messages that specifically indicate how the look-ahead problem arose. In addition, certain loops in the digraphs induced by these relations indicate that the grammar is not LR(k) for any k. Finally, an oft-discovered and used but incorrect look-ahead set algorithm is similarly based on two other relations defined for the fwst time here. The formal presentation of this algorithm should help prevent its rediscovery.

...read moreread less

98 citations

Proceedings Article•DOI•

A forward move algorithm for LR error recovery

[...]

Thomas J. Pennello¹, Frank DeRemer¹•Institutions (1)

University of California, Santa Cruz¹

01 Jan 1978

TL;DR: A "forward move algorithm", and some of its formal properties, is presented for use in a practical syntactic error recovery scheme for LR parsers and an error recovery algorithm that uses the accumulated right context is proposed.

...read moreread less

Abstract: A "forward move algorithm", and some of its formal properties, is presented for use in a practical syntactic error recovery scheme for LR parsers. The algorithm finds "valid fragment" (comparable to a valid prefix) just to the right of a point of error detection. For expositional purposes the algorithm is presented as parsing arbitrarily far beyond the point of error detection in a "parallel" mode, as long as all parses agree on the read or reduce action to be taken at each parse step. In practice the forward move is achieved serially by adding "recovery states" to the LR machine. Based on the formal properties of the forward move we propose an error recovery algorithm that uses the accumulated right context. The performance of the recovery algorithm is illustrated in a specific case and discussed in general.

...read moreread less

59 citations

Proceedings Article•DOI•

Very fast LR parsing

[...]

Thomas J. Pennello

01 Jul 1986

TL;DR: LR parsers can be made to run 6 to 10 times as fast as the best table-interpretive LR parsers, and a factor of 2 to 4 increase in total table size can be expected, depending upon whether syntactic error recovery is required.

...read moreread less

Abstract: LR parsers can be made to run 6 to 10 times as fast as the best table-interpretive LR parsers. The resulting parse time is negligible compared to the time required by the remainder of a typical compiler containing the parser.A parsing speed of 1/2 million lines per minute on a computer similar to a VAX 11/780 was achieved, up from an interpretive speed of 40,000 lines per minute. A speed of 240,000 lines per minute on an Intel 80286 was achieved, up from an interpretive speed of 37,000 lines per minute.The improvement is obtained by translating the parser's finite state control into assembly language. States become code memory addresses. The current input symbol resides in a register and a quick sequence of register-constant comparisons determines the next state, which is merely jumped to. The parser's push-down stack is implemented directly on a hardware stack. The stack contains code memory addresses rather than the traditional state numbers.The strongly-connected components of the directed graph induced by the parser's terminal and nonterminal transitions are examined to determine a typically small subset of the states that require parse-time stack-overflow-check code when hardware does not provide the check automatically.The increase in speed is at the expense of space: a factor of 2 to 4 increase in total table size can be expected, depending upon whether syntactic error recovery is required.

...read moreread less

55 citations

Efficient computation of LALR(1) look-ahead sets (with retrospective)

[...]

Thomas J. Pennello, Frank DeRemer

01 Jan 1979

TL;DR: Two relations are defined that capture the essential structure of the problem of computing LALR(1) look-ahead sets, and an efficient algorithm is presented to compute the sets in time linear in the size of the relations.

...read moreread less

Abstract: We define two relations that capture the essential structure of the problem of computing LALR(1) look-ahead sets, and present an efficient algorithm to compute the sets in time linear in the size of the relations. In particular, for a PASCAL grammar, our algorithm performs less than 20% of the set unions performed by a popular-compiler (YACC).

...read moreread less

20 citations

Proceedings Article•DOI•

Efficient computation of LALR(1) look-ahead sets

[...]

Frank DeRemer¹, Thomas J. Pennello¹•Institutions (1)

University of California, Santa Cruz¹

01 Aug 1979

TL;DR: In this paper, the authors define two relations that capture the essential structure of the problem of computing LALR(1) look-ahead sets, and present an efficient algorithm to compute the sets in time linear in the size of the relations.

...read moreread less

16 citations

Cited by

PDF

Open Access

More filters

Book•

Partial evaluation and automatic program generation

[...]

Neil D. Jones¹, Carsten Krogh Gomard², Peter Sestoft³•Institutions (3)

University of Copenhagen¹, Computer Resources International², Technical University of Denmark³

01 Jan 1993

TL;DR: This paper presents a guide to the literature the self-applicable scheme specializer, a partial evaluator for a subset of scheme for a first-order functional languages.

...read moreread less

Abstract: Functions, types and expressions programming languages and their operational semantics compilation partial evaluation of a flow chart languages partial evaluation of a first-order functional languages the view from Olympus partial evaluation of the Lambda calculus partial evaluation of prolog aspects of Similix - a partial evaluator for a subset of scheme partial evaluation of C applications of partial evaluation termination of partial evaluation program analysis more general program transformation guide to the literature the self-applicable scheme specializer.

...read moreread less

1,549 citations

Book Chapter•DOI•

Algorithms for finding patterns in strings

[...]

Alfred V. Aho¹•Institutions (1)

Bell Labs¹

02 Jan 1991

TL;DR: This chapter discusses the algorithms for solving string-matching problems that have proven useful for text-editing and text-processing applications and several innovative, theoretically interesting algorithms have been devised that run significantly faster than the obvious brute-force method.

...read moreread less

Abstract: Publisher Summary This chapter discusses the algorithms for solving string-matching problems that have proven useful for text-editing and text-processing applications. String pattern matching is an important problem that occurs in many areas of science and information processing. In computing, it occurs naturally as part of data processing, text editing, term rewriting, lexical analysis, and information retrieval. Many text editors and programming languages have facilities for matching strings. In biology, string-matching problems arise in the analysis of nucleic acids and protein sequences, and in the investigation of molecular phylogeny. String matching is also one of the central and most widely studied problems in theoretical computer science. The simplest form of the problem is to locate an occurrence of a keyword as a substring in a sequence of characters, which is called the input string. For example, the input string queueing contains the keyword ueuei as a substring. Even for this problem, several innovative, theoretically interesting algorithms have been devised that run significantly faster than the obvious brute-force method.

...read moreread less

413 citations

Journal Article•

Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars

[...]

Ted Briscoe¹, John M. Carroll¹•Institutions (1)

University of Cambridge¹

01 Mar 1993-Computational Linguistics

TL;DR: The construction of a very wide-coverage probabilistic parsing system for natural language (NL) based on LR parsing techniques, intended to rank the large number of syntactic analyses produced by NL grammars according to the frequency of occurrence of the individual rules deployed in each analysis.

...read moreread less

Abstract: We describe work toward the construction of a very wide-coverage probabilistic parsing system for natural language (NL), based on LR parsing techniques. The system is intended to rank the large number of syntactic analyses produced by NL grammars according to the frequency of occurrence of the individual rules deployed in each analysis. We discuss a fully automatic procedure for constructing an LR parse table from a unification-based grammar formalism, and consider the suitability of alternative LALR(1) parse table construction methods for large grammars. The parse table is used as the basis for two parsers; a user-driven interactive system that provides a computationally tractable and labor-efficient method of supervised training of the statistical information required to drive the probabilistic parser. The latter is constructed by associating probabilities with the LR parse table directly. This technique is superior to parsers based on probabilistic lexical tagging or probabilistic context-free grammar because it allows for a more context-dependent probabilistic language model, as well as use of a more linguistically adequate grammar formalism. We compare the performance of an optimized variant of Tomita's (1987) generalized LR parsing algorithm to an (efficiently indexed and optimized) chart parser. We report promising results of a pilot study training on 150 noun definitions from the Longman Dictionary of Contemporary English (LDOCE) and retesting on these plus a further 55 definitions. Finally, we discuss limitations of the current system and possible extensions to deal with lexical (syntactic and semantic) frequency of occurrence.

...read moreread less

256 citations

Journal Article•DOI•

Approximate matching of regular expressions

[...]

Eugene W. Myers¹, Webb Miller²•Institutions (2)

University of Arizona¹, Pennsylvania State University²

01 Jan 1989-Bulletin of Mathematical Biology

TL;DR: An algorithm to solve the problem in time O(MN), where M and N are the lengths of A and R, and requires only O(N) space to deliver just the score of the best alignment, superior to an earlier algorithm by Wagner and Seiferas.

...read moreread less

193 citations

Journal Article•DOI•

A Four Russians algorithm for regular expression pattern matching

[...]

Gene Myers¹•Institutions (1)

University of Arizona¹

01 Apr 1992-Journal of the ACM

TL;DR: This work places a new worst-case upper bound on regular expression pattern matching using a combination of the node-listing and “Four-Russians” paradigms and provides an implementation that is faster than existing software for small regular expressions.

...read moreread less

Abstract: Given a regular expression R of length P and a word A of length N, the membership problem is to determine if A is in the language denoted by R An O(PN/lgN) time algorithm is presented that is based on a lgN speedup of the standard O(PN) time simulation of R's nonderministic finite automaton on A using a combination of the node-listing and “Four-Russians” paradigms This result places a new worst-case upper bound on regular expression pattern matching Moreover, in practice the method provides an implementation that is faster than existing software for small regular expressions

...read moreread less

135 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44

Collapse