Two ways of formalizing grammars

doi:10.1007/BF00985036

MARK JOHNSON

TWO

WAYS OF

FORMALIZING GRAMMARS*

1.

INTRODUCTION

A

grammar

is a

formal device

which

both

identifies

a certain set

of

utter

ances as

well-formed,

and

which also

defines

a

transduction relation be

tween

these

utterances

and

their

linguistic

representations.

This

paper

focuses

on two

widely-used

"formal"

or

"logical"

representations

of

gram

mars in

computational

linguistics,

Definite

Clause

Grammars

and Feature

Structure

Grammars,

and

describes

the

way

in

which

they

express

the

recognition problem

(the

problem

of

determining

if

an utterance is

in the

language

generated

by

a

grammar)

and the

parsing problem

(the

problem

of

finding

the

analyses

assigned

by

a

grammar

to an

utterance).

Although

both

approaches

are

'constraint-based',

one

of them

is

based

on

logical consequence

relation,

and the other

is

based on

satisfiability.

The

main

goal

of this

paper

is

to

point

out

the

different

conceptual

basis

of

these two

ways

of

formalizing

grammars,

and

discuss

some

of their

properties.

1.1.

Definite-Clause

Grammars,

A

Validity-based Approach

The

definite-clause

grammar (DCG)

framework

originates

in

Colmerauer's

work on

Metamorphosis

Grammars

in the 1970's

(Colmerauer 1978)

and

was

developed

and

popularized

by

Pereira and Warren

(1981),

Pereira

and

Shieber

(1987)

and

others. In this

approach,

a

grammar

(here

taken

to

include

the

lexicon)

is conceived

of as a set of axioms. The well

formedness

of

an

utterance

and

the

fact

that

it has

a

certain

linguistic

structure

are theorems

that follow

from these

axioms,

so

both the

recog

nition

and

parsing problems

is one

of

determining

if

certain

types

of

formulae

are

logical

consequences

of

these

axioms. Thus

the well-form

edness

or

grammaticality

of

a

particular

utterance

is

expressed

by

the fact

that the

corresponding

formula is

a

consequence

of

the

grammar

axioms,

and

ungrammaticality

by

the fact that the

corresponding

formula

is not

a

*

I

would

like to

thank Edward Stabler and an

anonymous

L&P reviewer for their

helpful

comments on an

earlier

draft

of

this

paper.

Of

course,

all

responsibility

for errors

in

this

paper

rests with me.

Linguistics

and

Philosophy

17:

221-248,

1994.

?

1994 Kluwer Academic

Publishers.

Printed in

the

Netherlands.

222

MARK

JOHNSON

consequence

of

the

grammar

axioms

(even

though

it

may

be

consistent

with

them).

That

is,

if the

grammar

axioms

are

D

and

the formula

wf

(u,

s)

asserts

that the

utterance

u is well-formed

with

linguistic

reprsentation

s

(where

s

might

be

interpreted

as

a

parse

tree,

etc.),

then

the

recognition problem

is

the

problem

of

determining

if

the

following

holds.1

D

t

3s

wf(u,

s).

The

parsing

problem

is

the

problem

of

finding

all of the s such

that

the

following

holds

for the

given

utterance

u.

D t

wf(u,

s).

In

general

D is

a

finite

set

of

closed

formulae,

so these

problems

are

equivalent

to the

following

validity

problems,

where D' is a

conjunction

of the

members of

D.

t=D' -3s

wf(u, s).

l

D'

-wf(u,

s).

To

summarize,

in the

DCG

approach

the intended

interpretation

Xi

is

one in

which

linguistic

representations

and

strings

are

conceptualized

as

individuals.

The

grammar

axioms

D

state

the

essential

properties

of

the

intended

interpretation

Ati,

so

if

u

is

interpreted

in

X, as

a

grammatical

utterance

with

linguistic

structure

s,

then

wf(u,

s)

is

true

in

every

model

of

D.

1.2. Feature

Structures,

A

Satisfiability-based

Approach

The second

framework

is the

feature-structure (FS)

approach,

where

a

grammar (which

includes

the

lexicon)

is

conceived

of

as

a

set

of

constraints,

and

a

well-formed

linguistic

representation

is

any

structure

that

satisfies

these

constraints.2

Specifically,

the

grammar

and

the

utterance

both im

pose

constraints

that

the

linguistic

structure must meet.

The

well-formed

or

grammatical

structures are

those

that

satisfy

the

constraints

imposed

by

the

grammar.

An

utterance

is

well-formed

iff one

of these

structures

1

In

this

paper

the

following

notation is used.

Object-language

expressions

are

written

in

sanserif

font,

e.g.

x,

y

etc.,

while

meta-language

variables

(ranging

over

object-language

expressions)

are

written

in italic

font,

e.g.

x,

y,

etc.

2

The

version

considered

here

is

similar to

HPSG

(Pollard

and

Sag

1987)

in that

it

is

expressive

enough

that

no

external

phrase

structure

component

is

required

-

the

phrase

structure

rules

are encoded

as

feature structure

constraints

-

and is a

simplification

of

systems

proposed

by

Carpenter

(1992).

TWO WAYS OF

FORMALIZING GRAMMARS

223

also

meets the

additional constraint that it

"corresponds"

to

the

utterance

in

an

appropriate

way

(e.g.,

the

structure's

yield (terminal

string)

is the

string

of

words

of

that

utterance).

Thus

grammaticality

or

well-formedness

of an

utterance

corresponds

to the

satisfiability

of

a

set of

constraints,

and

ungrammaticality

or

ill-formedness

corresponds

to

the

unsatisfiability

of

that

set.

In

formalizing

the

recognition

and

parsing problems

in

this

approach,

linguistic representations

can

be

regarded

as

interpretations,

and

the

con

straints

as

expressions

or formulae

(from

some

language

of

constraints)

which

these

interpretations

must

satisfy

in

order to be

considered

well

formed

linguistic

representations.

That

is,

a

well-formed

linguistic

repre

sentation is

a

model

of

these formulae

(rather

than

an

individual

in a

model

as in

the

DCG

approach),

and

the

set

of

all

models

of the

gramma

tical

constraints

is the

set of

all

well-formed

linguistic

representations.

Thus

unlike

the

DCG

approach,

in

general

there

is

no

single

intended

model of a set of

feature structure constraints.

Most

of the work

in

this field

has

focussed

on

the

development

of

specialized languages

for

expressing systems

of constraints

to be used

as

annotations

on

phrase-structures

rules

(e.g.,

the Feature

Description

Language

of

Kasper

and

Rounds

(1990)

and

the

attribute-value

languages

of

Johnson

(1988)).

It

seems

that

the

language

of

first-order

logic

(in

fact,

usually

decidable

sublanguages

thereof)

is

capable

of

expressing

these

kinds

of

constraints

(Johnson

1990a,

b, 1991a, b,

Smolka

1992).

Manaster

Ramer and Rounds

(1987)

and

Carpenter

(1992)

propose

extended

ver

sions

of

these

systems

that

are

expressive

enough

to be

linguistically

useful

alone

(i.e.,

without other

descriptive

devices

such as a

phrase-structure

'backbone').

This

paper

explores

the

degree

to

which such

an extended

feature

system

can

be

expressed

in

a first-order

language.3

This also

aids

comparison

with

the DCG

approach,

which is

formulated in the

same

language.

The

recognition

problem

is the

problem

of

determining

the simultaneous

satisfiability

of both

grammar

and

utterance

constraints.

That

is,

if

F

is

a

formula

expressing

the

grammatical

constraints

that

every

well-formed

linguistic

structure must

satisfy (i.e.

that

is

true

in

exactly

the

well-formed

structures)

and

yield(u)

is

a

formula that

is

true in an

interpretation

(i.e.,

a

linguistic

structure)

iff

that

interpretation corresponds

to

utterance

u

3

An

interesting

alternative not discussed

in this

paper

is to

extend

a

standard first-order

language by

adding

'feature-structure

expressions'

to that

language.

It

seems

that

the

most

insightful

semantics for

such an

extended

language

is

based

on

abduction;

see Hohfield

and

Smolka

(1988)

and

Chen

and

Warren

(1989)

for details.

224

MARK JOHNSON

(say,

has

u

as its

phonological

form),

then

utterance

u

is well-formed

iff

there exists

a

model

.

such

that

the

following

is

true.

~

F

A

yield(u).

The

parsing problem

is the

problem

of

describing

or

characterizing

the

set of

models of

the

conjoined

constraints. Since

this

set

may

be

infinite,

it

is not

in

general

possible

to

exhaustively

enumerate these models.

There

are

two

standard

techniques

for

describing

the

models

of the

constraints,

both

exploiting

the

observation

that infinite

sets can

have finite

descrip

tions

(e.g.

the

infinite

set

of

integers greater

than

7

has the

finite

descrip

tion

"{x

I

x

>

7}").

These

two

techniques

are

discussed

in

detail

in

sections

2.8

and

2.10 of Johnson

(1988).

The

first

technique

exploits

the observation that

in

cases where the

possible

constraints

are

restricted,

it

may

be

possible

to

show

that

the set

of

models

possesses

a

certain

structure,

so

that

an

infinite

set of

models

can

be

finitely

described,

i.e.,

specified

or

identified

with

finite

means.

Usually,

attention

is

restricted

to

a

certain

type

of

interpretation,

e.g.

acyclic

deterministic finite

automata

(DFA)

in

Kasper

and Rounds

(1990),

and

attribute-value structures

(AVS)

in

Johnson

(1988).

Kasper

and

Rounds

(1990)

showed

that the set of

DFA

satisfying

any

constraint

expressible

in

their Feature

Description

Logic

is

a

finite

union of

principal

filters

(generated by

the

"minimal

models" with

respect

to the

"subsump

tion"

ordering),

and

Johnson

(1988)

showed that

the set of

AVSs

satisfying

any

constraint

expressible

in an

attribute-value

language

is

a

finite

union

of

finite differences of

finitely-generated

principal

filters;4

in

both

cases

there are

effective

procedures

for

constructing

these

generators,

which

constitute

a

finite

description

of

a

(possibly

infinite)

set of

models.

The

second

technique

is

a

variation of

the

first

one;

it is

based

on

the

observation

that

every

formula identifies

a

set of

interpretations,

namely

those

that

satisfy

it.

Thus the

formula

F

A

yield(u)

is

a

description

of

the

set

of

its

models

(although

perhaps

not a

very

useful

one).

For some

constraint

languages

(including

those

of

Kasper

and

Rounds

(1990)

and

Johnson

(1988))

there exist

algorithms

that reduce

an

arbitrary

formula

to

an

equivalent

formula

in a

"normal

form",

from

which

one

can

"read

off'

the

important

properties

of the models

(see

sections 2.8

and 2.10

of

Johnson

(1988)

for further

discussion).

Independently

of

the existence of

normal

forms,

however,

if

it

can

be

shown

that

if

4

Because

attribute-value

languages

can

express

negated

constraints,

Johnson

(1988)

requires

"negative"

minimal

models

(i.e.

"inequality

arcs")

as well as

"positive"

minimal models.

TWO WAYS

OF

FORMALIZING GRAMMARS

225

F

A

yield(u)

1

A

for

some

formula

A,

then

A

is true

of

every

linguistic

representation

that

satisfies the

grammar

constraints and

corresponds

to

the

utterance

u,

i.e.

A is

a

description

of

the

well-formed

structures

of u. Thus

information

about

an

utterance

can

be

extracted

by

computing

the

logical

consequences

of

the

(grammar

and

utterance)

constraints.5

For

example,

if

the

utterance

u

is

ungrammatical,

then

F

A

yield(u)

1

false

because there

are no

models of these

constraints.

2.

FORMALIZING

CONTEXT-FREE

GRAMMARS

Both

approaches

are

capable

of

expressing

grammars

considerably

more

complicated

than the

context-free

grammars

described in this

section,

but

it is

instructive

to

consider these

simpler

systems.

This

paper

follows

standard

linguistic

practice

in

assuming

that the

right

hand side of each

production

in the

grammars

being

formalized is either

a

(possibly empty)

sequence

of non-terminals

or

else

a

single

terminal.

This

assumption

sim

plifies

the formalization somewhat without

restricting

the class of lan

guages

that

can

be

expressed.

First,

formalizations of

the

recognition

problem

for

following

simple

context-free

grammar

(based

on

the

simple

grammar

of Shieber

1986)

are

presented.

Then the axioms

are

modified

so

that

a

representation

of the

parse

tree

is

produced

as

well.

Finally,

the

axioms

are

further modified

to

include

agreement

features,

so

that

ungrammatical

utterances such

as

*Knights

sleeps

are

not

generated.

S

-*

NP VP

VP-*V NP

NP uther

()

NP-,

knights

VP

->

sleeps

V--

like

5

Not all the

consequences

A

are

informative,

of

course,

since the

set

of

consequences

includes

e.g.

all

tautologies. Correspondingly,

not

all of

the

logical

consequences

of the

DCG

axioms

are

of interest

either.

Two ways of formalizing grammars

Citations

Natural language processing and language learning

The Cambridge Handbook of Learner Corpus Research: Learner corpora and natural language processing

Computing with features as formulae

Detecting and Diagnosing Grammatical Errors for Beginning Learners of German: From Learner Corpus Annotation to Constraint Satisfaction Problems.

Representing Constraints with Automata

References

Foundations of logic programming

Head-driven phrase structure grammar

Negation as failure

Some notes on economy of derivation and representation

The logic of typed feature structures

Related Papers (5)

String adjunct grammars

General formulation of formal grammars

Underlying Principles and Recurring Ideas of Formal Grammars

Reference grammars and pedagogical grammars

Grammar Testing

Trending Questions (2)