What future works have the authors mentioned in the paper "From program verification to program synthesis" ?

The authors envision that in the future, they can either augment synthesis conditions with constraints about relevance or use a postprocessing step to prioritize and pick relevant solutions from those enumerated.

Why does the solver not derive s2 + 1 s1?

because of the incompleteness in the handling of quadratic expressions, their solver cannot derive (s2 + 1)2 ≤ s21 from s2 + 1 ≤ s1.

How have the authors demonstrated the viability of their approach?

The authors have demonstrated the viability of their approach by synthesizing difficult examples in the three domains of arithmetic, sorting, and dynamic programming, all in very reasonable time.

What are the synthesis conditions for an expanded scaffold?

The synthesis conditions corresponding to a scaffold are satisfiable iff there exists a program (with a maximum of n transitions in each acyclic fragment where n is the parameter to the expansion) that is valid with respect to the scaffold.

How do the authors constrain the well-formedness of an expanded scaffold?

The authors constrain the well-formedness of each transition system in the expanded scaffold exp = ExpandD,Rn,Dprf(Rflow) using Eq. (4).WellFormCond(exp) = ^choose {[]gi→si}i ∈cond(exp)WellFormTS({[]gi → si}i) (5)where cond(exp) recursively examines the expanded scaffold exp and returns the set of all choose statements in it.

What is the way to ensure that the guards are not disjointed?

Disjointedness is not required for correctness [11] because if multiple guards are triggered then arbitrarily choosing the body for any one suffices.

What are the constraints that are used to solve the synthesis conditions?

The authors then observe that they resemble verification conditions, and the authors can employ verification tools, if they have certain properties, to solve them (Section 4).

(Open Access) From program verification to program synthesis (2010) | Saurabh Srivastava

Q: What is the function that expands a flowgraph?

Given a string for a flowgraph template, the authors define an expansion function Expand : int ×Dprf ×R ×D ×Rflow → TSL that introduces fresh unknowns for missing guards, statements and invariants that are to be synthesized.

Q: How do the authors encode program guards and statements?

The authors encode the synthesis problem as a verification problem by encoding program guards and statements as logical facts that need to be discovered.

From Program Veriﬁcation to Program Synthesis

Saurabh Srivastava

University of Maryland, College Park

saurabhs@cs.umd.edu

Sumit Gulwani

Microsoft Research, Redmond

sumitg@microsoft.com

Jeffrey S. Foster

University of Maryland, College Park

jfoster@cs.umd.edu

Abstract

This paper describes a novel technique for the synthesis of imper-

ative programs. Automated program synthesis has the potential to

make programming and the design of systems easier by allowing

programs to be speciﬁed at a higher-level than executable code. In

our approach, which we call proof-theoretic synthesis, the user pro-

vides an input-output functional speciﬁcation, a description of the

atomic operations in the programming language, and a speciﬁca-

tion of the synthesized program’s looping structure, allowed stack

space, and bound on usage of certain operations. Our technique

synthesizes a program, if there exists one, that meets the input-

output speciﬁcation and uses only the given resources.

The insight behind our approach is to interpret program synthe-

sis as generalized program veriﬁcation, which allows us to bring

veriﬁcation tools and techniques to program synthesis. Our syn-

thesis algorithm works by creating a program with unknown state-

ments, guards, inductive invariants, and ranking functions. It then

generates constraints that relate the unknowns and enforces three

kinds of requirements: partial correctness, loop termination, and

well-formedness conditions on program guards. We formalize the

requirements that program veriﬁcation tools must meet to solve

these constraint and use tools from prior work as our synthesizers.

We demonstrate the feasibility of the proposed approach by syn-

thesizing programs in three different domains: arithmetic, sorting,

and dynamic programming. Using veriﬁcation tools that we previ-

ously built in the VS

project we are able to synthesize programs

for complicated arithmetic algorithms including Strassen’s matrix

multiplication and Bresenham’s line drawing; several sorting algo-

rithms; and several dynamic programming algorithms. For these

programs, the median time for synthesis is 14 seconds, and the ra-

tio of synthesis to veriﬁcation time ranges between 1× to 92×(with

an median of 7×), illustrating the potential of the approach.

Categories and Subject Descriptors I.2.2 [Automatic Program-

ming]: Program Synthesis; F.3.1 [Logics and Meanings of Pro-

grams]: Specifying and Verifying and Reasoning about Programs

General Terms Languages, Theory.

Keywords Proof-theoretic program synthesis, veriﬁcation.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full citation

on the ﬁrst page. To copy otherwise, to republish, to post on servers or to redistribute

to lists, requires prior speciﬁc permission and/or a fee.

POPL’10, January 17–23, 2010, Madrid, Spain.

 2010 ACM 978-1-60558-479-9/10/01. . . $10.00

1. Introduction

Automated program synthesis, despite holding the promise of sig-

niﬁcantly easing the task of programming, has received little atten-

tion due to its difﬁculty. Being able to mechanically construct pro-

grams has wide-ranging implications. Mechanical synthesis yields

programs that are correct-by-construction. It relieves the tedium

and error associated with programming low-level details, can aid in

automated debugging and in general leaves the human programmer

free to deal with the high-level design of the system. Additionally,

synthesis could discover new non-trivial programs that are difﬁcult

for programmers to build.

In this paper, we present an approach to program synthesis

that takes the correct-by-construction philosophy of program de-

sign [14, 18, 38] and shows how it can be automated. Program ver-

iﬁcation tools routinely synthesize program proofs in the form of

inductive invariants for partial correctness and ranking functions

for termination. We encode the synthesis problem as a veriﬁca-

tion problem by encoding program guards and statements as logical

facts that need to be discovered. This allows us to use certain veriﬁ-

cation tools for synthesis. The veriﬁcation tool infers the invariants

and ranking functions as usual, but in addition infers the program

statements, yielding automated program synthesis. We call our ap-

proach proof-theoretic synthesis because the proof is synthesized

alongside the program.

We deﬁne the synthesis task as requirements on the output pro-

gram: functional requirements, requirements on the form of pro-

gram expressions and guards, and requirements on the resources

used. The key to our synthesis algorithm is the reduction from the

synthesis task to three sets of constraints. The ﬁrst set are safety

conditions that ensure the partial correctness of the loops in the pro-

gram. The second set are well-formedness conditions on the pro-

gram guards and statements, such that the output from the veriﬁca-

tion tool (facts corresponding to program guards and statements)

correspond to valid guards and statements in an imperative lan-

guage. The third set are progress conditions that ensure that the

program terminates. To our knowledge, our approach is the ﬁrst

that automatically synthesizes programs and their proofs, while

previous approaches have either used given proofs to extract pro-

grams [27] or made no attempt to generate the proof. Some ap-

proaches, while not generating proofs, do ensure correctness for a

limited class of ﬁnitizable programs [29].

To illustrate our approach, we next show how to synthesize

Bresenham’s line drawing algorithm. This example is an ideal

candidate for automated synthesis because, while the program’s

requirements are simple to specify, the actual program is quite

involved.

1.1 Motivating Example

As a motivating example, we consider a well-known algorithm

from the graphics community called Bresenham’s line drawing

algorithm, shown in Figure 1(a). The algorithm computes (and

(a) Bresenhams(int X, Y ) {

:=2Y −X; y:=0; x:=0;

while (x ≤ X)

out[x]:=y;

if (v

< 0)

:=v

+2Y ;

else

:=v

+2(Y -X); y++;

x++;

return out;

}

(b) Bresenhams(int X, Y ) {

[]true → v

=2Y −X ∧ y

=0 ∧ x

while (x ≤ X)

[]v

< 0 → out

=upd(out,x,y) ∧ v

+2Y ∧ y

=y ∧ x

=x+1

[]v

≥ 0 → out

=upd(out,x,y) ∧ v

+2(Y -X) ∧ y

=y+1 ∧ x

=x+1

return out;

}

(c)

Invariant τ

0 < Y ≤ X ∧ v

= 2(x+1)Y −(2y +1)X ∧

2(Y −X) ≤ v

≤ 2Y ∧

∀k : 0 ≤ k < x ⇒ 2|out[k]−(Y/X)k| ≤ 1

Ranking function ϕ

X − x

Figure 1. (a) Bresenham’s line drawing algorithm (b) The invariant and ranking function that prove partial correctness and termination,

respectively. (c) The algorithm written in transition system form, with statements as equality predicates, guarded appropriately.

writes to the output array out) the discrete best-ﬁt line from (0, 0)

to (X, Y ), where the point (X, Y ) is in the NE half-quadrant, i.e.,

0 < Y ≤ X. The best-ﬁt line is one that does not deviate more

than half a pixel away from the real line, i.e., |y −(Y/X)x| ≤ 1/2.

For efﬁciency, the algorithm computes the pixel values (x, y) of

this best-ﬁt line using only linear operations, but the computation is

non-trivial and the correctness of the algorithm is also not evident.

The speciﬁcation for this program is succinctly written in terms

of its precondition τ

pre

and postcondition τ

post

pre

: 0 < Y ≤ X

post

: ∀k : 0 ≤ k ≤ X ⇒ 2|out[k]−(Y/X)k|≤1

Notice that in the postcondition, we have written the assertion out-

side the loop body for clarity of presentation, but it can easily

be rewritten, as a quantiﬁer-free assertion, inside. Bresenham pro-

posed the program shown in Figure 1(a) to implement this speciﬁ-

cation. The question we answer is whether it is possible to synthe-

size the program given just the speciﬁcation and a description of

the available resources (control ﬂow, stack space and operations).

Let us stepwise develop the idea behind synthesis starting from the

veriﬁcation problem for the given program.

Observe that we can write program statements as equality pred-

icates and acyclic fragments as transition systems. For example, we

can write x := e as x

= e, where x

is a renaming of x to its out-

put value. We will write statements as equalities between the output

(primed) versions of the variables and the expression (over the un-

primed versions of the variables). Also, guards that direct control

ﬂow in an imperative program can now be seen as guards for state-

ment facts in a transition system. Figure 1(b) shows our example

written in transition system form. To prove partial correctness, one

can write down the inductive invariant for the loop and verify that

the veriﬁcation condition for the program is in fact valid. The ver-

iﬁcation condition consists of four implications for the four paths

corresponding to the entry, exit, and one each for the branches in

the loop. Using standard veriﬁcation condition generation, with the

precondition τ

pre

and postcondition τ

post

, and writing the renamed

version of invariant τ as τ

, these are

pre

∧ s

entry

⇒ τ

τ ∧ ¬g

loop

⇒ τ

post

τ ∧ g

loop

∧ g

body1

∧ s

body1

⇒ τ

τ ∧ g

loop

∧ g

body2

∧ s

body2

⇒ τ

(1)

where we use symbols for the various parts of the program:

body1

: v

< 0

body2

: v

≥ 0

loop

: x ≤ X

entry

: v

=2Y -X ∧ y

=0 ∧ x

body1

: out

=upd(out, x, y) ∧ v

+2Y ∧ y

=y ∧ x

=x+1

body2

: out

=upd(out, x, y) ∧ v

+2(Y -X) ∧ y

=y+1 ∧ x

=x+1

With a little bit of work, one can validate that the invariant τ shown

in Figure 1(c) satisﬁes Eq. (1). Checking the validity of given in-

variants can be automated using SMT solvers [10]. In fact, pow-

erful program veriﬁcation tools now exist that can generate ﬁxed-

point solutions—inductive invariants such as τ —automatically us-

ing constraint-based techniques [6, 21, 32], abstract interpreta-

tion [9] or model checking [3]. There are also tools that can prove

termination [7]—by inferring ranking functions such as ϕ—and to-

gether with the safety proof provide a proof for total correctness.

The insight behind our paper is to ask the question, if we can

infer τ in Eq. (1), then is it possible to infer the guards g

’s and the

statements s

’s at the same time? We have found that we can in-

deed infer guards and statements as well, by suitably encoding pro-

grams as transition systems, asserting appropriate constraints, and

then leveraging program veriﬁcation techniques to do a systematic

(lattice-theoretic) search for unknowns in the constraints. Here the

unknowns now represent both the invariants and the statements and

guards. It turns out that a direct solution to the unknown guards

and statements may be uninteresting, i.e., it may not correspond

to real programs. But we illustrate that we can impose additional

well-formedness constraints on the unknown guards and statements

such that any solution to this new set of constraints corresponds

to a valid, real program. Additionally, even if we synthesize valid

programs, it may be that the programs are non-terminating. There-

fore we need to impose additional progress constraints that ensure

that the synthesized programs are ones that we can actually run.

We now illustrate the need for these well-formedness and progress

constraints over our example.

Suppose that the statements s

entry

, s

body1

and s

body2

, are un-

known. A trivial satisfying solution to Eq. (1) may set all these

unknowns to false. If we use a typical program veriﬁcation tool

that computes least ﬁxed-points starting from ⊥, then indeed, it will

output this solution. On the other hand, let us make the conditional

guards g

body1

and g

body2

unknown. Again, g

body1

= g

body2

= false

is a satisfying solution. We get uninteresting solutions because the

unknowns are not constrained enough to ensure valid statements

and control-ﬂow. Statement blocks are modeled as

= e

with one equality for each output variable x

and expressions e

are over input variables. Therefore, false does not correspond to

any valid block. Similarly g

body1

= g

body2

= false does not cor-

respond to any valid conditional with two branches. For example,

consider if (g) S

else S

with two branches. Note how S

and

are guarded by g and ¬g, respectively, and g ∨ ¬g holds. For

every valid conditional, the disjunction of the guards is always a

tautology. In veriﬁcation, the program syntax and semantics en-

sure the well-formedness of acyclic fragments. In synthesis, we will

need to explicitly constrain well-formedness of acyclic fragments

(Section 3.4).

Next, suppose that the loop guard g

loop

is unknown. In this case

if we attempt to solve for the unknowns τ and g

loop

, then one

valid solution assigns τ = g

loop

= true, which corresponds to

an non-terminating loop. In veriﬁcation, we were only concerned

with partial correctness and assumed that the program was termi-

nating. In synthesis, we will need to explicitly encode progress by

inferring appropriate ranking functions, like ϕ in Figure 1(c), to

prevent the synthesizer from generating non-terminating programs

(Section 3.5).

Note that our aim is not to solve the completely general synthe-

sis problem for a given functional speciﬁcation. Guards and state-

ments are unknowns but they take values from given domains, spec-

iﬁed by the user as domain constraints, so that a lattice-theoretic

search can be performed by existing program veriﬁcation tools.

Also notice that we did not attempt to change the number of invari-

ants or the invariant position in the constraints. This means that we

assume a given looping or ﬂowgraph structure, e.g., one loop for

our example. Lastly, as opposed to veriﬁcation, the set of program

variables is not known, and therefore we need a speciﬁcation of the

stack space available and also a bound on the type of computations

allowed.

We use the speciﬁcations to construct an expansion that is a pro-

gram with unknown symbols and construct safety conditions over

the unknowns. We then impose the additional well-formedness and

progress constraints. We call the new constraints synthesis condi-

tions and hope to ﬁnd solutions to them using program veriﬁcation

tools. The constraints generated are non-standard, and therefore to

solve them we need veriﬁcation tools that satisfy certain properties.

Veriﬁcation tools we developed in previous work [32, 21] indeed

have those properties. We use them to efﬁciently solve the syn-

thesis conditions to synthesize programs, with a very acceptable

slowdown over veriﬁcation.

The guards, statements and proof terms for the example in

this section come from the domain of arithmetic. Therefore, a

program veriﬁcation tool for arithmetic would be appropriate. For

programs whose guards and statements are more easily expressed

in other domains, a corresponding veriﬁcation tool for that domain

should be used. In fact, we have employed tools for the domains of

arithmetic and predicate abstraction for proof-theoretic synthesis

with great success. Our objective is to reuse existing veriﬁcation

technology—that started with invariant validation and progressed

to invariant inference—and push it further to program synthesis.

1.2 Contributions

This paper makes the following contributions:

•

We present a novel way of specifying a synthesis task as a

triple consisting of the functional speciﬁcation, the domains of

expressions and guards that appear in the synthesized program,

and resource constraints that the program is allowed to use

(Section 2).

•

We view program synthesis as generalized program veriﬁca-

tion. We formally deﬁne constraints, called synthesis condi-

tions, that can be solved using veriﬁcation tools (Section 3).

•

We present requirements that program veriﬁcation tools must

meet in order to be used for synthesis of program statements

and guards (Section 4).

•

We build synthesizers using veriﬁcation tools and present syn-

thesis results for the three domains of arithmetic, sorting and

dynamic programming (Section 5).

2. The Synthesis Scaffold and Task

We now elaborate on the speciﬁcations that a proof-theoretic ap-

proach to synthesis requires and how these also allow the user to

specify the space of interesting programs.

We describe the synthesis problem using a scaffold of the form

hF, D, Ri

The three components are as follows:

1. Functional Speciﬁcation The ﬁrst component F of a scaffold

describes the desired precondition and postcondition of the synthe-

sized program. Let ~v

and ~v

out

be the vectors containing the input

and output variables, respectively. Then a functional speciﬁcation

F = (F

pre

( ~v

), F

post

( ~v

, ~v

out

)) is a tuple containing the formu-

lae that hold at the entry and exit program locations. For example,

for the program in Figure 1, F

pre

(X, Y )

= (0 < Y ≤ X and

post

(X, Y, out)

= ∀k : 0 ≤ k ≤ X ⇒ 2(Y /X)k − 1 ≤

2out[k] ≤ 2(Y /X)k + 1.

2. Domain Constraints The second component D of the scaffold

describes the domains for expressions and guards in the synthesized

program. The domain speciﬁcation D =(D

exp

, D

grd

) is a tuple that

constrains the respective components:

2a. Program Expressions: The expressions manipulated by the pro-

gram come from the domain D

exp

2b. Program Guards: The logical guards (boolean expressions)

used to direct control ﬂow in the program come from the do-

main D

grd

For example, for the program in Figure 1, the domains D

exp

, D

grd

are both linear arithmetic.

3. Resource Constraints The third component R of the scaffold

describes the resources that the synthesized program can use. The

resource speciﬁcation R = (R

flow

, R

stack

, R

comp

) is a triple of

resource templates that the user must specify for the ﬂowgraph,

stack and computation, respectively:

3a. Flowgraph Template We restrict attention to structured pro-

grams (those that are goto-less, or whose ﬂowgraphs are re-

ducible [22]). The structured nature of such ﬂowgraphs allows

us to describe them using simple strings. The user speciﬁes a

string R

flow

from the following grammar:

T ::= ◦ | ∗(T ) | T;T (2)

where ◦ denotes an acyclic fragment of the ﬂow graph, ∗(T )

denotes a loop containing the body T and T ;T denotes the

sequential composition of two ﬂow graphs. For example, for

the program in Figure 1, R

flow

= ◦;∗(◦).

3b. Stack Template A map R

stack

: type → int indicating the

number of extra temporary variables of each type available

to the program. For example, for the program in Figure 1,

stack

= (int, 1).

3c. Computation Template At times it may be important to put an

upper bound on the number of times an operation is performed

inside a procedure. A map R

comp

: op → int of operations op

to the upper bound speciﬁes this constraint. For example, for

the program in Figure 1, R

comp

= ∅ which indicates that there

are no constraints on computation.

On the one hand, the resource templates make synthesis tractable

by enabling a systematic lattice-theoretic search, while on the other

they allow the user to specify the space of interesting programs and

can be used as a feature. For instance, the user may wish to reduce

memory consumption at the expense of a more complex ﬂowgraph

and still meet the functional speciﬁcation. If the user does not care,

then the resource templates can be considered optional and left

unspeciﬁed. In this case, the synthesizer can iteratively enumerate

possibilities for each resource and attempt synthesis with increas-

ing resources.

2.1 Picking a proof domain and a solver for the domain

Our synthesis approach is proof-theoretic and we synthesize the

proof terms, i.e., invariants and ranking functions, alongside the

program. These proof terms will take values from a suitably chosen

proof domain D

prf

. Notice that D

prf

will be at least as expressive

as D

grd

and D

exp

. The user chooses an appropriate proof domain

and also picks a solver capable of handling that domain. We will

use program veriﬁcation tools as solvers and typically, the user will

pick the most powerful veriﬁcation tool available for the chosen

proof domain.

2.2 Synthesis Task

Given a scaffold hF, D, R i, we call an executable program valid

with respect to the scaffold if it meets the following conditions.

•

When called with inputs ~v

that satisfy F

pre

( ~v

) the program

terminates, and the resulting outputs ~v

out

satisfy F

post

( ~v

, ~v

out

There are associated invariants and ranking functions that pro-

vide a proof of this fact.

•

There is a program loop (with an associated loop guard g)

corresponding to each loop annotation (speciﬁed by “∗”) in

the ﬂowgraph template R

flow

. The program contains statements

from the following imperative language IML for each acyclic

fragment (speciﬁed by “◦”).

S ::= skip | S;S | x := e | if g then S else S

Where x denotes a variable, e denotes some expression, and g

denotes some predicate. (Memory reads and writes are modeled

using memory variables and select/update expressions.) The

domain of expressions and guards is as speciﬁed by the scaffold,

i.e., e ∈ D

exp

and g ∈ D

grd

•

The program only uses as many local variables as speciﬁed by

stack

in addition to the input and output variables ~v

, ~v

out

•

Each elementary operation only appears as many times as spec-

iﬁed in R

comp

EXAMPLE 1 (Square Root). Let us consider a scaffold with func-

tional speciﬁcation F = (x ≥ 1, (i − 1)

≤ x < i

), which states

that the program computes the integral square root of the input x

, i.e., i − 1 = b

√

xc. Also, let the domain constraints D

exp

, D

grd

be limited to linear arithmetic expressions, which means that the

program cannot use any native square root or squaring operations.

Lastly, let the R

flow

, R

stack

and R

comp

be ◦;∗(◦);◦, {(int, 1)} and

∅, respectively. A program that is valid with respect to this scaffold

is the following:

IntSqrt(int x) {

v:=1;i:=1;

while

τ,ϕ

(v ≤ x)

v:=v+2i+1;i++;

return i−1;

}

Invariant τ :

v=i

∧x ≥ (i−1)

∧i ≥ 1

Ranking function ϕ:

x − (i−1)

where v, i are the additional stack variable and loop iteration

counter (and reused in the output), respectively. Also, the loop

is annotated with the invariant τ and ranking function ϕ as shown,

and which prove partial correctness and termination, respectively.

In the next two sections, we formally describe the steps of

our synthesis algorithm. We ﬁrst generate synthesis conditions

(Section 3), which are constraints over unknowns for statements,

guards, loop invariants and ranking functions. We then observe that

they resemble veriﬁcation conditions, and we can employ veriﬁca-

tion tools, if they have certain properties, to solve them (Section 4).

3. Synthesis Conditions

In this section, we deﬁne and construct synthesis conditions for

an input scaffold hF, D, Ri. Using the resource speciﬁcation R,

we ﬁrst generate a program with unknowns corresponding to the

fragments we wish to synthesize. Synthesis conditions then specify

constraints on these unknowns and ensure partial correctness, loop

termination and well-formedness of control-ﬂow. We begin our

discussion by motivating the representation we use for acyclic

fragments in the synthesized program.

3.1 Using Transition Systems to Represent Acyclic Code

Suppose we want to infer a set of (straight-line) statements that

transform a precondition φ

pre

to a postcondition φ

post

, where the

relevant program variables are x and y. One approach might be to

generate statements that assign unknown expressions e

and e

x and y, respectively:

{φ

pre

}x := e

; y := e

{φ

post

}

Then we can use Hoare’s axiom for assignment to generate the

veriﬁcation condition φ

pre

⇒ (φ

post

[y 7→ e

])[x 7→ e

]. However,

this veriﬁcation condition is hard to automatically reason about

because it contains substitution into unknowns. Even worse, we

have restricted the search space by requiring the assignment to

y to follow the assignment to x, and by specifying exactly two

assignments.

Instead we will represent the computation as a transition system

which provides a much cleaner mechanism for reasoning when pro-

gram statements are unknown. A transition in a transition system

is a (possibly parallel) mapping of the input variables to the output

variables. Variables have an input version and an output version (in-

dicated by primed names), which allows them to change state. For

our example, we can write a single transition:

{φ

pre

}

, y

= he

, e

i{φ

post

}

Here φ

post

is the postcondition, written in terms of the output

variables, and e

, e

are expressions over the input variables. The

veriﬁcation condition corresponding to this tuple is φ

pre

∧ x

∧ y

= e

⇒ φ

post

. Note that every state update (assignment)

can always be written as a transition.

We can extend this approach to arbitrary acyclic program frag-

ments. A guarded transition (written []g → s) contains a state-

ment s that is executed only if the quantiﬁer-free guard g holds. A

transition system consists of a set {[]g

→ s

}

of guarded transi-

tions. It is easy to see that a transition system can represent any

arbitrary acyclic program fragment by suitably enumerating the

paths through the acyclic fragment. The veriﬁcation condition for

{φ

pre

}{[]g

→ s

}

{φ

post

} is simply

(φ

pre

∧g

∧s

⇒ φ

post

In addition to the simplicity afforded by the lack of any order-

ing, the constraints from transition systems are attractive for syn-

thesis as the program statements s

and guards g

are facts just

like the pre- and postconditions φ

pre

and φ

post

. Given the lack of

differentiation, any (or all) can be unknowns in these synthesis con-

ditions. This distinguishes them from veriﬁcation conditions which

can only have unknown invariants, or often the invariants must be

known as well.

Synthesis conditions can thus be viewed as generalizations of

veriﬁcation conditions. Program veriﬁcation tools routinely infer

ﬁxed-point solutions (invariants) that satisfy the veriﬁcation condi-

tions with known statements and guards. With our formulation of

statements and guards as just additional facts in the constraints, it

is possible to use (sufﬁciently general) veriﬁcation tools to infer in-

variants and program statements and guards. Synthesis conditions

serve an analogous purpose to synthesis as veriﬁcation conditions

do to veriﬁcation. If a program is correct (veriﬁable), then its veri-

ﬁcation condition is valid. Similarly, if a valid program exists for a

scaffold, then its synthesis condition has a satisfying solution.

3.2 Expanding a ﬂowgraph

We synthesize code fragments for each acyclic fragment and loop

annotation in the ﬂowgraph template as follows:

•

Acyclic fragments: For each acyclic fragment annotation “◦”,

we infer a transition system {g

→ s

}

, i.e., a set of assign-

ments s

, stated as conjunctions of equality predicates, guarded

by quantiﬁer-free ﬁrst-order-logic (FOL) guards g

such that the

disjunction of the guards is a tautology. Suitably constructed

equality predicates and quantiﬁer-free FOL guards are later

translated to executable code—assignment statements and con-

ditional guards, respectively—in the language IML.

•

Loops: For each loop annotation “∗” we infer three elements.

The ﬁrst is the inductive loop invariant τ, which establishes

partial correctness of each loop iteration. The second is the

ranking function ϕ, which proves the termination of the loop.

Both the invariant and ranking function take values from the

proof domain, i.e., τ, ϕ ∈ D

prf

. Third, we infer a quantiﬁer-

free FOL loop guard g.

Formally, the output of expanding ﬂowgraphs will be a program

in the transition system language TSL (note the correspondence to

the ﬂowgraph grammar from Eq. 2):

p ::= choose {[]g

→ s

}

| while

τ,ϕ

(g) do {p} | p;p

Here each s

is a conjunction of equality predicates, i.e.,

). We will use ~p to denote a sequence of program statements

in TSL. Note that we model memory read and updates using se-

lect/update predicates. Therefore, in x = e the variable x could

be a memory variable and e could be a memory select or update

expression.

Given a string for a ﬂowgraph template, we deﬁne an expan-

sion function Expand : int ×D

prf

×R ×D ×R

flow

→ TSL that

introduces fresh unknowns for missing guards, statements and in-

variants that are to be synthesized. Expand

n,D

prf

D,R

flow

) expands a

ﬂowgraph R

flow

and is parametrized by an integer n that indicates

the number of transition each acyclic fragment will be expanded

to, the proof domain and the resource and domain constraints. The

expansion outputs a program in the language TSL.

Expand

n,D

prf

D,R

(◦) = choose {[]g

→s

}

i=1..n

, s

: fresh

unknowns

Expand

n,D

prf

D,R

(∗(T )) = while

τ,ϕ

(g) { τ, ϕ, g : fresh

Expand

n,D

prf

D,R

(T ); unknowns

}

Expand

n,D

prf

D,R

) = Expand

n,D

prf

D,R

);Expand

n,D

prf

D,R

)

Each unknown g, s, τ generated during the expansion has the fol-

lowing domain inclusion constraints.

τ ∈ D

prf

g ∈ D

grd

s ∈

= e

where x

∈ V, e

∈ D

exp

Here V = ~v

∪ ~v

out

∪ T ∪ L is the set of variables: the input ~v

and output ~v

out

variables, the set of temporaries (local variables) T

as speciﬁed by R

stack

, and the set of iteration counters and ranking

function tracker variables is L (which we elaborate on later), one

for each loop in the expansion. The restriction of the domains by

the variable set V indicates that we are interested in the fragment

of the domain over the variables in V . Also, the set of operations in

is bounded by R

comp

The expansion has some similarities to the notion of a user-

speciﬁed sketch in previous approaches [31, 29]. However, the un-

knowns in the expansion here are more expressive than the integer

unknowns considered earlier, and this allows us to perform a lattice

search as opposed to the combinatorial approaches proposed ear-

lier. Notice that the unknowns τ, g, s, ϕ we introduce can all be in-

terpreted as boolean formulae (τ, g naturally; s using our transition

modeling; and ϕ as ϕ > c, for some constant c), and consequently

ordered in a lattice.

EXAMPLE 2. Let us revisit the integral square root computation

from Example 1. Expanding the ﬂowgraph template ◦;∗(◦);◦ with

n = 1 yields exp

sqrt

choose {[]g

→ s

} ;

while

τ,ϕ

) {

choose {[]g

→ s

} ;

};

choose {[]g

→ s

}

τ ∈ D

prf

, g

∈ D

grd

, s

∈

= e

∈ V, e

∈ D

exp

where V = {x, i, r, v}. The variables i and r are the loop iteration

counter and ranking function tracker variable, respectively, and

v is the additional local variable. Also, the chosen domains for

proofs D

prf

, guards D

grd

and expressions D

exp

are FOL facts over

quadratic expressions, FOL facts over linear arithmetic and linear

arithmetic, respectively.

Notice that the expansion encodes everything speciﬁed by the do-

main and resource constraints and the chosen proof domain. The

only remaining speciﬁcation is F, which we will use in the next

section to construct safety conditions over the expanded scaffold.

3.3 Encoding Partial Correctness: Safety Conditions

Now that we have the expanded scaffold we need to collect the

constraints (safety conditions) for partial correctness implied by the

simple paths in the expansion. Simple paths (straight-line sequence

of statements) start at a loop header F

pre

and end at a loop header

or program exit. The loop headers, program entry and program exit

are annotated with invariants, precondition F

pre

and postcondition

post

, respectively.

Let φ denote formulae, that represent pre and postconditions

and constraints. Then we deﬁne PathC : φ × TSL × φ → φ

as a function that takes a precondition, a sequence of statements

and a postcondition and outputs safety constraints that encode the

validity of the Hoare triple. Let us ﬁrst describe the simple cases of

constraints from a single acyclic fragment and loop:

PathC(φ

pre

, (choose {[]g

→ s

}

), φ

post

) =

(φ

pre

∧ g

∧ s

⇒ φ

post

)

PathC(φ

pre

, (while

τ,ϕ

(g) {~p

}), φ

post

) =

pre

⇒ τ

∧ PathC(τ ∧ g, ~p

, τ ) ∧ (τ ∧ ¬g ⇒ φ

post

)

Here φ

post

and τ

are the postcondition φ

post

and invariant τ but

with all variables renamed to their output (primed) versions. Since

the constraints need to refer to output postconditions and invariants

the rule for a sequence of statements is a bit complicated. For sim-

plicity of presentation, we assume that acyclic annotations do not

appear in succession. This assumption holds without loss of gener-

ality because it is always possible to collapse consecutive acyclic

fragments, e.g., two consecutive acyclic fragments with n transi-

tions each can be collapsed into a single acyclic fragment with n

transitions. For efﬁciency, it is prudent to not make this assumption

in practice, and the construction here generalizes easily. For a se-

quence of statements in TSL, under the above assumptions, there

are three cases to consider. First, a loop followed by statements

~p. Second, an acyclic fragment followed by just a loop. Third, an

acyclic fragment, followed by a loop, followed by statements ~p.

Each of these generates the following, respective, constraints:

PathC(φ

pre

, (while

τ,ϕ

(g) {~p

};~p), φ

post

) =

(φ

pre

⇒ τ

) ∧ PathC(τ ∧ g, ~p

, τ ) ∧ PathC(τ ∧ ¬g, ~p, φ

post

)

PathC(φ

pre

, (choose {[]g

→ s

}

;while

τ,ϕ

(g) {~p

}), φ

post

) =

(φ

pre

∧ g

∧ s

⇒ τ

) ∧ PathC(τ ∧ g, ~p

, τ ) ∧ (τ ∧ ¬g ⇒ φ

post

)

PathC(φ

pre

, (choose {[]g

→ s

}

;while

τ,ϕ

(g) {~p

};~p), φ

post

) =

(φ

pre

∧ g

∧ s

⇒ τ

) ∧ PathC(τ ∧ g, ~p

, τ ) ∧ PathC(τ ∧ ¬g, ~p, φ

post

)

From program verification to program synthesis

Figures

Citations

Automating string processing in spreadsheets using input-output examples

Code completion with statistical language models

Syntax-guided synthesis

Automated feedback generation for introductory programming assignments

Syntax-guided synthesis

References

Introduction to Algorithms

Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints

A Discipline of Programming

Introduction to Algorithms, 2nd edition.

Guarded commands, nondeterminacy and formal derivation of programs

Related Papers (5)

Combinatorial sketching for finite programs

Oracle-guided component-based program synthesis

Z3: an efficient SMT solver

Automating string processing in spreadsheets using input-output examples

A Deductive Approach to Program Synthesis

Frequently Asked Questions (11)

Q1. What are the contributions in "From program verification to program synthesis" ?

Q2. What future works have the authors mentioned in the paper "From program verification to program synthesis" ?

Q3. What is the function that expands a flowgraph?

Q4. Why does the solver not derive s2 + 1 s1?

Q5. How do the authors encode program guards and statements?

Q6. How have the authors demonstrated the viability of their approach?

Q7. What are the synthesis conditions for an expanded scaffold?

Q8. How do the authors constrain the well-formedness of an expanded scaffold?

Q9. What is the way to ensure that the guards are not disjointed?

Q10. What are the constraints that are used to solve the synthesis conditions?

Q11. What is the simplest way to collapse acyclic fragments?