scispace - formally typeset
Open AccessProceedings ArticleDOI

From program verification to program synthesis

TLDR
The proposed technique synthesizes programs for complicated arithmetic algorithms including Strassen's matrix multiplication and Bresenham's line drawing; several sorting algorithms; and several dynamic programming algorithms using verification tools built in the VS3 project.
Abstract
This paper describes a novel technique for the synthesis of imperative programs. Automated program synthesis has the potential to make programming and the design of systems easier by allowing programs to be specified at a higher-level than executable code. In our approach, which we call proof-theoretic synthesis, the user provides an input-output functional specification, a description of the atomic operations in the programming language, and a specification of the synthesized program's looping structure, allowed stack space, and bound on usage of certain operations. Our technique synthesizes a program, if there exists one, that meets the input-output specification and uses only the given resources.The insight behind our approach is to interpret program synthesis as generalized program verification, which allows us to bring verification tools and techniques to program synthesis. Our synthesis algorithm works by creating a program with unknown statements, guards, inductive invariants, and ranking functions. It then generates constraints that relate the unknowns and enforces three kinds of requirements: partial correctness, loop termination, and well-formedness conditions on program guards. We formalize the requirements that program verification tools must meet to solve these constraint and use tools from prior work as our synthesizers.We demonstrate the feasibility of the proposed approach by synthesizing programs in three different domains: arithmetic, sorting, and dynamic programming. Using verification tools that we previously built in the VS3 project we are able to synthesize programs for complicated arithmetic algorithms including Strassen's matrix multiplication and Bresenham's line drawing; several sorting algorithms; and several dynamic programming algorithms. For these programs, the median time for synthesis is 14 seconds, and the ratio of synthesis to verification time ranges between 1x to 92x (with an median of 7x), illustrating the potential of the approach.

read more

Content maybe subject to copyright    Report

From Program Verification to Program Synthesis
Saurabh Srivastava
University of Maryland, College Park
saurabhs@cs.umd.edu
Sumit Gulwani
Microsoft Research, Redmond
sumitg@microsoft.com
Jeffrey S. Foster
University of Maryland, College Park
jfoster@cs.umd.edu
Abstract
This paper describes a novel technique for the synthesis of imper-
ative programs. Automated program synthesis has the potential to
make programming and the design of systems easier by allowing
programs to be specified at a higher-level than executable code. In
our approach, which we call proof-theoretic synthesis, the user pro-
vides an input-output functional specification, a description of the
atomic operations in the programming language, and a specifica-
tion of the synthesized program’s looping structure, allowed stack
space, and bound on usage of certain operations. Our technique
synthesizes a program, if there exists one, that meets the input-
output specification and uses only the given resources.
The insight behind our approach is to interpret program synthe-
sis as generalized program verification, which allows us to bring
verification tools and techniques to program synthesis. Our syn-
thesis algorithm works by creating a program with unknown state-
ments, guards, inductive invariants, and ranking functions. It then
generates constraints that relate the unknowns and enforces three
kinds of requirements: partial correctness, loop termination, and
well-formedness conditions on program guards. We formalize the
requirements that program verification tools must meet to solve
these constraint and use tools from prior work as our synthesizers.
We demonstrate the feasibility of the proposed approach by syn-
thesizing programs in three different domains: arithmetic, sorting,
and dynamic programming. Using verification tools that we previ-
ously built in the VS
3
project we are able to synthesize programs
for complicated arithmetic algorithms including Strassen’s matrix
multiplication and Bresenham’s line drawing; several sorting algo-
rithms; and several dynamic programming algorithms. For these
programs, the median time for synthesis is 14 seconds, and the ra-
tio of synthesis to verification time ranges between 1× to 92×(with
an median of 7×), illustrating the potential of the approach.
Categories and Subject Descriptors I.2.2 [Automatic Program-
ming]: Program Synthesis; F.3.1 [Logics and Meanings of Pro-
grams]: Specifying and Verifying and Reasoning about Programs
General Terms Languages, Theory.
Keywords Proof-theoretic program synthesis, verification.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
POPL’10, January 17–23, 2010, Madrid, Spain.
Copyright
c
2010 ACM 978-1-60558-479-9/10/01. . . $10.00
1. Introduction
Automated program synthesis, despite holding the promise of sig-
nificantly easing the task of programming, has received little atten-
tion due to its difficulty. Being able to mechanically construct pro-
grams has wide-ranging implications. Mechanical synthesis yields
programs that are correct-by-construction. It relieves the tedium
and error associated with programming low-level details, can aid in
automated debugging and in general leaves the human programmer
free to deal with the high-level design of the system. Additionally,
synthesis could discover new non-trivial programs that are difficult
for programmers to build.
In this paper, we present an approach to program synthesis
that takes the correct-by-construction philosophy of program de-
sign [14, 18, 38] and shows how it can be automated. Program ver-
ification tools routinely synthesize program proofs in the form of
inductive invariants for partial correctness and ranking functions
for termination. We encode the synthesis problem as a verifica-
tion problem by encoding program guards and statements as logical
facts that need to be discovered. This allows us to use certain verifi-
cation tools for synthesis. The verification tool infers the invariants
and ranking functions as usual, but in addition infers the program
statements, yielding automated program synthesis. We call our ap-
proach proof-theoretic synthesis because the proof is synthesized
alongside the program.
We define the synthesis task as requirements on the output pro-
gram: functional requirements, requirements on the form of pro-
gram expressions and guards, and requirements on the resources
used. The key to our synthesis algorithm is the reduction from the
synthesis task to three sets of constraints. The first set are safety
conditions that ensure the partial correctness of the loops in the pro-
gram. The second set are well-formedness conditions on the pro-
gram guards and statements, such that the output from the verifica-
tion tool (facts corresponding to program guards and statements)
correspond to valid guards and statements in an imperative lan-
guage. The third set are progress conditions that ensure that the
program terminates. To our knowledge, our approach is the first
that automatically synthesizes programs and their proofs, while
previous approaches have either used given proofs to extract pro-
grams [27] or made no attempt to generate the proof. Some ap-
proaches, while not generating proofs, do ensure correctness for a
limited class of finitizable programs [29].
To illustrate our approach, we next show how to synthesize
Bresenham’s line drawing algorithm. This example is an ideal
candidate for automated synthesis because, while the program’s
requirements are simple to specify, the actual program is quite
involved.
1.1 Motivating Example
As a motivating example, we consider a well-known algorithm
from the graphics community called Bresenham’s line drawing
algorithm, shown in Figure 1(a). The algorithm computes (and

(a) Bresenhams(int X, Y ) {
v
1
:=2Y X; y:=0; x:=0;
while (x X)
out[x]:=y;
if (v
1
< 0)
v
1
:=v
1
+2Y ;
else
v
1
:=v
1
+2(Y -X); y++;
x++;
return out;
}
(b) Bresenhams(int X, Y ) {
[]true v
0
1
=2Y X y
0
=0 x
0
=0
while (x X)
[]v
1
< 0 out
0
=upd(out,x,y) v
0
1
=v
1
+2Y y
0
=y x
0
=x+1
[]v
1
0 out
0
=upd(out,x,y) v
0
1
=v
1
+2(Y -X) y
0
=y+1 x
0
=x+1
return out;
}
(c)
Invariant τ
:
0 < Y X v
1
= 2(x+1)Y (2y +1)X
2(Y X) v
1
2Y
k : 0 k < x 2|out[k](Y/X)k| 1
Ranking function ϕ
:
X x
Figure 1. (a) Bresenham’s line drawing algorithm (b) The invariant and ranking function that prove partial correctness and termination,
respectively. (c) The algorithm written in transition system form, with statements as equality predicates, guarded appropriately.
writes to the output array out) the discrete best-fit line from (0, 0)
to (X, Y ), where the point (X, Y ) is in the NE half-quadrant, i.e.,
0 < Y X. The best-fit line is one that does not deviate more
than half a pixel away from the real line, i.e., |y (Y/X)x| 1/2.
For efficiency, the algorithm computes the pixel values (x, y) of
this best-fit line using only linear operations, but the computation is
non-trivial and the correctness of the algorithm is also not evident.
The specification for this program is succinctly written in terms
of its precondition τ
pre
and postcondition τ
post
:
τ
pre
: 0 < Y X
τ
post
: k : 0 k X 2|out[k](Y/X)k|1
Notice that in the postcondition, we have written the assertion out-
side the loop body for clarity of presentation, but it can easily
be rewritten, as a quantifier-free assertion, inside. Bresenham pro-
posed the program shown in Figure 1(a) to implement this specifi-
cation. The question we answer is whether it is possible to synthe-
size the program given just the specification and a description of
the available resources (control flow, stack space and operations).
Let us stepwise develop the idea behind synthesis starting from the
verification problem for the given program.
Observe that we can write program statements as equality pred-
icates and acyclic fragments as transition systems. For example, we
can write x := e as x
0
= e, where x
0
is a renaming of x to its out-
put value. We will write statements as equalities between the output
(primed) versions of the variables and the expression (over the un-
primed versions of the variables). Also, guards that direct control
flow in an imperative program can now be seen as guards for state-
ment facts in a transition system. Figure 1(b) shows our example
written in transition system form. To prove partial correctness, one
can write down the inductive invariant for the loop and verify that
the verification condition for the program is in fact valid. The ver-
ification condition consists of four implications for the four paths
corresponding to the entry, exit, and one each for the branches in
the loop. Using standard verification condition generation, with the
precondition τ
pre
and postcondition τ
post
, and writing the renamed
version of invariant τ as τ
0
, these are
τ
pre
s
entry
τ
0
τ ¬g
loop
τ
post
τ g
loop
g
body1
s
body1
τ
0
τ g
loop
g
body2
s
body2
τ
0
(1)
where we use symbols for the various parts of the program:
g
body1
: v
1
< 0
g
body2
: v
1
0
g
loop
: x X
s
entry
: v
0
1
=2Y -X y
0
=0 x
0
=0
s
body1
: out
0
=upd(out, x, y) v
0
1
=v
1
+2Y y
0
=y x
0
=x+1
s
body2
: out
0
=upd(out, x, y) v
0
1
=v
1
+2(Y -X) y
0
=y+1 x
0
=x+1
With a little bit of work, one can validate that the invariant τ shown
in Figure 1(c) satisfies Eq. (1). Checking the validity of given in-
variants can be automated using SMT solvers [10]. In fact, pow-
erful program verification tools now exist that can generate fixed-
point solutions—inductive invariants such as τ —automatically us-
ing constraint-based techniques [6, 21, 32], abstract interpreta-
tion [9] or model checking [3]. There are also tools that can prove
termination [7]—by inferring ranking functions such as ϕ—and to-
gether with the safety proof provide a proof for total correctness.
The insight behind our paper is to ask the question, if we can
infer τ in Eq. (1), then is it possible to infer the guards g
i
s and the
statements s
i
s at the same time? We have found that we can in-
deed infer guards and statements as well, by suitably encoding pro-
grams as transition systems, asserting appropriate constraints, and
then leveraging program verification techniques to do a systematic
(lattice-theoretic) search for unknowns in the constraints. Here the
unknowns now represent both the invariants and the statements and
guards. It turns out that a direct solution to the unknown guards
and statements may be uninteresting, i.e., it may not correspond
to real programs. But we illustrate that we can impose additional
well-formedness constraints on the unknown guards and statements
such that any solution to this new set of constraints corresponds
to a valid, real program. Additionally, even if we synthesize valid
programs, it may be that the programs are non-terminating. There-
fore we need to impose additional progress constraints that ensure
that the synthesized programs are ones that we can actually run.
We now illustrate the need for these well-formedness and progress
constraints over our example.
Suppose that the statements s
entry
, s
body1
and s
body2
, are un-
known. A trivial satisfying solution to Eq. (1) may set all these
unknowns to false. If we use a typical program verification tool
that computes least fixed-points starting from , then indeed, it will
output this solution. On the other hand, let us make the conditional
guards g
body1
and g
body2
unknown. Again, g
body1
= g
body2
= false
is a satisfying solution. We get uninteresting solutions because the
unknowns are not constrained enough to ensure valid statements
and control-flow. Statement blocks are modeled as
V
i
x
0
i
= e
i
,
with one equality for each output variable x
0
i
and expressions e
i
are over input variables. Therefore, false does not correspond to
any valid block. Similarly g
body1
= g
body2
= false does not cor-
respond to any valid conditional with two branches. For example,
consider if (g) S
1
else S
2
with two branches. Note how S
1
and
S
2
are guarded by g and ¬g, respectively, and g ¬g holds. For
every valid conditional, the disjunction of the guards is always a
tautology. In verification, the program syntax and semantics en-
sure the well-formedness of acyclic fragments. In synthesis, we will
need to explicitly constrain well-formedness of acyclic fragments
(Section 3.4).
Next, suppose that the loop guard g
loop
is unknown. In this case
if we attempt to solve for the unknowns τ and g
loop
, then one
valid solution assigns τ = g
loop
= true, which corresponds to
an non-terminating loop. In verification, we were only concerned

with partial correctness and assumed that the program was termi-
nating. In synthesis, we will need to explicitly encode progress by
inferring appropriate ranking functions, like ϕ in Figure 1(c), to
prevent the synthesizer from generating non-terminating programs
(Section 3.5).
Note that our aim is not to solve the completely general synthe-
sis problem for a given functional specification. Guards and state-
ments are unknowns but they take values from given domains, spec-
ified by the user as domain constraints, so that a lattice-theoretic
search can be performed by existing program verification tools.
Also notice that we did not attempt to change the number of invari-
ants or the invariant position in the constraints. This means that we
assume a given looping or flowgraph structure, e.g., one loop for
our example. Lastly, as opposed to verification, the set of program
variables is not known, and therefore we need a specification of the
stack space available and also a bound on the type of computations
allowed.
We use the specifications to construct an expansion that is a pro-
gram with unknown symbols and construct safety conditions over
the unknowns. We then impose the additional well-formedness and
progress constraints. We call the new constraints synthesis condi-
tions and hope to find solutions to them using program verification
tools. The constraints generated are non-standard, and therefore to
solve them we need verification tools that satisfy certain properties.
Verification tools we developed in previous work [32, 21] indeed
have those properties. We use them to efficiently solve the syn-
thesis conditions to synthesize programs, with a very acceptable
slowdown over verification.
The guards, statements and proof terms for the example in
this section come from the domain of arithmetic. Therefore, a
program verification tool for arithmetic would be appropriate. For
programs whose guards and statements are more easily expressed
in other domains, a corresponding verification tool for that domain
should be used. In fact, we have employed tools for the domains of
arithmetic and predicate abstraction for proof-theoretic synthesis
with great success. Our objective is to reuse existing verification
technology—that started with invariant validation and progressed
to invariant inference—and push it further to program synthesis.
1.2 Contributions
This paper makes the following contributions:
We present a novel way of specifying a synthesis task as a
triple consisting of the functional specification, the domains of
expressions and guards that appear in the synthesized program,
and resource constraints that the program is allowed to use
(Section 2).
We view program synthesis as generalized program verifica-
tion. We formally define constraints, called synthesis condi-
tions, that can be solved using verification tools (Section 3).
We present requirements that program verification tools must
meet in order to be used for synthesis of program statements
and guards (Section 4).
We build synthesizers using verification tools and present syn-
thesis results for the three domains of arithmetic, sorting and
dynamic programming (Section 5).
2. The Synthesis Scaffold and Task
We now elaborate on the specifications that a proof-theoretic ap-
proach to synthesis requires and how these also allow the user to
specify the space of interesting programs.
We describe the synthesis problem using a scaffold of the form
hF, D, Ri
The three components are as follows:
1. Functional Specification The first component F of a scaffold
describes the desired precondition and postcondition of the synthe-
sized program. Let ~v
in
and ~v
out
be the vectors containing the input
and output variables, respectively. Then a functional specification
F = (F
pre
( ~v
in
), F
post
( ~v
in
, ~v
out
)) is a tuple containing the formu-
lae that hold at the entry and exit program locations. For example,
for the program in Figure 1, F
pre
(X, Y )
.
= (0 < Y X and
F
post
(X, Y, out)
.
= k : 0 k X 2(Y /X)k 1
2out[k] 2(Y /X)k + 1.
2. Domain Constraints The second component D of the scaffold
describes the domains for expressions and guards in the synthesized
program. The domain specification D =(D
exp
, D
grd
) is a tuple that
constrains the respective components:
2a. Program Expressions: The expressions manipulated by the pro-
gram come from the domain D
exp
.
2b. Program Guards: The logical guards (boolean expressions)
used to direct control flow in the program come from the do-
main D
grd
.
For example, for the program in Figure 1, the domains D
exp
, D
grd
are both linear arithmetic.
3. Resource Constraints The third component R of the scaffold
describes the resources that the synthesized program can use. The
resource specification R = (R
flow
, R
stack
, R
comp
) is a triple of
resource templates that the user must specify for the flowgraph,
stack and computation, respectively:
3a. Flowgraph Template We restrict attention to structured pro-
grams (those that are goto-less, or whose flowgraphs are re-
ducible [22]). The structured nature of such flowgraphs allows
us to describe them using simple strings. The user specifies a
string R
flow
from the following grammar:
T ::= | (T ) | T;T (2)
where denotes an acyclic fragment of the flow graph, (T )
denotes a loop containing the body T and T ;T denotes the
sequential composition of two flow graphs. For example, for
the program in Figure 1, R
flow
= ;().
3b. Stack Template A map R
stack
: type int indicating the
number of extra temporary variables of each type available
to the program. For example, for the program in Figure 1,
R
stack
= (int, 1).
3c. Computation Template At times it may be important to put an
upper bound on the number of times an operation is performed
inside a procedure. A map R
comp
: op int of operations op
to the upper bound specifies this constraint. For example, for
the program in Figure 1, R
comp
= which indicates that there
are no constraints on computation.
On the one hand, the resource templates make synthesis tractable
by enabling a systematic lattice-theoretic search, while on the other
they allow the user to specify the space of interesting programs and
can be used as a feature. For instance, the user may wish to reduce
memory consumption at the expense of a more complex flowgraph
and still meet the functional specification. If the user does not care,
then the resource templates can be considered optional and left
unspecified. In this case, the synthesizer can iteratively enumerate
possibilities for each resource and attempt synthesis with increas-
ing resources.
2.1 Picking a proof domain and a solver for the domain
Our synthesis approach is proof-theoretic and we synthesize the
proof terms, i.e., invariants and ranking functions, alongside the

program. These proof terms will take values from a suitably chosen
proof domain D
prf
. Notice that D
prf
will be at least as expressive
as D
grd
and D
exp
. The user chooses an appropriate proof domain
and also picks a solver capable of handling that domain. We will
use program verification tools as solvers and typically, the user will
pick the most powerful verification tool available for the chosen
proof domain.
2.2 Synthesis Task
Given a scaffold hF, D, R i, we call an executable program valid
with respect to the scaffold if it meets the following conditions.
When called with inputs ~v
in
that satisfy F
pre
( ~v
in
) the program
terminates, and the resulting outputs ~v
out
satisfy F
post
( ~v
in
, ~v
out
).
There are associated invariants and ranking functions that pro-
vide a proof of this fact.
There is a program loop (with an associated loop guard g)
corresponding to each loop annotation (specified by ”) in
the flowgraph template R
flow
. The program contains statements
from the following imperative language IML for each acyclic
fragment (specified by ”).
S ::= skip | S;S | x := e | if g then S else S
Where x denotes a variable, e denotes some expression, and g
denotes some predicate. (Memory reads and writes are modeled
using memory variables and select/update expressions.) The
domain of expressions and guards is as specified by the scaffold,
i.e., e D
exp
and g D
grd
.
The program only uses as many local variables as specified by
R
stack
in addition to the input and output variables ~v
in
, ~v
out
.
Each elementary operation only appears as many times as spec-
ified in R
comp
.
EXAMPLE 1 (Square Root). Let us consider a scaffold with func-
tional specification F = (x 1, (i 1)
2
x < i
2
), which states
that the program computes the integral square root of the input x
, i.e., i 1 = b
xc. Also, let the domain constraints D
exp
, D
grd
be limited to linear arithmetic expressions, which means that the
program cannot use any native square root or squaring operations.
Lastly, let the R
flow
, R
stack
and R
comp
be ;();, {(int, 1)} and
, respectively. A program that is valid with respect to this scaffold
is the following:
IntSqrt(int x) {
v:=1;i:=1;
while
τ
(v x)
v:=v+2i+1;i++;
return i1;
}
Invariant τ :
v=i
2
x (i1)
2
i 1
Ranking function ϕ:
x (i1)
2
where v, i are the additional stack variable and loop iteration
counter (and reused in the output), respectively. Also, the loop
is annotated with the invariant τ and ranking function ϕ as shown,
and which prove partial correctness and termination, respectively.
In the next two sections, we formally describe the steps of
our synthesis algorithm. We first generate synthesis conditions
(Section 3), which are constraints over unknowns for statements,
guards, loop invariants and ranking functions. We then observe that
they resemble verification conditions, and we can employ verifica-
tion tools, if they have certain properties, to solve them (Section 4).
3. Synthesis Conditions
In this section, we define and construct synthesis conditions for
an input scaffold hF, D, Ri. Using the resource specification R,
we first generate a program with unknowns corresponding to the
fragments we wish to synthesize. Synthesis conditions then specify
constraints on these unknowns and ensure partial correctness, loop
termination and well-formedness of control-flow. We begin our
discussion by motivating the representation we use for acyclic
fragments in the synthesized program.
3.1 Using Transition Systems to Represent Acyclic Code
Suppose we want to infer a set of (straight-line) statements that
transform a precondition φ
pre
to a postcondition φ
post
, where the
relevant program variables are x and y. One approach might be to
generate statements that assign unknown expressions e
x
and e
y
to
x and y, respectively:
{φ
pre
}x := e
x
; y := e
y
{φ
post
}
Then we can use Hoare’s axiom for assignment to generate the
verification condition φ
pre
(φ
post
[y 7→ e
y
])[x 7→ e
x
]. However,
this verification condition is hard to automatically reason about
because it contains substitution into unknowns. Even worse, we
have restricted the search space by requiring the assignment to
y to follow the assignment to x, and by specifying exactly two
assignments.
Instead we will represent the computation as a transition system
which provides a much cleaner mechanism for reasoning when pro-
gram statements are unknown. A transition in a transition system
is a (possibly parallel) mapping of the input variables to the output
variables. Variables have an input version and an output version (in-
dicated by primed names), which allows them to change state. For
our example, we can write a single transition:
{φ
pre
}
˙
x
0
, y
0
¸
= he
x
, e
y
i{φ
0
post
}
Here φ
0
post
is the postcondition, written in terms of the output
variables, and e
x
, e
y
are expressions over the input variables. The
verification condition corresponding to this tuple is φ
pre
x
0
=
e
x
y
0
= e
y
φ
0
post
. Note that every state update (assignment)
can always be written as a transition.
We can extend this approach to arbitrary acyclic program frag-
ments. A guarded transition (written []g s) contains a state-
ment s that is executed only if the quantifier-free guard g holds. A
transition system consists of a set {[]g
i
s
i
}
i
of guarded transi-
tions. It is easy to see that a transition system can represent any
arbitrary acyclic program fragment by suitably enumerating the
paths through the acyclic fragment. The verification condition for
{φ
pre
}{[]g
i
s
i
}
i
{φ
0
post
} is simply
V
i
(φ
pre
g
i
s
i
φ
0
post
).
In addition to the simplicity afforded by the lack of any order-
ing, the constraints from transition systems are attractive for syn-
thesis as the program statements s
i
and guards g
i
are facts just
like the pre- and postconditions φ
pre
and φ
0
post
. Given the lack of
differentiation, any (or all) can be unknowns in these synthesis con-
ditions. This distinguishes them from verification conditions which
can only have unknown invariants, or often the invariants must be
known as well.
Synthesis conditions can thus be viewed as generalizations of
verification conditions. Program verification tools routinely infer
fixed-point solutions (invariants) that satisfy the verification condi-
tions with known statements and guards. With our formulation of
statements and guards as just additional facts in the constraints, it
is possible to use (sufficiently general) verification tools to infer in-
variants and program statements and guards. Synthesis conditions
serve an analogous purpose to synthesis as verification conditions
do to verification. If a program is correct (verifiable), then its veri-
fication condition is valid. Similarly, if a valid program exists for a
scaffold, then its synthesis condition has a satisfying solution.
3.2 Expanding a flowgraph
We synthesize code fragments for each acyclic fragment and loop
annotation in the flowgraph template as follows:

Acyclic fragments: For each acyclic fragment annotation ”,
we infer a transition system {g
i
s
i
}
i
, i.e., a set of assign-
ments s
i
, stated as conjunctions of equality predicates, guarded
by quantifier-free first-order-logic (FOL) guards g
i
such that the
disjunction of the guards is a tautology. Suitably constructed
equality predicates and quantifier-free FOL guards are later
translated to executable code—assignment statements and con-
ditional guards, respectively—in the language IML.
Loops: For each loop annotation we infer three elements.
The first is the inductive loop invariant τ, which establishes
partial correctness of each loop iteration. The second is the
ranking function ϕ, which proves the termination of the loop.
Both the invariant and ranking function take values from the
proof domain, i.e., τ, ϕ D
prf
. Third, we infer a quantifier-
free FOL loop guard g.
Formally, the output of expanding flowgraphs will be a program
in the transition system language TSL (note the correspondence to
the flowgraph grammar from Eq. 2):
p ::= choose {[]g
i
s
i
}
i
| while
τ
(g) do {p} | p;p
Here each s
i
is a conjunction of equality predicates, i.e.,
V
j
(x
j
=
e
j
). We will use ~p to denote a sequence of program statements
in TSL. Note that we model memory read and updates using se-
lect/update predicates. Therefore, in x = e the variable x could
be a memory variable and e could be a memory select or update
expression.
Given a string for a flowgraph template, we define an expan-
sion function Expand : int ×D
prf
×R ×D ×R
flow
TSL that
introduces fresh unknowns for missing guards, statements and in-
variants that are to be synthesized. Expand
n,D
prf
D,R
(R
flow
) expands a
flowgraph R
flow
and is parametrized by an integer n that indicates
the number of transition each acyclic fragment will be expanded
to, the proof domain and the resource and domain constraints. The
expansion outputs a program in the language TSL.
Expand
n,D
prf
D,R
() = choose {[]g
i
s
i
}
i=1..n
g
i
, s
i
: fresh
unknowns
Expand
n,D
prf
D,R
((T )) = while
τ
(g) { τ, ϕ, g : fresh
Expand
n,D
prf
D,R
(T ); unknowns
}
Expand
n,D
prf
D,R
(T
1
;T
2
) = Expand
n,D
prf
D,R
(T
1
);Expand
n,D
prf
D,R
(T
2
)
Each unknown g, s, τ generated during the expansion has the fol-
lowing domain inclusion constraints.
τ D
prf
|
V
g D
grd
|
V
s
V
i
x
i
= e
i
where x
i
V, e
i
D
exp
|
V
Here V = ~v
in
~v
out
T L is the set of variables: the input ~v
in
and output ~v
out
variables, the set of temporaries (local variables) T
as specified by R
stack
, and the set of iteration counters and ranking
function tracker variables is L (which we elaborate on later), one
for each loop in the expansion. The restriction of the domains by
the variable set V indicates that we are interested in the fragment
of the domain over the variables in V . Also, the set of operations in
e
i
is bounded by R
comp
.
The expansion has some similarities to the notion of a user-
specified sketch in previous approaches [31, 29]. However, the un-
knowns in the expansion here are more expressive than the integer
unknowns considered earlier, and this allows us to perform a lattice
search as opposed to the combinatorial approaches proposed ear-
lier. Notice that the unknowns τ, g, s, ϕ we introduce can all be in-
terpreted as boolean formulae (τ, g naturally; s using our transition
modeling; and ϕ as ϕ > c, for some constant c), and consequently
ordered in a lattice.
EXAMPLE 2. Let us revisit the integral square root computation
from Example 1. Expanding the flowgraph template ;(); with
n = 1 yields exp
sqrt
:
choose {[]g
1
s
1
} ;
while
τ
(g
0
) {
choose {[]g
2
s
2
} ;
};
choose {[]g
3
s
3
}
τ D
prf
|
V
g
1
, g
2
, g
3
D
grd
|
V
s
1
, s
2
, s
3
V
i
x
i
= e
i
x
i
V, e
i
D
exp
|
V
where V = {x, i, r, v}. The variables i and r are the loop iteration
counter and ranking function tracker variable, respectively, and
v is the additional local variable. Also, the chosen domains for
proofs D
prf
, guards D
grd
and expressions D
exp
are FOL facts over
quadratic expressions, FOL facts over linear arithmetic and linear
arithmetic, respectively.
Notice that the expansion encodes everything specified by the do-
main and resource constraints and the chosen proof domain. The
only remaining specification is F, which we will use in the next
section to construct safety conditions over the expanded scaffold.
3.3 Encoding Partial Correctness: Safety Conditions
Now that we have the expanded scaffold we need to collect the
constraints (safety conditions) for partial correctness implied by the
simple paths in the expansion. Simple paths (straight-line sequence
of statements) start at a loop header F
pre
and end at a loop header
or program exit. The loop headers, program entry and program exit
are annotated with invariants, precondition F
pre
and postcondition
F
post
, respectively.
Let φ denote formulae, that represent pre and postconditions
and constraints. Then we define PathC : φ × TSL × φ φ
as a function that takes a precondition, a sequence of statements
and a postcondition and outputs safety constraints that encode the
validity of the Hoare triple. Let us first describe the simple cases of
constraints from a single acyclic fragment and loop:
PathC(φ
pre
, (choose {[]g
i
s
i
}
i
), φ
post
) =
V
i
(φ
pre
g
i
s
i
φ
post
0
)
PathC(φ
pre
, (while
τ
(g) {~p
l
}), φ
post
) =
φ
pre
τ
0
PathC(τ g, ~p
l
, τ ) (τ ¬g φ
post
0
)
Here φ
post
0
and τ
0
are the postcondition φ
post
and invariant τ but
with all variables renamed to their output (primed) versions. Since
the constraints need to refer to output postconditions and invariants
the rule for a sequence of statements is a bit complicated. For sim-
plicity of presentation, we assume that acyclic annotations do not
appear in succession. This assumption holds without loss of gener-
ality because it is always possible to collapse consecutive acyclic
fragments, e.g., two consecutive acyclic fragments with n transi-
tions each can be collapsed into a single acyclic fragment with n
2
transitions. For efficiency, it is prudent to not make this assumption
in practice, and the construction here generalizes easily. For a se-
quence of statements in TSL, under the above assumptions, there
are three cases to consider. First, a loop followed by statements
~p. Second, an acyclic fragment followed by just a loop. Third, an
acyclic fragment, followed by a loop, followed by statements ~p.
Each of these generates the following, respective, constraints:
PathC(φ
pre
, (while
τ
(g) {~p
l
};~p), φ
post
) =
(φ
pre
τ
0
) PathC(τ g, ~p
l
, τ ) PathC(τ ¬g, ~p, φ
post
)
PathC(φ
pre
, (choose {[]g
i
s
i
}
i
;while
τ
(g) {~p
l
}), φ
post
) =
V
i
(φ
pre
g
i
s
i
τ
0
) PathC(τ g, ~p
l
, τ ) (τ ¬g φ
post
0
)
PathC(φ
pre
, (choose {[]g
i
s
i
}
i
;while
τ
(g) {~p
l
};~p), φ
post
) =
V
i
(φ
pre
g
i
s
i
τ
0
) PathC(τ g, ~p
l
, τ ) PathC(τ ¬g, ~p, φ
post
)

Figures
Citations
More filters
Journal ArticleDOI

Automating string processing in spreadsheets using input-output examples

TL;DR: The design of a string programming/expression language that supports restricted forms of regular expressions, conditionals and loops is described and an algorithm based on several novel concepts for synthesizing a desired program in this language is described from input-output examples.
Proceedings ArticleDOI

Code completion with statistical language models

TL;DR: The main idea is to reduce the problem of code completion to a natural-language processing problem of predicting probabilities of sentences, and design a simple and scalable static analysis that extracts sequences of method calls from a large codebase, and index these into a statistical language model.

Syntax-guided synthesis

TL;DR: This work describes three different instantiations of the counter-example-guided-inductive-synthesis (CEGIS) strategy for solving the synthesis problem, reports on prototype implementations, and presents experimental results on an initial set of benchmarks.
Proceedings ArticleDOI

Automated feedback generation for introductory programming assignments

TL;DR: A simple language for describing error models in terms of correction rules is introduced, and a rule-directed translation strategy is formally defined that reduces the problem of finding minimal corrections in an incorrect program to the problems of synthesizing a correct program from a sketch.
Proceedings ArticleDOI

Syntax-guided synthesis

TL;DR: This work describes three different instantiations of the counter-example-guided-inductive-synthesis (CEGIS) strategy for solving the synthesis problem, reports on prototype implementations, and presents experimental results on an initial set of benchmarks.
References
More filters
Book

Introduction to Algorithms

TL;DR: The updated new edition of the classic Introduction to Algorithms is intended primarily for use in undergraduate or graduate courses in algorithms or data structures and presents a rich variety of algorithms and covers them in considerable depth while making their design and analysis accessible to all levels of readers.
Proceedings ArticleDOI

Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints

TL;DR: In this paper, the abstract interpretation of programs is used to describe computations in another universe of abstract objects, so that the results of abstract execution give some information on the actual computations.
Journal ArticleDOI

Guarded commands, nondeterminacy and formal derivation of programs

TL;DR: So-called “guarded commands” are introduced as a building block for alternative and repetitive constructs that allow nondeterministic program components for which at least the activity evoked, but possibly even the final state, is not necessarily uniquely determined by the initial state.
Frequently Asked Questions (11)
Q1. What are the contributions in "From program verification to program synthesis" ?

This paper describes a novel technique for the synthesis of imperative programs. In their approach, which the authors call proof-theoretic synthesis, the user provides an input-output functional specification, a description of the atomic operations in the programming language, and a specification of the synthesized program ’ s looping structure, allowed stack space, and bound on usage of certain operations. The insight behind their approach is to interpret program synthesis as generalized program verification, which allows us to bring verification tools and techniques to program synthesis. The authors demonstrate the feasibility of the proposed approach by synthesizing programs in three different domains: arithmetic, sorting, and dynamic programming. Using verification tools that the authors previously built in the VS project they are able to synthesize programs for complicated arithmetic algorithms including Strassen ’ s matrix multiplication and Bresenham ’ s line drawing ; several sorting algorithms ; and several dynamic programming algorithms. Automated program synthesis has the potential to make programming and the design of systems easier by allowing programs to be specified at a higher-level than executable code. For these programs, the median time for synthesis is 14 seconds, and the ratio of synthesis to verification time ranges between 1× to 92× ( with an median of 7× ), illustrating the potential of the approach. 

The authors envision that in the future, they can either augment synthesis conditions with constraints about relevance or use a postprocessing step to prioritize and pick relevant solutions from those enumerated. 

Given a string for a flowgraph template, the authors define an expansion function Expand : int ×Dprf ×R ×D ×Rflow → TSL that introduces fresh unknowns for missing guards, statements and invariants that are to be synthesized. 

because of the incompleteness in the handling of quadratic expressions, their solver cannot derive (s2 + 1)2 ≤ s21 from s2 + 1 ≤ s1. 

The authors encode the synthesis problem as a verification problem by encoding program guards and statements as logical facts that need to be discovered. 

The authors have demonstrated the viability of their approach by synthesizing difficult examples in the three domains of arithmetic, sorting, and dynamic programming, all in very reasonable time. 

The synthesis conditions corresponding to a scaffold are satisfiable iff there exists a program (with a maximum of n transitions in each acyclic fragment where n is the parameter to the expansion) that is valid with respect to the scaffold. 

The authors constrain the well-formedness of each transition system in the expanded scaffold exp = ExpandD,Rn,Dprf(Rflow) using Eq. (4).WellFormCond(exp) = ^choose {[]gi→si}i ∈cond(exp)WellFormTS({[]gi → si}i) (5)where cond(exp) recursively examines the expanded scaffold exp and returns the set of all choose statements in it. 

Disjointedness is not required for correctness [11] because if multiple guards are triggered then arbitrarily choosing the body for any one suffices. 

The authors then observe that they resemble verification conditions, and the authors can employ verification tools, if they have certain properties, to solve them (Section 4). 

This assumption holds without loss of generality because it is always possible to collapse consecutive acyclic fragments, e.g., two consecutive acyclic fragments with n transitions each can be collapsed into a single acyclic fragment with n2 transitions.