What is the current set of benchmarks?

The current set of benchmarks is limited to synthesis of loop-free functions with no optimality criterion; nevertheless, the benchmarks provide an initial demonstration of the expressiveness of the base formalism and of the relative merits of the individual solution strategies presented earlier.

What is the probability of making e the new candidate?

Given the original candidate e, and a mutation e′ thus obtained, the probability of making e′ the new candidate is given by the Metropolis-Hastings acceptance ratio α(e, e′) = min(1,Score(e′)/Score(e)).

How many buckets are used to report the solution times of different solutions?

To account for variability and for the constant factors introduced by the prototype nature of the implementations, the authors report only the order of magnitude of the solution times in five different buckets: 0.1 for solution times less than half a second, 1 for solution times between half a second and 15 seconds, 100 for solution times up to two minutes, 300 for solution times of up to 5 minutes, and infinity for runs that time out after 5 minutes.

How many solvers were affected by the encoding of the problem?

The authors can see in the data that all solvers were affected by the encoding of the problem for at least some benchmark; although in some cases, the pruning strategies used by the solvers were able to ameliorate the impact of the larger search space.

How do the authors switch to a search at size n?

Starting with n = 1, with some probability pm (set by default to 0.01), the authors switch at each step to one of the searches at size n±1.

What is the way to fix an inductive synthesis algorithm?

Given the tuple (T , f , ϕ, L), there are two important choices one must make to fix an inductive synthesis algorithm: (1) search strategy:

(Open Access) Syntax-guided synthesis (2013) | Rajeev Alur

Q: What contributions have the authors mentioned in the paper "Syntax-guided synthesis" ?

The authors describe three different instantiations of the counter-example-guided-inductive-synthesis ( CEGIS ) strategy for solving the synthesis problem, report on prototype implementations, and present experimental results on an initial set of benchmarks.

Syntax-Guided Synthesis

Rajeev Alur

†

Rastislav Bodik

‡

Garvit Juniwal

‡

Milo M. K. Martin

†

Mukund Raghothaman

†

Sanjit A. Seshia

‡

Rishabh Singh

]

Armando Solar-Lezama

]

Emina Torlak

‡

Abhishek Udupa

†

University of Pennsylvania

‡

University of California, Berkeley

]

Massachusetts Institute of Technology

Abstract—The classical formulation of the program-synthesis

problem is to ﬁnd a program that meets a correctness speciﬁca-

tion given as a logical formula. Recent work on program synthesis

and program optimization illustrates many potential beneﬁts

of allowing the user to supplement the logical speciﬁcation

with a syntactic template that constrains the space of allowed

implementations. Our goal is to identify the core computational

problem common to these proposals in a logical framework. The

input to the syntax-guided synthesis problem (SyGuS) consists

of a background theory, a semantic correctness speciﬁcation

for the desired program given by a logical formula, and a

syntactic set of candidate implementations given by a grammar.

The computational problem then is to ﬁnd an implementation

from the set of candidate expressions so that it satisﬁes the

speciﬁcation in the given theory. We describe three different

instantiations of the counter-example-guided-inductive-synthesis

(CEGIS) strategy for solving the synthesis problem, report on

prototype implementations, and present experimental results on

an initial set of benchmarks.

I. INTRODUCTION

In program veriﬁcation, we want to check if a program

satisﬁes its logical speciﬁcation. Contemporary veriﬁcation

tools vary widely in terms of source languages, veriﬁcation

methodology, and the degree of automation, but they all

rely on repeatedly invoking an SMT (Satisﬁability Modulo

Theories) solver. An SMT solver determines the truth of

a given logical formula built from typed variables, logical

connectives, and typical operations such as arithmetic and

array accesses (see [1], [2]). Despite the computational in-

tractability of these problems, modern SMT solvers are ca-

pable of solving instances with thousands of variables due

to sustained innovations in core algorithms, data structures,

decision heuristics, and performance tuning by exploiting

the architecture of contemporary processors. A key driving

force for this progress has been the standardization of a

common interchange format for benchmarks called SMT-LIB

(see smt-lib.org) and the associated annual competition (see

smtcomp.org). These efforts have proved to be instrumental

in creating a virtuous feedback loop between developers and

users of SMT solvers: with the availability of open-source

and highly optimized solvers, researchers from veriﬁcation

and other application domains ﬁnd it beneﬁcial to translate

their problems into the common format instead of attempting

to develop their own customized tools from scratch, and the

limitations of the current SMT tools are constantly exposed by

the ever growing repository of different kinds of benchmarks,

thereby spurring greater innovation for improving the solvers.

In program synthesis, we wish to automatically synthesize

an implementation for the program that satisﬁes the given

correctness speciﬁcation. A mature synthesis technology has

the potential of even greater impact on software quality than

program veriﬁcation. Classically, program synthesis is viewed

as a problem in deductive theorem proving: a program is

derived from the constructive proof of the theorem that states

that for all inputs, there exists an output, such that the desired

correctness speciﬁcation holds (see [3]). Our work is motivated

by a recent trend in synthesis in which the programmer, in

addition to the correctness speciﬁcation, provides a syntactic

template for the desired program. For instance, in the pro-

gramming approach advocated by the SKETCH system, a pro-

grammer writes a partial program with incomplete details, and

the synthesizer ﬁlls in the missing details using user-speciﬁed

assertions as the correctness speciﬁcation [4]. We call such

an approach to synthesis syntax-guided synthesis (SyGuS).

Besides program sketching, a number of recent efforts such as

synthesis of loop-free programs [5], synthesis of Excel macros

from examples [6], program de-obfuscation [7], synthesis of

protocols from the skeleton and example behaviors [8], synthe-

sis of loop-bodies from pre/post conditions [9], integration of

constraint solvers in programming environments for program

completion [10], and super-optimization by ﬁnding equivalent

shorter loop bodies [11], all are arguably instances of syntax-

guided synthesis. Also related are techniques for automatic

generation of invariants using templates and by learning [12]–

[14], and recent work on solving quantiﬁed Horn clauses [15].

Existing formalization of the SMT problem and the in-

terchange format does not provide a suitable abstraction for

capturing the syntactic guidance. The computational engines

used by the various synthesis projects mentioned above rely

on a small set of algorithmic ideas, but have evolved inde-

pendently with no mechanism for comparison, benchmarking,

and sharing of back-ends. The main contribution of this paper

is to deﬁne the syntax-guided synthesis (SyGuS) problem in

a manner that (1) captures the computational essence of these

recent proposals and (2) is based on more canonical formal

frameworks such as logics and grammars instead of features

of speciﬁc programming languages. In our formalization, the

correctness speciﬁcation of the function f to be synthesized

is given as a logical formula ϕ that uses symbols from a

background theory T . The syntactic space of possible im-

plementations for f is described as a set L of expressions

built from the theory T , and this set is speciﬁed using a

grammar. The syntax-guided synthesis problem then is to ﬁnd

an implementation expression e ∈ L such that the formula

ϕ[f/e] is valid in the theory T . To illustrate an application of

the SyGuS-problem, suppose we want to ﬁnd a completion of

a partial program with holes so as to satisfy given assertions.

A typical SyGuS-encoding of this task will translate the

concrete parts of the partial program and the assertions into

the speciﬁcation formula ϕ, while the holes will be represented

with the unknown functions to be synthesized, and the space

of expressions that can substitute the holes will be captured

by the grammar.

Compared to the classical formulation of the synthesis

problem that involves only the correctness speciﬁcation, the

syntax-guided version has many potential beneﬁts. First, the

user can use the candidate set L to limit the search-space for

potential implementations, and this has signiﬁcant computa-

tional beneﬁts for solving the synthesis problem. Second, this

approach gives the programmer the ﬂexibility to express the

desired artifact using a combination of syntactic and semantic

constraints. Such forms of multi-modal speciﬁcations have the

potential to make programming more intuitive. Third, the set

L can be used to constrain the space of implementations for

the purpose of performance optimizations. For example, to

optimize the computation of the product of two two-by-two

matrices, we can limit the search space to implementations that

use only 7 multiplication operations, and such a restriction can

be expressed only syntactically. Fourth, because the synthesis

problem boils down to ﬁnding a correct expression from the

syntactic space of expressions, this search problem lends itself

to machine learning and inductive inference as discussed in

Section III. Finally, it is worth noting that the statement “there

exists an expression e in the language generated by a context-

free grammar G such that the formula ϕ[f/e] is valid in a

theory T ” cannot be translated to determining the truth of a

formula in the theory T , even with additional quantiﬁers.

The rest of the paper is organized in the following manner.

In Section II, we formalize the core problem of syntax-

guided synthesis with examples. In Section III, we discuss

a generic architecture for solving the proposed problem us-

ing the iterative counter-example guided inductive synthesis

strategy [16] that combines a learning algorithm with a ver-

iﬁcation oracle. For the learning algorithm, we show how

three techniques from recent literature can be adapted for our

purpose: the enumerative technique generates the candidate

expressions of increasing size relying on the input-output

examples for pruning; the symbolic technique encodes parse

trees of increasing size using variables and constraints, and

it calls an SMT solver to ﬁnd a parse tree consistent with

all the examples encountered so far; and the stochastic search

uniformly samples the set L of expressions as a starting point,

and then executes (probabilistic) traversal of the graph where

two expressions are neighbors if one can be obtained from

the other by a single edit operation on the parse tree. We

report on a prototype implementation of these three algorithms,

and evaluate their performance on a number of benchmarks in

Section IV.

II. PROBLEM FORMULATION

At a high level, the functional synthesis problem consists

of ﬁnding a function f such that some logical formula ϕ

capturing the correctness of f is valid. In syntax-guided

synthesis, the synthesis problem is constrained in three ways:

(1) the logical symbols and their interpretation are restricted

to a background theory, (2) the speciﬁcation ϕ is limited

to a ﬁrst order formula in the background theory with all

its variables universally quantiﬁed, and (3) the universe of

possible functions f is restricted to syntactic expressions

described by a grammar. We now elaborate on each of these

points.

Background Theory: The syntax for writing speciﬁcations is

the same as classical typed ﬁrst-order logic, but the formulas

are evaluated with respect to a speciﬁed background theory

T . The theory gives the vocabulary used for constructing

formulas, the set of values for each type, and the interpretation

for each of the function and relation (predicate) symbols in

the vocabulary. We are mainly interested in theories T for

which well-understood decision procedures are available for

determining satisfaction modulo T (see [1] for a survey).

A typical example is the theory of linear integer arithmetic

(LIA) where each variable is either a boolean or an integer,

and the vocabulary consists of boolean and integer constants,

standard boolean connectives, addition (+), comparison (≤),

and conditionals (ITE). Note that the background theory can

be a combination of logical theories, for instance, LIA and the

theory of uninterpreted functions with equality.

Correctness Speciﬁcation: For the function f to be syn-

thesized, we are given the type of f and a formula ϕ as

its correctness speciﬁcation. The formula ϕ is a Boolean

combination of predicates from the background theory, in-

volving universally quantiﬁed free variables, symbols from the

background theory, and the function symbol f , all used in a

type-consistent manner.

Example 1: Assuming the background theory is LIA, con-

sider the speciﬁcation of a function f of type int × int 7→ int:

: f(x, y) = f(y, x) ∧ f(x, y) ≥ x.

The free variables in the speciﬁcation are assumed to be

universally quantiﬁed: a given function f satisﬁes the above

speciﬁcation if the quantiﬁed formula ∀x, y. ϕ

holds, or

equivalently, if the formula ϕ

is valid.

Set of Candidate Expressions: In order to make the synthe-

sis problem tractable, the “syntax-guided” version allows the

user to impose structural (syntactic) constraints on the set of

possible functions f. The structural constraints are imposed

by restricting f to the set L of functions deﬁned by a given

context-free grammar G

. Each expression in L has the same

type as that of the function f, and uses the symbols in the

background theory T along with the variables corresponding

to the formal parameters of f.

Example 2: Suppose the background theory is LIA, and the

type of the function f is int ×int 7→ int. We can restrict the set

of expressions f(x, y) to be linear expressions of the inputs

by restricting the body of the function to expressions in the

set L

described by the grammar below:

LinExp := x | y | Const | LinExp + LinExp

Alternatively, we can restrict f(x, y) to conditional expres-

sions with no addition by restricting the body terms from the

set L

described by:

Term := x | y | Const | ITE(Cond, Term, Term)

Cond := Term ≤ Term | Cond ∧ Cond | ¬Cond | (Cond)

Grammars can be conveniently used to express a wide range

of constraints, and in particular, to bound the depth and/or the

size of the desired expression.

SyGuS Problem Deﬁnition: Informally, given the correct-

ness speciﬁcation ϕ and the set L of candidates, we want

to ﬁnd an expression e ∈ L such that if we use e as

an implementation of the function f, the speciﬁcation ϕ is

valid. Let us denote the result of replacing each occurrence

of the function symbol f in ϕ with the expression e by

ϕ[f/e]. Note that we need to take care of binding of input

values during such a substitution: if f has two inputs that the

expressions in L refer to by the variable names x and y, then

the occurrence f(e

, e

) in the formula ϕ must be replaced

with the expression e[x/e

, y/e

] obtained by replacing x and

y in e by the expressions e

and e

, respectively. Now we can

deﬁne the syntax-guided synthesis problem, SyGuS for short,

precisely:

Given a background theory T , a typed function

symbol f, a formula ϕ over the vocabulary of T

along with f, and a set L of expressions over the

vocabulary of T and of the same type as f , ﬁnd an

expression e ∈ L such that the formula ϕ[f/e] is

valid modulo T .

Example 3: For the speciﬁcation ϕ

presented earlier, if the

set of allowed implementations is L

as shown before, there

is no solution to the synthesis problem. On the other hand, if

the set of allowed implementations is L

, a possible solution

is the conditional if-then-else expression ITE(x ≥ y, x, y).

In some special cases, it is possible to reduce the deci-

sion problem for syntax guided synthesis to the problem of

deciding formulas in the background theory using additional

quantiﬁcation. For example, every expression in the set L

equivalent to ax+by+c, for integer constants a, b, c. If ϕ is the

correctness speciﬁcation, then deciding whether there exists an

implementation for f in the set L

corresponds to checking

whether the formula ∃ a, b, c. ∀ X. ϕ[f/ax + by + c] holds,

where X is the set of all free variables in ϕ. This reduction

was possible for L

because the set of all expressions in L

can be represented by a single parameterized expression in the

original theory. However, the grammar may permit expressions

of arbitrary depth which may not be representable in this way,

as in the case of L

Synthesis of Multiple Functions: A general synthesis prob-

lem can involve more than one unknown function. In principle,

adding support for problems with more than one unknown

function is merely a matter of syntactic sugar. For exam-

ple, suppose we want to synthesize functions f

) and

), with corresponding candidate expressions given by

grammars G

and G

, with start non-terminals S

and S

respectively. Both functions can be encoded with a single

function f

(id, x

, x

). The set of candidate expressions is

described by the grammar that contains the rules of G

and

along with a new production S := ITE(id = 0, S

, S

with the new start non-terminal S. Then, every occurrence of

) in the speciﬁcation can be replaced with f

(0, x

, ∗)

and every call to f

) can be replaced with f

(1, ∗, x

Although adding support for multiple functions does not

fundamentally increase the expressiveness of the notation,

it does offer signiﬁcant convenience in encoding real-world

synthesis problems.

Let Expressions in Grammar Productions: The SMT-LIB

interchange format for specifying constraints allows the use of

let expressions as part of the formulas, and this is supported by

our language also: (let [var = e

] e

). While let-expressions

in a speciﬁcation can be desugared, the same does not hold

when they are used in a grammar. As an example, consider

the grammar below for the set of candidate expressions for

the function f(x, y):

T := (let [z = U ] z + z)

U := x | y | Const | U + U | U ∗ U | (U )

The top-level expression speciﬁed by this grammar is the

sum of two identical subexpressions built using arithmetic

operators, and such a structure cannot be speciﬁed using a

standard context-free grammar. In the example above, every

let introduced by the grammar uses the same variable name. If

the application of let-expressions are nested in the derivation

tree, the standard rules for shadowing of variable deﬁnitions

determine which deﬁnition corresponds to which use of the

variable.

SYNTH-LIB Input Format: To specify the input to the

SyGuS problem, we have developed an interchange format,

called SYNTH-LIB, based on the syntax of SMT-LIB2—the

input format accepted by the SMT solvers (see smt-lib.org).

The input for the SyGuS problem to synthesize the function f

with the speciﬁcation ϕ

in the theory LIA, with the grammar

for the languages L

is encoded in SYNTH-LIB as:

(set-logic LIA)

(synth-fun f ((x Int) (y Int)) Int

((Start Int (x y

(Constant Int)

(+ Start Start)))))

(declare-var a Int)

(declare-var b Int)

(constraint (= (f a b) (f b a)))

(constraint (>= (f a b) a))

(check-synth)

Optimality Criterion: The answer to our synthesis problem

need not be unique: there may be two expressions e

and e

the set L of allowed expressions such that both implementa-

tions satisfy the correctness speciﬁcation ϕ. Ideally, we would

like to associate a cost with each expression, and consider the

problem of optimal synthesis which requires the synthesis tool

to return the expression with the least cost among the correct

ones. A natural cost metric is the size of the expression. In

presence of let-expressions, the size directly corresponds to the

number of instructions in the corresponding straight-line code,

and thus such a metric can be used effectively for applications

such as super-optimization.

III. INDUCTIVE SYNTHESIS

Algorithmic approaches to program synthesis range over a

wide spectrum, from deductive synthesis to inductive synthesis.

In deductive program synthesis (e.g., [3]), a program is synthe-

sized by constructively proving a theorem, employing logical

inference and constraint solving. On the other hand, inductive

synthesis [17]–[19] seeks to ﬁnd a program matching a set

of input-output examples. It is thus an instance of learning

from examples, also termed as inductive inference or machine

learning [20], [21]. Many current approaches to synthesis

blend induction and deduction [22]; syntax guidance is usually

a key ingredient in these approaches.

Inductive synthesizers generalize from examples by search-

ing a restricted space of programs. In machine learning, this

restricted space is called the concept class, and each element

of that space is often called a candidate concept. The concept

class is usually speciﬁed syntactically. Inductive learning is

thus a natural ﬁt for the syntax-guided synthesis problem

introduced in this paper: the concept class is simply the set L

of permissible expressions.

A. Synthesis via Active Learning

A common approach to inductive synthesis is to formulate

the overall synthesis problem as one of active learning using

a query-based model. Active learning is a special case of

machine learning in which the learning algorithm can control

the selection of examples that it generalizes from and can

query one or more oracles to obtain both examples as well as

labels for those examples. In our setting, we can consider the

labels to be binary: positive or negative. A positive example

is simply an interpretation to f in the background theory

T that is consistent with the speciﬁcation ϕ; i.e., it is a

valuation to the arguments of the function symbol f along with

the corresponding valuation of f that satisﬁes ϕ. A negative

example is any interpretation of f that is not consistent with ϕ.

We refer the reader to a paper by Angluin [23] for an overview

of various models for query-based active learning.

In program synthesis via active learning, the query oracles

are often implemented using deductive procedures such as

model checkers or satisﬁability solvers. Thus, the overall

synthesis algorithm usually comprises a top-level inductive

learning algorithm that invokes deductive procedures (query

oracles); e.g., in our problem setting, it is intuitive, although

not required, to implement an oracle using an SMT solver for

the theory T . Even though this approach combines induction

and deduction, it is usually referred to in the literature simply

as “inductive synthesis.” We will continue to use this termi-

nology in the present paper.

Consider the syntax-guided synthesis problem of Sec. II.

Given the tuple (T , f , ϕ, L), there are two important choices

one must make to ﬁx an inductive synthesis algorithm: (1)

search strategy: How should one search the concept class L?

and (2) example selection strategy: Which examples do we

learn from?

B. Counterexample-Guided Inductive Synthesis

Counterexample-guided inductive synthesis (CEGIS) [16],

[24] shown in Figure 1 is perhaps the most popular approach

to inductive synthesis today. CEGIS has close connections

to algorithmic debugging using counterexamples [19] and

counterexample-guided abstraction reﬁnement (CEGAR) [25].

This connection is no surprise, because both debugging and

abstraction-reﬁnement involve synthesis steps: synthesizing a

INITIALIZE

LEARNING

ALGORITHM

VERIFICATION

ORACLE

Candidate

Concept

Counterexample

Learning Succeeds

Learning Fails

“Concept Class”, Initial Examples

Fig. 1. Counterexample-Guided Inductive Synthesis (CEGIS)

repair in the former case, and synthesizing an abstraction

function in the latter (see [22] for a more detailed discussion).

The deﬁning aspect of CEGIS is its example selection strat-

egy: learning from counterexamples provided by a veriﬁcation

oracle. The learning algorithm, which is initialized with a

particular choice of concept class L and possibly with an initial

set of (positive) examples, proceeds by searching the space of

candidate concepts for one that is consistent with the examples

seen so far. There may be several such consistent concepts,

and the search strategy determines the chosen candidate, an

expression e. The concept e is then presented to the veriﬁcation

oracle O

, which checks the candidate against the correctness

speciﬁcation. O

can be implemented as an SMT solver that

checks whether ϕ[f /e] is valid modulo the theory T . If the

candidate is correct, the synthesizer terminates and outputs this

candidate. Otherwise, the veriﬁcation oracle generates a coun-

terexample, an interpretation to the symbols and free variables

in ϕ[f/e] that falsiﬁes it. This counterexample is returned to

the learning algorithm, which adds the counterexample to its

set of examples and repeats its search; note that the precise

encoding of a counterexample and its use can vary depending

on the details of the learning algorithm employed. It is possible

that, after some number of iterations of this loop, the learning

algorithm may be unable to ﬁnd a candidate concept consistent

with its current set of (positive/negative) examples, in which

case the learning step, and hence the overall CEGIS procedure,

fails.

Several search strategies are possible for learning a can-

didate expression in L, each with its pros and cons. In the

following sections, we describe three different search strategies

and illustrate the main ideas in each using a small example.

C. Illustrative Example

Consider the problem of synthesizing a program which

returns the maximum of two integer inputs. The speciﬁcation

of the desired program max is given by:

max(x, y) ≥ x ∧ max(x, y) ≥ y ∧

(max(x, y) = x ∨ max(x, y) = y)

The search space is suitably deﬁned by an expression

grammar which includes addition, subtraction, comparison,

conditional operators and the integer constants 0 and 1.

Expression to Veriﬁer Learned Test Input

x hx = 0, y = 1i

y hx = 1, y = 0i

1 hx = 0, y = 0i

x + y hx = 1, y = 1i

ITE(x ≤ y, y, x) –

TABLE I

A RUN OF THE ENUMERATIVE ALGORITHM

D. Enumerative Learning

The enumerative learning algorithm [8] adopts a dynamic

programming based search strategy that systematically enu-

merates concepts (expressions) in increasing order of complex-

ity. Various complexity metrics can be assigned to concepts,

the simplest being the expression size. The algorithm needs

to store all enumerated expressions, because expressions of

a given size are composed to form larger expressions in the

spirit of dynamic programming. The algorithm maintains a

set of concrete test cases, obtained from the counterexamples

returned by the veriﬁcation oracle. These concrete test cases

are used to reduce the number of expressions stored at each

step by the dynamic programming algorithm.

We demonstrate the working of the algorithm on the illus-

trative example. Table I shows the expressions submitted to

the veriﬁcation oracle (an SMT solver) during the execution of

the algorithm and the values for which the expression produces

incorrect results. Initially, the algorithm submits the expression

x to the veriﬁer. The veriﬁer returns a counterexample hx =

0, y = 1i, corresponding to the case where the expression

x violates the speciﬁcation. The expression enumeration is

started from scratch every time a counterexample is added. All

enumerated expressions are checked for conformance with the

accumulated (counter)examples before making a potentially-

expensive query to the veriﬁer. In addition, suppose the

algorithm enumerates two expressions e and e

which evaluate

to the same value on the examples obtained so far, then only

one of e or e

needs to be considered for the purpose of

constructing larger expressions.

Proceeding with the illustrative example, the algorithm then

submits the expression y and the constant 1 to the veriﬁer. The

veriﬁer returns the values hx = 1, y = 0i and hx = 0, y =

0i, respectively, as counterexamples to these expressions. The

algorithm then submits the expression x+y to the veriﬁer. The

veriﬁer returns the values hx = 1, y = 1i as a counterexample.

The algorithm then submits the expression shown in the last

row of Table I to the veriﬁer. The veriﬁer certiﬁes it to be

correct and the algorithm terminates.

The optimization of pruning based on concrete counterex-

amples helps in two ways. First, it reduces the number of

invocations of the veriﬁcation oracle. In the example we have

described, the correct expression was examined after only

four calls to the SMT solver, although about 200 expressions

were enumerated by the algorithm. Second, it reduces the

search space for candidate expressions signiﬁcantly (see [8]

for details). For instance, in the run of the algorithm on

the example, although the algorithm enumerated about 200

expressions, only about 80 expressions were stored.

Production Component

E → ITE(B, E, E) Inputs: (i

: B)(i

, i

: E)

Output: (o : E)

Spec: o = ITE(i

, i

)

B → E ≤ E Inputs: (i

, i

: E)

Output: (o : B)

Spec: o = i

≤ i

TABLE II

COMPONENTS FROM PRODUCTIONS

E. Constraint-based Learning

The symbolic CEGIS approach uses a constraint solver

both for searching for a candidate expression that works

for a set of concrete input examples (concept learning) and

veriﬁcation of validity of an expression for all possible inputs.

We use component based synthesis of loop-free programs

as described by Jha et al. [5], [7]. Each production in the

grammar corresponds to a component in a library. A loop-

free program comprising these components corresponds to an

expression from the grammar. Some sample components for

the illustrative example are shown in Table II along with their

corresponding productions.

The input/output ports of these components are typed and

only well-typed programs correspond to well-formed expres-

sions from the grammar. To ensure this, Jha et al.’s encod-

ing [5] is extended with typing constraints. We illustrate the

working of this algorithm on the maximum of two integers

example. The library of allowed components is instantiated

to contain one instance each of ITE and all comparison

operators(≤, ≥, =) and the concrete example set is initialized

with hx = 0, y = 0i. The ﬁrst candidate loop-free program

synthesized corresponds to the expression x. This candidate

is submitted to the veriﬁcation oracle which returns with

hx = −1, y = 0i as a counterexample. This counterexample is

added to the concrete example set and the learning algorithm

is queried again. The SMT formula for learning a candidate

expression is solved in an incremental fashion; i.e., the con-

straint for every new example is added to the list of constraints

from the previous examples. The steps of the algorithm on the

illustrative example are shown in Table III.

If synthesis fails for a component library, we add one in-

stance of every operator to the library and restart the algorithm

with the new library. We also tried a modiﬁcation to the

original algorithm [5], in which, instead of searching for a

loop-free program that utilizes all components from the given

library at once, we search for programs of increasing length

such that every line can still select any component from the

library. The program length is increased in an exponential

Iteration Loop-free program Learned counter-example

1 o

:= x hx = −1, y = 0i

2 o

:= x ≤ x

:= ITE(o

, y, x) hx = 0, y = −1i

3 o

:= y ≥ x

:= ITE(o

, y, x) –

TABLE III

A RUN OF THE CONSTRAINT LEARNING ALGORITHM

Syntax-guided synthesis

Figures

Citations

Syntax-guided synthesis

Synthesizing data structure transformations from input-output examples

FlashMeta: a framework for inductive program synthesis

Automatic software repair: a survey

Reactive synthesis from signal temporal logic specifications

References

Induction of Decision Trees

OpenFlow: enabling innovation in campus networks

Programs for Machine Learning

Language identification in the limit

Queries and Concept Learning

Related Papers (5)

Combinatorial sketching for finite programs

Z3: an efficient SMT solver

Oracle-guided component-based program synthesis

Automating string processing in spreadsheets using input-output examples

Synthesizing data structure transformations from input-output examples

Frequently Asked Questions (14)

Q1. What contributions have the authors mentioned in the paper "Syntax-guided synthesis" ?

Q2. What is the key ingredient in the current approaches to synthesis?

Q3. What is the common approach to inductive synthesis?

Q4. What is the way to solve the inductive synthesis problem?

Q5. What is the use of a constraint solver?

Q6. How many expressions were stored in the example?

Q7. What is the current set of benchmarks?

Q8. What is the probability of making e the new candidate?

Q9. How many buckets are used to report the solution times of different solutions?

Q10. How many solvers were affected by the encoding of the problem?

Q11. What does the grammar do to help you with a synthesis problem?

Q12. What is the popular approach to inductive synthesis today?

Q13. How do the authors switch to a search at size n?

Q14. What is the way to fix an inductive synthesis algorithm?