Tutorial
Sumit Gulwani - Dimensions in Program Synthesis
Program Synthesis, which is the task of discovering programs that realize user intent, can be useful in several scenarios: enabling people with no programming background to develop utility programs, helping regular programmers automatically discover tricky/mundane details, and even discovery of new algorithms.
This tutorial will describe three key dimensions in program synthesis: input constraints that express user intent, search space of programs, and search technique. (i) The input constraints can be in the form of logical relations between inputs and outputs, input-output examples, demonstrations, and/or partial/inefficient/related programs. (ii) The search space can be over imperative/functional programs (with possible restrictions on the control structure or the operator set), or over restricted models of computations such as regular/context-free transducers, or succinct logical representations. (iii) The search technique can be based on exhaustive search, probabilistic techniques (such as belief propagation or genetic algorithms), and/or logical reasoning. Most logical reasoning techniques involve two main steps: constraint generation, and constraint solving. (a) Constraint generation can be input-based, path-based, or invariant-based. (b) Constraint solving of resultant quantified formulas typically involves use of quantifier elimination techniques (such as Farkas Lemma, cover algorithms, sampling) that further enable efficient solving of quantifier-free constraints using off-the-shelf SAT/SMT solvers.
We will illustrate the above concepts by brief description of various program synthesis projects that target synthesis of a wide variety of programs: standard undergraduate textbook algorithms (e.g., sorting, dynamic programming), program inverses (e.g., decoders, deserializers), bit-vector manipulation routines, deobfuscated programs, graph algorithms, data transformers, algebraic algorithms, and mutual exclusion algorithms. Our tools today scale to synthesis of up to 25 lines of code within an hour.