Syntax-Guided Program Synthesis

Prof. Rajeev Alur, University of Pennsylvania

The classical synthesis problem is to find a program that meets a correctness specification given as a logical formula. Recent work on synthesis and optimization illustrates many potential benefits of allowing the user to supplement the logical specification with a syntactic template that constrains the space of allowed implementations. The formulation of the syntax-guided synthesis problem (SyGuS) is aimed at standardizing the core computational problem common to these proposals in a logical framework. In this talk, we first describe how a wide range of problems such as automatic synthesis of loop invariants, program optimization, program repair to defend against timing-based attacks, and learning programs from examples can be formalized as SyGuS instances. We then explain the counterexample-guided-inductive-synthesis strategy for solving the SyGuS problem, its different instantiations, and experimental evaluation over benchmarks from a variety of applications. We conclude by describing the Sharingan system that uses program synthesis to generate network classification programs at the session layer from positive and negative traffic examples. Sharingan accepts raw network traces as inputs, and reports potential patterns of the target traffic in a domain specific language designed for specifying session-layer quantitative properties. Our experiments show that Sharingan is able to correctly identify patterns from a diverse set of network traces and generates explainable outputs, while achieving accuracy comparable to state-of-the-art learning-based systems.

Evolving Program Synthesis

Dr. Asish Tiwari, Microsoft

Program synthesis is a powerful tool for enabling, or supporting, various tasks that programmers, data scientists, and end-users perform.

However, every application of program synthesis poses new requirements and challenges, which are often addressed by evolving the underlying technology. In this talk, we discuss such additional constraints, and how those have resulted in the evolution of our view on program synthesis.

The user-interaction model is the key driver of the evolution of program synthesis. The burden of writing formal specifications caused the switch to program synthesis by examples (PBE). Even with examples, there is a need for minimizing interactions, which leads to active synthesis -- using the idea of distinguishing, or significant, inputs -- and predictive synthesis. However, in many cases, even getting into a mode where the user implicitly expresses the intent to perform program synthesis is not acceptable. Here we have to perform so-called modeless program synthesis. Finally, the user may prefer natural language, rather than input-output examples, to express intent. This leads to new challenges, and requires program synthesis to further evolve to program repair. This evolution of program synthesis is also characterized by an increase in ambiguity in the specification. This talk will discuss the above evolution of techniques alongside the applications.

Program Synthesis with Sketch

Prof. Armando Solar-Lezama

In this tutorial, I will show how to use the basic features of the sketch synthesis system. The talk will describe how to set up a synthesis problem by describing a space of possible programs using generators, and the different modes supported by sketch for creating specifications. The talk will also describe some of the more recent additions to the language, including the new numerical solver which allows one to solve sketches with real-valued unknowns.

Program Synthesis For Data Science

Prof. Yu Feng, Yanju Chen, UC Santa Barbara

In recent years, program synthesis has emerged as a powerful technology to automate many tedious tasks in data science, such as spreadsheet programming, data migration, SQL programming, and table transformations. In this tutorial, we introduce Trinity, a general-purpose framework that can be used to quickly build domain-specific program synthesizers for automating many tedious tasks that arise in data science. We illustrate the core techniques behind Trinity: 1) an efficient deductive engine that can learn from past mistakes, 2) a statistical model that is trained from hybrid specifications with both input-output examples and natural language, and 3) a deduction-guided reinforcement learning agent that can refine the search policy on-the-fly. Finally, we will demo how end-users can use Trinity to automate data wrangling tasks.

Interactive Debugging of Concurrent Programs Under Relaxed Memory Models

Prof. Subhjit Roy, Indian Institute of Technology, Kanpur

Programming environments for sequential programs provide strong debugging support. However, concurrent programs, especially under relaxed memory models, lack powerful interactive debugging tools. In this work, we present Gambit, an interactive debugging environment that uses gdb to run a concrete debugging session on a concurrent program, while employing a symbolic execution on the program trace in the background simultaneously. The symbolic execution is analysed by a theorem prover to answer queries from the programmer on possible scenarios resulting from alternate thread interleavings or due to reorderings on other relaxed memory models, while synthesizing relevant repair suggestions.

Gambit can help programmers understand complex bug scenarios on alien environments, that is, when the program is deployed in the field under different concurrent schedulers and diverse memory models. While Gambit currently packages three memory models (PSO, TSO and SC) and provides support for constructs like fences and atomic blocks, the framework is extensible, allowing support for other memory models and programming constructs. We have evaluated Gambit on multiple programs and found that Gambit responds to the debug queries quite fast (about a second on an average across our benchmark programs).

Program Synthesis for Automated Data Transformations - Challenges in Real Life

RD Naik, Sayandeep Mitra, TCS Research

For any business, Data Wrangling, Data Migration, and Data Integration are recurrent needs and requirements. While there are differences between them and the context in which each is needed, there is commonality not only in dealing with 'data' but the underlying transformations of the data as well. Over the years, various techniques have been explored to model and implement the transformations, with the programming-by-example (PBE) becoming popular in recent years. We are exploring a PBE based approach to have a DSL-driven reduction of the search space of the possible transformation programs, with the DSL defining the extent of the automation of synthesizing the transformation programs. We describe our experiences, limitations observed with the known approaches, and the challenges that are faced in real-life data migration scenarios.

Learning to detect API Best Practice Violations

Dr. Pranav Garg, Amazon

CodeGuru Reviewer is an AWS service that provides automated code review comments. One of the recommendation categories in the Reviewer is API best practices. These recommendations are generated by several independent detectors or rules that detect violations of best practices over different APIs. In this talk, we will present our ongoing work to automatically synthesize such detectors from a few code examples that conform to or violate a given best practice. Based on the intuition that similar code changes repeated in different packages correspond to fixes of best practice violations, we obtain labeled code examples for synthesizing API best practice detectors by mining code changes from different repositories and clustering them. We will present our approach of synthesizing detectors from such code change clusters using decision tree and CNF learning and discuss preliminary results that show the promise of our approach.