The participants of the summer school are invited to attend the co-located workshop on Trusted Automatic Programming on Friday, August 16. The workshop consists of the following lecture series:
Abhik Roychoudhury, National University of Singapore
Symbolic Execution for testing, debugging and program repair: Trusted Automatic Programming
Symbolic execution is a well-known core technology for analyzing program executions. Symbolic execution allows program execution in the presence of unknown variable values, and as such can cover multiple executions and achieve high behavioral coverage. Symbolic execution has been widely used for systematic software testing, including various industrial usages. At the same time, symbolic execution engines suffer from scalability and usability challenges – making it difficult to scale the testing task. In the first lecture, we will convey the importance of the concept of symbolic analysis and how it can benefit biased random searches such as fuzz testing to make them more systematic. Fuzz testing engines have a low barrier to entry for developers and are routinely used for detecting software vulnerabilities. Ideas from symbolic execution can greatly enhance their effectiveness, as well as enable novel use cases, for security testing.
In the second and third lectures, we will delve deeply into symbolic execution beyond testing – and how symbolic execution can be used for specification inference – where no formal specification of intended program behavior is available. Initially we will focus on software debugging where a buggy program fails certain tests, and how analysis of the tests as well as other program artifacts can aid in fault localization. Subsequently, we will discuss the problem of automated program repair – where given a buggy program and a test-suite, our goal will be to find a (minimal) repair via symbolic analysis. We will study how the traditional scalability challenge of symbolic execution is effectively mitigated, to scale up the symbolic analysis- based repair tasks to large program sizes as well as large search spaces. Apart from extracting formal specifications from tests, symbolic execution also performs a generalization of the given tests. We will discuss how such generalizations help alleviate the pernicious test-data overfitting problem.
In the last lecture, we will examine the use of testing, debugging and repair approaches to achieve the goal of trusted automatic programming. At a small scale, this may examine trust issues in automatic programming at a micro-scale- such as using automated repair for patching security vulnerabilities and how the patches can be effectively trusted in real-life. At a larger scale, these technologies are beginning to see usage in automated improvement of code generated from Large Language Models (LLMs). We will seek to conclude the discussions with forward-looking perspectives on the programming of the future.