Software Security via Program Analysis

Class: 16741 CS 6501 Section 008

Instructor:

Yonghwi Kwon (yongkwon@virginia.edu, http://yongkwon.info)

Time and Place:

Tu., Thurs., 9:30AM - 10:45AM @ Rice 340

We will use a google class site for homework assignments and announcements.

Google Classroom: https://classroom.google.com

Class code: 553hg39

Course Description:

Cyberattacks are becoming more and more sophisticated. State-funded attackers are spending tremendous time and effort to infiltrate organizations (e.g., enterprise and government agencies) leveraging stealthy and sophisticated attack mechanisms (e.g., zero-day exploits).

To fight back against those attackers, there are various advanced techniques proposed by researchers and industry. As attackers break into systems in various ways, building a fundamental protection against these attackers require techniques across various layers of software and fundamental understanding of the system as well as attackers.

  • This course will cover recent advances in cyberattack prevention and analysis via program analysis and reverse-engineering. In particular, we will focus on understanding recent advances in the topics by reading, presenting, and discussing details of recent publications in top security conferences (S&P, USENIX Security, CCS, and NDSS).
  • You will learn (1) how to analyze vulnerabilities and exploits to understand root causes of the attack, and come up with fundamental solutions, (2) how to investigate sophisticated cyberattacks in order to pinpoint and discover how attackers infiltrate systems and what they did (e.g., leak secrets), and (3) how we can leverage program analysis techniques in order to automate the above tasks and make the software more secure.
  • You will (1) read recent academic papers carefully and present the essence of papers, (2) learn how to implement advanced dynamic and static analysis used in attack prevention and investigation, and (3) learn how to conduct a system security research that builds fundamental software.
  • It would be great if you have basic knowledge or experience in system programming in C (assembly is a big plus). Experience in dynamic program analysis tools (e.g., Intel Pin, DynamoRIO), static program analysis tools (e.g., LLVM), and/or reverse-engineering tools (e.g., IDA Disassembler, OllyDbg, Immunity Debugger) is very welcomed.

Opportunities:

  • If your individual project is well developed, I will support you to turn it into a research paper. Once accepted (in a conference or workshop), expenses required for your travel to the conference (or workshop) and presentation will be supported.
  • If we both agreed that your research interests and potential are well aligned (with me), we may seek for potential funding sources for your Ph.D. program.

Information flow can be tracked via program analysis, meaning that we can understand how Siri and Alexa laughed.

Schedule

Week 1: Class Introduction / Project Descriptions / How to Read/Critique/Present System Security Papers

Week 2: Program Tracing / (Project 1 and Pin introduction)

What is tracing? Why it is needed? How to automatically trace a program?

  • Project 1 Start -- 10% of the grade., Paper Assignment

Week 3: Dynamic Analysis (Dynamic Slicing / Information Flow)

What is dynamic analysis? What is slicing and information flow? Why we need those? How to do those effectively? What we can do with those?

Week 4: Dynamic Analysis (Information Flow) / Reverse Engineering (Disassembly)

Topic 1: How we can leverage our knowledge of information flow to build secure systems/make systems secure?

Topic 2: What is reverse engineering? Basic principles of disassembly. Recovering semantics

Week 5: Reverse Engineering (Advanced Disassembly, Decompiler, Anti-Debugging Techniques)

Recovering semantics, Decompilers, Anti-debugging techniques and solutions against them

Week 6: Static Analysis / (Project 2 and LLVM introduction)

What is static analysis? How it is compared to dynamic analysis? What are the common static analysis techniques for security applications? => Answer: Value-set Analysis, Control-flow Integrity, Data-flow Integrity. How we can implement those? => Answer: LLVM

  • Project 2 Start -- 10% of the grade.

Week 7: Static Analysis

Various algorithms including Value-set/Control flow integrity, Data-flow integrity

Week 8: Static Analysis + Dynamic Analysis / Probabilistic program analysis

Combining both analyses for effective and efficient problem solving. Runtime monitoring guided by static analysis / Static analysis guided by dynamic traces. Probabilistic approaches to improve program analysis techniques.

Week 9: Research proposal discussion

Have open discussion sessions for the projects. Providing peer feedback.

Week 10-14: Research paper discussion (Topics are flexible and determined in the class)

A group of students will present a particular topic. It would be a single research paper or multiple papers. Depending on the topic, the instructor may help the presentation to provide complete information.

  • Project 3 Start -- 10% of the grade

Week 15: Final Presentations (Outcomes of your individual projects)

Individual students present final results of the course project (individual project) -- 10% of the grade. (10% for a report, 20% for the project artifacts)

Reading Day and Thanksgiving: No class

Assessment:

This class has no exam. The grading is based on projects and presentations.

1. Presentation: 20% (10% for understanding of the paper, 10% for effective presentation)

2. Assignments: 30% (3 assignments; each 10%)

3. Independent Research Project: 40% (20% for the design and implementation, 10% for a presentation, 10% for a report)

4. Class participation: 10% (Questions and Reviews for the papers discussed in the class)

I will hand out my business card for people who asked good questions. At the end of the semester, return the card to redeem your credits.

5. Extra credit: Extra assignments: TBD% (To be announced)

Topics:

Dynamic Program Analysis

  • Data-flow tracking
  • Control-flow tracking

Static Program Analysis

  • Data-flow analysis
  • Control-flow analysis
  • Pointer/alias analysis

Reverse-engineering

  • Evasive techniques
  • Code obfuscation/de-obfuscation

Operating System Security

  • Sandboxing/isolation, Fault localization
  • Record and replay based analysis
  • Audit-logging

Web Security

  • Script Language Security (JS/Flash)
  • Browser Security
  • Malicious Advertisement

Mobile Security

  • Security Issues on Android/iOS
  • Program Analysis techniques for Mobile Platforms

IoT Security

  • Security Issues on heterogeneous IoT platforms
  • Improving IoT security via program analysis

Reading List:

This reading list includes representative publications that will be covered during this class. Papers will be added during the semester. Please use them to understand high-level themes of the class topics.

Particularly for systems security papers: (1) Read Abstract -> Introduction -> Conclusion, (2) Find and read a motivation (representative) example or case studies. They include a complete (and often realistic) story and how the proposed idea solves the problem with newly proposed methods.

Dynamic/Static Analysis Frameworks

Data-flow tracking and Data-flow analysis

Control-flow tracking and Control-flow analysis

Evasive techniques

Code obfuscation/de-obfuscation

Record and replay / N-version systems

Audit-logging

Web/Browser Security

Sandboxing/isolation, Fault localization

Mobile/IoT Security

Machine Learning (Added)