Software Security via Program Analysis
Class: 16741 CS 6501 Section 008
Instructor:
Yonghwi Kwon (yongkwon@virginia.edu, http://yongkwon.info)
Time and Place:
Tu., Thurs., 9:30AM - 10:45AM @ Rice 340
Office Hours: Tu 10:45AM ~ 12:00PM @ Rice 505
We use a google class site for homework assignments and announcements.
Course Description:
Cyberattacks are becoming more and more sophisticated. State-funded attackers are spending tremendous time and effort to infiltrate organizations (e.g., enterprise and government agencies) leveraging stealthy and sophisticated attack mechanisms (e.g., zero-day exploits).
To fight back against those attackers, there are various advanced techniques proposed by researchers and industry. As attackers break into systems in various ways, building a fundamental protection against these attackers require techniques across various layers of software and fundamental understanding of the system as well as attackers.
- This course will cover recent advances in cyberattack prevention and analysis via program analysis and reverse-engineering. In particular, we will focus on understanding recent advances in the topics by reading, presenting, and discussing details of recent publications in top security conferences (S&P, USENIX Security, CCS, and NDSS).
- You will learn (1) how to analyze vulnerabilities and exploits to understand root causes of the attack, and come up with fundamental solutions, (2) how to investigate sophisticated cyberattacks in order to pinpoint and discover how attackers infiltrate systems and what they did (e.g., leak secrets), and (3) how we can leverage program analysis techniques in order to automate the above tasks and make the software more secure.
- You will (1) read recent academic papers carefully and present the essence of papers, (2) learn how to implement advanced dynamic and static analysis used in attack prevention and investigation, and (3) learn how to conduct a system security research that builds fundamental software.
- It would be great if you have basic knowledge or experience in system programming in C (assembly is a big plus). Experience in dynamic program analysis tools (e.g., Intel Pin, DynamoRIO), static program analysis tools (e.g., LLVM), and/or reverse-engineering tools (e.g., IDA Disassembler, OllyDbg, Immunity Debugger) is very welcomed.
Opportunities:
- If your individual project is well developed, I will support you to turn it into a research paper. Once accepted (in a conference or workshop), expenses required for your travel to the conference (or workshop) and presentation will be supported.
- If we both agreed that your research interests and potential are well aligned (with me), we may seek for potential funding sources for your Ph.D. program.
Information flow can be tracked via program analysis, meaning that we can understand how Siri and Alexa laughed.
Schedule
Week 1 (Aug 27, 29): Class Introduction / Project Descriptions / How to Read/Critique/Present System Security Papers
Week 2 (Sep 3, 5): Program Tracing / (Project 1 and Pin introduction)
What is tracing? Why it is needed? How to automatically trace a program?
- Project 1 Start -- 10% of the grade (Will be release on Friday midnight)
Week 3 (Sep 10, 12): Dynamic Analysis (Dynamic Slicing / Information Flow)
What is dynamic analysis? What is slicing and information flow? Why we need those? How to do those effectively? What we can do with those?
- Paper Assignment
Week 4/5/6 (Sep 17, 19, 24, 26, Oct 1, 3): Dynamic Analysis (Information Flow) / Reverse Engineering (Disassembly)
Topic 1: How we can leverage our knowledge of information flow to build secure systems/make systems secure?
Topic 2: What is reverse engineering? Basic principles of disassembly. Recovering semantics
Week 7/8 (Oct 1, 3, Oct 10): Reverse Engineering (Advanced Disassembly, Decompiler, Anti-Debugging Techniques) / (Project 2 and LLVM introduction)
Recovering semantics, Decompilers, Anti-debugging techniques and solutions against them
- Project 2 Start -- 10% of the grade. (Will be release on Friday midnight)
Week 9 (Oct 15, 17): Reverse Engineering / (Project 2 and LLVM introduction)
Recovering semantics, Decompilers, Anti-debugging techniques and solutions against them
- (Tentative) Project 1 Due (Sunday midnight)
Week 10 (Oct 22, 24): Reverse Engineering / Static Analysis
What is static analysis? How it is compared to dynamic analysis? What are the common static analysis techniques for security applications? => Answer: Value-set Analysis, Control-flow Integrity, Data-flow Integrity.
Week 11 (Oct 29, 31): Research proposal discussion (tentative) - Oct 29, Guest Lecture - Oct 31
Guest lecture about PKI security and certificates abuse in real-world.
- Have open discussion sessions for the projects. Providing peer feedback.
Week 12 (Nov 5, 7): Cyber Forensics
TBD
- Project 3 Start -- 10% of the grade
Week 13 (Nov 12, 14): No classes
Week 14 (Nov 19, 21): Script Language Security
Analyzing real world vulnerability in web browsers to understand and prevent attacks.
Week 15 (Nov 26, 28) : Thanksgiving -- No classes
Materials will be provided while we have no classes this week.
Understand how and why web is insecure. Learn various ways and methods to prevent attacks on web ecosystem across server and client.
- (Tentative) Project 2 Due
Week 16 (Dec 3, 5): Final Presentations (Outcomes of your individual projects)
Individual students present final results of the course project (individual project) -- 40% of the grade. (breakdown: 10% for presentation, 10% for a report, 20% for the project artifacts)
- (Tentative) Project 3 Due
Reading Days (Oct 8) , Thanksgiving (Nov 28), Final week (Dec 10, 12): No class
Assessment:
This class has no exam. The grading is based on projects and presentations.
1. Presentation: 20% (10% for understanding of the paper, 10% for effective presentation)
2. Assignments: 30% (3 assignments; each 10%)
3. Independent Research Project: 40% (20% for the design and implementation, 10% for a presentation, 10% for a report)
4. Class participation: 10% (Questions and Reviews for the papers discussed in the class)
I will hand out my business cards for students who participated in the class actively. At the end of the semester, return the cards to redeem your credits.
5. Extra credit: Extra assignments: TBD% (To be announced)
Topics:
Dynamic Program Analysis
- Data-flow tracking
- Control-flow tracking
Static Program Analysis
- Data-flow analysis
- Control-flow analysis
- Pointer/alias analysis
Reverse-engineering
- Evasive techniques
- Code obfuscation/de-obfuscation
Operating System Security
- Sandboxing/isolation, Fault localization
- Record and replay based analysis
- Audit-logging
Web Security
- Script Language Security (JS/Flash)
- Browser Security
- Malicious Advertisement
Mobile Security
- Security Issues on Android/iOS
- Program Analysis techniques for Mobile Platforms
IoT Security
- Security Issues on heterogeneous IoT platforms
- Improving IoT security via program analysis
Reading List:
This reading list includes representative publications that will be covered during this class. Papers will be added during the semester. Please use them to understand high-level themes of the class topics.
Particularly for systems security papers: (1) Read Abstract -> Introduction -> Conclusion, (2) Find and read a motivation (representative) example or case studies. They include a complete (and often realistic) story and how the proposed idea solves the problem with newly proposed methods.
Dynamic/Static Analysis Frameworks
- Pin: building customized program analysis tools with dynamic instrumentation [PLDI'05]
- Valgrind: A Framework for Heavyweight Dynamic Binary Instrumentation [PLDI'07]
- LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation [CGO'04]
Data-flow tracking and Data-flow analysis
- libdft: Practical Dynamic Data Flow Tracking for Commodity Systems [VEE'12]
- A General Approach for Efficiently Accelerating Software-based Dynamic Data Flow Tracking on Commodity Hardware [NDSS'12]
- LDX: Causality Inference by Lightweight Dual Execution [ASPLOS'16]
Control-flow tracking and Control-flow analysis
- Control-Flow Integrity [MSR-TR-05-18]
- Efficient Path Encoding [MICRO'96]
- Precise Calling Context Encoding [ICSE'10]
- LDX: Causality Inference by Lightweight Dual Execution [ASPLOS'16]
Evasive techniques
- Evading android runtime analysis via sandbox detection [ASIACCS'14]
- X-Force: Force-Executing Binary Programs for Security Applications [USENIX'14]
- Revolver: An Automated Approach to the Detection of Evasive Web-based Malware [SP'13]
Code obfuscation/de-obfuscation
- Deobfuscation of virtualization-obfuscated software: a semantics-based approach [CCS'11]
- LOOP: Logic-Oriented Opaque Predicate Detection in Obfuscated Binary Code [CCS'15]
- Code obfuscation against symbolic execution attacks [ACSAC'16]
Record and replay / N-version systems
- TightLip: Keeping Applications from Spilling the Beans [NSDI'07]
- Intrusion recovery using selective re-execution [OSDI'10]
- Record and transplay: partial checkpointing for replay debugging across heterogeneous systems [SIGMETRICS'11]
- Transparent Mutable Replay for Multicore Debugging and Patch Validation [ASPLOS'13]
- Varan the Unbelievable: An Efficient N-version Execution Framework [ASPLOS'15]
Audit-logging
- High Accuracy Attack Provenance via Binary-Based Execution Partition [NDSS'13]
- LogGC: Garbage Collecting Audit Log [CCS'13]
- Efficient patch-based auditing for web application vulnerabilities [OSDI'12]
Web/Browser Security
- The Security Architecture of the Chromium Browser
- UCognito: Private Browsing without Tears [CCS'15]
- WebCapsule: Towards a Lightweight Forensic Engine For Web Browsers [CCS'15]
- Riding out DOMsday: Toward Detecting and Preventing DOM Cross-Site Scripting [NDSS'18]
- FlashDetect: ActionScript 3 malware detection [RAID'12]
- The Postman Always Rings Twice: Attacking and Defending postMessage in HTML5 Websites [NDSS'13]
Sandboxing/isolation, Fault localization
Mobile/IoT Security
- iRiS: Vetting Private API Abuse in iOS Applications [CCS'15]
- GUITAR: Piecing Together Android App GUIs from Memory Images. [CCS'15]
- Fear and Logging in the Internet of Things [NDSS'18]
- IoTFuzzer: Discovering Memory Corruptions in IoT Through App-based Fuzzing [NDSS'18]
- Sensitive Information Tracking in Commodity IoT [USENIX'18]
Machine Learning (Added)