You are given a simple malware program (which is a malicious keylogger: keylogger.zip).
A common anti-virus software would detect this malware by matching a signature of a malware extracted from a malware binary. In this case, we assume that an anti-virus software creates a signature from one of the strings found in the binary.
For example, if you run "strings keylogger", you will find it will identify "/dev/input/by-path/platform-..." string. As it accesses the keyboard handle, it is a good indicator of a malware program. On the other hand, by hiding the string, you will make the detection of this malware harder.
Now, you assume that you are the malware developer.
Your goal is to use LLVM to diversify the program, specifically Global Constant Strings, in the binary, to avoid detection of the anti-virus software.
[Important!] Note that the “strings” program may find additional strings beyond the global constant strings. You do not need to handle those. The exact list of strings you need to randomize is presented below.
Essentially, all. However, I will leave some of the string for extra credits. Note that it is not necessary that those strings for extra credits are harder to handle.
Must handle (because the below three strings are strong indicators of a keylogger):
c"/dev/input/by-path/platform-i8042-serio-0-event-kbd\00"
c"log.txt\00"
c"a+\00"
All strings you can find in the program (there are a few more).
Please read this document.
Project 2 Template Source: Project2_src.zip
Project 2 Template Build Files: Project2_Build.zip
1. Your LLVM tool code
2. A report that includes
(1) high-level descriptions of how your LLVM tool works,
(2) specific code snippets on how you encode,
(3) specific code snippets on how you decode
(including how you instrument global variables and function calls)