This section first explains the usefulness of Mystique in generating malware and evaluating anti-malware. Then, we show how representative the malware generated via evolution mechanism is.
FODA of Android malware (§4.1) helps attain the precious domain knowledge on Android attack behaviors. In Mystique, we maintain the traceability of all the features and their corresponding code. Hence, for each malware generated by Mystique, all selected features (e.g., triggers, source, sink, etc.) are labeled. These labels can provide a good indexing mechanism for the collection of generated malware, which can facilitate the malware management.
In addition to using IBEA to generate the malware, Mystique also supports the customized malware generation — the user can decide which features are selected for malware generation. Mystique verifies the feature constraints, and generates the malware if no feature conflicts are found. The usefulness and flexibility of Mystique benefit from the SPLE architecture. Besides, rather than randomly mutate the malware, Mystique produces more aggres-sive yet less detectable malware via evolution process.
We conduct a controlled experiment to assess the effectiveness of Mystique to obtain optimal malware from the attacker’s view. The basic idea is to use a small set of AFs and EFs for fast convergence, and evaluate the resulting malware when the evolution stops.
The experiment is conducted as follows: 1) pick up 10 samples of malware which can be detected. 2) use IBEA algorithm to generate new variants by combining or mutating features in the initial population of malware. 3) stop if no more optimal malware is generated. The 10 samples are from malware family DroidKungFu3, AnserverBot, BaseBridge, DroidKungFu4, Geinimi, Pjapp, KMin, GoldDream, DroidKungFu1 and DroidKungFu2 as shown in Table 3.
The extracted features are as follows. In addition, we consider all 14 types of evasion features in this experiment.
All selectable features in this experiment
Triggers:
[T 1 ] STARTUP,
[T 2 ] android.intent.action.BOOT_COMPLETED,
[T 3 ] android.intent.action.BATTERY_CHANGED,
[T 4 ] android.intent.action.NEW_OUTGOING_CALL
[T 5 ] android.provider.Telephony.SMS_RECEIVED,
Source:
[SU 1 ] PACKAGE::INSTALLED_APK,
[SU 2 ] SMS::ALL,
[SU 3 ] SMS::INCOMING_SMS,
[SU 4 ] TELEPHONY::IMEI,
[SU 5 ] TELEPHONY::IMSI,
[SU 6 ] TELEPHONY::PHONE_NUMBER,
[SU 7 ] TELEPHONY::SIM_SERIAL
Sinks:
[SI 1 ] HTTP::APACHE_GET,
[SI 2 ] HTTP::APACHE_POST,
[SI 3 ] HTTP::SOCKET_POST,
[SI 4 ] SMS::SEND_TEXT_MESSAGE
Permissions:
[P 1 ] android.permission.INTERNET
[P 2 ] android.permission.PROCESS_OUTGOING_CALLS
[P 3 ] android.permission.RECEIVE_BOOT_COMPLETED
[P 4 ] android.permission.READ_PHONE_STATE
[P 5 ] android.permission.RECEIVE_SMS
[P 6 ] android.permission.SEND_SMS
Evasion:
[E 1 ] Control based evasion
[E 2 ] Data based evasion
[E 3 ] Transformation attacks (12 types of transformation)
Initially, Mystique selects features randomly to construct 10 malware samples as the initial population. Mystique evolves based on the fitness value of newly-generated malware. After 30 iterations, Mystique obtains the optimal malware of which the fitness values reach optimum in three objectives. The optimal malware contains 16 attack features and 3 evasion features. AFs in the optimal malware are {T1, T3, T5, SU1, SU2, SU4, SU5, SU7, SI1, SI2, SI3, SI4, P1, P2, P3, P6}, and EFs contains control based evasion, data based evasion and one transformation. We show the details of generated malware in the following textbox.
Text Box
Generation 10:
{ id":"1001111011010101010011101000100001100", "selected":false,"eval":[-10.0,4.0,0.5]},
{"id":"1101011011011111000011101000100001100","selected":false,"eval":[-12.0,4.0,0.0]},
{"id":"1101011011011111010011101000100001100","selected":true,"eval":[-12.0,4.0,0.0]},
{"id":"1101011011011111010011101000100001100","selected":true,"eval":[-12.0,4.0,0.0]},
{"id":"1101011011011111010011101000100001100","selected":true,"eval":[-12.0,4.0,0.0]},
{"id":"1101011011011011010011101000100001100","selected":false,"eval":[-11.0,4.0,0.0625]},
{"id":"1101011011011111000010101000100001100","selected":false,"eval":[-12.0,4.0,0.0]},
{"id":"1101011011011111010011101000100001100","selected":true,"eval":[-12.0,4.0,0.0]},
{"id":"1101011011011101000011101000100001100","selected":true,"eval":[-11.0,4.0,0.015625]},
{"id":"1101011011011111010011101000100000100","selected":true,"eval":[-12.0,3.0,0.0]}
Generation 20:
{"id":"1101011011011111011111000000011001101","selected":false,"eval":[-12.0,5.0,0.015625]},
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.0]},
{"id":"1101011011011111000111101111101011000","selected":false,"eval":[-12.0,8.0,0.046875]},
{"id":"0001010101000010011110101010001000001","selected":false,"eval":[-5.0,4.0,0.046875]},
{"id":"1101011011011111010000001111010000101","selected":false,"eval":[-12.0,7.0,0.0625]},
{"id":"0001010101000010011110101010011000001","selected":false,"eval":[-5.0,5.0,0.375]},
{"id":"1101011011011111010011101001011101111","selected":false,"eval":[-12.0,9.0,0.375]},
{"id":"0001010101000010011111101010001100001","selected":false,"eval":[-5.0,5.0,0.25]},
{"id":"1101011011011111011100101001010100001","selected":false,"eval":[-12.0,5.0,0.375]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.5]},
{"id":"1101011011011111001110100011001011000","selected":true,"eval":[-12.0,5.0,0.0]},
{"id":"1101011011011111000001000011000001011","selected":false,"eval":[-12.0,5.0,0.5]},
{"id":"0001010101000010011111101010001000001","selected":false,"eval":[-5.0,4.0,0.25]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.5]}
Generation 30:
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.0625]},
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.0]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.0625]},
{"id":"1101011011011111011100100000000101001","selected":true,"eval":[-12.0,3.0,0.0]},
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.375]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.25]},
{"id":"1101011011011111000011101000100001100","selected":false,"eval":[-12.0,4.0,0.5]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.046875]},
{"id":"1101011011011111011101100000000101001","selected":false,"eval":[-12.0,3.0,0.0]},
{"id":"1101011011011111000010101000100001100","selected":false,"eval":[-12.0,4.0,0.0625]}
Generation 40:
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.0625]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.046875]},
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.5]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.07017543859649122]},
{"id":"1101011011011111010011101000100001100","selected":false,"eval":[-12.0,4.0,0.046875]},
{"id":"1101011011011111011100100000000101001","selected":false,"eval":[-12.0,3.0,0.06349206349206349]},
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.047619047619047616]},
{"id":"1101011011011111000010101000100001100","selected":false,"eval":[-12.0,4.0,0.5]},
{"id":"1101011011011111011101100000000101001","selected":false,"eval":[-12.0,3.0,0.03125]},
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.0625]}
In this experiment, we attempt to get the optimal malware based on selected features. As shown above, we generate 10 malware samples in the initial population during the evolution. We use {"id":XXX, "selected":true|false, "eval":[AF, EF, DR]} to represent one malware sample. "id" indicates the binary representation for the selection of AFs and EFs, where 0 represents the i-th feature is not selected, and 1 represents the i-th feature is selected. "eval" has three elements which are for the fitness value for each of three objectives, i.e., aggressiveness (F1(x)), evasiveness (F2(x)) and detectability (F3(x)). As we are going to generate more aggressive malware, we negate the value of F1(x) by multiplying -1. "selected" means whether it is selected as the candidates to produce offspring. All "selected" malware samples are put into the population pool, waiting to be selected according to their fitness value, i.e., "eval".
For seek of simplification, we only show the information for malware every 10 generations AND we only consider triggers, sources and sinks to calculate the first fitness value as permissions are oftentimes dependent by other features. In the 10-th generation, we selected 7 malware samples as candidate to produce the offspring, since they outperform in the aspect of viability according to the IBEA. The further malware samples are obtained by crossoverring and mutating the selected candidates. When the evolution proceeds to the 20-th generation, there is only one malware sample which is more viable than its siblings. So it is selected as candidate for further generation. In the 30-th generation, we obtain one malware sample as candidate, while we cannot obtain a proper candidate the 4-th generation. So the evolution stops and the change of DNAs of malware converges to several optimal malware. There are FIVE optimal malware samples generated in this experiment:
{"id":"1101011011011111010011101000100000100","selected":true,"eval":[-12.0,3.0,0.0]}
{"id":"1101011011011111010011101000100000100","selected":false,"eval":[-12.0,3.0,0.0]}
{"id":"1101011011011111011100100000000101001","selected":true,"eval":[-12.0,3.0,0.0]}
{"id":"1101011011011111011101100000000101001","selected":false,"eval":[-12.0,3.0,0.0]}
The features selected in the second optimal malware
[T1] MAIN::STARTUP
[T3] android.intent.action.BATTERY_CHANGED
[T5] android.provider.Telephony.SMS_RECEIVED
[SU1] PACKAGE::INSTALLED_APK
[SU2] SMS::ALL
[SU4] TELEPHONY::IMEI
[SU5] TELEPHONY::IMSI
[SU7] TELEPHONY::SIM_SERIAL
[SI1] HTTP::APACHE_GET
[SI2] HTTP::APACHE_POST
[SI3] HTTP::SOCKET_POST
[SI4] SMS::SEND_TEXT_MESSAGE
[P1] android.permission.INTERNET
[P2] android.permission.PROCESS_OUTGOING_CALLS
[P3] android.permission.RECEIVE_BOOT_COMPLETED
[P4] android.permission.READ_PHONE_STATE
[P6] android.permission.SEND_SMS
[E1] control based
[E2] data based
[E3] partial transformation