You are a cyber forensic investigator. You obtain a criminal's computer, and locate a suspicious program that the criminal was running when he was arrested. The criminal was trying to send a secret message through this program, and that's what we want to know. Unfortunately, the program gets input (i.e., the secret message) from a file and then deletes the file right after it gets.
Luckily, you obtain a memory dump from the criminal's computer. Your goal is to identify the original message that the criminal wanted to send, from the memory dump. You have the program's binary too.
Consider the following program.
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
char* mask_buffer( char* buf, int size )
{
char* key = "KEYSTRING";
int i;
for ( i = 0; i < size; i++ )
buf[i] = buf[i] ^ key[i % strlen(key)]; // XOR
return buf;
}
char* read_file_and_dispose( char* filename )
{
char* ret = (char*)malloc(1024);
FILE* f = fopen( filename, "rt" );
if( f ) {
memset( ret, 0, 1024 );
fgets (ret, 1023, f);
fclose(f);
remove( filename );
return ret;
}
return 0;
}
int main(int argc, char** argv)
{
char* pBuffer;
char* pszInput;
int nSize = 16384;
int i, j;
pBuffer = (char*)malloc( nSize );
memset( pBuffer, 0, nSize );
/* step 1 */
pszInput = read_file_and_dispose( "msg" );
if( !pszInput ) return 1;
j = strlen(pszInput);
for ( i = 0; i < j; i++ ) {
pBuffer[i] = pszInput[i];
/* step 2 */
pszInput[i] = 0; // wipe out.
}
free(pszInput);
/* step 3 */
mask_buffer( pBuffer, nSize );
/* ======== MEMORY OBTAINED AT THIS VERY MOMENT ======== */
/* step 4: send secret buffer */
free( pBuffer );
return 0;
}
Focus on the Step 1, 2, 3, and 4.
Step 1: First, the program reads input from a file via read_file_and_dispose(). In the function, you can see the program reads the file via fgets and then call remove to delete the file.
Step 2: After the program gets the input, it copies them into the pBuffer. Then, it will overwrite the values in the pszInput, to clear the original input values in the memory.
Step 3: Now it calls mask_buffer that uses XOR to mask the buffer.
Step 4: It will send the buffer to another computer. (The code is emitted).
Consider that we obtain the memory dump between the step 3 and 4. As you can see from the code, pszInput does not contain the original inputs at the moment. pBuffer is already masked. Your goal is to recover the original input.
The memory dump you got is the below:
31, 0, 10, 7, 94, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, 89, 83, 84, 82, 73, 78, 71, 75, 69, ...
Remember the XOR truth table? If "x ^ y = z" then "z ^ y = x". Simply you can go through another round of XOR operations to decode the original values. With the source code, you can simply call the mask_buffer function again to decode the pBuffer. The following is the output.
84, 69, 83, 84, 10, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
Now it is the time to see the ASCII code table. 84 is T, 69 is E, 83 is S, 84 is T, 10 is a new line.
Assume that you do not know the key ("KEYSTRING" here). How can you recover the original values?