More From Strings

If you have viewed the analysis page for a sample on the #totalhash site you might have seen a section entitled strings. Strings can be a great way to get some more information from a sample in a very quick way without having to resort to dynamic analysis.

What are strings?

Well they are exactly that, strings that are present in the compiled executable. To demonstrate lets write some simple C code compile it into an executable format and analyze the binary.

My simple helloworld.c code looks like this;

helloworld

If we now compile this code with the command gcc -o helloworld helloworld.c¬† we now have a compiled executable “helloworld”. If we execute the code we see that it simply displays “Hello World”.

To prove how useful strings can be if I now use the strings command on this executable and grep for “Hello World” I find a match.

String Obfuscation

An age old and simple technique to avoid strings being readable by the strings command is to store a string in an alternative format. A simple alternative is to use a character array. Let’s try it out, here is the code to helloworld_obfs.c;

helloworld_obfs

If we compile this code, and execute it, you will see it performs the exact same job as helloworld.c, it will output “Hello World”. If we run strings over this new file though you will not find the string “Hello World”. The string is still there, but we need to revert to another technique to be able to read it.

I’m now going to use the objdump tool found on the Linux to perform some disassembly of the compiled code. Here is some output from the command objdump -D helloword_obfs;

objdump

The¬†disassembly shows us the character array being prepared in machine language and enables us to reconstruct the string. As you can see there are a number of consecutive movb instructions in the above disassembly. If you pay close attention to the hex values next to the movb instructions you will see our “Hello World” string.

Application

Let’s put this knowledge into use and use it against some malware samples. To save manually decoding the hex values I can use some simple Perl code to do the job. You can download that code here;

http://totalhash.cymru.com/download/strdeob.pl.txt

Now let’s run this against a known Zegost sample in our repository.

http://totalhash.cymru.com/analysis/50cf59b3baf2101465b25d5823494d6431600a57

strings

Lots of new output to help us determine what this sample might do at runtime.