About six weeks ago, when I blogged about the Adobe Reader/Flash 0-day that was making the rounds back then, I talked about generating automated shellcode dumps with Pin. In this post I want to talk a bit about Pin, dynamic binary instrumentation, and the shellcode dumper Pintool we developed at zynamics.
Dynamic binary instrumentation is a technique for analyzing binary files by executing the files and injecting analysis code into the binary file at runtime. This method is not exactly new. It has been in use for many years already, for example in program verification, profiling, and compiler optimization. However, despite its amazing power and ease of use, dynamic binary instrumentation is still not widely used by binary code reverse engineers.
The two most important dynamic binary instrumentation tools for binary code reverse engineers are Pin and DynamoRIO. Pin is developed by Intel and provided by the University of Virginia while DynamoRIO is a collaboration between Hewlett-Packard and MIT. Both are free to use but only DynamoRIO is open source.
In general, both Pin and DynamoRIO are very similar to use. If you want to use either tool you have to write a C or C++ plugin that contains your analysis code. This code is then injected into the target process by Pin or DynamoRIO. For most reverse engineering purposes, Pin and DynamoRIO are both equally useful and when you talk to reverse engineers who make use of dynamic binary instrumentation it often seems to be a matter of personal taste which tool they prefer. At zynamics we use Pin because the API for analysis code seems cleaner to us.
Let’s get back to shellcode dumping now. The idea behind shellcode detection is rather simple: Whenever an instruction is executed, check if that instruction belongs to a section of a loaded module in the address space of the target process. If that’s the case, then the instruction is considered legit (not shellcode). If, however, the instruction is outside of any module section, and therefore most likely on the stack or allocated heap memory, the instruction is considered shellcode. This heuristic is not perfect, of course. However, it works surprisingly well in practice.
In fact, the Pintool we developed does exactly this. For every executed instruction in the target process it performs the check described in the above paragraph. Until shellcode is found, the Pintool keeps track of up to 100 legit instructions executed before the shellcode. Then, when shellcode is found, it dumps the legit instructions before the shellcode and the shellcode itself. The big value of our Pintool is not that it dumps the shellcode. The big value is that it tells you exactly where control flow is transferred from the legit code to the shellcode. With this information you can quickly find the vulnerability in the exploited program. If you are really interested in the shellcode, you can also just set a breakpoint on the last legit instruction before the shellcode and do a manual analysis from there.
You can find the complete documented source of our shellcode dumper Pintool on the zynamics GitHub. The source code is surprisingly short and I did my best to document the code. If you are having any questions about the source code, please leave a comment to this blog entry or contact me in some other way.
Let’s take a look at the output of a sample run now. You can find the full trace log on GitHub but here are the important parts.
0x238C038E::EScript.api E8 D2 72 F6 FF call 0x23827665
0x238BBF4F::EScript.api 8B 44 24 04 mov eax, dword ptr [esp+0x4]
0x238BBF53::EScript.api C6 40 FF 01 mov byte ptr [eax-0x1], 0x1
[ … more EScript.api instructions … ]
0x2D841E82::Multimedia.api 56 push esi
0x2D841E83::Multimedia.api 8B 74 24 08 mov esi, dword ptr [esp+0x8]
0x2D841E87::Multimedia.api 85 F6 test esi, esi
0x2D841E89::Multimedia.api 74 22 jz 0x2d841ead
0x2D841E8B::Multimedia.api 56 push esi
[ … more Multimedia.api instructions … ]
0x2D841E96::Multimedia.api 8B 10 mov edx, dword ptr [eax]
0x2D841E98::Multimedia.api 8B C8 mov ecx, eax
0x2D841E9A::Multimedia.api FF 52 04 call dword ptr [edx+0x4]
0x0A0A0A0A:: 0A 0A or cl, byte ptr [edx]
0x0A0A0A0C:: 0A 0A or cl, byte ptr [edx]
0x0A0A0A0E:: 0A 0A or cl, byte ptr [edx]
[ … more shellcode … ][/sourcecode]
In case you are wondering about gaps in the instruction trace (for example at calls), please note that each instruction is only dumped once. So, if a function is called twice, the second call is not dumped to the output file anymore. This behavior was added to keep log files small.
I think the shellcode dumper is a good example for a first Pintool. The idea behind it is really simple and the actual Pintool can be improved by interested readers many ways (dump register values or improve the shellcode detection heuristic, for example). If you are making improvements to the tool, please let us know.