As mentioned in the conference circus post, last month I was also giving a talk in CodeGate 2010. Just after Thomas’s talk about BinCrowd it was my turn to present my “Introduction to mobile reversing”:
As you can see in the slides (Flash required, direct pdf download here), the presentation was divided into three parts: Windows Mobile, Android and iPhoneOS. Almost half of the time was spent in the last section talking about iPhone applications and Objective-C reversing.
One of the main problems when doing iPhone application reverse engineering is to deal with Objective-C artifacts. This is not a new issue. Everything developed over the Cocoa Framework has being using Objective-C for a long time. Proof of that is the existence of previous work in the reversing field (Cameron Hotchkies, itsme) but those scripts have a problem with iPhone applications: they are crafted to analyze x86 binaries (Intel MacOS X). That means the scripts try to parse x86 assembly code and look for compiler structures in certain sections/segments. Both the sections and the structures used are completely different for ARM-iPhoneOS binaries. Other scripts are specific for iPhone (KennyTM), but don’t take care of, in my opinion, the most annoying Objective-C side effect: callgraphs and cross-references.
During my talk at CodeGate, I pointed out that one of the main drawbacks was the fact that with Objective-C we have a useless callgraph, because all calls (take all not as 100% but as a really high percentage) are made to the method objc_msgSend() as you can see in this sample callgraph:
The red dots are methods and arrows represent calls. There’s only one arrow per dot because we removed duplicates to simplify it. The three top-called functions are three different flavours of msgSend: the standard one (objc_msgSend); the one to send messages to the super-class (objc_msgSend_Super); and the standard one that returns a struct instead of an integer (objc_msgSend_stret).
Continuing with the presentation, in the last part I used an script to patch the calls to objc_msgSend() and make the callgraph a bit more useful. Basically the script parses the calls to objc_msgSend() and traces the arguments passed in R0 and R1 that are the target class and method name. That way, the script creates a new segment in the binary where it places dummy functions with the names “classname_methodname” and patches the call to objc_msgSend() to point to the corresponding dummy-function. The callgraph after using the script looks a bit better (in blue the dummy functions):
As promised, the script has been released and you can download it from the zynamics GitHub account. This is an initial version with only the objc_msgSend() patching. We will be updating the script with the tricks disclosed in further posts, including a few improvements and adding static Objective-C class reconstruction.