Author Archive

Objective-C reversing (Part II)

2010-07-02

It’s been a long time since the first part about static Objective-C reverse engineering so it’s time for a second one and provide another script to play with. In this second part we will be covering static class reconstruction for Objective-C binaries. Some class reconstruction was made previously in Cameron Hotchkies’s and itsme’s work, but those were for Mac OS X binaries and as we said in the first part, the structure of the binaries changed, as well as the internal structures that contain information about classes. Along this post and all its examples, we will use DigiClock as an example for our analysis. Given that the application comes with source code, it will be useful to verify our findings.

Our first step is to look into the __objc_classlist section. As the name says, this section contains a list of the classes defined inside the binary. Note that general classes as NSObject are not included even if used/referenced in the source code, just the classes we implement. Each DWORD in this section points to an address in __objc_data where the class definition is stored. For example:


00004980 DCD _OBJC_CLASS_$_DigiClockAppDelegate
00004984 DCD _OBJC_CLASS_$_FlipsideView
00004988 DCD _OBJC_CLASS_$_MainView
0000498C DCD _OBJC_CLASS_$_MainViewController
00004990 DCD _OBJC_CLASS_$_RootViewController
00004994 DCD _OBJC_CLASS_$_FlipsideViewController

And going to _OBJC_CLASS_$_DigiClockAppDelegate:


000040D8 _OBJC_CLASS_$_DigiClockAppDelegate DCD _OBJC_METACLASS_$_DigiClockAppDelegate
000040DC DCD 0
000040E0 DCD 0
000040E4 DCD 0
000040E8 DCD dword_43AC

Here we have our first struct, that we can define as (all fields are size 4 bytes):

Offset Description
0 Pointer to Meta-Class
4 Pointer to Super-Class
8 Pointer to class cache
12 Pointer to vtable related struct
16 Pointer to class definition

As you can see (if you’re looking into the binary while reading this, which you should) there’s always a meta-class but in this case we have no super-class. Super-class are usually references to high level classes like NSObject, UIViewController, UITableViewController, etc.

The cache and vtable struct are also empty so let’s move to the class definition:

Offset Description
0 Boolean that indicates if it’s a meta-class
4 Instance size (disk?)
8 Instance size (memory?)
12 Always zero?
16 Pointer to class name (ASCII)
20 Pointer to method struct (list of implemented methods)
24 Pointer to protocol struct (inherited protocols)
28 Pointer to ivar names (list of declared variables)
32 Always zero?
36 Pointer to properties struct (list with encoded types)

First of all, about the two “instance size” fields, it is unclear which one refers to either disk or memory size but my guess is inside parenthesis.

Method struct

This one is a simple struct that contains a lot of useful information. The struct works like an array of method definitions. That way the first two fields indicate the size of each “method definition” (or field) and the second one the total number of fields. Following the example, DigiClockAppDelegate has 6 methods:


00004314 dword_4314 DCD 0xC
00004318 DCD 6
0000431C DCD aSetwindow ; "setWindow:"
00004320 DCD aV12048 ; "v12@0:4@8"
00004324 DCD __DigiClockAppDelegate_setWindow__+1
00004328 DCD aWindow ; "window"
0000432C DCD a804 ; "@8@0:4"
00004330 DCD __DigiClockAppDelegate_window_+1
00004334 DCD aSetrootviewcon ; "setRootViewController:"
00004338 DCD aV12048 ; "v12@0:4@8"
0000433C DCD __DigiClockAppDelegate_setRootViewController__+1
00004340 DCD aRootviewcontro ; "rootViewController"
00004344 DCD a804 ; "@8@0:4"
00004348 DCD __DigiClockAppDelegate_rootViewController_+1
0000434C DCD aDealloc ; "dealloc"
00004350 DCD aV804 ; "v8@0:4"
00004354 DCD __DigiClockAppDelegate_dealloc_+1
00004358 DCD aApplicationdid ; "applicationDidFinishLaunching:"
0000435C DCD aV12048 ; "v12@0:4@8"
00004360 DCD __DigiClockAppDelegate_applicationDidFinishLaunching__+1

The information stored in every field is, in order: the method name, an encoded string that specifies the function prototype (return value and parameter types), and the method address. The encoded prototype can look a bit tricky at first but with the help of some available information we can see how setWindow prototype would be something like:

void setWindow(self@0, id@8)

We know that id is a class instance, but we don’t know which one. And that’s all about methods for now.

Protocol struct

The protocol struct has two parts. The first one specifies how many protocols that class inherits:


0000437C dword_437C DCD 1
00004380 DCD dword_49B8

In this case, as shown by the first DWORD is only one protocol, and the second field points to the protocol definition:

Offset Description
4 Pointer to name of the class the protocol is inherited from.
8 Pointer to protocol struct (N-th level inheritance)
12 Pointer to a method struct (instance methods)
20 Pointer to a method struct (class methods).

Here we have a recursive reference, where a protocol usually points to higher level protocols. It would be possible to build a hierarchy tree using this information but unfortunately, in most cases protocol information is related to standard classes (UIApplicationDelegate, NSObject…) so the resulting tree uses to be the same or quite similar.

About the difference between instance and class methods, the “Objective-C Programming Language” says:

“Protocols can’t be used to type class objects. Only instances can be statically typed to a protocol, just as only instances can be statically typed to a class. (However, at runtime, both classes and instances will respond to a conformsToProtocol: message.)”

Ivar struct

This struct lists the variables defined inside the interface of the class. In our example, DigiClockAppDelegate defines two variables:


000042E4 dword_42E4 DCD 0x14
000042E8 DCD 2
000042EC DCD _OBJC_IVAR_$_DigiClockAppDelegate.window
000042F0 DCD aWindow ; "window"
000042F4 DCD aUiwindow ; "@\"UIWindow\""
000042F8 DCD 2
000042FC DCD 4
00004300 DCD _OBJC_IVAR_$_DigiClockAppDelegate.rootViewController
00004304 DCD aRootviewcontro ; "rootViewController"
00004308 DCD aRootviewcont_0 ; "@\"RootViewController\""
0000430C DCD 2
00004310 DCD 4

The structure is similar to the one used with methods. First we have a field size (0x14) followed by the total number of fields. Each field contains information about the offset, the variable name and the type. The remaining integer values are related to the size in memory.

Property struct

This is the last struct of the class definition, and again uses the same structure with the first DWORD telling the field size and the second the number of fields. In our example, the properties are applied over the two variables we saw in the previous structure (“window” and “rootViewController”):


00004364 dword_4364 DCD 8
00004368 DCD 2
0000436C DCD aRootviewcontro ; "rootViewController"
00004370 DCD aTRootviewcontr ; "T@\"RootViewController\",&,N,VrootViewCon"...
00004374 DCD aWindow ; "window"
00004378 DCD aTUiwindowNVwin ; "T@\"UIWindow\",&,N,Vwindow"

For properties, fields contain only the variable name and the encoded properties. The encoded string usually follows this format:

T@”VAR_TYPE”,[&N],V_var_name

Where N represents the nonatomic property and “&” the retain property.

Final words

Well, that’s all for this second part. There’s an idapython script that parses all this information on zynamics github. In the third and upcoming parts of Objective-C reversing we will be filling the gaps on the structures and using all this information to reconstruct the application’s header files and to improve the objc_helper script that we introduced on the first part.

Defcon CTF quals: bin400 writeup

2010-06-02

A few days ago, between May 21st – 24th, DDTEK organized the Defcon 18 Capture The Flag qualifiers. For all of you that are not familiar with this kind of contest, Defcon CTF is a hacking offense/defense contest held during the conference in Vegas. In order to play the final round, a previous online competition takes place to select 9 top-teams that will join last year’s winner. The qualification contest contained 30 challenges through different categories like Pursuits Trivial (general questions), Crypto Badness (cryptography), Packet Madness (network traffic analysis), Binary L33tness (reversing), Pwtent Pwnables (exploiting) and Forensics. We at zynamics had a couple of guys playing in different teams so we decided to join the writeup fever and release a solution for the Binary L33tness 400 challenge

(more…)

Objective-C script update

2010-05-25

The objc_helper script we presented earlier in Objective-C Reversing Part I has been updated. Check the new version in Zynamics’ GitHub. This is a summary of the main changes and fixes:

  • Fixed a problem when tracking R0 register that was modified by previous calls. Now if the script is tracking R0 and finds a BX/BLX, it assumes that is modifying R0 and stops, marking the tracking as failed.
  • Changed the way the script parses the data references so it works both with release and debug binaries. Instead of getting the raw offset we now use recursive calls to idautils.DataRefsFrom(). For the references to work properly we had to make a pre-process converting all dwords to offsets in the classrefs and superrefs sections (similar to the offsetize() used by KennyTM).
  • In some cases, compiler can decide to use LR as a general register so the search for R0..R15 fails. Now the script includes the handling of this special case.
  • Added check of Thumb/non-Thumb code for patching the calls correctly.
  • Fixed bug that was getting the incorrect parameters for other flavours of msgSend(). Now it should be easier to add others.

Thanks a lot to everybody that reported bugs, and also to the betatesters!

Soon we will come with the Objective-C reversing part II with more improvements and details on static analysis. Stay tuned!

Objective-C reversing (Part I)

2010-04-27

As mentioned in the conference circus post, last month I was also giving a talk in CodeGate 2010. Just after Thomas’s talk about BinCrowd it was my turn to present my “Introduction to mobile reversing”:

As you can see in the slides (Flash required, direct pdf download here), the presentation was divided into three parts: Windows Mobile, Android and iPhoneOS. Almost half of the time was spent in the last section talking about iPhone applications and Objective-C reversing.

One of the main problems when doing iPhone application reverse engineering is to deal with Objective-C artifacts. This is not a new issue. Everything developed over the Cocoa Framework has being using Objective-C for a long time. Proof of that is the existence of previous work in the reversing field (Cameron Hotchkies, itsme) but those scripts have a problem with iPhone applications: they are crafted to analyze x86 binaries (Intel MacOS X). That means the scripts try to parse x86 assembly code and look for compiler structures in certain sections/segments. Both the sections and the structures used are completely different for ARM-iPhoneOS binaries. Other scripts are specific for iPhone (KennyTM), but don’t take care of, in my opinion, the most annoying Objective-C side effect: callgraphs and cross-references.

During my talk at CodeGate, I pointed out that one of the main drawbacks was the fact that with Objective-C we have a useless callgraph, because all calls (take all not as 100% but as a really high percentage) are made to the method objc_msgSend() as you can see in this sample callgraph:

Original callgraph of an iPhone Objective-C sample

The red dots are methods and arrows represent calls. There’s only one arrow per dot because we removed duplicates to simplify it. The three top-called functions are three different flavours of msgSend: the standard one (objc_msgSend); the one to send messages to the super-class (objc_msgSend_Super); and the standard one that returns a struct instead of an integer (objc_msgSend_stret).

Continuing with the presentation, in the last part I used an script to patch the calls to objc_msgSend() and make the callgraph a bit more useful. Basically the script parses the calls to objc_msgSend() and traces the arguments passed in R0 and R1 that are the target class and method name. That way, the script creates a new segment in the binary where it places dummy functions with the names “classname_methodname” and patches the call to objc_msgSend() to point to the corresponding dummy-function. The callgraph after using the script looks a bit better (in blue the dummy functions):

Callgraph of the same sample after using the script

As promised, the script has been released and you can download it from the zynamics GitHub account. This is an initial version with only the objc_msgSend() patching. We will be updating the script with the tricks disclosed in further posts, including a few improvements and adding static Objective-C class reconstruction.