It’s been a long time since the first part about static Objective-C reverse engineering so it’s time for a second one and provide another script to play with. In this second part we will be covering static class reconstruction for Objective-C binaries. Some class reconstruction was made previously in Cameron Hotchkies’s and itsme’s work, but those were for Mac OS X binaries and as we said in the first part, the structure of the binaries changed, as well as the internal structures that contain information about classes. Along this post and all its examples, we will use DigiClock as an example for our analysis. Given that the application comes with source code, it will be useful to verify our findings.
Our first step is to look into the __objc_classlist section. As the name says, this section contains a list of the classes defined inside the binary. Note that general classes as NSObject are not included even if used/referenced in the source code, just the classes we implement. Each DWORD in this section points to an address in __objc_data where the class definition is stored. For example:
00004980 DCD _OBJC_CLASS_$_DigiClockAppDelegate
00004984 DCD _OBJC_CLASS_$_FlipsideView
00004988 DCD _OBJC_CLASS_$_MainView
0000498C DCD _OBJC_CLASS_$_MainViewController
00004990 DCD _OBJC_CLASS_$_RootViewController
00004994 DCD _OBJC_CLASS_$_FlipsideViewController
And going to _OBJC_CLASS_$_DigiClockAppDelegate:
000040D8 _OBJC_CLASS_$_DigiClockAppDelegate DCD _OBJC_METACLASS_$_DigiClockAppDelegate
000040DC DCD 0
000040E0 DCD 0
000040E4 DCD 0
000040E8 DCD dword_43AC
Here we have our first struct, that we can define as (all fields are size 4 bytes):
|0||Pointer to Meta-Class|
|4||Pointer to Super-Class|
|8||Pointer to class cache|
|12||Pointer to vtable related struct|
|16||Pointer to class definition|
As you can see (if you’re looking into the binary while reading this, which you should) there’s always a meta-class but in this case we have no super-class. Super-class are usually references to high level classes like NSObject, UIViewController, UITableViewController, etc.
The cache and vtable struct are also empty so let’s move to the class definition:
|0||Boolean that indicates if it’s a meta-class|
|4||Instance size (disk?)|
|8||Instance size (memory?)|
|16||Pointer to class name (ASCII)|
|20||Pointer to method struct (list of implemented methods)|
|24||Pointer to protocol struct (inherited protocols)|
|28||Pointer to ivar names (list of declared variables)|
|36||Pointer to properties struct (list with encoded types)|
First of all, about the two “instance size” fields, it is unclear which one refers to either disk or memory size but my guess is inside parenthesis.
This one is a simple struct that contains a lot of useful information. The struct works like an array of method definitions. That way the first two fields indicate the size of each “method definition” (or field) and the second one the total number of fields. Following the example, DigiClockAppDelegate has 6 methods:
00004314 dword_4314 DCD 0xC
00004318 DCD 6
0000431C DCD aSetwindow ; "setWindow:"
00004320 DCD aV12048 ; "v12@0:4@8"
00004324 DCD __DigiClockAppDelegate_setWindow__+1
00004328 DCD aWindow ; "window"
0000432C DCD a804 ; "@8@0:4"
00004330 DCD __DigiClockAppDelegate_window_+1
00004334 DCD aSetrootviewcon ; "setRootViewController:"
00004338 DCD aV12048 ; "v12@0:4@8"
0000433C DCD __DigiClockAppDelegate_setRootViewController__+1
00004340 DCD aRootviewcontro ; "rootViewController"
00004344 DCD a804 ; "@8@0:4"
00004348 DCD __DigiClockAppDelegate_rootViewController_+1
0000434C DCD aDealloc ; "dealloc"
00004350 DCD aV804 ; "v8@0:4"
00004354 DCD __DigiClockAppDelegate_dealloc_+1
00004358 DCD aApplicationdid ; "applicationDidFinishLaunching:"
0000435C DCD aV12048 ; "v12@0:4@8"
00004360 DCD __DigiClockAppDelegate_applicationDidFinishLaunching__+1
The information stored in every field is, in order: the method name, an encoded string that specifies the function prototype (return value and parameter types), and the method address. The encoded prototype can look a bit tricky at first but with the help of some available information we can see how setWindow prototype would be something like:
void setWindow(self@0, id@8)
We know that id is a class instance, but we don’t know which one. And that’s all about methods for now.
The protocol struct has two parts. The first one specifies how many protocols that class inherits:
0000437C dword_437C DCD 1
00004380 DCD dword_49B8
In this case, as shown by the first DWORD is only one protocol, and the second field points to the protocol definition:
|4||Pointer to name of the class the protocol is inherited from.|
|8||Pointer to protocol struct (N-th level inheritance)|
|12||Pointer to a method struct (instance methods)|
|20||Pointer to a method struct (class methods).|
Here we have a recursive reference, where a protocol usually points to higher level protocols. It would be possible to build a hierarchy tree using this information but unfortunately, in most cases protocol information is related to standard classes (UIApplicationDelegate, NSObject…) so the resulting tree uses to be the same or quite similar.
About the difference between instance and class methods, the “Objective-C Programming Language” says:
“Protocols can’t be used to type class objects. Only instances can be statically typed to a protocol, just as only instances can be statically typed to a class. (However, at runtime, both classes and instances will respond to a conformsToProtocol: message.)”
This struct lists the variables defined inside the interface of the class. In our example, DigiClockAppDelegate defines two variables:
000042E4 dword_42E4 DCD 0x14
000042E8 DCD 2
000042EC DCD _OBJC_IVAR_$_DigiClockAppDelegate.window
000042F0 DCD aWindow ; "window"
000042F4 DCD aUiwindow ; "@\"UIWindow\""
000042F8 DCD 2
000042FC DCD 4
00004300 DCD _OBJC_IVAR_$_DigiClockAppDelegate.rootViewController
00004304 DCD aRootviewcontro ; "rootViewController"
00004308 DCD aRootviewcont_0 ; "@\"RootViewController\""
0000430C DCD 2
00004310 DCD 4
The structure is similar to the one used with methods. First we have a field size (0x14) followed by the total number of fields. Each field contains information about the offset, the variable name and the type. The remaining integer values are related to the size in memory.
This is the last struct of the class definition, and again uses the same structure with the first DWORD telling the field size and the second the number of fields. In our example, the properties are applied over the two variables we saw in the previous structure (“window” and “rootViewController”):
00004364 dword_4364 DCD 8
00004368 DCD 2
0000436C DCD aRootviewcontro ; "rootViewController"
00004370 DCD aTRootviewcontr ; "T@\"RootViewController\",&,N,VrootViewCon"...
00004374 DCD aWindow ; "window"
00004378 DCD aTUiwindowNVwin ; "T@\"UIWindow\",&,N,Vwindow"
For properties, fields contain only the variable name and the encoded properties. The encoded string usually follows this format:
Where N represents the nonatomic property and “&” the retain property.
Well, that’s all for this second part. There’s an idapython script that parses all this information on zynamics github. In the third and upcoming parts of Objective-C reversing we will be filling the gaps on the structures and using all this information to reconstruct the application’s header files and to improve the objc_helper script that we introduced on the first part.