As mentioned in the conference circus post, last month I was also giving a talk in CodeGate 2010. Just after Thomas’s talk about BinCrowd it was my turn to present my “Introduction to mobile reversing”:
As you can see in the slides (Flash required, direct pdf download here), the presentation was divided into three parts: Windows Mobile, Android and iPhoneOS. Almost half of the time was spent in the last section talking about iPhone applications and Objective-C reversing.
One of the main problems when doing iPhone application reverse engineering is to deal with Objective-C artifacts. This is not a new issue. Everything developed over the Cocoa Framework has being using Objective-C for a long time. Proof of that is the existence of previous work in the reversing field (Cameron Hotchkies, itsme) but those scripts have a problem with iPhone applications: they are crafted to analyze x86 binaries (Intel MacOS X). That means the scripts try to parse x86 assembly code and look for compiler structures in certain sections/segments. Both the sections and the structures used are completely different for ARM-iPhoneOS binaries. Other scripts are specific for iPhone (KennyTM), but don’t take care of, in my opinion, the most annoying Objective-C side effect: callgraphs and cross-references.
During my talk at CodeGate, I pointed out that one of the main drawbacks was the fact that with Objective-C we have a useless callgraph, because all calls (take all not as 100% but as a really high percentage) are made to the method objc_msgSend() as you can see in this sample callgraph:
The red dots are methods and arrows represent calls. There’s only one arrow per dot because we removed duplicates to simplify it. The three top-called functions are three different flavours of msgSend: the standard one (objc_msgSend); the one to send messages to the super-class (objc_msgSend_Super); and the standard one that returns a struct instead of an integer (objc_msgSend_stret).
Continuing with the presentation, in the last part I used an script to patch the calls to objc_msgSend() and make the callgraph a bit more useful. Basically the script parses the calls to objc_msgSend() and traces the arguments passed in R0 and R1 that are the target class and method name. That way, the script creates a new segment in the binary where it places dummy functions with the names “classname_methodname” and patches the call to objc_msgSend() to point to the corresponding dummy-function. The callgraph after using the script looks a bit better (in blue the dummy functions):
As promised, the script has been released and you can download it from the zynamics GitHub account. This is an initial version with only the objc_msgSend() patching. We will be updating the script with the tricks disclosed in further posts, including a few improvements and adding static Objective-C class reconstruction.
Could you write a small tutorial about how to use the script ? Do we just have to run it over the binary located inside our .ipa file ? Does it help to compile for Debug in order to retain symbols ?
Thanks for this, very interesting graphs.
Sure, I promise to update the readme file asap.
About your questions… you’re right on the first one, just extract the ipa, open the Mach-O binary using IDA and run the script. The second question is an interesting one, the symbols and class information included inside the binaries is the same, but Debug-compiled binaries use some arithmetic ops to reference the selectors that the current “register tracking” doesn’t like so much. In short, at this moment it’s better to work with Release binaries 😉
Interesting work, thank you. Out of curiosity, what application did you use to generate those callgraphs?
pydot to export the callgraph from IDA and Omnigraffle to layout it.
I wish I had a more bigger monitor
to admire the graphs. 🙂 Good job, Tora. Keep it up!
That script looks a little dangerous to me. Say you had this code:
NSData *data = [[NSString stringWithFormat:@”blah”] dataUsingEncoding: NSUTF8StringEncoding];
NSLog(@”%d”, [data length]);
as far as I can tell, the [data length] objc_msgSend call will be patched to _[NSString_length]
Right, the script doesn’t do anything special when tracking R0 so if there are calls/messages in the middle the result is not correct. I’ll add a temporary fix for that and in later posts we’ll see what can we get from the class reconstruction (return types), identify use of “self”… that will improve a bit this same part. Thanks for pointing that out 😉
good job
but instead of ida phyton u might have created IDC scritp would be more nice and simpler.
Awesome job Tora, nice to read you once again, as usual.
Keep up the good work!
@tasz uhm. which part is nicer and simpler about IDC (than python), please? i think I missed something.
did you fix the ‘bad character’ issues or you’ve abandoned the project?
The project is still alive. We’ve been a bit busy but fixes for the ‘bad character’ and other issues you people kindly reported are already being tested. If nothing strange happens it will be released this week.
[…] script update By Jose Duart The objc_helper script we presented earlier in Objective-C Reversing Part I has been updated. Check the new version in Zynamics’ GitHub. This is a summary of the main […]
cool stuff.
I’m still getting bad character errors. Was this supposed to have been fixed in the 2010/05/25 release?
86C44: can't rename byte as '_[_‡¤K4]' because it contains a bad character '‡'.
Hi Kevin,
some issues were in fact fixed on that release, but due to the way I do register backtracking (to analyze the arguments to msgSend) any new code optimization that I didn’t see/analyze. If is still/again happening a lot, as a quick patch I could add checks for allow only alphanum names. If the sample giving problems is public, send me an email and I’ll take a look 😉
Jose, Thanks! I can’t seem to locate an email address for you, though.
[…] iPhone app by Hana Bank, and it was created by Jose Duart using IDA and pydot for his presentation “Introduction to mobile reversing” at CodeGate 2010, a security conference held in Seoul, South Korea. The image originally had […]
[…] http://blog.zynamics.com/2010/04/27/objective-c-reversing-i/ http://blog.zynamics.com/2010/04/27/objective-c-reversing-ii/ […]
Is there something like this, but annotates the output of otool, instead of requiring IDA Pro?
[…] first and simplest technique was to hinder quick Objective-C method name retrieval scripts; this is certainly the least interesting of the transforms, but would remove a large amount of […]