We’re pleased to announce that zynamics has been acquired by Google! If you’re an existing customer and do not receive our email announcement within the next 48 hours, please contact us at email@example.com. All press inquiries should be sent to firstname.lastname@example.org.
Archive for the ‘Uncategorized’ Category
A while ago I posted a blog entry called “challenging conventional wisdom on AV signatures (Part 1 of 2)”. There, I argued that the fundamental problem with traditional AV byte signatures is the of lack of information asymmetry: The defender and the attacker both have access to the same information, and the attacker can run a potentially infinite number of test-runs to make sure he can reliably bypass all of defensive measures the defender has taken.
The important thing to take away from that blog post is that the problem with AV signatures is not inherent to “signatures” – it is a matter of information symmetry.
Now, how can one change this situation? Is there a clever way to make traditional byte signatures useful again? Can we somehow introduce information asymmetry in a productive manner?
To investigate this, we have to remember another blog post where I described some of our results on generating “smart” signatures (this appears to be AV lingo for signatures that are not checksum-based, but which consist of bytes and wildcards). The summary of this blog post is more or less: “With the algorithms underpinning VxClass, we can not only automatically cluster malicious software into groups, we can also generate signatures for each group automatically. And one signature will match the entire group.”
There was one small bit of information missing in that post that will make this post interesting: We can usually generate dozens, if not hundreds, of different signatures for the same cluster of malware. These signatures match, by construction, on all samples of a particular cluster, but they have nothing in common – they match on different bits of the code.
Where does this leave us? Well, it leaves us with a pretty cool system that we call VxClass for Financials (although it is possible to substitute ‘Financials’ with other large verticals that are often victims of targeted attacks). The system works as follows:
- Different financial institutions each get a user account on a centralized VxClass server
- Users upload malicious software that they have recovered (using tools such as Memoryze) from their own systems
- Users are anonymous by default
- Users can see how malware they upload clusters; they can also see how similar their malware is to malware other users uploaded
- Users can only download their own malware, not the malware of other users
- For each cluster, users can generate a personalized detection signature that no other user will ever see
Why is this cool ? Well, for one thing, every user profits from uploading to the system — the more samples are present in one cluster, the better the predictive power of the signatures. At the same time, users do not have to share any confidential information with each other — they are encouraged to, but they do not have to. Finally, even if some users of the system are sloppy and leak their signatures to the attacker, they only endanger themselves – everybody else has their own signatures, and will thus not be affected by this signature leak. This is important – normally, when I share methods of detection with others, I risk losing them. Not here.
We are starting an evaluation/beta program of the system in the next 1-2 weeks — at the moment, targeted at the financial sector. If you happen to find this interesting, are working for a financial institution and want to participate in our test drive, please contact us at email@example.com !
my name is Jan Newger and I’ve just recently joined the team at zynamics. Previously, I finished my diploma thesis at the RWTH Aachen. The work was about code virtualization techniques and what mechanisms can be applied to extract readable code from such a protection system. In the past, I mostly fiddled with reverse engineering and especially anti-reverse engineering techniques, wrote some libraries as well as a few IDA Pro plugins.
From now on, I’ll be working together with Tim and Sebastian on BinNavi and I’m looking forward to develop my skillz in such an excellent team of researchers here at zynamics.
It’s been a long time since the first part about static Objective-C reverse engineering so it’s time for a second one and provide another script to play with. In this second part we will be covering static class reconstruction for Objective-C binaries. Some class reconstruction was made previously in Cameron Hotchkies’s and itsme’s work, but those were for Mac OS X binaries and as we said in the first part, the structure of the binaries changed, as well as the internal structures that contain information about classes. Along this post and all its examples, we will use DigiClock as an example for our analysis. Given that the application comes with source code, it will be useful to verify our findings.
Our first step is to look into the __objc_classlist section. As the name says, this section contains a list of the classes defined inside the binary. Note that general classes as NSObject are not included even if used/referenced in the source code, just the classes we implement. Each DWORD in this section points to an address in __objc_data where the class definition is stored. For example:
00004980 DCD _OBJC_CLASS_$_DigiClockAppDelegate
00004984 DCD _OBJC_CLASS_$_FlipsideView
00004988 DCD _OBJC_CLASS_$_MainView
0000498C DCD _OBJC_CLASS_$_MainViewController
00004990 DCD _OBJC_CLASS_$_RootViewController
00004994 DCD _OBJC_CLASS_$_FlipsideViewController
And going to _OBJC_CLASS_$_DigiClockAppDelegate:
000040D8 _OBJC_CLASS_$_DigiClockAppDelegate DCD _OBJC_METACLASS_$_DigiClockAppDelegate
000040DC DCD 0
000040E0 DCD 0
000040E4 DCD 0
000040E8 DCD dword_43AC
Here we have our first struct, that we can define as (all fields are size 4 bytes):
|0||Pointer to Meta-Class|
|4||Pointer to Super-Class|
|8||Pointer to class cache|
|12||Pointer to vtable related struct|
|16||Pointer to class definition|
As you can see (if you’re looking into the binary while reading this, which you should) there’s always a meta-class but in this case we have no super-class. Super-class are usually references to high level classes like NSObject, UIViewController, UITableViewController, etc.
The cache and vtable struct are also empty so let’s move to the class definition:
|0||Boolean that indicates if it’s a meta-class|
|4||Instance size (disk?)|
|8||Instance size (memory?)|
|16||Pointer to class name (ASCII)|
|20||Pointer to method struct (list of implemented methods)|
|24||Pointer to protocol struct (inherited protocols)|
|28||Pointer to ivar names (list of declared variables)|
|36||Pointer to properties struct (list with encoded types)|
First of all, about the two “instance size” fields, it is unclear which one refers to either disk or memory size but my guess is inside parenthesis.
This one is a simple struct that contains a lot of useful information. The struct works like an array of method definitions. That way the first two fields indicate the size of each “method definition” (or field) and the second one the total number of fields. Following the example, DigiClockAppDelegate has 6 methods:
00004314 dword_4314 DCD 0xC
00004318 DCD 6
0000431C DCD aSetwindow ; "setWindow:"
00004320 DCD aV12048 ; "v12@0:4@8"
00004324 DCD __DigiClockAppDelegate_setWindow__+1
00004328 DCD aWindow ; "window"
0000432C DCD a804 ; "@8@0:4"
00004330 DCD __DigiClockAppDelegate_window_+1
00004334 DCD aSetrootviewcon ; "setRootViewController:"
00004338 DCD aV12048 ; "v12@0:4@8"
0000433C DCD __DigiClockAppDelegate_setRootViewController__+1
00004340 DCD aRootviewcontro ; "rootViewController"
00004344 DCD a804 ; "@8@0:4"
00004348 DCD __DigiClockAppDelegate_rootViewController_+1
0000434C DCD aDealloc ; "dealloc"
00004350 DCD aV804 ; "v8@0:4"
00004354 DCD __DigiClockAppDelegate_dealloc_+1
00004358 DCD aApplicationdid ; "applicationDidFinishLaunching:"
0000435C DCD aV12048 ; "v12@0:4@8"
00004360 DCD __DigiClockAppDelegate_applicationDidFinishLaunching__+1
The information stored in every field is, in order: the method name, an encoded string that specifies the function prototype (return value and parameter types), and the method address. The encoded prototype can look a bit tricky at first but with the help of some available information we can see how setWindow prototype would be something like:
void setWindow(self@0, id@8)
We know that id is a class instance, but we don’t know which one. And that’s all about methods for now.
The protocol struct has two parts. The first one specifies how many protocols that class inherits:
0000437C dword_437C DCD 1
00004380 DCD dword_49B8
In this case, as shown by the first DWORD is only one protocol, and the second field points to the protocol definition:
|4||Pointer to name of the class the protocol is inherited from.|
|8||Pointer to protocol struct (N-th level inheritance)|
|12||Pointer to a method struct (instance methods)|
|20||Pointer to a method struct (class methods).|
Here we have a recursive reference, where a protocol usually points to higher level protocols. It would be possible to build a hierarchy tree using this information but unfortunately, in most cases protocol information is related to standard classes (UIApplicationDelegate, NSObject…) so the resulting tree uses to be the same or quite similar.
About the difference between instance and class methods, the “Objective-C Programming Language” says:
“Protocols can’t be used to type class objects. Only instances can be statically typed to a protocol, just as only instances can be statically typed to a class. (However, at runtime, both classes and instances will respond to a conformsToProtocol: message.)”
This struct lists the variables defined inside the interface of the class. In our example, DigiClockAppDelegate defines two variables:
000042E4 dword_42E4 DCD 0x14
000042E8 DCD 2
000042EC DCD _OBJC_IVAR_$_DigiClockAppDelegate.window
000042F0 DCD aWindow ; "window"
000042F4 DCD aUiwindow ; "@\"UIWindow\""
000042F8 DCD 2
000042FC DCD 4
00004300 DCD _OBJC_IVAR_$_DigiClockAppDelegate.rootViewController
00004304 DCD aRootviewcontro ; "rootViewController"
00004308 DCD aRootviewcont_0 ; "@\"RootViewController\""
0000430C DCD 2
00004310 DCD 4
The structure is similar to the one used with methods. First we have a field size (0x14) followed by the total number of fields. Each field contains information about the offset, the variable name and the type. The remaining integer values are related to the size in memory.
This is the last struct of the class definition, and again uses the same structure with the first DWORD telling the field size and the second the number of fields. In our example, the properties are applied over the two variables we saw in the previous structure (“window” and “rootViewController”):
00004364 dword_4364 DCD 8
00004368 DCD 2
0000436C DCD aRootviewcontro ; "rootViewController"
00004370 DCD aTRootviewcontr ; "T@\"RootViewController\",&,N,VrootViewCon"...
00004374 DCD aWindow ; "window"
00004378 DCD aTUiwindowNVwin ; "T@\"UIWindow\",&,N,Vwindow"
For properties, fields contain only the variable name and the encoded properties. The encoded string usually follows this format:
Where N represents the nonatomic property and “&” the retain property.
Well, that’s all for this second part. There’s an idapython script that parses all this information on zynamics github. In the third and upcoming parts of Objective-C reversing we will be filling the gaps on the structures and using all this information to reconstruct the application’s header files and to improve the objc_helper script that we introduced on the first part.
A few days ago, between May 21st – 24th, DDTEK organized the Defcon 18 Capture The Flag qualifiers. For all of you that are not familiar with this kind of contest, Defcon CTF is a hacking offense/defense contest held during the conference in Vegas. In order to play the final round, a previous online competition takes place to select 9 top-teams that will join last year’s winner. The qualification contest contained 30 challenges through different categories like Pursuits Trivial (general questions), Crypto Badness (cryptography), Packet Madness (network traffic analysis), Binary L33tness (reversing), Pwtent Pwnables (exploiting) and Forensics. We at zynamics had a couple of guys playing in different teams so we decided to join the writeup fever and release a solution for the Binary L33tness 400 challenge
- Fixed a problem when tracking R0 register that was modified by previous calls. Now if the script is tracking R0 and finds a BX/BLX, it assumes that is modifying R0 and stops, marking the tracking as failed.
- Changed the way the script parses the data references so it works both with release and debug binaries. Instead of getting the raw offset we now use recursive calls to idautils.DataRefsFrom(). For the references to work properly we had to make a pre-process converting all dwords to offsets in the classrefs and superrefs sections (similar to the offsetize() used by KennyTM).
- In some cases, compiler can decide to use LR as a general register so the search for R0..R15 fails. Now the script includes the handling of this special case.
- Added check of Thumb/non-Thumb code for patching the calls correctly.
- Fixed bug that was getting the incorrect parameters for other flavours of msgSend(). Now it should be easier to add others.
Thanks a lot to everybody that reported bugs, and also to the betatesters!
Soon we will come with the Objective-C reversing part II with more improvements and details on static analysis. Stay tuned!