One of the big problems of static code analysis are function calls with non-static call targets. These function calls can call different target functions depending on the current program state. At first they call one function and in the next moment they might call a completely different function. Popular examples of such dynamic function calls are virtual functions (like in C++) or function pointers to callback functions.
Statically finding the set of potential call targets of a dynamic function call is very difficult. While this is an area of program analysis that has seen a lot of research in the last years, the problem is undecidable in general and can become really ugly really quickly. A simpler way to resolve the call targets of dynamic function calls is to execute the target program and log where dynamic function calls are going.
In BinNavi we have implemented a way to resolve dynamic function calls within modules as well as dynamic function calls that cross module boundaries. The general idea behind our code is this:
- Figure out where the dynamic function calls are located and put breakpoints on them
- Every time such a breakpoint is hit, execute a single step and find out where the call is going
- Keep going until enough data has been collected
You can see how it all works in the 5 minutes (13 MB) flash video you can watch when you click on the image below.
Here is some more information about the process which I could not put into the video itself:
The whole Call Resolver functionality is not part of BinNavi itself but implemented as a plugin. This shows how easily users of BinNavi can extend the BinNavi GUI with new functionality and how powerful the debugging and graphing API of BinNavi is. In fact, you can download the code of the plugin here if you want to check it out yourself. This plugin was written in Java but it could have been written in Jython or JRuby as well.
Storing disassembly data in a MySQL database gives the plugin an enormous advantage: It is really, really simple to find the addresses of dynamic function calls. A single SQL query does the trick. In most other reverse engineering tools the plugin would need to go through all functions/basic blocks/instructions of the modules to find the dynamic function call instructions.
Setting breakpoints only on dynamic function call instructions brought a big speed improvement compared to just tracing the whole target program. As you can see in the video, the target program stays responsive enough to be used. This is very useful because it allows the user of the Call Resolver to control what functionality is executed and therefore what dynamic function calls are traced.
Of course the dynamic approach has downsides too. We have to have a way to execute the target program. If all we have is a non-executable memory dump of some suspicious file then we can not use dynamic function call analysis. Even if it is possible to execute the target program, it is easy to miss function calls that are never executed or function call targets that are never reached while the tracer is attached to the process. This is especially true if you have a heuristic like BinNavi has where you stop resolving function calls that “always” (really, more than 20 times) seem to go to the same target address.
So, what about you? I’d like to hear about your experiences with resolving dynamic function calls. Are you more of a fan of a static solution or a dynamic solution?