Resolving dynamic function calls with BinNavi

One of the big problems of static code analysis are function calls with non-static call targets. These function calls can call different target functions depending on the current program state. At first they call one function and in the next moment they might call a completely different function. Popular examples of such dynamic function calls are virtual functions (like in C++) or function pointers to callback functions.

Statically finding the set of potential call targets of a dynamic function call is very difficult. While this is an area of program analysis that has seen a lot of research in the last years, the problem is undecidable in general and can become really ugly really quickly. A simpler way to resolve the call targets of dynamic function calls is to execute the target program and log where dynamic function calls are going.

In BinNavi we have implemented a way to resolve dynamic function calls within modules as well as dynamic function calls that cross module boundaries. The general idea behind our code is this:

  • Figure out where the dynamic function calls are located and put breakpoints on them
  • Every time such a breakpoint is hit, execute a single step and find out where the call is going
  • Keep going until enough data has been collected

You can see how it all works in the 5 minutes (13 MB) flash video you can watch when you click on the image below.

Resolved dynamic function calls to ws2_32.dll

Here is some more information about the process which I could not put into the video itself:

The whole Call Resolver functionality is not part of BinNavi itself but implemented as a plugin. This shows how easily users of BinNavi can extend the BinNavi GUI with new functionality and how powerful the debugging and graphing API of BinNavi is. In fact, you can download the code of the plugin here if you want to check it out yourself. This plugin was written in Java but it could have been written in Jython or JRuby as well.

Storing disassembly data in a MySQL database gives the plugin an enormous advantage: It is really, really simple to find the addresses of dynamic function calls. A single SQL query does the trick. In most other reverse engineering tools the plugin would need to go through all functions/basic blocks/instructions of the modules to find the dynamic function call instructions.

Setting breakpoints only on dynamic function call instructions brought a big speed improvement compared to just tracing the whole target program. As you can see in the video, the target program stays responsive enough to be used. This is very useful because it allows the user of the Call Resolver to control what functionality is executed and therefore what dynamic function calls are traced.

Of course the dynamic approach has downsides too. We have to have a way to execute the target program. If all we have is a non-executable memory dump of some suspicious file then we can not use dynamic function call analysis. Even if it is possible to execute the target program, it is easy to miss function calls that are never executed or function call targets that are never reached while the tracer is attached to the process. This is especially true if you have a heuristic like BinNavi has where you stop resolving function calls that “always” (really, more than 20 times) seem to go to the same target address.

So, what about you? I’d like to hear about your experiences with resolving dynamic function calls. Are you more of a fan of a static solution or a dynamic solution?

6 Responses to “Resolving dynamic function calls with BinNavi”

  1. Ange says:

    Interesting.
    It would be easier to watch if it was in a lower resolution though.

    • Sebastian Porst says:

      Hi Ange,

      true, other people have complained about this too. The thing is, I consider BinNavi to be unusable at a screen resolution of 1024×768. The graphs that show the disassembled code are unfortunately not very compact yet so at low resolutions you just never see a lot of code on the screen.

      I’ll think about making future videos at a lower resolution.

  2. Hi,

    There´s a big problem with this solution, the code coberture.

    If the proceese in runtime dont hit the dynamics calls, you can never know the address …

    I m researching for static solutions with the most popular compilers and his optimizations options, it will soon be publicated in my Blog. I hope it, je je je ..

    Regards.

    • Sebastian Porst says:

      Hi Ivan,

      yeah, the problem is mentioned in my post. I still think it’s a really good approach for getting a first impression of what is going on.

      I would prefer a static solution too, but static solutions can be arbitrarily difficult (even undecidable). How is your solution working? There is a lot of research into points-to analysis ( http://en.wikipedia.org/wiki/Pointer_analysis )and I am wondering if you are using it or doing your own thing.

  3. Ivan Hernanz says:

    Hi Sebastian,
    Excuse me, because of my english I am spanish.

    Yes, my solution works with shape analisys and other techniques, like SuperGraph and so on.

    I talk with Ero Carrera, to implement the POC with monoREIL, but I have some doubts:

    1. Can i change REIL? with more instructions ? How?
    2. The monotone framework (monoREIL) can i work with other set of values which are not lattices?. If yes, how can i do to implement it?
    3. I guess i can implement any type of flow function, is it true? (for the others set Values).
    4. Can i change the fields and tables of database schema to introduce more BBs ??

    I will like to work with your APIs and to colaborate with you. Are you interested?

    Regards,

    • Sebastian Porst says:

      Hi Ivan,

      that’s good to hear that you are doing stuff like shape analysis.

      Now, for your questions:

      1. Do you ask whether you can change the REIL language itself (as in adding your own custom REIL instructions)? Or are you asking whether you can add new translators that translate native instructions into REIL code? The answer to the first question is ‘yes’, the answer to the second question is ‘yes, but probably not easily through the API’.

      2. Hm, a difficult question. I have never thought about this. What other sets of values are you thinking about?

      3. Yes.

      4. Yes, but depending on what you want to do it might be good to talk to me first.

      Of course I am interested in collaboration. I think Ero has given you our support instant messengers. You can contact me there (or at sebastian.porst@zynamics.com).