Code coverage and BinNavi

I have already explained in my previous posts how much I love static analysis, nonetheless sometimes you have to get your hands dirty and use a debugger. In this post we will take a look at the BinNavi debugging APIs and how to use them to create a code coverage plugin. In this blog post I have spoken about how to use BinNavi “without BinNavi” so in order to fully understand the rest of the post it is probably better to take a look at it.

We implement code coverage at basic blocks level, that is we set a breakpoint at the beginning of each basic block inside a module. So the first thing to do is to retrieve the basic blocks of a given module. BinNavi exports a method to directly read the start address of each basic block belonging to a given module from the database instead of iterating through the functions and retrieve the basic blocks structures. It should be noticed though that this method cannot be used to modify basic blocks structures.

[sourcecode language=”python”]
for module in mods:
addresses = ModuleHelpers.getBasicBlockAddresses(module)
for address in addresses:
addr = address.toLong()
# filter them using user-supplied lower and upper bound addresses
if start_addr <= addr <= end_addr:
blocks.append(addr)

print "Total basic blocks", len(blocks)[/sourcecode]

Of course those addresses need to be relocated at run-time,  therefore the next task is to locate the module in-memory and relocate each address accordingly. Intuitively in order to do so we need to attach to the remote process and look for loaded modules until we find the one we are interested in:

[sourcecode language=”python”]
def getRunningModule(self, moduleName):

if self.debugger is None:
return None

self.debugger.process.addListener(self)
self.name = moduleName

if self.debugger.isConnected() is False:
print "attaching to the remote target"
self.debugger.connect()

while self.module is None:
continue

self.debugger.suspend()[/sourcecode]

We suspend the target process here because before executing the process we first need to relocate the addresses and set breakpoints. We will resume it after both operations are completed.

As you might have noticed before attaching to the remote target we register a listener for the target process.
There are a few types of listener classes useful for our purposes, most notably IDebuggerListener and IProcessListener. Both of them are notified when common debugging events happen. To learn more about those listeners  I suggest you to take a look at the documentation.
In our class we implement a few methods of the IProcessListener class which are called by the dispatcher inside BinNavi when certain messages are delivered from the remote debugger.

[sourcecode language=”python”]
def changedTargetInformation(self, process):
self.debugger.resume()

def addedModule(self, process, mod):
if self.module != None:
return
for module in process.modules:
if module.name.find(self.name) != -1:
self.module = module[/sourcecode]

The first method is called when the debugger attaches to the target process and retrieves some basic information on it. We need to resume the process at that point as the debugger after the initialization suspends it(notice that the call to suspend() in the previous code snippet happens after we locate the module in memory, that is after we call resume() here).

The second method is called whenever a new image is loaded in the process address space. In our code as soon as we find the module we are looking for we don’t care about other images.

Now that we have the module in-memory we can relocate the addresses:

[sourcecode languge=”python”]
inMemoryAddr = runningModule.baseAddress.toLong()
originBaseAddr = self.naviDB.module.imagebase.toLong()
print "Original Base Address: %x In-Memory one: %x\n Relocating..\n" % (inMemoryAddr, originBaseAddr)
for bb in self.blocks:
addresses.append((bb – originBaseAddr) + inMemoryAddr)[/sourcecode]

I lied when I said we need to set breakpoints, in fact BinNavi takes care of that internally by the means of a TraceLogger!

[sourcecode language=”python”]
for address in addresses:
naviAddresses.append(TracePoint(self.traceEntity, Address(address)))

print "Starting the process…\n"
tracer = TraceLogger(self.debugger, self.traceEntity)
self.traceManager = myTraceListener(addresses)
self.trace = tracer.start("codeCoverage", "", naviAddresses)
self.trace.addListener(self.traceManager)
self.debugger.resume()[/sourcecode]

TraceLogger is a class which let create a log of echo breakpoint events, that is we create a list of TracePoints (locations where the trace logger sets echo breakpoints) and the TraceLogger will take care of the rest.

Echo breakpoints are a ‘lightweight’ version of regular breakpoints. In essence, echo breakpoints get removed after they are initially hit. This leads to better performance of the application that is being debugged, as execution speed of a particular path is only slowed down during the -first- execution.

So first we set up the tracer and then we create the trace. A trace can have a listener which is notified when a new event is added; we use such a listener to keep track of the blocks touched during the execution.

[sourcecode language=”python”]
class myTraceListener(ITraceListener):
def __init__(self, addresses):
self.addyCount = []
for address in addresses:
self.addyCount.append((address, 0))

def addedEvent(self, trace, event):
for addy, counter in self.addyCount:
if addy == event.address:
self.addyCount.remove((addy, counter))
self.addyCount.append((addy, counter + 1))[/sourcecode]

When a new event is added, we retrieve the address and update the address counter accordingly.
At this point we are all set, and we can get the code coverage score:

[sourcecode language=”python”]
def getCodeCoverage(self):
#get the list of all the executed blocks at a given program point
touched_blocks = self.naviTracer.getExecBlocks()
coverage = float(len(touched_blocks)) / float(len(self.getBlocks()))
return coverage

def printStatistics(self):
print "CODE COVERAGE = %f\n" % self.getCodeCoverage()
[/sourcecode]

Let’s run it then! On the target machine we run:

[code]
client32.exe C:\WINDOWS\system32\calc.exe
[/code]

on the local machine

[code]
jython NaviCoverage.py databaseHost databaseUser databasePassword databaseName calc.exe
[/code]

And here’s a screenshot

That’s all for now.

5 Responses to “Code coverage and BinNavi”

  1. […] blog.zynamics.com the official zynamics company blog « Code coverage and BinNavi […]

  2. beist says:

    Nice posting. Binnavi seems very awesome. I see many good API it has. 🙂

    By the way, I’m using IDA and I often have to do code-coverage as well. I wanted to attach to a process but not load a binary. But, IDA didn’t analyze functions in a process that I attached. So, I used a dirty way to solve this problem.

    First off, I loaded the binary of the process in IDA, And recorded all function information through IDAPython(+sqllite). Next, I ran the binary and attached to the process and recovered the function information using the recorded data. IDA gives us a nice API to easily do this, MakeFunction(). 😀

    Anyway, my friend’s company is using BinNavi and maybe I’ll meet him soon. He may will give some demo to me, meaning, I’m so excited. 🙂

  3. Sebastian Porst says:

    Hi beist,

    that’s a good approach but I am wondering about the speed. Without testing it I fear that IDA + IDAPython is really slow for something that produces as much data as your trace. Have you ever thought about trying some other tool to generate the trace for you? (OllyDBG’s trace mode, Pin, DynamoRIO, …)?

  4. beist says:

    Hi!

    I’ve tested only in immunity debugger and PIN. I think PIN gives us the best speed and very nice API, but I just wanted to take a chance to play on python. Just started IDAPython a month ago, yeah, I know now python language is the best for prototypes. 😉 How about Binnavi though?

  5. Sebastian Porst says:

    Hi beist,

    I wouldn’t use BinNavi for anything speed-critical because all BinNavi debuggers are remote debuggers and even if you’re on blog.zynamics.com you still need a round-trip through TCP/IP for each event. Mind you, BinNavi is fast enough for most things (our trace mode can process like 150 breakpoint hits per second on my PC) but for serious instruction tracing I would rather use a dedicated dynamic instrumentation tool.