ida2sql: exporting IDA databases to MySQL

by

Today we are finally making it easier to get your hands on ida2sql, our set of scripts to export information contained in an IDA database into MySQL.

As a short recap, ida2sql is a set of IDAPython scripts to export most of the information contained in an IDB into a MySQL database. It has existed and evolved already for a few years and has been the main connection between IDA and BinNavi for the most of the life of the latter.

The last development efforts have been geared towards making the schema a bit more friendly (see below) and making it work in a fair range of IDA (5.4 to 5.7) and IDAPython versions (including some that shipped with IDA which had minor problems, ida2sql will automatically work around those issues). The script runs under Windows, Linux and OSX.

ida2sql is comprised of a ZIP archive containing the bulk of the scripts that simply needs to be copied to IDA’s plugin folder. No need to extract its contents. A second script, ida2sql.py, needs to be run from within IDA when we are ready to export data. You can keep it in any folder, it should be able to automatically locate the ZIP file within the plugins folder. You can download here a ready built package containing the ZIP file, the main script, a README and an example configuration file.

When the main Python file is executed in IDA and if all the dependencies are successfully imported the user will be presented with a set of dialogs to enter the database information. If the database is empty there will also be a message informing that the basic set of tables is about to be created at that point. Once all configuration steps have been completed the script will start processing the database, gathering data and finally inserting it all into the database.

The configuration process can be simplified by creating a config file ida2sql.cfg in IDA’s main directory (or by pointing to it the IDA2SQLCFG environment variable). If ida2sql can find that file it will not ask for any of the configuration options and go straight into the exporting.

Automation

ida2sql has a batch mode that comes handy when you need to export a collection of IDBs into the database. To run ida2sql in batch mode it’s enough to set the corresponding option in the configuration file.

mode: batch

An operation mode of “batch” or “auto” indicates that no questions or other kind of interaction should be requested from the user. (Beware though that IDA might still show dialogs like those reminding of a license or free-updates period about to expire. In those cases run IDA through the GUI and select to never show again those reminders). The batch mode is specially useful when running ida2sql from the command line, for instance:

idag.exe -A -OIDAPython:ida2sql.py database.idb|filename.exe

Requirements

  • mysql-python
  • A relatively recent IDA (tested with 5.4, 5.5, 5.6 and the latest beta of 5.7)
  • IDAPython. Chances are that you already have it if you are running a recent IDA version
  • A MySQL database. It does not need to reside on the same host

The schema

A frequent criticism to the schema design has always been the use of a set of tables per each module. People have asked why not use instead using a common table-set for all modules in the database. While we considered this approach in the original design, we opted for using a set of tables per module. We are storing operand trees in an optimized way aiming at reducing redundant information by keeping a single copy of all the common components of the operand’s expression tree. Such feature would be extremely difficult to support were we to use a different a different schema. Additionally tables can easily grow to many tens of millions of rows for large modules. Exporting hundreds of large modules could lead to real performance problems.

The table “modules” keeps track of all IDBs that have been exported into the database and a set of all the other tables exists for each module.

BinNavi DB Version 2

BinNavi DB Version 2

The following, rather massive, SQL statement shows how to retrieve a basic instruction dump for all exported code from an IDB. (beware of the placeholder “_?_”)

SELECT
   HEX( functions.address ) AS functionAddress,
   HEX( basicBlocks.address ) AS basicBlockAddress,
   HEX( instructions.address ) AS instructionAddress,
   mnemonic, operands.position,
   expressionNodes.id, parent_id, 
   expressionNodes.position, symbol, HEX( immediate )
FROM 
    ex_?_functions AS functions
INNER JOIN ex_?_basic_blocks AS basicBlocks ON
    basicBlocks.parent_function = functions.address
INNER JOIN ex_?_instructions AS instructions ON 
    basicBlocks.id = instructions.basic_block_id
INNER JOIN ex_?_operands AS operands ON 
    operands.address = instructions.address
INNER JOIN ex_?_expression_tree_nodes AS operandExpressions ON
    operandExpressions.expression_tree_id = operands.expression_tree_id
INNER JOIN ex_?_expression_nodes AS expressionNodes ON
    expressionNodes.id = operandExpressions.expression_node_id
ORDER BY 
    functions.address, basicBlocks.address,
    instructions.sequence, operands.position, 
    expressionNodes.parent_id,
    expressionNodes.position;

Limitations and shortcomings

The only architectures supported are x86 (IDA’s metapc), ARM and PPC. The design is pretty modular and supports adding new architectures by simply adding a new script. The best way to go about it would be to take a look at one of the existing scripts (PPC and ARM being the simplest and most manageable) and modify them as needed.

ida2sql has been designed with the goal in mind of providing an information storage for our products, such as BinNavi. It will only export code that is contained within functions. If you have an IDB that has not been properly cleaned or analyzed and contains snippets/chunks of code not related to functions, those will not be exported. Examples of some cases would be exception handlers that might only be referenced through a data reference (if at all) or switch-case statements that haven’t been fully resolved by IDA.

The scripts have exported IDBs with hundreds of thousands of instructions and many thousands of functions. Nonetheless the larger the IDB the more memory the export process is going to require. ida2sql’s performance scales mostly linearly when exporting. It should not degrade drastically for larger files. It will also make use of temporary files that can grow large (few hundred MBs if the IDB is tens of MBs in size when compressed). Those should not be major limitations for most uses of ida2sql.

Also it’s worth noting that IDA 5.7 has introduced changes to the core of IDAPython and the tests we have made so far with the current beta the performance of ida2sql has improved significantly. In the following figures you can see the export times in seconds for some IDBs exported with IDA 5.5, 5.6 and 5.7.

ida2sql export times

ida2sql export times for a medium size file

ida2sql export times

ida2sql export times for a set of small IDBs

Summing up. We hope this tool will come handy for anyone looking into automating mass analysis and has been bitten by the opaque and cumbersome IDBs. Give it a spin, look at the source code, break it and don’t forget to let us know how it could be improved! (patches are welcome! ;-) )

Tags: , , , , ,

37 Responses to “ida2sql: exporting IDA databases to MySQL”

  1. haliddavasana Says:

    Seems to be a crash in arch\metapc.py line 228 (i think) when dealing with xmm operations / operands

    • Ero Carrera Says:

      IDA 5.7 changed the way those registers are exposed through the API. ida2sql has been fixed in order to support the new format as well as to continue working with older IDA/IDAPython versions.

      The revised version has been uploaded to GitHub

      • haliddavasana Says:

        Also a crash in dealing with the ‘aeskeygenassist’ (!?!?!) instruction … looks similar

  2. Ero Carrera Says:

    Thanks, will look into it.

    • Ero Carrera Says:

      With IDA 5.7 and the latest ida2sql, which fixed SSE support (XMM operands & more) , the AES extensions work like a charm. I’m not seeing any problems with aeskeygenassist or its siblings

  3. Sanjay Says:

    Hi Ero,
    As advised by you, I started looking at ida2sql for my work. It is really cool!!. However, I notice one thing. IDA pro provides a list of arguments and local variables pertaining to a function. This information makes further analysis very convenient. I don’t see this information available with ida2sql. Is it not possible to include 2 more fields in “functions” table viz. “arguments” and “local_var” and fill them up while fetching information from IDA DB to SQL DB?

    • Ero Carrera Says:

      We considered that but decided it would be something we would implement in BinNavi, once it gets its own datatype system.
      The data type recognition in IDA works relatively well but it’s something that benefits a lot from IDA’s interactivity, which is something harder to model with the SQL export, which we mainly designed to be static.
      I think you might be better off working directly on the IDB through IDAPython if you need to access that kind of information.

  4. virvdova Says:

    I have problem:

    Environment variable IDA2SQLPATH not found
    IDA2SQL> Using architecture module [metapc]
    IDA2SQL> Initialization sucessful.

    Exporting manually …
    Traceback (most recent call last):
    File “”, line 1, in
    File “C:\Program Files\IDA\python\init.py”, line 62, in runscript
    execfile(script, globals())
    File “C:/Program Files/IDA/python/ida2sql.py”, line 52, in
    ida_to_sql.ida_to_sql.main()
    File “C:\Program Files\IDA\python\ida_to_sql\ida_to_sql.py”, line 848, in main
    idc.GetInputFilePath(), arch.get_architecture_name(), idaapi.get_imagebase(), module_comment, batch_mode)
    File “C:\Program Files\IDA\python\ida_to_sql\sql_exporter.py”, line 836, in new_module
    status = self.db.new_module(path, architecture, base_address, module_comment, batch_mode)
    File “C:\Program Files\IDA\python\ida_to_sql\sql_exporter.py”, line 358, in new_module
    if self.verify_database_version() is not True:
    File “C:\Program Files\IDA\python\ida_to_sql\sql_exporter.py”, line 319, in verify_database_version
    rows = self.cursor.fetchall()
    AttributeError: DummyCursor instance has no attribute ‘fetchall’

    Help me!

    • Ero Carrera Says:

      Do you have mysql-python installed in your system? What database are you trying to export it for?

  5. Ishant Says:

    Hello sir, i am unable to make the database of the disassembled binary file.I am currently doing my final year project on detecting the malwares using different approach which involves disassembling it in the first step and then make the database of the apis called by the binary file.I have ida pro with me, i have ida2sql plugin.but i have no clue how to use it effectively.

    Plz help!

  6. Sanjay Says:

    hi Ishant,
    as written in the supplied README, u need to keep “ida2sql.zip” in the IDApro’s plugin director and then run the “ida2sql.py” script from within IDApro (after loading your binary and having IDApro finished its analysis). You need to have setup your mysql database already (by creating a new database. Remember to install MYSQL-python module.). You can fill the corresponding entries in the supplied “ida2sql.cfg” file and paste it in IDA directory. You can also supply this info while running python script within IDApro. that’s all to fetch data from IDA DB to mysql DB. After that, you use your favorite language (python, i guess!!) to access that DB and do whatever u want :)

    • Ishant Says:

      Hello sanjay i got it..but a new error says error connecting to the database

      bytes pages size description
      ——— —– —- ——————————————–
      286720 35 8192 allocating memory for b-tree…
      311296 38 8192 allocating memory for virtual array…
      262144 32 8192 allocating memory for name pointers…
      —————————————————————–
      860160 total memory allocated

      Loading IDP module C:\Program Files\IDA\procs\pc.w32 for processor metapc…OK
      Loading type libraries…
      Autoanalysis subsystem has been initialized.
      Database for file ‘notepad.exe’ is loaded.
      Compiling file ‘C:\Program Files\IDA\idc\ida.idc’…
      Executing function ‘main’…
      —————————————————–
      IDAPython version 0.9.61 final (serial 0) initialized
      Python interpreter version 2.5.0 final (serial 0)
      —————————————————–
      Environment variable IDA2SQLPATH found: [C:\Program Files\IDA\plugins]
      IDA2SQL> Using architecture module [metapc]
      IDA2SQL> Initialization sucessful.
      Error connecting to the database
      Exception exceptions.AttributeError: “DBWrap instance has no attribute ‘db_con'” in <bound method DBWrap.__del__ of > ignored
      Traceback (most recent call last):
      File “”, line 1, in
      File “C:\Program Files\IDA\python\init.py”, line 60, in runscript
      execfile(script, globals())
      File “C:/Program Files/IDA/ida2sql.py”, line 42, in
      ida_to_sql.ida_to_sql.main()
      File “C:\Program Files\IDA\plugins\ida_to_sql\ida_to_sql.py”, line 657, in main
      if not sqlexporter.is_database_ready():
      UnboundLocalError: local variable ‘sqlexporter’ referenced before assignment

  7. Ishant Says:

    Hello Sanjay

    Thanks 4 the reply.Plz tell me how to run ida2sql from within the ida.I am using ida pro 5.2 and python 2.5.Is it like we have to run it thru cli?Plz help

  8. Sanjay Says:

    Hi Ishant,
    As mentioned by Ero, install setuptools which is, anyway, required to install python modules “easily”. However, you may experience that when u want to install mysql_python on windows, it cries a lot. So, better you download the binary (http://sourceforge.net/projects/mysql-python/files/mysql-python/1.2.2/) and install it on your system. Remember to check your python version against the binary.

    • Ishant Says:

      Thank you sanjay.I had installed it now one different error this time..it says
      bytes pages size description
      ——— —– —- ——————————————–
      286720 35 8192 allocating memory for b-tree…
      311296 38 8192 allocating memory for virtual array…
      262144 32 8192 allocating memory for name pointers…
      —————————————————————–
      860160 total memory allocated

      Loading IDP module C:\Program Files\IDA\procs\pc.w32 for processor metapc…OK
      Loading type libraries…
      Autoanalysis subsystem has been initialized.
      Database for file ‘notepad.exe’ is loaded.
      Compiling file ‘C:\Program Files\IDA\idc\ida.idc’…
      Executing function ‘main’…
      —————————————————–
      IDAPython version 0.9.61 final (serial 0) initialized
      Python interpreter version 2.5.0 final (serial 0)
      —————————————————–
      Environment variable IDA2SQLPATH found: [C:\Program Files\IDA\plugins]
      IDA2SQL> Using architecture module [metapc]
      IDA2SQL> Initialization sucessful.
      Error connecting to the database
      Exception exceptions.AttributeError: “DBWrap instance has no attribute ‘db_con'” in <bound method DBWrap.__del__ of > ignored
      Traceback (most recent call last):
      File “”, line 1, in
      File “C:\Program Files\IDA\python\init.py”, line 60, in runscript
      execfile(script, globals())
      File “C:/Program Files/IDA/ida2sql.py”, line 42, in
      ida_to_sql.ida_to_sql.main()
      File “C:\Program Files\IDA\plugins\ida_to_sql\ida_to_sql.py”, line 657, in main
      if not sqlexporter.is_database_ready():
      UnboundLocalError: local variable ‘sqlexporter’ referenced before assignment
      Environment variable IDA2SQLPATH found: [C:\Program Files\IDA\plugins]
      IDA2SQL> Using architecture module [metapc]
      IDA2SQL> Initialization sucessful.
      Error connecting to the database
      Exception exceptions.AttributeError: “DBWrap instance has no attribute ‘db_con'” in <bound method DBWrap.__del__ of > ignored
      Traceback (most recent call last):
      File “”, line 1, in
      File “C:\Program Files\IDA\python\init.py”, line 60, in runscript
      execfile(script, globals())
      File “C:/Program Files/IDA/ida2sql.py”, line 42, in
      ida_to_sql.ida_to_sql.main()
      File “C:\Program Files\IDA\plugins\ida_to_sql\ida_to_sql.py”, line 657, in main
      if not sqlexporter.is_database_ready():
      UnboundLocalError: local variable ‘sqlexporter’ referenced before assignment

      PLZ HELP

    • Ero Carrera Says:

      Just before the exception there’s the line “Error connecting to the database” pointing out that it was not possible to establish a successful connection.

      What database are you using? MySQL? is it properly installed and are you providing the right database name, username and password? can you connect to the database through MySQL’s command line utility?

      • Ishant Says:

        Yes mysql

        I was able to connect it but only once
        it showed exporting segments but wen i restarted my machine it again gives the same error

  9. Ishant Says:

    Its done boss…thanx a ton
    database is created i am able to see the function calls i.e api calls, i am able to export it to excel sheet.thanx a lot
    one last doubt i also want to know the frequency of each api.Is it possible??

  10. Sanjay Says:

    good. what do you mean by “frequency of each api”? are you referring to calls to functions in windows API? if so, just look for CALL assembly instruction and record the operand which should be name of the function. But remember: it all depends on how well IDA Pro did in recognizing the functions (which is pretty good).

  11. Ishant Says:

    Tables_in_new1 |
    +—————————-+
    | address_comments_1 |
    | address_references_1 |
    | basic_blocks_1 |
    | callgraph_1 |
    | control_flow_graph_1 |
    | data_1 |
    | expression_substitutions_1 |
    | expression_tree_1 |
    | functions_1 |
    | instructions_1 |
    | metainformation |
    | modules |
    | operand_expressions_1 |
    | operand_strings_1 |
    | operand_tuples_1 |
    | sections_1 |

    hi sanjay i have the above mentioned tables.
    and the query i want to execute should be able to give me function address, src block address, and name where type should be import or function

  12. Sanjay Says:

    Hi Ishant,
    Long back, I did some hands-on with ida2sql. Following is a very basis code which can give you some idea on how to start with it (this is not the solution of your problem!!). I am using another module called pydot to make a graphical representation of the information that I get from database, populated by ida2sql script. As I mentioned earlier, all you need to do is step-down to instruction level and look for CALL instruction. It will have only one operand which you can match with library functions, you want to profile. So, basically, you need to follow this path: modules -> functions (call_graph) -> basic_blocks -> instructions ->operand_*

    '''
    Created on Aug 5, 2010

    @author: SanjayR
    '''
    '''
    Following is the database schema for idasql database.
    +-------------------------------+
    | Tables_in_idasql |
    +-------------------------------+
    | ex_1_address_comments |
    | ex_1_address_references |
    | ex_1_basic_blocks |
    | ex_1_callgraph |
    | ex_1_control_flow_graphs |
    | ex_1_data |
    | ex_1_expression_nodes |
    | ex_1_expression_substitutions |
    | ex_1_expression_tree_nodes |
    | ex_1_expression_trees |
    | ex_1_functions |
    | ex_1_instructions |
    | ex_1_operands |
    | ex_1_sections |
    | modules |
    +-------------------------------+
    '''
    import MySQLdb
    import pydot
    g = pydot.Dot(graph_type='digraph')

    conn = MySQLdb.connect (host = "localhost",user = "idasql",passwd = "idasql",db = "idasql")
    cursor = conn.cursor ()
    cursor.execute ("""SELECT * FROM ex_1_functions""")
    functions = cursor.fetchall()
    funcDict={}

    print "function info:"
    for func_rows in functions:
    cfg = pydot.Dot(graph_type='digraph')
    #print "%08x %s %d %d %s" % func_rows
    funcDict[func_rows[0]]=func_rows[1]
    cursor.execute(""" SELECT id, address FROM ex_1_basic_blocks WHERE parent_function = %d""" % (func_rows[0],))
    BBrow = cursor.fetchall()
    if len(BBrow)<= 1:
    continue
    #print "BB for function %s:" % func_rows[1]
    for BB_rows in BBrow:
    #print "ID: %d\taddress: %08x"%(BB_rows[0],BB_rows[1])
    cfg.add_node(pydot.Node("%d"%BB_rows[0], label = "Loc_%08x"%BB_rows[1]))
    cursor.execute(""" SELECT source, destination, parent_function FROM ex_1_control_flow_graphs WHERE parent_function = %d""" % (func_rows[0],))
    cfgrow = cursor.fetchall()
    for cfg_rows in cfgrow:
    #print "source: %d\tDest: %d\tparent: %d"%(cfg_rows[0],cfg_rows[1], cfg_rows[2])

    cfg.add_edge(pydot.Edge("%d"%cfg_rows[0],"%d"%cfg_rows[1]))

    cfg.write_jpeg("%s.jpg"%func_rows[1],prog='dot')
    cursor.execute ("""SELECT * FROM ex_1_callgraph""")
    callrow = cursor.fetchall()
    print "call graph info:"
    for call_rows in callrow:
    print "%d %08x %d %08x %08x" % call_rows
    g.add_edge(pydot.Edge("%s" % funcDict[call_rows[1]], "%s" % funcDict[call_rows[4]]))

    g.write_jpeg('testingGraph.jpg',prog='dot')

    #cursor.execute ("""SELECT * FROM ex_1_basic_blocks""")
    #BBrow = cursor.fetchall()
    #print "BB info:"
    #for BB_rows in BBrow:
    # print "%d %08x %08x" % BB_rows

    cursor.close ()
    conn.close ()

    hope this helps
    -Sanjay

  13. Ishant Says:

    Thanks 4 d help sanjay but i didnot understand anythin in dis.:(
    And ya i want to know the frequency of each api so thai can select that as a feature and feed it to my classifier.plz help soon

  14. Chossing Says:

    Seems to very slow for exporting some idb… Any approaches to speed it up?

  15. somaiya Says:

    Hi i am not being able to use the plugin. I am getting the following error.
    Error connecting to the database, error importing required module:

    I am using the database host as localhost and user as root.

    Please advise.

    Thank you

  16. Krish Godiawala Says:

    hello Ero we are getiing the following error:

    Using FLIRT signature: SEH for vc7/8
    Propagating type information…
    Function argument information has been propagated
    The initial autoanalysis has been finished.

    Environment variable IDA2SQLPATH not found
    IDA2SQL> Using architecture module [metapc]
    IDA2SQL> Initialization sucessful.
    Error connecting to the database, error importing required module:

    And also we are using IDA PRO 6.1..so will the version matter???could you please help us out on the same??

    • Ero Carrera Says:

      For some reason there’s still no mention of the failing required module in the error message. Is it simply failing to be displayed?
      Given the error I would suspect the problem lies in the module attempting the connection to the database. I would first make sure that it is working. I would advise you to first make sure that your default Python installation can connect to MySQL by firing up the interpreter and manually attempting a direct connection. If that works I’d try similar steps within the Python command line in IDA. If all these work, then it may be worth looking into ida2sql as a source of the problem. Let me know how it goes.

      • Somaiya Says:

        Hi,

        I am not well versed with python. I would really appreciate if I could personally connect with you or anyone on your team who could help me out with this. I really need to get this working asap as a lot of my final dissertation depends on this.

        Regards

  17. Hisham Says:

    Hello Ero,
    i have problem with creating mysql database
    here is log of output window

    Environment variable IDA2SQLPATH not found
    IDA2SQL> Using architecture module [metapc]
    IDA2SQL> Initialization sucessful.
    Connecting to the database
    Error connecting to the database, Reason:

    i installed latest Mysql for windows 64, mysql for python , mysql to python 2.7 connector and idapython

    • Ero Carrera Says:

      Hi, it seems there’s some sort of problem connecting to the database but no error message is provided. I would first start debugging the issue by launching a Python shell and trying to connect to MySQL, outside IDA, to get an idea of what the problem may be, that should give you a better idea of where the issue may lie.

  18. yanchen qiao Says:

    Hello Ero,
    Does it support ida pro 6.5, I can’t get any result when running the shell as the blog above.
    I think you must be proficient in the IDA, so could you help me solve a problem. I want to save the “Function Name” windows to a .txt in batch mode, but the idaw could not do it.
    Thank you very much.
    My english is so poor.

    • Ero Carrera Says:

      Saving function names should a relatively simple script. It should be something along the lines of:

      with open('my_file.txt', 'w') as f:
      for function_start in Functions():
      f.write('%x:%s\n' % (function_start, GetFunctionName(function_start)))

      The formatting is bad, there should be indentation, but I hope it’s helpful.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

Join 40 other followers

%d bloggers like this: