Malicious PDF file analysis: zynamics style

If you are interested in PDF file analysis we might soon have something for you. We have developed a nifty little application that can not only parse PDF files but also help you analyze them very quickly. The main features include:

  • The ability to view PDF files as content trees as well as hex data.
  • Decode and display embedded JavaScript.
  • Refactoring functionality for JavaScript code, for example for variable renaming.
  • An integrated JavaScript interpreter for malicious script debugging.
  • An extensible Adobe Reader emulator to simulate arbitrary versions and configurations of Adobe Reader.
  • Intercept all called functions to log calls or modify arguments and return values.
  • Automated exploit recognition.

To see it all in action, you can watch a preview video by clicking this link.

There are a few things to explore in the next weeks:

  • We will improve the PDF parser.
  • We will add more JavaScript refactoring functions.
  • We need to figure out how to limit memory to scripts because if your script gets heap-sprayed, the PDF analysis tool will get heap-sprayed too which is uncool.
  • We will add a plugin API so that you can automatically process large quantities of files. This should be very useful for everyone who wants to do batch analysis of PDF files.

11 Responses to “Malicious PDF file analysis: zynamics style”

  1. neox says:

    pretty cool. looking forward for the tool. 🙂

  2. Tyler says:

    If you need any beta testers, PLEASE contact me! Great tool!

    • Sebastian Porst says:

      I’ll keep that in mind, but realistically it’s at least another four weeks before we can beta-test this baby.

  3. Das scheint eine reelle Hilfe bei der Entwicklung von JavaScript haltigen PDFs zu werden!

    Wie geht Ihr mit PDFs (xfa) um, die mit LiveCycle Designer erzeugt wurden?

    • Sebastian Porst says:

      Hi Jan,

      we did not yet look at anything created with LiveCycler designer.

  4. jf says:

    Watched the video; neat. My only question would be (and this may be ignorance of Adobe JS) relating to refactoring code and scripts that use arguments.callee.toScript() to decrypt themselves; although assuming adobe supports that I suppose you could just emulate the call and return the original script, keeping the refactoring working as expected.

    all in all, interesting work, as always.

    • Sebastian Porst says:

      Hi jf,

      Adobe Javascript does indeed support the callee method of obfuscation and I have a few PDF samples that make use of it.

      Our PDF tool does not yet handle this properly, but we are planning to emulate the callee method exactly the same you mention: by overriding the method and returning the original script text.

  5. […] PDF Tool (Malware PDF analysis tool; more info here) […]

  6. Awesome! this is EXACTLY the tool I want for pdfs. Even if the first few versions turn out buggy it really seems this is a good approach and design. can’t wait.

  7. […] Porst I have talked about PDF Dissector, our new tool for analyzing malicious PDF files,  on this blog before. After a few weeks of beta testing we are releasing PDF Dissector 1.0 […]

  8. […] obfuscation in PDF: Sky is the limit (getAnnots,arguments.callee) 2010-04-09: Malicious PDF file analysis: zynamics style (PDF Dissector video) 2010-04-22: Will there be new viruses exploiting /Launch vulnerability in […]