A few days ago, between May 21st – 24th, DDTEK organized the Defcon 18 Capture The Flag qualifiers. For all of you that are not familiar with this kind of contest, Defcon CTF is a hacking offense/defense contest held during the conference in Vegas. In order to play the final round, a previous online competition takes place to select 9 top-teams that will join last year’s winner. The qualification contest contained 30 challenges through different categories like Pursuits Trivial (general questions), Crypto Badness (cryptography), Packet Madness (network traffic analysis), Binary L33tness (reversing), Pwtent Pwnables (exploiting) and Forensics. We at zynamics had a couple of guys playing in different teams so we decided to join the writeup fever and release a solution for the Binary L33tness 400 challenge
The binary for this challenge was a Linux x86 ELF binary, that you can find inside the additional contents package. By running the binary we knew that it listens on port 5566/TCP and if you connect you will be asked for a key, so the objective was to find the right key. After a first glance, the binary was not giving us much information about its behaviour and the callgraph was too complex to do a step-by-step analysis. There are, however, some strings referenced from main() that attracted our attention. One of the first is “DVM requires 16kB minimum heap”, and if we search for it verbatim we will not feel lucky, but looking only for “requires 16kB minimum heap” we get references to J2ME’s KVM (Sun’s K Virtual Machine).
After reading a bit of documentation and checking some other strings, we understood that the challenge was a modification of KVM. The main difference was that KVM accepts jar/class input file but we had no input file so that meant the java code was somehow embedded inside the binary. But before starting any further analysis, we had to make our life easier and knowing that our binary was a modified KVM, we did look for a KVM binary (that we found somewhere, among with really useful source code and after spending so much time lost in Sun’s official website).
Once we got that binary with debug information, it was time to import it (essentially function names) so we could get part of the analysis done automatically. For that purpose we used a tool we know well: BinDiff. Diffing the challenge binary against the downloaded KVM binary gave us 295 matched functions (72,3% of the total). With BinDiff, for each matched function we have two parameters: similarity that is a value between 0.0 and 1.0 where 1.0 means identical functions, and confidence that is also between 0.0 and 1.0 telling how much we can trust the similarity value (that’s because of the different matching algorithms used). So, in order to import function names from KVM, we can do it manually for each function (figure 1)
or use a batch mode based on similarity and confidence (figure 2).
We were in a hurry so we used the “Import Symbols and Comments” option, using any value for similarity (default option) but at least 0.5 in confidence. After that the binary was making much more sense, specially the calls to getRawClassX (0x0804A1D0) that receive as a parameter a pointer to class structs. Those structs are really interesting and it was not hard to reach the bytecode of each one, but the problem was how to dump that to a working class file we could decompile. Being not sure about this approach we decided to look for the VM loop. KVM source code shows that the main loop is in FastInterpret(), with the help of the source and the diff results we found that function at 0x804E000.
At FastInterpret we have the main switch with all the opcodes (it’s easy to figure out how many of them are not implemented in this modified version, because all of them jump to show the “illegal bytecode” message). Having reached this point, it was time for live analysis. Using GDB, we launched the binary and placed a “log breakpoint” inside the VM loop. It’s possible to do this by executing:
gdb $ b *0x804E01B
gdb $ commands 1
> printf "%08x %x %x %x\n", $esi, $eax, *$ebx, *($ebx-4)
That way we get a trace of all executed opcodes inside the VM ($esi is the VM instruction pointer and $eax is the opcode), and a hint about the parameters pointed by $ebx. The full log with opcodes executed between key submission and reply from the binary is also included inside the additional contents package. At that point we had a huge list of executed opcodes and somewhere inside that list there was the key check but, how to find it? First we looked for our input string that was “1234567890…” that in hex is “31 32 33 34…”. Following the hits of those strings we could get a hint about what was going on, like the checks made over all bytes of our key and around 0x807CACC (VM code) that our key length was supposed to be 34 bytes, so we did all the process again with a 34-byte length key (“1234567890″*4 + “1234”)
As we used grep to keep track of the key (grep ” 3[0-9]”) we also got a lot of hits in IXOR instructions at 0x807b638. Our key was being xor’ed against other 34 byte array (“3d f6 bb 16 5f…”) and stored in a third array using BASTORE. We tried to continue analyzing what was happening to both our key and the xor’ed key. But we found out that the java code was implementing RC4’s typical initialization loop (array[i] = i for i = 0,1,2…256) and the RC4 key initialization algorithm. At this point, continuing with the opcode analysis of the RC4 cryptography code seemed a crazy approach so it was time to change a bit our plans.
Thinking a bit, the whole problem involves a check what means the use of any arithmetic or boolean opcode. The obvious one is IF_ICMPEQ. That opcode is used twice in the log we had, one for the key length and another at 0x807CAFF checking 0x59 against 0xC. This rang a bell, since 0xC is the result of 0x31 IXOR 0x3D (our dummy key’s first character versus the xor key we saw before). The next step was obviously to use the proper byte in our dummy key and see what was going to happen. 0x3D IXOR 0x59 is 0x64 or the ‘d’ letter so our next dummy key was “d234567890…”. For the further tests we just used a breakpoint on the IF_ICMPEQ opcode (0x0804E294) to check what was being compared until we got the full 34 enconded key:
>>> xorkey = [0x3d, 0xf6, 0xbb, 0x16, 0x5f, 0x22, 0x25, 0xca, 0x93, 0xbf, 0x46, 0xb6, 0x9d, 0x10, 0x06, 0xdf, 0x94, 0x45
, 0x9b, 0x9f, 0x44, 0x5e, 0xe5, 0x54, 0x51, 0x19, 0x70, 0x95, 0x29, 0x07, 0xfd, 0x39, 0xd3, 0x91]
>>> finalkey = [0x59, 0x92, 0xcf, 0x73, 0x34, 0x51, 0x05, 0xa6, 0xfc, 0xc9, 0x23, 0x96, 0xfb, 0x7f, 0x74, 0xff, 0xfe, 0x
24, 0xed, 0xfe, 0x64, 0x37, 0x96, 0x74, 0x34, 0x6f, 0x15, 0xe7, 0x50, 0x70, 0x95, 0x5c, 0xa1, 0xf4]
>>> key = ''
>>> for i in range(len(xorkey)):
... key += chr(xorkey[i] ^ finalkey[i])
>>> print key
ddteks love for java is everywhere
Finally it wasn’t necessary to look at the RC4 part to solve the challenge. However, our analysis was deep enough to know that it was using “ddtek” as the RC4 key. Our guess is that it was being used to decrypt something related to the final key.