The purpose of November 2004 SOTM is to analyze a heavily armored binary.
The file 0x90.exe was downloaded and transferred to the "test machine" were all the reversing took place. The first step of the analysis was to simply look at the binary itself using simple tools like "strings" or a hexeditor. These are the meaningful strings found in the binary
msvcrt.dll KERNEL32.dll printf GetTickCount GetCommandLineA ExitProcess You really thought you would find strings eh? ;-) Scan of the month coded by Nicolas Brulez / Digital River
Seems that the 0x90.exe author have a lot of sense of humor ^_^''
The absence of strings, the one found are from the import table, is an evidence that the target is probably packed and/or crypted. Using file analysis tools like PeID/GetTyp don't seem to confirm my hypothesis. Maybe the target use a custom packer/encrypter?? To confirm this hypothesis I tried to open the file using a hex editor, Biew, but it shutdown showing a message "Not enough memory. PE initialization". The same happen trying to open the file using the debugger, Ollydbg and reading from Ida Pro board seems that the same happen trying to open 0x90 using Ida.
Both error however showed where was the problem, the PE Header.
To continue the analysis and load the target into our favorite tools we must normalize the PE header. Opening the target in a hex editor (one that doesn't know about PE structure) we see that some fields of the PE header are filled with "uncommon" values; values that confuses common tools as debugger and disassembler. To fully normalize the header many fields should be changed but many of them can be ignored as the most troublesome for the tool and the only one we'll change are:
After applying the above patches the file can be easily opened by Biew and Ollydbg. Opening the file in the disassembler mode of Biew confirmed what we thought; the executable is crypted/packed. This can be readily seen from the amount of "strange code" present, jump into other instruction, fake prefixes, privileged instruction and btw the execution start in a section called 'DATA'.
Loaded the file into OllyDbg we are faced with a classic.
00DE2000 > 60 PUSHAD 00DE2001 E8 00000000 CALL 0x90.00DE2006 00DE2006 5D POP EBP 00DE2007 8BC5 MOV EAX,EBP 00DE2009 83E8 06 SUB EAX,6 00DE200C 81ED 0620DE00 SUB EBP,0x90.00DE2006
The above code load into the EAX register the code's starting address (0xde2000 in this case) and into the EBP register the difference "delta" from the "normal" expected address. This code is commonly found in virus/packer/crypter stub code that is relocated and need to reference data not placed at absolute addresses.
Stepping after we could see a lot of meaningless code, useless register assignment, jump and the like; that code is just "junk code" used to distract and hinder the analysis. The presence of "junk code" make really difficult to use static analysis tool like disassembler and make "boring" working with a debugger since you have to distinguish between "valid" instruction and junk one. The author however made a big mistake; the code look as follow:
VALID INSTRUCTION BLOCK PUSHA . . . A LOT OF JUNK CODE . . . POPA NEXT VALID INSTRUCTION BLOCK PUSHA . . . A LOT OF JUNK CODE . . . POPA
The pusha/popa pair save and restore the register status on the stack and are needed to preserve the value of the register between two valid instructions block. This make the "junk code" creation a lot easier for the author since the algorithm used have a simple mean to make "safe/nop code" but make up an evident signature that can be used to distinguish between valid and junk code. So to find the next meaningful instruction one have only to step till the register value are restored by a popa. This seem simple but actually is still a lot of work since the "junk code" blocks are long. We need same kind of automation....
OllyDgb is a great debugger; it's free (actually shareware but the author, Oleh Yuschuk, claim is only for copyright reason :) it's stable and have a nice set of feature. It's also have a plugin interface that can be used to extend its feature set. One of the most useful plugin is OllyScript written by SHaG that can be used to automatize the work with Ollydbg using script written in some kind of asmlike language.
Actually what we want to do is really simple; placed on the beginning "pusha" of the "junk code" block, step till the corresponding "popa" is executed and than fill all the block with nop. The following script do the trick.
lin.txt var cop var start var count eoe exit eob exit top: mov cop, esp mov start, eip sti l: cmp cop, esp je endtrash sti jmp l endtrash: mov count, eip sub count, start fill start, count, 90 ret exit: ret
I think it's pretty self explanatory; with eip on the "pusha", running the script actually only step (sti) through the junk code till "popa" is executed; the execution of popa is detected via the value of ESP, if ESP is equal to the value previous the execution of "pusha" mean we executed a "popa" and we must stop. The code block executed is then filled (fill) by 0x90 (nop instruction).
Using this script we can move through the code fastly and identify the true behavior of the programm; as a bonus the nopped code can be dumped and inserted in the exe file using a hexeditor eliminating the need to step through junk code more than once and making the executable cleaner and more easily analyzable using for example a disassembler. (a little note we could dump the full exe image as OllyDbg allow that but doing so would dump also some modification that the programm make to "the data"). With the document is available a fully cleaned 0x90.exe file.
After some junk code snipped we meet the first trick:
00DE2289 60 PUSHAD 00DE228A E8 48000000 CALL 0x90.00DE22D7 00DE228F 8B4C24 0C MOV ECX,DWORD PTR SS:[ESP+C] 00DE2293 8381 B8000000 02 ADD DWORD PTR DS:[ECX+B8],2 00DE229A 33C0 XOR EAX,EAX 00DE229C 8941 04 MOV DWORD PTR DS:[ECX+4],EAX 00DE229F 8941 08 MOV DWORD PTR DS:[ECX+8],EAX 00DE22A2 8941 0C MOV DWORD PTR DS:[ECX+C],EAX 00DE22A5 8941 10 MOV DWORD PTR DS:[ECX+10],EAX 00DE22A8 8941 14 MOV DWORD PTR DS:[ECX+14],EAX 00DE22AB C741 18 55010000 MOV DWORD PTR DS:[ECX+18],155 00DE22B2 8B81 B0000000 MOV EAX,DWORD PTR DS:[ECX+B0] 00DE22B8 50 PUSH EAX 00DE22B9 0FA2 CPUID 00DE22BB 0F31 RDTSC 00DE22BD 2B0424 SUB EAX,DWORD PTR SS:[ESP] 00DE22C0 83C4 04 ADD ESP,4 00DE22C3 3D 00000E00 CMP EAX,0E0000 00DE22C8 77 03 JA SHORT 0x90.00DE22CD 00DE22CA 33C0 XOR EAX,EAX 00DE22CC C3 RETN 00DE22CD 8381 B8000000 63 ADD DWORD PTR DS:[ECX+B8],63 00DE22D4 33C0 XOR EAX,EAX 00DE22D6 C3 RETN 00DE22D7 33C0 XOR EAX,EAX 00DE22D9 64:FF30 PUSH DWORD PTR FS:[EAX] 00DE22DC 64:8920 MOV DWORD PTR FS:[EAX],ESP 00DE22DF 0FA2 CPUID 00DE22E1 0F31 RDTSC 00DE22E3 33DB XOR EBX,EBX 00DE22E5 8F03 POP DWORD PTR DS:[EBX] 00DE22E7 64:67:8F06 0000 POP DWORD PTR FS: 00DE22ED 83C4 04 ADD ESP,4 00DE22F0 61 POPAD
During the analysis we'll encounter this code many times. It's simply a check for debugger presence. Before explaining the above code meaning we need to introduce what is called SEH. SEH is the acronym of Structured Exception Handling and it's a way an application running under Windows can cope with code exception (as invalid memory reference, invalid opcode, ecc...). The idea of exception handling is that the application installs a callback function and then, if an exception occurs, the system will call the function to let the application deal with the exception. The system provide to the handler the state of the thread at the exception's moment so it can diagnose the cause of the exception and eventually resume the execution or gracefully abort the program. To install a per thread SEH handler the thread need to modify a pointer in the Thread Information Block (TIB) located at FS: with the address of a ERR structure containing the address of the exception handler and the address of the next ERR structure (SEH can be chained to behave in a way similar to the C++ try/catch block, if an exception cannot be handled by the current handler it's passed to the "containing one").
Coming back to our code it install a SEH handler:
00DE22D9 64:FF30 PUSH DWORD PTR FS:[EAX] 00DE22DC 64:8920 MOV DWORD PTR FS:[EAX],ESP
using a ERR structure builded on the stack by the "CALL 0x90.00DE22D7" and the "PUSH DWORD PTR FS:[EAX]"; the handler address is the "CALL 0x90.00DE22D7" return address so it's first instruction is the "MOV ECX,DWORD PTR SS:[ESP+C]" located after the call. Then it call the RDTSC instruction (Read time-stamp counter into EDX:EAX) and deliberately cause an exception trying to pop into ds:0. The system than call the SEH handler that load into ECX the address of the CONTEXT structure, passed by the system and containing info on the thread state at the time of exception, and modify some value. In detail it add 2 to the EIP value (offset 0xb8 of the CONTEX structure) to skip over the faulting instruction ("POP DWORD PTR DS:[EBX]") it then reset the DRx register (offset 0x4-0x18 in the CONTEX structure) in the attempt to disable some hardware breakpoint, it then access the EAX register (offset 0xb0 of the CONTEX structure) and, after calling again RDTSC, calculate di difference in clock cicle regards to before the exception; if that difference is below 0xe0000 it resume the execution, returning with EAX=0, else before returning it modify again the EIP. This is done to detect the presence of a debugger; in fact running the code under a debugger is more clock cicle intensive than running the code "alone" since the debugger is called by the os in the event of an exception and usually handle the exception for the user (eg: stopping the execution showing the exception address). The value 0xe0000 is of course empirical since in a multitasking environment like Windows there is no way to tell how many clock cicle will pass, in fact some times the above code fail even when running outside a debugger. The above code also contain a bug, the instruction "ADD DWORD PTR DS:[ECX+B8],63" would never alter the EIP value in the CONTEXT structure since the CPUID istruction doesn't preserve the ECX value.
All this just to say that the above code could be skipped without influencing the programm execution.
After a couple more of the above trick the code check for breakpoint in the first 4 byte of some of the imported API:
00DE2950 B8 0D10DE00 MOV EAX,
00DE2C2C 8B40 02 MOV EAX,DWORD PTR DS:[EAX+2] 00DE2EF1 8B00 MOV EAX,DWORD PTR DS:[EAX] 00DE2F5B 8BF8 MOV EDI,EAX 00DE2F5D B9 04000000 MOV ECX,4 00DE31FA B8 60060000 MOV EAX,660 00DE31FF C1E8 03 SHR EAX,3 00DE3202 F2:AE REPNE SCAS BYTE PTR ES:[EDI] 00DE3204 85C9 TEST ECX,ECX 00DE3206 74 04 JE SHORT 0x90.00DE320C 00DE3208 0F31 RDTSC 00DE320A 50 PUSH EAX 00DE320B C3 RETN
The address of the above disassembly are not contiguous due the the elimination of the junk code. This snippet of code show another trick the programm use to make analysis more difficult. Immediate constant are frequently obfuscated as 0xcc in this case 0x660 >> 3. This is obviously done to camouflage the "valid" instruction in the junk code.
At this point we understood that this part of the code is only a stub used to decode some other code in fact from now on, using value in some table, part of the 0x90 code are decoded using a xor operation and the execution control is then passed to them.
Stepping through this part of the code is really boring since there are many layer (175); but since the stub are all the same is easy, using OllyScript, to write a script to step through them all:
sc.txt: var count var cbase var boop var tmpbp var tmpseh var i gmi eip, CODEBASE mov cbase, $RESULT log cbase var csize gmi eip, CODESIZE mov csize, $RESULT log csize mov count, 0 eoe expt chk: run jmp chk expt: inc i log i cmp i, af jne asd ret asd: mov count, esp mov tmpseh, [count] cmp tmpseh, 12e000 jb nval cmp tmpseh, 12fffc ja nval add count, 4 sbp: mov tmpbp, [count] bp tmpbp esto find eip, #3d00000e00# go $RESULT sti mov !cf, 1 bc tmpbp jmp chk nval: add count, 24 jmp sbp jmp chk
This script simply execute the stubs handling the SEH trick seen above. At the end we find the "real" 0x90 and we can start to analyze it's true behavior.
The first thing I expected reaching the "inner" 0x90 was to see "clean" code and no trick but, surprisingly, the code showed the same "style" of the first part; a lot of junk code and SEH trick. Using the script above was pretty easy to step through the code and the first thing I noted was that the code seemed table driven. Value stored in a table are loaded and seem to drive the execution flow, operand loaded and the like; stepping through this kind of code was pretty useless since the code executed seemed unrelated and was a lot difficult to understand. If we want to analyze the behavior of 0x90 we must understand how those tables are structured and used.
Analyzing the code accessing the tables showed that the "inner" 0x90 is actually some kind of script language interpreter or some kind of little virtual CPU emulator. After a brief initialization (0x90 writes in the memory where are stored the virtual register value the string "Evil Has No Boundaries !") the code starts to load value from a table located at 0x00e1ba3d. This table (CODE) is the script executed by 0x90 or, if you prefer, the code executed by the virtual CPU:
CODE 00E1BA3D 02 02 0A 15 02 51 03 00 01 2E A7 11 53 00 03 66 00E1BA4D 02 02 4F 13 02 51 02 01 03 02 00 00 02 04 00 02 00E1BA5D 03 03 04 01 16 00 00 00 02 02 2D 21 CA AD 01 02 00E1BA6D 01 2F DE 36 02 02 06 20 55 02 04 02 2E 01 00 00 00E1BA7D 02 08 00 02 01 04 05 45 DC 9B 1D 01 04 04 45 97 00E1BA8D 51 74 01 05 04 E2 DF 45 AD 01 04 04 EF BE AD DE 00E1BA9D 01 04 05 6C 6C 65 68 01 05 03 65 41 85 17 01 05 00E1BAAD 04 69 61 77 41 01 04 04 77 6F 68 73 01 04 05 20 00E1BABD 73 74 69 01 05 03 20 6F 6E 20 01 04 05 76 69 72 00E1BACD 64 01 04 04 63 72 65 6D 01 05 04 73 74 75 6E 01 00E1BADD 05 03 21 21 21 79 01 05 04 21 3F 68 65 01 06 07 00E1BAED FF FF FF DF 00 02 45 00 02 46 03 00 3D CE E4 00 00E1BAFD 02 05 03 04 02 2E 01 00 00 02 04 00 01 04 03 02 00E1BB0D 00 00 00 02 04 00 02 0B 00 01 00 02 46 00 03 64 00E1BB1D 00 02 47 00 01 D6 22 08 53 00 03 66 01 04 03 48 00E1BB2D 85 09 00 02 03 00 01 03 02 01 00 00 03 66 02 0A 00E1BB3D 00 02 01 04 03 02 00 00 00 02 0A 00 01 01 09 02 00E1BB4D 05 05 00 86 BB E1 00 47 42 03 00 02 44 00 01 F3 00E1BB5D A8 11 53 02 01 23 DE 36 02 00 04 4D 04 00 D7 DC 00E1BB6D F6 F2 00 00 11 54 19 37 00 01 DC A8 11 53 02 01 00E1BB7D 23 DE 36 02 00 04 4D 02 00 02 02 48 13 02 51 01 00E1BB8D 02 04 01 02 04 01 01 04 04 05 00 00 00 02 03 01 00E1BB9D 01 05 03 04 00 00 00 01 05 04 5A 00 00 00 01 0A 00E1BBAD 04 02 04 01 31 01 00 00 02 03 00 02 0A 00 02 01 00E1BBBD 04 03 02 00 00 00 02 0A 00 01 01 09 02 05 02 04 00E1BBCD 02 01 05 04 4E 00 00 00 00 01 2D CA 10 53 00 03 00E1BBDD 66 01 04 03 AC DE 00 00 02 04 00 01 03 02 00 00 00E1BBED 02 02 45 13 02 51 03 00 01 F3 A8 11 53 00 03 66 00E1BBFD 02 04 02 01 03 02 00 00 02 04 00 02 03 03 04 01 00E1BC0D C2 01 00 00 47
A bit of info on this "virtual architecture". The CPU can address 6 register (which i call r0-r5) whose value are stored by 0x90 in the memory between 0xde8736 and 0xde874a, the CPU also have a "zero flag" located at 0xde874a. The instruction of that virtual CPU are of variable length the first two byte indicate the operation the remaining byte are operand. Instruction are grouped in six groups. The first byte of the opcode indicate the group the second the instruction in the group. The third instruction of the first group (opcode 01 03) have a byte that specify the address type used by the instruction (similar to the modrm byte of x86). 0x90 maintain a table (M) for the groups and 6 table (G0-G5) for each of the instruction groups also use a table (F) for the different 0103 instruction addressing mode. These are the table used with their address inside 0x90:
M 00E1B991 A9 B9 E1 00 BD B9 E1 00 F5 B9 E1 00 25 BA E1 00 00E1B9A1 29 BA E1 00 35 BA E1 00
G0 00E1B9A9 FA A4 DE 00 6A C3 DE 00 B7 73 E1 00 C1 B0 E1 00 00E1B9B9 C1 4C E1 00
G1 00E1B9BD 7B 1A DF 00 2F 44 DF 00 D2 59 DF 00 EF 74 DF 00 00E1B9CD A6 81 E1 00 5B A0 E1 00 6D 22 E1 00 28 41 E1 00 00E1B9DD ED 16 E0 00 53 53 E0 00 48 B9 DF 00
G2 00E1B9F5 6F E1 DE 00 47 E9 DE 00 A3 DC DF 00 B5 9A E0 00 00E1BA05 BC D4 E0 00 77 14 E1 00 11 5B E1 00 2B B9 E1 00 00E1BA15 16 E8 E0 00 C9 C6 E0 00 E7 B5 E0 00 6A 3D E0 00
G3 00E1BA25 DC EF DF 00
G4 00E1BA29 FD 34 E0 00 89 79 E0 00 A7 F8 E0 00
G5 00E1BA35 04 36 E1 00 54 55 E1 00
F 00E1B9E9 CB 7F DF 00 48 90 DF 00 19 A3 DF 00
Analyzing each of the function addressed by the above table, always using the script to step through the junk code, I learned what each instruction was doing. Following is a table showing the all instruction used by the code above; some are absent since, surprisingly, unused even if their code is in 0x90.
|00 00 dw dw dw dw||push dwdwdwdw ^ 0x3719553F|
|00 01 dw dw dw dw||push dwdwdwdw+0xADD01337|
|00 02 rn||push rn xor 0x47|
|00 03 rn||pop rn xor 0x66|
|00 04 rn||add esp, rn xor 0x45|
|01 03 rn 02||xor [rn], dwdwdwdw|
|01 03 rn 01||xor word [rni], rn|
|01 03 rn 00||xor byte [rni], rn|
|01 04 rn dw dw dw dw||add rn-3, dwdwdwdw|
|01 05 rn dw dw dw dw||sub rn-2, dwdwdwdw|
|01 06 rn dw dw dw dw||and rn-5, dwdwdwdw|
|01 09 rn1 rn2||add rn1-1, rn2-3|
|01 0a rn1 rn2||cmp rn1-2, rn2-1|
|02 01 dw dw dw dw||apicall dwdwdwdw + 0xFEA731DE eax in r0|
|02 02 dw dw dw dw rn||mov rn, dwdwdwdw + 0xAEFDED04|
|02 03 rn||dec rn|
|02 04 rn||inc rn|
|02 05 rn dw dw dw||xor rn, rn|
|02 06 by wd wd||r0 = memchr( r0, by, wdwd )|
|02 07||int 3|
|02 08 rn1 rn2||mov rn2, dword [rn1]|
|02 09 rn||bswap rn|
|02 0a rn1 rn2||mov rn2, byte [rn1]|
|02 0b rn1 rn2||mov rn2, word [rn1]|
|03 00 dw dw dw dw||if stack1 == stack2 goto dwdwdwdw - 0x31337|
|04 00 dw dw dw dw||jmp dwdwdwdw + 0x0DEADEAD|
|04 01 dw dw dw dw||jne relative dw|
|04 02 dw dw dw dw||je relative dw|
|05 00 dw dw dw dw||call dw dw dw dw|
Using this knowledge it's easy now to understand what the code is trying to do, I've written a small "disassembler" for this CPU; it's output on the code above is:
0x00> mov r3, 0x20e 0x07> push 0xe1ba65 0x0d> pop r0 0x10> mov r2, 0x0 0x17> xor byte [r0], r2 0x1c> inc r0 0x1f> dec r3 0x22> jne 0x17 0x28> mov r1, 0x5cc80e31 0x2f> r0 = apicall GetCommandLineA 0x35> memchr( r0, 0x20, 0x255 ) 0x3a> je 0x132 0x40> mov r2, dword [r0] 0x44> add r2, 0x1d9bdc45 0x4b> add r1, 0x74519745 0x52> sub r2, 0xad45dfe2 0x59> add r1, 0xdeadbeef 0x60> add r2, 0x68656c6c 0x67> sub r1, 0x17854165 0x6e> sub r2, 0x41776169 0x75> add r1, 0x73686f77 0x7c> add r2, 0x69747320 0x83> sub r1, 0x206e6f20 0x8a> add r2, 0x64726976 0x91> add r1, 0x6d657263 0x98> sub r2, 0x6e757473 0x9f> sub r1, 0x79212121 0xa6> sub r2, 0x65683f21 0xad> and r2, 0xdfffffff 0xb4> push r2 0xb7> push r1 0xba> cmpandgoto 0xc9 0xc0> xor r3, r3 0xc3> je 0x132 0xc9> inc r0 0xcc> add r0, 0x2 0xd3> inc r0 0xd6> mov r1, word [r0] 0xda> push r1 0xdd> pop r2 0xe0> push r0 0xe3> push 0xd8360d 0xe9> pop r0 0xec> add r0, 0x98548 0xf3> dec r0 0xf6> xor word [r0], r2 0xfb> pop r0 0xfe> mov r2, byte [r0] 0x102> add r0, 0x2 0x109> mov r1, byte [r0] 0x10d> add r2, r1 0x111> call 0x149 0x117> dec r3 0x11a> push r3 0x11d> push 0xe1bc2a 0x123> r0 = apicall printf 0x129> add esp, 8 0x12c> jmp 0xe42a43ed 0x132> push 0x12e 0x138> push 0xe1bc13 0x13e> r0 = apicall printf 0x144> add esp, 8 0x147> exit 0x149> mov r1, 0x4c 0x150> inc r1 0x153> inc r1 0x156> add r1, 0x5 0x15d> dec r1 0x160> sub r1, 0x4 0x167> sub r2, 0x5a 0x16e> cmp r2, r1 0x172> jne 0x132 0x178> dec r0 0x17b> mov r2, byte [r0] 0x17f> add r0, 0x2 0x186> mov r1, byte [r0] 0x18a> add r2, r1 0x18e> inc r2 0x191> sub r2, 0x4e 0x198> push 0xe0dd64 0x19e> pop r0 0x1a1> add r0, 0xdeac 0x1a8> inc r0 0x1ab> xor byte [r0], r2 0x1b0> mov r3, 0x49 0x1b7> push 0xe1bc2a 0x1bd> pop r0 0x1c0> inc r2 0x1c3> xor byte [r0], r2 0x1c8> inc r0 0x1cb> dec r3 0x1ce> jne 0x1c3 0x1d4> ret
The above "disassembly" is fixed to show the correct API name; these name can be obtained by simply breakpoint on the apicall function (we know it's address from the tables). Saying the truth the above disassembly refer to the "final version" of the code in fact as you can see from the first few instructions the "virtual CPU" decode, with a xor, part of the instruction it's going to execute.
So what's the behavior of 0x90? It access the command line (via GetCommandLineA api) and find the first argument passed to the programm (not 100% true it find the first space and use what's after, it so fail if in the path of 0x90 are present space) it add/subtract some value from the first 4 character of the first argument and compare the value obtained with one calculated adding and subtracting values from a constant, if they are not equal it print the string "Please Authenticate!" and exit. So we know 0x90 is waiting some kind of string/password on the command line and we can calculate the first four character of that password inverting the sub/address operation made on the constant. I've written a small programm that do the above (a calculator is actually enough) the first 4 character of the password are '1D3N'. The "virtual CPU" then xor the word at 0xe1bb54 with the argument 5-6 character; since the word at 0xe1bb54 is part of the code that the virtual CPU will execute this operation prevent "correct" execution if the user supplied a wrong argument to 0x90. At moment we don't have enough information to calculate the value of the 5-6 character. It then add the 5 and 7 character then add the result with some constant and compare it against another constant, exit if the they are different. The execution continue adding the 6 and 8 character then the result is added with some constant and the resulting byte is put in xor with the last but one byte of the CODE table and, incremented by 1, with the byte between 0xe1bc2a-0xe1bc71.
Now we have enough information to infer the value of the 5-8 character of the password the program is expecting on the command line. One thing to note is that it never directly compare the password against know value and it use the character password to decode part of it's own data and code; this way it's impossibile to find the correct password in memory or patch the programm to continue the execution even if the password is wrong.
Since it xor the sum of the 6 and 8 character, added with some constant, with the last but one character of the CODE table and since the last instruction of that table must be a "ret" (as that part of code is reached using a call and it's the only way the programm can continue execution) we know the expected value of the sum. Similar consideration can be taken in account trying to determine the value of the 5 and 7 character. Small snippet of code available with this document show how to determine that value.
At the end we know that 0x90 is expecting as argument the string '1D3NEAcN'. Using that string the programm decode and print the string:
Welcome... Exploit for it doesn't matter 1.x Courtesy of Nicolas Brulez
After printing that string the program exit gracefully so at the end its only purpose was to print a string if supplied with the correct "password".
1. Identify and explain any techniques in the binary that protect it from being analyzed or reverse engineered.
The binary's PE header has invalid value in many fields (most notably the value of NumberOfRvaAndSizes and the PhysicalSize of the section named Nicolas) that confuse/crash many analysis tools. Junk code is used to hinder analysis and SEH are used to detect debugger presence and to make "not easy to step" code jump. The main code is encoded by many xor encryption layer. In the code immediate constant, addresses are often obfuscated to camouflage the valid code in the junk one and to prevent static reference analysis.
2. Something uncommon has been used to protect the code from beeing reverse engineered, can you identificate what it is and how it works?
The main code is actually an interpreter or a virtual CPU emulator; the behavior of the executable is controlled by a script/virtual CPU code. A, mostly complete, analysis of the virtual CPU instruction code/architecture is available above.
3. Provide a means to "quickly" analyse this uncommon feature.
A set of script for OllyDbg/OllyScript and a small disassembler for that virtual architecture were developed to quickly analyze 0x90 behaviors.
4. Which tools are the most suited for analysing such binaries, and why?
The presence of junk code and obfuscated constant/address make the use of static analysis tool like disassembler difficult. Live analysis tools like debugger, file/api monitor is best suited for such kind of binaries. Once reversed how the virtual CPU work, dedicated tools like disassembler or emulator can be used to simplify the analysis. However once "cleaned out" of the junk code, using some script/tool for example, even interactive disassembler (PVDasm, BDasm, Ida Pro) can be useful in the analysis work.
5. Identify the purpose (fictitious or not) of the binary.
Maybe some kind of password activated backdoor/tools. Actually it only print a string :)
6. What is the binary waiting from the user? Please detail how you found it.
The binary check for the presence of password passed as argument to the program. (there is however a bug on how it identify the argument) The password is '1D3NEAcN' it print a "greeting" string if the password is correct. It was found simply reading the disassembly of the virtual CPU code.
Bonus Question: What techniques or methods can you think of that would make the binary harder to reverse engineer?
Complexity is the only way to deter reverse engineering. One of the error that made the analysis easier was the easy identifiable structure of the junk code and the repetitivity of the "trick" used. This repetitivity made easy the making of tools to automatize the analysis. I don't believe that the technique used in 0x90 will become anywhere common as making such a binaries involve quite a in deep knowledge of assembler language and x86 architecture and common malware is written in much more high level languages.
I'm Nicola Cuomo a student of Computer Science (Informatica) at Universita' Federico II di Napoli. I'm currently, still -_-, working on my thesis about the formal analysis of security protocol. I'm reachable at ncuomo(at)studenti.unina.it and you can see my poor web page at http://studenti.unina.it/~ncuomo/.
If you have read this document please forgive my quasi-english.
Wim Koomen, Thijs Bosschert and Bas Bosschert who inspired, unknowingly, the style this document use :P
www.honeynet.org and Nicolas Brulez for the funny challenge
Oleh Yuschuk for OllyDbg and SHaG for OllyScript
When reading the scripts and tools source code below keep in mind that they are just quick hacks made during the analysis of 0x90.