In this analysis I've used the following tools, which are all free (as in beer) for trial purposes:
- IDA 4.2.1 evaluation version (wouldn't it be nice to own a fully working version *hint* ;)).
- VMware workstation to create a safe sandbox system in which I could observe the binary in the wild.
- GDB, debuggers are always one of the most powerful tools in reverse engineering, someone go and port softice to unix!
- Biew to do some hex editing and making some changes to the-binary to make it easier to debug it
- Strace a very powerful reverse engineering tool that will allow you to view system calls as they occur
- Ethereal my favorite packet dumper
- Sendip an util to send raw IP packets from the command line
Please note that the tool fenris also looks very promising and interesting and might have made my work a lot
easier but I did not know of its existence when I did my analysis.
The webpage reads:
This is an un-trusted tool developed and used by the blackhat community, do not use a production system to analyze it, nor any system with a connection to a production network.
I took this warning serious and because I did not have a spare computer laying around to do my analysis on I
used VMware to solve this problem. After getting the trial version and trial key I installed a clean debian system
on it, fetched the-binary, checked its MD5, installed the tools I wanted to run on the system (gdb,strace,biew) etc using apt-get.
After the installation was done I shut down the system and configured it so that it used a 'host-only' network connection and could not get onto the internet or my local network
using my system. I could now run ethereal on the host system to capture packets sent out by the infected OS. And use ssh
on the host system to log into the infected system.
Starting the analysis
Now we have a nice sandbox where we can play with the-binary all we want and it can't do any harm to our system or network. Let's use strace to find out what the-binary really does.
I fired up the-binary using strace -f -i (follow forks and print eip). We see that it calls
geteuid (to see if its running as root), then two rapid forks (probably used to fool some debuggers)
it then chdirs to /, closes all its filehandles. So far nothing really interesting. The interesting part
starts when it creates a RAW socket for IP protocol 11. According to protocol lists this is the
NVP or network voice protocol. A protocol that is by default ignored by most IDS systems.
After that the program hangs on
a recv on this socket. In other words it is waiting for traffic using ip protocol 11. So let's
feed it some, I used the tool sendip to send it a string of 100 A's. This simply puts it back to the recv
statement to receive more data. its now time to use the most powerful tool in reverse engineering history: IDA.
Of course our hacker applied some tricks so that IDA does not recognize the names of the libc functions anymore.
But an experienced reverse engineer can easily re-identify most of them (when i was done i saw there even was signature
file for this version of libc released that can identify all libc functions in the-binary).
When we apply this method to all other functions called so far we get a better view of the binary, let's look at
the interesting part:
.text:080482B0 push 0
.text:080482B2 push 800h
.text:080482B7 lea eax, [ebp+var_800]
.text:080482BD push eax
.text:080482BE mov ecx, [ebp+var_44C8]
.text:080482C4 push ecx
.text:080482C5 call recv ; receive command
.text:080482CA mov esi, eax
.text:080482CC add esp, 10h
.text:080482CF mov edx, [ebp+var_44D0]
.text:080482D5 cmp byte ptr [edx+9], 0Bh ; check if it is protocol 0xb
.text:080482D9 jnz loc_8048EB8 ; if not receive again
.text:080482DF mov ecx, [ebp+var_44D4]
.text:080482E5 cmp byte ptr [ecx], 2 ; check if first byte is 0x02
.text:080482E8 jnz loc_8048EB8 ; receive again
.text:080482EE cmp esi, 0C8h ; check if length is at least 0x0cb (200) bytes
.text:080482F4 jle loc_8048EB8 ; receive again
We can conclude the message should start with a 0x02 and has to be at least 200 bytes long, otherwise it is
simply ignored. So let's fire up sendip again and send it a message of over 200 bytes starting with 02. I sent
it 02 300 times now. Still no luck strace shows it simply going back into the receive loop again. Let's see what
the-binary does with our received command after it the first 3 checks.
The decryption routine
We see it feeds our message to a sub at 804A1E8. This turns out to be a decryption function that is
in essence very simple, yet programmed in a very stupid way.
If you are interested in an analysis of decryption routines read one of the 100's of essays on key generators which are very similar to encryption/decryption routines.
The algorithm used can be expressed using the following piece of perl:
$dec .= chr((ord(substr($enc,$x+1,1))-ord(substr($enc,$x,1))-0x17) % 256);
In other words: d[x] = (e[x+1]-e[x]-0x17) mod 256. Where e is the encrypted string and d is the decrypted
string. After the decryption it checks the second byte of the decrypted string and according to it executes a different piece
of code, in other words there are 12 different types of packets. Which we will analyze one by one below.
Packet type 0
When we send a packet of type 0 to the-binary and look at the strace output we can see that it sends some data over the network
to 0.0.0.0 and then goes back into the receive loop to receive its next command. We can have a look at the data it sends by using
strace -xx -s 50000 -f which will cause strace to output the data the-binary sends in hexadecimal form. When we decode this data using the same algorithm the
program uses to decode its commands we see the decoded packet is the entire encrypted packet we have send it with some 0's and
garbage appended. So packet type 0 appears to be a ping packet for the blackhat to see if the-binary
is running. The destination address 0.0.0.0 can be changed as we will see later on (when we analyze packet type 1).
Packet type 1
In IDA we can see that packet type 1 gets a byte and then 4 bytes from the beginning of the message and writes these
values to internal variables in the program. (i.e. it sets some internal variables according to the received data). The first
thing that came to my mind was that these could be the 4 bytes of the ip address. A simple test can prove this:
send ABCDEFGHIJKM followed by a lot of 0's in a packet type 1 and then send a type 0 packet. In strace we now see it
sends the packet to 10 different hosts one of them being 18.104.22.168 (BCDE). When we review the rest of the sub in
IDA we derive the following piece of pseudo code (when we call the first receive byte decoymode):
check if decoymode == 2 if so, add 10 addresses from the message else add 10 random ip addresses and put the
first ip in the message on a random position within these 10. These decoy ips are used to obscure the real source ip of the
attacker between other ip addresses.
Packet type 2
Using the string references that IDA generates we can easily see the purpose of this type of packet, it executes
a command puts its stdout and stderr in a file called /tmp/.hj237349 then reads that file encrypts it contents
, sends it back to the ip set using packet 1 and then removes the temporary file. Please note that the packet
contains alot of garbage data at the end that unfortunately for the attacker also contains unencrypted data so that
his packets still look suspicious.
Packet type 3
What we see here is something we will see again in other subs, it first compares a value with 0 if it is not 0
then it jumps back into the receive loop, if it is zero it forks off and puts the pid of the child in that variable.
(We will later see this pid is used in packet 7 to stop forked child processes). After forking off it calls
a sub with parameters from the received command. The sub uses the first 4 bytes of the message to set an ip address
or the last part of the message as a host named for the target (spoofed source of the attack) a boolean is used to indicate
whether to use the ip or the hostname. This sub does a domain lookup on a part of the message if it fails it sleeps for 10 minutes and then tries
again. If it is successful it starts spitting out a lot of packets. Ethereal can capture these packets,
and decodes them as dns lookups for .com,.net etc (a request that takes a packet of about 60 bytes to send and gets
back a lot more). The source port is also configureable from the command. These kinds of attacks are documented in this CERT Incident Note and
reveal the-binary's real purpose: DDoS attacks.
Packet type 4
In IDA we see that this packet is handled in a similar way as packet type 3. A lot of bytes from the message are being
put on the stack and a pointer to the rest of the message and then a sub is called. Which does a hostname lookup and then
starts creating packets. The difference here is that the type of attack is different. Here it sends out fragmented udp packets
with a spoofed source address (this address is in the received message).
Packet type 5
In strace we see when a packet type 5 is received it opens up a listening socket at port 23281. And waits for someone
to connect to it. When we use netcat to connect to this port, we see it reads our input and then sends back some garbage
exits and listens again. Let's have a look in IDA what it is expecting us to send. In IDA we see again one of
the hacker's brilliant encryption schemes at work (actually this is a scheme invented by Caesar).
It takes the input, replaces every letter by the next letter in the alphabet and then checks if the first 6 letters
equal TfOjG (so the string to send is: SeNiF) when we send SeNiF to port 23281 it binds a root shell to the port which
we can then use over our netcat connection. If we send something else it sends 0xFF 0xFB 0x01 0x00.
Packet type 6
This packet does the same thing as packet type 2, only it does not send the output back. So it is used to execute
a single command. This could be used for example to install a new version remotely.
Packet type 7
This mode reads the pid stored in the internal variable set by the DoS and root shell modes and then calls the kill
command, to stop the attacks.
Packet type 8
This packet is very similar to 3, only difference is that it passes slightly different parameters to the DNS flood
sub routine. This function seems to make packet 3 redundant as this packet can do everything packet type 3 can do as well.
Packet type 9
Also similar to number 3 only starts a different type of flood, this time a SYN flood. As is describe in
this CERT advisory.
Packet type A
Similar to 9, only passes slightly different parameters on the stack.
Packet type B
Again the same as 3.
Why some attack modes have multiple commands
One might wonder why for example the DNS DoS attack can be called using different commands, the answers appears to be
that these modes only differ in the speed of the attack (mode 3 uses a fixed speed and mode 8 allows the speed to be set). The speed
is expressed in 100 requests per second. So speed 6 means 600 requests per second.
The-binary is a tool that can be used by the attacker to takeover your system and execute a few different DoS attacks.
While some of the code in the-binary is well written, a lot of code is not. For example the implementation of the decryption
routine, by reading the string backwards and using a lot of printf calls is inefficient to say the least. In other
words a tool to be aware of but nothing really new here. Tools like stacheldraht and tfn2k have been around for a pretty
long time and contain almost identical functionality.
General reverse engineering tricks
Now the-binary is analyzed let's discuss some of the tricks I used when creating this analysis.
Saving your work in IDA evaluation
The evaluation version of IDA doesn't allow saving, but does allow you to save its database to an IDC file (using the
file produce command). Using this method you can simply save all your comments etc in these IDC files and then load it again.
This is equal to saving only a bit slower (but if you have a very fast machine it is not that annoying).
Altering the program to allow easier debugging
Often when analyzing a program there are some inconvinient things, for example in the-binary when i wanted to analyze
some of the code in packet type 3 using gdb i could not attach to it as it forked. There is a simple way around this by
altering the binary so that it does not fork here. Simply removing the call to the fork with a lot of nops allows easier debugging.
Becoming a reverse engineer takes some steps:
1. Be a master in assembly. For example by reading the art of assembly.
2. Start at the basics, by doing some crackme's and reverse me's on the net.
3. Read a lot, for example on fravia's old pages or search for more to read using fravia's new pages.
Experience is also very important and you have a zillion programs to practice on. A very good method of practice is to alter existing binaries to do something new. For example
you can read this tutorial about adding (useless) support for encrypted mp3's to the winamp binary if you practice things like this
in no time you will be an expert reverse engineer!