Table of Contents


1. .............. Decompiling the Executable
1.1 ................ Tools required
1.2 ................ Initial Inspection
1.2.1 ................. Determining the Linux Distribution used for compilation
1.3 ................ Reconstructing the Symbol Table
1.4 ................ Disassembling.
1.4.1 ................. Tidying up the disassembly listing.
1.5 ................ Decompilation / Analysis
1.5.1 ................. Partial analysis for when speed is important.
1.5.2 ................. Methods undertaken while decompiling the binary.
1.5.3 ................. Determining compilation optimisation options.
1.5.4 ................. Hand decompilation
1.5.5 ................. Decompiling the DOS attacks.

2. .............. Analysing the decompiled source code.
2.1 ................ Program startup
2.2 ................ Features
2.2.1 ................. Fast actions
2.2.2 ................. Slow actions
2.3 ................ Network protocols.
2.3.1 ................. Packets from Agent -> Handler
2.3.2 ................. Packets from Handler -> Agent
2.4 ................ Fingerprints
2.5 ................ Weaknesses

3. .............. Summary


Appendix A
Appendix B


This is an analysis of an unknown binary, for the HoneyNet Reverse Challenge. This is a technical analysis, and assumes the reader has prior programming experience, but does not necessarily have experience in the field of reverse engineering.

This analysis is structured in two main sections.

The first section details the chronological process of analysing the unknown executable 'the-binary'. The emphasis is on the process of analysis, rather than the results. This is to assist newcomers to the field of reverse engineering. As a result, this analysis spends more time on basics and low level details than it otherwise might, if aimed at a more experienced audience. The section finishes with decompiled source code of the executable.

The second section details the analysis of the decompiled executable. It details what it is that the executable does.

1. Decompiling the Executable

The process of reverse engineering is a lot like tracking. There are so many fascinating questions just waiting to be answered. Many of these questions may seem unrelated to the final goal (be it completely understanding the reversed program, or finding the animal still in its footsteps). However, following a gut feeling and taking the time to answer one of these unrelated questions often brings you closer to your goal. In this analysis questions are provided which may seem irrelevant, but may just bring that final goal one step closer.

1.1 Tools required:

Standard unix development environment including: Patience and perseverance are useful, but not required.

1.2 Initial Inspection

Upon receipt of the unknown executable, the first step is not to execute it. The binary could conceivably do anything, from exploiting a unknown kernel bug and rm -rf /'ing the system, to sending rude email message to your boss.

What does the standard unix file(1) command tell us?

   $ file the-binary
   the-binary: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, stripped
The executable is an elf binary for a 80386 class pc. This means that the operating system is either linux or *bsd for 386.

What does a strings dump show?

   $ strings the-binary | more
Searching through the output yields this highly useful line.
   @(#) The Linux C library 5.3.12
So the binary was written in C, for the linux operating system. The version of the C library used to compile was 5.3.12. This is quite an old version of the linux C library. Current versions are at glibc2.0 (aka libc6). This raises a question: why such an old version of the C library? Possibilities include; a deliberate choice to keep the file size down due to the static linking, or the binary was written and compiled a long time ago.

Performing a web search, we find that this version of the C library was included in the Slackware 3.1, and RedHat 4.x linux distributions. These were released in 1996 and 1997. It seems safe to say that the executable was compiled on one of these linux distributions. The question is, which one. Having had experience with both Slackware3.1 and RedHat 4.0 (they can be found on the InfoMagic LINUX Developer's Resource Dec 1996 edition) and being a programmer, slackware feels a more "programmer friendly" environment. Instinct says that the program was written and compiled on slackware, but how to be sure?

1.2.1 Determining the Linux Distribution used for compilation

The binary is statically linked. This means that the executable contains not just user written code, but it also object code from the libc library. As we have the libc libraries for Slackware and Redhat plus the unknown executable, it seems that a type of grep could determine which object files are found inside the executable. The distribution with the matching object files would be the distribution the binary was compiled on.
      Short notes on the elf file format.  For full details see [elf].  The elf
      object file format contains different type of related data in separate
      sections.  The sections we are concerned with are:
      * .text   - contains executable code.
      * .rodata - contains unmodifiable data e.g. string constants
      * .data   - contains initialised modifiable data e.g initialised static arrays
      * .bss    - contains uninitialised modifiable data e.g uninitialised static

      The different sections can be seen with the command:
           objdump -h <file>

      Code can be either position independent (e.g. jump forward 15 bytes), or
      relocatable (e.g. call this function, the address of which will not be known
      till after loading).  Position independent code will be identical in the
      original object file and the linked executable.  Relocatable code will be

      relocatable code offsets can be seen with this command:
           objdump -r <file>

To determine if a given object file was linked into the statically linked binary, we search for matching sections of position independent code. This is achieved by converting all relocatable code into a wildcard, and then performing a linear search. An example of this is illustrated below.

         The unknown executable is represented by ABCDEFGHIJKLMNOP

         Object file 1 is represented by IJ*LM
         Object file 2 is represented by BCD
         Object file 3 is represented by IJ*L
         Object file 4 is represented by E*GH
         Object file 5 is represented by AEPK
         ('*' represents a wildcard)

         After searching for matches:

         conflict            IJ*LM       at offset 9
         match        BCD                at offset 2
         conflict            IJ*L        at offset 9
         match           E*GH            at offset 5
Both object files 2 and 4 were found in the executable. Object files 1 and 3 match, but conflict with each other (as they can't both exist in the executable without overlapping). Manual intervention is required to resolve the conflict between object files 1 and 3.

The results of this example are 2 definite matches, and 1 unresolved.

The libc from Slackware 3.1, and from RedHat4.x (plus all updates) were obtained (these can be found by doing a web search). Each version of libc.a was placed in its own directory and unpacked with ar x libc.a.

Writing a program to perform the matching and executing it results in:

   $ bin/search_static the-binary slackware3.1 > slackware3.1.out
   $ bin/search_static the-binary rh_5.3.12-8 > rh_5.3.12-8.out
   $ bin/search_static the-binary rh_5.3.12-17 > rh_5.3.12-17.out
   $ bin/search_static the-binary rh_5.3.12-18.2 > rh_5.3.12-18.2.out
   $ bin/search_static the-binary rh_5.3.12-18.5 > rh_5.3.12-18.5.out
Distribution Definite matches Unresolved matches
Slackware 3.1 169 5
RedHat 5.3.12-8 149 9
RedHat 5.3.12-17 105 10
RedHat 5.3.12-18.2 27 0
RedHat 5.3.12-18.5 27 0
Table 1. Summary of searching for libc matches

This shows that slackware 3.1 was used to compile the executable. (Most matches wins).

Why are there differences in the results when all 5 version are the same C library? This is due to slightly different code due to different patches being applied, plus different compile options (or different version of the compiler).

Which compiler was used to compile the executable? Slackware 3.1 shipped with gcc 2.7.2. Was this the compiler used? Examining the comment section of the executable implies yes.

   $ objdump -j .comment -s the-binary | head

   the-binary:     file format elf32-i386

   Contents of section .comment:
    0000 00474343 3a202847 4e552920 322e372e  .GCC: (GNU) 2.7.
    0010 322e6c2e 32000047 43433a20 28474e55  2.l.2..GCC: (GNU
    0020 2920322e 372e3200 00474343 3a202847  ) 2.7.2..GCC: (G
    0030 4e552920 322e372e 322e6c2e 32000047  NU) 2.7.2.l.2..G
    0040 43433a20 28474e55 2920322e 372e322e  CC: (GNU) 2.7.2.
    0050 6c2e3200 00474343 3a202847 4e552920  l.2..GCC: (GNU) 

1.3 Reconstructing the Symbol Table

The executable has been stripped. This means there is no symbol table. So a statement such as:
   call   0x080571e8
could be a call to a library function, or it could be a call to user written code. Without a symbol table, this is unknown.

But, the original object files contain symbol tables. As the object files used in the linking of the executable are already known from the diversion into finding the compilation distribution, perhaps the symbol table for the stripped executable can be reconstructed.

As an example: Say foo.o was found to exist in the executable at offset 0x8001000. The known symbol table of foo.o states that there are two functions inside foo.o with the following offsets:

   0x000     bar
   0x120     baz
This allows us to reconstruct symbol table entries for bar and baz, by adding the offsets, which yields:
   0x8001000 bar
   0x8001120 baz
Repeating this for all object files found inside the executable will result in a partially reconstructed symbol table.

Note: the compiler and libc versions used have been identified. As well as searching libc.a for object files, libgcc.a is also searched, as this library is included by gcc when compiling.

Manual resolution of conflicts is discussed in Appending A.

   $ bin/search_static the-binary slackware3.1 > object_files
   $ vi object_files

   $ bin/gensymbols object_files > symbols
Now we have a partial symbol table for the-binary.

1.4 Disassembling.

The objdump program can create a disassembly listing for an executable. Object code disassembly is not an easy task due to the fact that code and data can be interspersed (a result of the von Neuman architecture). This is frequently the case with indexed jumps (e.g. switch statements). The objdump disassembler will try to interpret the indexed jump offsets as code (instead of data), and the resulting disassembled code will look rather confusing. A program was written to use the objdump program as disassembler, but indexed jumps were cleaned up.

Running this program yields a disassembly output which we will use to decompile the-binary.

Note: only the .text section is disassembled. The .init and .fini sections contain compiler specific startup and shutdown code, not user code.

   $ bin/gendump the-binary > dump1
This yields a rather unwieldy 43000 line listing, containing user code as well as statically linked object code. As the size and offsets of the libc5 object code have been determined, this object code can be stripped from the disassembly.
   $ bin/decomp_strip object_files < dump1 > dump2
This yields a much more manageable 2500 line listing.

Now we need to insert the symbol table entries we previously extracted.

   $ bin/decomp_insert_symbols symbols < dump2 > dump3
Now library function calls are readable.
     call   0x080571e8
Is replaced with:
     call   0x080571e8 <__libc_fork>
It follows that all calls to functions which are not labeled, are either gcc startup/shutdown code, user functions, or functions from other statically linked libraries. Searching for these functions:
   $ grep 'call   0x' dump3 | grep -v '<' | cut -c18- | sort | uniq
This yields only 11 unknown called functions.

How to determine the gcc startup and shutdown code? The compiler and C library used to compile the executable are known, so we use that same compiler to compile a dummy program. The dummy program is then disassembled and commonality between the two disassemblies will be the gcc startup and shutdown code.

Note: A minimal installation of slackware3.1 was setup in a chroot jail. The software installed into the jail was gcc and libc development libraries.

   $ echo 'int main(void) { return(0); }' > dummy.c ; gcc -o dummy dummy.c
   $ bin/gendump dummy > dummy.dump
Comparing the two disassembly listings we can establish labels for the following functions.
   0x08048080 is common gcc startup code
   0x080675a8 is __do_global_dtors_aux but is not called from the .text section
   0x08048134 is the main function
So now adding the following symbol table entries to symbols
   0x08048134 main
   0x08048ecc func1
   0x08048f94 func2
   0x08049138 func3
   0x08049174 func4
   0x08049564 func5
   0x080499f4 func6
   0x08049d40 func7
   0x0804a194 func8
   0x0804a1e8 func9
   0x080675a8 __do_global_dtors_aux

   $ cp symbols symbols.modified
   $ vi symbols.modified
   $ bin/decomp_insert_symbols symbols.modified < dump3 > dump4
Now the disassembly listing has all symbols inserted.

1.4.1 Tidying up the disassembly listing.

A brief examination of the listing shows lines like the following.
   lea    0xfffff800(%ebp),%edx
It would be more pleasing to have them of the form:
   lea    -0x800(%ebp),%edx

When the code references data in the .rodata and .data sections, it would be nice cross reference this in the disassembly output. So:
   080481aa: mov    0x80675d8,%eax
would be displayed as:
   # Possible reference to rodata '[mingetty]'
   080481aa: mov    0x80675d8,%eax

A visual display of the end point of a jump instruction would be nice.
   0804a1be: mov    $0x1,%ecx
   0804a1c3: cmp    %edi,%ecx
   0804a1c5: je     0x0804a1dd
   0804a1c7: nop
   0804a1c8: movzbl 0xffffffff(%ebx,%ecx,1),%edx
   0804a1cd: movzbl (%ecx,%esi,1),%eax
   0804a1d1: lea    0x17(%edx,%eax,1),%eax
   0804a1d5: mov    %al,(%ecx,%ebx,1)
   0804a1d8: inc    %ecx
   0804a1d9: cmp    %edi,%ecx
   0804a1db: jne    0x0804a1c8
   0804a1dd: lea    0xfffffff4(%ebp),%esp
The above code would be easier to examine at a glance when displayed as:
   0804a1be: mov    $0x1,%ecx
   0804a1c3: cmp    %edi,%ecx
   0804a1c5: je     0x0804a1dd                     *
   0804a1c7: nop                                   |
   0804a1c8: movzbl 0xffffffff(%ebx,%ecx,1),%edx  *|
   0804a1cd: movzbl (%ecx,%esi,1),%eax            ||
   0804a1d1: lea    0x17(%edx,%eax,1),%eax        ||
   0804a1d5: mov    %al,(%ecx,%ebx,1)             ||
   0804a1d8: inc    %ecx                          ||
   0804a1d9: cmp    %edi,%ecx                     ||
   0804a1db: jne    0x0804a1c8                    *|
   0804a1dd: lea    0xfffffff4(%ebp),%esp          *
Performing these cleanups:
   $ bin/decomp_fixup_signs < dump4 > tmp1
   $ bin/decomp_xref_data the-binary < tmp1 > tmp2
   $ bin/decomp_xref_jumps < tmp2 > dump5

1.5 Decompilation / Analysis

With only 2500 lines of assembly code, hand decompilation is feasible. If the program were much larger, a combination of blackbox testing, tracing in a debugger and some hand decompilation would be used.

A complete understanding of 80386 instruction set is not needed to perform hand decompilation. A solid understanding of the various addressing modes is a requirement. An estimated 80% decompilation can be achieved with only knowing the following instructions:

For those not familiar with hand decompiling, see Appendix B which contains tips for those new to this area.

1.5.1 Partial analysis for when speed is important.

This sections provides a walkthrough of an analysis, for when speed is important. This may be the case if your system is under current attack, and you were lucky enough to obtain a copy of the executable.

We will start with the disassembly of the executable obtained so far. (Like all good system administrators, we are prepared for attack and have tools in place to obtain a disassembly with symbol table intact).

For a quick overview of the program, we take a look at all library functions called by the program

   $ cat dump4 | grep call | cut -c29- | sort | uniq -c
         1	<_IO_fclose>
         1	<_IO_fopen>
         1	<_IO_fread>
        11	<_IO_sprintf>
         1	<__libc_chdir>
         6	<__libc_close>
         3	<__libc_dup2>
        14	<__libc_fork>
         2	<__libc_free>
         1	<__libc_geteuid>
         1	<__libc_init>
         3	<__libc_kill>
         1	<__libc_malloc>
         4	<__libc_setsid>
         3	<__libc_time>
         1	<__libc_unlink>
        23	<__random>
         1	<__setfpucw>
         3	<__srandom>
         8	<_exit>
         1	<accept>
         1	<atexit>
         5	<bcopy>
         1	<bind>
         1	<execl>
         9	<exit>
         2	<func1>
         2	<func2>
         1	<func3>
         2	<func4>
         1	<func5>
         1	<func6>
         2	<func7>
         2	<func8>
         1	<func9>
         5	<gethostbyname>
         7	<inet_addr>
         1	<listen>
         1	<main>
         4	<memcpy>
         3	<memset>
        13	<rand>
         2	<recv>
         1	<send>
         6	<sendto>
         2	<setenv>
         1	<setsockopt>
        15	<signal>
         6	<sleep>
         7	<socket>
         2	<system>
         1	<unsetenv>
        10	<usleep>
The listen() and accept() calls imply right away that this program is some kind of server.

The calls to system() seem particularly ominous, as does the call to execl(). As both system() and execl() take strings as parameters, lets take a look at string constants in the executable.

   $ objdump -s -j .rodata the-binary | head -20

   the-binary:     file format elf32-i386

   Contents of section .rodata:
    80675d8 5b6d696e 67657474 795d002f 00002f74  [mingetty]./../t
    80675e8 6d702f2e 686a3233 37333439 002f6269  mp/.hj237349./bi
    80675f8 6e2f6373 68202d66 202d6320 22257322  n/csh -f -c "%s"
    8067608 20313e20 25732032 3e263100 72620054   1> %s 2>&1.rb.T
    8067618 664f6a47 00fffb01 002f7362 696e3a2f  fOjG...../sbin:/
    8067628 62696e3a 2f757372 2f736269 6e3a2f75  bin:/usr/sbin:/u
    8067638 73722f62 696e3a2f 7573722f 6c6f6361  sr/bin:/usr/loca
    8067648 6c2f6269 6e2f3a2e 00504154 48004849  l/bin/:..PATH.HI
    8067658 53544649 4c45006c 696e7578 00544552  STFILE.linux.TER
    8067668 4d007368 002f6269 6e2f7368 002f6269
    8067678 6e2f6373 68202d66 202d6320 22257322  n/csh -f -c "%s"
    8067688 20002564 2e25642e 25642e25 64008d36   .%d.%d.%d.%d..6
    8067698 15000000 15000000 14000000 15000000  ................
    80676a8 15000000 19000000 14000000 14000000  ................
    80676b8 14000000 476e0100 00010000 00000000  ....Gn..........
    80676c8 03636f6d 00000600 01000000 00000000  .com............
Several strings catch our attention straight away:
   '/bin/csh -f -c "%s" '
   '/bin/csh -f -c "%s" 1> %s 2>&1'
These two strings would go nicely together with a call to setenv.
The other call to setenv probably involves
Setting the PATH, and TERM environment variables is only really useful for interactive use, so this binary most likely provides a shell. The call to execl(), and the existence of the string "/bin/sh" support this.
   '/bin/csh -f -c "%s" '
   '/bin/csh -f -c "%s" 1> %s 2>&1'
These two strings are most likely passed to sprintf, given that they contain the format string "%s". The second string contains the shell redirection strings "1> 2>&1". This is likely to to used as argument to the call to system(). There are two calls to system(), which most likely match up with these strings. This implies that the binary executes shell commands determined at runtime.

Given that the binary is a server, provides shell access and has the capability to execute shell commands determined at runtime, it is easy to assume that the binary provides a backdoor for remote attackers to obtain shell access, or execute a command on the system running this binary. If this binary was running on a machine with network access, the network access to that machine would be dropped immediately.

Time to examine the disassembly. A quick examination of the main() functions, shows that it never returns, as it has an infinite loop.

The first few library calls include fork(), setsid(), chdir("/"). These calls are commonly used by a process to become a daemon. This is stated in [stevens1].

One of the last calls before the infinite loop is a call to socket(), and the first call inside the infinite loop is a call to recv().

We now know that the program becomes a daemon, opens a socket and loops forever, reading from that socket.

It is curious that the program calls recv() on the socket, without calling bind() or connect() first. Examining the arguments passed to socket()

socket(2, 3, 11);
Using the socket manpage and the /usr/include/sys/socket.h header file. this can be clarified as
socket(AF_INET, SOCK_RAW, 11)
A raw socket using protocol 11 is unusual. Looking in /etc/protocols we see that there is no listing for protocol 11. So using this protocol must be a way to try to hide the communication from client to server.

At this point, all border routers would be reconfigured to deny packets using protocol 11. This would prevent clients from communicating with any servers which have possibly been installed.

Examining the other functions, we see that functions func4(), func5(), func6() and func7() all have infinite loops. Also, these functions all perform a call

before the loop. Inside the loop, each function makes calls to sendto().

The IPPROTO_RAW flag to socket() allows the program to manually set the fields of the IP header of the IP datagrams being sent. The most common (blackhat) reason for doing this would be to spoof the source IP address. This together with an infinite loop sending IP datagrams implies a denial of service attack.

To summarise so far, the tool is a server which communicates with its clients using IP datagrams with a protocol field of 11 (most likely to attempt to avoid detection). The tool provides a remote shell to remote users, as well as the capability to execute shell commands. The tool can also perform 4 types of DOS attack. This sounds very similiar to the Tribe Flood Network (TFN). We can now conclude that this tool is most likely (yet) another distributed denial of service attack tool.

1.5.2 Methods undertaken while decompiling the binary.

This section follows on from the previous partial analysis. We have a rough idea of what the executable does, but we wish to know everything. There are 10 user written functions (including) main. First we build a call graph to get an overall feel of the program structure.
         +-- func1
         |     +-- func2
         |           +-- func3
         +-- func4
         +-- func5
         +-- func6
         +-- func7
         +-- func8
         +-- func9
Briefly examining main(), we see the following structure
go_daemon                 # fork() setsid() fork() chdir() close()
loop {
   switch {
      case 0x1:    func8() ; func1()
      case 0x2:
      case 0x3:    sprintf("/bin/csh -f -c \"%s\" 1> %s 2>&1")
                   system() ; fopen() ; fread() ; func8() ; func1() ; fclose()
      case 0x4:    func4()
      case 0x5:    func6()
      case 0x6:    socket() ; bind() ; listen() ; accept() ; setenv() ; execl()
      case 0x7:    sprintf("/bin/csh -f -c \"%s\"); system()
      case 0x8:
      case 0x9:    func4()
      case 0xa:    func7()
      case 0xb:    func7()
      case 0xc:    func5()
So after receiving a packet, func9() is somehow used to process that packet, and then switch on the result of the processing.

We can say that:

are all for performing DOS attacks, since we have determined that func4(), func5(), func6() and func7() are DOS attacks.

case 0x6 provides a shell backdoor (determined previously).

case 0x7 executes a shell (csh) command.

case 0x3 is interesting, as it performs a shell command, redirecting stdout and stderr into a file, that file is then opened, read, func8() then func1() and called, then the file is closed. So the output of the executed command is stored in a file, then something ( func8() ; func1() ) is done with the contents. It seems reasonable to presume that the contents are sent back to the attacker. Lets test this theory.

Examining func8(), we see that it is just a small function which looks like it does something with character arrays. It appears to be some kind of encoding.

   0804a1c8: movzbl -0x1(%ebx,%ecx,1),%edx     ; edx = ebx[ecx - 1]
   0804a1cd: movzbl (%ecx,%esi,1),%eax         ; eax = esi[ecx]
   0804a1d1: lea    0x17(%edx,%eax,1),%eax     ; eax = eax + edx + 0x17
   0804a1d5: mov    %al,(%ecx,%ebx,1)          ; ebx[ecx] = al
   0804a1d8: inc    %ecx                       ; ecx++
func1() does some small processing then calls func2().

Examining func2(), we see that there is a call to socket() with the same parameters as in main(), plus a call to sendto().

Our theory seems to hold. func8() encodes data, and func1() sends it.

If data is encoded before being sent from this binary to the attacker, it seems a reasonably assumption to assume that data is also encoded when sent from attacker to this binary. That would imply that func9() is a decoder of some kind. Perhaps it is even a decoder for data encoded by func8()? With curiosity piqued, both func8() and func9() were hand decompiled. Examining the coded, it seems that yes, func9() does decode data encoded with func8(). Note that while func9() does decode data encoded with func8(), it does so in a roundabout manner, with lots of double copying of data. This is presumably to confuse someone examining it. However, using the assumption that func9() was a decoder for the encoder func8(), allowed faster understanding of the code, and quicker hand decompilation.

At this point we have a general understanding of the program, and the purpose of most functions.

   func1 - calls func2
   func2 - send data to attacker
   func4 - DOS attack
   func5 - DOS attack
   func6 - DOS attack
   func7 - DOS attack
   func8 - encodes data
   func9 - decodes data
Browsing the code, we notice that there is function which is never called.
   08048f30: push   %ebp
   08048f31: mov    %esp,%ebp
   08048f33: sub    $0x4,%esp
   08048f36: push   %ebx
   08048f37: mov    0xc(%ebp),%edx
   08048f3a: mov    0x8(%ebp),%ebx
   08048f3d: xor    %ecx,%ecx
   08048f3f: movw   $0x0,-0x2(%ebp)
   08048f45: cmp    $0x1,%edx
   08048f48: jle    0x08048f5c                                  *
   08048f4a: lea    (%esi),%esi                                 |
   08048f4c: movzwl (%ebx),%eax                                *|
   08048f4f: add    %eax,%ecx                                  ||
   08048f51: add    $0x2,%ebx                                  ||
   08048f54: add    $-0x2,%edx                                 ||
   08048f57: cmp    $0x1,%edx                                  ||
   08048f5a: jg     0x08048f4c                                 *|
   08048f5c: cmp    $0x1,%edx                                   *
   08048f5f: jne    0x08048f6c                                        *
   08048f61: mov    (%ebx),%al                                        |
   08048f63: mov    %al,-0x2(%ebp)                                    |
   08048f66: movzwl -0x2(%ebp),%eax                                   |
   08048f6a: add    %eax,%ecx                                         |
   08048f6c: mov    %ecx,%edx                                         *
   08048f6e: sar    $0x10,%edx
   08048f71: movzwl %cx,%eax
   08048f74: lea    (%eax,%edx,1),%ecx
   08048f77: mov    %ecx,%eax
   08048f79: sar    $0x10,%eax
   08048f7c: add    %eax,%ecx
   08048f7e: mov    %ecx,%eax
   08048f80: not    %ax
   08048f83: mov    %ax,-0x2(%ebp)
   08048f87: and    $0xffff,%eax
   08048f8c: mov    -0x8(%ebp),%ebx
   08048f8f: mov    %ebp,%esp
   08048f91: pop    %ebp
   08048f92: ret    
What is interesting about this code is that it appears to be operating on two byte words (shorts in C). We can see this by these lines:
   08048f3f: movw   $0x0,-0x2(%ebp)
   08048f4c: movzwl (%ebx),%eax
   08048f66: movzwl -0x2(%ebp),%eax
   08048f80: not    %ax
   08048f83: mov    %ax,-0x2(%ebp)
The code appears to do some summing on an array of shorts, then takes a 1s complement (the not opcode) of the result. This sounds very familiar. Browsing [stevens3], we see that the IP checksum algorithm does just this. [stevens3] pp 672. has source for this algorithm with the function in_cksum(). Comparing the in_cksum() C code against the assembly, we find an exact match.

As the functions performing DOS attacks hand construct IP datagrams, why don't they call this function? Browsing functions func4(), func5(), func6(), func7(), we see that the reason the checksum function is not called, is because it has been inlined. This allows us to identify and replace large chunks of assembly, with a simple call to in_cksum().

1.5.3 Determining compilation optimisation options.

What optimisation options were used to compile the original code? The option -f-defer-pop was used as can been seen below. Without this optimisation, parameters passed to a function would be removed from the stack immediately after each call. We can see below that this does not occur.
   08048271: call   0x080569bc <signal>
   08048276: push   $0x1
   08048278: push   $0xf
   0804827a: call   0x080569bc <signal>
   0804827f: push   $0x1
   08048281: push   $0x11
   08048283: call   0x080569bc <signal>
   08048288: add    $0x24,%esp
Examining the man page for gcc, we see that -f-defer-pop is turned on when the -O1 option is selected. Let's use a trial and error method of determining the optimisation options selected. We hand decompile the first portion of the main function. Now we compile the decompiled function several times with different optimisation levels and options. After each try, we disassemble the result and compare to the original. This is repeated until a match occurs. After doing this, it seems that the -O1 option was used.

1.5.4 Hand decompilation

Now we have a good understanding of what each function does. From here on, we begin the tedious task of hand decompiling line by line. Following the guidelines in Appendix B, this is not too difficult a task. With a general understanding of what a function does, it is easier to decompile.

After a few late nights, we finally have the finished product. How do we know that our hand decompilation is correct? We know the compiler and C library versions used to compile the original, so we use that environment to compile the reconstructed code. Then we disassemble the newly compiled executable, and compare the two disassembly listings. Apart from different register allocations, they are pretty much the same, so we can conclude that the hand decompilation is correct.

1.5.5 Decompiling the DOS attacks.

How do we work out what the decompiled DOS functions do? Examining func7(), it is fairly obvious it is a synflood. The syn flag in the TCP header, is explicitly set.

func6() is more difficult. The IP fragment offset field is set, so it seems to be a type of fragment based attack. Going to our favourite blackhat side on the net, we download all the fragment based attacks we can find. These attacks are teardrop, bonk, boink and jolt2. Searching for similarities we find:


   . . . = htons(0x455);
   . . .
   pdu.ip.frag_off = htons(8190);
   . . . = htons(0x455);
   . . .
   pkt.ip.frag_off = htons (8190);
Examining func6() and jolt2 further, we see that they are essentially the same. In fact, looking at the order of the lines, it looks func6 is a cut and paste of jolt2 with some changes.

func4() and func5() have large portions of code the same. At the heart of these functions, is the sending of a UDP datagram to port 53 of a remote server. For func4(), the IP address of the remote server is stored in the .data section of the executable, while for func5() it is a parameter of the function. Examining the code for func4(), we see that there is an array of IP addresses (terminated by zero), which are used as destinations. Extracting these addresses as dotted quads, we see that it is a list of 11441 IP addresses. This list has been sorted lexicographically, not numerically (i.e. the list was sorted using sort, instead of sort -n).

Picking a few addresses at random and resolving them yields:

   $ host

   $ host

   $ host
The list appears to be a list of name servers. This correlates with the sending of a UDP datagram to port 53 (domain) of these servers.

The payload of this datagram is stored in the .rodata segment of the original executable. We examine this data with the knowledge that it is probably a DNS query or response, utilising [rfc1035] (Domain Names - Implementation and Specification).

We see that the data consists of 9 dns queries. The queries are zone of authority (SOA) queries for the domains

However the queries for domains de, es, gr and ie, are malformed. The length field for the names is incorrect. It is 3 instead of 2. This is most likely due to using cut and paste and forgetting to fix the length.

Both func4() and func5() send these queries with a spoofed source address. The result would be the name servers responding to the query to the spoofed source address and generating a flood of traffic. The reason for a SOA query is that it guarantees a result.

2. Analysing the decompiled source code.

Starting with a copy of the source code, this section performs an analysis upon it, determining its features and capabilities.

2.1 Program startup

The tool starts by checking whether it is running as root, exiting if not. This because the tool requires to be run as root to access raw sockets. The tool then changes the value of argv[0] (process name) to be "[mingetty]". This is an attempt to hide the process from casual detection by the use of the ps command. (mingetty is the default getty on RedHat (and derivatives), as well as Slackware versions pre 7.0). The braces around the process name are a ps convention for processes which are swapped out.

Next the tool becomes a daemon by detaching from the controlling terminal, creates a new session, ensures that it is not a session leader, changes the current working directory to the root directory (/), and closes unneeded file descriptors. This is the standard way of becoming a daemon (using SVR4 rules), as described by [stevens1] pp 417.

After minor initialisation, the daemon open a raw socket to accept IP packets using protocol 11. According to [rfc1700], this protocol is reserved for the Network Voice Protocol (NVP-II). This is an old protocol which is not in current use. The purpose of sending data with a protocol field of 11, is to avoid being noticed. Many firewalls, IDS and system administrators only concern themselves with ICMP, UDP and TCP (protocols 1, 6, 17). Using something other than these, is likely to escape the notice of a novice system administrator.

The daemon now enters an infinite loop, processing requests.

As stated in the quick analysis, this tool is mostly likely used for distributed denial of service (DDOS) attacks. Examining a diagram of a typical DDOS system, we see that this tool neatly fits as the agent part of the system. From here on, the tool will be referred to as agent.

                  +----------+           +----------+
                  | attacker |           | attacker |
                  +----------+           +----------+
                       |                      |
        . . . --+------+---------------+------+----------------+-- . . .
                |                      |                       |
                |                      |                       |
          +-----------+          +-----------+           +-----------+
          |  handler  |          |  handler  |           |  handler  |
          +-----------+          +-----------+           +-----------+
                |                      |                       |
                |                      |                       |
. . . ---+------+-----+------------+---+--------+------------+-+-- . . .
         |            |            |            |            |
         |            |            |            |            |
     +-------+    +-------+    +-------+    +-------+    +-------+
     | agent |    | agent |    | agent |    | agent |    | agent |
     +-------+    +-------+    +-------+    +-------+    +-------+
Diagram of a distributed denial of server attack tool (David Dittrich).

2.2 Features

There are two classes of actions which can be performed by the agent, fast and slow. A fast action is one which will complete quickly, such as executing a command or responding to a status query. Slow actions are ones which take significant amounts of time, such as performing a particular DOS attack, or providing a remote shell. Fast actions are performed as soon as their request arrives. Only one slow action can be performed at any one time. If the agent is performing a slow action, and a request to perform a different slow action arrives, the first action is terminated immediately, before starting the second.

Possible actions performed by the agent are:

2.2.1 Fast actions:

2.2.2 Slow actions:

2.3 Network protocols.

Communication between handler and agent is connectionless unauthenticated and unreliable. There is no attempt at retransmission of lost packets, and the only error checking on the packets themselves is the IP checksum (processed by the kernel, but only applies to the fields of the IP header, not the payload). Specifically constructed IP datagrams with a protocol field of 11 are used. The Payload is encrypted. The general format of a datagram used for communication follows this form:
   | IP Header   | dir | res |    Encoded data           |

IP Header - a standard IP header with protocol 11
dir       - a direction byte: 2 for handler -> agent, 3 for agent -> handler
res       - reserved byte, unused
data      - encoded data.
Total length of the IP packet is always greater than 200, though it is
generally between 400 and 600 bytes in length.
The encoding process used is quite simple. Treat the data to encode as an array of unsigned chars. For each element of the array, add the current data plus 23 to an accumulator. The value of the accumulator modulo 256 is the encoded data for that element of the array.

Once the encoded data is decoded, it can be processed. There are 18 different formats this decoded data can take. The formats consist of combinations of command flags, IP addresses and port numbers. These formats are discussed below.

Notes on PDU format.

2.3.1 Packets from Agent -> Handler

2.3.2 Packets from Handler -> Agent

2.4 Fingerprints

Detecting communication between agents and handlers is simple. As the communications method is to use IP datagrams with a protocol field of 11, and the registered owner of that protocol (Network Voice Protocol) is not currently used, then any datagrams using that protocol can be assumed be communication between agents and handlers.

Strings unique to the executable of the agent are:

/bin/csh -f -c "%s" 1> %s 2>&1
TfOjG                                    <===== password shifted 1 character
/bin/csh -f -c "%s"
Detecting running agents can be achieved by looking for processes with open raw sockets waiting for protocol 11. The netstat command can be used to do this. Here is the results of finding a running agent.
# netstat -pan | grep raw | grep :11
raw    0      0*      7     5226/[mingetty]

2.5 Weaknesses

The most obvious weakness is that the agent / handlers communicate entirely with datagrams with the protocol field set to 11. Simply blocking these datagrams at a border router would halt the effectiveness of the DDOS system. As the registered owner of that protocol (Network Voice Protocol) is not currently used, nothing is broken by the blocking of those packets.

As the agent does not require any authorisation, anyone can send commands to an existing agent. If there is an agent performing an attack from inside an organisation, sending terminate commands to all hosts in an organisation would terminate the attack. The sample handler tool that was written, to test the agent could be easily modified to achieve this.

The simple encoding scheme used can be easily decoded. This allows the obtaining of the IP address of the handler when an initialisation command is sent. This makes the technique of the agent sending packets to multiple IP addresses (to hide the IP address of the handler) worthless.

3. Summary

The binary is the agent half of a distributed denial of service attack tool. The following are notable points discussed in the analysis.


[dittrich] Dittrich, D. The DoS Project's "trinoo" distributed denial of service attack tool. <>
[elf] ELF (Executable and Linkable Format) <>
[phrack] Phrack Magazine Issue 48 File 13 of 18 <>
[rfc1035] RFC 1035: Domain Names - Implementation and Specification.
[rfc1700] RFC 1700: Assigned Numbers
[stevens1] Stevens, W.R. Advanced Programming in the UNIX Environment. 1993 Addison Wesley.
[stevens2] Stevens, W.R. TCP/IP Illustrated Vol 1. 1994 Addison Wesley.
[stevens3] Stevens, W.R. Unix Network Programming Vol 1. 2nd Ed. 1998 Prentice Hall

Appendix A - manually resolving object file matches

The list of matching object files for the libc5 that shipped with slackware3.1 is:
#              -- This is an automatically generated file. --
# This file contains the results of using elfgrep to search a binary file
# for contained object file.  Manual resolution of the conflicts at the
# end of this file may be necessary.
slackware3.1/setenv.o - match at 0x0804a2a8 (0x00000249 bytes)
. . .
slackware3.1/stpcpy.o - match at 0x08067300 (0x00000044 bytes)
slackware3.1/_udivdi3.o - match at 0x08067344 (0x00000108 bytes)
slackware3.1/_umoddi3.o - match at 0x0806744c (0x0000015c bytes)

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/auth_none.o - match at 0x08067094 (0x0000010f bytes)
# slackware3.1/pthread_stubs.o - match at 0x08067190 (0x00000013 bytes)
# slackware3.1/__errno_loc.o - match at 0x08067184 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x08067184 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x08067184 (0x0000000c bytes)
# slackware3.1/_clear_cache.o - match at 0x0806717c (0x00000007 bytes)
# slackware3.1/_clear_cache.o - match at 0x0806719c (0x00000007 bytes)
# slackware3.1/_udiv_w_sdiv.o - match at 0x0806717c (0x00000007 bytes)
# slackware3.1/_udiv_w_sdiv.o - match at 0x0806719c (0x00000007 bytes)

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/getsockopt.o - match at 0x08056c9c (0x00000055 bytes)
# slackware3.1/setsockopt.o - match at 0x08056c9c (0x00000055 bytes)

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/__errno_loc.o - match at 0x0804f534 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x0804f534 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x0804f534 (0x0000000c bytes)

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/__errno_loc.o - match at 0x08056e64 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x08056e64 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x08056e64 (0x0000000c bytes)

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/asprintf.o - match at 0x0804f808 (0x00000018 bytes)
# slackware3.1/iosprintf.o - match at 0x0804f808 (0x00000018 bytes)
# slackware3.1/iosscanf.o - match at 0x0804f808 (0x00000018 bytes)
There are 5 unresolved matches.

Resolving 1st conflict

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/auth_none.o - match at 0x08067094 (0x0000010f bytes)
# slackware3.1/pthread_stubs.o - match at 0x08067190 (0x00000013 bytes)
# slackware3.1/__errno_loc.o - match at 0x08067184 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x08067184 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x08067184 (0x0000000c bytes)
# slackware3.1/_clear_cache.o - match at 0x0806717c (0x00000007 bytes)
# slackware3.1/_clear_cache.o - match at 0x0806719c (0x00000007 bytes)
# slackware3.1/_udiv_w_sdiv.o - match at 0x0806717c (0x00000007 bytes)
# slackware3.1/_udiv_w_sdiv.o - match at 0x0806719c (0x00000007 bytes)
Examining the appropriate gap between object files for this conflict, we see that there is a 0x111 byte gap. Of all the possible conflicts auth_none.o (with size 0x10f) fits the gap with the least wasted space.
slackware3.1/loadlocale.o - match at 0x08066bfc (0x00000497 bytes)
# 0x111 byte gap
slackware3.1/xdr_ref.o - match at 0x080671a4 (0x00000105 bytes)

Resolving 2nd conflict

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/getsockopt.o - match at 0x08056c9c (0x00000055 bytes)
# slackware3.1/setsockopt.o - match at 0x08056c9c (0x00000055 bytes)
Examining the disassembled code (from objdump -d the-binary) we find this interesting piece of code:
 8056cc2: mov    $0xe,%edx
 8056cc7: lea    0xffffffec(%ebp),%ecx
 8056cca: mov    $0x66,%eax
 8056ccf: mov    %edx,%ebx
 8056cd1: int    $0x80
int $0x80 is the interrupt used by linux for system calls. %eax is the system call major, %ebx is the system call minor. Note that the major is 0x66 (102 decimal), and the minor is 0xe (14 decimal).

Examining the linux source code:


#define __NR_socketcall         102
or alternatively

        .long SYMBOL_NAME(sys_fstatfs)          /* 100 */
        .long SYMBOL_NAME(sys_ioperm)
        .long SYMBOL_NAME(sys_socketcall)
        .long SYMBOL_NAME(sys_syslog)
        .long SYMBOL_NAME(sys_setitimer)
        .long SYMBOL_NAME(sys_getitimer)        /* 105 */
So this code is for a socketcall, looking promising.
Now in /usr/src/linux/include/linux/net.h

#define SYS_SOCKET      1               /* sys_socket(2)                */
#define SYS_BIND        2               /* sys_bind(2)                  */
#define SYS_CONNECT     3               /* sys_connect(2)               */
#define SYS_LISTEN      4               /* sys_listen(2)                */
#define SYS_ACCEPT      5               /* sys_accept(2)                */
#define SYS_GETSOCKNAME 6               /* sys_getsockname(2)           */
#define SYS_GETPEERNAME 7               /* sys_getpeername(2)           */
#define SYS_SOCKETPAIR  8               /* sys_socketpair(2)            */
#define SYS_SEND        9               /* sys_send(2)                  */
#define SYS_RECV        10              /* sys_recv(2)                  */
#define SYS_SENDTO      11              /* sys_sendto(2)                */
#define SYS_RECVFROM    12              /* sys_recvfrom(2)              */
#define SYS_SHUTDOWN    13              /* sys_shutdown(2)              */
#define SYS_SETSOCKOPT  14              /* sys_setsockopt(2)            */
#define SYS_GETSOCKOPT  15              /* sys_getsockopt(2)            */
#define SYS_SENDMSG     16              /* sys_sendmsg(2)               */
#define SYS_RECVMSG     17              /* sys_recvmsg(2)               */
So the minor is 14, which is setsockopt.

So the conflict resolves to setsockopt.o

Resolving 3rd conflict

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/__errno_loc.o - match at 0x0804f534 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x0804f534 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x0804f534 (0x0000000c bytes)
Examining names of surrounding object files, __res_loc.o seem the most likely match.
slackware3.1/res_comp.o - match at 0x0804d02c (0x00000717 bytes)
slackware3.1/res_init.o - match at 0x0804d744 (0x0000089c bytes)
slackware3.1/res_query.o - match at 0x0804dfe0 (0x00000655 bytes)
slackware3.1/res_send.o - match at 0x0804e638 (0x00000ef9 bytes)
# 0xf byte gap
slackware3.1/iofclose.o - match at 0x0804f540 (0x00000081 bytes)

Resolving 4th conflict

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/__errno_loc.o - match at 0x08056e64 (0x0000000c bytes)
# slackware3.1/__h_errno_loc.o - match at 0x08056e64 (0x0000000c bytes)
# slackware3.1/__res_loc.o - match at 0x08056e64 (0x0000000c bytes)
__res_loc.o was already resolved in the previous conflict. This leaves a choice of __errno_loc.o and __h_errno_loc.o. __errno_loc.o seems more likely as it provides errno which is set by many system calls.

Resolving 5th conflict

# Possible conflict below requiring manual resolution:
# ----------------------------------------------------
# slackware3.1/asprintf.o - match at 0x0804f808 (0x00000018 bytes)
# slackware3.1/iosprintf.o - match at 0x0804f808 (0x00000018 bytes)
# slackware3.1/iosscanf.o - match at 0x0804f808 (0x00000018 bytes)
The surrounding object files are all of the form io(.*)sprintf.o it seems likely that the most appropriate match is then iosprintf.o
slackware3.1/ioprintf.o - match at 0x0804f7ec (0x00000019 bytes)
slackware3.1/iovsprintf.o - match at 0x0804f820 (0x00000065 bytes)
# 0x1b byte gap
slackware3.1/iovfprintf.o - match at 0x0804f888 (0x000035f7 bytes)

Appendix B - Tips for hand decompilation.

Here follow a number of tips that can be used to help with hand decompilation. Like anything, hand decompilation requires practice, so starting early with simple examples is good preparation for large task such as this.

C function calling convention.

C passes function arguments on the stack. Arguments are pushed onto the stack from right to left. Any return value of the function is returned in the register %eax. With this knowledge, we can determine the number of arguments a function requires by counting the number of pushes.

For example, consider the following piece of code.

0804842d: push   $0x0
0804842f: call   0x08057444 <__libc_time>
08048434: add    $0x4,%esp
08048437: push   %eax
08048438: call   0x080559a0 <__srandom>
0804843d: add    $0x4,%esp
The first 3 lines call the libc function time(), passing 1 parameter. Line 1 pushes the argument. Line 2 performs the call. Line 3 removes the pushed arguments from the stack.

The second 3 lines call the function srandom(), passing as parameter %eax. However %eax was the returned value from the call to time().

Therefore the above 6 lines are decompiled to:

Checking the manpages for time(2) and srandom(3), we see that time takes a pointer as parameter. The number of arguments passed was as expected. The information from the manpages allows us to rewrite the decompiled code as:

Using known functions to determine argument types

Known functions - such as those in the C library - can be used to determine the types of the arguments.

We will use the call to recv as an example. The prototype for recv() - as specified from the man page is:

   int recv(int s, void *buf, size_t len, int flags);

080482b0: push   $0x0
080482b2: push   $0x800
080482b7: lea    -0x800(%ebp),%eax
080482bd: push   %eax
080482be: mov    -0x44c8(%ebp),%ecx
080482c4: push   %ecx
080482c5: call   0x08056b44 <recv>
080482ca: mov    %eax,%esi
080482cc: add    $0x10,%esp
This code fragment decompiles to
%esi = recv(-0x44c8(%ebp), -0x800(%ebp), 0x800, 0x0);
From the man page, we can infer that the data at -0x44c8(%ebp) is an integer being a socket descriptor, and the data at -0x800(%ebp) is of type void *. The register %esi hold an integer (being the number of bytes received).

As the address of the start of the buffer is 0x800 bytes back from the frame pointer, and the length of the buffer is given as 0x800, it seems sensible to make the inference that the data at -0x800(%ebp) is of type:

char buf[0x800]       (or unsigned as the case may be).
As another example, consider this code:
0804827f: push   $0x1
08048281: push   $0x11
08048283: call   0x080569bc <signal>
This decompiles to
signal(0x11, 0x1);
but this is still pretty meaningless.

Examining the manpage for signal

       sighandler_t signal(int signum, sighandler_t handler);
So 0x11 is the signal number. Examine the relevant header file /usr/include/signal.h We find these lines
#define SIGCHLD         17      /* Child status has changed (POSIX).  */
#define SIG_IGN ((__sighandler_t) 1)            /* Ignore signal.  */
So we can rewrite the call as
which is much more readily understandable.

Using the compiler to help.

Looking at this code, it seems to be copying the string "[mingetty]", to address contained with %ebx, but in blocks of 4 characters. Why would anyone deliberately write code this way? Perhaps they didn't, perhaps this is really an inlined version of the strcpy() function. How could we be sure of this?
080481a8: mov    (%ebx),%edx
# Possible reference to rodata '[mingetty]'
080481aa: mov    0x80675d8,%eax
080481af: mov    %eax,(%edx)
# Possible reference to rodata 'getty]'
080481b1: mov    0x80675dc,%eax
080481b6: mov    %eax,0x4(%edx)
# Possible reference to rodata 'y]'
080481b9: mov    0x80675e0,%ax
080481bf: mov    %ax,0x8(%edx)
# Possible reference to rodata ''
080481c3: mov    0x80675e2,%al
080481c9: mov    %al,0xa(%edx)
We write a quick test program, compile it then examine the disassembly of the compiled program.
$ echo 'main(void) {char p[100]; strcpy(p, "[mingetty]");}' > test.c
$ gcc -finline-functions test.c
Doing this and comparing confirms our hypothesis. This technique only really works if the same set of optimisation options are used to compile the test program, as the original. A bit of experimentation with different options is needed to determine the correct ones to use.