Honeynet Project: Scan of the Month - Scan 23 (September 2002)
By Volker Kindermann <vk at xbsd dot de>
Sun, 22. Sep 2002
The Honeynet Project's Scan of the Month
for September is the first one intended for beginners. The Project's members scanned
a honeypot via five different methods and captured the resulting network traffic using
Snort, an open source Intrusion detection system.
The task is to analyze these portscans according to the questions.
Ok, welcome to my first Scan of the month challenge. Just downloaded this big binary file, checked md5 hashes and extracted it. Got a much bigger file to examine. The Honeynet Project proposed three software-tools to master the challenge: tcpdump, ethereal and snort.
After examinig tcpdump a little bit I decided to leave it for later. It's a nice tool but the output is not very meaningful for a beginner like me. Just remembered the chapter in Northcutt's Network Intrusion Detection book about tcpdump filters. Might be helpful at a later time.
Next candidate to view the binary was ethereal. Ah, quite a lot of data, but a very nice interface. And the ability to write filters, too. But not so suited to get an overall view of the file.
So why not use snort? The file was created with snort, why not using it to get an overview? So I installed snort 1.8.7 from the FreeBSD ports tree, looked at the configfile, included the portscan preprocessor (portscans are the subject of the challenge) and ran it on the binary.
To view the results, I installed snortsnarf (also from the ports tree) and gave it the snort output. A very interessting summary page that was full of hints of the popular portscanner nmap. Fine. A good hint for question 6. So I decided to read the nmap man page to get a feeling about the capabilities of nmap.
Next I created an "analysis file" which I will use as a reference. It's too big to include, but I will explain how I created it. I started ethereal and opened the binary. Then I chose "print" while activating the "packet summary" option. The result was a file with one line per packet, a number for indentification and the time relative to the start of the file. All packet numbers refer to this file.
The first scan type.
After some reading in the nmap man page it was easy to recognize the first 4 packages as the default nmap behaviour to test if a target host is reachable (see explanation in question 3).
This default scan could help much more, for it will always mark the beginning of a nmap scan. Great!
Ok, let's get to work. This "nmap initialization" occurs first in the first 4 packets. Therefore the next following packets may be a nmap scan technique. Let's have a look at packet 5 to 8:
|5||10.346091||192.168.0.9||192.168.0.99||TCP||52198 > 52156 [SYN] Seq=68054434 Ack=0 Win=2048 Len=0|
|6||10.346199||192.168.0.99||192.168.0.9||TCP||52156 > 52198 [RST, ACK] Seq=0 Ack=68054435 Win=0 Len=0|
|7||10.346137||192.168.0.9||192.168.0.99||TCP||52198 > 28494 [SYN] Seq=68054434 Ack=0 Win=2048 Len=0|
|8||10.346235||192.168.0.99||192.168.0.9||TCP||28494 > 52198 [RST, ACK] Seq=0 Ack=68054435 Win=0 Len=0|
What do we have here? The attacker sends tcp packets with only the SYN Flag set to two presumably random ports. He gets back two
packets with the RST and ACK flags set. The nmap man page describes this as "half-open" scanning:
"You send a SYN packet, as if you are going to open a real connection and you wait for a response. A SYN|ACK indicates the port is listening. A RST is indicative of a non-listener. If a SYN|ACK is received, a RST is immediately sent to tear down the connection (actually our OS kernel does this for us)."
Ok, sounds like we have our first scan type, despite the fact that we are receiving RST and ACK, not only RST. To enlighten this, let's have a look at the holy bible of TCP/IP: Stevens, Richard: «TCP/IP Illustrated, Volume 1: The Protocols.». In Chapter 18, there is the explanation of the TCP Three-way-handshake (page 231 f.) and the default behaviour for a Connection Request to a nonexistent port (page 247). The examples shows, that the ACK flag is used to assign the RST packet to the triggering SYN packet, so we really got our first hit here: a TCP SYN (half-open) scan
To determine the open ports that this scan found, I had to look for responses from 192.168.0.99 that have the SYN and ACK flag set.
I did this by printing the packet summary in a file (with ethereal) separated the packets No. 5 to 148006 that seems to contain the TCP SYN
scan and grepped the resulting file for "SYN, ACK". This reveals the following ports as open:
as shown in packets no. 18332, 18686, 19610, 20890, 23772 and 28876.
as shown in packets no. 123013, 123571, 124339, 126049, 128841 and 134737.
as shown in packets no. 97157, 97175, 97581, 97665, 98291, 98353, 99920, 99976, 102900, 102962, 108608 and 108668.
as shown in packets no. 81061, 81643, 82313, 83653, 86180 and 91488.
as shown in packets no. 98978, 99496, 100278, 101852, 104608 and 109882.
as shown in packets no. 85062, 85086, 85494, 85528, 86178, 86328, 87502, 87552, 90064, 90122, 95630 and 95690.
The second scan type.
Our next "nmap initialization" sequenze takes us to packet 148007 and is again 4 packets long. Let's have a look at the following four packets:
|148011||1284.924394||192.168.0.9||192.168.0.99||TCP||42294 > iclpv-nls  Seq=0 Ack=0 Win=4096 Len=0|
|148012||1284.924445||192.168.0.99||192.168.0.9||TCP||iclpv-nls > 42294 [RST, ACK] Seq=0 Ack=0 Win=0 Len=0|
|148013||1284.924437||192.168.0.9||192.168.0.99||TCP||42294 > intecourier  Seq=0 Ack=0 Win=4096 Len=0|
|148014||1284.924476||192.168.0.99||192.168.0.9||TCP||intecourier > 42294 [RST, ACK] Seq=0 Ack=0 Win=0 Len=0|
What's that? The attacker doesn't send any TCP flags at all. Looks like a NULL scan. The nmap man page is again very informational:«The idea is that closed ports are required to reply to your probe packet with an RST, while open ports must ignore the packets in question (see RFC 793 pp 64). [...] The Null scan turns off all flags.»
So our next scan type seems to be a nmap NULL scan. Any closed port will respond with a packet containing the RST and ACK flags, so we have to look for packets without an answer to get our wanted open ports. These are the following ports (found out by manually inspection):ssh (22)
The third scan type.
The next scan type shows up beginning with packet 150759. Four typical packets of this type are:
|150759||1417.570462||192.168.0.9||192.168.0.99||TCP||58163 > 963 [FIN, PSH, URG] Seq=0 Ack=0 Win=1024 Urg=0 Len=0|
|150760||1417.570495||192.168.0.99||192.168.0.9||TCP||963 > 58163 [RST, ACK] Seq=0 Ack=1 Win=0 Len=0|
|150761||1417.570519||192.168.0.9||192.168.0.99||TCP||42294 > 58163 > 700 [FIN, PSH, URG] Seq=0 Ack=0 Win=1024 Urg=0 Len=0|
|150762||1417.570530||192.168.0.99||192.168.0.9||TCP||700 > 58163 [RST, ACK] Seq=0 Ack=1 Win=0 Len=0|
The characteristical setting of FIN, PSH and URG set reveals this one as nmap Xmas tree scan. Right from the man page: "...while the Xmas tree scan turns on the FIN, URG and PUSH flags."
Again, any closed port will respond with a packet containing the RST and ACK flags, so we have to look for packets without an answer to get our wanted open ports. These are the following ports (found out by manually inspection):ssh (22)
The forth scan type.
To get a clue about the forth scan type let's look at two scans of this:
|156008||1612.404870||192.168.0.1||192.168.0.99||TCP||35964 > qbikgdp [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156009||1612.404914||192.168.0.254||192.168.0.99||TCP||35964 > qbikgdp [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156010||1612.404949||192.168.0.9||192.168.0.99||TCP||35964 > qbikgdp [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156011||1612.404972||192.168.0.99||192.168.0.9||TCP||qbikgdp > 35964 [RST, ACK] Seq=0 Ack=1 Win=0 Len=0|
|156012||1612.404992||192.168.0.199||192.168.0.99||TCP||35964 > qbikgdp [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156013||1612.405153||192.168.0.1||192.168.0.99||TCP||35964 > gridgen-elmd [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156014||1612.405159||192.168.0.254||192.168.0.99||TCP||35964 > gridgen-elmd [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156015||1612.405221||192.168.0.9||192.168.0.99||TCP||35964 > gridgen-elmd [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
|156016||1612.405245||192.168.0.99||192.168.0.9||TCP||gridgen-elmd > 35964 [RST, ACK] Seq=0 Ack=1 Win=0 Len=0|
|156017||1612.405236||192.168.0.199||192.168.0.99||TCP||35964 > gridgen-elmd [FIN, PSH, URG] Seq=0 Ack=0 Win=3072 Urg=0 Len=0|
So what do we have here? Looks similar to the Xmas tree scan despite the fact, that three other source IP addresses are involved. They send the exact same packets to the target but they never ever receive an answer. Here we have a nmap Xmas tree scan with the decoy option. That is used to cloak the attacker's real IP. But here it seems like two of the other three IP addresses, namely 192.168.0.1 and 192.168.0.199 are not from machines that exist on the network. There are no responses. 192.168.0.254 is the so called "broadcast" address of the network 192.168.0.0/24. Because it's too easy to probe an entire network by only pinging to it's broadcast address and waiting for all running machines to respond, most modern operating systems don't answer broadcast ping requests. Similarly, there would be no answer to a tcp/udp datagram whose source address does not define a single host (Stevens, page 71).
The decoy methos only really cloaks the attacker, if the other hosts are up and responding, but the pattern used here looks very much like a nmap Xmas tree scan with the decoy option.
An short analysis of a similar scan is found on whitehats.com.
The fifth scan type.
After eliminating the so far recognised scan data of our analysis-file, the rest of the packets don't show a special scan type on the first look. Knowing the nmap man page, there is one big issue left: OS Detection.
I don't get a concrete idea from the nmap man page how such an OS Detection scan will look like, so I decided to run such a scan on my computers (nmap -sS -p22 -O 192.168.17.7) and to compare the results.
After doing this, the result is clear: the fifth method is OS-Detection. You can check this, if you compare the analysis file with the file of my test.
Question 1. What is a binary log file and how is one created?
Normaly, a logfile contains some sort of metadata: data about events that happened. This type of logfile is used on Unix-Systems to log the system messages, the actions of the mailserver, etc. Only the type of event is logged, not the actual data of the event. These logfiles are created by the syslog daemon.
A binary log file is a file with data that is not human readable. The logfiles mentioned above are often in ascii text format, so that they can easiliy be read by a pager like "more" or "less" or an editor like "vi". Before you can read a binary logfile, there must be some software to convert the information into human readable formats. This may be the software that created the logfile, but it can also be done by other programs that understand the binary format.
Opposed to the syslogd system logs mentioned in paragraph one, the tcpdump binary logformat does not contain metadata, but instead it contains the raw network data that passed the interface. It is now possible to convert this raw data to human readable formats without loosing the information in the raw data. It's comparable with a database, where you create "views" of the containing data to suit your needs.
A binary log file is created by some type of software program, in our case it's snort, an opensource software that has three main modes:
sniffer, packet logger and network intrusion detection system.
The binary file was created while using snort in packet logger mode. Normally it would log the packets to a file in ascii text format, but if you run it with the command "snort -l ./log -b" the "-b" switch tells it to log in binary mode. The authors of snort recommend this if you're on a high speed network or if you want to log the packets into a mor compact form for later analysis (SnortUsersManual.pdf, page 5).
Question 2. What is MD5 and what value does it provide?
MD5 is member of the one-way hash functions. These are like digital fingerprints: small pieces of data that can serve to identify much larger digital objects.
They are called one-way because of their mathematical nature. Anyone can compute the one-way hash of anything. However, it is computationally unfeasible to create another digital object that hashes to the same value, or to derive the digital object's original state.
Hash functions can also provide a measure of authentication and integrity. With the md5 hash of the Binary given, I could compute the md5 of the downloaded binary and compare them. If they are the same, the file was not altered. This way I have only to compare a short string instead of a huge file.
Hash functions have an enormous range of applications in cryptography and computer security. Almost every Internet protocol uses
them to process keys, chain a sequence of events together, or authenticate events. They are essential for digital signature algorithms.
(compare: Schneier, page 94)
MD5 is explained in great detail in RFC 1321.
Question 3. What is the attacker's IP address?
There are 5 IP addresses involved in the binary: 192.168.0.1, 192.168.0.9, 192.168.0.99, 192.168.0.199 and 192.168.0.254.
As a result of the analysis (see section 2), we already know that the scanning was done using the nmap portscanner. In the nmap man page it says that nmap pings a target host by default to check if a host is responding. Then only hosts that respond are scanned. "Nmap can do this by sending ICMP echo request packets to every IP address on the networks you specify. Hosts that respond are up. Unfortunately, some sites such as microsoft.com block echo request packets. Thus nmap can also send a TCP ack packet to (by default) port 80. If we get an RST back, that machine is up. [...] By default (for root users), nmap uses both the ICMP and ACK techniques in parallel. [...] Note that pinging is done by default anyway, and only hosts that respond are scanned." (man 1 nmap).
Looking at the first 4 packets of the binary, we can see exact that behaviour:
|1||0.000000||192.168.0.9||192.168.0.99||ICMP||Echo (ping) request|
|2||0.000078||192.168.0.99||192.168.0.9||ICMP||Echo (ping) reply|
|3||0.000044||192.168.0.9||192.168.0.99||TCP||52218 > http [ACK] Seq=2347237379 Ack=4094787819 Win=2048 Len=0|
|4||0.000119||192.168.0.99||192.168.0.9||TCP||http > 52218 [RST] Seq=4094787819 Ack=0 Win=0 Len=0|
From these packets and the information from the nmap man page we can say that the attacker's IP address is 192.168.0.9
One can claim that this address is spoofed, but this makes no sense, because in this case the answers of the packets won't reach the attacker and his nmap won't recognise the target system as up and therefore would not scan it.
Question 4. What is the destination IP address?
As shown in the answer of question 3, the destination IP address is 192.168.0.99.
Question 5. We scanned the honeypot using five different methods. Can you identify the five different scanning methods, and describe how each of the five works?
Question 6. Which scanning tool was used to scan our honeypot? How were you able to determine this?
The analysis in chapter 2 showed, that nmap was used for scanning.
nmap is a very wide spread and also very sophisticated scanning tool. As with most other networking software, the packets created by nmap have some unique identifiers that allows network intrusion detection tools like snort to recognize the scanning tool.
In addition, nmap is heavy used by whitehats and blackhats and that fact is responsible for having detailed "nmap recognition rules" in intrusion detection systems.
Question 7. What is the purpose of port scanning?
One of the first phases in any attempt to break into a host on a network is to do some kind of reconnaissance on the network or a particular host. An attacker might have a new piece of code that was just released which enables him to get root access to a host if he can find a vulnerable host. Or, an attacker might just be interested in getting into a host or multiple hosts in any way possible. Different hackers have different goals for hacking. Perhaps the host or network is being sought to participate in a distributed denial-of-service attack. Or perhaps the interest is in compromising a host from which to launch other attacks and hide the true identity of the hacker.
The attacker must scan the network in some fashion to discover live hosts and later discover hosts susceptible to exploits by scanning
service ports. For instance, the attacker might have aquired some software that could gain root access of hosts offering vulnerable
imap servers. Chances are good that he would scan the network for any host listening on the imap port. After discovering those, the
attacker might try to execute the imap exploit code on hosts running imap.
(See: Northcutt, et.al: page 79.)
Question 8. What ports were found open on our honeypot?
As shown by the analysis, the following ports were open on the honeypot:ssh (22)
Bonus Question. What operating system was the attacker using?
That's a really difficult question. The only information is the binary file, that the Project provided. To reveal the attacker's operating system, we depend on passive fingerprinting as explained in Chapter 7 of "Know your Enemy". The database of signatures provided by this book didn't help me. But I found a paper of Lance Spitzner mentioning a tool for passive fingerprinting, called p0f.
I quickly installed it from the FreeBSD ports tree and gave it the binary to eat. After a few seconds, it spit out:
192.168.0.9 [1 hops]: Linux 2.4.2 - 2.4.14 (1).
The attacker was running a Linux distribution with a kernel between version 2.4.2 and 2.4.14.