Project Honeynet Scan of the Month 24

 Table of Contents

Brief Answers to Questionsp1
    Question 1p1
    Question 2p2
    Question 3p2
    Question 4p2
    Question 5p2
    Question 6p2
Detailed Analysisp2
    Initial Approachp2
    Second Approachp5
    The Bonus Questionp6
Data Filesp6
halloween ghost

jump to top

redhive laboratories

Pedram Amini
Doug Brown
Eric Clore

This month's project involved the forensic analysis of a floppy that was seized by the police from a suspected drug dealer (made up scenario). You may want to read the police report for the appropriate background information. The following is a detailed writ covering our methodologies and conclusions.

We begin with brief answers to the six questions that were asked. We then discuss in detail our analysis and conclusions; It is important to note that we do not just discuss the correct approach but our entire approach, incorrect paths and all. We end the writ with a listing of the tools and reference materials we used to conduct our analysis.

jump to top
Brief Answers to Questions

1. Who is Joe Jacob's supplier of marijuana and what is the address listed for the supplier?

By examining Jimmy Jungle.doc, which is apparently a letter from Joe to his supplier, we determined that the following is the name and address of Joe Jacob's supplier:

Jimmy Jungle
626 Jungle Ave Apt 2
Jungle, NY 11111

2. What crucial data is available within the coverpage.jpg file and why is this data crucial?

The cover page file contains the password to the encrypted zip file (scheduled, which can be found at the end of the file (offset 0x3d20) as "pw=goodtimes".

3. What (if any) other high schools besides Smith Hill does Joe Jacobs frequent?

After successfully opening the encrypted scheduled visits spreadsheet it is apparent that Joe Jacobs also frequents the following high schools:

Key High School
Leetch High School
Birard High School
Richter High School
Hull High School

4. For each file, what processes were taken by the suspect to mask them from others?

The restored and repaired floppy image contains 3 files. These files were modified to mask them from others in the following ways:

Jimmy Jungle.doc:

  • This file was deleted which means the FAT table entry for this file was wiped and the beginning of the directory entry for this file (offset 0x2600) was set to 0xe5.

    cover page.jpg:

  • Renamed to a .jpgc extension.
  • At offset 0x26ba (bytes 26-27) the starting cluster in the directory entry was changed to 0x01a4. The correct value should be 0x002a.


  • Renamed from .zip to .exe.
  • The file length at offset 0x71c (bytes 28-31) was changed from 2416 bytes (0x7009) to 1000 bytes (0xe803).

    5. What processes did you (the investigator) use to successfully examine the entire contents of each file?

    This answer to this question lies in the Detailed Analysis section.

    6. Bonus Question: What Microsoft program was used to create the Cover Page file. What is your proof (Proof is the key to getting this question right, not just making a guess).

    The cover page file contains padding in the 4 byte repeating sequence of 0x28a28a00. We created JFIF images using Microsoft Paint and other programs. By examining the padding within these files it is obvious that the 0x28a28a00 padding is unique to Microsoft Paint.

    jump to top
    Detailed Analysis

    Initial Approach

    First things first we need to download, verify, and decompress the floppy image. We began analysis on our linux box as it provides all of our favorite standard tools (grep, strings, perl, etc...) plus excellent forensic tools.

        $ curl >
          % Total    % Received % Xferd  Average Speed          Time             Curr.
                                         Dload  Upload Total    Current  Left    Speed
        100 18146  100 18146    0     0  14077      0  0:00:01  0:00:01  0:00:00 34284
        $ md5sum 
        $ unzip 
        inflating: image                   

    We check to see what 'file' has to say about image and then we mount it.
        $ file image
        image: x86 boot sector, system MSDOS5.0, FAT (12 bit)
        # insmod vfat
        # insmod loop
        # losetup /dev/loop1 image
        # mkdir mounted
        # mount -o ro /dev/loop1 mounted/
        # cd mounted/
        # ls
        cover page.jpgc             schedu~1.exe

    'file' reports that image is a FAT12 MSDOS disk. We switch to root, load the FAT module, load the loop module, setup a loop device, and mount the image on that device. You'll notice that we don't use the noexec and nosuid flags for mount because we trust 'file' and because we like to live dangerously. Two files. A JPEG and an executable, the familiar .exe extention further confirms that we are indeed dealing with a DOS image. Lets try and figure out what these files are.
        # file *
        cover page.jpgc           : PC formatted floppy with no filesystem
        schedu~1.exe:               Zip archive data, at least v2.0 to extract
        # hexdump cover\ page.jpgc\ \ \ \ \ \ \ \ \ \ \  
        0000000 f6f6 f6f6 f6f6 f6f6 f6f6 f6f6 f6f6 f6f6
        0000200 0000 0000 0000 0000 0000 0000 0000 0000
        # cp schedu~1.exe ..
        # cd ..
        # unzip
        # mv schedu~1.exe 
        # unzip 
          End-of-central-directory signature not found.  Either this file is not
          a zipfile, or it constitutes one disk of a multi-part archive.  In the
          latter case the central directory and zipfile comment will be found on
          the last disk(s) of this archive.
        note: may be a plain executable, not an archive
        unzip:  cannot find zipfile directory in one of or
      , and cannot find, period.

    The initial thought that cover page.jpgc was a JPEG was dismissed by 'file'. The contents of the file were extremely confusing, it was a slew of 0xf6's followed by a slew of nulls. This raised a series of questions. What is this file -- An XOR table to be used in the future? Perhaps the file's statistics (offsets, sizes, etc...) provide some kind of important information. Why does the filename contain trailing spaces? The executable is reportedly a zip file. We renamed and attempted to decompress it but failed. Is it corrupt or is it not a zip file? We jump to Google and find a reference on zip file formats [4]. Examining the contents of the file verifies that it is indeed a zip file, however the trailing PK tag was removed, thereby rendering the file unrecognizable by standard zip utilities. We restore the zip file in two ways: by hand (beyond the scope of this document) and also the easy way using PKZIPFIX, a DOS utilitiy from the old school pkzip package. We transferred the fixed zip file to our windows box and were able to successfully open it using windows compressed folders. The zip file contained an encrypted Excel spreadsheet, "scheduled visits.xls". Despite our best efforts we were never able to completely restore the zip file so that it could be recognized by *all* compression utilities.

    At this point we thought we had everything we needed and came up with three methods to crack the password:

    • Dictionary attack
      • Using our standard dictionary which contains 1,450,244 words.
      • Using a small dictionary that we would create out of the words from the police report. This was initially generated by feeding report.txt to " tr ' ' '\n' ", and then cleaned up and tweaked by hand.
    • Partial known-plaintext attack [1, 3].
    • Brute force attack.
    Searching Google produced a great deal of tools dedicated to zip password cracking. Experimention and availability whittled the list down to four tools:
    • Advanced ZIP Password Recovery, windows, used in dictionary and plaintext attacks.
    • Zip Key, windows, used in dictionary and plaintext attacks.
    • FCrackZip, linux, used in plaintext and brute force attacks.
    • Zip Cracker, linux, used in brute force attack.
    This may be a good time to note that our windows box is a P4 1.7Ghz with 512M of RAM and our linux box is an AMD Athlon 800Mhz with 256M of RAM. The dictionary attack using the police report was a quick failure. The dictionary attack utilizing our standard wordset was also quick to fail, although as we will discuss later this should not have been the case.

    Our next attempt at cracking the zip file was to launch a partial known plaintext attack [1, 3]. There are two prerequisites to launching this attack. The first is that you need to have at least 13 bytes of known plaintext. The second is that you must compress the known plaintext using the exact same method used to compress the target zip file. The target zip file contains only one file, though we can not be absolutely sure what kind of file it is we know that it is probably an Excel spreadsheet based off of the extension. Viewing various Excel files it is obvious that they share a standard header. We extracted 48 bytes of this header and stored it as excel_header.dat. We then compressed that file with various utilities and at various compression levels. We then took each of these compressed header files and launched the known plaintext attack against the target zip file using Advanced ZIP Password Recovery. Again we were disappointed as the results were not fruitful. Our initial thought in response to this failure was that the file was not an actual Excel spreadsheet and was mearly named as such to serve as a diversion.

    At this point we had no option left but to launch a brute force attack. We started the brute force attack on our linux box using Zip Cracker, and in the process of waiting for the results restarted our analysis with an entirely new approach. In the end it took Zip Cracker 9 days to determine the password.

    Second Approach

    If you run 'strings' against the image file and examine the output you will notice that there is significantly more data than what appears when the image is mounted. It's time to throw the image file into our favorite hex/text editor, Ultra Edit. We can clearly identify 3 different files:

    • offset 0x4200:0x9060 - Microsoft Word Document (20064 bytes)
    • offset 0x9200:0xcf30 - JPG File (15664 bytes)
    • offset 0xd000:0xd970 - ZIP File (2416 bytes)
    Starting at 0xda00 and continuing to the end of the file we see the familiar 0xf6 sequence. Apparently 'file' is smarter then we were because it knew exactly what it was looking at: "PC formatted floppy with no filesystem". This is simply padding that showed up as a file due to incorrect header information. It would be a trivial matter to simply extract these files by hand, but it was decided that the more elegant approach would be to restore the obviously damaged fileystem header. As none of us are masters of the FAT12 filesystem, we had to turn to our trusty friend, Google. The best reference we found was "Fun with FAT, undeleting files" [2]. The following table is an extract from [2]:
     --------------------  <-- 1 sector  -- Starts at 0
    |  Dos Boot Code     | 
    |     FAT #1         | <-- 6 sectors -- Starts at 200h
    |     FAT #2         | <-- 6 sectors -- Starts at 1400h
    |    Directory       | <-- 8 sectors -- Starts at 2600h  
    |   Data Section     | <--Remainder of disk -- Starts @ 4200h

    We began the reconstruction process with the Microsoft Word document. This file did not appear when we mounted the original image. It is apparent that the file was deleted as the FAT table entry did not exist. To restore the FAT table we extracted the file by hand to a fresh floppy and extracted the appropriate entry from this new FAT table. The file is also marked as erased in the directory entry. This is obvious by the 0xe5 at offset 0x2600, whose value was changed to 0x42.

    We next moved on to the reconstruction of the JPG image. To begin it was obvious that the long filename in the directory entry was changed from .jpg to .jpgc. This was fixed by nulling out the 'c' at offset 0x2662. We then wondered why this file showed up as the 0xf6's instead of the actual file. Analyzing the directory entry for this file it is apparent that the starting cluster value was modified. The current starting cluster at offset 0x26ba (bytes 26-27) is 0x01a4 -- this needs to be remedied. The first cluster starts at offset 0x3800 and the JPG file starts at offset 0x9200 -- clusters sizes are 0x200 each. A little math: 9200 - 3800 = 5400, 0x5400 / 0x200 = 42 or 0x002a (bytes have to be swapped), so we swapped 0x01a4 with 0x002a.

    The final file that needed restoration was the zip file. The first thing that needed to be done was to rename the extension from .exe back to .zip (offset 0x2708). Checking through the rest of directory entry it becomes apparent that the file length was incorrectly set as 1000 bytes (0xe803) when it should really be 2416 bytes (0x7009). This change was made at offset 0x71c (bytes 28-31).

    At offset 0xcf20 we find this very intriguing line: "pw=goodtimes". Could this be the password to the encrypted zip file? Apparently it is. What is interesting is that "goodtimes" exists in our standard dictionary, the dictionary attack must have failed due to improper formatting of the zip file. This line falls right below the JPG file, is this the crucial data that question number 2 is referencing? We weren't convinced at first. The fact that the JPG was grainy led us to believe that perhaps the image contained some steganographic data. We ran Stegdetect against the file and did not find anything, this was enough evidence to convince us to change the file length of the JPG to officially contain the "pw=goodtimes" string.

    The Bonus Question

    There really isn't much more to this then what was mentioned in the answers section. The repeating 0x28a28a00 pattern really caught our attention and it was thought that perhaps this sequence was unique to a certain Microsoft program, and it didn't take much effort to create a series of JPG (JFIF) images using different image applications and compare them all. Our theory was quickly proven to hold water. While we could be wrong, we've already beaten this thing to death and didn't feel like proceeding any further.

    jump to top
    Data Files

    Files extracted from repaired image:
    cover page.jpg JPG (JFIF) image that contained zip password
    Jimmy Jungle.doc Word document from Joe Jacob to his supplier
    Scheduled Visits.xls Excel spreadsheet containg Joe Jacobs high school visitation schedule

    Other files:

    image.dat Final repaired floppy image
    excel_header.dat Standard Excel header contents used in known plain-text attack
    report.txt Original police report
    police_report_words The original police report split into one word/phrase per line and used in brute force attack

    jump to top

    Advanced ZIP Password Recovery
    Zip Cracker
    Zip Key

    jump to top

    [1] "A Known Plaintext Attack on the PKZIP Stream Cipher", Eli Biham Paul & C. Kocher,
    [2] "fat12 undelete text",, [mirror]
    [3] "ZIP Attacks with Reduced Known-Plaintext", Michael Stay,, [mirror]
    [4] "ZIP archive file format",, [mirror]