09 May, 2009

Extracting emails from archived Sguil transcripts

Here is a Perl script I wrote to extract emails and attachments from archived Sguil transcripts. It's useful for grabbing suspicious attachments for analysis.

In Sguil, whenever you view a transcript it will archive the packet capture on the Sguil server. You can then easily use that packet capture to pull out data with tools like tcpxtract or tcpflow along with Perl's MIME::Parser in this case. The MIME::Parser code is modified from David Bianco's blog.

As always with Perl or other scripts, I welcome constructive feedback. The first regular expression is fairly long and may scroll off the page, so make sure you get it all if you copy it.


# by nr
# 2009-05-04
# A perl script to read tcpflow output files of SMTP traffic.
# Written to run against a pcap archived by Sguil after viewing the transcript.
# 2009-05-07
# Updated to use David Bianco's code with MIME::Parser.
# http://blog.vorant.com/2006/06/extracting-email-attachements-from.html

use strict;
use MIME::Parser;

my $fileName; # var for tcpflow output file that we need to read
my $outputDir = "/var/tmp"; # directory for email+attachments output

if (@ARGV != 1) {
print "\nOnly one argument allowed. Usage:\n";
die "./emailDecode.pl /path/archive/\:62313_192.168.1.8\:25-6.raw\n\n";

$ARGV[0] =~ m
or die "\nIncorrect file name format or dst port is not equal to 25. Try again.\n\n";

system("tcpflow -r $ARGV[0]"); # run tcpflow w/argument for path to sguil pcap

my $srcPort = sprintf("%05d", $2); # pad srcPort with zeros
my $dstPort = sprintf("%05d", $4); # pad dstPort with zeros

# Put the octest and ports into array to manipulate into tcpflow fileName
my @octet = split(/\./, "$1\." . "$srcPort\." . "$3\." . "$dstPort");

foreach my $octet(@octet) {
my $octetLength = length($octet); # get string length
if ($octetLength < 5) { # if not a port number
$octet = sprintf("%03d", $octet); # pad with zeros
$fileName = $fileName . "$octet\."; # concatenate into fileName

$fileName =~ s/(.+\d{5})\.(.+\d{5})\./$1-$2/; # replace middle dot with hyphen
my $unusedFile = "$2-$1"; # this is the other tcpflow output file

# open the file and put it in array
open INFILE, "<$fileName" or die "Unable to open $fileName $!\n";
my @email = <INFILE>;
close INFILE;

my $count = 0;
# skip extra data at beginning
foreach my $email(@email) {
if ($email =~ m/^Received:/i) {
else {
delete @email[$count];
$count ++;

my $parser = new MIME::Parser;
my $entity = $parser->parse_data(\@email); # parse the tcpflow data
$entity->dump_skeleton; # be verbose when dumping

unlink($fileName, $unusedFile); # delete tcpflow output files

No comments:

Post a Comment