This directory contains all the spam that I have received since early 1998. I have employed various "bait" addresses, such as <bait@em.ca> to trick email address harvesters into putting them on spam lists. The archives have been (re)compressed with p7zip which produces files about half the size of tar+bzip2, and smaller even than I was able to achieve with RAR on Linux.
This archive is provided for the purposes of researching behavior of spammers and development of new spam management techniques. Permission is hereby granted to use this archive without restriction. If you publish any research or software based on this archive, I would apreciate a reference to this archive in said work, but it is not required. I would also like to see any such work, and may link it here.
If you have any comments, please e-mail me at bruce@untroubled.org.
NOTE: Most of the messages in this archive contain forged headers in one form or another. The fact that a message claims to have come from one particular address or another does not mean it actually originated at that address. The only way to determine where a message originated is to do a careful study of the Received: headers, and even then much of the information cannot be trusted.
This archive was used in the following reports or sites:
Several people have asked about the file names within the archives. There have been a variety of methods used to move messages from my mailboxes into the archive over time. The formats are:
Name Modification Time Size
Parent Directory 2008-07-02 11:37 - 1997-1998-headers.tar.bz2 1998-03-26 11:15 67k 1997-1998-spam-headers.bz2 1998-03-26 10:48 66k 1998.7z 2005-09-05 13:33 754k 1999.7z 2005-09-05 13:33 898k 2000.7z 2005-09-05 13:34 1.5M 2001.7z 2005-09-05 13:39 5.8M 2002.7z 2005-09-05 17:04 11.2M 2003.7z 2005-09-07 12:57 28.9M 2004.7z 2005-09-07 12:57 53.6M 2005.7z 2006-06-19 12:30 37.4M 2006.7z 2007-08-06 12:25 103.5M 2007.7z 2008-01-01 15:15 87.1M 2008-01.7z 2008-02-14 16:58 10.0M 2008-02.7z 2008-03-11 11:13 10.6M 2008-03.7z 2008-04-01 10:59 12.8M 2008-04.7z 2008-05-05 09:23 12.5M 2008-05.7z 2008-06-01 05:33 15.4M 2008-06.7z 2008-07-01 05:35 17.8M 2008-07.7z 2008-07-04 05:32 1.5M