Using Sharkbait

Sharkbait requires a server with PHP 4 or better. You will also need access to an email service that will let you create new addresses on-the-fly. For this I recommend SpamGourmet.

There are several things to consider when setting up Sharkbait. First, you don't want it to be immediately obvious that you're using Sharkbait. If the spammers catch on, their bots will quickly know to throw out pages that look like they contain sharkbait. To make this possible, you must take several steps:

This brings up an important point. Most website packages ask that you place some notice on your site "Powered by phpMyFoobar" or the like. I would ask that you not mention Sharkbait at all, as this will give the harvester bots something to help them identify Sharkbait-produced pages. However, I would ask that you drop me a line and let me know the URL of the bait page you've set up.

When setting up Sharkbait, your first impression will likely be to place hidden sharkbait at the bottom of every page of your site. This is not a good idea. Legitimate bots, such as GoogleBot, will then be generating sharkbait, and storing them in a cache for the rest of the world to use. Then when we got spam, the IP address in the sharkbait would link back to GoogleBot. Not a good situation. Instead, you should put sharkbait on pages specifically written to display sharkbait, and place a rule in your robots.txt file excluding all bots from viewing the page. The good bots will honor this rule, and the really nasty harvester bots will ignore it.

After you set up your sharkbait-generating page, you obviously need to link to it on your site. However, you don't want the link visible to most users. I have several ideas for the link, all of which involve the use of CSS. Harvester bots generally ignore CSS, which makes our job a lot easier. Here are some options:

All of these have the same pitfalls: people without CSS will see the links, so you should word the link such that people won't be inclined to click it. I don't know many people who use non-CSS browsers though, and those who do are usually using text browsers, and therefore are geeks and will know a spam trap when they see one. Another problem is that using the "Select All" feature in most browsers and then copying the text will also grab the link.

Configuration Variables

In sharkbait.inc.php there are several variables that you can configure. Here's a list of them and what they do:

TRAP_UNIQUE_KEY

Specifies the encryption key that will be used to scramble and descramble generated sharkbait. Try to pick something at least 10 characters, and try not to make it obvious. For example, Using your website's name is a very bad idea.

It may not be immediately obvious why you need a key, but remember that Sharkbait is open-source, and therefore the format of sharkbait can be found easily. Spammers could use this to generate sharkbait with someone else's IP address, and implicate them as a spammer. But if generating "working" sharkbait requires a key, it will be very difficult to forge sharkbait with a specific IP address and time.

For this reason, you should never, ever give out your site key.

TRAP_ALLOW_ALL_DECODE

This controls whether or not the provided decode.php file will use your site key when sharkbait is decoded without providing a key. Note that your key will not be disclosed to them, but your key will be used to decode the sharkbait they provide.

I would recommend that you leave this on. If you file a spam complaint on some sharkbait that got used, the ISP will likely want to know how the IP address and time of the retrieval are contained in the address. You can provide the URL of your decoder as proof.

However, I would not recommend making your decoder URL public -- it could be used to brute-force your key (although that would take quite a while). This is another good reason to rename the decoder.

TRAP_ALLOW_NONALPHA

This will allow non-alphanumeric characters (specifically, "-" and "_") to appear in sharkbait. While such characters are valid in an email address, harvester bots don't pay attention to RFC's. Occasionally, two non-alphanumeric characters will appear side-by-side, or worse, at the beginning of some sharkbait. Some harvester bots will ignore these addresses, and of course we don't want them to.

TRAP_RECURSE

If this is on, the default email.php script provided with Sharkbait will display a link to itself, with a slightly different URL each time, providing harvester bots with a whole barrel of sharkbait. However, many harvester bots are wise to this trick, and will ignore such pages. At the same time, you'll be burning a lot of bandwidth.

Note that if you write your own version of the front-end script, you can ignore this variable entirely, and then its setting doesn't really matter.

TRAP_PATTERN

Specifies the format that the sharkbait should appear in. This is also used by the sample front-end script, but I would recommend that you use its value instead of hard-coding the pattern that you want. The value should be a string containing exactly one "%s", which will be replaced with the generated sharkbait. So if you use SpamGourmet, you might use "%s.3.herefishyfishy@wronghead.com".

Summary


Sharkbait is not produced by Pixar Animation Studios. But they make good movies.