A failed attempt at stopping spam bots

I am spam man - I am here to annoy your users and hopefully sell some sweet herbal concoctions!

This post is just a quick overview of an experiment I ran recently with the Mindscape website. It’s nothing about our products but I thought the web devs out there might enjoy hearing this story. Late last year we upgraded our website and moved the core site over to MVC which made it nice and fun to develop with however we inadvertently opened the door to spambots.

We noticed that within a day or two that our forums were filling up with posts advertising cheap watches, viagra and adult links. Not something we really want and especially annoying since a lot of folks subscribe to forum update posts and would get this rubbish in their inbox. The obvious thing to do was to add either an email validation process (register -> not active until click link emailed to you) or add a captcha. Now, I hate captures and I hate email validation even more so I decided to try and think of how we could solve the problem.

1. Most spam bots will try and fill out every form field
I started by adding a hidden (as in display:none;) input box to the registration screen. If the controller action received a value for this friend then it was probably a bot and stated so in the validation field.

2. Most spam bots won’t execute JavaScript
I added a small script that dynamically added a new input box – also hidden – with a specific key value. If the controller action didn’t get this specific value back it could safely assume you were either a bot or you were running your browser with JavaScript disabled.

So how did this go? Generally pretty good – bot activity dropped to almost zero as we banned the existing accounts that posted dodgy links in our forum. Some of them were fairly cunning in that they lay dormant for months before finally posting, but we just deleted those accounts as they posted.

Then we ran a huge promotion and had thousands of folks registering on our website trying to claim a prize. Next thing you know we’re getting a bunch of unhappy people complaining our websites are calling them bots! Not an ideal situation at all. We finally tracked it down to the fact that most of those impacted were running form filling software (like 1Password) or had a browser that tried to complete the form (e.g. Google Chrome). When this happened those tools would dutifully assign values to the hidden input box that I mentioned as point 1. Not so good.

So, not wanting to continue to alienate people and still reduce spam in our forums I’ve relented and Jeremy recently switched the registration over to using Recaptcha for registrations and newsletter subscriptions. Speaking of which, why not test it out by subscribing to our newsletter? ;-)

I hope our experiment shared is useful for others thinking of ways to defeat spam bots.

Tagged as General

4 Responses to “A failed attempt at stopping spam bots”

  • Spam man is awesome. I think we just found your costume to the next office Halloween party.

  • No Captcha here,

    Now let me tell you about a really great deal on viagra….

  • Hey, the method you have described is actually called a Honeypot Captcha. Honeypot being

    ‘In computer terminology, a honeypot is a trap set to detect, deflect, or in some manner counteract attempts at unauthorized use of information systems.’

    We used the technique on a couple of projects, it reduced spam for a period of time, but spammers would circumvent the honeypot and eventually submit spam anyway.

    Personally I like this method:

    http://www.webappers.com/2009/02/20/drag-and-drop-ajax-fancy-captcha-jquery-plugin/

  • I’ve taken a different approach… I define each letter/number as being a series of coordinates — think dot matrix printer. These coordinates are transformed into a series of divs with CSS classes dictating background color and position. The divs are then randomized so patterns are not likely to be predicted. These divs, and corresponding styles, render as easy to read letters/numbers. For a bot to enter the correct value, the page would have to be rendered, captured as an image, then OCRed. To reduce the likelihood of replay submissions, the values are tied to a HMAC hashed time code that causes a value to expire. So far it’s been working well.

    Markup example: ……

    Render example: http://www.daphault.com/Share/captcha.jpg

  • Leave a Reply

Archives

Join our mailer

You should join our newsletter! Sent monthly:

Back to Top