Ounce of prevention

Freshness Warning
This article is over 8 years old. It's possible that the information you read below isn't current.

At the risk of this starting to look like a blog about comment spam, I have some additional thoughts on the matter.

I’ve made some changes to my comment forms here. The first is that the CGI script that comments get posted to is no longer the default mt-comments.cgi. I’ve created a clone of the comments script and renamed it fbda07e9fd3bb656bbf62c5b0ed6480e.cgi. That should stop bots that search for copies of mt-comments.cgi.

The next thing I’ve done is included a hidden field in each comment form that contains a MD5 hash of the entry ID and a secret word. Then I modified MT to check for that field. The comments script now creates a hash of the entry id and secret word and compares it to the one submitted with the comment. If that field isn’t submitted or it doesn’t match, the comment is rejected and the user is shown an error message.

But I wonder if these steps are useful at all. What I question is how spam bots are finding entries on which to comment. The entries that get the most spam comments here are those that have a large number of incoming links. The SimpleComments page is one of the hardest hit. That seems to suggest that bots are crawling from blog to blog, following links and posting comments.

This means that in order to post a comment, the bots must be parsing the HTML in order to find out if there’s a comment form on it. They aren’t apparently searching Google for common comment scripts otherwise the top search results would have the most spam comments.

Since the bots are parsing the HTML adding hidden form fields probably won’t deter them. If the authors of the bots have any brains whatsoever, they’re submitting all the hidden fields along with the forms. My hidden hash will be submitted by a bot just like it would by a person. What will probably be the biggest help is the thing that was easiest to do: changing the comment script name.

What else would be effective is changing the names of all the form fields. Making them short random strings would make it impossible for a bot to recognize the comment form using only the field names. People would be able to understand the form because of the labels, but bots would have to implement a large amount of fuzzy logic in order to recognize that “Name,” “Your Name:,” and other forms are really the same thing.

JK
October 26, 2003 4:27 AM

Maybe it would be enough if the intermediate page had a ‘robots.txt’ tag. Google wouldn’t index the link.

Trackback from Spam-Block Specialists
November 10, 2003 10:24 AM

SPEWS works for --YOU-- to eradicate SPAM

Excerpt: SPEWS-- the spam reduction specialists!

Paul Makepeace
September 30, 2004 4:56 PM

I fully agree, and really despise this solution. Especially with MT Blacklist it is essentially redundant anyway. Are you aware of any patches or ways of turning it off?

David
October 29, 2004 7:35 PM

Ok, so I have a question: Did this end up working sufficiently for you?

Wil
February 19, 2006 8:41 AM

I’ve been purging our forum membership page of spurious spambot placed addies, but many of them have some sort of cloaking device that prevents me from identifying, and hence deleting them. Short of turning our forum into a closed enter by invitation only site, is their a simple way to attack these listings? I am a simple poet and not very conversant with techno skills.

These are the last 15 comments. Read all 24 comments here.

This discussion has been closed.

Follow me on Twitter

Lijit Search

Best Of

  • Comment Spam Manifesto Spammers are hereby put on notice. Your comments are not welcome. If the purpose behind your comment is to advertise yourself, your Web site, or a product that you are affiliated with, that comment is spam and will not be tolerated. We will hit you where it hurts by attacking your source of income.
  • Best of Newly Digital There have been dozens of Newly Digital entries from all over the world. Here are some of the best.
  • Let it go Netscape 4 is six years old.
  • The importance of being good Starbucks is pulling CD burning stations from their stores. That says something interesting about their brand.
  • Google on the desktop Google picks up Picasa, giving them an important foothold on people's PCs.
  • More of the best »

Recently Read

Get More

Subscribe | Archives

8

Recently

invisible Fence (Mar 22)
The New York Times has a paywall now. Sorta. If you don't choose to ignore it.
Black status icon for Chrometa (Mar 17)
Replacing the status icon of Chrometa
Using Google Voice as your voicemail on AT&T (Oct 26)
How I set up my iPhone to use Google Voice as it's voicemail system.
Don Mattingly forced to make coaching change (Sep 17)
New LA Dodgers coach starts to wonder if he knows the rules of baseball at all.
In which Vonage pretends their prices haven't changed (Apr 12)
Translating what Vonage marketing says about their price increase into plain English.
Twitter app competition (Apr 12)
Life as a Twitter app developer is far from over.
Twitter app competition (Apr 12)
Life as a Twitter app developer is far from over.
The rest of the world is not like you (Apr 5)
Normal people are different. Keep that in mind when creating or marketing a product.

Subscribe to this site's feed.

Elsewhere

IMified
Build instant messaging applications. (My company)
SacStarts
The Sacramento technology startup community.
Pinewood Freak
Pinewood Derby tips and tricks

Contact

Adam Kalsey

Mobile: 916.600.2497

Email: adam AT kalsey.com

AIM or Skype: akalsey

Resume

PGP Key

©1999-2012 Adam Kalsey.
Content management by Movable Type.