Ounce of prevention

Freshness Warning
This article is over 15 years old. It's possible that the information you read below isn't current.

At the risk of this starting to look like a blog about comment spam, I have some additional thoughts on the matter.

I’ve made some changes to my comment forms here. The first is that the CGI script that comments get posted to is no longer the default mt-comments.cgi. I’ve created a clone of the comments script and renamed it fbda07e9fd3bb656bbf62c5b0ed6480e.cgi. That should stop bots that search for copies of mt-comments.cgi.

The next thing I’ve done is included a hidden field in each comment form that contains a MD5 hash of the entry ID and a secret word. Then I modified MT to check for that field. The comments script now creates a hash of the entry id and secret word and compares it to the one submitted with the comment. If that field isn’t submitted or it doesn’t match, the comment is rejected and the user is shown an error message.

But I wonder if these steps are useful at all. What I question is how spam bots are finding entries on which to comment. The entries that get the most spam comments here are those that have a large number of incoming links. The SimpleComments page is one of the hardest hit. That seems to suggest that bots are crawling from blog to blog, following links and posting comments.

This means that in order to post a comment, the bots must be parsing the HTML in order to find out if there’s a comment form on it. They aren’t apparently searching Google for common comment scripts otherwise the top search results would have the most spam comments.

Since the bots are parsing the HTML adding hidden form fields probably won’t deter them. If the authors of the bots have any brains whatsoever, they’re submitting all the hidden fields along with the forms. My hidden hash will be submitted by a bot just like it would by a person. What will probably be the biggest help is the thing that was easiest to do: changing the comment script name.

What else would be effective is changing the names of all the form fields. Making them short random strings would make it impossible for a bot to recognize the comment form using only the field names. People would be able to understand the form because of the labels, but bots would have to implement a large amount of fuzzy logic in order to recognize that “Name,” “Your Name:,” and other forms are really the same thing.

MojoMark
September 18, 2003 2:18 PM

I wonder if you could place a random number in a cookie that is placed by the comment entry form. When the comment is submitted, it would only be accepted if the browser provides a valid number back. If the spammers have the smarts to capture and provide the cookie during comment submission, you could then measure the time between generating the random number, and when it comes back in a comment. Conceivably, the time delta would unusually small (I know I can't create a comment in under 30 seconds) if done by a spammer, and could be filtered. Defeatable sure, but will the spammers take the time to find out where its failing, decode why its failing, and put a stall in to handle it? Seems kinda unlikely to me if they lust for speed and coverage.

Frederic
September 18, 2003 2:33 PM

"Back to the drawing board." Unfortunately you are right. Yes, brute force would be a way around what I suggested and random field name would kill autofill. Adding steps that can be automated to the comment posting procedure will not stop spammer. I think that because you put a system that is open to interaction such as a comment system, an email or something else, you can only loose the game. Your line of defense is broken because in order to work, the comment system should accept comment and if regular user can use it, so can spammers. If you want it to make 1 time harder for spammers, you will make it 2 times harder for regular users. I think that the only thing we can do is either only allow comments from a closed group of people that we trust or let the system be open and clean the spam after they appear. Now if we can't avoid spam, we can make it easier to clean it. What about a link such as "Report comment spam" that would send you an email with the comment and a link to delete it ? Remember ... we used to put an email address on our webpages and when we received spam we used web form 2 email systems and now these systems are broken by spammer as well as comment forms. I think that all these have the same weakness. Find a way to stop spam for email and you will find a solution that can be applied to other problems.

Frederic
September 18, 2003 2:44 PM

MojoMark, timing issue are easy to break. Forum scripts usually forbid users to post more than X post in X minutes. I guess that if I'm a comment script spammer it's something that I would try. "This comment script is made for real user that take time to type a comment, emulate user interaction and put a delay where needed" You know, with multi-threading, while I spend 30 seconds on a website before submitting my comment spam, I can move to the next blog and go on ... You know, it's easy to reverse engineer system that are open. Look at the Google toolbar, it use a checksum algorithm that run against the url you are watching so Google backend know that the request come from their toolbar or from a software built to run hundred of queries to get the pagerank of your competitors. Even this can of thing is easy so beleive me, professional spammers will think about that delay thing and they will take the time to find a work around. When you loose time to keep your comment system free of time you loose money, when spammers take the time to analyse your line of defense this is an investment. They will make more money later.

Frederic
September 18, 2003 2:55 PM

Here is a thread on this subject in Movable Type support forum: http://www.movabletype.org/support/index.php?act=ST&f=10&t=26946&hl=comment+spam

Trackback from Noch'n Blogg.
September 19, 2003 1:41 AM

Effektive Massnahmen gegen Comment-Spam

Excerpt: Immer mehr Leute beschweren sich über Comment-Spam und einige Maßnahmen wurden ergriffen, um dieser Methode entgegenzuwirken. Ich hatte bislang noch...

Saintjude
September 29, 2003 6:42 PM

Not that I know anything of course... Perhaps you could try the system that Yahoo uses to prevent automated registrations. Namely that the poster has to enter a codeword that is presented as a distressed image on the page. I'm sure that anyone who wants to post won't mind a few extra characters - and it can be fun. My most recent word was "death".

Adam Kalsey
September 29, 2003 7:07 PM

The problem with those is that blind users won't be able to comment.

anthony
September 30, 2003 11:22 AM

For your "honeypot" form idea: Use standard HTML comments to block it off.

Trackback from cce blog
October 6, 2003 2:54 PM

quick-n-dirty comment spam fix

Excerpt: i started getting a LOT of comment spam ... so i just renamed mt-comments.cgi to mt-c0mments.cgi to keep the robots away. i haven't received any comment spam since then, and i used to get several every day, so i suppose it must be working. publicizin...

JK
October 26, 2003 4:25 AM

The best response I have seen addresses not how the spammers do this, but rather denying them the payoff they seek. If you make all of your comments' URL links go to an intermediate page which has no inbound links (and hence no pagerank value) then that page can give the user's own URL, which is clickable, but the spammer's purpose will have been defeated. Some sort of blog software upgrade broadly implementing this type of fix appears to be the best medium term way out of this mess.

JK
October 26, 2003 4:27 AM

Maybe it would be enough if the intermediate page had a 'robots.txt' tag. Google wouldn't index the link.

Trackback from Spam-Block Specialists
November 10, 2003 10:24 AM

SPEWS works for --YOU-- to eradicate SPAM

Excerpt: SPEWS-- the spam reduction specialists!

Paul Makepeace
September 30, 2004 4:56 PM

I fully agree, and really despise this solution. Especially with MT Blacklist it is essentially redundant anyway. Are you aware of any patches or ways of turning it off?

David
October 29, 2004 7:35 PM

Ok, so I have a question: Did this end up working sufficiently for you?

Wil
February 19, 2006 8:41 AM

I've been purging our forum membership page of spurious spambot placed addies, but many of them have some sort of cloaking device that prevents me from identifying, and hence deleting them. Short of turning our forum into a closed enter by invitation only site, is their a simple way to attack these listings? I am a simple poet and not very conversant with techno skills.

These are the last 15 comments. Read all 24 comments here.

This discussion has been closed.

Follow me on Twitter

Best Of

  • How not to apply for a job Applying for a job isn't that hard, but it does take some minimal effort and common sense.
  • Movie marketing on a budget Mark Cuban's looking for more cost effective ways to market movies.
  • California State Fair The California State Fair lets you buy tickets in advance from their Web site. That's good. But the site is a horror house of usability problems.
  • Customer reference questions. Sample questions to ask customer references when choosing a software vendor.
  • Comment Spam Manifesto Spammers are hereby put on notice. Your comments are not welcome. If the purpose behind your comment is to advertise yourself, your Web site, or a product that you are affiliated with, that comment is spam and will not be tolerated. We will hit you where it hurts by attacking your source of income.
  • More of the best »

Recently Read

Get More

Subscribe | Archives

Recently

Assumptions and project planning (Feb 18)
When your assumptions change, it's reasonable that your project plans and needs change as well. But too many managers are afraid to go back and re-work a plan that they've already agreed to.
Feature voting is harmful to your product (Feb 7)
There's a lot of problems with using feature voting to drive your product.
Encouraging 1:1s from other managers in your organization (Jan 4)
If you’re managing other managers, encourage them to hold their own 1:1s. It’s such an important tool for managing and leading that everyone needs to be holding them.
One on One Meetings - a collection of posts about 1:1s (Jan 2)
A collection of all my writing on 1:1s
Are 1:1s confidential? (Jan 2)
Is the discussion that occurs in a 1:1 confidential, even if no agreed in the meeting to keep it so?
Skip-level 1:1s are your hidden superpower (Jan 1)
Holding 1:1s with peers and with people far below you on the reporting chain will open your eyes up to what’s really going on in your business.
Do you need a 1:1 if you’re regularly communicating with your team? (Dec 28)
You’re simply not having deep meaningful conversation about the process of work in hallway conversations or in your chat apps.
What agenda items should a manager bring to a 1:1? (Dec 23)
At least 80% of a 1:1 agenda should be driven by your report, but if you also to use this time to work on things with them, then you’ll have better meetings.

Subscribe to this site's feed.

Contact

Adam Kalsey

Mobile: 916.600.2497

Email: adam AT kalsey.com

Twitter, etc: akalsey

Resume

PGP Key

©1999-2019 Adam Kalsey.