Ounce of prevention

Freshness Warning
This blog post is over 18 years old. It's possible that the information you read below isn't current and the links no longer work.

At the risk of this starting to look like a blog about comment spam, I have some additional thoughts on the matter.

I’ve made some changes to my comment forms here. The first is that the CGI script that comments get posted to is no longer the default mt-comments.cgi. I’ve created a clone of the comments script and renamed it fbda07e9fd3bb656bbf62c5b0ed6480e.cgi. That should stop bots that search for copies of mt-comments.cgi.

The next thing I’ve done is included a hidden field in each comment form that contains a MD5 hash of the entry ID and a secret word. Then I modified MT to check for that field. The comments script now creates a hash of the entry id and secret word and compares it to the one submitted with the comment. If that field isn’t submitted or it doesn’t match, the comment is rejected and the user is shown an error message.

But I wonder if these steps are useful at all. What I question is how spam bots are finding entries on which to comment. The entries that get the most spam comments here are those that have a large number of incoming links. The SimpleComments page is one of the hardest hit. That seems to suggest that bots are crawling from blog to blog, following links and posting comments.

This means that in order to post a comment, the bots must be parsing the HTML in order to find out if there’s a comment form on it. They aren’t apparently searching Google for common comment scripts otherwise the top search results would have the most spam comments.

Since the bots are parsing the HTML adding hidden form fields probably won’t deter them. If the authors of the bots have any brains whatsoever, they’re submitting all the hidden fields along with the forms. My hidden hash will be submitted by a bot just like it would by a person. What will probably be the biggest help is the thing that was easiest to do: changing the comment script name.

What else would be effective is changing the names of all the form fields. Making them short random strings would make it impossible for a bot to recognize the comment form using only the field names. People would be able to understand the form because of the labels, but bots would have to implement a large amount of fuzzy logic in order to recognize that “Name,” “Your Name:,” and other forms are really the same thing.

MojoMark
September 18, 2003 2:18 PM

I wonder if you could place a random number in a cookie that is placed by the comment entry form. When the comment is submitted, it would only be accepted if the browser provides a valid number back. If the spammers have the smarts to capture and provide the cookie during comment submission, you could then measure the time between generating the random number, and when it comes back in a comment. Conceivably, the time delta would unusually small (I know I can't create a comment in under 30 seconds) if done by a spammer, and could be filtered. Defeatable sure, but will the spammers take the time to find out where its failing, decode why its failing, and put a stall in to handle it? Seems kinda unlikely to me if they lust for speed and coverage.

Frederic
September 18, 2003 2:33 PM

"Back to the drawing board." Unfortunately you are right. Yes, brute force would be a way around what I suggested and random field name would kill autofill. Adding steps that can be automated to the comment posting procedure will not stop spammer. I think that because you put a system that is open to interaction such as a comment system, an email or something else, you can only loose the game. Your line of defense is broken because in order to work, the comment system should accept comment and if regular user can use it, so can spammers. If you want it to make 1 time harder for spammers, you will make it 2 times harder for regular users. I think that the only thing we can do is either only allow comments from a closed group of people that we trust or let the system be open and clean the spam after they appear. Now if we can't avoid spam, we can make it easier to clean it. What about a link such as "Report comment spam" that would send you an email with the comment and a link to delete it ? Remember ... we used to put an email address on our webpages and when we received spam we used web form 2 email systems and now these systems are broken by spammer as well as comment forms. I think that all these have the same weakness. Find a way to stop spam for email and you will find a solution that can be applied to other problems.

Frederic
September 18, 2003 2:44 PM

MojoMark, timing issue are easy to break. Forum scripts usually forbid users to post more than X post in X minutes. I guess that if I'm a comment script spammer it's something that I would try. "This comment script is made for real user that take time to type a comment, emulate user interaction and put a delay where needed" You know, with multi-threading, while I spend 30 seconds on a website before submitting my comment spam, I can move to the next blog and go on ... You know, it's easy to reverse engineer system that are open. Look at the Google toolbar, it use a checksum algorithm that run against the url you are watching so Google backend know that the request come from their toolbar or from a software built to run hundred of queries to get the pagerank of your competitors. Even this can of thing is easy so beleive me, professional spammers will think about that delay thing and they will take the time to find a work around. When you loose time to keep your comment system free of time you loose money, when spammers take the time to analyse your line of defense this is an investment. They will make more money later.

Frederic
September 18, 2003 2:55 PM

Here is a thread on this subject in Movable Type support forum: http://www.movabletype.org/support/index.php?act=ST&f=10&t=26946&hl=comment+spam

Trackback from Noch'n Blogg.
September 19, 2003 1:41 AM

Effektive Massnahmen gegen Comment-Spam

Excerpt: Immer mehr Leute beschweren sich ber Comment-Spam und einige Manahmen wurden ergriffen, um dieser Methode entgegenzuwirken. Ich hatte bislang noch...

Saintjude
September 29, 2003 6:42 PM

Not that I know anything of course... Perhaps you could try the system that Yahoo uses to prevent automated registrations. Namely that the poster has to enter a codeword that is presented as a distressed image on the page. I'm sure that anyone who wants to post won't mind a few extra characters - and it can be fun. My most recent word was "death".

Adam Kalsey
September 29, 2003 7:07 PM

The problem with those is that blind users won't be able to comment.

anthony
September 30, 2003 11:22 AM

For your "honeypot" form idea: Use standard HTML comments to block it off.

Trackback from cce blog
October 6, 2003 2:54 PM

quick-n-dirty comment spam fix

Excerpt: i started getting a LOT of comment spam ... so i just renamed mt-comments.cgi to mt-c0mments.cgi to keep the robots away. i haven't received any comment spam since then, and i used to get several every day, so i suppose it must be working. publicizin...

JK
October 26, 2003 4:25 AM

The best response I have seen addresses not how the spammers do this, but rather denying them the payoff they seek. If you make all of your comments' URL links go to an intermediate page which has no inbound links (and hence no pagerank value) then that page can give the user's own URL, which is clickable, but the spammer's purpose will have been defeated. Some sort of blog software upgrade broadly implementing this type of fix appears to be the best medium term way out of this mess.

JK
October 26, 2003 4:27 AM

Maybe it would be enough if the intermediate page had a 'robots.txt' tag. Google wouldn't index the link.

Trackback from Spam-Block Specialists
November 10, 2003 10:24 AM

SPEWS works for --YOU-- to eradicate SPAM

Excerpt: SPEWS-- the spam reduction specialists!

Paul Makepeace
September 30, 2004 4:56 PM

I fully agree, and really despise this solution. Especially with MT Blacklist it is essentially redundant anyway. Are you aware of any patches or ways of turning it off?

David
October 29, 2004 7:35 PM

Ok, so I have a question: Did this end up working sufficiently for you?

Wil
February 19, 2006 8:41 AM

I've been purging our forum membership page of spurious spambot placed addies, but many of them have some sort of cloaking device that prevents me from identifying, and hence deleting them. Short of turning our forum into a closed enter by invitation only site, is their a simple way to attack these listings? I am a simple poet and not very conversant with techno skills.

These are the last 15 comments. Read all 24 comments here.

This discussion has been closed.

Recently Written

The Trap of The Sales-Led Product (Dec 10)
It’s not a winning way to build a product company.
The Hidden Cost of Custom Customer Features (Dec 7)
One-off features will cost you more than you think and make your customers unhappy.
Domain expertise in Product Management (Nov 16)
When you're hiring software product managers, hire for product management skills. Looking for domain experts will reduce the pool of people you can hire and might just be worse for your product.
Strategy Means Saying No (Oct 27)
An oft-overlooked aspect of strategy is to define what you are not doing. There are lots of adjacent problems you can attack. Strategy means defining which ones you will ignore.
Understanding vision, strategy, and execution (Oct 24)
Vision is what you're trying to do. Strategy is broad strokes on how you'll get there. Execution is the tasks you complete to complete the strategy.
How to advance your Product Market Fit KPI (Oct 21)
Finding the gaps in your product that will unlock the next round of growth.
Developer Relations as Developer Success (Oct 19)
Outreach, marketing, and developer evangelism are a part of Developer Relations. But the companies that are most successful with developers spend most of their time on something else.
Developer Experience Principle 6: Easy to Maintain (Oct 17)
Keeping your product Easy to Maintain will improve the lives of your team and your customers. It will help keep your docs up to date. Your SDKs and APIs will be released in sync. Your tooling and overall experience will shine.

Older...

What I'm Reading

Contact

Adam Kalsey

+1 916 600 2497

Resume

Public Key

© 1999-2021 Adam Kalsey.