Comments

Comments for Comment spam

Excerpt: I’m fed up and want your help in devising a solution that will curtail comment and TrackBack spam. Read the whole article…

Wilson
September 3, 2003 10:35 AM

With the amount of people talking about RSS replacing e-mail (because of the recent incidents that caused tons of problems with virus payloads all over the place), I'm not surprised that spammers found this new "niche". In fact, I'm somewhat surprised we don't see more of this. One other thing that might help is the "Turing test" that many sites use in registration pages, where the user has to read a word (or a series of digits) from a distorted image and type it in a form field, to confirm that he/she is not a bot. The problem this brings, of course, is accessibility. Even if you allow the option of listening to the word/digits (which I've seen some sites doing), many people would still be locked out for no reason. Not an easy problem.

Jim Ray
September 3, 2003 11:16 AM

I don't know, but ever since I've been taking this great ERBAL VIGRA, my love life has been sensational! (couldn't resist...) It seems like something could tie in with the "remember me" function already implemented on many Movable Type weblogs -- basically, whitelisting for comments. If a user has never posted a comment before, require some kind of validation (email challenge-response, an image of a word on a grid Turing verifier, etc.) and then set a cookie, store their IP, remember their personal info for next time. In general, I'm not a huge fan of challenge response systems. I tried one myself for the better part of a year on my email and it worked ok, but I found I was spending more time sifting through my "possible spam" directory, so I switched to a heurisitc/bayesian filter (spam assassin). I wonder if there's room for content analysis, much like how Spam Assassin works. Before a comment gets posted, the contents get parsed and ranked for their "spamminess". Comments with a high threshold would then be directed to a challenge response that bots wouldn't be able to deal with. On a more global level, some kind of karmic system amongst bloggers and internet citizens may be interesting - think Slashdot's comment ranking for weblogs. A way to digitally identify one's self beyond "name, email, url" is long overdue, fighting comment spam might be what pushes internet identities over the edge. I would imagine a system that uses a unique ID of some sort - PGP signature, signed FOAF - would allow for comments to be ranked on weblogs. So, anonymity would still be allowed and even encouraged, but those that identify themselves would rise to the top and discourage spammers with low karma.

Adam Kalsey
September 3, 2003 11:57 AM

One thing I was discussing with Brad was a way for multiple sites to collaborate and reduce spam. When someone gets a comment or trackback that they feel is spammy, they could ping a central clearinghouse with the IP address and URL of the spammer. When the clearinghouse receives pings from multiple site owners containing the same URL or IP address within a short period of time, it adds that IP address and URL to it's spam database. Sites could subscribe to the spam database and get a ping every time a new IP and URL gets added. When a site gets a ping, it would temporarily ban comments from that IP address or containing that URL. This would make it more difficult for bots to crawl sites and send comment or TrackBack spam. As soon as they start spamming some sites, all the other sites that are subscribed would ignore their spam.

Cal
September 3, 2003 1:20 PM

What happens if the spammers turn to tactics like like simply hijacking other people's machines for quick, one-time attacks? They may get blacklisted quickly, but it's only a temporary solution if they switch to another machine for another attack. Given other theories about the posibilities of combining viruses with spam, comment-spam would be hard to fight against coming from thousands of directions. I like the idea of a central blacklist, but the idea has flaws and I think the other validation suggestions (although still not perfect) sound like better solutions.

Jim Ray
September 3, 2003 1:21 PM

I think that some kind of distributed database would be effective as a front line defense, much like how spam assassin uses the ORB and spamhaus databases. However, much like spam assassin, it's a stop gap solution that treats the problem, not the cause. While such solutions may be succesful most of the time, they are not bulletproof and end up perpetuating a cold war style arms race between the spammers and the anti-spammers. Of course, I use spam assassin, so there's something to be said for efficacy over idealism. If everyone digitally signed their emails, there'd be no such thing as spam. If everyone could digitally sign their comments, we could eliminate comment spam, too. Like most things, there will be a balance between ease of use to your users and usefulness to you.

Adam Kalsey
September 3, 2003 1:57 PM

That's not entirely true. I received a signed spam email a week or two ago. The signature was valid, but I assume that the identity was fake. That's the problem with signatures (ink or digital). Unless you have a trusted third party that verifies identities, signatures are only as trustworthy as the person using them. That's why we use notaries for important transactions and why companies like Verisign make so much money.

Abdelmalek
September 3, 2003 2:59 PM

The simple way to do it is to remove all url in comments. No way to steal visitors = no reason to put comment spam on a page... An other way to fight back: Build a link farm where you put a link to all the comment spammer's websites. They will be soon penalysed by google and nobody will find them ;). I like distributed/collaborative approaches to fight spam. For weblog with few comment volume, pre approval of comments may be the answer. If you know that your comment will first be read by a moderator/blog owner, and that you know that it will never be approved why would you want to put a comment spam ? Pre approval via email turn a Comment Spam into a regular spam with smaller audience and regular email spam tool already available could be used...

Jim Ray
September 3, 2003 4:12 PM

Wow - that's a first for me. I've never heard of signed spam. Interesting indeed. I knew saying something as final as "f everyone digitally signed their emails, there'd be no such thing as spam" would come back to haunt me... Of course, the implication that some kind of mechanism for identifying signatures is pretty obvious. Which takes us back to the distributed database of known spammers in a way.

Adam Kalsey
September 3, 2003 9:22 PM

It was a travel place advertising one of those free timeshare getaway weekends where you have to sit and listen to their sales pitch. I considered keeping it or blogging it because I knew that no one would believe me. It stayed in my inbox for days, but in the end I deleted it. Cal: The idea behind a distributed system would be that it would operate in near real time. Each time a spammer were to start up with a new IP, they would hit a brick wall after sending just a few messages. And since any bans are temporary, you don't have to worry about unjustly banning someone simply because their machine (or dynamic IP) was used as a platform for spam.

Adrian Holovaty
September 3, 2003 9:40 PM

Simon Willison started a decentralized blog-comment-spam blacklist. Check it out. http://simon.incutio.com/archive/2003/09/02/

Trackback from Compendium
September 4, 2003 8:19 AM

Why RSS might not replace e-mail just yet

Excerpt: Given the never-ending arms race between e-mail users and spammers, though, this doesn't bode well for the use of RSS as an alternative to SMTP-based e-mail - unless you decide never to subscribe to the RSS equivalent of mailing lists.

Trackback from Move the Crowd
September 5, 2003 4:29 AM

More on Comment Spam

Excerpt: There's a discussion about possible methods of stopping blog comments spamming over at Adam Kalsey's blog. Go add your thoughts....

trialanderror
September 5, 2003 8:41 AM

I am not sure how comments are being auto-discovered (i.e. directly from search engines or by other blog links). So how about "randomizing" the order of the comment fields. Human readable may understand. Or creating 2 or more sets of comment fields (used in html but not shown to the user-I don't know if this is possible) that appear to be regular comment fields by go nowhere? The "real" comment fields would be randomly placed in the "stream" html code. Maybe having the post button popup a window for verification. (Limitation with ad blockers). These thoughts are to make things programmatically inconvenient to the bots. If these are not useful, please delete (as I am not a programmer).

Con Tendem
September 8, 2003 8:47 AM

I see that you have made email a hidden field. If more people did that, or allowed non-valid email addresses in the form of name[at]domain.com at the like, we could also reduce the other type of spam -- email harvesting from blog comments by spammers. That is a non-trivial issue for me - I like to leave my address so that people could contact me later if they have something to say about my comment. However, I am finding more and more spam sent to the email addresses used, and have to weigh the value (tiny) of having my email address out in public versus the cost (large) of a spam deluge. Yet I persist, go figure :)

Trackback from Mentalized/Journal
September 9, 2003 3:20 AM

Movable Type: Easier edit/removal of new comments

Excerpt: A small Movable Type hack to add a link to the edit comments page in the the email MT sends when a new comment is posted.

Trackback from Yoz Grahame's Cheerleader
September 9, 2003 7:26 AM

Seven quick tips for a spam-free blog

Excerpt: Blog comment spam, while certainly not a pest on the scale of its email equivalent, has still made enough of a presence felt for it to be considered a threat. Nobody wants to spend two hours a day cleaning out...

Mike Steinbaugh
September 9, 2003 10:52 PM

I think the authentication idea is a good one. Maybe we could invent something like PGP keys for weblogs. SixApart could run the authentication server for MT and TypePad (something like authorize.blogs.com). I bet this would zap the comment spam. Changing the form fields around would work as well, I think. That would be a nice feature to have in MT Pro. However, I don't think making users register to post comments (like on a message board) is a good idea because in my case for example, many of my visitors just find my site from Google, leave one post and never return. If they had to create a login, it would be a waste of space on my server. On a side note, I think keeping your e-mail address out of your RSS feed is an excellent idea because the spambots are going to start parsing RSS for sure.

Chris
September 11, 2003 6:12 AM

I want a CAPTCHA, as it's got to be the best anti-robot tool ever and everyone is getting used to it these days. Needs must, and the devil drives perl these days. Today comments, tomorrow trackback, later wiki's... *sigh*

Adam Kalsey
September 11, 2003 9:43 AM

For those that didn't follow that, a captcha is a test that is hard for computers to complete but easy for humans. The most common one is an image that contains a passphrase that must be typed in order to submit a form. The image is hard to read with OCR, but easy for humans to decipher. Try signing up for a Yahoo mail account to see how it works. The idea is that if your site requires that an image is read in order to use it, anything that can't read images wil be denied access. The problem is that this includes not only spam bots, but blind users, some handheld devices, and text-only browsers as well. This form of a captcha is an accessibility nightmare.

galiel
September 19, 2003 7:48 AM

I am surprised there has been no follow-up discussion about communal post-ranking systems like Slashdot. No need to censor anyone or deal with accessibility problems, you simply have the community rank comments by merit, with the kind of safeguards against ballot-box-stuffing that Slashdot has built in. Trolls, spammers and freepers, who arguably combine the worst attributes of both, still post, but their posts don't get exposure--anyone who is bothered simply sets their filter to level 3 or whatever, and never see the bottom-feeders. When the community is too small to have a good community filter, you either rank it yourself or appoint a small group of responsible commenters to do the ranking. When the community grows enough, you adopt a Slash-type system. Simple, free-speech-friendly, accessible, non-intrusive, manageable.

Trackback from soundCommons :: weblog ::
October 10, 2003 10:49 AM

Comment Spam

Excerpt: In the past month or so, the blog has become the target of polite comments that seem to have not

Trackback from Reflective Reality
October 10, 2003 11:26 PM

Automated Comment SPAM Solution

Excerpt: I now have a working captcha thanks to James Seng. I really don't care how much of a pain it is on the accessibility front, the spammers have driven me to finding a working solution. The don't allow comments from google searches hack also makes first t...

Trackback from random ruminations
October 11, 2003 9:21 AM

Comment Spam

Excerpt: I've been struck with comment spam three times in the last week. I don't know if this means that, suddenly, my blog has hit the radar screens of whatever search engine spammers use, or if I'm just lucky. Regardless, the first time is was mild, the seco...

Trackback from different strings
October 12, 2003 8:14 PM

More on comment spam

Excerpt: There's a thread over at Making Light about a specific comment spammer who has been posting ads for what is allegedly child pornography. This guy is really obnoxious - one blogger reports having it show up on 89 posts so...

Trackback from Take the First Step
October 16, 2003 7:44 AM

Weblog Software and the Internet Food Chain

Excerpt: it's probably a good thing that TypePad embeds comments and TrackBack pings within the individual entry page. On the other hand, they should expect trackback spam to join the current comment spam. They need to address this before the cure becomes worse...

Richard Rutter
October 16, 2003 8:18 AM

I've started to implement tools to prevent comment spam on my site. So far I've only gone down the blacklist route. I also like the idea of preventing repeat posts within a certain time period - this would also prevent accidental multiple-posting. I figured that you could recognise a repeat post in three ways: 1) same name, email, url 2) same IP address 3) same session ID Could a PHP session ID prevent robot attacks? Or would a robot always get assigned a session ID anyway? I'm thinking no session ID - no comment.

Lonnon Foster
November 5, 2003 1:31 PM

Jay Allen has an excellent Movable Type plugin for stopping comment spam: MT-Blacklist (http://www.jayallen.org/projects/mt-blacklist/). The plugin hits comment spammers where they live: in the URLs they leave behind. Comment spam is actually a little easier to filter than email spam, because it has to point to a specific URL in order to boost that URL's page ranking in search engines. MT-Blacklist looks for known spam URLs (and comes with a default blacklist of over 450), and adding new ones is as easy as clicking a link in MT's new comment notification mail.

stephen
November 5, 2003 11:36 PM

convert URLS to a link pointing to ur server which in turns, redirects the link to the orig URL. defeating the purpose of ranking high in search engines

Adam Kalsey
November 6, 2003 9:52 AM

That's an idea that's often floated about. The problem is that spammers would still leave spam, not knowing that your system wasn't giving them Google juice. And this (and Jay Allen's) solution also relies on the concept that spammers leave comment spam solely to increase PageRank. That will change. Spammers will start leaving spam for other reasons as well.

Trackback from Wetware
November 7, 2003 9:08 AM

A New Way to Fight Blog Comment Spam

Excerpt: Spam in blog comments is quite different from email spam and can be fought in a much more direct manner.

Alfred Anderson
November 14, 2003 2:47 PM

You have excellent ideas represented in this BLOG. Many of them could be used by more than just blog but could migrate into email, web page comments, IM and other areas where spamming is frequent. However, while select individual sites can be protected with such advance techniques, do we have an infrastructure that allows such protection to be available on a more global scale? Right now, I sense this is a grass-roots level for which support is needed (perhaps at the standards committee level). Is anyone lobbying the standards bodies for incorporation of such proven ideas? Will the best of these ideas be incorporated in commercial-ware? Unless these ideas reach the average consumer, they are falling far short of their potential. So how can these ideas be marketed?

kaushal parikh
December 17, 2003 8:45 AM

The simple way to do it is to remove all url in comments. No way to steal visitors = no reason to put comment spam on a page... An other way to fight back: Build a link farm where you put a link to all the comment spammer's websites. They will be soon penalysed by google and nobody will find them ;). I like distributed/collaborative approaches to fight spam. For weblog with few comment volume, pre approval of comments may be the answer. If you know that your comment will first be read by a moderator/blog owner, and that you know that it will never be approved why would you want to put a comment spam ? Pre approval via email turn a Comment Spam into a regular spam with smaller audience and regular email spam tool already available could be used... kaushal parikh http://www.kaushalparikh.com

Trackback from WWWorker - Sascha Carlin
November 15, 2004 10:12 AM

Secret Tags - An alternative to Captchas?

Excerpt: [11/14/2004] Update: [Adam Kalsey has a piece][adam] from Sep 2003 that includes more or less what I call Secret Tags. Since it's from Sep 2003, the credit goes to him, even I discovered his piece just today. Adam, too, says...

Mark
January 9, 2006 6:14 PM

I agree very much with your point about spamming on comments. Why don't you just make sure that the topic is really addressed honestly? If it is addressed legitimately, then you should allow the link. If it's just a short and meaningless comment, then I would delete it. People should be rewarded for their honest interests in specific topics.

This discussion has been closed.

Recently Written

The Trap of The Sales-Led Product (Dec 10)
It’s not a winning way to build a product company.
The Hidden Cost of Custom Customer Features (Dec 7)
One-off features will cost you more than you think and make your customers unhappy.
Domain expertise in Product Management (Nov 16)
When you're hiring software product managers, hire for product management skills. Looking for domain experts will reduce the pool of people you can hire and might just be worse for your product.
Strategy Means Saying No (Oct 27)
An oft-overlooked aspect of strategy is to define what you are not doing. There are lots of adjacent problems you can attack. Strategy means defining which ones you will ignore.
Understanding vision, strategy, and execution (Oct 24)
Vision is what you're trying to do. Strategy is broad strokes on how you'll get there. Execution is the tasks you complete to complete the strategy.
How to advance your Product Market Fit KPI (Oct 21)
Finding the gaps in your product that will unlock the next round of growth.
Developer Relations as Developer Success (Oct 19)
Outreach, marketing, and developer evangelism are a part of Developer Relations. But the companies that are most successful with developers spend most of their time on something else.
Developer Experience Principle 6: Easy to Maintain (Oct 17)
Keeping your product Easy to Maintain will improve the lives of your team and your customers. It will help keep your docs up to date. Your SDKs and APIs will be released in sync. Your tooling and overall experience will shine.

Older...

What I'm Reading

Contact

Adam Kalsey

+1 916 600 2497

Resume

Public Key

© 1999-2021 Adam Kalsey.