Your Ad Here

Referral Abuse

It appears that each time an RSS file from my site is loaded by one of these applications, a referer is deposited in the log file. Each time I load a page in Internet Explorer, I don’t leave a referer for www.microsoft.com/ie in the log files of the site whose page I loaded, so why should any of the RSS readers be different?

RSS readers misusing the referer field? (kottke.org)

Amen. I’ve always found it irritating that news aggregators insert their URL into the referrer field. Some aggregators have taken things a step further by allowing the user to use any arbitrary URL as the referrer. So I get 48 "referrals" each day from www.hardhathosting.com even though there’s not a single link from their site to mine. It’s not that I mind knowing where my readers are coming from—that’s kind of nice, and I’m glad the person behind Hardhat Hosting finds me interesting enough to grab the feeds for this this blog and Simplelinks once an hour. It just makes it difficult to distinguish a real referrer from someone else. If Hardhat Hosting were to put a real link on their site to mine and people followed it, I probably wound’t notice. I’d just think the referral logs were crying wolf.

As a temporary measure, you can use a custom URL for the referrals instead of your site’s home page or the home page for the aggregator. L.M. Orchard does this with Amphetadesk, setting the referrer to a thanks page on 0xDECAFBAD and I do this with Aggie. Sites I visit see a link to http://kalsey.com/blog/thanks/ in their referral logs.

It would be nice if there was some sort of browser header the aggregator could send to identify itself instead of using the referrer field. Oh, that’s right, there is. It’s called User-Agent.

The user agent field is designed for browsers, robots, and other user agents to identify themselves to the Web server. You can even add additional information, like a contact URL or email address. I’d like to see aggregators start using it.

Ingve
January 31, 2003 7:06 PM

I also think it is a bad idea, but the intentions behind it were good. Most of the people who “discover” this horrible abuse tend to sound like it’s being done for evil purposes and that aggregator writers must be stupid since they’re not using the user-agent header.

The Aggie referrals question was more or less how many “generic” Aggie referrals you get vs. how many Aggie users take advantage of the opportunity to provide you with potentially useful “hey, this is me and I’m reading you” information…

Userland stuffing their address in the referer header provided a benefit for many users even before the user’s weblog url inclusion by providing the count for how many times a resource was accessed. Noise to you and the server logs crowd, helpful to the many people with blogs on Manila sites etc. :-)

Adam Kalsey
January 31, 2003 10:47 PM

Most Aggie users don’t take advantage of this. You do and Anders Jacobsen does, but there’s several people that don’t. That might be on purpose — maybe they don’t want me to know who they are, or that might be on accident.

I also just noticed, while looking through the Aggie source, that you were the one that provided the patch to add the “I’m reading you” referrer advertising into Aggie. You might want to take a stab at my patch and see if you can improve things a bit. This was the first time I’d ever looked at C# code, so I’m sure there’s a better way to do what I did.

Ingve
February 1, 2003 12:04 AM

Your patch has a few minor issues (a newline in constant error, and I can’t really see where the aggieBase_ string is used now, probably just cut and paste problems) but I’m not sure that abusing the User-Agent header is a huge improvement. If nobody is using the ability to provide information about their feed reading habits then maybe we should just declare this experiment a failure and move on.

Jacques Distler
February 2, 2003 10:34 PM

Well, assuming that if the Request_URI is your RSS feed, one can regard the Referer as most likely being bogus, one can simply not log it.

Easy to set up with SetEnvIF and CustomLog.

Rod
March 9, 2006 11:18 PM

Hard Hat Hosting got exactly what they wanted: a link from your site to theirs…

These are the last 15 comments. Read all 16 comments here.


Your comments:

Text only, no HTML. URLs will automatically be converted to links. Your email address is required, but it will not be displayed on the site.

Name:

Email: (not displayed)

If you don't feel comfortable giving me your real email address, don't expect me to feel comfortable publishing your comment.

Website (optional):

Lijit Search

Best Of

Recently Read

Get More

Subscribe | Archives

Recently

Sprout Test (May 7)
A test post for Sprout widgets.
Product Leadership (May 3)
An anthology of product leadership writing.
Fighting Monster patent claims (Apr 16)
The patent bully picked on the wrong little guy.
Peavy's pine tar (Apr 6)
Jake Peavy's cheating
Bush and Morgan on inner city baseball (Mar 30)
Morgan and Bush discuss the role of baseball in the inner cities.
Not a fork (Mar 27)
We have no intention of forking Drupal. That would be nuts. So what are we doing then?
Eating our dogfood in the sausage factory (Mar 26)
Recursive development for the new Drupal powered community platform.

Subscribe to this site's feed.

Elsewhere

Feed Crier
Get alerted by IM when your favorite web sites and feeds are updated.
SacStarts
The Sacramento technology startup community.
Pinewood Freak
Pinewood Derby tips and tricks
Del.icio.us
My tagstream at del.icio.us.
Waddlespot
My son's Club Penguin community. News, blogs, tips, and tricks.

Contact

Adam Kalsey

Mobile: 916.600.2497

Email: adam AT kalsey.com

AIM or Skype: akalsey

Resume

PGP Key

©1999-2008 Adam Kalsey.
Content management by Movable Type.