Broken aggregator

Freshness Warning
This blog post is over 20 years old. It's possible that the information you read below isn't current and the links no longer work.

I read the contents of several hundred sites each day through my news aggregator. I’ve switched aggregators a few times over the last few years, starting with a home-grown web-based aggregator, followed by Radio, Amphetadesk, then Radio again, back to Amphetadesk, and currently Aggie. If the site doesn’t have a news feed, I either create one through a small scraping tool I built, or don’t bother to read it.

All the aggregators I’ve tried have a serious flaw. They let feed developers break the aggregator display. A single unclosed <em> tag causes all the posts from all the sites from that point on to be italicized. That’s something that I can generally live with, but it’s possible for those unclosed tags to compound (or would that be to aggregate?) and make a mess of my display.

Today, one feed forgot to close an <em>, another forgot to close a <b> and another forgot to close a <small>. What I ended up with was tiny, bold, italicized text that’s next to impossible to read.

The problem could have been worse. If someone forgot to add the > to the end of a tag, the rest of the text on the page might be ignored, at least until a closing angle bracket occured.

But much worse, it’s possible for a feed developer to inject malicious code into their feed, and most aggregators would happily render the HTML to the browser. A bit of clever JavaScript in a feed that exploits a browser’s vulnerability to cross site scripting attacks could do quite a bit of damage.

I suggest that the various aggregator developers take steps to eliminate these problems. Check for unclosed tags and close them. Strip things like script tags from the feed before rendering them.

Meredith
January 18, 2003 3:10 AM

I'd love to hear more about the homegrown web-based aggregator you built. After getting frustrated with those written by others (including some you mentioned above), I'm trying to write my own using MT and mt-rssfeed.

Adam Kalsey
January 18, 2003 8:59 AM

It was built in ASP. All it really did was grabbed the XML files, parsed them and cached the result. The admin interface was quite primitive. You can see a cached copy of the aggregator output on the Wayback Machine at http://web.archive.org/web/20010803122838/http://kalsey.com/news/

Bill Kearney
January 21, 2003 2:57 PM

While it doesn't serve the immediate need, feeds listed on Syndic8 do get marked as bad when they produce invalid XML. You could have an aggregator double-check with Syndic8 (via XMLRPC) as to whether a feed is known to be working or not. That and you can always run it through the validator and send that URL to the feed author.

This discussion has been closed.

Recently Written

Input metrics lead to outcomes (Sep 1)
An easy to understand example of using input metrics to track progress toward an outcome.
Lagging Outcomes (Aug 22)
Long-term things often end up off a team's goals because they can't see how to define measurable outcomes for them. Here's how to solve that.
Tyranny of Outcomes (Aug 19)
An extreme focus on outcomes can have an undesired effect on product teams.
The Trap of The Sales-Led Product (Dec 10)
It’s not a winning way to build a product company.
The Hidden Cost of Custom Customer Features (Dec 7)
One-off features will cost you more than you think and make your customers unhappy.
Domain expertise in Product Management (Nov 16)
When you're hiring software product managers, hire for product management skills. Looking for domain experts will reduce the pool of people you can hire and might just be worse for your product.
Strategy Means Saying No (Oct 27)
An oft-overlooked aspect of strategy is to define what you are not doing. There are lots of adjacent problems you can attack. Strategy means defining which ones you will ignore.
Understanding vision, strategy, and execution (Oct 24)
Vision is what you're trying to do. Strategy is broad strokes on how you'll get there. Execution is the tasks you complete to complete the strategy.

Older...

What I'm Reading

Contact

Adam Kalsey

+1 916 600 2497

Resume

Public Key

© 1999-2023 Adam Kalsey.