Related Entries Revisited

Freshness Warning
This article is over 15 years old. It's possible that the information you read below isn't current.

While my Related Entries plugin does a decent job of displaying blog entries that are similar to the current one, it requires that I stick to a somewhat rigid set of keywords or category classification in order to generate a high-quality list of related entries. As I get more and more entries, it is becoming difficult to stick to that system and I am finding that a number of entries on the site have related entries lists that aren’t very well related. An errant keyword leaves the whole system in disarray.

An architecture that depends on human input to be perfect in order to work is brittle. It breaks easily and isn’t flexible. I don’t like brittle systems, so a week or so ago I set out to find a different method of generating related entries.

MySQL Fulltext indexes to the rescue

I had been working with MySQL fulltext queries to create a database search on a client site and I noticed that when I searched for the exact title of the database entry, I got back a set of results that included not only the entry I was looking for, but several entries that were very similar to it. Later that day, I received a comment on the Related Entries plugin page asking about the algorithm used to generate related entries from keywords. Getting a query that set my mind to work thinking about alternative algorithms on the same day that I noticed the MySQL fulltext behavior was pure serendipity. My mind would probably not have intersected these two ideas otherwise.

If fulltext searches on exact database fields were returning some similar database records, then perhaps including multiple fields would return records that were even more similar. I spent about a week testing various combinations of Movable Type fields, concatenating the contents of several fields and using that text to search against the same fields. What I found was that using the full body of the entry typically returned results that didn’t relate very well. A simple word repeated too many times in the body of several entries would skew the results.

Using some of the shorter entry fields — excerpt, title, and keywords — created lists that had some very solid results. Several of the entries in the list were perfect matches to the current entry, but unfortunately many were not. What I needed now was a way to force the best matches to the top, so it’s a good thing that MySQL provides a way to do this. If you include the fulltext query in your SQL’s SELECT clause, MySQL will add a numeric relevance score to each record. Including the fulltext query twice in your SQL doesn’t adversely affect performance because the MySQL query engine recognizes that they are the same queries and only runs it once. By sorting the resulting records by the relevance score, all the irrelevant records are pushed to the bottom.

Installation and Configuration

And now the moment you’ve all been waiting for. How to install this improved related entries system. The first thing you will need to do is create a fulltext index in your MySQL database. If you are using another data storage method for Movable Type, you’re out of luck. This puppy only works in MySQL.

Run this SQL command against the Movable Type database. The specifics about how to run a query on your database isn’t something I’m going to explain here. If you don’t know how to do that, ask your Web server host for help.

ALTER TABLE mt_entry ADD FULLTEXT ( entry_keywords, entry_title, entry_excerpt )

This tells MySQL that you are going to run fulltext queries on those fields, in that order. MySQL will store some hidden data that optimizes fulltext queries and makes them fast.

Thanks to Inluminent, I discovered that Simon Willison is doing much the same thing as me, but he was generating the related list through PHP each time the entry was shown. This caused a problem with database timeouts and he had to put in a caching mechanism. To prevent this, and to make sure that people who don’t have access to PHP can still use this hack, I’m using Brad Choate’s MT SQL plugin to run the fulltext query and then MT generates the list when the page is built. There’s no need for caching, because the related list is static HTML just like the rest of the page.

So go download Brad’s plugin and install it according to his installation instructions. I’ll wait right here until you get back.

Done? Great. Now go into your individual archive template in Movable Type. Find the spot you want to stick the related entries and add this template code:

<MTSQLEntries query="SELECT entry_id, MATCH (entry_keywords, entry_title, entry_excerpt) AGAINST ('[MTEntryKeywords encode_php='q'] [MTEntryTitle encode_php='q']') AS score FROM mt_entry WHERE MATCH (entry_keywords, entry_title, entry_excerpt) AGAINST ('[MTEntryKeywords encode_php='q'] [MTEntryTitle encode_php='q']') AND entry_id != '[MTEntryID]' AND entry_blog_id = [MTBlogID] ORDER BY score DESC LIMIT 0 , 4"><li><a href="<MTEntryLink>"><MTEntryTitle></a></li></MTSQLEntries>

Now just rebuild your individual entry archives and start enjoying your new and improved related entries.

Trackback from mashby.com
December 7, 2003 10:21 AM

Sprucing Up The Place

Excerpt: After yesterday's entry regarding Plugin and how to tweak MT, I couldn't help but do a little tweaking myself. That coupled with the comments from my review at The Weblog Review, I thought of a few things that I could do to kind of spruce up the place....

Patrick Madden
December 11, 2003 9:22 AM

Thank you for this method. I've just set it up, it works fine, but, like Alexander Payne, I haven't been entering excerpts or keywords, and thus I get only entries with common words in their titles. I've read through your explanation in the comments here, and it seems logical, but I am wondering a couple of things: Your query looks at Titles, Keywords, and Excerpts, but I would expect something about MTEntryExcerpt in the AGAINST ('[MTEntryKeywords encode_php='q'] [MTEntryTitle encode_php='q']') section of the code. Is that missing? or does it check Excerpts some other way? I don't read SQL code at all, but this seems like it's checking only titles and keywords, not excerpts (though it wants to with the entry_excerpt). And if I were daring and wanted to check within MTEntryBody (using the score > # adjustment), would I just add [MTEntryBody encode_php='q'] inside both AGAINST parenthesis? Would I also need to add a entry_body thing inside the MATCH parenthesis? My solution may be to just start using excerpts and keywords (and have the blog's other authors do it too), but I thought I'd give this a try first (as a way of including the old posts). Thanks a lot.

Trackback from Sundown
January 12, 2004 6:03 AM

Related Entries with MTSQLEntries

Excerpt: Results are pretty good and will continue to get better if I can remember to be diligent about keywords.

Junjan
October 19, 2004 10:52 AM

Hi there, I have a problem with your plugin. I have installed it as described and it works, but it always shows up only one result and it is the very same entry. Any idea what can be the problem?

TJ
October 20, 2004 1:45 AM

Hi, congratulations to this wonderful "plugin". It's a great help! One thing we couldn't figure out - how can we alter the statement to search for realted entries in all blogs? We used "...AND entry_blog_id in (8,9,10)..." instead of the existing entry_blog_id statement but it still only produces matches from one weblog? What's the problem? All teh best, Torsten

Amol Hatwar
November 17, 2004 1:19 PM

I was wanting to add a related entry list to my blog as well. I knew things could be done this way, but if the pages were made static, older posts wouldn't be able to link to newer ones. Or you'd have to regenerate all static files at every post (too much over-head). It seems to me that Simon's way is right. I wonder why he'd get MySQL choke-ups but. The function mysql_pconnect() was reported to be buggy.

Lauren Noelle
December 7, 2004 8:47 PM

I have the same problem as Junjan: "it always shows up only one result and it is the very same entry." Well, that happened on some entries, in at least one, there are none (when there should be more, I checked ones that should have lots).

Trackback from eclecticism
January 8, 2005 2:22 PM

Progress: Related Entries

Excerpt: The keyword index will work, but I've got a lot of work to do on my keywords before I can bring it live. In the meantime, I've re-implemented Adam Kalsey's 'related entries' code, listing five similar entries in the sidebar of each individual entry page.

Trackback from Mehh? Mehh!
August 2, 2005 1:16 PM

Useful site for MT users

Excerpt: An extremely useful site I came across while installing pluggins for movable type is Adam Kelseys blog which contains such beauties as the related entries plugin which I'm attempting to implement right now (had it working am just trying to...

Subbu Allamaraju
October 17, 2005 8:45 PM

Thanks for this idea. I tried a few alternatives, and your solution is the most elegant for displaying related entries.

Scott Johnson
May 17, 2006 12:07 PM

I realize that I'm about two years behind the curve on this one, but I just implemented this great little hack on my blog today. Very nice!

Sammy
May 23, 2006 8:12 AM

Is this plugin compatible with WordPress?

Scott Johnson
June 29, 2006 12:44 AM

Now that I have been using this query in my MT individual archives for a bit over a month, I have found one shortcoming. See this page for an example: http://full-speed.org/archives/2006/06/13/inaudible_ring_tones.php (Adam, feel free to delete that link once you've checked it out.) Basically, when the query doesn't find related entries, nothing is shown. I would really like to fix that. Whether it be a simple message that says that there were no related entries found or (preferably in my case) displaying some less relevant links. I have dealt with MySQL quite a bit, but this FULLTEXT stuff is new to me. And MTSQL is something that I have only used in a copy & paste manner. Any ideas?

Scott Johnson
June 29, 2006 12:54 AM

Sorry for so many postings here, but I have been hacking up your query a bit and have a result that at least gives me _something_. It's far from perfect at this point, but I now have _some_ links displaying, however irrelevant. What I did was replace MTEntryKeywords (a field that was returning an empty string for my blog postings) with MTEntryExcerpt in two places in the query. Like I said, it's not perfect, but at least I have some links now. Back to hacking that query... :)

amir
July 22, 2006 1:50 AM

hi there is that possible to we find the re;ated entries of one post, from other blogs entries? HOW??!!! How we should change the code? please mail me the method .. thanks

These are the last 15 comments. Read all 57 comments here.

This discussion has been closed.

Follow me on Twitter

Best Of

  • The mouse and me Not only is the mouse very destructive, but it's evaded all attempts to capture or kill it so far.
  • Comment Spam Manifesto Spammers are hereby put on notice. Your comments are not welcome. If the purpose behind your comment is to advertise yourself, your Web site, or a product that you are affiliated with, that comment is spam and will not be tolerated. We will hit you where it hurts by attacking your source of income.
  • The importance of being good Starbucks is pulling CD burning stations from their stores. That says something interesting about their brand.
  • Rounded corners in CSS There lots of ways to create rounded corners with CSS, but they always require lots of complex HTML and CSS. This is simpler.
  • Where do the RSS ad startups fit in? Yahoo's RSS advertising service could spell trouble for pure-play RSS advertising services unless they adapt their business model.
  • More of the best »

Recently Read

Get More

Subscribe | Archives

15

Recently

Stretching your team (Jun 11)
Stretching your team is one of the best ways to improve your output, your team's happiness, and your velocity. But they'll need coaching.
Physical camera shutter for Cisco Spark Board (Jul 6)
A 3d printable design for a camera shutter for a Cisco Spark Board
My Travel Coffee Setup (Jan 20)
What my travel coffee brewing setup looks like, and how you can build your own for under $100.
Turkey Legs (May 30)
Product naming gone awry.
Speaking for Geeks: Your Slides (Dec 17)
Tips and tricks for creating great slides.
Speaking for Geeks: Writing Your Talk (Dec 14)
Don’t wait until the night before the talk to write it. Crazy, I know.
Speaking for Geeks: Tell a Story (Dec 13)
Telling a story keeps your presentation focused, keeps your audience interested, and makes it easier for you to remember your talk.
Speaking for Geeks: Where to speak (Dec 11)
You've got a great idea for a talk. How do you find conferences to submit it to?

Subscribe to this site's feed.

Elsewhere

Tropo
Voice and communications platforms, including Tropo and Phono. Work.
SacStarts
The Sacramento technology startup community.
Pinewood Freak
Pinewood Derby tips and tricks

Contact

Adam Kalsey

Mobile: 916.600.2497

Email: adam AT kalsey.com

AIM or Skype: akalsey

Resume

PGP Key

©1999-2018 Adam Kalsey.
Content management by Movable Type.