Need someone to lead product management at your software company? I build high-craft software and the teams that build it. I'm looking for my next opportunity. Check out my resume and get in touch.

This is the blog of Adam Kalsey. Unusual depth and complexity. Rich, full body with a hint of nutty earthiness.

Content Management

Related Entries Revisited

Freshness Warning
This blog post is over 21 years old. It's possible that the information you read below isn't current and the links no longer work.

While my Related Entries plugin does a decent job of displaying blog entries that are similar to the current one, it requires that I stick to a somewhat rigid set of keywords or category classification in order to generate a high-quality list of related entries. As I get more and more entries, it is becoming difficult to stick to that system and I am finding that a number of entries on the site have related entries lists that aren’t very well related. An errant keyword leaves the whole system in disarray.

An architecture that depends on human input to be perfect in order to work is brittle. It breaks easily and isn’t flexible. I don’t like brittle systems, so a week or so ago I set out to find a different method of generating related entries.

MySQL Fulltext indexes to the rescue

I had been working with MySQL fulltext queries to create a database search on a client site and I noticed that when I searched for the exact title of the database entry, I got back a set of results that included not only the entry I was looking for, but several entries that were very similar to it. Later that day, I received a comment on the Related Entries plugin page asking about the algorithm used to generate related entries from keywords. Getting a query that set my mind to work thinking about alternative algorithms on the same day that I noticed the MySQL fulltext behavior was pure serendipity. My mind would probably not have intersected these two ideas otherwise.

If fulltext searches on exact database fields were returning some similar database records, then perhaps including multiple fields would return records that were even more similar. I spent about a week testing various combinations of Movable Type fields, concatenating the contents of several fields and using that text to search against the same fields. What I found was that using the full body of the entry typically returned results that didn’t relate very well. A simple word repeated too many times in the body of several entries would skew the results.

Using some of the shorter entry fields — excerpt, title, and keywords — created lists that had some very solid results. Several of the entries in the list were perfect matches to the current entry, but unfortunately many were not. What I needed now was a way to force the best matches to the top, so it’s a good thing that MySQL provides a way to do this. If you include the fulltext query in your SQL’s SELECT clause, MySQL will add a numeric relevance score to each record. Including the fulltext query twice in your SQL doesn’t adversely affect performance because the MySQL query engine recognizes that they are the same queries and only runs it once. By sorting the resulting records by the relevance score, all the irrelevant records are pushed to the bottom.

Installation and Configuration

And now the moment you’ve all been waiting for. How to install this improved related entries system. The first thing you will need to do is create a fulltext index in your MySQL database. If you are using another data storage method for Movable Type, you’re out of luck. This puppy only works in MySQL.

Run this SQL command against the Movable Type database. The specifics about how to run a query on your database isn’t something I’m going to explain here. If you don’t know how to do that, ask your Web server host for help.

ALTER TABLE mt_entry ADD FULLTEXT ( entry_keywords, entry_title, entry_excerpt )

This tells MySQL that you are going to run fulltext queries on those fields, in that order. MySQL will store some hidden data that optimizes fulltext queries and makes them fast.

Thanks to Inluminent, I discovered that Simon Willison is doing much the same thing as me, but he was generating the related list through PHP each time the entry was shown. This caused a problem with database timeouts and he had to put in a caching mechanism. To prevent this, and to make sure that people who don’t have access to PHP can still use this hack, I’m using Brad Choate’s MT SQL plugin to run the fulltext query and then MT generates the list when the page is built. There’s no need for caching, because the related list is static HTML just like the rest of the page.

So go download Brad’s plugin and install it according to his installation instructions. I’ll wait right here until you get back.

Done? Great. Now go into your individual archive template in Movable Type. Find the spot you want to stick the related entries and add this template code:

<MTSQLEntries query="SELECT entry_id, MATCH (entry_keywords, entry_title, entry_excerpt) AGAINST ('[MTEntryKeywords encode_php='q'] [MTEntryTitle encode_php='q']') AS score FROM mt_entry WHERE MATCH (entry_keywords, entry_title, entry_excerpt) AGAINST ('[MTEntryKeywords encode_php='q'] [MTEntryTitle encode_php='q']') AND entry_id != '[MTEntryID]' AND entry_blog_id = [MTBlogID] ORDER BY score DESC LIMIT 0 , 4"><li><a href="<MTEntryLink>"><MTEntryTitle></a></li></MTSQLEntries>

Now just rebuild your individual entry archives and start enjoying your new and improved related entries.

Recently Written

Think Systems, not Symptoms
Dec 15: Piecemeal process creation frustrates teams and slows work. Stop patching problems and start solving systems. Adopting a systems thinking approach helps you design processes that are efficient, aligned with goals, and truly add value.
Your Policies Aren’t Your Culture
Dec 13: Policies guide behavior, but culture is the lived norms and values of your team. Policies reflect culture -- they don’t define it. Netflix’s parental leave shift didn’t change its culture of freedom and responsibility. It clarified how to live it.
Lighten Your Process Burden
Dec 7: Everyone hates oppressive processes, but somehow we keep managing to create them.
Product Add-Ons Are An Expansion Myth
Dec 1: Add-ons can enhance your product’s appeal but won’t drive significant market growth. To expand your customer base, focus on developing standalone products.
Protecting your Product Soul when the Same Product meets New People.
Nov 23: Expand into new markets while preserving your product’s core value. Discover how to adapt and grow without losing your product’s soul.
Building the Next Big Thing: A Framework for Your Second Product
Nov 19: You need a first product sooner than you think. Here's a framework for helping you identify a winner.
A Framework for Scaling product teams
Oct 9: The people, processes, and systems that make up a product organization change radically as you go through the stages of a company. This framework will guide that scaling.
My Networked Webcam Setup
Sep 25: A writeup of my network-powered conference call camera setup.

Older...

What I'm Reading