Dirify in PHP

Freshness Warning
This blog post is over 17 years old. It's possible that the information you read below isn't current and the links no longer work.

Movable Type has a dirify option that converts strings to something that can be used in a Web filename. It strips certain characters, converts spaces to underscores, and makes some other changes. One of my client sites has a number of pages that use PHP to access the MT database directly and displays the entries dynamically (something that the upcoming MT 3.1 does natively).

But since the site uses static pages with dirified titles as the filenames, I needed to convert the MT data from the database to MT’s archive URL format using a dirify function, so I ported the dirify mechanism to PHP.

function dirify($s) {
     $s = convert_high_ascii($s);  ## convert high-ASCII chars to 7bit.
     $s = strtolower($s);           ## lower-case.
     $s = strip_tags($s);       ## remove HTML tags.
     $s = preg_replace('!&[^;\s]+;!','',$s);         ## remove HTML entities.
     $s = preg_replace('![^\w\s]!','',$s);           ## remove non-word/space chars.
     $s = preg_replace('!\s+!','_',$s);               ## change space chars to underscores.
     return $s;    

function convert_high_ascii($s) {
 	$HighASCII = array(
 		"!\xc0!" => 'A',    # A`
 		"!\xe0!" => 'a',    # a`
 		"!\xc1!" => 'A',    # A'
 		"!\xe1!" => 'a',    # a'
 		"!\xc2!" => 'A',    # A^
 		"!\xe2!" => 'a',    # a^
 		"!\xc4!" => 'Ae',   # A:
 		"!\xe4!" => 'ae',   # a:
 		"!\xc3!" => 'A',    # A~
 		"!\xe3!" => 'a',    # a~
 		"!\xc8!" => 'E',    # E`
 		"!\xe8!" => 'e',    # e`
 		"!\xc9!" => 'E',    # E'
 		"!\xe9!" => 'e',    # e'
 		"!\xca!" => 'E',    # E^
 		"!\xea!" => 'e',    # e^
 		"!\xcb!" => 'Ee',   # E:
 		"!\xeb!" => 'ee',   # e:
 		"!\xcc!" => 'I',    # I`
 		"!\xec!" => 'i',    # i`
 		"!\xcd!" => 'I',    # I'
 		"!\xed!" => 'i',    # i'
 		"!\xce!" => 'I',    # I^
 		"!\xee!" => 'i',    # i^
 		"!\xcf!" => 'Ie',   # I:
 		"!\xef!" => 'ie',   # i:
 		"!\xd2!" => 'O',    # O`
 		"!\xf2!" => 'o',    # o`
 		"!\xd3!" => 'O',    # O'
 		"!\xf3!" => 'o',    # o'
 		"!\xd4!" => 'O',    # O^
 		"!\xf4!" => 'o',    # o^
 		"!\xd6!" => 'Oe',   # O:
 		"!\xf6!" => 'oe',   # o:
 		"!\xd5!" => 'O',    # O~
 		"!\xf5!" => 'o',    # o~
 		"!\xd8!" => 'Oe',   # O/
 		"!\xf8!" => 'oe',   # o/
 		"!\xd9!" => 'U',    # U`
 		"!\xf9!" => 'u',    # u`
 		"!\xda!" => 'U',    # U'
 		"!\xfa!" => 'u',    # u'
 		"!\xdb!" => 'U',    # U^
 		"!\xfb!" => 'u',    # u^
 		"!\xdc!" => 'Ue',   # U:
 		"!\xfc!" => 'ue',   # u:
 		"!\xc7!" => 'C',    # ,C
 		"!\xe7!" => 'c',    # ,c
 		"!\xd1!" => 'N',    # N~
 		"!\xf1!" => 'n',    # n~
 		"!\xdf!" => 'ss'
 	$find = array_keys($HighASCII);
 	$replace = array_values($HighASCII);
 	$s = preg_replace($find,$replace,$s);
     return $s;

To use this function, simply pass in the string you want dirified, like so:

<?php echo dirify("Here’s the title of an entry!"); ?>

Dennis Pallett
July 28, 2004 1:52 PM

Looks pretty good. Mind if I post it at my php website, PHPit, with full credit to you of course?

July 29, 2004 12:46 AM

Very nice! Looks a bit more complex and complete than the original one I have bookmarked from the early days. :) http://www.movabletype.org/support/index.php?act=ST&f=14&t=12400&hl=dirify,and,function Thanks for posting it!

Gabriel Radic
August 12, 2004 8:58 AM

Hello Adam, nice work with PHP dirify, thanks a lot. Would you be interested in improving the high-ascii filter with a few bug fixes and more supported characters? I did this already for MT. Too bad the developers didn't listen to my whining and didn't implement the changes. You can find the improved high-ascii filter here http://mt-stuff.fanworks.net/plugin/dirify_for_unicode.phtml Thanks.

Adam Kalsey
August 12, 2004 9:57 AM

Not really, unless it's also implemented in MT. The idea here isn't to be perfect, but to be an exact work-alike o fhte MT function so you can use PHP to construct URLs that point to files created by the MT build process. That way I can have individual archive pages be staticly created by MT but the indexes are dynamic. The upcoming MT3.1 does nearly the exact same thing (in fact the code is so similar it's scary), but there are people who won't be running the dynamic features of 3.1 who would still want to use this code.

Rob Bolton
August 18, 2004 8:15 AM

Very nice, this should work well.

August 18, 2004 11:23 AM

Here's mine in javascript: function makeShortcut( str ){ str = str.toLowerCase(); var rExps=[ /[\xC0-\xC2]/g, /[\xE0-\xE2]/g, /[\xC8-\xCA]/g, /[\xE8-\xEB]/g, /[\xCC-\xCE]/g, /[\xEC-\xEE]/g, /[\xD2-\xD4]/g, /[\xF2-\xF4]/g, /[\xD9-\xDB]/g, /[\xF9-\xFB]/g ]; var repChar=['A','a','E','e','I','i','O','o','U','u']; for(var i=0; i<rExps.length; i++){ str = str.replace(rExps[i],repChar[i]); } str = str.replace( /\W+/g, '-' ); str = str.replace( /[_-]+/g, '-' ); str = str.replace( /^-/, ''); str = str.replace( /-$/, ''); return str; } I use it "onkeypress" for administrate articles: typing the title shows in real time how looks the shortcut.

Trackback from Six Apart Professional Network
September 10, 2004 3:26 PM

Dirify in PHP

Excerpt: This one's a handy reference: Dirify in PHP, courtesy of Adam Kalsey. If you've ever used the "dirify" attribute in a Movable Type template tag, this function performs the same text transformation, but in PHP. It's somewhat underreported that in...

November 2, 2004 3:05 AM

I found the convert_high_ascii() function exceedingly useful for stripping out evyl characters for use with a new flash/xml solution I'm working on. Many thanks! (And yes, full credits in the source :)

November 24, 2004 4:21 AM

Excellent - stripped out x92 (funny apostrophe?) from some Excel generated data for use in xml parser that barfs at anything odd! Thanks

July 17, 2009 4:09 AM

This is exactly what I was looking for. Thanks so much!

July 30, 2009 5:23 PM

Just what I've been looking for! Thank you!!!

This discussion has been closed.

Recently Written

The Trap of The Sales-Led Product (Dec 10)
It’s not a winning way to build a product company.
The Hidden Cost of Custom Customer Features (Dec 7)
One-off features will cost you more than you think and make your customers unhappy.
Domain expertise in Product Management (Nov 16)
When you're hiring software product managers, hire for product management skills. Looking for domain experts will reduce the pool of people you can hire and might just be worse for your product.
Strategy Means Saying No (Oct 27)
An oft-overlooked aspect of strategy is to define what you are not doing. There are lots of adjacent problems you can attack. Strategy means defining which ones you will ignore.
Understanding vision, strategy, and execution (Oct 24)
Vision is what you're trying to do. Strategy is broad strokes on how you'll get there. Execution is the tasks you complete to complete the strategy.
How to advance your Product Market Fit KPI (Oct 21)
Finding the gaps in your product that will unlock the next round of growth.
Developer Relations as Developer Success (Oct 19)
Outreach, marketing, and developer evangelism are a part of Developer Relations. But the companies that are most successful with developers spend most of their time on something else.
Developer Experience Principle 6: Easy to Maintain (Oct 17)
Keeping your product Easy to Maintain will improve the lives of your team and your customers. It will help keep your docs up to date. Your SDKs and APIs will be released in sync. Your tooling and overall experience will shine.


What I'm Reading


Adam Kalsey

+1 916 600 2497


Public Key

© 1999-2022 Adam Kalsey.