Dirify in PHP

Freshness Warning
This article is over 14 years old. It's possible that the information you read below isn't current.

Movable Type has a dirify option that converts strings to something that can be used in a Web filename. It strips certain characters, converts spaces to underscores, and makes some other changes. One of my client sites has a number of pages that use PHP to access the MT database directly and displays the entries dynamically (something that the upcoming MT 3.1 does natively).

But since the site uses static pages with dirified titles as the filenames, I needed to convert the MT data from the database to MT’s archive URL format using a dirify function, so I ported the dirify mechanism to PHP.

<?php
function dirify($s) {
     $s = convert_high_ascii($s);  ## convert high-ASCII chars to 7bit.
     $s = strtolower($s);           ## lower-case.
     $s = strip_tags($s);       ## remove HTML tags.
     $s = preg_replace('!&[^;\s]+;!','',$s);         ## remove HTML entities.
     $s = preg_replace('![^\w\s]!','',$s);           ## remove non-word/space chars.
     $s = preg_replace('!\s+!','_',$s);               ## change space chars to underscores.
     return $s;    
}

function convert_high_ascii($s) {
 	$HighASCII = array(
 		"!\xc0!" => 'A',    # A`
 		"!\xe0!" => 'a',    # a`
 		"!\xc1!" => 'A',    # A'
 		"!\xe1!" => 'a',    # a'
 		"!\xc2!" => 'A',    # A^
 		"!\xe2!" => 'a',    # a^
 		"!\xc4!" => 'Ae',   # A:
 		"!\xe4!" => 'ae',   # a:
 		"!\xc3!" => 'A',    # A~
 		"!\xe3!" => 'a',    # a~
 		"!\xc8!" => 'E',    # E`
 		"!\xe8!" => 'e',    # e`
 		"!\xc9!" => 'E',    # E'
 		"!\xe9!" => 'e',    # e'
 		"!\xca!" => 'E',    # E^
 		"!\xea!" => 'e',    # e^
 		"!\xcb!" => 'Ee',   # E:
 		"!\xeb!" => 'ee',   # e:
 		"!\xcc!" => 'I',    # I`
 		"!\xec!" => 'i',    # i`
 		"!\xcd!" => 'I',    # I'
 		"!\xed!" => 'i',    # i'
 		"!\xce!" => 'I',    # I^
 		"!\xee!" => 'i',    # i^
 		"!\xcf!" => 'Ie',   # I:
 		"!\xef!" => 'ie',   # i:
 		"!\xd2!" => 'O',    # O`
 		"!\xf2!" => 'o',    # o`
 		"!\xd3!" => 'O',    # O'
 		"!\xf3!" => 'o',    # o'
 		"!\xd4!" => 'O',    # O^
 		"!\xf4!" => 'o',    # o^
 		"!\xd6!" => 'Oe',   # O:
 		"!\xf6!" => 'oe',   # o:
 		"!\xd5!" => 'O',    # O~
 		"!\xf5!" => 'o',    # o~
 		"!\xd8!" => 'Oe',   # O/
 		"!\xf8!" => 'oe',   # o/
 		"!\xd9!" => 'U',    # U`
 		"!\xf9!" => 'u',    # u`
 		"!\xda!" => 'U',    # U'
 		"!\xfa!" => 'u',    # u'
 		"!\xdb!" => 'U',    # U^
 		"!\xfb!" => 'u',    # u^
 		"!\xdc!" => 'Ue',   # U:
 		"!\xfc!" => 'ue',   # u:
 		"!\xc7!" => 'C',    # ,C
 		"!\xe7!" => 'c',    # ,c
 		"!\xd1!" => 'N',    # N~
 		"!\xf1!" => 'n',    # n~
 		"!\xdf!" => 'ss'
 	);
 	$find = array_keys($HighASCII);
 	$replace = array_values($HighASCII);
 	$s = preg_replace($find,$replace,$s);
     return $s;
}
?>

To use this function, simply pass in the string you want dirified, like so:

<?php echo dirify("Here’s the title of an entry!"); ?>

Dennis Pallett
July 28, 2004 1:52 PM

Looks pretty good. Mind if I post it at my php website, PHPit, with full credit to you of course?

kristine
July 29, 2004 12:46 AM

Very nice! Looks a bit more complex and complete than the original one I have bookmarked from the early days. :) http://www.movabletype.org/support/index.php?act=ST&f=14&t=12400&hl=dirify,and,function Thanks for posting it!

Gabriel Radic
August 12, 2004 8:58 AM

Hello Adam, nice work with PHP dirify, thanks a lot. Would you be interested in improving the high-ascii filter with a few bug fixes and more supported characters? I did this already for MT. Too bad the developers didn't listen to my whining and didn't implement the changes. You can find the improved high-ascii filter here http://mt-stuff.fanworks.net/plugin/dirify_for_unicode.phtml Thanks.

Adam Kalsey
August 12, 2004 9:57 AM

Not really, unless it's also implemented in MT. The idea here isn't to be perfect, but to be an exact work-alike o fhte MT function so you can use PHP to construct URLs that point to files created by the MT build process. That way I can have individual archive pages be staticly created by MT but the indexes are dynamic. The upcoming MT3.1 does nearly the exact same thing (in fact the code is so similar it's scary), but there are people who won't be running the dynamic features of 3.1 who would still want to use this code.

Rob Bolton
August 18, 2004 8:15 AM

Very nice, this should work well.

Jerome
August 18, 2004 11:23 AM

Here's mine in javascript: function makeShortcut( str ){ str = str.toLowerCase(); var rExps=[ /[\xC0-\xC2]/g, /[\xE0-\xE2]/g, /[\xC8-\xCA]/g, /[\xE8-\xEB]/g, /[\xCC-\xCE]/g, /[\xEC-\xEE]/g, /[\xD2-\xD4]/g, /[\xF2-\xF4]/g, /[\xD9-\xDB]/g, /[\xF9-\xFB]/g ]; var repChar=['A','a','E','e','I','i','O','o','U','u']; for(var i=0; i<rExps.length; i++){ str = str.replace(rExps[i],repChar[i]); } str = str.replace( /\W+/g, '-' ); str = str.replace( /[_-]+/g, '-' ); str = str.replace( /^-/, ''); str = str.replace( /-$/, ''); return str; } I use it "onkeypress" for administrate articles: typing the title shows in real time how looks the shortcut.

Trackback from Six Apart Professional Network
September 10, 2004 3:26 PM

Dirify in PHP

Excerpt: This one's a handy reference: Dirify in PHP, courtesy of Adam Kalsey. If you've ever used the "dirify" attribute in a Movable Type template tag, this function performs the same text transformation, but in PHP. It's somewhat underreported that in...

Keith
November 2, 2004 3:05 AM

I found the convert_high_ascii() function exceedingly useful for stripping out evyl characters for use with a new flash/xml solution I'm working on. Many thanks! (And yes, full credits in the source :)

Ian
November 24, 2004 4:21 AM

Excellent - stripped out x92 (funny apostrophe?) from some Excel generated data for use in xml parser that barfs at anything odd! Thanks

clotilde
July 17, 2009 4:09 AM

This is exactly what I was looking for. Thanks so much!

Lawrence
July 30, 2009 5:23 PM

Just what I've been looking for! Thank you!!!

Your comments:

Text only, no HTML. URLs will automatically be converted to links. Your email address is required, but it will not be displayed on the site.

Name:

Not your company or your SEO link. Comments without a real name will be deleted as spam.

Email: (not displayed)

If you don't feel comfortable giving me your real email address, don't expect me to feel comfortable publishing your comment.

Website (optional):

Follow me on Twitter

Best Of

  • California State Fair The California State Fair lets you buy tickets in advance from their Web site. That's good. But the site is a horror house of usability problems.
  • Best of Newly Digital There have been dozens of Newly Digital entries from all over the world. Here are some of the best.
  • How not to apply for a job Applying for a job isn't that hard, but it does take some minimal effort and common sense.
  • Newly Digital Newly Digital is an experimental writing project. I've asked 11 people to write about their early experiences with computing technology and post their essays on their weblogs. So go read, enjoy, and then contribute. This collection is open to you. Write up your own story, and then let the world know about it.
  • Lock-in is bad T-Mobile thinks they'll get new Hotspot customers with exclusive content and locked-in devices.
  • More of the best »

Recently Read

Get More

Subscribe | Archives

Recently

Encouraging 1:1s from other managers in your organization (Jan 4)
If you’re managing other managers, encourage them to hold their own 1:1s. It’s such an important tool for managing and leading that everyone needs to be holding them.
One on One Meetings - a collection of posts about 1:1s (Jan 2)
A collection of all my writing on 1:1s
Are 1:1s confidential? (Jan 2)
Is the discussion that occurs in a 1:1 confidential, even if no agreed in the meeting to keep it so?
Skip-level 1:1s are your hidden superpower (Jan 1)
Holding 1:1s with peers and with people far below you on the reporting chain will open your eyes up to what’s really going on in your business.
Do you need a 1:1 if you’re regularly communicating with your team? (Dec 28)
You’re simply not having deep meaningful conversation about the process of work in hallway conversations or in your chat apps.
What agenda items should a manager bring to a 1:1? (Dec 23)
At least 80% of a 1:1 agenda should be driven by your report, but if you also to use this time to work on things with them, then you’ll have better meetings.
Handling “I don’t have anything to talk about” in your 1:1s (Dec 21)
When someone says they have nothing to discuss, they’re almost always thinking too narrowly.
What should you talk about in a 1:1? (Dec 19)
Who sets the agenda? What should you discuss, and what should you avoid discussing?

Subscribe to this site's feed.

Contact

Adam Kalsey

Mobile: 916.600.2497

Email: adam AT kalsey.com

Twitter, etc: akalsey

Resume

PGP Key

©1999-2019 Adam Kalsey.