So, I’ve been working to migrate my ultrasaurus weblog from Movable Type to WordPress (this one will come later). Since all the links were changing anyhow, I decided to move from ugly numbered permalinks to pretty links based on the title of the post. I only had a few hundred posts, but enough so I didn’t want to be doing it by hand. Since I’m learning Ruby and infatuated with it at the moment, I wrote a Ruby script to generate the .htaccess directives.

In case this would be helpful to anyone else, here’s the code:

[sourcecode language=’ruby’]
require ‘date’
require ‘find’

Find.find(‘./archives/’) do |fname|
# puts fname
date_string = “”
title = “”
next if, “r”) do |f|
f.each_line do |line|
if m = line.match(“(


)”) then
date_string = m.captures[1]
elsif m = line.match(“(

)(.*)(</h3)") then
title = m.captures[1].downcase

# for some reason my mt archives include some old drafts with no titles
next if title == ""

oldfilename = File.basename(fname)

# 1) remove chars that I think wordpress removes
# 2) change spaces to dashes
# 3) handle this case to avoid —
# Emmy Award for Outstanding Lead Actress – Miniseries or Movie
# also avoid — which I saw in my data, but not sure why it was there
# 4) remove trailing dash if there is one
# Thank you ruby-talk
title = title.gsub(/[(,?!'":.+=/)]/, '').gsub(' ', '-').gsub('–','-').gsub(/-$/, '')

# get the new path to the entry, like "/2003/03/brand-new-weblog/"
date = Date.parse(date_string)
newpath = sprintf("%d/%02d/%s/", date.year.to_s, date.month.to_s, title);

# we want something like this:
# redirect 301 /old/old.htm

output = "redirect 301 /sarahblog/archives/" + oldfilename + "" + newpath
puts output

end # loop thru files in directory

Leave a reply

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>