Broken Blogger "rel" Tags

I use Operator for FireFox which parses Microformats and allows you to take actions on them – such as add a contact to your address book. Some really cool stuff.

But the “rel” tag is broken on my blogger blog. Blogger doesn’t follow the spec. Not that they can, with my site being a FTP published blog, they can’t follow spec and still be able to publish the “Labels” as they call tags, correctly. The issue is that blogger makes the URL end with “.html” as the would need to in order for that “Labels” page to work at all.

I took matters into my own hands. I can’t fix blogger, but I can fix my site.

My webhost has PHP available to me. I have made 3 changes so I can do some really cool stuff with my blog, like fix the Labels issue.
1) I have all my blogger uploaded .html pages parsed as PHP
2) I use mod_rewrite to allow me to link to /labels/tag as well as /labels/tag.html
3) I use PHP’s output buffering to dynamically rewrite the page as it is being served.

In more detail:

1) I created a .htaccess file at the root level of my website (/httpdocs) that contains the following line.

AddType application/x-httpd-php .html .php

This allows all .html pages to be parsed as php code.

2) I created a .htcasses file in /labels that has the following code in it.

RewriteEngine On
RewriteBase /labels/
RewriteCond %{REQUEST_URI} !.+\.html$
RewriteCond %{REQUEST_URI} !.+\.html.+$

RewriteRule ^(.+)$ $1\.html [L]

This checks to see if the URL requested ends in .html and if not, appends the .html and serves the file.

3a) I created a file called bloggerrewrite.php and it contains the following code:

function bloggerlabelrewrite($buffer)
// Lets find the blogger labels stuff
// It is good to be as specific as we can because we don’t have control over any changes Blogger may make
$pattern = ‘/(<p class=\”blogger-labels\”>.*<\/p>)/iu’; //case insensitive and ungreedy just incase Blogger changes something.

$html_array = preg_split ($pattern, $buffer, -1, PREG_SPLIT_DELIM_CAPTURE ); // $html_array[0] = the stuff before, $html_array[1] = the blogger lables paragraph, $html_array[2] the stuff after

$trans = array(“Labels:” => “Tags:”, “.html\”>” => “\”>”); // the transformation we want to make – Changing Labels to Tags, and dropping the .html

return $html_array[‘0’] . strtr($html_array[‘1’] , $trans) . $html_array[‘2’] ;

3b) I then modified my Blogger Template by inserting the following between the <BLOGGER> and </BLOGGER> so I can use the output buffering from the 3a file on each individual post.



<?php ob_end_flush(); ?>

The results are as you see. My Blogger “Labels” no longer exist, you now see a “Tags” section at the bottom of every post. These tags are properly formatted, and so they work with Operator.

I am kinda wondering, what if a crawler or whatever comes across the site, identifies it as a blogger site, and tries to parse the “Labels” in a BloggerQuirks mode, and doesn’t find any Labels….