Archive

Posts Tagged ‘rewriterule’

Using mod_rewrite And Zend Framework To Display Dynamic sitemap.xml

April 3rd, 2009 1 comment

Whilst creating a site the other day I thought about how I would manage the sitemap.xml file. This file is basically a XML file containing a list of URLs. Most major search engines understand (and look for) this file, so having it present on a site is a definite must.

I have been down the route before of having a sitemap.xml file created by the application every time a new record or something was added, but as this was a high traffic, multi-user site this approach just had to many problems. The main problem (aside from the potential performance hit) was that I would have to spend hours tying the calls to the sitemap.xml creation file into my application.

I then hit upon the idea of using a RewriteRule that would mask a controller as the sitemap.xml file. This would mean that the sitemap.xml controls could be kept away from all other parts of the application (so I could use the same template again), but I could also use Zend_Cache to cache the sitemap.xml file daily and therefore save on processing time.

First I needed to create a RewriteRule that would redirect a call to sitemap.xml to the Sitemap controller.

RewriteRule ^(.*)sitemap.xml$ /sitemap/index [L]

Next I created the Sitemap controller and made sure that the index action did not show the layout. The URLs are passed as an array to the view.

class SitemapController extends Zend_Controller_Action
{
    public function indexAction()
    {
        $this->_helper->layout()->disableLayout();
        $urls = array(array('loc'=>'http://www.talkincode.com/', 'lastmod'=>'2009-04-02T11:34:48+00:00', 'changefreq'=>'daily', 'priority'=>'1.0'));
        
        $this->view->urls = $urls;
    }
}

In order for the Sitemap controller to display anything it needs to have a view to render. This creates the basic outline of the file and uses the partialLoop() function to print out the array of URLs.

<urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd" xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    <?php echo $this->partialLoop('sitemap/_urlItem.phtml',$this->urls); ?>
</urlset>

Here is the file _urlItem.phtml, which gets rendered for every item in the $this->urls array.

<url> 
    <loc><?php echo $this->loc; ?></loc> 
    <lastmod><?php echo $this->lastmod; ?></lastmod> 
    <changefreq><?php echo $this->changefreq; ?></changefreq> 
    <priority><?php echo $this->priority; ?></priority> 
</url>

I assumed that everything would be working nicely now, but when I went to a browser and tried to find sitemap.xml I was presented with a message that said sitemap.xml was an invalid controller.

Just to test I added a redirect to the end of the RewriteRule to make sure that the rule worked.

RewriteRule ^(.*)sitemap.xml$ /sitemap/index [R=301, L]

This redirected to the correct place, so it must have been Zend Framework that was causing the error to occur. After a bit of thinking I realised that I could create a route that would reroute the call to the missing sitemap.xml controller to the existing Sitemap controller. Here is the rule I created, just add this to your bootstrap file.

$router = $frontController->getRouter();
$router->addRoute(
    'manageSitemap',
    new Zend_Controller_Router_Route('sitemap.xml', array('controller'=>'sitemap','action'=>'index'))
);

Navigating to sitemap.xml now shows me the output of the Sitemap controller.

Redirect One Directory To Another With .htaccess

May 19th, 2008 No comments

To stop access to a directory (and anything in that directory) all you need is a simple RewriteRule.

RewriteEngine on
RewriteBase /
RewriteRule ^exampledirectory/(.*)$ / [R=301,L]

In this example, if this .htaccess file resides in the root directory of the site and you try to access anything within /exampledirectory you will be redirected back to the root folder. To redirect to another folder (like anotherdirectory) on your web server use the following rule.

RewriteEngine on
RewriteBase /
RewriteRule ^exampledirectory/(.*)$ /anotherdirectory [R=301,L]

Categories: Apache Tags: , , ,

Using mod_rewrite On Form Parameters

March 5th, 2008 11 comments

Using mod_rewrite on websites is fairly straightforward and can create some lovely looking URL structures. Instead of having a URL that contains lots of odd looking parameters like this:
http://www.example.com/example.php?parameter1=value1&parameter2=value2

You can use a .htaccess file to rewrite the URL on the server side in order to shorten this to something like this:
http://www.example.com/p-value1

In this occasion the value of parameter2 will always be value2 so we can just include that in the rewrite rule, which would look something like the following. $1 is a back-reference to the first parenthesized value matched in the RewriteRule.
RewriteRule ^p-(.*)$ /example.php?parameter1=$1&parameter2=value2 [L]

Remember to turn ensure that the FollowSynLinks directive is enabled and that the rewrite engine is turned on before starting your rewrite rules. FollowSynLinks should have been enabled by your server administrator, but you can include here just in case they haven’t.
Options +FollowSymLinks
RewriteEngine On

You can also make sure that mod_rewrite is actually installed by enclosing all of this in an if statement. This will stop the server throwing an error if you don’t have mod_rewrite.
<IfModule mod_rewrite.c>
Options +FollowSymLinks
RewriteEngine On
</IfModule>

What PHP sees on the server side is exactly the same as normal so you can retrieve the parameters with a standard $_GET lookup.

However, the default behaviour of forms messes this up. Lets say that we had a search page that we created a rewrite rule so that it read like this:
http://www.example.com/s-value1

This redirects to the page search.php and passes any parameters to that page. The problem here is if we call the same page through a form. Here is an example search form.
<form action="search.php" method="get">
<input type="text" name="q" value="" />
<input type="submit" name="s" value="Search" />
</form>

When this is run with the string "test" the URL looks like this.
search.php?q=test&s=Search

This is the default browser behaviour, but it still messes up the nice URL structure created previously. There is a way to fix this. Have a look at the following .htaccess file.

<IfModule mod_rewrite.c>
Options +FollowSymLinks
RewriteEngine On
 
RewriteCond %{REQUEST_URI} /search.php$
RewriteCond %{QUERY_STRING} ^q=([A-Za-z0-9\+]+)&s=Search$
RewriteRule ^(.*)$ /s-%1? [R=301,L]
 
RewriteRule ^s-(.*)$ /search.php?q=$1&s=Search&a=1 [L]
</IfModule>

Here we are using the RewriteCond directive which allows us to test for certain conditions. In this case we have two conditions.
RewriteCond %{REQUEST_URI} /search.php$
RewriteCond %{QUERY_STRING} ^q=([A-Za-z0-9\+]+)&s=Search$

The first condition allows us to only run the rule on the page search.php. This stops any annoying confusion if we want to pass a similar query string to a different page. The second condition allows us to test the query string to see if it contains the parameters we are looking for. The %{QUERY_STRING} bit is a reference to the actual query string passed, minus the question mark at the beginning. In this case we want to trap the parameter q with any value and the parameter s with the value of Search. The dollar sign at the end is very important, but I’ll come back to that.

The first rewrite rule redirects the page search.php to the URL /s- and whatever the query string was. The %1 is a back-reference to the first parenthesized value matched in the most-recently-matched RewriteCond.
RewriteRule ^(.*)$ /s-%1? [R=301,L]

We then also need to include a rewrite rule that will recognise the new URL and act on it. However, want we don’t want to do is confuse the server and put it into an endless loop, which is quite easy since we are redirecting from search.php to search.php. So what we do is include the parameter "a" at the end of our rewrite rule with the value of 1.
RewriteRule ^s-(.*)$ /search.php?q=$1&s=Search&a=1 [L]

Going back to the second rewrite condition above we included a dollar sign at the end of the rule. This meant that the string had to end there, so if we include anything else after the end of the query string the rewrite condition will return false. So although we don’t actually use the a in our script it is needed there to stop the server going into an infinite loop of redirects.

For more information on mod_rewrite and other .htaccess examples have a look at the excellent tutorial at Ask Apache.