Handle Bad Page Requests in a Web Application

A common error in web applications is failed navigation. In some cases, the user has mistyped a URL; in other cases, the internal navigation of a web site is flawed. In (hopefully) rare cases, users are attempting some shenanigans.

Documenting Bad Requests

Whatever the cause of a bad request, the Istarel Workshop Application Framework (IWAF) traps those failures and invokes ApplicationDelegate->handleBadRequest(). Since I cannot know ahead of time what bad requests might be out there, I begin by logging them and then periodically reviewing the log file.

Partial Listing: /rsrc/ApplicationDelegate.php

function handleBadRequest($page_request)
{
    if (IN_PRODUCTION)
    {
        $fm = IWFileManager::defaultFileManager();

        $log_file = str_replace('rsrc/', '', APPL_RSRC_DIR) . 'log/bad-requests.log';
        $message  = date('Y-m-d h:i:s') . ' — ' . $page_request . "\n";

        $fm->appendFile($log_file, $message);

        header('Location: ' . APPL_ROOT_DIR . $this->startingPoint());
        exit();
    }
}

All this does is let me document any problems. For example, when I started logging the bad requests for Big Nerd Ranch, there were (generally) three different kinds of page request failures.

First, people were trying to reach the Big Nerd Ranch blog and forums by using URLs like http://www.bignerdranch.com/forums. Not an unreasonable guess, but incorrect: The forums and blog are found via subdomains of bignerdranch.com.

Second, people were trying URLs to classes, but not quite using the correct offering names. For example, the popular Cocoa class at Big Nerd Ranch is reached via http://www.bignerdranch.com/classes/cocoa_i, but many people were trying some variation of http://www.bignerdranch.com/classes/cocoa. Again, not a bad try, but completely inscrutable to the web application.

Third, there are obnoxious malcontents who think they can choose a clever URL and gain illegal access to code or data.

Apache's mod_rewrite

For URL with any kind of pattern to them, Apache's mod_rewrite is an excellent way to redirect a bad request before it becomes a bad request. My base mod_rewrite implementation simply ensures that the application framework handles all script-based requests.

RewriteEngine On
RewriteRule !\.(js|ico|gif|jpg|png|css|pdf|xml)$ index.php

The forum and blog requests very much represent a simple pattern. Namely, if the request begins with some variation of "forum" or "blog", then I should redirect the use to the appropriate subdomain. To be a bit more versatile, I will also ignore the case of any request (hence the bracketed NC). Whenever you intend to redirect a user, you want append [L] to the rewrite rule, which tells Apache: Do not apply any later rules!

RewriteEngine On
RewriteRule ^(phpbb|forum|board) http://forums.bignerdranch.com [NC,L]
RewriteRule ^blog http://weblog.bignerdranch.com [NC,L]
RewriteRule !\.(js|ico|gif|jpg|png|css|pdf|xml)$ index.php

Specific Redirects

One module I always build into my applications is a redirect facility. This enables me to have modules reachable via SEO (and user) friendly URLs. So, instead of http://www.bignerdranch.com/OfferingView?id=1 (other URLs might have far more complex query string parameters), the Cocoa page at Big Nerd Ranch is reached via http://www.bignerdranch.com/classes/cocoa_i.

I could use mod_rewrite here as well, putting a series of specific rewrite rules in place to handle mistaken URLs, where [PT] tells Apache to pass the result through to the next handler.

RewriteEngine On
RewriteRule cocoa_i$ classes/cocoa_i [NC,PT]
RewriteRule !\.(js|ico|gif|jpg|png|css|pdf|xml)$ index.php

Instead, however, for these specific kinds of redirects, I create a redirect entry in the administrator modules for the application.

Forbidden Requests

Finally, I like to utilize the forbiddenViews() method of the ApplicationDelegate, which returns an array of page requests that the framework should cause to silently die and simply return the user to the designated starting point for the application (typically the home page).

Partial Listing: /rsrc/ApplicationDelegate.php

function forbiddenViews()
{
    return array('track.php');
}

Though not shown here, a page request can also be modified by the ApplicationDelegate, which allows for more sophisticated handling of inappropriate requests. That modifyPageRequest() method is also how I make use of the data from the redirect module to invoke the appropriate application module.