MediaWiki/archive/customizing/URLs

from HTYP, the free directory anyone can edit if they can prove to me that they're not a spambot
< MediaWiki‎ | archive‎ | customizing
Revision as of 20:33, 26 October 2005 by Woozle (talk | contribs) (→‎Finally: note about weird links to uncreated pages)
Jump to navigation Jump to search

computing: software: web: MediaWiki: Customization: Shortening MediaWiki URLs

There are (at least) two "standard" ways of prettifying MediaWiki URLs, documented here. One uses Apache's mod_rewrite. Although it probably works well (I haven't tried it), it just seemed aesthetically unappealing, from a design point of view.

There's also another way of doing it if you have access to httpd.conf or .htaccess. It's fairly tidy and quite flexible, though I don't know how much additional load it puts on the server (see brief discussion at the end).

Using a 404 Handler

It uses the 404 (missing page) redirect mechanism — a standard /index.php/ request is handled by the standard code (in index.php), but any other URL which doesn't correspond to an existing page (within the wiki or not) is handled by a modified index.php. For any given "nonexistent" URL of the form "http://yourdomain.com/nonexistent/page", the code returns a wiki page entitled "Nonexistent/page", with the "nonexistent" URL displayed as the URL for that page.

There is also a feature wherein you can create a page called Mediawiki:your/url/here and it will redirect to an article whose title is the contents of that page. For example: http://wiki.vbz.net/Currentevents is redirected to vbzwiki:Current events because the page vbzwiki:MediaWiki:Currentevents contains the text "Current events".

First, in the main .htaccess file (or in httpd.conf if you prefer), assign a location to handle 404 errors such that a PHP file will be loaded -- either of these will do, for example:

ErrorDocument 404 /errors/404/
ErrorDocument 404 /wiki404error.php

In the first instance, your modified index.php file would go in /errors/404/; in the second, it would be renamed wiki4040error.php and go in the same folder as the normal index.php.

The remaining instructions depend on which MediaWiki version you are using.

Version 1.4

These instructions were made from changes that actually worked, but I may have left out some steps. I was more careful when I did the changes for version 1.5, so if these don't work check the version 1.5 instructions for anything missing.

Second: Make the changes indicated in the 404-handling copy of index.php:

if ( '' == $title && 'delete' != $action ) {
## 2005-06-19 Woozle mods for "missing" page
	# title not passed in parameter; use REQUEST_URI from environment
	$title = rawurldecode(ltrim($_SERVER['REQUEST_URI'], " /"));
	# see if there's a page designated for this URI
	$wgTitle = Title::newFromText( wfMsgForContent( $title ) );
	if ('' == $wgTitle) {
		$wgTitle = Title::newFromText( $title );
	}
## end Woozle mods
} elseif ( $curid = $wgRequest->getInt( 'curid' ) ) {
	/* redirect to canonical url, make it a 301 to allow caching */
	$wgOut->setSquidMaxage( 1200 );
# 2005-06-21 Woozle mods to allow 404 page to summon wiki page without redirecting
#	$wgOut->redirect( $wgTitle->getFullURL(), '301');
	$wgArticle = new Article( $wgTitle );
#  	$mainText = $wgOut->parse( $wgArticle->getContent( false ) );
#	echo $mainText;
	$wgArticle->view();
# end Woozle mods
} else if ( Namespace::getSpecial() == $wgTitle->getNamespace() ) {

Version 1.5

Second: You will need to copy includes/Defines.php and localSettings.php into the same folder as the modified index.php. There's probably a better way, but that's what I was able to get working.

Third: Make the changes indicated in the 404-handling copy of index.php:

  • First change - need to point to the copied Defines.php:
# 2005-10-25 Woozle - for 404 handling
require_once( './Defines.php' );
  • Second change - this is optional, but it cleans up the file a lot:
# 2005-10-25 Woozle - config code removed because it will never be executed
#if( !file_exists( 'LocalSettings.php' ) ) {
# ...
#}
  • Third change - this pulls in the title-request from the error URI:
# Query string fields
# 2005-10-25 Woozle - 404 support - parameters have to be parsed from $_SERVER instead of $_REQUEST
	$raw_uri = rawurldecode(ltrim($_SERVER['REQUEST_URI'], " /"));
	$arr_uri = explode('?',$raw_uri);
	$title = $arr_uri[0];
	$uri_qry= $arr_uri[1];
	parse_str($uri_qry,$_REQUEST);
	$action = $wgRequest->getVal( 'action', 'view' );
	$title_force = $wgRequest->getVal( 'title' );
	if ( != $title_force) {
		$title = $title_force;
	}
# 2005-10-25 END
  • Fourth change - optional and untested - allow title redirection
if (  == $title && 'delete' != $action ) {
	$wgTitle = Title::newFromText( wfMsgForContent( 'mainpage' ) );
# 2005-10-26 Woozle - 404 support - optional redirect based on "mediawiki:articlename"
	if ( == $wgTitle) {
		$wgTitle = Title::newFromText( $title );
	}
# 2005-10-26 END
  • Fifth change - I'm actually not sure if this is necessary, but don't have time to test uncommenting it:
# 2005-10-25 Woozle - for 404 handling - block out redirection code
# was -- if ((action is explicitly "view") AND (title is not passed as param) OR (title is not in canonical form) AND ??
#} else if ( ( $action == 'view' ) && 	(!isset( $_GET['title'] ) || $wgTitle->getPrefixedDBKey() != $_GET['title'] ) && !count( array_diff( array_keys( $_GET ), array( 'action', 'title' ) ) ) )
#{
#	/* redirect to canonical url, make it a 301 to allow caching */
#	$wgOut->setSquidMaxage( 1200 );
#	$wgOut->redirect( $wgTitle->getFullURL(), '301');
} else if ( NS_SPECIAL == $wgTitle->getNamespace() ) {

Finally

Finally, put the modified index.php where it will be the page used to handle 404 errors.

Caveats:

  • All wiki links on the loaded page will point back to canonical wiki URLs, e.g. http://yourdomain.com/wiki/index.php/Page_Title; to change this. see "Shortening the links" below.
  • Your arbitrary URL will have its first character capitalized before it is displayed as the page's title or used to load another page (if you have set up a Mediawiki: page for it), although the URL shown will remain unchanged
  • There is probably a lot of excess index.php code which can be stripped out, as it will never be executed in this context
  • URLs ending in slashes appear to be a problem for some namespaces; the wiki code appears to be reading the URL from some place other than the modified code. (This doesn't seem to be a problem for version 1.5.)
  • Some links (e.g. links to other pages not yet created) will show the current page in the URL and the target page in the query ("?title="). The "title=" overrides the URL, so they still work properly in all tests so far -- but I will look towards fixing this.

I suspect none of these things will be difficult to fix, but I am calling it quits for now at this point as I have already exceeded my time-budget for the evening.

CPU load: Obviously it has to do the same URL translation it would normally have to do and then determine that the file doesn't exist, but that shouldn't take any more cycles than locating an existing file; for URLs containing at least one slash, it should be quicker. Given all the processing done by the MediaWiki software for loading "normal" wiki pages, I suspect the difference is negligible.

Shortening the Links

To cause internal links to use the "new style" URL, make the following change in localSettings.php:

First change

#$wgScript           = "$wgScriptPath/index.php";
# 2005-10-25 Woozle - to support 404:handling
$wgScript           = "$wgScriptPath";

It's not yet clear whether this ultimately works, and some old-style links are just cached for awhile, or whether it just fixes links in certain namespaces. Working on that. Also, edit links no longer work properly; working on that too.

Comments

Please feel free to post comments here or on the Talk page if you try any of these procedures.