In this post, I'll try to explain how URLs (links to items/archives/...) are built in Nucleus. It turned out to be quite a lengthy article. I'll start in a fancy URL free world and will end up explaining how fancy URLs work and highlighting some problems.
$CONF['Self']
Everything starts with $CONF['Self']. You'll find this variable defined in any access point that can be used to display a Nucleus website (index.php, atom.php, xml-rss2.php, ...). It is almost always equal to the name of the file. When fancy URLs are used however, it is recommended to change the $CONF['Self'] value to the base URL for the site (see further below)
There should be only one non-admin area scenario where $CONF['Self'] is empty: when action.php is called to perform some action (adding a comment, sending a message to a user, calling a plugin). action.php has no knowledge of its caller, and therefor most actions in there either redirect back to serverVar('HTTP_REFERER') (the refering URL) or get passed an explicit URL to redirect to.
$CONF['IndexURL']
Some functions use $CONF['IndexURL']. This is the Site URL that is defined on the global settings page. Some other globally defined URLs are available. For more info, see the documentation on the nucleus_config table. (IndexURL, AdminURL, MediaURL, PluginURL, SkinsURL, SkinsURL and ActionURL)
$CONF['ItemURL'] etc...
Somewhere in globalfunctions.php, you'll find this section:
$CONF['ItemURL'] = $CONF['Self'];
$CONF['ArchiveURL'] = $CONF['Self'];
$CONF['ArchiveListURL'] = $CONF['Self'];
$CONF['MemberURL'] = $CONF['Self'];
$CONF['SearchURL'] = $CONF['Self'];
$CONF['BlogURL'] = $CONF['Self'];
$CONF['CategoryURL'] = $CONF['Self'];
These are the prefered variables to use when generating URLs. They were introduced because of feature request: a user wanted to use separate files for each of the pagetypes (e.g. item.php for detailed items, archive.php for an archive, etc...). This way, he could more easily track the amount of visitors on his site.
To take advantage of this, ones config.php might look like this:
...
// include libs
include($DIR_LIBS.'globalfunctions.php');
// override settings
$CONF['ItemURL'] = 'item.php';
$CONF['ArchiveURL'] = 'archive.php';
...
It could also be done in each of the separate files, after inclusion of config.php.
URL building functions
Onwards to the actual link building: there are a couple of functions taking care of this:
createItemLink($itemid, $extra = '')createMemberLink($memberid, $extra = '')createCategoryLink($catid, $extra = '')createArchiveListLink($blogid = '', $extra = '')createArchiveLink($blogid, $archive, $extra = '')createBlogLink($url, $params)
Most of these are pretty obvious (except for the optional $extra parameter): they'll create a link based on the appropriate $CONF['XXXXXURL'] variable, using the current pathmode settings (fancy URLs or not?). For now, let's stick to 'regular' URLs. Info on FancyURL differences are further below.
Let's look at what the following code will do:
$params = array('catid' => 5)
$url = createItemLink(1234, $params)
First, createItemLink will take $CONF['ItemURL'] and append ?itemid=1234 at the end. Then, this generated URL is passed to the addLinkParams function, together with the $params array. This will add a &key=value part for each of the mappings in the array. The result will be a complete URL to use.
Actually, there's one function I didn't mention in the previous list: createBlogidLink($blogid, $extra = ''). It is used by the BLOG class to generate an URL based on the blog URL (the one defined in the settings of the weblog). This URL is only used in one place: the generation of a category list (it's the <%blogurl%> variable)
In come FancyURLs
The concept of fancy URLs relies on the ability of Apache to stop parsing an URL at the moment when it finds a file it can return. The part of the URL that is 'ignored' ends up in the PATH_INFO server variable, and can be accessed from PHP. If, for example, we have an URL http://example.com/index.php/item/1234, the index.php script will be executed, and PATH_INFO will contain /item/1234
When Nucleus is in Fancy URL mode, it executes some code to parse this PATH_INFO into internal variables. It's located in globalfunctions.php and looks like this:
if ($CONF['URLMode'] == 'pathinfo') {
$data = explode("/",serverVar('PATH_INFO'));
for ($i=0;$i<sizeof($data);$i++) {
switch ($data[$i]) {
case 'item': // item/1
$i++;
if ($i<sizeof($data)) $itemid = intval($data[$i]);
break;
...
case 'catid':
$i++;
if ($i<sizeof($data)) $catid = intval($data[$i]);
break;
}
}
}
What is basically done: the code loops through the value pairs present in the pathinfo and fills variables like $itemid and $catid, which would normally be filled using info from the querystring (?itemid=1234&catid=5)
Next to accepting fancy URLs, the URLs generated must be in the same format, hence the URL generating functions listed above (createItemLink etc.) behave differently when Nucleus is set up to use fancy URLs. If we look back at our earlier example, createItemLink will still use $CONF['ItemURL'], but will now add /item/1234 instead. In the same way, addLinkParams will add /key/value pairs.
Fancy URLs without .php extension
With the explanation above, fancy URLs are possible, but they still don't look very pretty: they'll look like http://example.com/index.php/item/1234 while we would like http://example/item/1234
The solution lies in a little trickery: we're creating an extensionless item file with PHP code, and tell Apache that that file indeed needs to be executed by the PHP engine. This is done through the following code in .htaccess
<FilesMatch "^item$">
ForceType application/x-httpd-php
</FilesMatch>
The item file itself looks like this:
<?php
include('./fancyurls.config.php');
include('./config.php');
$data = explode("/",serverVar('PATH_INFO'));
$itemid = intval($data[1]);
selector();
?>
You'll notice that it's nearly identical to the normal index.php, except for the inclusion of fancyurls.config.php and some pathinfo related code.
The only thing fancyurls.config.php does is that it defines $CONF['Self'] to the site URL (http://example.com). It's imporant to note that the URL does NOT have a trailing slash. This is for a specific reason: createItemLink adds /item/1234, which would lead to a double slash. One could argue that the URL building functions should not add the slash, and that the slash should be in $CONF['Self'] instead. However, this would break the index.php/item/1234 scenarios (without the .htaccess and special files)
The code involving PATH_INFO gets the itemid out of the URL immediately. This is needed because the pathinfo will be /1234 now instead of /item/1234. Any other parameters passed in the URL will still be handled by the code in globalfunctions.php
Problems when an absolute URL is needed
In some scenarios, URL building can be a problem. One example are RSS/Atom feeds: they need an absolute URL to be able to provide a valid link to the item. There are several ways to do this:
- Set
$CONF['Self']inatom.phpto the absolute URL, and use<%itemlink%>in the templates - Use something like
<%blogurl%>index.php?itemid=<%itemid%>or<%blogurl%>item/<%itemid%>in the templates.
The second solution is what is used in Nucleus release versions, since it's the only way to have things work out of the box.
Problems with multiple blogs
Another problem area is when multiple blogs are used and a generic skin is used. While /blog/2 will display the correct blog using the skin for that template and links to items of that blog will work fine, a member link like /member/1 from such a page will bring you back to the default blog/skin. To be correct, the link would have to be /member/1/blog/2. It would be the task of parse_authorlink to add the blog parameter, but the question of when it should be added and when it shouldn't needs to be investigated.
Posted by karma at 16:59:40. Filed under: Inside Nucleus

Comments
Add Comment