On of the most important features WordPress has is permalinks rewrite engine that can create all those pretty links we are so used to with no query elements like question marks or ampersand. Did you wonder how permalinks work and what can you do to customize them? Read on.

This article explains what happens when WordPress resolves URL request that uses pretty permalinks. This article doesn’t include examples with creating custom permalinks structures. I am planning to have additional articles to cover that.

By default, pretty permalinks are disabled in WordPress. Reason for that is that in some cases they can’t be activated without user intervention. For pretty permalinks to work, WordPress needs access to .htaccess file (if on Apache server) where it needs to add few lines that are base for the rewrite engine in WordPress. And that also requires Apache mod_rewrite module. This module is part of Apache installations and it is very rare that it is missing. Code added into htaccess file basically redirects all URL’s to WordPress index.php and that allows WordPress to break URL into parts and using regular expressions detect what content to display.

WordPress supports also PATHINFO permalinks that don’t require mod_rewrite module, but permalinks in that case must start with index.php. More info on this you can find on the WordPress Codex Permalinks page (links are at the end of this article). There are some setup differences in using permalinks for Apache, IIS or other servers, but they can work with any of the currently available servers for Windows, Linux or MacOS.

So, basic thing to activate rewrite engine in WordPress is to enable permalinks from WordPress Settings/Permalinks panel. If it is set to Default, permalinks are disabled, any other value they are active.

Why do you need these permalinks?

Well, there are many reasons, and most important reasons are SEO and that they look much nicer and user-friendly than long query strings in URL. Especially if you have complex website structure. Search Engine Optimizations benefits from permalinks, because they contain more relevant information about the content, and most search engines have a use for that when indexing websites.

Older versions of WordPress were known to work slower because of some permalinks structures that required additional SQL queries, but since WordPress 3.3 most critical permalinks structures are optimized and there is no performance penalty if you use post name only in the URL. Even with older WordPress versions performance was affected with only some structures and with large number of posts.

Permalinks are a must in a website, and very few WordPress websites don’t use it (mostly default installations, left with permalinks disabled). What to choose for permalinks is another matter.

Default permalinks settings

To quick start using the permalinks, you can select one of the predefined rules from Permalinks panel. Also, you can see Custom Structure field where you can create structure you want. What this panel doesn’t say is that those rules are only for posts (default post type: post). WordPress allows you to customize only this post type rewrite rules, nothing else! Pages in WordPress always use same rewrite structure: only sanitized version of the page name. And, that is the rule you can’t change. For archives, Permalinks panel offers only two settings: what to use as a base for category and tags archives.

Beside this, there are plenty more rewrite rules that WordPress will not allow you to change (well, not directly anyway). These rules include: archives for authors, date base archives, attachments, feeds, custom post types, custom taxonomies and generic archives.

When you create rules on the Permalinks panel, you don’t need to handle regular expressions, but you can use special tags to form the URL. By default, WordPress uses several structure tags (full list in the Codex, check the link at the bottom). If you use these tags in the URL they will be replaced with actual data. If element in the URL is not recognized, it will be left like that as a part of URL. This way you can add static parts of the URL.

Most things in WordPress use more than one rewrite rule. To resolve year based archive, WP needs separate rules to match: basic year archive URL, URL with page numbers and feed URL. Some things need more than 3 rules, for a post you need 5 rules.

Resolving rewrite rules

To resolve URL, WordPress uses a list of rewrite rules. Rules are based on regular expressions, and each regular expression in the list points to a query based URL. For instance, here is the rule to resolve author archive feed:

Regular Expression: author/([^/]+)/feed/(feed|rdf|rss|rss2|atom)/?$
WP Resolved Query: index.php?author_name=$matches[1]&feed=$matches[2]

I am not going to go into how regular expressions work, but URL is matched against all regular expressions in the rules list, until we get a match. When URL is matched against expression, query for that expression is than used to resolve requested URL to a query based URL. In the query in example above you see $matches[1] and $matches[2]. These are values detected by regular expression from requested URL. These values are URL parts marked with ‘(‘ and ‘)’ in the regular expression. So, if your URL is this:

http://www.example.com/author/admin/feed/rss/

Website URL part is removed, and we get the request:

author/admin/feed/rss/

This is what WordPress than matches against the rewrite rules, it will match it against our example expression above. Words ‘admin’ and ‘rss’ are matched/extracted from regular expression, they are replacing $matches[1] and $matches[2] and resolved query based URL is now this:

index.php?author_name=admin&feed=rss

And this is something WordPress can use to prepare page, load template and data for it. This final query is used to create WP Query object that is ultimately used to get posts for that request.

If all rewrite rules fail, the last rule will always resolve. This base rule resolves any request to a page request. If that page is not found, you get 404 error. If the URL is resolved by some rewrite rule, and the WordPress Query doesn’t find any results to match, again you will get 404 error.

Custom rewrite rules

If you need to customize URL for posts, you can use Settings Permalinks panel and do it from there. But, anything else you must do it through code. WordPress 3.0 added support for custom post types, and there is one very important thing missing: custom rewrite rules for them. All custom post types by default have same URL structure that includes name for the post type and name for the post. Same thing goes for the archives for the post types, or date base archives. Nothing can be customized without custom coding.

But, there is even more important use for custom permalinks. Each URL is resolved to a query that WordPress uses to get posts. So, date based query and URL filter posts by the date. If you request this:

http://www.example.com/2012/03/

You will get all posts published in the year 2012, in March. And if you want to get all posts belonging to custom post type movie, you can use this:

http://www.example.com/movie/

But, what to do if you want to get all posts for custom post type movie for 2012? You would like it to be this:

http://www.example.com/movie/2012/03/

But, there is no rule for that in WordPress and this will not work, so instead this elegant link you must use only this:

http://www.example.com/movie/?year=2012&monthnum=3

Anything that is part of the URL: resolved by WordPress or added like here with direct query element is used in WordPress Query engine to get posts. There are many examples where additional rewrite rules can make a big difference in getting cleaner links.

Most important thing is filtering posts by post type and taxonomy. Normally, you have URL that filters posts by post type or by taxonomy term. If you want to mix those, you need to use query elements. So, if your movie custom post type has taxonomies category and genre, to filter by both with URL you need to use something like this:

http://www.example.com/movie/?cat=short&genre=animated

But, what if you can do it like this:

http://www.example.com/movie/short/animated/

That would be great, and if you have more taxonomies you can have more terms listed like this, one term for each taxonomy. But, again, this can’t be done without using custom coding to implement additional rewrite rules and you can’t use such links…

GD Custom Posts and Taxonomies Tools Pro

…Oh wait, you can!

Ever since custom post types are added to WordPress, I used them for many things, and I always hated limitations with permalinks. So, I started adding customized rewrite rules with my GD Custom Posts and Taxonomies Tools Pro plugin. Plugin now allows fully customized single post permalinks for custom post types (just like for default posts), date based archives for custom post types and archives with intersection of custom post types and taxonomies. All that without any additional coding, just by setting rules for each custom post type you add through this plugin.

Check out this tutorial with examples to see what can be done with this plugin and custom post types. All this is used on Dev4Press for Central Documentation and Feature of Day sections to filter posts by terms. I have few more ideas for additional rewrite rules and more customizable links structure.

Canonical Redirect

This is very interesting and useful (usually) feature. Sometimes, WordPress can partially match the URL depending on the rule. Than, it will find what URL should resolve to, and than generate proper URL for it, and redirect to it. So, you request one URL, and that gets replaced with some other URL. If you use category taxonomy in the URL, with hierarchy ending with post name like this:

http://www.example/parent-category/child-category/post-name/

If you make this request:

http://www.example/parent-category/post-name/

WordPress will find the post it needs, and than will determine that URL is not canonical as the rewrite rules require it to be, it will than generate full URL and redirect to it. But, if you don’t want this to happen, or you have additional rewrite rules for the post, canonical redirect will ignore all that and will redirect to URL it decides is correct. In that case, you must disable canonical redirect. To do that, you can use this PHP code:

remove_filter('template_redirect', 'redirect_canonical');

Conclusion

Handling custom rewrite rules is not easy, and very often you will run into conflicts. If you try to use same regular expression for your rule, you will remove default rule that was using same expression, and that will most likely break something. When working with rules, it is useful to have list of all rules in the system. You can see this list with GD Press Tools Pro debugger, or you can use few functions (check the debug rewrite rules tutorial, link at the bottom).

WordPress rewriter is very powerful, but by default it uses only a fraction of what it can do, since only basic rules are implemented, and if you need more, you must write code for that. The goal was to keep things simple in the core, and to allow freedom to customize things if needed. Not everyone will need so complex rewrite rules, and basic ones are enough. If you need more, you need to dig deeper or to use specialized plugins to achieve that.

8 Responses to “How WordPress URL rewriting works?”

  1. EduBoris | April 9, 2012 at 6:34 pm

    Can you tell me how to actually achieve these type of URL structures?
    Is this done on the apache level or php can handle this?

    I would like to have a similar to wordpress structure as you show with /name-of-thing/name-of-post/

    I know this can be found on the open source code but if possible to help me find where it is among the large code base that would be great!

    Great work and thank you in advance.

    • MillaN | April 9, 2012 at 7:32 pm

      I will soon start series of posts on this subject with practical examples for this. But, this is done using WordPress rewriting engine, and it can’t be used for other CMS systems.

  2. Jonathan Head | April 9, 2012 at 11:11 pm

    FWIW, Google itself has already stated, almost four years ago, that it has no problems crawling URLs with question marks and ampersands — and actually goes so far as to specifically recommend _not_ using URL rewriting for dynamic websites (like those created with WordPress), as doing so can cause problems when crawling sites:

    http://googlewebmastercentral.blogspot.com/2008/09/dynamic-urls-vs-static-urls.html

    Pretty URLs _are_ more visually appealing and easier for visitors to understand, but it would be best not to perpetuate the myth that it’s better for SEO or hard for search engines to crawl such URLs.

  3. Marlon | April 27, 2012 at 10:30 pm

    Really useful and very well explained! thanks a lot!

  4. Jose | April 29, 2012 at 4:22 pm

    Interesting post. Curious question? Can I use this to forward this permalink with accented characters:

    /más-de-méxico.htm

    to it’s correspondent permalink in WordPress without the accents:

    /mas-de-mexico.htm

    ? Thanks!

    • MillaN | April 29, 2012 at 4:24 pm

      No, these are not the same characters. It could be done with additional filtering in rewriter to replace characters, but it will not work by default.

  5. Jose | April 29, 2012 at 4:30 pm

    I see. Thanks a million! That was fast!

  6. Ed G. | January 31, 2013 at 7:44 pm

    Thank you thank you!
    redirect_canonical is evil… it was messing up my attempts to parse the slug myself. The one line of code you give above saved me HOURS of work!!

Leave a Reply

Dev4Press Plugins Pack

Dev4Press Plugins Pack

Personal$99.00
Business$249.00
Developer$499.00
GD bbPress Toolbox

GD bbPress Toolbox

Personal$30.00
Business$90.00
Developer$180.00
GD Custom Posts And Taxonomies Tools

GD Custom Posts And Taxonomies Tools

Personal$35.00
Business$105.00
Developer$210.00
GD Press Tools

GD Press Tools

Personal$40.00
Business$120.00
Developer$240.00
GD Products Center

GD Products Center

Personal$40.00
Business$120.00
Developer$240.00
xScape Theme Club

xScape Theme Club

Standard$149.00