Before we go on, let's discuss a little bit how Joomla URL routing works. It's long, it's a bit depressing, but read to the end — as you'll see, all problems with the URL routing in Joomla are ultimately a user education issue, not a coding issue.
You would expect web software to have have absolutely deterministic, clear routing for URLs. Given a set of URL parameters (the “non-SEF URL”) the URL router (“SEF router”) will spit out the same route (“SEF URL”). In this ideal world we already have nuclear fusion and world peace, we have solved poverty and hunger and… I digress.
In the imperfect world we live in, web software has to choose one of two evils. URL routing is either extremely technical and inflexible but robust, or it is is user-friendly and flexible but can result in some very weird situations. Joomla chose the latter. It's what allows modules to work the way they do which is one of the many reasons Joomla is an insanely powerful but still user-friendly CMS. But which part of its soul did it have to sell to you-know-who to do that? Let's see…
URL routing in Joomla operates in two axes. On one hand we have the user-defined menu structure which operates as the first pass of URL routing and is defined by users, who are humans and not very good at managing hierarchies (make that doubleplusungood, Winston). Then you have the SEF router of our component which, unlike users, works in a perfectly logical, orderly fashion.
The thing is, they are both working at the same time to do URL routing and this can cause some… uh… complications.
Let's say you have the following data hierarchy:
-
Category A, alias
alpha
.-
Category B, alias
bravo
.-
Article C, alias
charlie
-
-
And you have the following menu structure (as to why the user chose this seemingly bonkers menu structure, no, they are not trying to sabotage your code, they have a good reason but we'll get back to that later):
-
Category item list for Category B, alias
bravo
. Item ID 123.-
Single article view for Article C, alias
charlie
. Item ID 234.-
Category A, alias
alpha
. Item ID 345.
-
-
You would expect that the URL for category A is https://www.example.com/alpha, category B's is https://www.example.com/alpha/bravo and article C's is https://www.example.com/alpha/bravo/charlie.
Nope.
The URL for category A is
https://www.example.com/bravo/charlie/alpha
per the menu
structure.
The URL for category B is https://www.example.com/bravo
per the menu structure.
The URL for article C is
https://www.example.com/bravo/charlie
per BOTH the menu
structure AND the SEF router of com_content.
However, it's perfectly possible to access article C as https://www.example.com/bravo/charlie/alpha/charlie. The first three parts of the route (/bravo/charlie/alpha) are the menu structure to category A. The rest (bravo/charlie) is handled by the SEF router of com_content.
*record scratch*
But, wait, wait a second! How does Joomla know when to use each
URL?! I am glad you asked, the answer is the Itemid
URL
parameter, i.e. the menu item ID for which we are going to be generating
a URL for our… something!
If we are routing the URL to article C using a URL with Itemid=123
Joomla will first figure out that we have a menu item to Article C's
parent category (Category B). It will then ask our URL router to route
this item in this category which will make our com_content router return
charlie
. Therefore Joomla will return the relative URL
bravo/charlie
.
If we are routing the URL to article C using a URL with Itemid=234
Joomla sees that the Itemid matches exactly what we need to route,
therefore it will return the menu structure up to this point, i.e. the
relative URL bravo/charlie
.
HOLD ON A SECOND! Both of these methods returned… THE SAME URL!
Ah, keen eyed reader, you are right! I can see you despairing. Oh,
please, not yet! It's about to get worse. You see,
when Joomla parses the URL it prioritises the menu structure over the
SEF router of each component. Since bravo/charlie
is,
indeed, a valid menu structure it will simply return the non-SEF URL
index.php?Itemid=234 — in both cases.
But, but, but… Isn't the Itemid how we tell modules when to display? Why, yes, it is! Oh, you had different modules displaying in menu items 123 and 234? Too bad! You don't get to choose. Sorry.
Back to routing non-SEF to SEF URLs. If you route article C using a URL with Itemid=345 Joomla tells you that you are on category A. So your SEF route has to find a full path to your article which would be bravo/charlie and this is added to Category A, menu item 345's URL of bravo/charlie/alpha to make the entirely confusing URL bravo/charlie/alpha/bravo/charlie which works perfectly.
What about trying to route article C without an Itemid? Now things get a bit tastier and testier. Joomla will try to find the most relevant route using the segments returned by your SEF router and trying to match them with the menu structure… Which one it is? Frankly, I don't have the foggiest off the top of my head. I'd have to build that site to figure it out. I would think it's the same as using Itemid=345. If all else fails, Joomla will use the Home item's Itemid and all bets are off.
Of course, this means that the same article can have three URLs, two of which are identical and one of them does not resolve to what you'd expect. But this is normal! And no, forget about getting a canonical URL for article C because there's none.
This insanity cannot be addressed because it would require decoupling URL routing from Joomla's menu system. However, this would mean that all published modules appear in all pages as there is no longer a way to know which menu item you're in. Of course it's the solution to this problem which broke module display in our example above but this is what you get when the users are “crazy”.
Or are they?
While it sounds convoluted and problematic, this method is not the least bit more convoluted and problematic than any other given CMS when you take into account all the possible[1], almost infinite use cases it's called to work with. It works great insofar the user can be trusted to not create psychopathic menu structures which work against the data structure, reusing the same aliases for a good measure of insanity.
But why would any user even create such a menu structure to begin with?!
As a matter of fact, the menu structure I presented is the wrong way to use Joomla. I confess I misdirected you but I did so for a noble reason. What the user most likely wanted was the menu item's visual structure to be what we presented for user experience reasons. They most likely don't care or not even want the crazy URL structure. They would very likely be spending hundreds of Euros every year in various SEF / SEO tools to try and fix their mistake… when they could just be told how to use Joomla the way Joomla was intended to be used to begin with.
The One True Joomla Way™ is to use a “hidden” menu (a menu without a module to display it) to generate the URL structure in a way that's mostly following the data structure and a shown menu with Alias menu items for the visual display.
So, in our example, the hidden menu would be:
-
Category A, alias
alpha
. Item ID 345.-
Category item list for Category B, alias
bravo
. Item ID 123.-
Single article view for Article C, alias
charlie
. Item ID 234.
-
-
And the visible menu:
-
Category item list for Category B, alias to menu item 123.
-
Single article view for Article C, alias to menu item 234.
-
Category A, alias
alpha
, alias to menu item 345.
-
-
That's the reason I went through this intentionally provocatively named section. When you are doing end user support you WILL come up with the atrocious menu structures like the one I presented above.
Do NOT try to address this in your router code; you will lose your mind and you will never make it work for all of your users. Remember, there is no routing method which is suitable for every use case and you cannot possibly address infinite use cases. Value your sanity as a developer! Learn when to say “no” to users.
Instead, ask your user what is their use case and is their intention with their menu structure: are they trying to shape the URL structure or just the way things are presented on their site? In 99 out of 100 cases the user intended to affect only the visual presentation of the menu; they have no idea they are shooting their feet by affecting the URL structure. Patiently explain them the trick about hidden menus. You'll get grateful clients for life.
[1] For any given, singular use case and a set of routing algorithms it is very easy to find the one algorithm which is most suitable, meaning the rest are unsuitable. However, every single use case has a different most suitable algorithm. Given a very large number of use cases, like the near infinite uses cases a CMS is called to address, any given routing algorithm would be just as unsuitable for most use cases as every other. Therefore the task of finding the “best” algorithm is reduced to finding an algorithm which fulfils some secondary or tertiary business goals such as making it possible for an end user to easily configure with a GUI or support our vision of having different modules show up in each page. The primary business goal of the “best” routing for the generic use case is, by definition, a bust unless we are willing to drastically reduce the use cases we are willing to support.