Google definitely wasn’t the first search engine that surfaced the Internet. But Google did things in a better way and provided results that were actually useful. Google hasn’t stopped expanding as a company since that very day.
Though SEO is done for all the search engines that exist today, one major search engine that handles most of the traffic on the Internet is Google. Whenever we talk about SEO, people automatically assume that we are talking about optimizing the website for Google.
When it comes to SEO, we need to check many factors, both onsite as well as offsite. But if your onsite SEO is not up to the mark, no matter how well you do your offsite SEO, you will not get the results you are expecting.
I was checking one of the websites I was doing SEO on, and I found that the website had some serious issues related to Canonicalization. I fixed the issues in no time but also decided that I would work on a post for explaining what Canonicalization means and how can one perform Canonicalization of a website properly.
What is URL Canonicalization?
The term Canonicalization can be tough to understand. Let me try to explain this in simple terms.
Let’s say there are two URLs of a website:
Both of those pages show content, and none of these pages redirects to any one of them. This can result in duplicate content issue on Google, and you can face penalties.
Let us see one more example. There are two URLs on a website that result in the same page resolution.
If both of these web pages show the same result, then this might cause an issue as well!
You might not pay much attention to this issue, but this might result in serious duplicate content penalties. The problem with search engine bots is that they can’t decide which version of the URL they should add in their index. If two pages are resolving the same content, they will just assume one copy is a copy of the other and your website will get penalized.
If your site is opening on 2 URLS showing the same content, then you must fix it. You must use server settings so that whether a user opens with www or without www, the site will open on any of the one version. In this way, you can fix the canonicalization.
Though, at times you would like to share same content on two URLS, then you can use rel=”canonical” tags to let search engine know that which is the original and which one is a copy of it. This can save you from duplicate content penalties.
How to correctly apply URL Canonicalization?
Let us now check how to apply URL Canonicalization. We don’t need to type in lines of code to do it. A simple rel=”canonical” tag is enough to apply Canonicalization.
Take an example, there are two URLs on the website that result in the same content when they resolve. These two URLs are:
The second URL results in the same content as the first URL. They are both displaying the same page and hence you can apply the rel=”canonical” tag, in this case, to indicate that the URL with index.php is a Canonical URL of the first one.
This is how it is applied.
<link rel=”canonical” href=”http://thewebpage.org/index.php”>
HTTP Header Canonicalization
The above markup can be used in the case of HTML content but what if we are dealing with non-HTML content like a PDF document? In those cases, we can use HTTP Header Canonicalization.
> HTTP/1.1 200 OK
> Content-Type: application/pdf
> Link: <http://www.example.com/white-paper.html>; rel=”canonical”
> Content-Length: 785710
You can get more information about HTTP Header based Canonicalization on Google’s official Webmaster blog.
When should you use Canonicalization?
Now that you know what exactly Canonicalization means, you can move forward on the topic and see when should you use it. Because there are many more cases other than the two I have mentioned in the examples above.
Here are a few conditions that can be prevented with proper URL Canonicalization.
- Different URL for one same content
- Various various categories and tags that result in same content
- Mobile website displaying same content but on different URL/subdomain
- URLs having HTTP and HTTPS URLs and both resulting in same content
- Various ports
- When website has a www and a non-www version
- In case of sharing syndicated content
These are some major conditions in which we can apply URL Canonicalization to save our site from facing any kind of duplicate content penalty.
This is when you should NOT perform URL Canonicalization!
There are scenarios in which we should not perform URL Canonicalization, and this section of this post is targeted towards specifying these particular conditions. You can also consider these as errors when it comes to URL Canonicalization. Let me list these one by one. I will try to explain most of them in a really simple manner.
Skip pagination canonicalization
If you are planning to canonicalize paginated URLs, then you should know that this is a very bad idea. You should not add a canonicalization tag on the second page of a URL as that URL will not be indexed at all by Google.
Multiple Canonical tags are bad
If a web page has multiple rel=”canonical” tags, then it can be really harmful to you. Make one specific tag and make it clear which one you prefer.
I have seen that many people apply the Canonical tag like this:
<link rel=”canonical” href=”index.php”>
This style of canonicalization is an invitation to a lot of errors. You need to understand that the more complete your canonical markup is, the better it will be for you and your content.
<link rel=”canonical” href=”http://thewebpage.org/index.php”>
The above markup is a better way to apply canonicalization.
Localization means targeting and manipulating the content of the website in order to serve it on the basis of the region it is being viewed in. If you really want to create a better website for your global audience, you can read this guide to create multilingual websites by Google.
Canonicalization on mobile version of websites
Just a canonical tag to differentiate a mobile website on the subdomain of your main website is not enough. Google suggests that you use both rel=”alternate” as well as rel=”canonical” in order to mention that the URL is for displaying the mobile version of the website.
Here is how you can implement it:
> <link rel=”canonical” href=”http://example.com/” >
> <link rel=”alternate” href=”http://m.example.com/” media=”only screen and (max-width: 640px)”>
Don’t use a Canonical tag outside of <head>
Search engine bots will totally ignore the tags that are set outside the <head> are of the website so in order to apply a proper canonical tag, you need to add it between <head></head>.
Don’t use multiple Canonical tags on a website
Using multiple Canonical tags is pointless. Search engines will ignore both of the tags and you will face weird SEO behavior and issues. Multiple canonical tag URLs are sometimes caused due to plugin glitches so you might have to keep an eye on that.
Don’t point a Canonical URL to a website with a non-200 status code
A website with a code like 301 and 302 will force the search engines to crawl one extra URL and this means that they need to crawl two URLs at once. This adds up to a big amount and it can easily deplete your crawl budget.
A URL with a status code of 404 is a totally wasted crawl and search engines will ignore your tag at all.
Don’t use Canonicalization for PageRank Sculpting
PageRank is no more a public entity or statistic to a website but it is still considered by the search engines. If you are planning to use Canonical tags for PageRank sculpting and to get better ranking, let me make it clear that it will do more harm to your website than good.
The concept of onsite SEO is much bigger than what you imagine it to be. You need to take care of many things at once, and you also need to keep yourself updated with the changes that take place in everyday time.
This post was a post for showing how you can apply Canonical URLs on a website. Keep in mind that Canonicalization is a delicate process and if done in a wrong manner, it can harm your website. Keep your website in check and make sure you perform Canonicalization properly.