The Canonical tag you’ve implemented has been ignored by Google. Why?

laptop and hand

Learn why Google chooses to ignore canonical tags

You received a message in the Search Console saying that Google chose different canonical tags than are implemented on the specific URLs.

Why does Google ignore the canonicals?

(Recap) What is the canonical tag?

According to Google, a canonical URL is

the URL of the page that Google thinks is most representative from a set of duplicate pages on your site.

For example, if you have URLs for the same page (example.com?dress=1234 and example.com/dresses/1234), Google chooses one as canonical.

The pages don’t need to be absolutely identical; minor changes in sorting or filtering of list pages don’t make the page unique (for example, sorting by price or filtering by item color).

The canonical URL can be in a different domain than a duplicate URL.

Google chooses the canonical page based on a number of factors (or signals), such as whether the page is served via HTTP or HTTPS, page quality, presence of the URL in a sitemap, and any rel=canonical tag.

The rel=canonical <link>tag is an attribute that is inserted in the HTML code for all duplicate pages, pointing to the canonical page. It can be inserted into an infinite number of link pages to consolidate their link value. This, as you can imagine, increases the page size a bit.

Another variant of the same tag is to insert the canonical as an HTTP header. The benefit of doing this is that it doesn’t increase the page size.

Both can be complex to maintain the mapping on larger sites or sites where the URLs change often.

While you can show your preferences to Google, it may choose a different page as canonical than you do, for various reasons.

Why has Google ignored the canonical tag you’ve implemented?

There are multiple signals that can influence Google’s choice of one page as the canonical over the other, regardless of the indicated preference via the placed canonical tag on the page.

They fall into two categories: page performance and content relevance.

Let’s examine a handful of them.

#1 Google prefers HTTPS over HTTP pages to use as canonicals.

One of the pages could be served via HTTPS, which will make it a preferred version of the content, unless there are other conflicting signals (e.g. invalid SSL certificate, insecure discrepancies, redirects to an HTTP page, or has an existing canonical tag to an HTTP page).

#2 You have not indicated your canonical page in your sitemap.

If your sitemap is outdated and includes a lot of links that are broken that is not good.

What is even worse is if you have a page, you’ve listed as canonical, which not only is not included in the sitemap. The worst is having duplicates of this page in the sitemap that point towards your preferred indicated canonical page, while the canonical one is absent.

A whole lot of confusion and mixed signals for Googlebot. Hence why it makes a choice on its own.

#3 You’ve implemented a wild-card canonical.

The implemented canonical tag could be a wild-card in terms of structure, with sites with different subdomains (e.g. excluding specifying the www. host-variant), prompting Google to discard it.

Avoid implementing your SSL/TLS certificate for the wrong host variant. For example, example.com serving the certificate for www.example.com. The certificate must match your complete site URL, or be a wildcard certificate that can be used for multiple subdomains on a domain.

#4 You’ve not specified to Google to avoid looking at dynamic parameters in JS-heavy URLs.

If the URLs include dynamic parameters, specifying that they should be ignored can reduce duplicate content issues and reduce mismatches in the canonicalization of URLs

#5 Google has chosen a page that is more suited to the user’s intent.

Google reserves the right to choose a page that is better suited to the user’s queries, and index it in search results instead.

#6 The canonical page does not match the user’s technical needs.

If the majority of users are visiting the site via mobile devices, and the site serves pages via separate mobile/desktop/amp versions, then Google will show the mobile version (even if the desktop one is marked as canonical).

#7 You have canonicalised a no-indexed page.

You shouldn’t mix noindex and rel=canonical as they’re very contradictory pieces of information for us. We’ll generally pick the rel=canonical and use that over the noindex, but any time you rely on interpretation by a computer script, you reduce the weight of your input 🙂 (and SEO is to a large part all about telling computer scripts your preferences)

—  John Mueller, Google, source (opens in a new tab)

The self-referencing canonical tells search engines that this is the only version of the page that exists. And the robots noindex tells search engines not to index it. These pages are not the real issue, though. 

In some cases, you might even have pages, which are marked as a canonical page for pages that are no-indexed but are no-indexed themselves. In this situation, the canonical instructions are sending conflicting messages to the search engines – the Page A is effectively saying ‘please index Page B’, and Page B  is saying ‘don’t index me!’

In this setup, the canonical configuration is completely broken. 

You will first need to establish if Page B is indeed the correct canonical for Page A, and if so remove the noindex directive from Page B. If in fact, Page B should be noindex, then Page A should be updated to change the canonical tag, either to become self-referential or to point to another URL.

Check out this tutorial from Sitebulb on fixing this issue. Easy diagnostics can be made via Screaming Frog (feel free to ask me how in the comments).

Severe crawlability issues like this will typically need to be resolved on a case-by-case basis. 

#8 Your Canonical Tag Placement is Inconsistent

Google may ignore canonical tags if they are inconsistently placed within the HTML code or if they are nested improperly, hindering Google’s ability to accurately interpret and enforce canonicalization.

#9 Your Website Structure is ever-changing

Websites undergoing frequent content or structural changes may struggle to maintain consistent canonical tags, leading Google to overlook them due to perceived inconsistency in signaling the preferred version of the content.

#10 Algorithmic Discretionon canonical tag selection

Google’s algorithms have the discretion to prioritize user experience and relevance over webmasters’ preferences, potentially ignoring canonical tags if they determine that a different page better satisfies user intent or provides more valuable content.

#11 You’ve implemented Multiple Canonical Tags

When multiple versions of a page exist (e.g., desktop, mobile, AMP) with conflicting canonical tags, Google may choose to disregard them altogether or prioritize certain versions based on its algorithms, resulting in unexpected canonicalization outcomes.

Bottom line?

Navigating Google’s treatment of canonical tags requires a nuanced approach, considering factors such as HTTPS implementation, sitemap accuracy, tag specificity, and user intent alignment. By addressing these intricacies and optimizing canonicalization strategies accordingly, webmasters can enhance their sites’ visibility and ensure alignment with Google’s indexing criteria. Hopefully, this gives you a better understanding of how Google perceives the pages you indicate as canonicals.

Always make sure that non-canonical pages also remain available for Google to visit. That means avoiding the use of noindex as a means to prevent the selection of a canonical page.

If the canonical version of the page is hosted on your domain, you can check which one it is via the URL inspection tool in the Google Search Console. If you’d like to move this type of reporting into Looker Studio, check out my free Looker Studio dashboard template for Google URL Inspection API , and the associated resources to set it up.

For more reading on the topic, visit the Google Search Central article on duplicated content.