Troubleshooting one of the most popular issues in e-commerce SEO
You are running an e-commerce site. You are not really getting any hits on your product pages.
You hire an SEO specialist, who tells you that 40% of product pages on an e-commerce site are not indexed. Ouch!
What are the possible reasons for this issue?
There are multiple reasons for this, they can be grouped into two categories for the most part:
- poor crawl efficiency
- duplicate content.
Let’s dig deeper.
#1 Robots.txt is blocking access to pages
Robots.txt is blocking access to pages, or accidentally deindexing sections of the website.
This can be checked, using a crawler Deepcrawl, Screaming Frog to check the indexability of pages.
#2 .htaccess server file is poorly configured
If your site runs on Apache Servers, the .htaccess server file might have been not configured properly, causing rendering issues that disrupt the loading of certain pages.
Check for typos, rule location, conflicting htaccess files, and incorrect syntax.
Another server-related issue that could be causing indexing faults is the domain’s DNS, which could prevent Googlebot from accessing and indexing the page.
#3 Your sitemap is unoptimized, outdated, and irrelevant
Unoptimized XML sitemap — an outdated sitemap can prevent the discoverability of new pages that are not interlinked.
In this case, a clean-up of the existing sitemap would be needed to include only 200-status code pages and the new pages.
Ecommerce platforms like Shopify are often criticized for poor sitemap structure and lack of sitemap detection by search engines, due to the way it’s set up. Sometimes, the best solution would be to implement an automated sitemap generation custom-built tool and redirect the Shopify-developed sitemap URL to a custom URL, where your custom-built sitemap is hosted in Google’s recommended format.
#4 You’ve got index bloat
Check for index bloat. There can be excess indexing issues, causing poor crawl efficiency.
In an e-commerce site, this is often caused by indexing pages with dynamic parameters. These are often the filters, available to users on the category pages or faceted navigation.
Another way that the same issue could manifest is having the paginated views of the same page (e.g. the product category catalog pages).
Pro Tips:
- Try implementing a protocol like IndexNow for notifying search engines when a change has been made to the website, and new pages are published.
- There are now other tools like URLMonitor, that can automatically submit non-indexed pages to Google for indexing, on autopilot, which can also help improve indexation rates and speed
The lack of breadcrumb links on a large site without schema mark-up and internal linking structures makes search engine discoverability of new pages and assessment of site structure limited.
Other quick wins in terms of website architecture can be things like related product sections, popular products in this category sections, best seller sections, etc, which can be implemented at each product page with a click of a button.
#6 Content is near similar on all your product pages.
Duplicate content issues are prevalent. I’ve written an extensive guide on this topic for Wix SEO Learning hub: Audit and fix duplicate content.
If the products on the site are similar to one another with only small changes in the name, Google can deindex these pages.
In this case, it’s best to implement product schema markup and provide unique product descriptions and sufficient title variations to products, as well as vary the on-page structure, and dynamic content systems on the pages. This can be achieved via text automation, via machine learning, or templatized content structures, using models like GPT3.
#7 The URLs are JavaScript-heavy
Let’s say the product pages and all metadata are unique.
Then an issue might be caused by rendering.
Google sometimes struggles to parse JS-heavy content.
Pages with heavy JavaScript elements may pose rendering challenges for search engines, impacting indexing. Address rendering issues promptly to ensure full page visibility.
#8 Server Downtime
Frequent server downtimes or slow response times can result in incomplete indexing of pages. Ensure consistent server uptime and optimize response times to prevent indexing delays.
Beware of 5XX status errors on your website, and consider alternative hosting solutions or look for underlying technical issues, like redirect loops.
#9 Canonicalization Issues
Incorrect implementation of canonical tags or conflicting canonical directives can confuse search engines, leading to improper indexing. Regularly audit canonical tags and resolve any inconsistencies to improve indexing accuracy.
#10 URL Redirect Chains and Redirect Loops
Redirect chains or loops within your URL structure can impede Googlebot’s ability to index pages effectively. Simplify your URL redirects and ensure they lead to the final destination without unnecessary hops.
Even worse are redirect loops. Redirect loops disrupt indexing and user experience by trapping search engine crawlers and frustrating visitors. Regularly audit your redirection structure to eliminate unnecessary redirects and break any looping chains. By resolving redirect loops, you enhance indexing efficiency and ensure a smoother user journey on your website.
#11 Poor Internal Linking
Inadequate internal linking can hinder Googlebot’s ability to discover and index new pages. Implement a robust internal linking strategy to guide search engine crawlers through your site efficiently.
Bottom line?
Investigate and fix these issues immediately.
By addressing an array of factors such as crawl efficiency, duplicate content, technical errors, and additional nuances like server downtime and canonicalization, you can foster improved search engine visibility and elevate user experience. Prioritizing these efforts not only enhances the discoverability of your product pages but also cultivates a stronger online presence, propelling your e-commerce business towards greater success.