TIPS & KNOWLEDGE BASE

E-COMMERCE SPECIALISTS SINCE 1996

Breadcrumbs, Categories, and SEO at Scale

What breaks when your catalog has a million products

If you’re running a large catalog, category structure stops being a simple organizational task, and turns into a technical SEO problem. This came up recently in a discussion around cleaning up canonical categories for a catalog with over a million products. The core question was straightforward:

How do I figure out what my canonical category should be, and what should be a breadcrumb?

The role of canonical categories

In Miva, a canonical category is structural. It defines where a product lives in the hierarchy and is directly tied to things like:

  • Breadcrumbs
  • Internal linking
  • Crawl paths for search engines
  • Building authority for your pages, links, categories

So when you assign a canonical category, you’re not just organizing products. You’re defining how both users and search engines move through your site. Before technical SEO was so critical to a site’s success, back when all you had to focus on was keywords, you might have assigned a canonical category as the highest level category of a product, or the last category in which it can be found:

Home > Car Parts

Vs

Home > Car Parts > Toyota > Exterior Accessories > Windshield Wipers

Choosing that canonical category requires a little more strategy now. It’s totally possible that neither the broadest nor the most specific categories are the answer. The breadcrumb should point to the last possible valuable page that applies.

If your Windshield Wipers category only has three products, it’s low value and not worth the crawl budget to find it. But if the Exterior Accessories category is more robust, has more products, and is generally a higher value page, then that’s where your breadcrumb should land. 

Breadcrumbs are not just UI

Breadcrumbs are often treated as a visual feature. They help shoppers understand where they are, and navigate their way back up the chain. But from an SEO standpoint, they’re internal links that carry serious weight. 

When a breadcrumb exists, it’s effectively saying “This product is from this category, and this category is important.”

Search engine crawlers will follow those links. They pass authority through them. They use them to understand the site structure. But that only happens if the destination actually exists. 

Where we run into the problem with large product catalogs, is that this can lead to a LOT of category creation and maintenance.

You can’t point to breadcrumb pages that don’t exist

You may want to try assigning specific and detailed canonical categories, without actually creating the category in your system. You’re flagging the product with a very detailed and nuanced canonical. However, with a catalog that large, you’d end up with thousands of sub categories – a ton of bloat – so you don’t create the actual category. . 

Unfortunately, that won’t work. If the category doesn’t exist as a real, indexable page, the breadcrumb link has nowhere meaningful to go. At best, it’s a dead end. At worst, it actively works against you. No page means no value, and a wasted crawl. 

You can’t create the pages and just not index them

Let’s say you do create all the necessary subcategories:

Car Parts > Toyota > External Accessories > Windshield and Rear Window Accessories > Wiper Blades and Arms

But instead of indexing all of them, you set the lower-level pages to noindex.

On the surface, that feels like a reasonable compromise. You get structure and live links without having to manage thousands of category pages that need to be optimized, filled in with on-page SEO, populated with multiple products, etc.

The problem is what happens when a crawler hits that structure. When a search engine bot lands on a product page, it follows the breadcrumb links. If those breadcrumbs point to noindex pages, the crawler will:

  • Follow the link
  • Load the page
  • See the noindex directive
  • Drop it from consideration

That might not sound like a big deal once. But scale it. If this happens across tens of thousands of products, you’re forcing crawlers to repeatedly spend time on pages that you’ve explicitly told them not to index.

That does two things:

  • Wastes crawl budget
  • Breaks the flow of link authority

You’re essentially building a structure that leads search engines into dead ends, over and over again.

When deeper categories actually make sense

Now you know what you can’t do – so what can you do? There’s nothing wrong with building a deeper category structure. In fact, for large catalogs, it’s often necessary.

But there’s a practical filter:

If a category is important enough to include in a breadcrumb, it should be important enough to exist as a real page.

That means:

  • It has enough products to be useful
  • It represents a meaningful grouping
  • It has some potential to capture search traffic, even if it’s long-tail

Not every subcategory meets that bar.

A better approach for large catalogs

When you’re dealing with thousands, hundreds of thousands or millions of products, you can’t treat every possible category as equal. You’ll burn through your (limited) crawl budget which is afforded to you by the search engines. You’ll waste that budget on unimportant pages, and you’ll confuse yourself and your shoppers in a maze of subcategories.

Instead:

  • Build out core categories that actually matter
  • Let breadcrumbs stop at a meaningful level
  • Avoid linking to low-value or thin subcategories

Look again at our example from before:

Car Parts > Toyota > External Accessories > Windshield and Rear Window Accessories > Wiper Blades and Arms

Vs.

Home > Car Parts > Toyota > Exterior Accessories > Windshield Wipers

Or even 

Home > Car Parts > Toyota > Exterior Accessories

If “Wiper Blades and Arms” doesn’t justify its own page, don’t force it into the structure.

The underlying issue: scale vs manageability

The root concern here wasn’t SEO theory. It was operational. Managing thousands of category pages is difficult:

  • Content needs to be created
  • Pages need to be optimized
  • Structure needs to stay consistent

That’s a real constraint. But trying to avoid that work by creating “invisible” or non-functional categories creates bigger problems down the line. Instead, scale your breadcrumbs back to the last landing page that you’d pay to have people view – because that is essentially what you’re doing. 

The takeaway

If you’re restructuring categories at scale, the rules are pretty simple:

  • Don’t assign canonical categories that don’t exist
  • Don’t breadcrumb to noindex pages
  • Don’t create deep structures you’re not willing to support

And most importantly, if a category is important enough to be part of your site structure, it needs to function as a real page.

Otherwise, you’re not building structure. You’re creating noise that both users and search engines have to work around.

Leave a Reply

Your email address will not be published. Required fields are marked *