Skip to main content

Indexed but Blocked by Robots.txt

I have an issue with a large E Commerce site whereby millions (yes, Millions!) Of long URL strings are being picked up by Search Console as Indexed But Blocked by Robots.txt.

These strings contain thousands of possible variations of filterable attributes in our Amasty Improved Navigation bar. These attributes are set up as URL parameters in SC, but are still being indexed. We want the core categories / brand pages to be indexed, but not when filters are applied. These are for navigation and UX.

I'm at a loss as to why this i happening and some advice on how to resolve would be super appreciated!

submitted by /u/RichieHermetic
[link] [comments]

from Search Engine Optimization: The Latest SEO News https://ift.tt/315KN5C

Comments

Popular posts from this blog

Local seo vs. natiowide seo?

I've done SEO for local businesses but I recently got my first client that sells an item nation wide. ​ Any suggestions for doing nationwide SEO? ​ I am used to making geopages for local towns. I was going to do the same with some input from the client about what cities or towns he would like to show up in? submitted by /u/Letmeinterviewyou [link] [comments] from Search Engine Optimization: The Latest SEO News http://bit.ly/2JHy0k0

Clients site has a weird issue with 302 redirects that I haven't seen before.

Site is in Drupal, hosted on Amazon CDN & Cloudflare. So here's a quick breakdown: The site itself works normally. It's a bit dated, but you can click on links and navigate around as you'd expect. Seeing no obvious issues, I run a Screaming Frog crawl to begin my audit. Only 5 pages were picked up by the crawl which was super weird, since all internal links are regular html and there shouldn't be any issues. So I go through the site and manually collect a bunch of URLs, which I submit to SF again as a list. Every single link bar the 5 originally crawled return a 302, with the 'redirect' pointing back to the home page. Except as I said, those pages don't browser redirect. Browser side, they work fine. I guess they redirect the crawl bot though, since the rest of the site is functionally invisible. Other tools I've looked at say that the pages return simultaneous 302 and 200s, which doesn't make too much sense. These 302s are also old enough ...