Skip to main content

How to deal with data-sheet and product PDFs which are repeated over the web?

I'd like to ask for opinions on the proper way to deal with a situation whereas a company (acme.com) is selling and implementing certain kind of gadgets, and while each product has their own dedicated HTML product page on these pages deep technical information is not provided. However for the tech-savvy the website also hosts data-sheets and vendor provided PDF documentation about the products, linked from the product HTML pages for easy access. All together this is a setup that seems to be useful for real users.

Now when we go down to SEO, since these PDF-s are mostly provided by the manufacturers, they appear in other places over the web. Many of them are detected as duplicate content by Google whereas the "Google-selected canonical" is mostly the same PDF on someone else's website!

Question 1: Does this mean that link-juice is leaked to those other random websites, even though acme.com is linking to the PDFs under it's own domain?

Question 2: If answer to Q1 is yes, would adding rel="nofollow" to those links pointing to the PDFs fix the issue?

To make things more complicated when I check the Links section in Google Search Console, I can see that apparently acme.com have loads of External links, but when I take a closer look, most of those are pointing to identical copies of these PDFs mostly on another site.

Question 3: Is it a bug in Google Search Console that it shows links to PDFs on other websites as incoming external links to acme.com?

But also when I look further I can find some PDFs, where the Google Selected Canonical seems to be the one on acme.com, even though there are many other sites linking to duplicates of the same PDFs.

Question 4: Does that mean that with those PDFs acme.com is stealing link juice from other sites?

Question 5: If Q4 is yes, does that mean acme.com should examine the PDFs one-by-one and add rel="nofollow" (or prevent indexing in any other way according to Q2) to those links where the canonical is on another domain, but keep normal links to those PDFs, where we have the canonical versions?

I know it is a complex topic. Your insights would be much appreciated! Thank you!

submitted by /u/GM8
[link] [comments]

from Search Engine Optimization: The Latest SEO News https://ift.tt/2UZCTfF

Comments

Popular posts from this blog

Local seo vs. natiowide seo?

I've done SEO for local businesses but I recently got my first client that sells an item nation wide. ​ Any suggestions for doing nationwide SEO? ​ I am used to making geopages for local towns. I was going to do the same with some input from the client about what cities or towns he would like to show up in? submitted by /u/Letmeinterviewyou [link] [comments] from Search Engine Optimization: The Latest SEO News http://bit.ly/2JHy0k0

Clients site has a weird issue with 302 redirects that I haven't seen before.

Site is in Drupal, hosted on Amazon CDN & Cloudflare. So here's a quick breakdown: The site itself works normally. It's a bit dated, but you can click on links and navigate around as you'd expect. Seeing no obvious issues, I run a Screaming Frog crawl to begin my audit. Only 5 pages were picked up by the crawl which was super weird, since all internal links are regular html and there shouldn't be any issues. So I go through the site and manually collect a bunch of URLs, which I submit to SF again as a list. Every single link bar the 5 originally crawled return a 302, with the 'redirect' pointing back to the home page. Except as I said, those pages don't browser redirect. Browser side, they work fine. I guess they redirect the crawl bot though, since the rest of the site is functionally invisible. Other tools I've looked at say that the pages return simultaneous 302 and 200s, which doesn't make too much sense. These 302s are also old enough ...