I think I have unearthed another piece of the SEO puzzle in XenForo forum software. I was browsing Google Webmaster Tools and I clicked into the Coverage area. I clicked the Excluded tab up top and then I clicked the Excluded by Noindex Tag link in the description below. I noticed that my site had thousands of “What’s New” pages that had been crawled, but since they contained the noindex tag, they weren’t put into Google’s index. Now, I’ll tell you from my over 20 years of experience working on the internet and in SEO that having all these noindex pages is not good. Just because they say noindex on them, they still accumulate and bleed pagerank all over the place. I’ve battled with this type of thing for ages.
It seems as though the ID for the page is the only thing that changes. I’m not sure if they’re based on sessions or what. All the pages are actually duplicates of one another. For instance, this is what two sample URLs would look like.
https://www.mysite.com/forum/whats-new/posts/20770/
https://www.mysite.com/forum/whats-new/posts/20771/
…and so on.
For a while, I’ve wondered where all these pages were coming from. Where they were linked to from. Then, I clicked on the small link the “Latest Posts” widget that I have on the homepage. Actually, I had this widget showing on almost all pages until a few days ago. Even though the link to the What’s New page has a nofollow attribute in it, Google is still following all these links that randomly change 24 hours a day, creating all these duplicate pages. By the way, I’m up over 17,000 of these pages in the Google Webmaster Console now. It’s getting out of hand.
I believe I have found a solution to this issue. Since the link on the homepage links to:
https://www.mysite.com/forum/whats-new/posts/?skip=1
and then redirects to one of those other styled URLs that I displayed above, all I did was block the /whats-new/ directory in the robots.txt file. I wouldn’t have done this if there were thousands of different links to all these various pages, but since the source is just through one link, I think this is okay. It’ll take months for Google to see that these are now blocked and to drop the reference to them from their index, but I think things will be okay after that. This directory is even blocked on the XenForo site itself.
Have you noticed something like this happening on your site? Please let me know. What did you do about it?
Leave a Reply