Automated no-index of light pages (zero replies, thin content)

btw, the issue with my forums and no-indexing zero reply threads is that I have a ton of FAQs and guides that are only 1 post yet are very content rich and useful.
 
btw, the issue with my forums and no-indexing zero reply threads is that I have a ton of FAQs and guides that are only 1 post yet are very content rich and useful.

You can whitelist authors, so that might be your workaround.
 
It's important to note that it's not just Panda we are dealing with. My site Physics Forums was hit by the same algorithm change that hit MetaFilter 3 years ago. Google denied that a UGC filter was placed until last year when they finally admitted it. UGC is definitely under attack.

I'm curious how this turned out. Has the meta noindex tag been working well for you? Did your rankings recover after your site was hit? If so, what do you think attributed to the recovery. Thanks ahead of time.

Jay
 
I'm curious how this turned out. Has the meta noindex tag been working well for you? Did your rankings recover after your site was hit? If so, what do you think attributed to the recovery. Thanks ahead of time.

Jay
I've tied many things over the past 5 years with no luck.
 
I've tied many things over the past 5 years with no luck.

I feel your pain. When it comes to this software, I see two huge issues. First is the /whats-new/ directory. This needs to be blocked in the robots.txt file. If you take a look in your Google Search Console, you should see thousands of pages in this directory that have been "excluded by the 'noindex' tag." On one of my sites, Google crawled and marked 19,000 What's New pages and I hadn't even been running the software for six months yet. Just because a page is marked as noindex, it's still in Google's index and it still us sucking up pagerank. I'm very skeptical of the noindex tag and feel it's been heavily misrepresented out there on the internet. As you probably already know, the pages in the /whats-new/ directory increment up to coincide with user sessions, or to better phrase that, users who look at those pages. And unfortunately, it seems that bots count as users too. The pages are infinite. That's definitely not good for SEO. It's best to block that directory.

If you ever do decide to block that directory, be prepared for your "blocked by robots.txt" metric to go sky high. Also prepare for your rankings to drop some. The minute you block the pagerank from flowing to all those useless pages in the /whats-new/ directory, your entire site will be reevaluated by Google. If you've got hundreds of thousands of those pages, it's going to hurt. It'll also take months and maybe even a year for those pages to clear out of the index. They will eventually clear out and your ranking will return and most likely increase from where they are today.

The second issue I see surrounds the image pages that we discussed in another post. In that post, I mentioned that I was going to block those pages in the robots.txt file. I've decided not to block them, but to set the permissions so visitors can't see them. This way, when Googlebot comes by to visit those image pages, it'll be confronted with a 403 status code and drop the page from the index. Blocking them in the robots.txt file would keep those URLs in the index indefinitely. It's better to get rid of them completely. This will take a while to occur, especially if you've got tens of thousands of these pages (images). We can talk about the /members/ directory in another post if you'd like. Ultimately though, that needs to be blocked as well, preferably with the permission system. Let me know if this helps at all.
 
Top Bottom