Automated no-index of light pages (zero replies, thin content)

cmeinck

Well-known member
Google seems intent on pushing UGC (user generated content) down in the search results. There are some active conversations in their product forums where Google reps have suggested that sites de-index UGC that is light (zero replies, thin content). As a forum admin, this sounds sort of insane. 88k discussions and almost 1 million messages, this is impossible to do without some level of automation. It would also have to tie in with whatever sitemap tool you are using.

Google's algorithms could change for the better, but I'm not so sure. I think as forum admins, we need to spoon feed our best content and leave the thin stuff as no-index. I'd welcome ideas on how to best accomplish this going forward with XenForo.

Source, Discussion on Google's Product Forums
 
Stick with what works:

http://en.wikipedia.org/wiki/PageRank

Focus on promoting your highest quality content. As a hypothetical example, if you were running a tech forum you might have a separate forum with technical guides. Those guides are moderated and approved. Then you can go out posting links to your guides on other highly ranked sites with relevant content. That kind of exposure is what increases your page rank.
 
I disallow indexing on thin content forums, such as introduction forums. Also thinking about doing it on the offtopic forum, but from time to time there are interesting discussions there, so I have not yet decided on that.
 
Stick with what works:

http://en.wikipedia.org/wiki/PageRank

Focus on promoting your highest quality content. As a hypothetical example, if you were running a tech forum you might have a separate forum with technical guides. Those guides are moderated and approved. Then you can go out posting links to your guides on other highly ranked sites with relevant content. That kind of exposure is what increases your page rank.


I agree with this, but what Google's asking for now is to not waste their time with thin content. Let's take my site for example. I've got 88k threads. I'm sure that some percentage are thin, low-quality posts. Google's algo now takes into account thin content. If there was a way for me to set no-index to threads with only one post, it automates the process and provides me with a better chance at succeeding. Point being that lower quality, thin content could impact your great threads.

I'd love to find ways of customizing the sitemap tools we have (created by 3rd parties) to remove these from the sitemaps.

My personal opinion is that we can no longer trust that Google will understand that forums have some thin content, along with robust discussion. I think we'll continue to see forums getting squeezed by these algos.

Daniweb is a perfect example.
 
I disallow indexing on thin content forums, such as introduction forums. Also thinking about doing it on the offtopic forum, but from time to time there are interesting discussions there, so I have not yet decided on that.

Are you using robots.txt or is there a way to set threads within those forums to no-index?
 
So you can use that for a specific type of page (ie. Members), but not a specific forum category such as Off Topic, because that shares the same template as indexed forums?

You would have to check the node_id when adding that to thread_view.
 
You would have to check the node_id when adding that to thread_view.

Let's say I want threads in nodes 81 and 80 to be set to nonidex using the code you provided here.

Code:
<xen:container var="$head.robots">
    <meta name="robots" content="noindex" /></xen:container>

How would you add this snippet to thread_view so that it appears in the head?
 
Code:
<xen:if is="in_array({$thread.node_id}, array(80,81))">

<xen:container var="$head.robots">
    <meta name="robots" content="noindex" /></xen:container>

</xen:if>

I've tried adding it to thread_view and receive the following error.

Screen Shot 2013-07-22 at 9.31.12 AM.webp

I'm grabbing the nodes from view-source and this code I've found:
Screen Shot 2013-07-22 at 9.32.46 AM.webp

Here's what I was using:

Code:
<xen:if is="in_array({$thread.node_id}, array(83,80))">

<xen:container var="$head.robots">
    <meta name="robots" content="noindex" /></xen:container>
</xen:if>
 
Use this:

Code:
<xen:if is="in_array({$thread.node_id}, array(80,81))">

<xen:container var="$head.robots">
    <meta name="robots" content="noindex" /></xen:container>

</xen:if>

It's the same thing, minus an invisible character that somehow creeped in there.
 
Thanks @Jake Bunce. This is invaluable to me. Such an awesome fix to help me combat Google's Panda. Between this thread_view code and member_view, I've effectively removed a ton of non-relevant content from Google's index. (y)(y)

Hoping one of the sitemap devs comes up with an option to not include threads from selected forum categories. I'm submitting Off Topic to Google in my sitemap and then telling them not to index it.
 
Is there a way to do this for a specific forum category? I've been able to tackle the threads. For example, all threads within Off Topic are now set to noindex. However, the category itself is not set to noindex. Setting the category to noindex would complete the job. Would this work if I used it on the forum_view template?

Code:
<xen:if is="in_array({$thread.node_id})">

<xen:container var="$head.robots">
    <meta name="robots" content="noindex" /></xen:container>

</xen:if>
 
A couple of questions regarding this topic. Is it possible to have something in place that would automate the addition of a noindex tag to threads that have zero replies and are X number of days old. If so, who could contact about creating the custom code required to make an add-on or script for this purpose. It should also work in reverse. If an older thread receives a reply, than it should remove the noindex tag, allowing it to be indexed in Google.

I'd gladly pay for this to happen.
 
Do we know that there's any real benefit from no-indexing thin content? So having thin content hurts quality content on the same url?
 
Do we know that there's any real benefit from no-indexing thin content? So having thin content hurts quality content on the same url?

Thin content is one of the many factors which can lead to a Panda penalty. Based on my research into the topic, there are forums which have been penalized, but have come back by noindexing zero reply content.

I'm just hoping that a. this is possible and b. someone will take me up on the offer.
 
Top Bottom