XF 2.0 Noindex members pages

shanew

Active member

WoodiE

Well-known member
Check this https://forumweb.hosting/robots.txt
They used robots.txt file to prevent spiders from crawling member profile pages. Even this method is stronger than using noindex tag in member pages.
Per Google this is not the preferred method.

Warning
Pages with a warning status might require your attention, and may or may not have been indexed, according to the specific result.

Indexed, though blocked by robots.txt: The page was indexed, despite being blocked by robots.txt (Google always respects robots.txt, but this doesn't help if someone else links to it). This is marked as a warning because we're not sure if you intended to block the page from search results. If you do want to block this page, robots.txt is not the correct mechanism to avoid being indexed. To avoid being indexed you should either use 'noindex' or prohibit anonymous access to the page using auth. You can use the robots.txt tester to determine which rule is blocking this page. Because of the robots.txt, any snippet shown for the page will probably be sub-optimal. If you do not want to block this page, update your robots.txt file to unblock your page.
https://support.google.com/webmasters/answer/7440203#crawl-error
 

Chromaniac

Well-known member
Code:
<xf:if is="$__globals.template == 'member_view' OR $__globals.template == 'member_about' OR $__globals.template == 'member_latest_activity' OR $__globals.template == 'member_recent_content'">
<meta name="robots" content="noindex, nofollow">
</xf:if>

would this work?
 
Code:
<xf:if is="$__globals.template == 'member_view' OR $__globals.template == 'member_about' OR $__globals.template == 'member_latest_activity' OR $__globals.template == 'member_recent_content'">
<meta name="robots" content="noindex, nofollow">
</xf:if>

would this work?

Did you try this? I'm trying to achieve the exact same thing.

Also, how did you find the names of the templates, like "member_view" and "member_latest_activity"? Are they the same as the names in the dashboard when you go to Appearance > Templates?

If you look here: https://xenforo.com/community/resources/conditional-statements-for-xenforo-2.5795/ they are using <xf:if is="$template as opposed to $__globals.template. Where did you come across that? I'm thinking that's not necessary.
 

Chromaniac

Well-known member
yeah, it's working on my forum (broadbandforum.co).
found a bookmarklet here. though template names are also mentioned in top of source code iirc.
i did try without __global first but it did not work for me. i have seen this happen in the past. several times i end up with code that just does not work in the page_container template... i found reference to __global in some response on the forum. tried it and the code started working somehow. could be related to the theme i am using. no idea really. it is pretty much a hit and trial for me!
 

Mr Lucky

Well-known member
<meta name="robots" content="noindex">

This might be useful:


@Rivmedia can probably help more.

Obviously you could add stuff to robots.txt but unless you can also exclude from Sitemap you would probably get Google whingeing at you.
 

Mr Lucky

Well-known member
It's £75 for the whole suite of several SEO addons, but if all you need is that one thing then yes.

More SEO stuff would be useful in the core of xenforo, I think there are several suggestions around.
 

Rivmedia

Member
It's £75 for the whole suite of several SEO addons, but if all you need is that one thing then yes.

More SEO stuff would be useful in the core of xenforo, I think there are several suggestions around.

Thanks for the tag @Mr Lucky , unfortunatly we've decided to keep all of our SEO related addons listed and non-listed suites for our clients only and are no longer selling them either indivdually or as a bundle.
 

djbaxter

Well-known member

“In the interest of maintaining a healthy ecosystem and preparing for potential future open source releases, we’re retiring all code that handles unsupported and unpublished rules (such as noindex) on September 1, 2019. For those of you who relied on the noindex indexing directive in the robots.txt file, which controls crawling, there are a number of alternative options,” the company said.

What are the alternatives? Google listed the following options, the ones you probably should have been using anyway:

(1) Noindex in robots meta tags: Supported both in the HTTP response headers and in HTML, the noindex directive is the most effective way to remove URLs from the index when crawling is allowed.
(2) 404 and 410 HTTP status codes: Both status codes mean that the page does not exist, which will drop such URLs from Google’s index once they’re crawled and processed.
(3) Password protection: Unless markup is used to indicate subscription or paywalled content, hiding a page behind a login will generally remove it from Google’s index.
(4) Disallow in robots.txt: Search engines can only index pages that they know about, so blocking the page from being crawled often means its content won’t be indexed. While the search engine may also index a URL based on links from other pages, without seeing the content itself, we aim to make such pages less visible in the future.
(5) Search Console Remove URL tool: The tool is a quick and easy method to remove a URL temporarily from Google’s search results.
 
Top