XF 1.4 How to control what gets indexed?

Pavle123

Active member
Hello,

How can I control what pages of my XenForo get indexed in search engines?

I obviously do not want FAQ section, members pages and low quality introduce yourself type of posts to be visible by search engines. It does not benefit anyone.

Do I need some sort of a SEO plugin for that? I guess those sorts of things should be built in the software?

Found this as well https://xenforo.com/community/resources/*******-advanced-noindex.3580/ is this good since it has no reviews?

By the way, I am amazed how fast my site got indexed with XenForo, it took less then 48 hours.
 
You should use robots.txt whenever possible. XenForo doesn't supply any robots.txt at the moment by default idk why o_O

First create a robots.txt file in the root of the forum (the place uncle index.php lives) and then put this inside it
Code:
User-agent: *
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /login/
Disallow: /register/
Disallow: /conversations/
Disallow: /members/
Disallow: /online/
Disallow: /recent-activity/
Disallow: /search/
Disallow: /admin.php
Disallow: /proxy.php
Disallow: /help/*
Disallow: /misc/contact
Allow: /

Host: www.rt-networks.com
Sitemap: http://www.rt-networks.com/sitemap.php

If you're using adsense then use this code
Code:
User-agent: Mediapartners-Google*
Disallow:

User-agent: *
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /login/
Disallow: /register/
Disallow: /conversations/
Disallow: /members/
Disallow: /online/
Disallow: /recent-activity/
Disallow: /search/
Disallow: /admin.php
Disallow: /proxy.php
Disallow: /help/*
Disallow: /misc/contact
Allow: /

Host: www.rt-networks.com
Sitemap: http://www.rt-networks.com/sitemap.php
Be sure to replace www.rt-networks.com with your domain. If your forum is installed on a sub directory something www.example.com/community then chances are you're using some kind of cms or anything like that and robots.txt already exists, so just put the above stuff i added inside code bbcode in the end of robots.txt. Then replace
Code:
Disallow: /
with
Code:
Disallow: /community/

If you don't want to index a thread then just below
Code:
Disallow: /misc/contact
add
Code:
Disallow /threads/how-to-control-what-gets-indexed.90613/
making entire robots.txt look like
Code:
User-agent: Mediapartners-Google*
Disallow:

User-agent: *
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /login/
Disallow: /register/
Disallow: /conversations/
Disallow: /members/
Disallow: /online/
Disallow: /recent-activity/
Disallow: /search/
Disallow: /admin.php
Disallow: /proxy.php
Disallow: /help/*
Disallow: /misc/contact
Disallow /threads/how-to-control-what-gets-indexed.90613/
Allow: /

Host: www.rt-networks.com
Sitemap: http://www.rt-networks.com/sitemap.php

Other than that do you have Webmaster Tools and Bing's Webmaster something setup? Bing doesn't automatically index your site as fast as Google does so you will need to use the sitemap to get indexed.
 
Thanks Brogan, I understood that, but what I actually wanted to ask is why to include sitemap and host name (of course of my own site, not the example one).
Is it beneficial? Because I have never seen such robot.txt although I must admit I used WordPress only.

Brogan, may I say once again how great the software is? My community exploded ever since I replaced BBpress with XF!
 
I understood that, but what I actually wanted to ask is why to include sitemap and host name (of course of my own site, not the example one).
For bots who are not smart enough to index stuff fast. Google and Bing Bots will learn at extremely fast rate and other bots will takes probably months or years so simply provide them the list of stuff you have and they index it for you.
 
Top Bottom