Robots.txt

This is the robots.txt file at XenForo.com, which is installed in the /community directory.

http://xenforo.com/robots.txt

User-agent: *
Disallow: /community/find-new/
Disallow: /community/forums/-/
Disallow: /community/account/
Disallow: /community/attachments/
Disallow: /community/goto/
Disallow: /community/posts/
Disallow: /community/login/
Disallow: /community/admin.php
Allow: /
 
This is the robots.txt file at XenForo.com, which is installed in the /community directory.

http://xenforo.com/robots.txt

User-agent: *
Disallow: /community/find-new/
Disallow: /community/forums/-/
Disallow: /community/account/
Disallow: /community/attachments/
Disallow: /community/goto/
Disallow: /community/posts/
Disallow: /community/login/
Disallow: /community/admin.php
Allow: /
Any changes with 1.1?
 
I had noting in my file other then the default file with XF. I often have 30 users and 80 guests on my site with many of the guests being robots.

I have XF in the root so I have added this to the file:
User-agent: *
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /posts/
Disallow: /login/
Disallow: /admin.php
Allow: /
 
What is the meaing of find-new, account, attachments, goto and posts? Are they blocking a url or a folder?

I want to stop google from indexing my member names. Will i then block x/members/ ?
 

Unless I'm missing something here that robots.txt used by XenForo is wrong, well wrong for everyone else. For starters there is no folder called "account" in XenForo ROOT by default (unless that's used for customers specific to this forum). Also there is no folder located here called "/attachments/" by default in an xenforo installation.

But there are two located here though: "/data/attachments" and "/internal_data/attachments/". The same goes for most of the other folder paths XenForo is using in the robots.txt file. Most of the entries just look all wrong going off the default xenforo folder structure (we install).

User-agent: *
Disallow: /find-new/ no such folder
Disallow: /account/ no such folder
Disallow: /attachments/ folder located in two areas (data and internal_data) folders, but not in "xenforo root" though
Disallow: /goto/ no such folder
Disallow: /posts/ no such folder
Disallow: /login/ no such folder
Disallow: /admin.php correct
Allow: /

I've also read before that you said most other folders in XenForo are covered and don't need to be added in robots.txt for exclusion. I have a question about that one, how are you blocking them then if I can ask? Because for example, lets take indexing the "styles" folder here.

The index.html files used in many folders like that are blank (there's no content in them) to tell spiders not to index those folders. Not like this example of code below - you would use in an index.html file to do it. But as part of the full code used that also includes html, title and body tags e.t.c.


<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
 
Hey Ryan, I recognise you from SEOMoz. Small world.

I wanted to ask you why you've decided to omit /attachments/ and /search/ from your exclusion list?

I can understand attachments perhaps if you've optimised them, but why would you allow /search/?

Thanks for the list & for others sharing theirs!
 
I'm using the robots.txt file to block bots for the script file names like:

Code:
User-agent: *
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /posts/
Disallow: /login/
Disallow: /misc
Disallow: /help
Disallow: /search
Disallow: /members
Disallow: /register
Disallow: /online
Disallow: /lost-password
Disallow: /internal_data/
Disallow: /js/
Disallow: /library/
Disallow: /styles/
Disallow: /admin.php
Disallow: /admindav.php
 
Allow: /

Is that right?
 
Unless I'm missing something here that robots.txt used by XenForo is wrong, well wrong for everyone else. For starters there is no folder called "account" in XenForo ROOT by default (unless that's used for customers specific to this forum). Also there is no folder located here called "/attachments/" by default in an xenforo installation.

But there are two located here though: "/data/attachments" and "/internal_data/attachments/". The same goes for most of the other folder paths XenForo is using in the robots.txt file. Most of the entries just look all wrong going off the default xenforo folder structure (we install).

User-agent: *
Disallow: /find-new/ no such folder
Disallow: /account/ no such folder
Disallow: /attachments/ folder located in two areas (data and internal_data) folders, but not in "xenforo root" though
Disallow: /goto/ no such folder
Disallow: /posts/ no such folder
Disallow: /login/ no such folder
Disallow: /admin.php correct
Allow: /

I've also read before that you said most other folders in XenForo are covered and don't need to be added in robots.txt for exclusion. I have a question about that one, how are you blocking them then if I can ask? Because for example, lets take indexing the "styles" folder here.

The index.html files used in many folders like that are blank (there's no content in them) to tell spiders not to index those folders. Not like this example of code below - you would use in an index.html file to do it. But as part of the full code used that also includes html, title and body tags e.t.c.


<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
I would like to see a reply of Brogan to this :)
 
I'm using the robots.txt file to block bots for the script file names like:
Is that right?
===robots.txt===========
User-agent: *
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /posts/
Disallow: /login/
Disallow: /misc
Disallow: /help
Disallow: /search
Disallow: /members
Disallow: /register
Disallow: /online
Disallow: /lost-password
Disallow: /internal_data/
Disallow: /js/
Disallow: /library/
Disallow: /styles/
Disallow: /admin.php
Disallow: /admindav.php

Allow:

Note: green = what xenforo.com uses.
Anyone know if it is a good idea to block the non-highlighted links too ?
 
If you value SEO (search engine rankings), then your site XF site should probably have solid robots.txt settings.

My question is this: how does adding stuff to the robots.txt improve your SEO? I have heard several places state this, but it makes no sense, when robots.txt LIMITS how much a bot will see on your site.
 
Another robots.txt

My full robots.txt;
Code:
User-agent: *
 
Disallow: /account/
Disallow: /admin.php
Disallow: /ajax/
Disallow: /attachments/
Disallow: /conversations/
Disallow: /data/
Disallow: /forums/-/
Disallow: /forums/tweets/
Disallow: /goto/
Disallow: /help/
Disallow: /internal_data/
Disallow: /js/
Disallow: /library/
Disallow: /login/
Disallow: /lost-password/
Disallow: /misc/contact/
Disallow: /members/
Disallow: /online/
Disallow: /recent-activity/
Disallow: /register/
Disallow: /posts/
Disallow: /search/
Disallow: /styles/
Allow: /

The /forums/-/ just disallows the "Mark forums as read" link, though by default it should not be displayed to Google. I also disallow a few account specific URLs, like /conversations/ and /account/, though Google should never see them, but it is better to be safe than sorry. I also disallow indexing my member list and profiles, but this is my choice, I don't know if it has any effect on anything. I also disallow all forms and search pages, they are very uninteresting for Google. How well these affect SEO, I don't know, except for those that reduces double entries (like /posts/ and /goto/). I don't remember the intention of everything in this one, I think I mostly copied it from a post I found somewhere here.
 
Here is mine, @Brogan - I have the robots.txt in my public_forum/community is that correct?
User-agent: *

Disallow: /community/find-new/

Disallow: /community/account/

Disallow: /community/attachments/

Disallow: /community/goto/

Disallow: /community/posts/

Disallow: /community/login/

Disallow: /community/admin.php

Disallow: /community/ajax/

Disallow: /community/conversations/

Disallow: /community/events/birthdays/

Disallow: /community/events/monthly

Disallow: /community/events/weekly

Disallow: /community/find-new/

Disallow: /community/help/

Disallow: /community/login/

Disallow: /community/lost-password/

Disallow: /community/online/

Sitemap: http://www.mysite.com/community/sitemap/sitemap.xml.gz

User-agent: baiduspider

Disallow: /

User-agent: Baiduspider-video

Disallow: /

User-agent: Baiduspider-mobile

DisAllow: /your-image-directory/

User-agent: Baiduspider-image

DisAllow: /image/

Allow: /
 
Put the robots.txt file in the root.

The configuration of it depends on your installation.
 
Back
Top Bottom