Whats your robots.txt file look like? - Xenforo

No i hadn't either, i think i got them from this thread or one linked here. So thought i would add them, which i did today. I guess seeing as its essentially allowing them it could not do any harm.

Actually looking at it more i don't think i need the mobile bot as i have User-Agent: * which allows all bots to crawl, and the adsbot looks like it is for adsense so for me i will be removing those now.

This is what i have now.
Code:
User-agent: Mediapartners-Google
Disallow:

User-agent: AhrefsBot
Disallow: /

User-agent: Baidu
Disallow: /

User-agent: Baiduspider
Disallow: /

User-agent: Baiduspider-video
Disallow: /

User-agent: Baiduspider-image
Disallow: /

User-agent: Cliqzbot
Disallow: /

User-agent: Diffbot
Disallow: /

User-agent: DotBot
Disallow: /

User-agent: EasouSpider
Disallow: /

User-agent: Exabot
Disallow: /

User-agent: linkdexbot
Disallow: /

User-agent: linkdexbot-mobile
Disallow: /

User-agent: magpie-crawler
Disallow: /

User-agent: meanpathbot
Disallow: /

User-agent: MJ12bot
Disallow: /

User-agent: NaverBot
Disallow: /

User-agent: omgilibot
Disallow: /

User-agent: proximic
Disallow: /

User-agent: Rogerbot
Disallow: /

User-agent: SiteBot
Disallow: /

User-agent: sogou
Disallow: /

User-agent: sogou spider
Disallow: /

User-agent: Sogou web spider
Disallow: /

User-agent: spbot
Disallow: /

User-agent: trendictionbot
Disallow: /

User-agent: Twiceler
Disallow: /

User-agent: URLAppendBot
Disallow: /

User-agent: Yandex
Disallow: /

User-agent: YoudaoBot
Disallow: /

User-agent: Yeti
Disallow: /

User-Agent: *
Disallow: /?page=
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /posts/
Disallow: /login/
Disallow: /admin.php
Disallow: /members/
Disallow: /conversations/
Allow: /

Sitemap: http://mysite.co.uk/sitemap.php
 
Last edited:
No i hadn't either, i think i got them from this thread or one linked here. So thought i would add them, which i did today. I guess seeing as its essentially allowing them it could not do any harm.

Actually looking at it more i don't think i need the mobile bot as i have User-Agent: * which allows all bots to crawl, and the adsbot looks like it is for adsense so for me i will be removing those now.

This is what i have now.
Code:
User-agent: AhrefsBot
User-agent: Baidu
User-agent: Baiduspider
User-agent: Baiduspider-video
User-agent: Baiduspider-image
User-agent: Cliqzbot
User-agent: Diffbot
User-agent: DotBot
User-agent: EasouSpider
User-agent: Exabot
User-agent: linkdexbot
User-agent: linkdexbot-mobile
User-agent: magpie-crawler
User-agent: meanpathbot
User-agent: MJ12bot
User-agent: NaverBot
User-agent: omgilibot
User-agent: proximic
User-agent: Rogerbot
User-agent: SiteBot
User-agent: sogou
User-agent: sogou spider
User-agent: Sogou web spider
User-agent: spbot
User-agent: trendictionbot
User-agent: Twiceler
User-agent: URLAppendBot
User-agent: Yandex
User-agent: YoudaoBot
User-agent: Yeti
Disallow: /

User-Agent: *
Disallow: /?page=
Disallow: /find-new/
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /posts/
Disallow: /login/
Disallow: /admin.php
Disallow: /members/
Disallow: /conversations/
Allow: /

Sitemap: http://mysite.co.uk/sitemap.php

Slight improvement/fix by removing all the unnecessary multiple disallow's
 
I have Baidu blocked via robots.txt yet they still visit my site
Just remember, an entry in robots.txt isn't really a "block" against those visitors, it is just a request to the visitor that they may or may not honor. Legitimate 'bots' will honor the requests.
 
Yeah am aware of this, it is a help. Do be honest most of the ones I had problems with like baidu have honoured it.
 
Slight improvement/fix by removing all the unnecessary multiple disallow's

When I researched this I couldn't find a definitive answer as to whether it was advisable to group or not group disallows in this way; hence my leaving them as individual entries in my own robots.txt files.

If anyone has any experience to offer in this regard - to confirm if grouping disallows works as expected on your server - then please let us know. (y)

Cheers,
Shaun :D
 
Code:
User-agent: *
Disallow: /account/
Disallow: /admin.php
Disallow: /attachments/
Disallow: /conversations/
Disallow: /find-new/
Disallow: /goto/
Disallow: /login/
Disallow: /members/*/trophies
Disallow: /misc/style
Disallow: /posts/
Disallow: /register/
Disallow: /search/
Allow: /

Sitemap: https://www.gamingforums.net/sitemap.php
 
Hi,

Can anyone explain to me why are you all using
Disallow: /attachments/ in robots.txt?

Does this block images being shown in search engine or it has nothing to do with that?
 
Hi,

Can anyone explain to me why are you all using
Disallow: /attachments/ in robots.txt?

Does this block images being shown in search engine or it has nothing to do with that?
It stops search engines seeing anything yourforum.tld/attachments. Thumbnails for attachments are internal_data I think? It makes them ignore anything to do with the attachments directory.
 
Out of interest, why does Xenforo's own robots.txt file, along with many other Xenforo site (including @Brogan's) disallow /find-new/ ?

Could it be because that page always redirects to a new search, so maybe makes no sense to be indexed? Plus the content of that page is purely links, no real content, and is always changing (hopefully) so not much use regarding keyword searches anyway. For SEO Google would see that as not useful content, merely a menu.
 
As discussed million times, admins should find a way to let us put rel noindex tag, but nobody seams to care about that until forums begin to receive messages like that.

I am seeing a lots of attachments as blocked in search, ever since day one I blocked them via robots.txt but that does not seams to be good solution anymore.
 
Top Bottom