Robots.txt and sitemap questions

Anyway the url seems to work for say mydomain.com/members/

Doesn't work for goto or attachments though. Just nothing comes up or oops
 
Still a bit confused about the community / which presumably is the equivalent of forums /?
Some people install XenForo in the root (@ www.YourDomain.com), some people install it in a sub-directory (@ www.YourDomain.com/community). The subdirectory could be anything, by default XenForo uses "community" if installing in a subdirectory, but it can be changed to anything.

A root installation is typically done when your site is only the forum itself. A sub-directory installation is typically done when the forum is only going to be part of your overall site.

Your site is installed in the root.
 
Thank you. It can only see admin.php, members, login, register, account, search and help in a url. Can't see goto or attachments. (I thought attachments were in admin.php?)

And now I've just come across this! Sounds like an argument for no robots.txt at all!

"In fact these malicious bots look at the robots.txt to better map your site. If any point you have a Disallow: this will be used to better attack your site. A hacker that is manually looking at your site should spend extra time examining any files/directories that you are attempting to disallow."

But I guess nothing is perfect. AI says it's still better to have robots.txt as it's an seo tool, not a security tool. But mentions other ways of securing something like admin.php - which is maybe something I should look into .........
 
Thanks. So is that for attachments? If I disallow attachments then I don't really need to disallow ImageSift

What is goto then?
XenForo uses /goto when linking quoted posts. It is another "duplicate" content reference because the content it links to is already covered by a /thread url.

1748213480787.webp

The up arrow references a /goto/ url link.
 
Really appreciate all this. Ok so as I don't seem to be having any server bandwidth issues right now and some of the bots could be good for seo, then maybe I could just go with this (wasn't sure about including help or not). Then I can always disallow/add things later if needed.

User-agent: *
Disallow: /admin.php
Disallow: /account/
Disallow: /attachments/
Disallow: /goto/
Disallow: /login/
Disallow: /register/
Disallow: /search/
Disallow: /help/
Disallow: /members/


Sitemap: https://www.thehamsterforum.com/sitemap.xml
 
Only reason I wanted rid of ahrefs is there are so many instances of it. So that wouldn't restrict google searching then? Having too many other things searching?
 
Ok I think I'm going to go with this. Most of the disallowed ones are apparently no benefit for SEO. I tend to have about 50 bots/guests crawling most of the time, so maybe it won't crowd out google if I disallow some of these (or maybe it won't make much difference! Google is still crawling). If you can see any errors, can you let me know please? :-)

Decided to leave attachments as they could be useful for snippets maybe?

Final robots.txt below




User-agent: AspiegelBot
Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: DotBot
Disallow: /

User-agent: MauiBot
Disallow: /

User-agent: MJ12bot
Disallow: /

User-agent: ImageSift
Disallow: /

User-agent: AnthropicBot
Disallow: /

User-agent: *
Disallow: /admin.php
Disallow: /account/
Disallow: /goto/
Disallow: /login/
Disallow: /register/
Disallow: /search/
Disallow: /help/
Disallow: /members/


Sitemap: https://www.xxxxxxxxxxxxxx.com/sitemap.xml
 
Last edited:
Well since uploading the robots.txt file I've got a lot more bots and visitors! Yandex has appeared all over the place. Never had that one before.
 
Added Yandex to the disallow list :-) Don't think I need a Russian bot do I?

Also have now secured admin.php via Cloudflare Zero Trust - so a good bit of housekeeping done.
 
Can I do that in Cloudflare as well?
Yes, you can do it through Cloudflare also.

There are a couple different ways to do it through Cloudflare that will work.

I personally create a "Mysite XenForo Admin" Rule Group. That includes all emails and ips of the admins for the site. Then create separate Applications for www.mysite.com/admin.php and www.mysite.com/install, add my test site (test.mysite.com). This way you only need to maintain one Rule Group which can be applied to multiple Applications.
 
Thank you. I've been trying to find out more about this and asked the server. Apparently if I secure admin.php and /install via Cloudflare Zero Trust, then I need to ensure that only Cloudflare IP addresses pass through the server. Is that right? Otherwise someone could bypass Cloudflare to do something. But they also said there are downsides and limitations to restricting all IP addresses to the server going through Cloudflare.
 
Back
Top Bottom