Known Bots

Known Bots 6.1.1

No permission to download
We block the entire ASN block at the server level. (We don’t use CloudFlare) We get the raw IP data associated with their ASN ID and load that into our firewall.
So do you block it in htaccess as well as blocking its IP ranges in the server panel?
 
blocking asn works very well with cloudflare. can be done using the digitalpoint addon from xenforo backend.
Thank you. I already tried blocking it in the Cloudflare app via a firewall rule and it still appeared on the site. So I'm wondering if it bypasses Cloudflare. If that's the case, none of the Cloudflare settings will help.
 
try getting a new ip from your host and stop ip leaks through unfurl/proxy etc which i think you already did using the digitalpoint addon.
 
try getting a new ip from your host and stop ip leaks through unfurl/proxy etc which i think you already did using the digitalpoint addon.
Yes I did use the proxy in the digitalpoint addon. However I am wondering if ByteDance obtained the server IP from the site before I used the addon - ie before that was proxied (it was certainly crawling the site before I used the addon). Can I actually get a new IP from the server if it's shared hosting?
 
would depend upon how flexible the host is. in case of shared environment, they would need to move you to another server which is possible and i have done it on shared hosting before. talk to your host. tell them you are getting hammered on this ip address and you would like to switch to a different one.
 
Getting a new IP won't really solve the problem, if the source of the problem isn't not blocked at other levels. (excluding CF)
These guys just scan subnets looking for active sites - it's only a matter of time until they come back.

(And, since it's shared hosting, I'm willing to bet that there are other sites on the same IP address, which I'm sure are already exposed and anything you move to, will likely already have been out there for some time.)
 
I wondered that. There is an option to change to a cloud hosting service with a dedicated IP address which isn't a particularly expensive upgrade - but I expect that IP address has been used before somewhere as well.

So what other levels could I block it at, outside Cloudflare? htaccess?
 
We had to block AS45899, a Vietnam range. I wanted to avoid this one for the longest time because it seemed like a residential ISP, but we kept getting more and more traffic from it, without any registration. The activity views the oldest threads, so that's a key indicator it's data scraping.

The AS is also associated to the 'Coccoc bot' crawler. It does not respect the robots.txt and will keep scanning.

More information about it all here: https://clashpanda.com/blocking-coccocbot-on-cloudflare-a-step-by-step-guide/

That's enough for us to confirm it should be blocked.
 
I suddenly have a whole number of bots called Async http client/server framework (aiohttp Python)
It's in the list of known bots but I don't know what it is and if I should try to block it?
 
It's in the list of known bots but I don't know what it is and if I should try to block it?
Seems like an open source python script that can be run by anyone.
Exactly - but there are barely any legitimate reasons imaginable to let it hammer other peoples forums at scale. These are scrapers, cheaply made ones - those who are made with a little more intelligence fake their user agent.
You can simply block those in your .htaccess:


Code:
<IfModule mod_rewrite.c>   
    RewriteCond %{HTTP_USER_AGENT} ^aiohttp$ [NC]
    RewriteRule .* - [F,L]
</IfModule>
 
Exactly - but there are barely any legitimate reasons imaginable to let it hammer other peoples forums at scale. These are scrapers, cheaply made ones - those who are made with a little more intelligence fake their user agent.
You can simply block those in your .htaccess:


Code:
<IfModule mod_rewrite.c> 
    RewriteCond %{HTTP_USER_AGENT} ^aiohttp$ [NC]
    RewriteRule .* - [F,L]
</IfModule>
I'm on XF Cloud and don't have access to .htaccess. :cry:

This is my only gripe with XenForo, how they don't help with keeping these buggers at bay. We've been under attack a few times and they fix it quick enough, no complaints there. But then they tell me to "get Cloudflare" but I don't want to get Cloudflare. I'm paying a lot of money for Cloud and I think it's XenForo's responsibility to keep bad bots that don't respect robots.txt out. That should be the default. If people do want them on their forum they can ask for it. /rant

Being on Cloud my only defence is the robots.txt which bad bots ignore. So what I do is I put them in discourage mode, redirecting them to a dead URL. It's how I got rid of Bytespider. But I can only do so many or else the system can't handle it and it will slow the site to a halt.

The Async bots were using one IP address and I put it in discourage mode. Well, now they are using a gazillion different IP addresses.

I can try adding it to the robots.txt but not sure what the user-agent is?
 
Last edited:
This is my only gripe with XenForo, how they don't help with keeping these buggars at bay.
I agree with you here, the more for the hosted version. What I'd expect from a decent actual forum software today is to deal with new developments and threats and AI-bots are one of the biggest. Sure one could argue, that this kind of stuff should be done on the network layer and not on the application layer - still: behaviour based blocking could (and in my eyes should) be supported or even done by forum software in my eyes. I am self-hosted and in the same boat not wanting to use cloudflare. Clearly a point where XF does fall short. Very short.

I can try adding it to the robots.txt but not sure what the user-agent is?
This won't work as robots.txt relies on cooperation of the bots and you can be sure that bad bots are not cooperative.
 
But then they tell me to "get Cloudflare" but I don't want to get Cloudflare. I'm paying a lot of money for Cloud and I think it's XenForo's responsibility to keep bad bots that don't respect robots.txt out. That should be the default. If people do want them on their forum they can ask for it.

The thing is that the XenForo devs are really really good at building forum software - it's their thing.

Cloudflare is really really good at identifying and blocking problematic traffic, including bots - it's their thing.

I don't ask my plumber to fix my garden and I don't ask my gardener to fix a leaky tap in my house.

If you expect XenForo Cloud to be able to do everything that Cloudflare can do in relation to bot identification and management - then you're going to have to pay for it, just like you would with Cloudflare - because that expertise doesn't come for free. Except that there's actually a free version of Cloudflare you could use to do exactly what you want because they already have the expertise and are leveraging their free plan to benefit their product development :rolleyes:

Blocking bad bots that don't adhere to the standards is a game of whack-a-mole - and requires a huge amount of time and effort to identify, track and mitigate them. Cloudflare are good at that type of thing. If bot management is an issue for you - implement a bot management solution. 🤷‍♂️
 
Sure one could argue, that this kind of stuff should be done on the network layer and not on the application layer - still: behaviour based blocking could (and in my eyes should) be supported or even done by forum software in my eyes

Don't forget that identifying and processing bots in the forum software will require increasing levels of resources - and thus cost to you directly. Outsourcing that to a 3rd party (Cloudflare or otherwise) prevents the traffic from even getting to your server - thus requiring fewer resources to process. This is why DDoS protection should be done externally - not on-server.

PS. if you want to talk ISO model stuff - bots are generally an Application Layer level thing - they use HTTP and DNS and act like regular web browsers. Network Layer stuff is much, much lower in the stack and more typically associated with firewalls and mitigating DDoS attacks rather than higher level bot traffic.
 
The thing is that the XenForo devs are really really good at building forum software - it's their thing.

Cloudflare is really really good at identifying and blocking problematic traffic, including bots - it's their thing.

I don't ask my plumber to fix my garden and I don't ask my gardener to fix a leaky tap in my house.
Hmmm, that's a good point and it's well taken.
 
I'm on XF Cloud and don't have access to .htaccess. :cry:

This is my only gripe with XenForo, how they don't help with keeping these buggers at bay. We've been under attack a few times and they fix it quick enough, no complaints there. But then they tell me to "get Cloudflare" but I don't want to get Cloudflare. I'm paying a lot of money for Cloud and I think it's XenForo's responsibility to keep bad bots that don't respect robots.txt out. That should be the default. If people do want them on their forum they can ask for it. /rant

Being on Cloud my only defence is the robots.txt which bad bots ignore. So what I do is I put them in discourage mode, redirecting them to a dead URL. It's how I got rid of Bytespider. But I can only do so many or else the system can't handle it and it will slow the site to a halt.

The Async bots were using one IP address and I put it in discourage mode. Well, now they are using a gazillion different IP addresses.

I can try adding it to the robots.txt but not sure what the user-agent is?
I was going to say you could block them in ht.access and then just saw that you can't. I use Cloudflare and just the free version - you don't have to have the paid version. The Cloudflare app for xenforo that you can install is also free. Some of the free options re bot blocking are limited. My robots.txt works well and I blocked Bytespider in ht.access which got rid of a lot. This is my robots.txt if it helps - with the correct user-agents and syntax (created with help on here from various people). My bots are down dramatically, but then it's only a small forum. These all seem to follow robots.txt although occasionally anthropic ignores it but not often.

Which ones in particular are giving you trouble? Getting rid of Bytespider made a big difference for me. I think I added Amazonbot to the list as well as Alexa was getting pesky.

I'll just add that the Cloudflare Xenforo addon is brilliant - even if you don't use many of the options - you can immediately see which country all guests and members are from and the same with IP addresses - which is very useful. You can also protect install and admin with a flick of a switch (to stop server origin IP being revealed to any unwanted visitors). Plus you can block a whole country in the firewall. I blocked one Eastern block country.

And if you're not using Cloudflare turnstile then it adds that as well - and that makes a massive difference in spam protection (and spots bots to prevent them registering or using the contact us email etc). It's a bit of extra work at first but then just does it's thing. And it's less work if you use the app for Xenforo. I think you just need to open a Cloudflare account before using the app but @digitalpoint will know - I already had an account. My email goes via Cloudflare as well.


User-agent: AspiegelBot
Disallow: /

User-agent: AhrefsBot
Disallow: /

User-agent: SemrushBot
Disallow: /

User-agent: DotBot
Disallow: /

User-agent: MauiBot
Disallow: /

User-agent: MJ12bot
Disallow: /

User-agent: ImageSift
Disallow: /

User-agent: AnthropicBot
Disallow: /

User-agent: Yandex
Disallow: /

User-agent: *
Disallow: /admin.php
Disallow: /account/
Disallow: /goto/
Disallow: /login/
Disallow: /register/
Disallow: /search/
Disallow: /help/
Disallow: /members/

Sitemap: https://www.your site.com/sitemap.xml
 
Last edited:
Back
Top Bottom