Known Bots

Known Bots 6.1.0

No permission to download
We block the entire ASN block at the server level. (We don’t use CloudFlare) We get the raw IP data associated with their ASN ID and load that into our firewall.
So do you block it in htaccess as well as blocking its IP ranges in the server panel?
 
blocking asn works very well with cloudflare. can be done using the digitalpoint addon from xenforo backend.
Thank you. I already tried blocking it in the Cloudflare app via a firewall rule and it still appeared on the site. So I'm wondering if it bypasses Cloudflare. If that's the case, none of the Cloudflare settings will help.
 
try getting a new ip from your host and stop ip leaks through unfurl/proxy etc which i think you already did using the digitalpoint addon.
 
try getting a new ip from your host and stop ip leaks through unfurl/proxy etc which i think you already did using the digitalpoint addon.
Yes I did use the proxy in the digitalpoint addon. However I am wondering if ByteDance obtained the server IP from the site before I used the addon - ie before that was proxied (it was certainly crawling the site before I used the addon). Can I actually get a new IP from the server if it's shared hosting?
 
would depend upon how flexible the host is. in case of shared environment, they would need to move you to another server which is possible and i have done it on shared hosting before. talk to your host. tell them you are getting hammered on this ip address and you would like to switch to a different one.
 
Getting a new IP won't really solve the problem, if the source of the problem isn't not blocked at other levels. (excluding CF)
These guys just scan subnets looking for active sites - it's only a matter of time until they come back.

(And, since it's shared hosting, I'm willing to bet that there are other sites on the same IP address, which I'm sure are already exposed and anything you move to, will likely already have been out there for some time.)
 
I wondered that. There is an option to change to a cloud hosting service with a dedicated IP address which isn't a particularly expensive upgrade - but I expect that IP address has been used before somewhere as well.

So what other levels could I block it at, outside Cloudflare? htaccess?
 
We had to block AS45899, a Vietnam range. I wanted to avoid this one for the longest time because it seemed like a residential ISP, but we kept getting more and more traffic from it, without any registration. The activity views the oldest threads, so that's a key indicator it's data scraping.

The AS is also associated to the 'Coccoc bot' crawler. It does not respect the robots.txt and will keep scanning.

More information about it all here: https://clashpanda.com/blocking-coccocbot-on-cloudflare-a-step-by-step-guide/

That's enough for us to confirm it should be blocked.
 
I suddenly have a whole number of bots called Async http client/server framework (aiohttp Python)
It's in the list of known bots but I don't know what it is and if I should try to block it?
 
It's in the list of known bots but I don't know what it is and if I should try to block it?
Seems like an open source python script that can be run by anyone.
Exactly - but there are barely any legitimate reasons imaginable to let it hammer other peoples forums at scale. These are scrapers, cheaply made ones - those who are made with a little more intelligence fake their user agent.
You can simply block those in your .htaccess:


Code:
<IfModule mod_rewrite.c>   
    RewriteCond %{HTTP_USER_AGENT} ^aiohttp$ [NC]
    RewriteRule .* - [F,L]
</IfModule>
 
Exactly - but there are barely any legitimate reasons imaginable to let it hammer other peoples forums at scale. These are scrapers, cheaply made ones - those who are made with a little more intelligence fake their user agent.
You can simply block those in your .htaccess:


Code:
<IfModule mod_rewrite.c> 
    RewriteCond %{HTTP_USER_AGENT} ^aiohttp$ [NC]
    RewriteRule .* - [F,L]
</IfModule>
I'm on XF Cloud and don't have access to .htaccess. :cry:

This is my only gripe with XenForo, how they don't help with keeping these buggers at bay. We've been under attack a few times and they fix it quick enough, no complaints there. But then they tell me to "get Cloudflare" but I don't want to get Cloudflare. I'm paying a lot of money for Cloud and I think it's XenForo's responsibility to keep bad bots that don't respect robots.txt out. That should be the default. If people do want them on their forum they can ask for it. /rant

Being on Cloud my only defence is the robots.txt which bad bots ignore. So what I do is I put them in discourage mode, redirecting them to a dead URL. It's how I got rid of Bytespider. But I can only do so many or else the system can't handle it and it will slow the site to a halt.

The Async bots were using one IP address and I put it in discourage mode. Well, now they are using a gazillion different IP addresses.

I can try adding it to the robots.txt but not sure what the user-agent is?
 
Last edited:
This is my only gripe with XenForo, how they don't help with keeping these buggars at bay.
I agree with you here, the more for the hosted version. What I'd expect from a decent actual forum software today is to deal with new developments and threats and AI-bots are one of the biggest. Sure one could argue, that this kind of stuff should be done on the network layer and not on the application layer - still: behaviour based blocking could (and in my eyes should) be supported or even done by forum software in my eyes. I am self-hosted and in the same boat not wanting to use cloudflare. Clearly a point where XF does fall short. Very short.

I can try adding it to the robots.txt but not sure what the user-agent is?
This won't work as robots.txt relies on cooperation of the bots and you can be sure that bad bots are not cooperative.
 
Back
Top Bottom