What tooling did you choose to make those work?Using following blocklists:
On cpanelWhat tooling did you choose to make those work?
<IfModule mod_setenvif.c>
<Location />
# Headless / automation browsers
SetEnvIfNoCase User-Agent "(selenium|puppeteer|phantomjs|phantom|playwright(-chromium|(-webkit)|(-firefox)?)?|headlesschrome|cypress|chromiumbot|headlessbot|slimerjs|triflejs|TestCafe|Nightwatch|WebDriverIO|Taiko|RobotFramework|Protractor|Nightmare|CasperJS|ZombieJS|Splash|HtmlUnit|WebKitTestRunner)" bad_bots
# AI / content / data crawlers
SetEnvIfNoCase User-Agent...
iplists.firehol.org
Don't get me wrong here, but AWS is excellent for CloudFront/S3 hosting of objects - global replication network and all. Everything else after that (IMHO) is big money.Interesting. Typically people realize that AWS is not a cheap option after they went to the cloud to save money.![]()
This is a high score for Xenforo.com as of a few mins ago:
View attachment 337529
Must be bad out there
Interestingly enough, things were slow for me over the past 72 hours... granted Anubis is just humming along without incident."guests: 95,699" for me , atm.
On cpanel
I also give a 403 to bots scanning for specific files (like wordpress).
- countries and ASN in firewall cc_deny (csf)
- user agents in apache with SetEnvIfNoCase (headless browsers, missing user-agents, scanners, data crawlers,..)
- firehol 1 in firewall lfd blocklist (csf)
Blocking specific countries in South Amerika had most effect.
I now have (what appears to be) a massive botnet performing hellacious rainbow/dictionary requests on my name servers.
One of the most recent attacks was more than 20k queries/query-attempts in a matter of mere seconds, spanning more than 1,500 unique IP addresses all querying the same things, across IPv4 and IPv6.
It sure has gotten annoying post 'AI-vibe-coding'. These things have the same style of probing/flooding attacks from decades past, but now it is much more distributed, and with some, much more bandwidth / resource intense. At least back then, these skiddies would at least rate limit their attacks and probe over time.Wow. Sometimes one would like to have the virtual swat team with it's black mini bus in place for the people that do these kind of things.
That would be a better option. At the moment we temp ban an ip that repeatedly hits 403.Was going to say, if you're using nginx, throw error 444 back at the bad actors. For some reason, HTTP 444 completely breaks most of these poorly coded bots/scrapers (hint: you'll get request hammering from time to time). But alas, you're using Apache for your web services.![]()

They hit 250k a couple weeks back. I uploaded a screenshot somewhere on the forum here.This is a high score for Xenforo.com as of a few mins ago:
View attachment 337529
Must be bad out there
Very interesting metrics!BrettC, thanks for mentioning this.
I'm not noticing an uptick in bots or CPU consumption here but i looked at my fail2ban history and notice it has been very busy rate limiting 401's.
I rate limit 404, 403, and 401, and also how fast you are submitting POSTs; these are all related to probing activity.
View attachment 337571
They must have not been too well distributed if fail2ban picked up so many of them.
I don't run my own DNS, so no signals there
This just shows what was said earlier though, if you have a well tuned server, none of it matters. It takes time to chase nasties around the internet, and even when you think you caught them, they change IP's / ASN's / use residential proxies. You can tune and forget, or spend time daily / weekly trying to stop it all. There is a healthy middle ground IMO, takes maybe 5 minutes weekly, but they still shift faster than you will ever block.They hit 250k a couple weeks back. I uploaded a screenshot somewhere on the forum here.
here:They hit 250k a couple weeks back. I uploaded a screenshot somewhere on the forum here.
Yes, I have read about same, but CF is not sharing any data related to same to general public I have not seen sites which have implemented this yet.I did see that Cloudflare is adding a service (I believe it's in a closed beta) where you can offer your data to AI scrapers for a payment. I guess one of the HTTP return codes,402 Payment Required, is the mechanism they use, and from there they've found a way to implement payment.
This is the worst part of using CF, I have blocked all countries as our site was getting heavily bombed with AI bots.CF finds some other way to block innocent legitimate users that are beyond my control. Which is what is happening right now.
Are you doing this on CF ?Using following blocklists:
- countries
- ASN
- user agents
- firehol 1
It's a closed beta at the moment, but they document how it works - https://developers.cloudflare.com/ai-crawl-control/features/pay-per-crawl/what-is-pay-per-crawl/Yes, I have read about same, but CF is not sharing any data related to same to general public I have not seen sites which have implemented this yet.
But lets hope, it will benefit forum revenue.
No, on cpanel server with firewall (csf = ConfigServer Firewall)Are you doing this on CF ?
Can you share list of countries and ASN ? I have blocked most of user-agents which are common and helped get rid of Spam traffic.
But i sense, even legimate users are unable to clear CF captcha and enter forums.
I also blocked Google Spiders doing same..
Not sure what is firehol 1 ?
Don't block them, manage challenge them. Different errors, different reaction.Block 1 country, they found another and another..
We use essential cookies to make this site work, and optional cookies to enhance your experience.