Crazy amount of guests

dutchbb · Wednesday at 11:25 AM

We had 3-6000 guests, now 200-400 for last months.

Using following blocklists:

countries
ASN
user agents
firehol 1

Connlimit and portflood set for some heavy scrapers.

smallwheels · Wednesday at 11:40 AM

dutchbb said:
Using following blocklists:

What tooling did you choose to make those work?

dutchbb · Wednesday at 11:53 AM

smallwheels said:
What tooling did you choose to make those work?

On cpanel

countries and ASN in firewall cc_deny (csf)
user agents in apache with SetEnvIfNoCase (headless browsers, missing user-agents, scanners, data crawlers,..)
firehol 1 in firewall lfd blocklist (csf)

I also give a 403 to bots scanning for specific files (like wordpress).

Blocking specific countries in South Amerika had most effect.

D

Post in thread 'Massive increase in Guest users online'

Sep 17, 2025

We had a large amount of guests/bots from amazonaws.com yesterday. We block some countries on server but have a lot of different ones today that normally don't show up.

Brazil also and Argentina mostly now.

D

Post in thread 'Known Bots'

Nov 16, 2025

This is what we have.

Code:

<IfModule mod_setenvif.c>
  <Location />

    # Headless / automation browsers
    SetEnvIfNoCase User-Agent "(selenium|puppeteer|phantomjs|phantom|playwright(-chromium|(-webkit)|(-firefox)?)?|headlesschrome|cypress|chromiumbot|headlessbot|slimerjs|triflejs|TestCafe|Nightwatch|WebDriverIO|Taiko|RobotFramework|Protractor|Nightmare|CasperJS|ZombieJS|Splash|HtmlUnit|WebKitTestRunner)" bad_bots

    # AI / content / data crawlers
    SetEnvIfNoCase User-Agent...

D

Post in thread 'Crazy amount of guests'

Apr 16, 2026

This also helps.

GitHub - firehol/blocklist-ipsets: ipsets dynamically updated with firehol's update-ipsets.sh script

ipsets dynamically updated with firehol's update-ipsets.sh script - firehol/blocklist-ipsets

github.com

FireHOL IP Lists | IP Blacklists | IP Reputation Feeds

350+ IP blacklists, IP blocklists and IP Reputation feeds, about Cybercrime, Fraud, Botnets, Μalware, Virus, Abuse, Attacks, Open Proxies, Anonymizers. See their changes and updates, size over time, retention policy, geographic coverage, comparisons and overlaps.

iplists.firehol.org

BrettC · Wednesday at 2:36 PM

smallwheels said:
Interesting. Typically people realize that AWS is not a cheap option after they went to the cloud to save money.

Don't get me wrong here, but AWS is excellent for CloudFront/S3 hosting of objects - global replication network and all. Everything else after that (IMHO) is big money.

ES Dev Team said:
This is a high score for Xenforo.com as of a few mins ago:

View attachment 337529

Must be bad out there

MentaL said:
"guests: 95,699" for me , atm.

Interestingly enough, things were slow for me over the past 72 hours... granted Anubis is just humming along without incident.

However, I did notice something quite interesting this past weekend. I now have (what appears to be) a massive botnet performing hellacious rainbow/dictionary requests on my name servers. Once one of these drones got a DNS response with A or AAAA records, I almost instantly am getting hits on HTTPS/443 for that very subdomain. This seems to be a new form of scraper/probers. I've not a clue if this is an evolution of these AI/LLM scrapers, but it sure feels suspect.

Back when DNS reflection attacks were the 'hot chick' in terms of DDoS vector, I already had rate limits set in place. These attacks are being auto-rate limited on named/bind, and the worst offenders are being matched and blocked by ConfigServer & Firewall into IP Tables, replicated across my various servers.

One of the most recent attacks was more than 20k queries/query-attempts in a matter of mere seconds, spanning more than 1,500 unique IP addresses all querying the same things, across IPv4 and IPv6. It was instantaneous enough to cause named/bind to be unable to proactively rate-limit correctly - well over the defined slip and drop limits. With this massive increase of queries per second to happen like it did, it gave me pause.

At first, I wanted to say this was a DNS poisoning attempt, but it's far from that attack mechanism.

BrettC · Wednesday at 2:40 PM

dutchbb said:
On cpanel

countries and ASN in firewall cc_deny (csf)

user agents in apache with SetEnvIfNoCase (headless browsers, missing user-agents, scanners, data crawlers,..)

firehol 1 in firewall lfd blocklist (csf)

I also give a 403 to bots scanning for specific files (like wordpress).

Blocking specific countries in South Amerika had most effect.

Was going to say, if you're using nginx, throw error 444 back at the bad actors. For some reason, HTTP 444 completely breaks most of these poorly coded bots/scrapers (hint: you'll get request hammering from time to time). But alas, you're using Apache for your web services.

smallwheels · Wednesday at 3:12 PM

BrettC said:
I now have (what appears to be) a massive botnet performing hellacious rainbow/dictionary requests on my name servers.

BrettC said:
One of the most recent attacks was more than 20k queries/query-attempts in a matter of mere seconds, spanning more than 1,500 unique IP addresses all querying the same things, across IPv4 and IPv6.

Wow. Sometimes one would like to have the virtual swat team with it's black mini bus in place for the people that do these kind of things.

BrettC · Wednesday at 3:24 PM

smallwheels said:
Wow. Sometimes one would like to have the virtual swat team with it's black mini bus in place for the people that do these kind of things.

It sure has gotten annoying post 'AI-vibe-coding'. These things have the same style of probing/flooding attacks from decades past, but now it is much more distributed, and with some, much more bandwidth / resource intense. At least back then, these skiddies would at least rate limit their attacks and probe over time.

It's almost not even worth it trying to find the originating attacker/group. Some of these attacks are very likely state-sponsored events. Best course of action is to just proactively filter it.

dutchbb · Wednesday at 4:10 PM

BrettC said:
Was going to say, if you're using nginx, throw error 444 back at the bad actors. For some reason, HTTP 444 completely breaks most of these poorly coded bots/scrapers (hint: you'll get request hammering from time to time). But alas, you're using Apache for your web services.

That would be a better option. At the moment we temp ban an ip that repeatedly hits 403.

ES Dev Team · Wednesday at 4:55 PM

BrettC, thanks for mentioning this.

I'm not noticing an uptick in bots or CPU consumption here but i looked at my fail2ban history and notice it has been very busy rate limiting 401's.

I rate limit 404, 403, and 401, and also how fast you are submitting POSTs; these are all related to probing activity.

They must have not been too well distributed if fail2ban picked up so many of them.

I don't run my own DNS, so no signals there

TMMAC · 2026-05-14T06:25:45+0100

ES Dev Team said:
This is a high score for Xenforo.com as of a few mins ago:

View attachment 337529

Must be bad out there

They hit 250k a couple weeks back. I uploaded a screenshot somewhere on the forum here.

BrettC · 2026-05-14T06:44:25+0100

ES Dev Team said:
BrettC, thanks for mentioning this.

I'm not noticing an uptick in bots or CPU consumption here but i looked at my fail2ban history and notice it has been very busy rate limiting 401's.

I rate limit 404, 403, and 401, and also how fast you are submitting POSTs; these are all related to probing activity.

View attachment 337571

They must have not been too well distributed if fail2ban picked up so many of them.

I don't run my own DNS, so no signals there

Very interesting metrics!

As for DNS, self-hosting your own DNS and not relying on a third party provider to host your DNS 'stuff' is the most liberating thing you can do. For starters, you gain significant insight and analytical detail to queries being made to your servers.

The only real drawback is the potential exists for massive DNS flood requests and DDoS based attacks to/from port 53 as it's UDP (reflection / IP origin spoofing). Once you get a handle on that with the RRL feature set (if using named/bind), you now have near-real-time control to your entire domains records (A, AAAA, MX, etc.). You then can collect a proper heap of metrics from client ISP's resolvers accessing your services... before the actual client accesses a website under your management/control. Hosting your own name servers can show you where some of these botnets are originating. Blocking their resolvers from querying your own servers resolvers actually slams the door shut and the botnets in question end up seeing NXDOMAIN on their local clients that use said resolver. That ultimately can translate to the botnets/scraper-nets never even hitting your servers ingest point to begin with. It's not 100% by any means, as there are many clients out there that utilize Cloudflare, GoogleDNS, OpenDNS, or even run their own local resolver - ignoring the ISP resolver.

Goes back to the saying: It's always DNS.

Anthony Parsons · 2026-05-14T07:04:53+0100

TMMAC said:
They hit 250k a couple weeks back. I uploaded a screenshot somewhere on the forum here.

This just shows what was said earlier though, if you have a well tuned server, none of it matters. It takes time to chase nasties around the internet, and even when you think you caught them, they change IP's / ASN's / use residential proxies. You can tune and forget, or spend time daily / weekly trying to stop it all. There is a healthy middle ground IMO, takes maybe 5 minutes weekly, but they still shift faster than you will ever block.

Guest caching alone at a CDN saves you 60% - 70% of your server doing anything.

smallwheels · 2026-05-14T07:47:31+0100

TMMAC said:
They hit 250k a couple weeks back. I uploaded a screenshot somewhere on the forum here.

here:

T

Post in thread 'Crazy amount of guests'

Apr 19, 2026

Unreal. This is out of control. Xenforo's own forum...

I assume XF does absolutely zero against scrapers and bots. For their business purposes it does no harm (but rather helps) if AI knows about their product.

XFR00z0 · 2026-05-14T14:29:20+0100

Wildcat Media said:
I did see that Cloudflare is adding a service (I believe it's in a closed beta) where you can offer your data to AI scrapers for a payment. I guess one of the HTTP return codes, 402 Payment Required, is the mechanism they use, and from there they've found a way to implement payment.

Yes, I have read about same, but CF is not sharing any data related to same to general public I have not seen sites which have implemented this yet.
But lets hope, it will benefit forum revenue.

Wildcat Media said:
CF finds some other way to block innocent legitimate users that are beyond my control. Which is what is happening right now.

This is the worst part of using CF, I have blocked all countries as our site was getting heavily bombed with AI bots.
Block 1 country, they found another and another..

If Xenforo, can shred some thoughts here would be great.
But Xenforo, doesnt seems to be worried even if the content on this forums is getting scrapped, as it will only help Xenforo in getting new prospective buyers.

XFR00z0 · 2026-05-14T14:31:20+0100

dutchbb said:
Using following blocklists:

countries

ASN

user agents

firehol 1

Are you doing this on CF ?
Can you share list of countries and ASN ? I have blocked most of user-agents which are common and helped get rid of Spam traffic.
But i sense, even legimate users are unable to clear CF captcha and enter forums.
I also blocked Google Spiders doing same..
Not sure what is firehol 1 ?

lazy llama · 2026-05-14T14:41:28+0100

XFR00z0 said:
Yes, I have read about same, but CF is not sharing any data related to same to general public I have not seen sites which have implemented this yet.
But lets hope, it will benefit forum revenue.

It's a closed beta at the moment, but they document how it works - https://developers.cloudflare.com/ai-crawl-control/features/pay-per-crawl/what-is-pay-per-crawl/

As you'd expect, it's targeted at "partners" with valuable content - e.g. medical journals. Maybe it'll trickle down to forums, but if I do I expect it to be Spotify-levels of revenue for content curators, i.e. $0.10 for 10,000,000 requests.

As most of the crawlers hitting forums seem to be the less-legit fake-user-agent, residential-proxy-using evil-crawlers, I suspect it won't have much effect on the traffic levels we're all seeing.

dutchbb · 2026-05-14T15:50:38+0100

XFR00z0 said:
Are you doing this on CF ?
Can you share list of countries and ASN ? I have blocked most of user-agents which are common and helped get rid of Spam traffic.
But i sense, even legimate users are unable to clear CF captcha and enter forums.
I also blocked Google Spiders doing same..
Not sure what is firehol 1 ?

No, on cpanel server with firewall (csf = ConfigServer Firewall)

Those countries and ASN are not always a good fit for other languages or international forums. Most effect was from blocking some in South America and ASN that are known as spam sources.

Firehol is a blocklist to add to your firewall. It contains blacklisted IP's from many other lists.

You have different levels:

https://iplists.firehol.org/

firehol_level1
This IP list is a composition of other IP lists.
The objective is to create a blacklist that can be safe enough to be used on all systems, with a firewall, to block access entirely, from and to its listed IPs.
The key prerequisite for this cause, is to have no false positives. All IPs listed should be bad and should be blocked, without exceptions.

firehol_level2
An ipset made from blocklists that track attacks, during about the last 48 hours.
This list is suitable for servers that need to block recent attack sources without being too aggressive. It is meant to be used on top of firehol_level1.

firehol_level3
An ipset made from blocklists that track attacks, spyware, viruses. It includes IPs that have been reported or detected in the last 30 days.
This list casts a wider net than firehol_level2 and may include some false positives, especially for IPs that were flagged weeks ago but may no longer be malicious.

firehol_level4
An ipset made from blocklists that track attacks, but may include a large number of false positives.
This is the most aggressive of the firehol levels. It includes lists that may have significant false positive rates. Use with caution and only if you understand the risk.

GitHub - firehol/blocklist-ipsets: ipsets dynamically updated with firehol's update-ipsets.sh script

ipsets dynamically updated with firehol's update-ipsets.sh script - firehol/blocklist-ipsets

github.com

Anthony Parsons · 2026-05-15T00:35:21+0100

XFR00z0 said:
Block 1 country, they found another and another..

Don't block them, manage challenge them. Different errors, different reaction.

Anthony Parsons · 2026-05-15T04:52:40+0100

Interestingly, CF is having Bot Management issues and that service is degraded network wide: https://www.cloudflarestatus.com/

Says they identified the issue and are working on a fix on May 5, yet service is still degraded globally.

Crazy amount of guests

Well-known member

Well-known member

Well-known member

Active member

Active member

Well-known member

Active member

Well-known member

Well-known member

New member

Active member

Well-known member

Well-known member

Member

Member

Well-known member

Well-known member

Well-known member

Well-known member

Similar threads

We value your privacy