Crazy amount of guests

Levina

Active member
I have a small forum, usually a few hundred guests at most. The last days the number of guests have been unusually high, around 3000 and climbing. We now have over 4800 guests. That's not normal. Many "Viewing unknown page" or "Latest content", with a warning sign. IP addresses are from all corners of the world. Many from south America, Brazil mostly. Also saw quite a number from Vietnam of all places, but everywhere else too.

We're on Cloud and I don't know what to do. Could we be under attack? We have been in the past, a few times I think. For reasons I cannot possibly fathom; we're a small photography forum for crying out loud.
 
A few of our clients (not XenForo) have had crawlers (we assume it's poorly written AI training looking at the traffic patterns) and huge traffic spikes from Brazil and Vietnam over this last week so they are evidently actively trawling. Not sure if that's of any comfort, but I think that it's probably not an attack as such. Obviously we tend to shut down traffic from the various networks involved, beyond enquiring of XF themselves I'm not sure what your easy options are on their cloud platform.
 
 
Install this one to get more insight
I have that installed but I'm too stupid to know what to look for. There's a long list of the following, but I don't know what that means or how that helps.

Scherm­afbeelding 2025-10-12 om 21.59.32.webp

There was also one of these in the list, from openai. I use ChatGPT sometimes so it would be hypocritical to not allow its search bot.

Scherm­afbeelding 2025-10-12 om 22.00.28.webp
 
Yep, that's us too.

I see some 4600 guests even here too.
 
Yeah, the weaknesses of my KnownBots add on are primarily:
  1. I have to have identified a bot and created a definition for it before it shows up as a bot - so new bots won't show up automatically until I get them added (and new bots are being created at a huge rate now!)
  2. so many bots do not identify themselves in any case - there's a lot of malicious bots out there which ignore robots.txt and don't identify themselves via their user agent, so we can't tell that they are bots.
So many AI crawlers and other bots that refuse to identify themselves and nobody can really stop them unless you want to start blocking entire networks - which may also block legitimate traffic. There's no easy solution here.
 
A majority of these bots are lying about what they are, and pretending to be legitimate clients, so they're very hard to identify.
And they are running wild all over the internet lately scraping every piece of info they can find.
Mostly it's chinese companies training AI.

I use a series of fail2ban rules attached to apache web logs to keep them at bay. This results in about half the bot slipthrough versus using cloudflare at best.

We have a huge amount of content so this bot count would climb into the 10's of thousands without any control.
 
I think that it's probably not an attack as such.
I don't think that either actually because with previous attacks our site slowed down to a trickle and a few times just shut down altogether, so I had to ask the good people of XenForo for help. And that is not happening now.

Guest numbers are down at the moment to 4200 from well over 5000 yesterday. Hopefully that is a trend.
 
Last edited:
This seems to be a big issue these days that affects all forums regardless of size. Not going anywhere soon as we live in an area of hungry AI sucking huge amount of data to train their models.
 
Yeah, the weaknesses of my KnownBots add on are primarily:
I didn't mean to criticise, Sim. I apologise if I gave that impression. If anything I blame myself because I don't really know what to look for in the "other user agents" list.
 
This seems to be a big issue these days that affects all forums regardless of size. Not going anywhere soon as we live in an area of hungry AI sucking huge amount of data to train their models.
I don't know about others but I do have a bit of a dilemma when it comes to these AI scrapers. I use ChatGPT regularly. To help me with texts, to help me with code. And to then deny it access to my forum feels, as I said above, hypocritical. But where to draw the line? Which ones to allow, which ones to block? I have no idea.
 
I didn't mean to criticise, Sim. I apologise if I gave that impression. If anything I blame myself because I don't really know what to look for in the "other user agents" list.

No criticism taken. Bot management is a tricky task - I'm pretty realistic about my inability to meaningfully identify the majority of bots - especially when so many are poorly behaved and don't adequately identify themselves these days.

Just to give you a sense of the scale of the issue - here are some stats from my KnownBots management tool:

1760388606561.webp

That top number is my TODO list - there are over 48,000 unique user agents that have been sent to me by sites using my addon which I need to go through and try and identify bots in, then create a definition for each to allow the addon to identify which user agents are bots and which can be ignored.

The vast majority of those will be genuine browsers (or random strings used by some malicious bots) - but sorting through the browsers to find the bots can be time-consuming.

Still, as you can see - of the 232,000 unique user agent strings I've collected so far, over 46,000 of them have been identified as bots - and I've managed to identify 1,871 different bots so far.

I generally recommend using Cloudflare to help with bot management - paid plan if you can afford it, but even the free plan has some basic bot management tools available.
 
Back
Top Bottom