Crazy amount of guests

Funnily enough I've just noticed I've had an increase in "guests" - mainly vietnam and some Brazil and China. About 500 at the moment - not as high as some but that's high for my site. Average is usually 180 including robots I know about as well as guests. This is just 500 ish guests only.

So what would it be from vietnam?
 
Oh, the traffic has been bouncing about different southeast asian countries for a while now. The enormous network in Brazil was kind of a surprise.
 
Funnily enough I've just noticed I've had an increase in "guests" - mainly vietnam and some Brazil and China. About 500 at the moment - not as high as some but that's high for my site. Average is usually 180 including robots I know about as well as guests. This is just 500 ish guests only.

So what would it be from vietnam?
If you're getting persistent random-page-access-like traffic from parts of the world that traditionally would not visit your site, you're possibly having something known as Residential Proxies, aka RESIP's 'browsing' your site. Check out this page if you're curious to learn more: https://ieeexplore.ieee.org/document/10814519 ( https://dl.ifip.org/db/conf/cnsm/cnsm2024/1571050912.pdf )

For example, at varying instances during 2025, I have seen an increase in usage from such foreign IP's, and effectively doing "AI Scraper things". That being random page accesses with no correlation to the previously visited page(s). Such becomes rather apparent with various South American ISP's and Chinese originating clients when I see a slew of browser headers Accept-Language reporting zh-CN or just zh via those South American IP addresses. Sure makes you go "hmmmm, well that isn't right!" - especially when its by the hundreds just request flooding the site with no correlation of a normal user 'page to page' navigation (Index -> Forum -> Thread progress).

One can begin applying country code based rate limits or outright blocks, and these AI Scraper companies/institutes then utilize other means to evade being limited/restricted from your resources. I have seen AI Scrapers utilize vast swaths of AWS Datacenter IP ranges, then use various smaller cloud providers that allow rapid spin-up/shutdown of instances all getting their own unique IP addresses. Now it's seemingly using the most round about way possible to grab data, proxies. :(
 
Right. You're playing whack a mole with country bans if your server protection isn't smart enough to figure out how to behaviorally analyze what's going on per IP address, which is technically possible if you do it at the PHP level.
 
This:

 
Be careful, I recently read on the forum that someone blocked all IPs from Singapore and mistakenly included Google's IPs from that location as well.

That's why it is necessary to find the IP block(s) of google's crawler and whitelist them.
Their crawler does not fully adhere to robots.txt and will violate many rules you set to try to defeat bots.

I found this out the hard way. Hopefully nobody else has to :)
 
Had pages and pages of bots trying to DDOS me. All from amazon, singapore and china.
I already don't bother with indexing to google because i get stalked by lowlifes who should know better.
 
Seems that both me and cloudflare users have both seen a 2xing of the bot count in the last hour or so.
Not happy about it.

Looks like i need to get this next generation bot protection coded up sooner than later.
 
Back
Top Bottom