Crazy amount of guests

Give PHP more capacity and ensure the DB is well tuned. Problem solved.
I am a bit curious:
How would this solve the problem that AI crawlers feed their models with knowledge from our forums and thus drive away potential visitors to their services without generating revenue (like human visitors would)?
 
Last edited:
Did you try to do anything about it since you started the thread back in October last year?
As you may remember I'm on Cloud hosting so my choices are limited and I still don't want to switch to Cloudflare.

What I did in the past was use discourage mode. But if I put too many IP addresses in there it would throttle the site and eventually grind to a halt.

The error I made with that was to have cranked every setting up to 100%. So then I made some changes. I turned off everything except the redirect, which I set to 100%. Some still get through just by sheer volume, but it's minor and the site doesn't affect the site now. So this works for me.

But not when I have 12k bots all with different IP addresses. If I were smart I would get us on Cloudflare where I could probably block an entire ASN? But these bot swarms don't seem to stay long and they don't throttle the site.
 
As you may remember I'm on Cloud hosting so my choices are limited and I still don't want to switch to Cloudflare.
Understandable.
What I did in the past was use discourage mode. But if I put too many IP addresses in there it would throttle the site and eventually grind to a halt.
Foreseeable issue. And on top of that the wrong strategy that does harm to yourself but barely affects the scrapers.
But not when I have 12k bots all with different IP addresses. If I were smart I would get us on Cloudflare where I could probably block an entire ASN? But these bot swarms don't seem to stay long and they don't throttle the site.
You should read the thread that you started to better understand what is happening. Plus to find out that there are solutions that do not involve Cloudflare. I'd recommend using this add on


which should work on cloud and gives you

• rate limiting with challenge
• VPN blocking
• ASN blocking
• Country blocking
• Net and IP blacklisting
• Net and IP whitelisting
• realtime check of IPs against the proxycheck.io API

This is a toolset that can massively limit the problem. I can provide you with the commented list of 450+ ASNs I block and the countries that I block. Of both you can easily filter which you do not want to block due to your regular audience but way more than half of the ASNs are datacenter and not normal users, so no harm to expect. This will linder the issue massively.

On top of that this add on


is relatively simple but does the job to provide you with the country an IP visiting your forum is coming from. Pretty neat for finding holes in your blocking strategy and fixing them.
 
As you may remember I'm on Cloud hosting so my choices are limited and I still don't want to switch to Cloudflare.
I havent read the whole thread so it may be out of context.
But you dont have to switch hosting or switch registrars to use the free cloudflare.

Just point your domain to cloudflare and then point cloudflare to your cloud site.
(a little more than just "pointing", but you get the idea)

Your not really switching anything. I am on XFcloud .
 
Last edited:
You should read the thread that you started to better understand what is happening. Plus to find out that there are solutions that do not involve Cloudflare. I'd recommend using this add on
I probably deserve that, problem is that much of what was discussed is a bit over my head.


which should work on cloud and gives you

• rate limiting with challenge
• VPN blocking
• ASN blocking
• Country blocking
• Net and IP blacklisting
• Net and IP whitelisting
• realtime check of IPs against the proxycheck.io API

This is a toolset that can massively limit the problem. I can provide you with the commented list of 450+ ASNs I block and the countries that I block. Of both you can easily filter which you do not want to block due to your regular audience but way more than half of the ASNs are datacenter and not normal users, so no harm to expect. This will linder the issue massively.

On top of that this add on


is relatively simple but does the job to provide you with the country an IP visiting your forum is coming from. Pretty neat for finding holes in your blocking strategy and fixing them.
This is interesting. I'm going to look into that. And thank you for offering to share you ASN list! That is much appreciated.
 
I havent read the whole thread so it may be out of context.
But you dont have to switch hosting or switch registrars to use the free cloudflare.

Just point your domain to cloudflare and then point cloudflare to your cloud site.
(a little more than just "pointing", but you get the idea)

Your not really switching anything. I am on XFcloud .
Yes, I know, I meant switching from what I do now to using Cloudflare. Thanks.
 
  • Like
Reactions: CTS
This is interesting. I'm going to look into that. And thank you for offering to share you ASN list! That is much appreciated.
They do even have a sale going on that ends in a couple of hours:


And thank you for offering to share you ASN list! That is much appreciated.
You're welcome. The problem is very annoying and can easily (and probably will) go over the head of many forum owners. So why not help each other - we all suffer from the same problem.
 
I am a bit curious:
How would this solve the problem that AI crawlers feed their models with knowledge from our forums and thus drive away potential visitors to their services without generating revenue (like human visitors would)?
It doesn't solve that problem, it solves the never ending issue of time required to track all the bad actors around the www and try to stop them accessing your content. They have your content. You block them one way, they drop that IP block, they get it from another. They have it. Now you have a potentially good IP block, blocked, and now you're chasing to block the next bad IP block.

It is never ending. They move faster than you can. Cloudflare do nothing other than their business, and they move faster than that large business can identify and isolate bad actors.

You can turn off bad AI bots in the free CF. Hell, you put everything through a managed challenge in CF if you truly want to stop all the automated nasty stuff, then just give bypass to known good bots.

What you won't ever achieve though, is catching all the bad actors yourself. This very discussion has outlined how they're moving to residential proxies, even more difficult to isolate, because you will block legitimate ISP users.

Tune your server, sit back, relax and do more constructive things with your time (which you don't get back) OR put everything through a managed challenge with known good bots on skip. Both result in you having your time to do other things. You get one life, you can't recover time.
 
Okay, so I'm ready to purchase Osman's Threat Monitor because my forum is being visited by bot swarm after bot swarm and all with different IP addresses so I can't do anything about them and I want them gone.

So do I understand correctly that the Threat Monitor automatically detects bots?
 
So do I understand correctly that the Threat Monitor automatically detects bots?
Not in the way you probably think it would. It detects a certain behavior (requesting a lot of pages in short time) and puts a rate limit on it first and ultimately a block for the IP. For everything else it relies on either the API of proxycheck.io or manual configuration. Proxycheck.io detects VPNs, proxies and all sorts of bad actors (which then are blocked by Thread Monitor), so also bots - but only a fraction of the resident proxies that are visiting your forum. It is missing the "intelligence" for that (which is no wonder as no tool currently has that intelligence).

But Threat Monitor enables you to block ASNs as well as whole countries, which is pretty effective. How effective it can be depends from the audience of your forum. If you are mainly serving one or a couple of countries you will have no issue blocking some others completely and this way you get rid of a huge amount of the resident proxies if - by accident - you can block Brazil, Argentina, India, Bangladesh, China, Honkong and Singapore (along with a longish list of others that scrape massively but not in the same league). If you can even block the US: even better. However: There is a price to pay: Obviously no one from these countries can access your forum any more as guest and as a consequence no one from a blocked country will be able to register in your forum. The only issue affects registred members from blocked countries that are not logged in currently - they are blocked like normal guests and therefore can't log in. So your forum members better never log out, if they live in a blocked country (or visit one) or use a blocked ASN.

Logged in users are let through, no matter where they connect from. Same goes for ASNs: There are a lot that you can block w/o any harm as basically nothing but bad actors come from there and there is no collateral damage as long as we talk about datacenter ranges. But if you block the ASN of Cable or DSL providers the same happens as before: Forum members that are already logged in and come from a blocked ASN do come through, but no one else.

Threat Monitor acts basically like a firewall with the advantage not to block logged in users. Pretty brilliant. However: It is a pretty big sword and you can do harm to your audience. Plus you have to find out what to block to be effective with as little collateral damage as possible. This is manual effort and time consuming and needs a bit of knowledge and/or learning. That's why I offered to share my blocking list with you - it will give you a massive kick start.

Also worth mentioning: 1000 Calls per day to proxycheck.io are free, if you need more (and you will almost safely need more) cost money. But it is cheap and money well spent. I do run on the 20k-calls per day plan and this is plenty for my forum.

This add on started as a rate limiter, wich (in my eyes) can be useful, but is not the most relevant bot issue today. Later it evolved into the Firewall-alike tool that it is now. As far as I know it is the only add on that can block ASNs and countries from forum access, so if you want that, there is no alternative (plus it is a good tool anyway).

There is another add-on that claims to dedect bots intelligently:


This one started as a statistics widget and evolved into something completely different in very short time. I did not test it as I wanted the abilities of Threat Manager for my forum but I am a little suspicious: It came out of nothing (the first public resource of the developer) and the development speed is too fast for my taste, so I have the suspicion it is massively AI coded. There are sometimes several feature releases on a single day. The advertising claims made are pretty gigantic (including having detection intelligence) and I don't buy them fully. But again: I haven't tried it. Also, some people seem to have issues with the latest releases. So overall not my first choice, the more as I already use IP Threat manager. Other people's mileage may vary and the developer is very responsive.
 
Last edited:
@smallwheels

Thanks for the write-up!

I already have proxicheck.io active on my site for one of Andy's add-ons.

I'm sure there will be a learning curve, but I'm happy to learn.

I have one last question, if I may. I see that in the Threat Monitor's description it says that "Google, Bing, and other search engines are never blocked." Google sometimes uses IP addresses located in countries I would love to block. So would the good bots get through even then?
 
I have one last question, if I may. I see that in the Threat Monitor's description it says that "Google, Bing, and other search engines are never blocked." Google sometimes uses IP addresses located in countries I would love to block. So would the good bots get through even then?
Yes, its a whitelist. Whitelist trumps blacklist, simply put. If you combine CF with threat monitor, you have a pretty good solution. CF does detect many residential proxies, basically CF detects bot behaviour and places it into its own Labyrith system, so the bot goes round and round in an endless loop, which is not your server. Not all, but CF is getting better with this, taking a guess, they see the issue for their systems and are trying to write systems to detect anomalous behaviour. You just won't stop everything, EVER.

Add the right tools, set and forget. You can be pretty aggressive with threat monitor, the choice is yours to how nice, or not, you are to mischievous behaviour IP's.
 
Back
Top Bottom