StopBotResources - Stop Bots From Hogging CPU and Bandwidth

tenants

Well-known member
I need some testers Testers are no longer required, but you can post your results below

I've written this plugin, but need people that get lots of bots, but do not necessarily have many forum users or server resources. (Forums that are on a shared host, new, but getting lots of bots)

I had to write this, since I my self was in this situation for a new forum and noticed spam bots (not search engine spiders) were taking up gigs per day on a forum that has one or two real users!

This kills start up forums

Bots generally visit the index page, look for thread content, go to login/login to get the cookie then register/register ... all of this takes up a fair chunk of bandwidth (particularly if you have thousands of bots)

This plugin checks the bot against one API that has a high % proxies used by bots on XenForo
If it's a proxy used by a spam bot, the spam bot is sent a 404 for forums, threads, index, registration and login (this does not happen for spiders or humans, just spam bots)

This significanltly reduces CPU (only 1 lookup needs to be done, rather than 10 -14 queries.
This significantly reduces bandwidth for that page (a few bytes rather than 40-80kb)


Overall, this means spam bots take up significantly less resources... this is fairly essential for any start up forum, finding they are running out of server resources.


I've only just installed it and found he following

CPU Before and after:

CPU_Before_After.webp


Bandwidth Before / After

beforeAfterBandwidth.webp

I would like to know if it works on other forums as well, if you can PM me your email, I'll send you the plugin, but I need screen caps like the above... I believe this should be tested for more than a week to know the impact (server resources can really fluctuate daily)

There are no plugin logs (creating logs would increase the CPU, so it would defy the point), but you can see the 404's from your server access logs, and watch the bandwidth and CPU drop via CPanel tools / awstats

I still need to keep track of bandwidth and CPU my self, since I've only just installed it, but it would be good if a few others to try it before I release it
 
I think you have my email, just maybe send me copy of this one too, as I've not yet installed them other three.. lol. My as well do all four at once haha. I do get alot of "bots" visiting my site, with not too many members currently, of course (haven't emailed and finished cleaning up forum to email them all). But yeah, would love to test it. :)
 
okay, take a snap shot of your CPU and Bandwidth before...(use what ever you want, awstats/other)
I'll send it over now
 
  • Like
Reactions: vVv
I know you're aiming this a low-resource start-ups (good idea), but I wondered if this would be useful for us with our own boxes, busy sites, and therefore lots of spam-bot traffic?
 
With your own box, you might not even notice the impact of bots on CPU / Bandwidth

Look at you stats, do you use awstats?

Look at your countries, certain countries (cough.. China Russia... and we can't leave out USA) use more bots than others, if you have a high % from those when comparing to your real users, then it's possible you have a high % of bots... possible

Also look at your Origin
If you have a very high % of direct users, you either have loyal forum users, or a high % of bots

If you have a high % of bots compared to real users, then you will notice a significant difference when using this addon.

However, having said that, you're free to give it a bash anyway and let me know if it has a positive impact on your resources.
 
I was just asked a question:

does this make alot of 404's show up in server error logs
yes, the access logs for your server will show 404's when the spam bots hit it (so on your own server, yes)
But google (and other search engine spiders) are not a spam bots, so it will not pick any of your pages up as a 404 (there is no impact on spiders / bots that crawl your site, the only impact is on Spam Bots)

But some people might still see this as a negative thing, so... we could define the error and type of header in the ACP

Then it doesnt have to be a 404, I could just show a blank page and no header
or even a custom 404:
404 The page you were looking for can't be found, now sod off bot

or maybe a custom 403:
403 Resiticeted access, I don't like the look of your face!


The header type and message can then be defined by you (even just a blank 200 if you like)


I personally like 404's since it confuses bot users (and they end up messaging me here, telling me that my site is down..)
 
I've just updated this resource, it's certainly working particularly well for bandwidth

XF seems to have a fairly high peak range for CPU (the CPU peaks quite a lot even at rest.. when there are no users/bots)
But for bandwidth, I'm finding some very promising results

I've now installed this on 2 forums (send me your email address via conversation if you are interested in testing this)


update1Bandwidth.webp
(it took me 2 days to notice the amount of bandwidth spam bots were taking, and 1/2 a day to panic and create the 1st version of this addon ;) )

It might look like my visits have droped, but that because all the spam bots are now sent a 404... I'm just left with my real users (and indexing bots) in the stats
 
You can now customise your response, so you don't have to send a 404 response back to spam bots, instead you can send any of these:

HTTP/1.1 100 Continue
HTTP/1.1 101 Switching Protocols
HTTP/1.1 200 OK
HTTP/1.1 201 Created
HTTP/1.1 202 Accepted
HTTP/1.1 203 Non-Authoritative Information
HTTP/1.1 204 No Content
HTTP/1.1 205 Reset Content
HTTP/1.1 206 Partial Content
HTTP/1.1 300 Multiple Choices
HTTP/1.1 301 Moved Permanently
HTTP/1.1 302 Found
HTTP/1.1 303 See Other
HTTP/1.1 304 Not Modified
HTTP/1.1 305 Use Proxy
HTTP/1.1 307 Temporary Redirect
HTTP/1.1 400 Bad Request
HTTP/1.1 401 Unauthorized
HTTP/1.1 402 Payment Required
HTTP/1.1 403 Forbidden
HTTP/1.1 404 Not Found
HTTP/1.1 405 Method Not Allowed
HTTP/1.1 406 Not Acceptable
HTTP/1.1 407 Proxy Authentication Required
HTTP/1.1 408 Request Time-out
HTTP/1.1 409 Conflict
HTTP/1.1 410 Gone
HTTP/1.1 411 Length Required
HTTP/1.1 412 Precondition Failed
HTTP/1.1 413 Request Entity Too Large
HTTP/1.1 414 Request-URI Too Large
HTTP/1.1 415 Unsupported Media Type
HTTP/1.1 416 Requested Range Not Satisfiable
HTTP/1.1 417 Expectation Failed
HTTP/1.1 500 Internal Server Error
HTTP/1.1 501 Not Implemented
HTTP/1.1 502 Bad Gateway
HTTP/1.1 503 Service Unavailable
HTTP/1.1 504 Gateway Time-out
HTTP/1.1 505 HTTP Version Not Supported

You can also customise the title and text sent via the response (a blank page with not title might be good for those of you getting hit really heavily)
 
I still need people to test this (for free), those that test it will obviously get the product for free once I release it (send me your email via pm). I now have a few forums testing it... but the more the better

I've now also added the plugin to SurreyForum
SF doesn't get a huge amount of spam bots (so it's not much of an issue), but already the bandwidth is down to about 1/2 :

surreyF.webp
 
I've just installed this today, after first turning off stop country spam. I use Keycaptcha already, and that does stop 100% of spambot registrations. I also use Xenutiles, which whilst bots aren't registering, Xenutiles still picks up attempted registrations.

With stop country spam I had maybe 10 pages in the backend of attempted bot registrations. Turning it off, within a little over 24hrs I have 150 pages of attempted bot registrations.

I don't put much into Awstats other than for comparison purposes of whether something is changing or not, and significantly, so this should do the job for this purpose.

The below are the last few days snapshots, and I'm very keen to see if the bandwidth usage goes down, and attempted registration log even picks up any at all now, as this should stop that completely.

Visitor figures and whatnot are pretty useless in the below IMHO, especially as people are returning to work now from Christmas... so a jump is occurring just from that, returning to normal. The bandwidth IMO is high though, and I believe it has to do with spambots more than anything versus legitimate users.

If this works, then it means I don't need stop country spam, as I don't get high numbers of human spammers because we have a 24hr moderation team located globally.

Screen Shot 2013-01-23 at 8.01.54 AM.webp

I also cleared the registration log to see if anything gets through or not to the registration page as an attempted register.
 
It won't stop the bots completely, it will reduce the impact of known bots for bandwidth and CPU, but only those whos proxies are know (some where in the region of 80-95% of the bots)

This is not a mechanism to rid 100% of bots, this is a mechanism to reduce the impact of the spam bots on resources, I'll be interested to see how this works out on other forums

You'll still see registration logs from bots, but far fewer than before
 
Yer, I reactivated stop country spam, removed blocking all countries, and just using it for the proxy blocking, and that stopped most of what was getting through the other near instantly. I still see a few getting through, though agree thus far in the short period of an hour or so, that the logging has nearly come to a stop on my site with both working together.
 
After bot stopper... added 24th Jan, it dropped about a gig on average thus far, per day... though it is too early to really be more conclusive at this stage. That is 30 gig a month. My CPU... nothing noticeable with that, as I run a dedicated server.

What I know, is that it's stopping about 98% of the bots that were hitting the site. I also installed the FoolBotHoneyPot add-on, and that is picking up the remainder 2% that get through when trying to hit the registration form.

Screen Shot 2013-01-30 at 7.36.42 AM.webp
 
There is a white list to add spiders (this is just a tick box in the ACP), and to be honest... it's not even necessary, since spiders will never be detected as spam bots:

Option added to allow Search Engine Spiders to bypass the StopProxies check to find out if it is a spam bot. This is options is not necessary, since search Engine Spiders will never be detected as spam bots. This option has been added to remove any fear that search engine bots will not be able to spider your site. The core XenForo functionality has been used to check if the user is a search engine robot (this checks the user_agent, which can be faked by bots)

StopProxies spambot check is only going to block those that are known for using automated bots within the last few weeks (such as XRumer)

What situation are you thinking where a white list might be useful, if a user has used XRumer within the last few weeks, but still want's to register?
 
Top Bottom