Crazy amount of guests

poison the content. Send identified scrapers into a huge mess of false, halftrue or completely made up information and let they scrape it to poison the AI models
There have been a few Markov chain type babblers done that generate realistic junk pages to tarpit scrapers and so forth you could investigate and see if any are more than idle experiments. I recall reading about a few some time back.
 
Funnily enough I'd just been looking at Markov chain generators.
e.g. this PHP one - https://github.com/hay/markov - demo at https://projects.haykranen.nl/markov/demo/

So a starter might be a tarpit add-on which picks a few random thread titles/posts from the forum as the input text, and simulates pages of thread titles and posts based on that text. As well as wasting the rogue scrapers' time, it would poison their input.
Put it behind some no-follow links, hide it from humans and use robots.txt to keep legit search scrapers out of the tarpit.
 
I got hit with a mega-wave last night.. they kept consuming all my tcp/ip ports ( my current limit is 2048, double the linux default ).. i kept the site online for others only by restarting apache repeatedly. The blast lasted 15 minutes and it totally cut through my protection.

I'm not sure if i should tune linux' max TCP/IP ports and apache's max request workers into the stratosphere or leave it the way it is. I do have lots of available cpu/ram.

Something about how the bots work on the other end has the bots holding the TCP/IP port open for much longer than usual. It resembles a slow loris attack. I already have mitigations against that, but the number of simultaneous IPs is too high and too unique.

I'm not surprised if they cut right through cloudflare. I think the pool of residential proxies is enormous and rotates too often for something like cloudflare or some other algorithm to catch them.


Anubis is a good idea. It may result in a worsening of stuck tcp/ip ports. Same with a markov generator, etc.
 
We have enormous guest amounts now ( 7.7k ) but only 50-100 tcp/ip ports being used.
Last night i saw 1.7k and the tcp/ip ports being used was 1800+ ( apache starts choking at 1000/sec with )

So yeah last night's rush was a different kind of traffic..
 
Back
Top Bottom