Crazy amount of guests

smallwheels · Mar 4, 2026

Seems another wave or two are currently going on. I'm currently at a deflection rate of more than 80% of the requesting IPs over the last 24hours. Root cause is mainly a massive rise in requests from resident proxies within the US on the one hand and on the other a massive flood of requests from Singapore and Hongkong, which comes to a relevant degree from AS132203 (Tencent cloud computing). The ASN has been blocked already ages ago but does not bother trying.

I am once more somewhat baffled about the immense amount of resident proxies in the US - this could maybe have been expected in a developing country with low level of education and a bad economy with very low wages but in the US? Weird.

chillibear · Mar 4, 2026

I would guess a semi popular app that doesn't advertise that it's doing some "sharing"? I gather some of these proxies are unknowing ones, or maybe there is an exploit on a popular router that hasn't been disclosed yet?

I did think the traffic came up a bit quickly this morning. I'd been doing a major database migration on one XF site and turned everything back on after half an hour or so to find:

I mean it'd be on about 10 seconds by the time I nipped over to my browser to check everything was back up cleanly!

At least not as bad as it can get. Evidently we had a few optimistic members hitting refresh during my maintenance window!

smallwheels · Mar 4, 2026

chillibear said:
I would guess a semi popular app that doesn't advertise that it's doing some "sharing"? I gather some of these proxies are unknowing ones, or maybe there is an exploit on a popular router that hasn't been disclosed yet?

All of that. A lot has been disclosed, people just are not aware or don't care:

2024:

WHEN YOU BUY a TV streaming box, there are certain things you wouldn’t expect it to do. It shouldn’t secretly be laced with malware or start communicating with servers in China when it’s powered up. It definitely should not be acting as a node in an organized crime scheme making millions of dollars through fraud. However, that’s been the reality for thousands of unknowing people who own cheap Android TV devices. (...)
Human Security researchers found seven Android TV boxes and one tablet with the backdoors installed, and they’ve seen signs of 200 different models of Android devices that may be impacted, according to a report shared exclusively with WIRED. The devices are in homes, businesses, and schools across the US.

Your Cheap Android TV Streaming Box May Have a Dangerous Backdoor

New research has found that some streaming devices and dozens of Android and iOS apps are secretly being used for fraud and other cybercrime.

www.wired.com

2025:

On the surface, the Superbox media streaming devices for sale at retailers like BestBuy and Walmartmay seem like a steal: They offer unlimited access to more than 2,200 pay-per-view and streaming services like Netflix, ESPN and Hulu, all for a one-time fee of around $400. But security experts warn these TV boxes require intrusive software that forces the user’s network to relay Internet traffic for others, traffic that is often tied to cybercrime activity such as advertising fraud and account takeovers. (...)
Experts say while these Android streaming boxes generally do what they advertise — enabling buyers to stream video content that would normally require a paid subscription — the apps that enable the streaming also ensnare the user’s Internet connection in a distributed residential proxy network that uses the devices to relay traffic from others.

Is Your Android TV Streaming Box Part of a Botnet? – Krebs on Security

In-depth analysis of multiple Superbox models by researchers at Censys, as reported by Krebs, showed that once the device is online, it immediately begins communicating with:

Chinese services, including Tencent’s QQ platform

Residential proxy services such as Grass (getgrass[.]io), which pays users to “share unused bandwidth”

Grass’s stated model is that users install an app and opt in to sharing their connection. Superbox appears to short-circuit that consent model, enrolling users implicitly through firmware and preinstalled software.

From an owner’s perspective, that means:

Your IP address is being used as an exit node for other peoples’ traffic.

You never explicitly agreed to this behavior during setup.

There is no obvious switch to disable it.

Your Android TV Box Might Be a Botnet Farm without You Knowing: A Deep Dive

A deep dive into how Android TV streaming devices can conscript your home network into botnets and residential proxy networks without your knowledge.

www.kylereddoch.me

Anthony Parsons · Mar 4, 2026

If its to good to be true, chances are, it isn't.

ES Dev Team · Mar 4, 2026

It's bad out there. On my magento site that gets attacked often now, i'm seeing a flood of residential proxies from USA cable providers.
I unfortunately have to keep 'under attack mode' on

Everyone is still blissfully running fail2ban.

The only good alternative is to design a friendlier and better captcha -_-

Suzanne O · Mar 4, 2026

Getting heaps from Facebook this morning.

rdn · Mar 5, 2026

Has anyone tried this?

Anubis: Web AI Firewall Utility | Anubis

Weigh the soul of incoming HTTP requests to protect your website!

anubis.techaro.lol

Suzanne O · Mar 5, 2026

rdn said:
Has anyone tried this?

Anubis: Web AI Firewall Utility | Anubis

Weigh the soul of incoming HTTP requests to protect your website!

anubis.techaro.lol

Nup. Cannot stand AI at all.

ES Dev Team · Mar 5, 2026

I've heard of someone here having success with it

smallwheels · Mar 5, 2026

rdn said:
Has anyone tried this?

Anubis: Web AI Firewall Utility | Anubis

Weigh the soul of incoming HTTP requests to protect your website!

anubis.techaro.lol

Look further up in this thread.

Search results for query: Anubis

xenforo.com

@BrettC has.

akok · Mar 5, 2026

I started using this antibot

GitHub - githubniko/antibot: AWAF защита от ботов ПФ, эмулирующих браузер, JavaScript, а так же классических ботов.

AWAF защита от ботов ПФ, эмулирующих браузер, JavaScript, а так же классических ботов. - githubniko/antibot

github.com

DeltaHF · Mar 5, 2026

My site was flooded today as well. I've been very vigilant blocking ASN and IP addresses with Cloudflare, yet this morning, we had over 36,000 currently active users on the forums. Ridiculous.

Anthony Parsons · Mar 5, 2026

I've turned another direction, I tuned my server and sites to the finest degree, and I just had a million unique's in 24hrs and none of it made a blip on the server as a result. I have tried fighting it, I'm finding it easier to take a different approach, tuning to the finest degree and then it just doesn't matter what hits the site. I have CF set to high, so they stop any nasty stuff, but otherwise, this approach works too. (the spike is daily server backup)

smallwheels · Mar 6, 2026

Luckily, load is not an issue with my forum as it is only tiny and so are the number of bot requests in comparison to your's. A couple of thousand per day. Basically we seem to be at the opposite ends of the scale regarding forum size and number of requests in total. So for me it ist more about protecting the content from scraping. With the rise of AI this has also become a matter of privacy of the users: Many users tell fragmented bits about themselves in forums and with AI it is easy to aggregate those, even cross platform, and this way to build profiles as well as to identify the real people behind nick names easily. See i.e. this study:

Large-scale online deanonymization with LLMs

We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would take hours for a dedicated human investigator. We then design attacks for the closed-world setting. Given two databases of pseudonymous individuals, each containing unstructured text written by or about that individual, we implement a scalable attack pipeline that uses LLMs to: (1) extract identity-relevant features, (2) search for candidate matches via semantic embeddings, and (3) reason over top candidates to verify matches and reduce false positives. Compared to classical deanonymization work (e.g., on the Netflix prize) that required structured data, our approach works directly on raw user content across arbitrary platforms. We construct three datasets with known ground-truth data to evaluate our attacks. The first links Hacker News to LinkedIn profiles, using cross-platform references that appear in the profiles. Our second dataset matches users across Reddit movie discussion communities; and the third splits a single user's Reddit history in time to create two pseudonymous profiles to be matched. In each setting, LLM-based methods substantially outperform classical baselines, achieving up to 68% recall at 90% precision compared to near 0% for the best non-LLM method. Our results show that the practical obscurity protecting pseudonymous users online no longer holds and that threat models for online privacy need to be reconsidered.

Large-scale online deanonymization with LLMs

We show that large language models can be used to perform at-scale deanonymization. With full Internet access, our agent can re-identify Hacker News users and Anthropic Interviewer participants at high precision, given pseudonymous online profiles and conversations alone, matching what would...

arxiv.org

(full paper available also at this link)

So upon all the other measures I've finally limited guest access to my forums - something I wanted to avoid until now for SEO Reasons. Now guests only see the first post of a thread, in some subforms they do only see the topic but not the thread content and some forums and areas are invisible to guests (the latter has been like that for years already). I have that in place since early February and the effect was - expectedly - a massive rise in registrations as well as a massive rise in daily registered visitors. Until now I do not (yet) see a crash in SEO / Google Ranking / Indexing. This will however be probably the consequence to some degree.

I've not seen a rise in attempted (let alone successful) bot registrations, rather the opposite: Through the massive blocking of bad IPs, VPNs, countries and ASNs via the IP Threat Monitor add on the spaminator add on, that catches bot registrations very successfully, sits idle most of the time - just two attempts caught last months (were up to then it was typically a couple of hundreds per month at least). The registrations that come through seem all genuine, so generally everything seems fine.

The one issue I still have is to identify residential proxies from Central Europe and block them early successfully as, this being the location of my users, I cannot block the ASNs of normal ISPs, let alone whole countries. No idea yet how to do that, especially as the resident proxies typically only do a single request each before rotating and blocking an IP makes no sense anyway as tomorrow a genuine user may have that IP. Currently I lack ideas how to deal with that topic.

But overall I am pretty pleased with my current setup. For a bigger forum this would however not be sufficient and would probably have massive load issues. For a forum with a world wide audience my current strategy of massive blocking would obviously not work as well.

smallwheels · Mar 8, 2026

smallwheels said:
• scraping via residential proxies costs per GB of traffic and it is expensive (and often slow)

So my number 4 would be:

4.) poison the content. Send identified scrapers into a huge mess of false, halftrue or completely made up information and let they scrape it to poison the AI models that they are feeding, effectively rendering them useless. It is important to mix up true and false information and to have references linking to the true world to not make it too obvious. Also include huge pictures and graphics to make traffic expensive. On could create a repository for this stuff ("bogopedia") where people could have fun adding this kind of thing - could even be done using XenForo.

I stumbled upon the conditions of one of those services that rent out resident proxies and it seems indeed desirable to posion the requests with bogus content. Some of the proxy providers charge per GB of traffic, another charges per successful request and this is where it get's interesting:

Bildschirmfoto 2026-03-08 um 15.22.06.webp

This means, a strict blocking via 403 won't cost the requestor anything, it would be better to redirect them to a bogus page or to present them at least with a 404 (like XF does if you lack sufficient rights to access the content you requested). The prices at said "service" seem relatively moderate, yet this quickly adds up as every request has to be paid and with 25 API credits per successful for residential proxies with headless and JS this adds up pretty quickly:

Bildschirmfoto 2026-03-08 um 15.24.46.webp

This is the architecture they promote and claim to have:

Bildschirmfoto 2026-03-08 um 15.27.52.webp

They also claim to have 110 million proxies and a success rate of 99,98%. As usual, you can believe this or rather not. ;-) As Usual, the company offering these services claims furiously that this all completely legal business, yet does not mention any name of someone working there or an address of the company but hides in whois behind a privacy protection service.
However, in this case it does not seem to be run by a company located in Cyprus, UAE, Dubai or Panama but in Wyoming - if the LLC that can be found on their webpage it true (which it seems to be). And with the name you can then find more information inc. a founder's name

https://www.crunchbase.com/organization/scrape-do

as well as a company number and a legal address in the US

PACKEND, LLC - 2021-001044605 - Wyoming

PACKEND, LLC American (Wyoming) company, Company number: 2021-001044605, Incorporation Date 18 de oct de 2021;, Address: 30 N Gould St Ste N Sheridan, WY 82801 USA

b2bhint.com

despite, according to the entry at crunchbase, the untertaking is based not in the US but in Ankara in Turkey.

Just in case someone wants to dive into a rabbit hole.

BrettC · Mar 9, 2026

smallwheels said:
Seems another wave or two are currently going on. I'm currently at a deflection rate of more than 80% of the requesting IPs over the last 24hours. Root cause is mainly a massive rise in requests from resident proxies within the US on the one hand and on the other a massive flood of requests from Singapore and Hongkong, which comes to a relevant degree from AS132203 (Tencent cloud computing). The ASN has been blocked already ages ago but does not bother trying.

I am once more somewhat baffled about the immense amount of resident proxies in the US - this could maybe have been expected in a developing country with low level of education and a bad economy with very low wages but in the US? Weird.

I had noticed a small uptick in Anubis challenge failures, but nothing that was out of the ordinary over the past two weeks.

Granted, I'm slamming the door shut on just about anything with TenCent's various ASN at Anubis with a challenge level of 16. Some of the worst offenders remain blocked at the edge server firewalls. Which does make the botnets use Residential Proxies to attempt to evade, but ultimately end at the Anubis challenge blocker.

rdn said:
Has anyone tried this?

Anubis: Web AI Firewall Utility | Anubis

Weigh the soul of incoming HTTP requests to protect your website!

anubis.techaro.lol

Me! When configured correctly, it's quite powerful. It's not a 100%, but it's pretty damn close to it.

Suzanne O said:
Nup. Cannot stand AI at all.

It's not AI. There's literally nothing in the code that uses or takes advantage of an AI ecosystem. If anything, the code in it has a lot of legacy stuff, and one part that made me chuckle, DNSBL checking - which is used widely on IRC.

akok said:
I started using this antibot

GitHub - githubniko/antibot: AWAF защита от ботов ПФ, эмулирующих браузер, JavaScript, а так же классических ботов.

AWAF защита от ботов ПФ, эмулирующих браузер, JavaScript, а так же классических ботов. - githubniko/antibot

github.com

Despite it being in all Russian, this is an interesting implementation. I'm rather interested in the developers 'FPS Blocking' feature - can't say that i've seen that done before. I'm not too keen on the PHP aspect of this implementation, but it looks fairly transparent when implemented correctly.

DeltaHF said:
My site was flooded today as well. I've been very vigilant blocking ASN and IP addresses with Cloudflare, yet this morning, we had over 36,000 currently active users on the forums. Ridiculous.

And your expense: Bandwidth costs (if you pay by the GB/TB)! They gain: Your websites content, and don't even give you source/origin-credit. Fun right?

It would be wise to setup an 'in the middle' challenge-check such as Anubis, or if using Cloudflare (or a similar service that provides a comprehensive WAF solution) to really ratchet-up the security. There is also Wicketkeeper, but i've yet to give that a whirl.

Suzanne O · Mar 10, 2026

They're all AI scrapper bots I have over 580 of them on my site. I'm considering putting all of their ip addresses on my cpanel's ip ban list.

Anthony Parsons · Mar 11, 2026

smallwheels said:
AS132203 (Tencent cloud computing). The ASN has been blocked already ages ago but does not bother trying.

Yer... if you're only going to block one ASN, it would be Tencent. So much nonsense from their ASN global servers. All countries are doing it... China just seems not to care about the blatant abuse of their scrapers, flooding sites, instead of slowing things down and getting the content at a steady pace like most Governments do.

smallwheels · Mar 11, 2026

Another finding: At least one company that offers residential proxies:

Bildschirmfoto 2026-03-11 um 11.55.57.webp

does also offer prescraped datasets for use with AI:

Bildschirmfoto 2026-03-11 um 11.55.37.webp

So it seems they extended their business model. As usual they claim to only have "ethically sourced" residential proxies - which seems barely plausible, given the numbers of IPs they claim to have on offer:

Bildschirmfoto 2026-03-11 um 11.56.20.webp

What distinguishes this company from others is that they advertise with ISO certification and to be GDPR compliant - which by nature is barely possible in the web scraping buissiness and obviously they must be well aware that people renting residential proxies for web scraping don't give a damn about privacy protection.

Bildschirmfoto 2026-03-11 um 11.57.22.webp

They are located in Israel and seem pretty successful, given the statements on their website:

Bildschirmfoto 2026-03-11 um 11.59.26.webp

Bildschirmfoto 2026-03-11 um 11.59.45.webp

In fact they seem not a shady business that is trying to hide but a "normal" company, which kind of baffled me, given the business model. They do even have a entry at wikipedia:

Bright Data - Wikipedia

en.wikipedia.org

A pretty interesting read including some disturbing bits:

Litigation

In July 2018, Bright Data sued another proxy service provider, Oxylabs, for patent infringement. A jury initially awarded $7.5 million in damages to Bright Data, although another judge followed that ruling by ordering the companies to undergo mediation.

In January 2023, Bright Data was sued by Meta Platforms for harvesting and selling data scraped from Facebook and Instagram. Meta previously hired Bright Data to scrape data from other websites. In January 2024, Bright Data won the dispute with Meta, where a federal judge in San Francisco declared that the company did not breach Meta's terms of use by scraping data from Facebook and Instagram, consequently denying Meta's request for summary judgment on claims of contract breach.

In July 2023, X Corp, formerly known as Twitter, sued Bright Data for scraping data from Twitter, violating its terms of service. Bright Data countersued, asserting its commitment to making public data accessible, claiming legality in its web data collection practices. In May 2024, a federal judge dismissed the suit, ruling that Bright Data did not violate X's terms of service or copyright by scraping publicly accessible data. The judge emphasized that such scraping practices are generally legal and that restricting them could lead to information monopolies, and highlighted that X's concerns were more about financial compensation than protecting user privacy.

So in two cases an US judge ruled that scraping would be legal, against big players like Meta and Twitter and despite the fact that especially on Facebook is are obviously tons of privacy related data of people and Facebook as well as Instagram and partly Twitter/X do have kind of a login wall before you can access more than just limited content. In my opinion it is simply impossible to scrape data at scale from these platforms (or from most forums) and still be in compliance with GDPR, let alone things like copyright. Either a very strange judge or a very strange legislation in the US or they do something vastly different from others. Pretty baffling.

Anthony Parsons · Mar 11, 2026

smallwheels said:
So in two cases an US judge ruled that scraping would be legal

Public data. Not private data.

For anyone who wants to stop all proxies, without issue or further concern: https://xenforo.com/community/resources/xtr-ip-threat-monitor.10134/

I have been using it since development, been through all the tweaks and performance adjustments, and it is now a very solid and capable system for automatically blocking all proxies and VPN's that are nasty, whilst allowing legitimate ones, such as Apple relay which every iphone Safari browser uses.

Crazy amount of guests

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Active member

Well-known member

Well-known member

Well-known member

Large-scale online deanonymization with LLMs​

Well-known member

Active member

Well-known member

Well-known member

Well-known member

Litigation​

Well-known member

Similar threads

We value your privacy

Large-scale online deanonymization with LLMs

Litigation