Crazy amount of guests

smallwheels · Mar 11, 2026

Anthony Parsons said:
For anyone who wants to stop all proxies, without issue or further concern: https://xenforo.com/community/resources/xtr-ip-threat-monitor.10134/

I am using it too and like it. Solid and does work very well since the latest versions. However: I would not sign this:

Anthony Parsons said:
blocking all proxies and VPN's that are nasty

By default it misses most of the resident proxies. It does catch data center based stuff to a high degree. To block resident proxies you have to block ASNs or countries with all the colateral damage that this may cause.

Also, it get's the data what to block from the proxycheck.io API and to use this does cost money (free up to 1000 queries/day, but even my small forum has more that that). The amount of queries depends a bit from the settings of the add on, however, if you want to block ASNs you have to query every IP.

While I really like the add on there are some things that I would like to see, especially better analytics. Lets hope that development continues as fast as it was in the past.

ES Dev Team · Mar 11, 2026

I've seen 10,000's-100,000's of unique IP addresses at times, yeah, that wouldn't work for me.

I'm not surprised that residential proxies walk through it!

smallwheels · Mar 11, 2026

ES Dev Team said:
I've seen 10,000's-100,000's of unique IP addresses at times, yeah, that wouldn't work for me.

The pricing of proxycheck.io is affordable in my opion:

pricing / proxycheck.io

proxycheck.io pricing, we offer free and paid plans.

proxycheck.io

The add on builds up a blocking list and countries are checked via a local maxmind database in the meantime. After a couple of weeks and with the actual version of the add on roughly 2/3 of the IPs visiting my forum are checked against the API

ES Dev Team said:
I'm not surprised that residential proxies walk through it!

The initial idea of the add on was flood protection: Loads of requests from one IP within very short time lead first to captcha and then to block. So basically it initially targeted the classic scraper and not at all resident proxies. At least in my forum the classic scraping with loads of requests from a single IP does not happen any more.
ASN blocking as a feature was added after a feature request from my side and this is a life saver for me.

Anthony Parsons · Mar 12, 2026

I did a bit of reading on the residential proxy issue, and it seems CF does have tools in place to identify them and place them into Labrinth. I guess any IP running a constant stream of activity, rate limited or not, would be identifiable. Unless you can randomise the activity from it, and are limiting the time use daily, randomising that too, then CF could certainly find patterns and send the IP's into neverland. Just another tick for using CF.

chillibear · Mar 12, 2026

Anthony Parsons said:
I guess any IP running a constant stream of activity, rate limited or not, would be identifiable.

I think if I was a huge wealthy company like CF I'd just create some natty little companies that signed up with the proxy companies and plonked traffic through "their" networks to identify a good quantity of the addresses. You might even then not actually do much about them, but you'd be a be able to look at the traffic patterns and see if there were any "tells" to use more generally.

smallwheels · Mar 12, 2026

Anthony Parsons said:
I guess any IP running a constant stream of activity, rate limited or not, would be identifiable. Unless you can randomise the activity from it, and are limiting the time use daily, randomising that too, then CF could certainly find patterns and send the IP's into neverland. Just another tick for using CF.

The issue with resident proxies is that they are rotating very fast. Typically, a single IP does only one single request (and within that one single call that leaves out elements of the webpage like JS, tracking or pictures, depending from how the scraper is configured). One can often identify them in hindsight by that and by other patterns, for which one would have to have knowlege about content and structure of the webpage. I.e. there are often requests that target directly a user profile, a single picture, or an older, inactive thread and often you have two or more IPs requesting the same unusual target in parallel.

So there are possible ways to identify residential proxies by behaviour but the number of requests, the user agent or classical finger printing are often useless.

CF does not have the knowlege about the content and structure of the website or about typical behavior, so they lack a lot of options to identify patterns. That's why CF does - as it has been written in this thread for months - often fail to identify those proxies reliably. CF has the advantage of high numbers, so they can identify suspicious IPs by i.e. the behaviour of sending single requests to vastly different websites within short time - but again this does work only after some time.

They may (and hopefully will) have improved their abilities over the last months - but as it has been written in this thread in the past it is not reliable and, within the product range of CF, the free tier will probably not offer protection. The proxy providers advertise a success rate of 99+% and science says only 10% of the resident proxies get identified.

I'd assume most webpage providers do not even recognize this traffic, the more if it comes from countries, where they do have their normal audience. And those who do will have a hard time finding out who is legitimate and who is a scraper before serving content.

smallwheels · Mar 12, 2026

chillibear said:
I think if I was a huge wealthy company like CF I'd just create some natty little companies that signed up with the proxy companies and plonked traffic through "their" networks to identify a good quantity of the addresses.

As written further up in this thread: This is what some companies (like i.e. proxycheck.io) do and it can possibly be assumed that CF does this as well. No doubt that it helps in many ways with diagnosis, however: The IPs change frequently, there is a ton of these proxy providers, they change their methods and behavior frequently and there is a broad range of different tools they use and requests they send. So it is a bit of a Hydra with many heads.

On a sidenote: We start to repeat everything as "new" what has already been mentioned and discussed further up the thread. So it seems the discussion is starting to grind to a halt, get in a loop and repeat itself.

lazy llama · Mar 12, 2026

smallwheels said:
CF does not have the knowlege about the content and structure of the website or about typical behavior, so they lack a lot of options to identify patterns. That's why CF does - as it has been written in this thread for months - often fail to identify those proxies reliably. CF has the advantage of high numbers, so they can identify suspicious IPs by i.e. the behaviour of sending single requests to vastly different websites within short time - but again this does work only after some time.

They may (and hopefully will) have improved their abilities over the last months - but as it has been written in this thread in the past it is not reliable and, within the product range of CF, the free tier will probably not offer protection. The proxy providers advertise a success rate of 99+% and science says only 10% of the resident proxies get identified.

Although CF "out-of-the-box" doesn't do anything particularly 'intelligent' with requests from residential proxies, by looking at the web server logs for stuff that is getting through, it's possible to add additional mitigation to Cloudflare rules, even on the free level.

At the moment, a lot of the traffic I'm seeing from residential proxies is extremely simple - in some cases to the point of being a bit bizarre.
For example, looking at our XF "guests" page, a large majority of the guests are "Viewing an error page" - from the web logs, these are typically 404s for requesting non-existent things in the image/link proxy. Similarly, I see lots of requests to attachments, goto/post and reactions with either no referrer or one which cannot be correct (e.g. root URL as referrer when our forum is not in the root). No genuine web browser would make these requests, and generally they're not interesting to genuine search indexers.
Use CF security rules to issue a "managed challenge" to these requests and it takes quite a lot of bot traffic out, with a completion rate of 0.01%
e.g. last 24 hours:

This is probably more effective than trying to whack-a-mole on individual residential proxy IPs - we get 300,000+ "Unique Visitors" per day.
CF also makes it easy to block/challenge countries and ASNs, so the traffic from "bad actor" data centres is also massively reduced.

Even then, we still hit our largest number of "visitors online" (15,000+) in the past week.

There's not going to be one solution, it's probably going to take a combination, plus an acceptance that it's not actually a battle we can win - we put content on the web, people will steal it - it's always been the case. Hiding more content behind registration/login will impact SEO and lead to a drop in positive activity.

Anthony Parsons · Mar 12, 2026

And we were just saying, and a solution is in the making: https://xenforo.com/community/resources/xtr-ip-threat-monitor.10134/updates#resource-update-52127

Suzanne O · Mar 14, 2026

Maybe XF needs to add @Xon's sign up and abuse add on to their software.

Anthony Parsons · Mar 15, 2026

Suzanne O said:
Maybe XF needs to add @Xon's sign up and abuse add on to their software.

Or maybe not. When you travel down this path, you endup where VB suffered their largest loss - they tried to do everything for everyone. It failed spectacularly.

Software has a purpose, a priority. The system has add-on capability, which is what XON's add-on is for. If you need it, use it, otherwise, don't.

smallwheels · Mar 18, 2026

A bit of a field report: When I looked into the dashboard of IP Thread Monitor routinely this morning something was unusual. Over the last couple of days it had been quiet on the scraping front and the massive amount of blocked countries and ASNs did their part to let my server live an easy life. However, tonight things changed a bit:

Bildschirmfoto 2026-03-18 um 08.48.16.webp

The number of IPs visiting in 24h had again gone up - lately it had been around 1.200, at max maybe 2.000. Typically around 700 get through, the rest is blacklisted. Over the last 30 days the statistics look like this: A deflection rate of more than 80%.

Bildschirmfoto 2026-03-18 um 08.49.18.webp

However, now I saw myself confronted with a 34% rate and 2.400 visits between midnight and six in the morning when only very few genuine visitors come to my forum. There are some, but few.

A check at the proxycheck.io dashboard showed a massive (for my environment) peak of different IPs showing up between one and two o'clock in the morning:

Bildschirmfoto 2026-03-18 um 08.50.10.webp

Clearly not normal traffic. Yet proxycheck.io had only identified a fraction of them as bad and even my country- and ASN blocking had let them through as well. Strange. No peak with registered visitors, so it had to be guests. No peak in the matomo statistics, so these visitors did not trigger the tracking. Time to dive into the rabbit hole.

I sshed into the server and went to analyze the web.log

Code:

 $ grep "18/Mar/2026:01:" web.log | grep " 200" | wc -l

gave me 5662 entries which resulted in a code 200 between 1 and 2 in the morning. Not good. The hour before it was 1631 (and there is the rest of the genuine users before going to bed included), the hour after gave me 287. So clearly, I got the time window.

Let's break it further down and so I did:

1:00 - 1:10 280
1:10 - 1:20 1751
1:20 - 1:30 1767
1:30 - 1:40 1630
1:40 - 1:50 39
1:50 - 2:00 195

So a time window of 30 minutes with way higher traffic than usual. In a bigger forum or one with an international audience, that is distributed though time zones probably no one would have noticed. Again the advantage of running a small local forum in laboratory mode.

My 5562 entries came from 1373 different IP addresses. 1156 of them had just one single entry in the log file, so basically did one single call and another 70 had two entries - clearly not your genuine visitors.

I could already see from the hostnames in the log that most of them came from German DSL providers for private users - clearly resident proxies. Finally they got me: While I do successfully block resident proxies from a lot of countries by country or ASN blocking b/c I don't have regular visitors from there I cannot do that within Germany, as my core audience comes from there.

Havin in mind the claims of providers of resident proxy networks about millions of resident proxies within Germany I was curious which providers they were coming from. The admin's Swiss army knife, the combo of grep, awk, dig, netcat and a little shell scripting let me feed the IPs into the fabulous free service of team cymru to get the ASNs for the IPs in question and aggregate them. Turned out: Not too many surprises: Almost all of the requesting IPs were from private DSL connections while their respective owners enjoyed their sleep. The ASNs sorted by number of different requesting IPs during the timeframe:

355: AS3209 (Vodafone)
232: AS3320 (DeTAG Deutsche Telekom)
226: AS8881 (Versatel)
173: AS6805 (Telefonica Germany)
132: AS3133 (Kabel-Deutschland)
94: AS7922 (COMCAST) - an outlier from the US, traditionally called Spamcast since more than 20 years
92: AS60294 (DE-DGW Deutsche Glasfaser Wholesale)
41: AS46375 (Sonic Telecom LLC, US)
39: AS42184 (TKRZ Stadtwerke GmbH) - a small local provider that I never heard of before
27: AS202208 (teutel GmbH) - another small local provider
22: AS8374 (Plusnet) from Poland
14: AS207790 (SWN Stadtwerke Neumuenster GmbH) - another small local provider

There were a couple below ten IPs as well and these were small ISPs. The order list pretty much reflects market share within German DSL/Cable providers.

So the bad news is: There are indeed resident proxies in Germany and there are many. And I do currently have no tool to keep them out from my forum. Time to get creative.

The good news is: As I have limited guest viewing massively a couple of weeks ago they can scrape a bit, but not very much.

Given that all of that came out of nothing and peaked massively through distributed requests from loads of IPs it is pretty safe to assume that this was one single player, using zombie hosts to scrape my forum. I still can barely believe that those people rented out their internet connection as zombies knowingly.

Kirby · Mar 19, 2026

smallwheels said:
Given that all of that came out of nothing and peaked massively through distributed requests from loads of IPs it is pretty safe to assume that this was one single player, using zombie hosts to scrape my forum. I still can barely believe that those people rented out their internet connection as zombies knowingly.

Were you able to analyze if real (probably headless) browsers were used fo the requests?

smallwheels · Mar 19, 2026

Kirby said:
Were you able to analyze if real (probably headless) browsers were used fo the requests?

No, not on a solid technical basis. But I would clearly assume "headless" or no real browsers due to how the requests are made. I.e. the single log file entry of a randomly picked one looks like that:

Code:

p5de35f0b.dip0.t-ipconnect.de - - [18/Mar/2026:01:19:21 +0100] "GET /threads/some_tread/ HTTP/1.1" 200 15804 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36" 2605 21864

The thread is two years old, not especially interesting and consists of just two postings. Yet, when grabbing the thread title in the web log I find this:

Code:

pd9e50fdd.dip0.t-ipconnect.de - - [18/Mar/2026:01:19:21 +0100] "GET /threads/some_tread/ HTTP/1.1" 200 15806 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36" 2609 21866
p5de35f0b.dip0.t-ipconnect.de - - [18/Mar/2026:01:19:21 +0100] "GET /threads/some_tread/ HTTP/1.1" 200 15804 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36" 2605 21864
p5b3e51d5.dip0.t-ipconnect.de - - [18/Mar/2026:01:19:21 +0100] "GET /threads/some_tread/ HTTP/1.1" 200 15804 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/145.0.0.0 Safari/537.36" 2676 21864
176.1.9.210 - - [18/Mar/2026:01:19:22 +0100] "GET /threads/some_tread/ HTTP/1.1" 403 199 "$myforumURL" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:148.0) Gecko/20100101 Firefox/148.0" 2392 5819
tmo-084-11.customers.d1-online.com - - [18/Mar/2026:01:19:21 +0100] "GET /threads/some_tread/ HTTP/1.1" 200 15804 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36" 2637 21864

Five calls to the same old boring thread from different IPs, apart from one of the same provider, within just two seconds. All limited to the plain content and, again apart from the one before, submitting the identical user agent and Google as the referrer. If I call the same thread from my browser I get the following log entries:

Code:

$myIP - - [19/Mar/2026:09:29:05 +0100] "GET /threads/some_thread HTTP/1.1" 301 - "https://$myForumURL/service_worker.js" "$myuseragnet" 1767 4892
$myIP - - [19/Mar/2026:09:29:07 +0100] "GET /threads/some_thread/ HTTP/1.1" 200 28940 "https://$myForumURL/service_worker.js" "$myuseragnet" 1172 29701
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/lazyimageloader/xf/lightbox.min.js?_v=167d08fa HTTP/1.1" 200 842 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1147 1202
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/lazyimageloader/lazy-compiled.js?_v=167d08fa HTTP/1.1" 200 9565 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1145 9949
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/action.min.js?_v=167d08fa HTTP/1.1" 200 23882 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1126 24311
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/inline_mod.min.js?_v=167d08fa HTTP/1.1" 200 6277 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1726 10971
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/advancedbbcode/transparent_spoiler.min.js?_v=167d08fa HTTP/1.1" 200 2461 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1750 7154
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/message.min.js?_v=167d08fa HTTP/1.1" 200 18544 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1127 18951
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/captcha.min.js?_v=167d08fa HTTP/1.1" 200 6991 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1127 7353
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/threadreplybanner/scroll-to.min.js?_v=167d08fa HTTP/1.1" 200 393 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1147 753
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/lightbox-compiled.js?_v=167d08fa HTTP/1.1" 200 155643 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1729 160758
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/advancedbbcode/editor-compiled.js?_v=167d08fa HTTP/1.1" 200 29742 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1146 30171
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xfmg/editor.min.js?_v=167d08fa HTTP/1.1" 200 2898 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1128 3259
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /webmanifest.php HTTP/1.1" 200 865 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1706 5660
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/advancedbbcode/editor/copy-n-paste.min.js?_v=167d08fa HTTP/1.1" 200 17691 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1154 18098
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/advancedbbcode/editor/xf23-table-fix.min.js?_v=167d08fa HTTP/1.1" 200 1324 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1156 1685
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/vendor/exif-reader/exif-reader.js?_v=167d08fa HTTP/1.1" 200 35220 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1143 35671
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/editor-compiled.js?_v=167d08fa HTTP/1.1" 200 733306 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1131 735673
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/xf/attachment_manager-compiled.js?_v=167d08fa HTTP/1.1" 200 30725 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1143 31154
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/useractivity/last_seen.min.js?_v=167d08fa HTTP/1.1" 200 954 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1142 1314
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/sv/alerts/alerts.min.js?_v=167d08fa HTTP/1.1" 200 6805 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1133 7167
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /js/vendor/froala/plugins/fullscreen.min.js?_v=167d08fa HTTP/1.1" 200 3801 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1149 4162
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /data/assets/logo_default/favicon.svg HTTP/1.1" 200 396991 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1211 398432
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /data/assets/notice_images/trikot_free.png HTTP/1.1" 200 91610 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1216 92210
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /data/avatars/m/0/143.jpg?1703341491 HTTP/1.1" 200 4102 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1210 4459
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /data/avatars/m/0/1.jpg?1663670705 HTTP/1.1" 200 4175 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1208 4532
$myIP - - [19/Mar/2026:09:29:09 +0100] "GET /css.php?css=public%3Aattachments.less%2Cpublic%3Aeditor.less%2Cpublic%3Alightbox.less%2Cpublic%3Amessage.less%2Cpublic%3Anotices.less%2Cpublic%3Aozzmodz_badges.less%2Cpublic%3Aozzmodz_badges_featured_badges.less%2Cpublic%3Ashare_controls.less%2Cpublic%3AsvAdvancedBBCode_wordcount.less%2Cpublic%3AsvAlertImprovements.less%2Cpublic%3AsvLazyImageLoader.less%2Cpublic%3AsvThreadReplyBanner_macros_reply_banner_html.less%2Cpublic%3Asv_bbcode_fullescreen.less%2Cpublic%3Asv_bbcode_header.less%2Cpublic%3Asv_bbcode_hr.less%2Cpublic%3Asv_bbcode_spoiler.less%2Cpublic%3Aextra.less&s=4&l=7&d=1773740805&k=6987815674ce7a0061b1ee430001b51865390ada HTTP/1.1" 200 25930 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 2342 30984
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /data/local/icons/regular.svg?v=1773740892 HTTP/1.1" 200 186085 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1146 186954
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /data/local/icons/brands.svg?v=1773740892 HTTP/1.1" 200 9122 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1145 9504
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/angle-right.svg?v=5.15.3 HTTP/1.1" 200 400 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1251 758
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/bookmark.svg?v=5.15.3 HTTP/1.1" 200 371 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1248 729
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/check-square.svg?v=5.15.3 HTTP/1.1" 200 637 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1252 995
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/angle-up.svg?v=5.15.3 HTTP/1.1" 200 403 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1248 761
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/thumbs-up.svg?v=5.15.3 HTTP/1.1" 200 1181 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1267 1540
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /js/xf/structure.min.js?_v=167d08fa_mt=undefined HTTP/1.1" 200 13356 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1160 13741
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /js/xf/form.min.js?_v=167d08fa_mt=undefined HTTP/1.1" 200 30918 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1155 31347
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/plus.svg?v=5.15.3 HTTP/1.1" 200 440 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1262 798
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/default/sv/mod/stop-icon.png HTTP/1.1" 200 3425 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1228 3780
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/default/sv/mod/alert-icon.png HTTP/1.1" 200 3241 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1229 3596
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/default/sv/mod/information-icon.png HTTP/1.1" 200 3244 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1235 3599
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/default/sv/mod/warning-icon.png HTTP/1.1" 200 3343 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1231 3698
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/reply.svg?v=5.15.3 HTTP/1.1" 200 657 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1263 1015
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/regular/square.svg?v=5.15.3 HTTP/1.1" 200 409 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1264 767
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /styles/fa/solid/user-circle.svg?v=5.15.3 HTTP/1.1" 200 571 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1267 929
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /data/assets/ozzmodz_badges_badge/BTC-small150.jpg HTTP/1.1" 200 721205 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1242 723545
$myIP - - [19/Mar/2026:09:29:11 +0100] "GET /data/local/icons/solid.svg?v=1773740892 HTTP/1.1" 200 30436 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1162 30863
$myIP - - [19/Mar/2026:09:29:11 +0100] "GET /data/local/icons/duotone.svg?v=1773740892 HTTP/1.1" 200 36671 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1164 37120
$myIP - - [19/Mar/2026:09:29:11 +0100] "GET /data/local/icons/light.svg?v=1773740892 HTTP/1.1" 200 4382 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1162 4742
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /attachments/1727523795071-png.14356/ HTTP/1.1" 200 129930 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1229 131552
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /attachments/$some_picture-jpg.14351/ HTTP/1.1" 200 23450 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1228 24154
$myIP - - [19/Mar/2026:09:29:10 +0100] "POST /job.php HTTP/1.1" 200 13 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1230 505
$myIP - - [19/Mar/2026:09:29:10 +0100] "GET /attachments/1727523986109-png.14357/ HTTP/1.1" 200 437613 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1229 441826
$myIP - - [19/Mar/2026:09:29:12 +0100] "POST /job.php HTTP/1.1" 200 13 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1230 505
$myIP - - [19/Mar/2026:09:29:13 +0100] "POST /job.php HTTP/1.1" 200 14 "https://$myForumURL/threads/some_thread/" "$myuseragnet" 1230 506

54 lines of log entries with all kinds of calls for requesting the very same thread as before but with a normal browser by a human. Currently I don't have any "intelligence" built into my filtering, so I don't have data to dig further or more in detail.

What I can say is that this "one single call to raw content" is pretty typical for the current scraping bots that come via resident proxies as is requesting the same resource from various IPs in parallel and randomly targeting very old threads.

smallwheels · Mar 19, 2026

smallwheels said:
No peak in the matomo statistics, so these visitors did not trigger the tracking.

I have to correct myself a little bit. With some delay Matomo does indeed show a peak, but only short of 130 visitors (in orange the day before, so you can easily spot the difference).

smallwheels said:
My 5562 entries came from 1373 different IP addresses. 1156 of them had just one single entry in the log file, so basically did one single call and another 70 had two entries - clearly not your genuine visitors.

It seems, that there are two different patterns here, als the logs show as well. While most of the requests were as described one post further up and easily to isolate and distinguish from a human visit others look more normal, do request different URLs and also pull more than just the rawest content. They behave more like traditional scrapers and it is not always trivial on first sight to distinguish them from a human - however, these have become the minority.

Levina · Mar 19, 2026

Wow, I suddenly have over 12,000 "guests" in the house. That's a record number for me. And all different IP addresses.

Schermafbeelding 2026-03-19 om 21.07.46.webp

smallwheels · Mar 19, 2026

Levina said:
Wow, I suddenly have over 12,000 "guests" in the house. That's a record number for me. And all different IP addresses.

View attachment 335328

Did you try to do anything about it since you started the thread back in October last year?

thomas1 · Mar 20, 2026

We also saw sharp spikes over the past three days, peaking at more than 15,000 "guests" at times. Multiple IPs, but 90% from the same ASN. Blocking that Cloud service provider took care of the worst. Another day, another onslaught, and we're back to thousands of guests. It's a futile exercise.

Anthony Parsons · Mar 20, 2026

Just tune your server, it doesn't matter then. Small forum with 20k uniques will still function fine on small server resources, providing it is well tuned. If you have your online setting set to 60 minutes, change it to 15 or lower, that way your server is dropping sessions faster. Give PHP more capacity and ensure the DB is well tuned. Problem solved.

BrettC · Mar 20, 2026

thomas1 said:
We also saw sharp spikes over the past three days, peaking at more than 15,000 "guests" at times. Multiple IPs, but 90% from the same ASN. Blocking that Cloud service provider took care of the worst. Another day, another onslaught, and we're back to thousands of guests. It's a futile exercise.

If you want a nuclear option, implement Anubis. Otherwise, the option of Cloudflare filtering does exist, and at the server-level, you can block the most problematic ASNs (which are traditionally datacenters). That's pretty much your only choices at this point outside of restricting guest view access while optimizing your server for higher loads and bandwidth utilization.

Anthony Parsons said:
Just tune your server, it doesn't matter then. Small forum with 20k uniques will still function fine on small server resources, providing it is well tuned. If you have your online setting set to 60 minutes, change it to 15 or lower, that way your server is dropping sessions faster. Give PHP more capacity and ensure the DB is well tuned. Problem solved.

I can't say I am on-board with the idea of openly allowing mass AI-scrapers pulling my sites content, and then at a later date, said AI provider not even giving credit or providing a source where the content was ingested from. Something about that just feels wrong, but hey... what do i know?

And for whatever it might be worth, AI entities that are providing sources to provided answers to the user, I do intentionally white list.

Crazy amount of guests

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Well-known member

Active member

Similar threads

We value your privacy