@Martyn : This topic has been dicussed back and forth over the last month in countless threads. These are bots that use resident proxies to scrape your content for training of AI models. Robots.txt won't help at all and neither will Cloudflare for the most part. The issue is not easy to deal with and it highly depends from the character of your forum, the locations you audience comes from and the technical skills and possibilties you have what can be done. Either way it is a lot of work and very time consuming. The most active thread about the topic is this:
I have a small forum, usually a few hundred guests at most. The last days the number of guests have been unusually high, around 3000 and climbing. We now have over 4800 guests. That's not normal. Many "Viewing unknown page" or "Latest content", with a warning sign. IP addresses are from all corners of the world. Many from south America, Brazil mostly. Also saw quite a number from Vietnam of all places, but everywhere else too.
We're on Cloud and I don't know what to do. Could we be under attack? We have been in the past, a few times I think. For reasons I cannot possibly fathom; we're a...