Anyone else getting hit by these?

Mendalla

Well-known member
I am getting a pile of guests with addresses in the 57.141 range. The IP addresses belong to Meta so maybe some new bot they are using? Anyone else seen this? I have Known Bots installed and it does not identify them so must be fairly new.
 
The AI companies are adding new bots as fast as humans can block them. They keep changing IP's and names and User Agents so that they can't be blocked. It's a racket and should be illegal. That's if they even follow robots.txt -- which many don't.
 
This one - meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler) ? We've seen huge numbers of hits from Meta's IP range and that user agent over recent weeks and that agent in particular. To quote them
The Meta-ExternalAgent crawler crawls the web for use cases such as training AI models or improving products by indexing content directly.

I assume they might well run various things on their IP ranges, so given they are giving you a user agent maybe block based on that instead if you want to be more nuanced?
 
Am I wrong in thinking that this activity is good for future traffic? I have as many as 700 robots at times. I don’t notice any slowdown of my server, little cpu and memory is consumed according to my Webmin stats. Now I’m self hosted with tons of resources. I suppose if that wasn’t the case it might be an issue.
 
Meta-ExternalAgent is a new AI crawler and I've read complaints about excessive traffic generated by this bot on other sites.

It's now listed in KnownBots - should show up in the next couple of of days once your site runs the update routine.

There are various ways you can mitigate bots like this - my first suggestion would be to block AI bots in Cloudflare as mentioned by @Alpha1
 
Am I wrong in thinking that this activity is good for future traffic?
Who knows, I suppose it depends what they are doing with "your site content" ... I can't imagine they are intent on pushing traffic off the meta/facebook platform!

I've read complaints about excessive traffic generated by this bot on other sites
On our forum sites we're seeing roughly 2 to 3 requests a second from meta's AI bot/scraper. Some of our other clients are getting higher rates than that. It's not been enough anywhere that I've had to curtail it. There have been some more aggressive crawlers floating around recently - which I'd guess are gathering data for AI training and being less well behaved, couple of those coming off Alibaba Cloud hosting and OVH SAS.
 
This one - meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler) ? We've seen huge numbers of hits from Meta's IP range and that user agent over recent weeks and that agent in particular. To quote them


I assume they might well run various things on their IP ranges, so given they are giving you a user agent maybe block based on that instead if you want to be more nuanced?
Facebook is working on their own AI chat system like OpenAI, which could be potentially be good. So you’re looking at potentially receiving traffic if your site is ranked and indexed by their ai systems.

just like if your site is connected to bing webmaster tools, your content will be crawled and appear in Chatgpt’s answers. Which will boast your traffic as these answers will link back to your site as the source.

Perpexlity does the same thing, they link back to the sites that they crawl and index.

If you don’t want any traffic coming from the AI systems, then block them off.
 
Facebook has invented, perfected and sustained the walled garden principle. Meta is not in the business of sharing traffic nor letting you eat into their business.
If you expect to receive traffic from Facebook's AI then most likely you will be disappointed.
 
If it's scraping the site for content to train an AI rather than for indexing, it could be using posts from your site without any actual link back to the site.
I let free semi low quality content get scrapped. Threads with images and videos very little.
 
Back
Top Bottom