Cloudflare Free Plan can block AI Scrapers and Crawlers

In Security>WAF>Custom Rule blocks Known Bots that are not a Search Engine Crawler. This rule also matches AI Scrapers and Crawlers.
.
1719117462779.webp
 
Last edited:
Yes, I set it to Block. Immediately after the setting I set a test request to my page with a User Agent sounding like a bot ("#curl -A .....") and it was not blocked. I think this setup (if it works) would be the best for my community: Just block everything other than users or crawlers that can give me more users.

Usually the easiest setting is Managed Challenge as this is the most convenient if a real user would show up, he would just maybe pass through without any interaction by Cloudflare or click on the Cloudflare-Box and then can continue visiting the page.

As for this setting here (is "Known Bot" but is not "Search Engine Crawler"): Managed Challenge Cloudflare sends a pretty large HTML to the client (robot or user) I had the idea it is better to be fair and just tell them "no" with an error code.

Anyway, I do not want to have bots that can "solve" the Challenge anyway. There are programs out there where you pay to have access to a web crawler that can circumvent Cloudflares bot challenge, and you pay much more the higher the setting in Cloudflare is (within paid plans). So these bots, Cloudflare would not detect anyway.
 
Cloudflare only recognizes a small subset of bots as "Known Bots".

These are my Cloudflare WAF Custom rules.

Most requests from countries like Russia China Singapure (the 0% Challenge Solved Rate [CSR] strongly indicates these are bots) are not blocked as "Known Bots other than Search Engine Crawler".

1719285393529.webp
 
Last edited:
In Security>WAF>Custom Rule blocks Known Bots that are not a Search Engine Crawler. This rule also matches AI Scrapers and Crawlers.
.
View attachment 304669
BTW - if you are wanting to use it in an expression, it's not cf.verified_bot_category ne "Search Engine Crawler" (that will of course work, but it will pick up other categories of bots as well, including things like payment processor webhooks). If you are just wanting to block AI crawlers, you would want this:

cf.verified_bot_category eq "AI Crawler"

You can find various bot categories here:

 
Last edited:
Cloudflare only recognizes a small subset of bots as "Known Bots".

These are my Cloudflare WAF Custom rules.

Most requests from countries like Russia China Singapure (the 0% Challenge Solved Rate [CSR] strongly indicates these are bots) are not blocked as "Known Bots other than Search Engine Crawler".

View attachment 304782

Three days after that I got this email from Google, it seems there are Cloudflare-Known-Bots that are Seach Engine Crawlers, but Cloudflare categorizes them into a different way:

1719792837781.webp
 
Back
Top Bottom