Not going to lie- they have been tearing my site up and I didnt know who it was until I installed the KnownBots addon. I even started to block Amazon AWS IP addresses.
User Agent: compatible; "ClaudeBot/1.0; +claudebot\@anthropic.com"
Before April 19, it was just: "claudebot"
What should we be blocking as a user-agent? Under our domain in Cloudflare, I can add user agent blocking under Security/WAF and add something like (lower(http.user_agent) contains "claude") or (lower(http.user_agent) contains "anthropic") which should catch them, I think? (I'm thinking "contains" might be enough of a wildcard on both terms to catch both variants...?)
Yes, a simple match on claudebot should do the trick.
I capture user agents from sites using my KnownBots addon so I can analyse them to identify bots - I currently have 59,540 user agents in the database. This is useful because it allows me to search the database of user agents to see whether a simple match will be sufficient or whether it will lead to false-positives.
In this case, the only user agents matching claudebot are:
Mozilla/5.0 AppleWebKit/0.0 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +support@anthropic.com)
Mozilla/5.0 AppleWebKit/0.0 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Mozilla/5.0 AppleWebKit/0.0 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com),gzip(gfe)
ClaudeBot
(note that I strip certain variables like version numbers from certain strings to minimise duplicates - hence the AppleWebKit/0.0)
I've had to block claudebot at the Cloudflare level on several of my sites because of this bad behaviour.
Wow, thanks for that. And I should probably get that addon one of these days. I had one of my low-traffic sites get slammed a couple of weeks ago and I'm betting that was the culprit, as I've blocked most other bots using Cloudflare.