How to block Robot ByteDance

Alvin63

Well-known member
I've mentioned this on a couple of other threads but Robot ByteDance is hogging my site and appearing multiple times every 60 seconds or so.

It ignores robots.txt
It also ignored Cloudflare rules so it must be bypassing Cloudflare
I've been unable to whitelist Cloudflare IP's in my shared hosting (it won't accept the IP ranges, only individual IP's and there are thousands)

Any other suggestions? Block it in htaccess?
 
Last edited:
Thank you. Blocking it in htaccess won't cause any issues with any other Cloudflare settings will it? I don't mean relating to this robot, I just mean general Cloudflare settings. For example I apparently can't whitelist my own IP in htaccess, as well as using Cloudflare zero trust as there are incompatibilities/one could cancel the other out. Sorry that's not very scientific but it's hot here tonight!

But I assume this is something different altogether and not related to Zero Trust.
 
My partial .httacess (kills them DEAD)
Code:
BrowserMatchNoCase "Bytedance" bad_bot
BrowserMatchNoCase "Bytespider" bad_bot
BrowserMatchNoCase "Baiduspider" bad_bot
BrowserMatchNoCase "BIDUBrowser" bad_bot
Order Deny,Allow
Deny from env=bad_bot
 
Last edited:
Thanks. And that is interesting because I thought it was ByteDance and Bytespider - not Bytedance?

Also just edited my post above with a question .........
 
Thanks. AI suggested this - does it look any good? Although I'm sure yours is better 😂 I just wasn't sure if I could still use allow and deny with me using Cloudflare zero trust as well?

RewriteEngine On

# Block bad bots
RewriteCond %{HTTP_USER_AGENT} ByteDance [NC,OR]
RewriteCond %{HTTP_USER_AGENT} Bytespider [NC]
RewriteRule .* - [F,L]

It said put it after the first Rewrite Engine on and before any Rewrite rule
 
Last edited:
My partial .httacess (kills them DEAD)
Code:
BrowserMatchNoCase "Bytedance" bad_bot
BrowserMatchNoCase "Bytespider" bad_bot
BrowserMatchNoCase "Baiduspider" bad_bot
BrowserMatchNoCase "BIDUBrowser" bad_bot
Order Deny,Allow
Deny from env=bad_bot
Where do you put this? At the top of htaccess?
 
Done that :-) Copied your text exactly and pasted at the end. And it's still there - Robot ByteDance. Is it supposed to work straight away? Or maybe take a while?
 
Cleared the server cache and it was still there. Refreshed the page and it's gone now ........... touch wood! Thanks for your help.
 
Last edited:
My partial .httacess (kills them DEAD)
Code:
BrowserMatchNoCase "Bytedance" bad_bot
BrowserMatchNoCase "Bytespider" bad_bot
BrowserMatchNoCase "Baiduspider" bad_bot
BrowserMatchNoCase "BIDUBrowser" bad_bot
Order Deny,Allow
Deny from env=bad_bot
Great solution - thank you very much! It is no longer appearing as a Robot. However my guest count has gone up a lot so maybe it's being sneaky and pretending to be a guest now? Anyway at least the bot is no longer there and Google and Bing are appearing more now.
 
Back
Top Bottom