• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Yandex Bots Hammering My Site

Mouth

Well-known member
#9
Code:
User-agent: AhrefsBot
User-agent: Baidu
User-agent: Baiduspider
User-agent: Baiduspider-video
User-agent: Baiduspider-image
User-agent: BoardReader
User-agent: BoardTracker
User-agent: Cliqzbot
User-agent: Diffbot
User-agent: dotbot
User-agent: EasouSpider
User-agent: Exabot
User-agent: Gigabot
User-agent: linkdexbot
User-agent: linkdexbot-mobile
User-agent: magpie-crawler
User-agent: meanpathbot
User-agent: MJ12bot
User-agent: NaverBot
User-agent: omgilibot
User-agent: proximic
User-agent: Rogerbot
User-agent: SiteBot
User-agent: sogou
User-agent: sogou spider
User-agent: Sogou web spider
User-agent: Sosospider
User-agent: spbot
User-agent: trendictionbot
User-agent: Twiceler
User-agent: URLAppendBot
User-agent: Yandex
User-agent: Yeti
User-agent: YoudaoBot
Disallow: /
 

Mr Lucky

Well-known member
#10
Code:
User-agent: AhrefsBot
User-agent: Baidu
User-agent: Baiduspider
User-agent: Baiduspider-video
User-agent: Baiduspider-image
User-agent: BoardReader
User-agent: BoardTracker
User-agent: Cliqzbot
User-agent: Diffbot
User-agent: dotbot
User-agent: EasouSpider
User-agent: Exabot
User-agent: Gigabot
User-agent: linkdexbot
User-agent: linkdexbot-mobile
User-agent: magpie-crawler
User-agent: meanpathbot
User-agent: MJ12bot
User-agent: NaverBot
User-agent: omgilibot
User-agent: proximic
User-agent: Rogerbot
User-agent: SiteBot
User-agent: sogou
User-agent: sogou spider
User-agent: Sogou web spider
User-agent: Sosospider
User-agent: spbot
User-agent: trendictionbot
User-agent: Twiceler
User-agent: URLAppendBot
User-agent: Yandex
User-agent: Yeti
User-agent: YoudaoBot
Disallow: /

How likely that any of those will obey robots.txt?
 

Rudy

Well-known member
#12
I had to block Baidu in our firewall--on a busy forum (with about 1500-1800 online during peak hours), I had the Baidu-bot hitting us with about 100 different requests from dozens of IP addresses all at once, and it was in essence a DOS attack. Using robots.txt did nothing to stop them. Only after a firewall smackdown did we finally get rid of their traffic, and I applied that to all the servers I have sites on.

I should probably do the same for Yandex.