Robots

Robots 2.1

No permission to download
...can I set it up to *blank-page* all robots
Pretty much. All you would do is populate the Robot Names box, then just leave the Allowed Robots box empty.

The issue arises when you stated "independent of robot name". Somehow you want the add-on to identify and blank-page all robots, so you must supply a way for it to identify exactly what user-agent is a robot.

As such, it's up to you to fill-in-the-blank for the names of the robots you want to block. User-Agents.org is a pretty good site to see how various Spiders, Robots, Crawlers and Browsers identify themselves.(i.e. What to put into the Robot Names box.) If you're curious as to what is currently browsing your website, you ought to be able to review your Raw Access Logs (via cPanel) and get a feel for what-all has been on your website.

If you're looking for a way to avoid website scraping, deterring bad user agents and halting referrer spam you might want to take a look into using ZB BLOCK. I've employed it for years and have been well pleased with it's capabilities.
 
...so you must supply a way for it to identify exactly what user-agent is a robot.

Right! I was a little lost there. I first thought perhaps there was a general way (i.e., without names) one could identify whether or not a piece of trafic comes from a robot -- but I found out via Andy it's not. But this is going to work out great anyway!

I'll check out your stuff. Thanks for your input!

cheers
/Håkan
 
I installed the addon and removed baidu from the list but baidu is still crawling my forum...

View attachment 124827

This is happening to us as well. As a test I've removed everything except googlebot, but our forum is being crawled right now by Bing, Yahoo, Yandex, and others. Unregistered and Registered group permissions under Robots are Allowed.

Is there anything else to check? This Add-On would be a huge help if it works.
 
I've removed everything except googlebot, but our forum is being crawled right now by Bing, Yahoo, Yandex, and others.

Not that I know but don't they then just get blank pages thus they have nothing to index? (That's how I read it from the overview)

(Quote Overview)
Q: What happens if a robot which is not allowed comes to my site?
A: It receives a blank page.
 
Not that I know but don't they then just get blank pages thus they have nothing to index? (That's how I read it from the overview)

(Quote Overview)
Q: What happens if a robot which is not allowed comes to my site?
A: It receives a blank page.

For sure, it seemed to be a great idea, but my main goal is to conserve the bandwidth and server executions they're using up. From the Robots tab in "Who's online now" they're moving from thread to thread, so blank page or otherwise, I'm not sure if that's actually happening. I think I might need to find something that just keeps them out entirely.
 
I found a minor bug. If the data being inserted into the type column is greater than 30 these bots can fill up the error log with data too long for field "type" error messages. I've temporarily increased the maximum varchar length, but the PHP probably needs to enforce the 30 character limit.

Great mod btw. :)
 
Hi Andy, what is the direct url to the robots view page. My custom style did not add the link in my profile (I added permissions too)

To those who want to check the bot is working you can use this tool:


sure it enough bots i have not whitelisted get a blank page. AWESOME !!!
 
Top Bottom