Known Bots

Known Bots 6.0.3

No permission to download
@Sim Server error log:

Code:
Hampel\KnownBots\Exception\RequestException: Request error fetching bots: cURL error 7: (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) src/addons/Hampel/KnownBots/Api/KnownBots.php:67

Generated by: Unknown account Oct 29, 2023 at 12:50 PM

Stack trace

#0 src/addons/Hampel/KnownBots/SubContainer/Api.php(68): Hampel\KnownBots\Api\KnownBots->fetch(1698380712, false)
#1 src/addons/Hampel/KnownBots/Cron/FetchBots.php(23): Hampel\KnownBots\SubContainer\Api->fetchBots()
#2 src/XF/Job/Cron.php(37): Hampel\KnownBots\Cron\FetchBots::fetchBots(Object(XF\Entity\CronEntry))
#3 src/XF/Job/Manager.php(260): XF\Job\Cron->run(8)
#4 src/XF/Job/Manager.php(202): XF\Job\Manager->runJobInternal(Array, 8)
#5 src/XF/Job/Manager.php(86): XF\Job\Manager->runJobEntry(Array, 8)
#6 job.php(43): XF\Job\Manager->runQueue(false, 8)
#7 {main}

Request state

array(4) {
  ["url"] => string(8) "/job.php"
  ["referrer"] => string(89) "/threads/beautiful.78402/"
  ["_GET"] => array(0) {
  }
  ["_POST"] => array(0) {
  }
}
 
I think I had that one recently, too, and was too busy with other life stuff to follow up. Will check to see if mine was the same later.
 
Yeah, mine was basically the same, but then was followed by:

Server error log
  • ErrorException: No data returned from BotFetcher
  • src/XF/Error.php:77
  • Generated by: Unknown account
  • Oct 14, 2023 at 10:08 PM

Stack trace​

#0 src/XF.php(219): XF\Error->logError('No data returne...', false)
#1 src/addons/Hampel/KnownBots/SubContainer/Api.php(46): XF::logError('No data returne...')
#2 src/addons/Hampel/KnownBots/Cron/FetchBots.php(19): Hampel\KnownBots\SubContainer\Api->fetchBots()
#3 src/XF/Job/Cron.php(37): Hampel\KnownBots\Cron\FetchBots::fetchBots(Object(XF\Entity\CronEntry))
#4 src/XF/Job/Manager.php(260): XF\Job\Cron->run(8)
#5 src/XF/Job/Manager.php(202): XF\Job\Manager->runJobInternal(Array, 8)
#6 src/XF/Job/Manager.php(86): XF\Job\Manager->runJobEntry(Array, 8)
#7 job.php(43): XF\Job\Manager->runQueue(false, 8)
#8 {main}

Request state​

array(4) {
["url"] => string(8) "/job.php"
["referrer"] => string(50) "https://www.wondercafe2.ca/whats-new/posts/761051/"
["_GET"] => array(0) {
}
["_POST"] => array(0) {
}
}

https://www.wondercafe2.ca/admin.php?logs/server-errors/8/delete
 
@Sim Server error log:

Code:
Hampel\KnownBots\Exception\RequestException: Request error fetching bots: cURL error 7: (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) src/addons/Hampel/KnownBots/Api/KnownBots.php:67

Generated by: Unknown account Oct 29, 2023 at 12:50 PM

Stack trace

#0 src/addons/Hampel/KnownBots/SubContainer/Api.php(68): Hampel\KnownBots\Api\KnownBots->fetch(1698380712, false)
#1 src/addons/Hampel/KnownBots/Cron/FetchBots.php(23): Hampel\KnownBots\SubContainer\Api->fetchBots()
#2 src/XF/Job/Cron.php(37): Hampel\KnownBots\Cron\FetchBots::fetchBots(Object(XF\Entity\CronEntry))
#3 src/XF/Job/Manager.php(260): XF\Job\Cron->run(8)
#4 src/XF/Job/Manager.php(202): XF\Job\Manager->runJobInternal(Array, 8)
#5 src/XF/Job/Manager.php(86): XF\Job\Manager->runJobEntry(Array, 8)
#6 job.php(43): XF\Job\Manager->runQueue(false, 8)
#7 {main}

Request state

array(4) {
  ["url"] => string(8) "/job.php"
  ["referrer"] => string(89) "/threads/beautiful.78402/"
  ["_GET"] => array(0) {
  }
  ["_POST"] => array(0) {
  }
}

Any time the server is offline (maintenance, reboot, etc), an attempt to fetch new bots will result in this RequestException. The same can also happen if there are network problems between your web host and mine such that the request fails.

Having one or two of those is not a problem and won't affect the function of the addon, most likely the result of a temporary outage.

If you get these messages every day, then that indicates a more significant issue that should be investigated.

None of these API fetch errors will be visible to end users - they happen in the background via the cron task.
 
Last edited:
in my case it was bytedance/bytespider that was killing my server. i noticed the activity using this addon and blocked them through cloudflare as they do not respect robots file.

Has anything specifically worked for you? My site has been crawling this past week and every time I look at the robot list, Bytespider is there in multiples. Them routing through AWS has evaded country blocking and they don't care about robots.txt.
 
i am blocking them through cloudflare firewall. using simple keyword in user agent. they are at least keeping that. as long as they have the botname in the user-agent, it should be easy to block them even if they are using aws ips!
 
i am blocking them through cloudflare firewall. using simple keyword in user agent. they are at least keeping that. as long as they have the botname in the user-agent, it should be easy to block them even if they are using aws ips!

Are you talking about the User Agent Blocking on the Tools tab?
 
that should work as well. but mine is configured as a rule under Web Application Firewall.

(http.user_agent contains "Bytespider")
 
Been having some pretty egregious bot activity since Christmas Eve. Had upwards of 2.5k "guests" on the forums earlier today, in what was clearly systematic crawling, similar to what we'd experienced with Bytespider and Huawai's Petalbot earlier in 2023. These latest bots are actively pretending to be regular users though, leaving no clear identifier in their user agent. Already got "Send user agents via API" enabled, so hoping this will be able to inform an update that will help us all in blocking them.

Has anyone else experienced an upswing in these kinds of bots?
 
Use cloudflare!
We already are, and it's easier said than done. With Bytespider and Petalbot, we could use Cloudflare to filter them out since they'd given us a nice obvious thing in their user agent to filter out. We don't have that here, since these new bots are intentionally masquerading as regular users. Permanently enabling "I’m Under Attack Mode" (the only thing we've been able to do thusfar that's actually eliminated the bots while enabled) isn't a workable long term solution, since it both degrades the user experience with constant Cloudflare challenges, and it's indiscriminate, meaning that legitimate bots that we want to continue accessing our site are also being locked out.
 
Wow! Thank you. Have thought Cloudflare cares for everything.
While we use it, I have not watched any logs in the last weeks and months.
I remember that i have used a half day to search for a lot of tips and tricks on how to solve this.
My robots.txt was long, my .htaccess was long also.

Finally a server admin has deleted everything and told me to let it like it is.
 
Been having some pretty egregious bot activity since Christmas Eve. Had upwards of 2.5k "guests" on the forums earlier today, in what was clearly systematic crawling, similar to what we'd experienced with Bytespider and Huawai's Petalbot earlier in 2023. These latest bots are actively pretending to be regular users though, leaving no clear identifier in their user agent. Already got "Send user agents via API" enabled, so hoping this will be able to inform an update that will help us all in blocking them.

Has anyone else experienced an upswing in these kinds of bots?

I have been getting so many of them since then that it exceeded the bandwith limit I had set for each of my forums in December. (the slowest month of the year for our forums) There were about 100 ip's running scripts that were just looking for known exploits in Wordpress (which we don't have). For some reason, they kept trying anyways.

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36

72.239.225.90
47.76.35.19
35.92.70.1
35.91.39.172
103.159.74.133
1.39.78.22
103.238.106.13
5.45.80.13
92.99.17.161
233.233.51.129
hundreds of others

I put this in my .htaccess and it cut back considerably, with a simple text 403 instead of the full graphic response from Xenforo.

Code:
SetEnvIf Request_URI wp-login.php$ BackOffNow=1
SetEnvIf Request_URI style.php$ BackOffNow=1
SetEnvIf Request_URI xmlrpc.php$ BackOffNow=1
SetEnvIf Request_URI task-check.php$ BackOffNow=1
SetEnvIf Request_URI sad.php$ BackOffNow=1
SetEnvIf Request_URI radio.php$ BackOffNow=1
Order allow,deny
Allow from all
Deny from env=BackOffNow

There may be a better way to handle this?

Some stragglers after, but most were looking for the above.

my_domain/Blog
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/BLOG
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/blog
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/SITE
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/Site
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/site
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/sito
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/bac
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/sitio
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/bak
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:27 AM
my_domain/shop
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Shop
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/SHOP
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/BACKUP
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Backup
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/bk
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/old-site
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/main
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/2021
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Www
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/WWW
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/www
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/bc
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/demo
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/TEST
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Test
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/test
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/backup
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/2018
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/2019
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/2020
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/2022
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/NEW
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/wp-old
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/New
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/new
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/oldsite
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/OLD
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Old
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/old
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/WP
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Wp
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/wp
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/WordPress
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/WORDPRESS
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/wordpress
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM
my_domain/Wordpress
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36 Today at 5:26 AM

I had to disable knownbots from uploading the report, because it gives a server 500 error every morning when it runs.
 
Last edited:
I had good luck with this:


Ever since hardly had any robots lurking the forum.
 
And there’s an Nginx version too:
 
Check the IPs they are using, is it an easily identifiable hosting provider? (ie non-residential connections)
Nothing that was immediately obvious, though there's a lot of IPs to go through and we might be more successful at finding commonalities as we systematically go through them all. As it is, the IPs are all Cloudflare ones in the Apache logs, so we'd have to make changes to Apache to pass the actual IPs through to logging. The actual IPs are there, but they're usually in X-FORWARDED-FOR or CF-CLOUDFLARE-IP, or some other similar header.

I have been getting so many of them since then that it exceeded the bandwith limit I had set for each of my forums in December. (the slowest month of the year for our forums) There were about 100 ip's running scripts that were just looking for known exploits in Wordpress (which we don't have). For some reason, they kept trying anyways.
We've already had those blocked, so that won't be the cause in this specific instance, but certainly we do get those targeting us on a daily basis.

I had good luck with this:

Ever since hardly had any robots lurking the forum.
I don't think I've seen this one before, I'll pass it on to our tech.

On a related note, just noticed that the sending of user agents for this bot failed yesterday.
Code:
Hampel\KnownBots\Exception\RequestException: Request error sending user agents: src/addons/Hampel/KnownBots/Api/KnownBots.php:240 
Generated by: Unknown account Jan 16, 2024 at 5:46 PM 

Stack trace
#0 src/addons/Hampel/KnownBots/SubContainer/Api.php(109): Hampel\KnownBots\Api\KnownBots->sendUserAgents('87|Q2jnjrKI7UJ0...', Array)
#1 src/addons/Hampel/KnownBots/Service/UserAgentSender.php(47): Hampel\KnownBots\SubContainer\Api->sendUserAgents('87|Q2jnjrKI7UJ0...', Array)
#2 src/addons/Hampel/KnownBots/Cron/SendAgents.php(71): Hampel\KnownBots\Service\UserAgentSender->sendUserAgents()
#3 src/addons/Hampel/KnownBots/Cron/SendAgents.php(42): Hampel\KnownBots\Cron\SendAgents::sendApi(Array)
#4 src/XF/Job/Cron.php(37): Hampel\KnownBots\Cron\SendAgents::send(Object(XF\Entity\CronEntry))
#5 src/XF/Job/Manager.php(260): XF\Job\Cron->run(8)
#6 src/XF/Job/Manager.php(202): XF\Job\Manager->runJobInternal(Array, 8)
#7 src/XF/Job/Manager.php(86): XF\Job\Manager->runJobEntry(Array, 8)
#8 job.php(43): XF\Job\Manager->runQueue(false, 8)
#9 {main}
 
It should try and re-authenticate automatically.

Let me know if you receive further errors.
Going back through the error logs, it appears it's failed to send for the past 4 days in a row. Failed again today at the same time.

We implemented a block around that time on blank user agents to try and mitigate some of the attacks we were under, would that be impacting on this?
 
Top Bottom