• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

XF 2.0 Bot/Crawler Detection

katsulynx

Well-known member
#1
Judging from a few hints, I'm pretty sure there is a way, but I can't figure out exactly how to. Is there a way to determine if the current visitor is a crawler? Does this also function for facebooks open graph cralwer/user agent?
 

katsulynx

Well-known member
#4
I doubt there's a necessity to build in anything extra actually. The online-list is well capable of filtering robots, so if that functionality is somehow available to the current user as well, that'd be all I need.
 

katsulynx

Well-known member
#6
@katsulynx the XF bot detection is fairly limited, but if you are only interested in the major well behaved bots it should work.
Yes. I only want to filter them out, so they're not shown the age gate. I've tried to show the page under the following condition, but it still blocks the facebook open graph crawler even though it's afterwards listed as robot on the online list.
PHP:
if(empty(\XF::visitor()->Activity->robot_key)) {
    // User is not a robot.
}
 

Xon

Well-known member
#7
Yes. I only want to filter them out, so they're not shown the age gate. I've tried to show the page under the following condition, but it still blocks the facebook open graph crawler even though it's afterwards listed as robot on the online list.
PHP:
if(empty(\XF::visitor()->Activity->robot_key)) {
    // User is not a robot.
}
Check the session object instead:
PHP:
/** @var \XF\Session\Session $session */
$session = \XF::session();
if (!empty($session['robot']))
{
// is a robot
}
Note; robot may not be set if the session hasn't been started, so use isset/empty
 

katsulynx

Well-known member
#8
Hm. I'm running the code inside a listener that is bound to the controller_post_dispatch event, so I'd assume the session and user should already be set at that point (and dumping them on the page delivers expected results). However your code still does block the facebook open graph crawler out. Seems like the robot entry is empty for him at that point in time.
 

Xon

Well-known member
#9
Hm. I'm running the code inside a listener that is bound to the controller_post_dispatch event, so I'd assume the session and user should already be set at that point (and dumping them on the page delivers expected results). However your code still does block the facebook open graph crawler out. Seems like the robot entry is empty for him at that point in time.
If that is the case, the facebook open graph crawler isn't being detected as a bot. Sounds like a bug report!