Best way to handle a protracted cronjob?

Jaxel

Well-known member
I have a cronjob I want to run daily. The problem with this cronjob is that it could possibly involve 200 queries to an external API.

If I ran this all at once, I would get rate limited immediately. So ideally, I would want to space these queries to about 5 seconds apart.

What would be the best way to handle this? Would it inhibit the browsing ability of an end user?
 
Would something as simple as this be the way to go?

Code:
    public static function runSync()
    {
        $discordModel = XenForo_Model::create('EWRdiscord_Model_Discord');
        $extAuths = $discordModel->getExtAuths();
       
        foreach ($extAuths AS $auth)
        {
            $discordModel->patchGuildMember($auth['user_id'], $auth['provider_key']);
            sleep(5);
        }
    }

My question is, the random person who ends up being the one to trigger this cronjob... will this affect their viewing experience? If they close their browser, will it interrupt the cronjob?

What about server timeouts?
 
I have cron entries triggered by an actual unix cronjob so it's much more predictable as to when it will actually run and the execution time becomes less of an issue.

Even so, I'd generally avoid something which will take more than a few seconds to run.

I'd take one of two approaches:

A) use an external queuing mechanism to do these asynchronously ... although rate limiting can be an issue there

B) add a column to the database as an INT(10) and store the unix timestamp of the last time the task was run for that user and then have a cron job that runs once every minute, queries the database to find the first entry where the unix timestamp is more than 60*60*24= 86,400 seconds before the current time and then run the task and update the timestamp for that entry.

So something like (pseudo code):

Code:
$timenow = CURRENT_TIME;
$threshold = $timenow - 86,400;

$query = "SELECT user_id, provider_key FROM auths WHERE lastsync <= $threshold ORDER BY lastsync ASC LIMIT 1";

$auth = $db->execute($query);

if ($auth)
{
    $success = $discordModel->patchGuildMember($auth['user_id'], $auth['provider_key']);

    if ($success) $db->execute("UPDATE auths SET lastsync = $timenow WHERE user_id = $auth['user_id']);
}

The main limitation of this approach is that you'll only get through a maximum of 86,400 executions a day - but if you need more you can always have it run them in something like chunks of 5 with a sleep between each loop if necessary. Running 5 at a time with a 5 second sleep between them is a lot more reasonable than 200+ at a time.
 
I did the code I put in my second post with about 30 users and it seems to have worked fine... Though I don't know if it made any user's browsing experience hang or anything.
 
Top Bottom