XF 2.2 Pruned 25k members and now the server is grinding to a halt

dethfire

Well-known member
The initial processing ended an hours ago but the site is still being crushed. The jobs table has 20k entries for XF:UserDeleteCleanUp. Is it working through these or do I have a bigger problem?

I am also getting the
"There are scheduled jobs outstanding which have not run. Jobs may not be getting triggered when expected." error
 
Batch updating (users) is extremely slow. Even on a powerful server it can take ages to update.

I had to remove a few thousand posts in a staging environment and that took a few hours in total.
 
Sadly, this is normal. If you look at MySQL you will almost certainly find out it is stuck cleaning up the user reactions table. It seems to be exponential depending on the number of users you delete as the batch cleanup job kicks in sometime after the users have actually been deleted by your batch update.

I would suggest doing in much smaller batches of less than 100 at a time if you don't want to run into problems. Also ensure you wait between batches to allow the cleanup job to run.
 
Pruning 50k users has just resulted in a downtime for us because of the resulting XF:UserDeleteCleanUp jobs and the expensive query in XF\Reaction\AbstractHandler::updateRecentCacheForUserChange in particular. Our xf_post table as roughly 28,000,000 rows and there are roughly 200,000 rows in xf_reaction_content with reaction_user_id = 0. Each resulting query took roughly 2 seconds and since many were done in parallel I guess the jobs were continuously locking (significant parts of) the xf_post table such that web requests started to pile up.

It would be nice if this could be improved. Maybe this could also be improved by not executing multiple jobs in parallel (which might be easier to implement when running jobs via CLI).
 
Pruning 50k users has just resulted in a downtime for us because of the resulting XF:UserDeleteCleanUp jobs and the expensive query in XF\Reaction\AbstractHandler::updateRecentCacheForUserChange in particular. Our xf_post table as roughly 28,000,000 rows and there are roughly 200,000 rows in xf_reaction_content with reaction_user_id = 0. Each resulting query took roughly 2 seconds
Yeah, the query is rather bad - it unconditionally updates all rows for the target user id even if there is nothing to update.
Adding a condition should sigificantly speed up that query.

 
My suggestion is delete their posts & contents (profile posts) etc, before deleting multiple users.

Even using "spam" tool at only 1 user with more than 1000 posts can cause server halt (in my server at least :D)
 
I don't know what if any database changes were made for 2.2.15, but when we upgraded to that the deadlocks and userdeletecleanup issues seemed to disappear for a while. It may just have been a coincidence as we seem to be back to having the problems again now during the regular housekeeping.
 
Top Bottom