Fixed Lock contention on thread-read tracking under very high load

Xon

Well-known member
Affected version
2.2.13
I've had a client report that under very high number of readers, the insert on duplicate key update query that is part of markThreadReadByUser causes snowballing slow queries due to lock contention.

MySQL was set to report deadlocks, but did not report anything during this catastrophically slow queries. innodb_flush_log_at_trx_commit had the default value of 1, but changing it to 2 did not appear to help.

A sample of the slow query log for this query:
Code:
# Time: 2023-xx-yyT00:00:37.264033Z
# Query_time: 54.301389  Lock_time: 54.032017  Rows_sent: 0  Rows_examined: 0  Rows_affected: 2  Bytes_sent: 19
# Time: 2023-xx-yyT00:00:37.263187Z
# Query_time: 31.912307  Lock_time: 31.637843  Rows_sent: 0  Rows_examined: 0  Rows_affected: 0  Bytes_sent: 11
Then with thousands of those the insert xf_thread_read on duplicate key update queries within the same second all reporting as being very slow.

The solution of patching markThreadReadByUser to not use insert xf_thread_read on duplicate key update similar to xf_session_activity handling appears to have resolved the unexpected query pileup.

ie this:
PHP:
        $noForceWrite = '-- XFDB=noForceAllWrite'."\n";
        if (!$thread->Read->offsetExists($userId))
        {
            // new record
            $this->db()->query($noForceWrite.'
                INSERT INTO xf_thread_read
                    (thread_id, user_id, thread_read_date)
                VALUES
                    (?,?,?)
                ON DUPLICATE KEY UPDATE thread_read_date = VALUES(thread_read_date)
            ', [$thread->thread_id, $userId, $newRead]);
        }
        else if ($readDate < $cutOff + 60)
        {
            // read-data was near the pruning cutoff
            $this->db()->query($noForceWrite.'
                REPLACE INTO `xf_thread_read`
                    (thread_id, user_id, thread_read_date)
                VALUES
                    (?, ?, ?)
            ', [$thread->thread_id, $userId, $newRead]);
        }
        else
        {
            // use update which is not affected by the locking strategy issues
            $this->db()->query($noForceWrite.'
                UPDATE `xf_thread_read`
                SET `thread_read_date` = ?
                WHERE `user_id` = ? AND `thread_id` = ?
            ', [$newRead, $userId, $thread->thread_id]);
        }
Note; the $noForceWrite bit is a separate so phpstorm highlights the queries as SQL properly. The cut-off variance of +60 was because it was observed the original insert on duplicate update queries could be slowed by ~50 seconds when under extreme load.

The fix I provided the client also does extra work to reduce potenetial read-locks caused by joins by not including the xf_thread_read join for thread/forum/node queries. Instead xf_thread_read rows are fetched in a batch via a simple select on demand via a custom FinderCollection class.

This approach has also been used for xf_session_activity table to reduce contention, which helped unexpectedly slow queries against that table as well.
 
Thank you for reporting this issue, it has now been resolved. We are aiming to include any changes that have been made in a future XF release (2.2.14).

Change log:
Reduce lock contention when writing thread read records
There may be a delay before changes are rolled out to the XenForo Community.
 
Top Bottom