XF 1.4 Double Posts, MySQL Lock Timeout errors

Denny Crane

Member
I've been getting a lot of both on my site lately.

http://www.sportstwo.com.

What I think is going on is users are clicking on the submit post button two or more times because they're on mobile and there's a lot of lag.

I just read a thread where one guy submitted the same post 4 times in a row.

I'm seeing only a handful (like 12-15) per day of these MySQL Lock Timeout errors in the server log.

Any suggestions?
 
These are in the WWW servers' config.php:

2015-02-23%20at%207.40%20AM.png


cache1, cache2, cache3 are defined in /etc/hosts, and are the 3 WWW server vps themselves.

I am using xenforo enhanced search 1.1.1.

How do I give it more memory to handle its indexes?
 
You're caching sessions, so that's good.

Elasticsearch allocates 1GB of memory by default which it considers very conservative. Like MySQL, the goal is to get as much (or all) of the index into memory if possible. The exact method of changing this may depend on how you have installed Elasticsearch; you may be able to configure this in your service wrapper/init.d script. See these pages for a couple discussions on it:

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/heap-sizing.html
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html

The size of your XF search index can be seen on the Elasticsearch setup page in the XF ACP.

If you're running a cluster and sharding your index, the memory requirements would go down per shard.

That said, it may be worth just waiting to see if the MySQL changes help (going from a very small amount of memory allocated to InnoDB to 6G will be a huge change). Changing more things at once may make it harder to see the effects of each individually.
 
I see.

I would need to set up a redundant cluster for elasticsearch and add it to the hosting setup.

How much better is the enhanced search than mysql's builtin search?

If the current mysql configuration doesn't resolve things, I might disable search for a couple days and see how that affects things.
 
How much better is the enhanced search than mysql's builtin search?
MySQL's full text search traditionally hasn't scaled. The InnoDB version may have different performance characteristics, though it also has some different exposure in terms of "valid queries" so it's a different system wrapped up in the same API. We only officially support the MyISAM version. The MyISAM version often runs into problems beyond about 1 million entries. MySQL FT performance varies wildly based on the number of documents matched by a query. It's also generally lacking unless you adjust MySQL to handle indexes with shorter minimum length and possibly changing the extensive stopword list.

Elasticsearch/XF Enhanced Search is significantly faster (especially with a large number of documents), has a more extensive index out of the box (shorter words, fewer stopwords), supports things like stemming to improve results, and supports relevancy and custom scoring (including title- and recency-weighting). So all in all, it's significantly better I'd say.
 
  • Like
Reactions: Xon
I disabled search and the few adding installed. It's still happening not under a particularly heavy user load. There's not a game tonight so people aren't congregating in one thread.
 
Error Info
Zend_Db_Statement_Mysqli_Exception: Mysqli statement execute error : Lock wait timeout exceeded; try restarting transaction - library/Zend/Db/Statement/Mysqli.php:214
Generated By: truebluefan, 6 minutes ago
Stack Trace
#0 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/Zend/Db/Statement.php(297): Zend_Db_Statement_Mysqli->_execute(Array)
#1 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/Zend/Db/Adapter/Abstract.php(479): Zend_Db_Statement->execute(Array)
#2 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter/DiscussionMessage.php(928): Zend_Db_Adapter_Abstract->query('?????UPDATE xf_...', 22679)
#3 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter/DiscussionMessage/Post.php(200): XenForo_DataWriter_DiscussionMessage->_updateUserMessageCount(false)
#4 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter/DiscussionMessage.php(560): XenForo_DataWriter_DiscussionMessage_Post->_updateUserMessageCount()
#5 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter.php(1409): XenForo_DataWriter_DiscussionMessage->_postSave()
#6 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter/Discussion.php(477): XenForo_DataWriter->save()
#7 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter/Discussion.php(426): XenForo_DataWriter_Discussion->_saveFirstMessageDw()
#8 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/DataWriter.php(1409): XenForo_DataWriter_Discussion->_postSave()
#9 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/ControllerPublic/Forum.php(745): XenForo_DataWriter->save()
#10 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/FrontController.php(347): XenForo_ControllerPublic_Forum->actionAddThread()
#11 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/library/XenForo/FrontController.php(134): XenForo_FrontController->dispatch(Object(XenForo_RouteMatch))
#12 /home/mschwartz/github/mschwartz/sportstwo-xenforo/htdocs/index.php(13): XenForo_FrontController->run()
#13 {main}
Request State
array(3) {
["url"] => string(62) "http://www.sportstwo.com/forums/calendar-excess.610/add-thread"
["_GET"] => array(0) {
}
["_POST"] => array(10) {
["title"] => string(41) "Formula One Grand Prix of Malaysia Mar 29"
["message_html"] => string(0) ""
["message"] => string(41) "4:00 AM ET
Sepang International Circuit"
["_xfRelativeResolver"] => string(65) "http://www.sportstwo.com/forums/calendar-excess.610/create-thread"
["attachment_hash"] => string(32) "2ec48f061e32c0f9ea3fe423a59354e4"
["watch_thread_state"] => string(1) "1"
["discussion_open"] => string(1) "1"
["_set"] => array(2) {
["discussion_open"] => string(1) "1"
["sticky"] => string(1) "1"
}
["poll"] => array(8) {
["question"] => string(0) ""
["responses"] => array(10) {
[0] => string(0) ""
[1] => string(0) ""
[2] => string(0) ""
[3] => string(0) ""
[4] => string(0) ""
[5] => string(0) ""
[6] => string(0) ""
[7] => string(0) ""
[8] => string(0) ""
[9] => string(0) ""
}
["max_votes_type"] => string(6) "single"
["max_votes_value"] => string(1) "2"
["change_vote"] => string(1) "1"
["view_results_unvoted"] => string(1) "1"
["close_length"] => string(1) "7"
["close_units"] => string(4) "days"
}
["_xfToken"] => string(8) "********"
}
}
 
Is there a particular timing when these errors seem to appear? Such as certain minutes past the hour or a certain time of day? It's hard to see a pattern after the fact when I can just see a relative timestamp.

The lock wait timeout issues you've shown all seem to relate to the xf_user table. This is done as part of the "hourly cleanup" cron script. It runs at 10 minutes past the hour by default. There is a transaction that could update a large number of xf_user records, but they should still be fast as they're primary key lookups, changing an integer value. Even with a large number of updates, I wouldn't expect that to take long.

Otherwise, it's possible that we're just seeing the result of a long running transaction elsewhere. The other transaction would be the problem because it's holding on to the locks, thereby triggering the issue with the posts. Did you manage to run the SHOW INNODB ENGINE STATUS query when the errors were happening? If not or it didn't have any useful information, we may need to do some further debugging via a script.

On a side note, there's an example of MySQL replication actually being the cause of the lock: http://www.chriscalender.com/debugg...ew-innodb-plugin’s-information-schema-tables/ This actually also shows what may be a better way of debugging if you're using the InnoDB plugin (or a version of MySQL with it), since the locks and the lock waits can be queried directly.
 
upload_2015-2-24_6-48-38.webp

There were about 50 total yesterday. Those are the ones since I cleared the log.

It looks like at 10 past the hour. We do have 26,000 records in the user table. Far fewer than the 3.6M in the posts table ;)

We get roughly 1000 posts a day and I'm pretty sure most of them come in the evening Pacific Time.

I presume there are other sites with more users and posting activity than mine. And that you've tested the software under load and with large data sets.

I could try shutting down the 2nd MySQL server for a bit to see if it resolves anything. There has to be a way to scale xenforo and make it redundant/reliable without having to make ever bigger/badder/faster servers to host the database.
 
If you see this message before 10 minutes past the hour, can you go into the ACP "cron entries" pages and choose the option to run "hourly clean up" now. How long does it take to run (approximately)?
 
I just ran it and it seemed to have taken less than a second. It was REALLY fast.

upload_2015-2-24_7-4-5.webp

I shut down the 2nd MySQL server so replication is not in effect. I also do my backups from the second server (4 hours ago) via automated script.

I've only encountered the behavior a couple times myself. When I do, I hit the submit post button and it goes grey for about a second and then becomes enabled again. Really quick.
 
If you're doing some sort of synchronous (or semi-synchronous) replication, then actions taken on the slave that cause locking could lead to locking on the master. If you're doing a mysqldump, for example, this would normally lock all the tables. You can do --single-transaction as an option which would prevent this. You could also do asynchronous replication (master-slave), though that may be a different approach.

I believe we do have some customers using various MySQL clustering approaches and this isn't an issue that I've heard of before.

In terms of people double posting, if the delay is from MySQL, you could enable the slow query log to log specific queries to see if that shows them coming up. This would only serve to identify a particular query that's slow; it wouldn't necessarily identify a quick query that gets a lock and then something that sits for 1 minute not hitting the DB. I haven't identified something that would do that specifically though.

The only potentially "slow" part of replying relates to sending emails and alerts of new replies. If you're using an external SMTP server to send email, we'd likely recommend using the default PHP system which uses a local MTA to have the MTA queue the mail; you can have that forward to your SMTP server of choice. Otherwise, you're having PHP talk (synchronously) to the SMTP server which is not a speedy process. (This happens outside of a transaction though.)
 
  • Like
Reactions: Xon
The replication slave/standby server isn't running, so it can't be locking anything. The site was fine much of the day, now triple posting.

The three WWW instances are running sendmail already.

I'm considering changing hosting providers. Maybe their network is swamped or some other technical issues.

No errors in the server log.
 
FWIW, I was writing the first part of that post before your responses regarding double posts. I was responding to the comments about redundancy or why the replication may have been triggering an issue.

Without an error in the log, we're really shooting in the dark. If it's something that can be reproduced consistently (when you reply to the thread at a certain time, it always seems to happen), then hooking up one of the PHP profilers to get the profiling output would allow us to shed light on specifically where the issue is happening and to see if PHP, MySQL or something else is the bottleneck.
 
Top Bottom