XF 1.4 1.4.1 upgrade issue

RyanC

Active member
since upgrading to 1.4.1 we're seeing a lot table locks.

http://screencast.com/t/lCfBOnCIt

This didn't exist before the upgrade...the problem we've had in tuning around this is that while tuning brings down load as soon as the board gets busy again it seems that everything returns to as it was and needs to be re-optimized.

This is a fairly large site that pre upgrade load was rarely over 5 on a 16 core machine early today we saw over 1200.

Any thoughts on getting this back to as it was?
 
"Sorting result" doesn't mean it's going to disk. It likely means a "filesort" which is just badly named: http://www.percona.com/blog/2009/03/05/what-does-using-filesort-mean-in-mysql/ You can confirm that with an EXPLAIN query though if you wish.

Beyond that, you may simply want to turn the online status indicator off or consider switching the xf_session_activity table to InnoDB. The status indicator requires reading it to view a thread, but it's a table that is written to in each page view. It's a MEMORY table by default so it uses a table lock but the trade off is that it doesn't touch the disk then (as InnoDB would need to do).
 
we can disable the online updates as we already converted the table asked about to InnoDB.


trying to understand the differences between 1.3.6 and 1.4.x social forum is still in the mix as its important to the users but we did try disabling from admincp


Here's an example of one of the slow queries. possible the xf_bb_code_parse_cache may be culprit. Checking.



# User@Host: vvvvv_user[vvvvv_user] @ localhost []

# Query_time: 6.073281 Lock_time: 0.000050 Rows_sent: 1 Rows_examined: 5

SET timestamp=1413410936;

SELECT post.*

,

bb_code_parse_cache.parse_tree AS message_parsed, bb_code_parse_cache.cache_version AS message_cache_version,

user.*, IF(user.username IS NULL, post.username, user.username) AS username,

user_profile.*,

signature_parse_cache.parse_tree AS signature_parsed, bb_code_parse_cache.cache_version AS signature_cache_version,

session_activity.view_date AS last_view_date,

0 AS like_date,

social_forum.logo_date,

social_forum.logo_width,

social_forum.logo_height,

social_forum.logo_crop_x,

social_forum.logo_crop_y,

social_forum.title AS social_forum_title,

social_forum_combination.cache_value AS secondary_social_forums

FROM xf_post AS post



LEFT JOIN xf_bb_code_parse_cache AS bb_code_parse_cache ON

(bb_code_parse_cache.content_type = 'post' AND bb_code_parse_cache.content_id = post.post_id)

LEFT JOIN xf_user AS user ON

(user.user_id = post.user_id)

LEFT JOIN xf_user_profile AS user_profile ON

(user_profile.user_id = post.user_id)

LEFT JOIN xf_bb_code_parse_cache AS signature_parse_cache ON

(signature_parse_cache.content_type = 'signature' AND signature_parse_cache.content_id = post.user_id)

LEFT JOIN xf_session_activity AS session_activity ON

(post.user_id > 0 AND session_activity.user_id = post.user_id)

LEFT JOIN xf_social_forum AS social_forum ON

(social_forum.social_forum_id = user_profile.primary_social_forum_id)

LEFT JOIN xf_social_forum_combination AS social_forum_combination ON

(social_forum_combination.social_forum_combination_id = user_profile.social_forum_combination_id)

WHERE post.thread_id = '941265'

AND (post.position >= 0 AND post.position < 30)

AND (post.message_state IN ('visible'))

ORDER BY post.position ASC, post.post_date ASC;
 
Last edited:
If you're seeing high load, you're naturally going to see more queries there and more contention. The sorting would just indicate the state it's in at the time and that's a state the query would always go through. You can learn more specifics by running an EXPLAIN. I was pointing out that a sorting state is very different from writing a temporary table (to memory or disk); it's a much simpler process, especially given that it's sorting no more than one page worth of posts.

Regardless, the rest of my previous post stands. The online status indicator in posts creates a lot more read activity against the session activity table and I laid out several ways that you may be able to adjust for this (including just disabling the feature).
 
Mike-

Was there any substantial query changes between 1.4.0 and 1.4.2? The only major changes were the upgrade of the XF suite.

We understand that the high load is a factor in the queries running for a long time, but there are many queries we haven't noticed before and generally very low disk I/O which is also strange given the time these queries are taking to run. Also a MySQL restart seems to help for a while, but the issue recurs. Is there anything else you can tell us in regards to how the queries have changed between versions?
 
Mike-

xf_bb_code_parse_cache came up quite a bit as the server was performing quite a few inserts against this table all at once. While this is most likely not the culprit, should this table have 8.3 million records or is there an option we can use to scale this back for testing?
 
The xf_bb_code_parse_cache table is used to cache some of the BB Code parse tree.

It can be disabled in Admin CP > Options > Performance: Cache BB Code output

It isn't new in 1.4. It has been around for a while. Since 1.2 IIRC.

When a page in a thread is loaded, any posts on that page which have not been previously cached will trigger an insert. It's possible that the inserts are slow due to poor I/O performance in MySQL or other delay which adds up with what else is being mentioned here.

The number of records in there seems consistent with the number of posts that RyanC's forum has.

I think the benefits of BB Code output caching are highest on more complex posts. It might be worth switching it off. Switching it off doesn't delete the existing data as far as I know so if there is no benefit found by switching it off or it makes things worse then it could be switched back on.
 
There have been effectively no query changes from 1.4.0 to 1.4.2. The only query change I can think of relates to the online user list and it was actually changed to remove a join.

The BB code cache was wholly untouched. You may not find it to be particularly beneficial given the potential write vs read ratio. You can trim it based on the "Remove data from BB code cache (days)". As implied by Chris, the benefit of the cache is in saving web-server on-thread-read load by caching the data in the DB. It may not be beneficial in your circumstance (we don't run it here).
 
Chris/Mike-

Thanks for the information. We're actively tuning the MySQL configuration as well. The installation had been performing well and we had not made any changes before the upgrade. We'll update you within 24 hours on the status of our tuning efforts.
 
Hello,

We've done quite a bit of sleuthing so far and found that there appears to be "bursts" of slow queries which are in sorting state. These queries do not appear to be hitting disk and do not appear to be caused by an attack. The queries run long because there is 50+ when looking at the active process list. Here's the query: http://screencast.com/t/yccYDz6k

We temporarily disabled the social media plugin referenced above which removed it from the hanging query but did not rectify the issue.

Is there any way we can tell why that query started running slow after the upgrade? We understand it could be an add on module issue as some modules don't fully support newer versions of XF immediately. However, we've disabled that plugin and still see issues. Sometimes the MySQL server can recover and keep moving, sometimes it has too many in the run queue and takes 10-15 minutes.

Any help is appreciated as we're running out of idea's here. The server ran with a very low TTFB for the last 100 days consistently until the upgrade.
 
Hello,

Also as a correction the issue started after an upgrade from 1.3.6 to 1.4.0. Can you please let us know of any substantial query changes and also update us on our previous post?
 
My previous posts have discussed the pertinent change to the query you've pointed out and several things that can be tried in regards to changing it (notably, disabling the online status indicator).
 
Hello,

The only difference in our question is what was changed between 1.3.6 and 1.4.x that could also introduce these queries? Also can you help validate that the queries that are hanging are XF only queries? (in the previous screenshot)
 
These queries aren't new. You're looking at the query that gets the posts for a page of a thread. A version of this query have been there since 1.0 and it has not fundamentally changed since then. Various elements have been added to the query over time, based on options configured (online status, BB code caches, etc) and add-ons.

The listed query is likely coming from XF itself, yes.
 
Mike-
Okay we will have a look in depth. The issue doesn't surface until 40-50 of these queries are running at the same time and MySQL is using 1200% CPU. There is virtually no disk I/O and the forum had been running well before. We'll test some other theories but we're running out of options at this time. Thank you.
 
Top Bottom