1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Fixed Query to fetch threads in a forum

Discussion in 'Resolved Bug Reports' started by digitalpoint, Dec 24, 2012.

  1. digitalpoint

    digitalpoint Well-Known Member

    The XenForo_Model_Thread::prepareThreadFetchOptions() method has some less than ideal joins and calculations going on...

    INNER JOINs are noticeably slower than LEFT JOINs (because they need to actually be looked at for exclusions). This gets *really* slow when you have a forum with hundreds (or thousands) of pages of threads... All the threads from previous pages need to actually perform the join in order to see if they are to be excluded from the list.

    In the case of this method, the INNER JOINs are really serving no purpose other than to do a double check on data integrity (don't show threads unless the node exists, don't show threads unless the forum exists, don't show the thread unless it has a first post, etc) Doing data integrity checks at the query level is really not worth the huge amount of resources needed when you start getting into high page numbers.

    The other issue in this method is SQL IF() statements are even more resource intensive than the JOINs because they also need to be evaluated for all threads in previous pages. This part in particular:

    IF(thread_user_post.user_id IS NULL, 0, thread_user_post.post_count) AS user_post_count
    I've modified the method so all the INNER JOINs are LEFT JOINs and changed the above IF() to just return the count regardless if it's NULL or not by changing it to this:

    thread_user_post.post_count AS user_post_count
    Doing this made the query to get threads in a forum at a very high page number go from 4-5 MINUTES to 2-3 seconds with no noticeable ill effects. The JOINS are just data integrity JOINs that really should never happen... and if something WAS wrong with a thread, wouldn't you want to know/see it? The user_post_count is used for hovering over the small avatar, and it works the same after the change.
  2. Adam Howard

    Adam Howard Well-Known Member


    Please fix this.

    * drools over improved optimization *
  3. Chris D

    Chris D XenForo Developer Staff Member

    Also confirmed.

    Before fix loading page 854:
    Timing: 15.4578 seconds Memory: 6.539 MB DB Queries: 17

    After fix:
    Timing: 1.1214 seconds Memory: 6.649 MB DB Queries: 17

    I have a question, however.

    Why didn't they use LEFT JOIN's instead of INNER in the first place?

    They did choose to use LEFT JOIN in some circumstances in this function, but others are INNER.

    I can only assume there must have been some sort of reason.

    Nice work though, Shawn. It clearly works and makes a huge difference.
  4. digitalpoint

    digitalpoint Well-Known Member

    I skimmed over it... it looks like they are using LEFT JOIN for records that may or may not be there (things like if the thread was read or not, if the forum was read or not, deletion logs, etc.)

    The INNER JOINs are used on stuff that *should* be there, but if parts are not there, you have a data integrity problem so the INNER JOIN essentially hides threads that have data integrity problems.

    I ended up using this method to extend the default XenForo_Model_Thread::prepareThreadFetchOptions() method... optimizes the query without needing to edit the files:

        public function prepareThreadFetchOptions(array $fetchOptions)
    $response parent::prepareThreadFetchOptions($fetchOptions);

    $response['joinTables'] = str_replace ('INNER JOIN''LEFT JOIN'$response['joinTables']);
    $response['selectFields'] = str_replace (array('IF(thread_user_post.user_id IS NULL, 0, thread_user_post.post_count) AS user_post_count''IF(user.username IS NULL, thread.username, user.username) AS username'), array('thread_user_post.post_count AS user_post_count''thread.username AS username'), $response['selectFields']);
    When it comes down to it, INNER JOIN should never be used on large sets of data, as it simply can't scale well (since you have to actually perform the JOIN on every record before the record you are actually looking at). IF() calculations don't scale well either because the records all need to be evaluated (and dumped to a temporary table) for every record up to the one you are looking at in case you are doing an evaluation of the result of the calculation.

    While there are cases where you can't avoid using INNER JOIN (hopefully just on small sets of data), using it for full data integrity checking every time you are getting records is just not a good idea if you want to be able to scale out.

    Maybe I'll put together an article (or add-on) that optimizes stuff like I did for vBulletin 4 (see this).
    thumped, Deebs, simbolo and 16 others like this.
  5. Mouth

    Mouth Well-Known Member

    That would be awesome!
  6. Adam Howard

    Adam Howard Well-Known Member

    yes, please. :love:
  7. CyclingTribe

    CyclingTribe Well-Known Member

    Ohhhhh yes, add-on please Shawn - save us all having to edit the core - can't wait!! (y)

    Shaun :D
  8. melbo

    melbo Well-Known Member

    This would be great!
    CyclingTribe likes this.
  9. Slavik

    Slavik XenForo Moderator Staff Member

    Also confirmed. Well spotted.
  10. Slavik

    Slavik XenForo Moderator Staff Member

    Hah, just came to suggest / do the same thing.

    How many optimisations have you found?
  11. digitalpoint

    digitalpoint Well-Known Member

    A decent number... But I haven't started actively looking yet, so just stuff I've run across randomly so far.
  12. Mike

    Mike XenForo Developer Staff Member

    I've made these changes, though they are slightly less correct, it's a situation that should virtually never happen (and if it did, it'd probably be caught elsewhere first) and if it gives more speed....

Share This Page