Slow Query

# Time: 120301 9:58:30
# User@Host: xfuser[xfuser] @ localhost []
# Thread_id: 477815 Schema: plusgsm_xf QC_hit: No
# Query_time: 16.573269 Lock_time: 0.000020 Rows_sent: 20 Rows_examined: 2371700
SET timestamp=1330606710;
SELECT user.*
,
user_profile.*,
user_option.*,
user_privacy.*
FROM xf_user AS user

INNER JOIN xf_user_profile AS user_profile ON
(user_profile.user_id = user.user_id)
INNER JOIN xf_user_option AS user_option ON
(user_option.user_id = user.user_id)
INNER JOIN xf_user_privacy AS user_privacy ON
(user_privacy.user_id = user.user_id)
WHERE (user.user_state = 'valid') AND (user.is_banned = 0)
ORDER BY user.username
LIMIT 20 OFFSET 474120;
 
You're joining four tables together and running some conditions on the result as well as a sort. I would expect that to be relatively slow in any case and really slow if data is not cached. Try running an EXPLAIN SELECT against that query and see what MySQL tells you.
 
SET timestamp=1330606710;
SELECT user.*
user_profile.*,
user_option.*,
user_privacy.*;

I am not sure why you would ever want to pull back all of the fields from the four tables, but Ok, whatever.
If you have a large user table you need to create a temp table before joining it.

Code:
SELECT    user.*,
    xf_user_profile.*,
    xf_user_option.*,
    xf_user_privacy.*
FROM    (
    select  xf_user.*
    FROM    xf_user
    WHERE    xf_user.is_banned = 0
        AND xf_user.user_state = 'valid'
    LIMIT    20 OFFSET 4710
    ) AS user
LEFT JOIN    xf_user_profile
        ON user.user_id = xf_user_profile.user_id
LEFT JOIN    xf_user_option
        ON user.user_id = xf_user_option.user_id
LEFT JOIN    xf_user_privacy
        ON user.user_id = xf_user_privacy.user_id;

The way you wrote your query, it was taking about 12 seconds to process, by using the temp table, it should take less than 1/10 of a second.

Best of luck to you.
 
Actually I did not write this query, it came in xenforo .. As I leave enabled the slow-log mysql did this result ... My big problem now is to locate where it is.

My user table has more than 470,000 records with seven custom fields
 
Your mysql is probably misconfigured (likely not using anywhere near as much RAM as it should)

That kind of query should be nearly instant even on a table that size. The query itself seems perfectly optimised
 
first query 9.52 seconds
second query 0.02 seconds
xf_user - 475,275 rows - 77mb
xf_user_filed - 2,850,249 rows - 108mb
xf_user_option - 475,275 rows - 13,7mb
xf_user_privacy - 475,275 rows - 9,2mb

is the server configuration or query that is poorly designed for a larger volume?
 
others query from : http://xenforo.com/community/members/
  1. SELECT user.*

    FROM xf_user AS user

    WHERE (user.user_state = 'valid') AND (user.is_banned = 0)
    ORDER BY user.register_date DESC
    LIMIT 8
    Run Time: 1.038485
    Select TypeTableTypePossible KeysKeyKey LenRefRowsExtra
    SIMPLE user ref user_state user_state 1 const 475076 Using where; Using filesort
  2. SELECT user.*

    FROM xf_user AS user

    WHERE (user.user_state = 'valid') AND (user.is_banned = 0)
    ORDER BY user.message_count DESC
    LIMIT 12
    Run Time: 1.051779
    Select TypeTableTypePossible KeysKeyKey LenRefRowsExtra
    SIMPLE user ref user_state user_state 1 const 475076 Using where; Using filesort

is it really necessary to select one of all the fields in the users table to only display the number of posts the user, or the registration date?

anyone who understands the basics of programming that makes querying know just what you want to use, generating queries faster even with a greater volume of data
 
my.cnf
[mysqld]
datadir=/mysql/mysqldata
socket=/mysql/mysqldata/mysql.sock
long_query_time=3
slow_query_log
#log-queries-not-using-indexes

safe-show-database
back_log = 50
skip-innodb
max_connections = 200
key_buffer_size = 1536M
myisam_sort_buffer_size = 256M
myisam_max_sort_file_size = 2048M
join_buffer_size = 2M
read_buffer_size = 2M
sort_buffer_size = 2M
table_definition_cache =4000
table_open_cache = 4000
thread_cache_size = 256
wait_timeout = 300
connect_timeout = 10
tmp_table_size =256M
max_heap_table_size =256M
max_allowed_packet = 128M
max_seeks_for_key = 1000
group_concat_max_len = 1024
max_length_for_sort_data=1024
net_buffer_length = 16384
max_connect_errors = 10000
concurrent_insert = 2
read_rnd_buffer_size = 4M
bulk_insert_buffer_size = 8M
query_cache_limit = 10M
query_cache_size = 256M
query_cache_type = 1
query_prealloc_size = 262144
query_alloc_block_size = 65536
transaction_alloc_block_size = 8192
transaction_prealloc_size = 4096
default-storage-engine = MyISAM
max_write_lock_count = 4
max_delayed_threads=0
thread_concurrency=8


[mysqlcheck]
socket=/mysql/mysqldata/mysql.sock
#nice = -5
#open-files-limit = 8192

[mysqldump]

socket=/mysql/mysqldata/mysql.sock
quick
max_allowed_packet = 128M

[myisamchk]
datadir=/mysql/mysqldata
socket=/mysql/mysqldata/mysql.sock
key_buffer_size = 1560M
sort_buffer_size = 16M
read_buffer_size = 16M
write_buffer_size = 16M
 
Well nothing majorly wrong there that I can see apart from the lack of innodb, which doesn't appear to the be cause of this issue (going by lock_time)

It's a nasty filesort from the order by user.username, perhaps try increasing key_buffer_size if it's not big enough (with 2 million users it's quite likely)
 
The advantage of creating the temp table is that since you only need twenty records, you aren't creating massive hash table lookups, matching all of the user id's. It is a badly written query for large user tables.

That's likely the query cache kicking in
No. The query I wrote limits things to 20 records.

My big problem now is to locate where it is.
it comes from the /library/XenForo/Model/User.php
Line #413
public function prepareUserFetchOptions(array $fetchOptions)
They wrote a factory function to join the tables, depending on fetchOptions.
Which will cause some queries to generate ugly, ugly joins.

However on line #429 the comment states:
// TODO: optimise the join on user_option with serialization to user or user_profile

So Kier and Mike are aware of the issue they just haven't addressed it yet.
 
The advantage of creating the temp table is that since you only need twenty records, you aren't creating massive hash table lookups, matching all of the user id's. It is a badly written query for large user tables.


No. The query I wrote limits things to 20 records.


it comes from the /library/XenForo/Model/User.php
Line #413
public function prepareUserFetchOptions(array $fetchOptions)
They wrote a factory function to join the tables, depending on fetchOptions.
Which will cause some queries to generate ugly, ugly joins.

However on line #429 the comment states:
// TODO: optimise the join on user_option with serialization to user or user_profile

So Kier and Mike are aware of the issue they just haven't addressed it yet.

You can't limit things to 20 records and have the order by username any more efficiently than is already there

Unless mysql does something silly like performing the joins and then doing the order by, but I would expect it to optimise that out. If that's not the case then fair enough with regards to your query
 
The joins are fine - they're pkey based. It's not directly the joins that's slow - it's a full table scan query. Unfortunately, MySQL (pre-5.6) is rather stupid when it comes to how it executes this query, by only applying the limit at the end. 5.6 has optimizations for this AFAIK, though that will just limit the data.

So while retrieving the data does slow it down, ultimately it still has to go through hundreds of thousands of rows. Early pages will be fast because of the index on username, so it can just stop when it finds what it needs.

But really, this is just a great example of why the member list is junk. Aside from being completely useless with that many users, it becomes rather easy to end up with queries like this.
 
But really, this is just a great example of why the member list is junk. Aside from being completely useless with that many users, it becomes rather easy to end up with queries like this.

Are you saying a large size community should disable/remove member list?
 
Might be an idea to have an option to control what groups appear on it if there's not already, would make it actually useful on large sites and would allow for a much more efficient query
 
Are you saying a large size community should disable/remove member list?
Probably, as it's not useful anyway. :)

Might be an idea to have an option to control what groups appear on it if there's not already, would make it actually useful on large sites and would allow for a much more efficient query
That will actually make it worse, as right now MySQL can use the username index to help it (in the earlier pages). But that version would have to query the group relation table to determine if a user is to be shown... though I suppose if MySQL started with that table and the result was suitably small, it could be faster.
 
Unfortunately, MySQL (pre-5.6) is rather stupid when it comes to how it executes this query, by only applying the limit at the end. 5.6 has optimizations for this AFAIK, though that will just limit the data.
Considering most people are stuck on MySQL 5.1; we need to help those people out.

Originally, I didn't notice we were sorting on xf_user.username. Sorting on the username is going to cause a filesort, it is, what it is. And because of the way it must be done, MySQL is going to scan the hash twice; during the where clause and the value pairs.
Use show session status like "Handler_read%"; to confirm the # of reads.

However that isn't the real issue, the performance hit can really come if there isn't enough memory (sort_buffer_size) and MySQL needs to write to disk during the scan.

SO.........
Change the structure of the database to have
Key(user_id, username_id)
OR
Write a better query; without a need to change the structure of the table.

Code:
SELECT    squser.*,
    xf_user.*,
    xf_user_profile.*,
    xf_user_option.*,
    xf_user_privacy.*
FROM    (
    select  xf_user.user_id
    FROM    xf_user
    WHERE    xf_user.is_banned = 0
        AND xf_user.user_state = 'valid'
    ORDER BY xf_user.username
    LIMIT    20 OFFSET 470000
    ) AS squser
LEFT JOIN    xf_user
        ON squser.user_id = xf_user.user_id
LEFT JOIN    xf_user_profile
        ON squser.user_id = xf_user_profile.user_id
LEFT JOIN    xf_user_option
        ON squser.user_id = xf_user_option.user_id
LEFT JOIN    xf_user_privacy
        ON squser.user_id = xf_user_privacy.user_id;

Don't take my word for it. Test it. It works. The small penalty for using the sub-query is more than made up for it when doing deep page offsets.

For reference see:
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html : Description of how file sort works
http://www.codinghorror.com/blog/2009/06/all-abstractions-are-failed-abstractions.html
http://www.facebook.com/note.php?note_id=206034210932
http://www.xarg.org/2011/10/optimized-pagination-using-mysql/
They can explain things better than I can.
 
Are you saying a large size community should disable/remove member list?
Imagine if Facebook had some master memberlist you could page through. Sorting ~1,000,000,000,000 records in order to output a page of 20 users. :)

The memberlist is more or less useless for all but the smallest sites. When was the last time you used it for something productive like finding a member that you couldn't remember their username? The #1 use for it on my site was spam bots using it to harvest all our usernames so it could later go and try to send mass PM spam.
 
Also something I've always wondered, it would almost be preferable to display the members page without any ordering (just in order of id) - that way there wouldn't be a page that consisted of nothing but members whose names start with obscure symbols which looks terrible to the eye.

And the user group filter would be great too, maybe even make it a frontend thing defaulting to (configurable) admins/moderators etc, because unless I'm mistaken there's no easy way of finding a list of staff besides staff online now.
 
Top Bottom