Performance Hit with Nested SQL Selects?

Jaxel

Well-known member
So I have a query, its a single query, so it doesn't take a lot of resources:
Code:
SELECT EWRgrado_ranks.*, xf_user.*, EWRgrado_ladders.*
FROM EWRgrado_ranks
    RIGHT JOIN xf_user ON (xf_user.user_id = EWRgrado_ranks.user_id)
    LEFT JOIN EWRgrado_ladders ON (EWRgrado_ladders.ladder_id = EWRgrado_ranks.ladder_id)
WHERE EWRgrado_ranks.rank_active = 1
ORDER BY EWRgrado_ranks.rank_score DESC

However, something I would like to do is determine the position of each row in the table. So I add a small nested select statement to my SQL query.
Code:
SELECT EWRgrado_ranks.*, xf_user.*, EWRgrado_ladders.*,
    (SELECT COUNT(*) FROM EWRgrado_ranks AS ranks WHERE ranks.rank_score > EWRgrado_ranks.rank_score) AS position
FROM EWRgrado_ranks
    RIGHT JOIN xf_user ON (xf_user.user_id = EWRgrado_ranks.user_id)
    LEFT JOIN EWRgrado_ladders ON (EWRgrado_ladders.ladder_id = EWRgrado_ranks.ladder_id)
WHERE EWRgrado_ranks.rank_active = 1
ORDER BY EWRgrado_ranks.rank_score DESC

So technically this is a single query. But I wonder if this code has a major performance hit because I think it may actually be an extra query PER ROW. So if the script returns 50 rows, it has the performance hit of 51 queries. Am I right in this assumption?

Would there be a better way to get what I'm looking for?
 
The number of queries means nothing, generally, but if that does indeed generate an additional 50 queries, then I'd suggest it's a bad idea. And your assumption probably is correct as technically it's "50" queries, even if they are nested.

I don't fully understand what you're trying to do of course, but generally XenForo would opt for doing something at save time or similar, rather than during run time.

So at a moment in the code where the position you're after would change (maybe in the DataWriter after a certain field has been updated or a record is added or similar event), they would perform the query that you're trying to nest and populate that data to an appropriate place.

That means the data is always available with no additional overhead.
 
That query will not scale well at all. The way you have it set up, it more or less can be considered an extra query (as far as resources are concerned) for every row returned. A better way to do it would be with MySQL user variables.

First set the variable with this query (this has more or less no overhead because this query isn't actually searching or hitting any tables):
Code:
SET @position = 0
Adjust the initial value as needed (for example if you are using a LIMIT).

Then your query above would be written as:
Code:
SELECT EWRgrado_ranks.*, xf_user.*, EWRgrado_ladders.*,
   @position:=@position+1 AS position
FROM EWRgrado_ranks
    RIGHT JOIN xf_user ON (xf_user.user_id = EWRgrado_ranks.user_id)
    LEFT JOIN EWRgrado_ladders ON (EWRgrado_ladders.ladder_id = EWRgrado_ranks.ladder_id)
WHERE EWRgrado_ranks.rank_active = 1
ORDER BY EWRgrado_ranks.rank_score DESC
 
Last edited:
It's also possible to set mysql variables directly in the query:
Code:
SELECT IF(@position, @position:=@position+1, @position:=1) AS position
 
Indeed... but if you want to min/max performance, you may want to do it with the two queries... IF() statements don't perform fantastically within SQL queries, and there's no real reason to dip into the IF() logic for each and every row returned.
 
Top Bottom