As designed Redundant index coverage on xf_post table

Xon

Well-known member
Affected version
1.x
xf_post has the following indexes related to post_date or thread_id
Code:
        KEY thread_id_post_date (thread_id, post_date),
        KEY thread_id_position (thread_id, position),
        KEY post_date (post_date)
Semi-modern versions of MySQL can do partial index matches, which means there are 2 indexes which can match thread_id, and 2 indexes which can match post_date if thread_id is available.
ie
with post_date column match post_date index
with post_date, thread_id column match post_date or thread_id_post_date index
with thread_id column match thread_id_position or thread_id_post_date index

Dropping post_date and reordering columns:
Code:
        KEY post_date_thread_id (post_date, thread_id),
        KEY thread_id_position (thread_id, position),
ie
with post_date column match post_date_thread_id index
with post_date, thread_id column match post_date_thread_id index
with thread_id column match thread_id_position or thread_id_post_date index.

This should improve insert performance (less indexes to update), and slightly improve diskspace usage.

This affects XF2 as well.
 
Which? I can see maybe MySQL 5.5 in which case this wouldn’t be applied to XF 1.x.
This stack overflow answer references MySQL 5.1, but after some digging I found an old copy of MySQL 5.0/5.1 manual text with the following:

If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on(col1), (col1, col2), and (col1, col2, col3).
http://download.nust.na/pub6/mysql/doc/refman/5.0/en/mysql-indexes.html
http://download.nust.na/pub6/mysql/doc/refman/5.1/en/mysql-indexes.html

What I think has changed, is later versions of MySQL/Mariadb can use the index even if the left-most prefix doesn't match a WHERE clause if the query analyzer estimates it will reduce the search space
 
The 5.7 version of that page still references left-most prefixes: https://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html There is an index merge optimization (5.6+) but that wouldn't really apply to the queries in question there.

Indexes are generally stored as a B-tree, so there isn't a way to search the second part of the tree in order without traversing all of the first parts individually. For these indexes, there are two primary types of queries we use:
  • WHERE post_date > ? AND post_date < ? (for stats)
  • WHERE thread_id = ? ORDER BY post_date (often for rebuilding threads, but also for getting the most recent things, etc)
Both of these generally involve ranges which still require the left most prefixing. It's not really explicitly stated, but how indexes are used for ranges is explained here: https://dev.mysql.com/doc/refman/5.7/en/range-optimization.html The multi-part indexes example demonstrates this.

Essentially, to run both of these queries efficiently, the current index setup is needed.
 
Top Bottom