• This forum has been archived. New threads and replies may not be made. All add-ons/resources that are active should be migrated to the Resource Manager. See this thread for more information.

Rebuilding the Search Index for a Large Site - Experts Only

Kier

XenForo developer
Staff member
In order to provide multi-content search without locking tables, XenForo maintains a separate search index table with a MySQL FullText index. Sometimes it is necessary to rebuild that index, such as when major changes to MySQL configuration are made (like changing the minimum searchable word length) or after completing an import of threads and posts from another system.

The search index rebuilding process is intensive and can take quite a while for large sites, but there are a few things that you can do to speed it up if you know what you are doing.

The first thing to do is to increase the number of items to be processed in each rebuild cycle. The default is 500 items, but with a fast server you may see a speed boost by bumping that number up to 5,000 or even 10,000 items per page.

Screen shot 2011-01-27 at 07.54.35.webp
Secondly, and this will have a big impact, is dividing up the work that MySQL has to do. The search index rebuild process must do two jobs - generating the search index data from the original content, and then inserting that data into the search index table. Normally, part of this insert process involves MySQL generating its own FullText index for the search index data, but you can defer this process until later by temporarily disabling the FullText index.

To do this, you will need to run a query manually to disable the FullText index before you start the rebuild process from the XenForo Admin Control Panel. The query is as follows:
Code:
ALTER TABLE xf_search_index DISABLE KEYS;
Having run this query, you can start the rebuild process from the XenForo control panel.

After the rebuild process has completed, you must turn the FullText index back on before search will work. This will require another query to be run manually:
Code:
ALTER TABLE xf_search_index ENABLE KEYS;
This query will take quite a while to complete, especially on a large database, but when it's done, search will be ready to use.

To get some idea of how long the query might take, I recently ran the query on a 5.5 million post database on a test machine here (Core 2 Quad 2.4GHz, 8GB RAM, average speed SATA hard drive) and the ENABLE KEYS query took just over two hours to complete.

Screen shot 2011-01-27 at 08.04.47.webp

It is important to note that if you forget to re-enable the FullText index using ENABLE KEYS, the search system on your site will not work at all, which is why we recommend disabling keys to experts only.
 
Great article, I will be sure to include it in our administration documentation.

I've closed the site, dumped the database to .sql via command line. Logged in to MySQL and selected the database. I've disabled the keys and rebuild from admin.php > search index. 750k posts went by pretty fast as expected. The server load went up to 4, not 40. An obvious improvement.

The next step is to enable keys again. 250gb SATA2 7200rpm drive. Quad Core cpu with 2GB system ram. *runs query*
 
Oh well ...

and compared to all that it takes like 3 minutes to re-index a 2 million post database with Sphinx
indexing index 'xenforo_post'...
collected 2155544 docs, 1112.4 MB
sorted 102.4 Mhits, 100.0% done
total 2155544 docs, 1112440510 bytes
total 133.438 sec, 8336745 bytes/sec, 16153.87 docs/sec

indexing index 'xenforo_thread'...
collected 191810 docs, 8.6 MB
sorted 1.2 Mhits, 100.0% done
total 191810 docs, 8574031 bytes
total 10.425 sec, 822383 bytes/sec, 18397.57 docs/sec

:)
 
thanks Kier, that way is way faster here than the usual one without disabling keys.
Test yesterday: ~4 hours to generate the table (after changing to min word length=3)
Test today: <10 minutes to generate the table & 4 min 37.45 sec to reenable the keys.
Discussions:89k & Messages:687k , both times 'Items to process the page' = 5000
Testserver is a i7-920 Quad-Core with (just) 2GB RAM dedicated to a xen-instance, usual SATA-II HDD RAID1
 
.. and done. Wfew! That was a lot better experience than the previous time I tried to do disable/enable. What a frustrating experience, but I am glad it seems to have ended ok.
 
We can do the search index rebuilding on an open/live forum when searching is temporary disabled right? Because this seems a better solution for us since we can have the forums online again faster after an import. Then we can just explain to members searching is temporary unavailable due to maintenance.
 
We can do the search index rebuilding on an open/live forum when searching is temporary disabled right? Because this seems a better solution for us since we can have the forums online again faster after an import. Then we can just explain to members searching is temporary unavailable due to maintenance.
Yea, that's correct.
 
Just did a test import on my SQL server, specs:
  • 20GB RAM
  • Percona MySQL 5.5-15
  • Dual CPU dual core, Intel 5150 2.66ghz
  • Raid 5, 3 x 15k RPM 146GB SAS drives
  • 4 million posts
Table details:
-rw-rw---- 1 mysql mysql 8.7K 2011-08-20 10:26 xf_search_index.frm
-rw-rw---- 1 mysql mysql 1.4G 2011-09-08 18:28 xf_search_index.MYD
-rw-rw---- 1 mysql mysql 1.8G 2011-09-08 18:44 xf_search_index.MYI

Total time :
mysql> ALTER TABLE xf_search_index ENABLE KEYS;
Query OK, 0 rows affected (14 min 22.82 sec)

One thing to note, ALTER TABLE is single threaded.
 
The search index rebuilding process is intensive and can take quite a while for large sites, but there are a few things that you can do to speed it up if you know what you are doing.
Would it be smart to put a link to this thread in the Xenforo Manual ?
http://xenforo.com/help/importing/

Finalizing the Import

A required action once the import is complete is to rebuild the search index. Failure to do so will make it impossible for searches to be performed on the imported data. To rebuild it, click on the Tools tab and then the Rebuild Caches icon. Depending on the number of posts and quantity of data, this could take a long time. Larger sites may benefit reading about ways to speed up rebuilding the search index.


 
One thing to note, ALTER TABLE is single threaded.
Quite a lot, if not all mySQL actions are. Its the one thing I wish they would change, but I would surmise that changing mySQL to support threading would be quite a challenge, although I'd have thought a reindexing operation would be one of the easier ones to do.

But multi-core is the way of the future for a while and single thread performance usually isn't as good as you think.
 
Quite a lot, if not all mySQL actions are. Its the one thing I wish they would change, but I would surmise that changing mySQL to support threading would be quite a challenge, although I'd have thought a reindexing operation would be one of the easier ones to do.

But multi-core is the way of the future for a while and single thread performance usually isn't as good as you think.

ALTER TABLE can become multithreaded if you use partitions on the table being altered but yes I agree, multithreaded operations would be a good thing to be implemented.
 
I ran the first process in about an hour:

ALTER TABLE xf_search_index DISABLE KEYS;


Then I ran the process.

When it came time to re-enable the keys, it processed for about 3 1/2 hours then the load spiked to 30 (dual-core server) and my MySQL server reset. I repaired the table and left the search feature disabled.

Is it safe to attempt to re-enable keys at a later date?
 
Top Bottom