1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. This forum has been archived. New threads and replies may not be made. All add-ons/resources that are active should be migrated to the Resource Manager. See this thread for more information.

Rebuilding the Search Index for a Large Site - Experts Only

Discussion in 'Tips and Guides [Archive]' started by Kier, Jan 27, 2011.

  1. Kier

    Kier XenForo Developer Staff Member

    In order to provide multi-content search without locking tables, XenForo maintains a separate search index table with a MySQL FullText index. Sometimes it is necessary to rebuild that index, such as when major changes to MySQL configuration are made (like changing the minimum searchable word length) or after completing an import of threads and posts from another system.

    The search index rebuilding process is intensive and can take quite a while for large sites, but there are a few things that you can do to speed it up if you know what you are doing.

    The first thing to do is to increase the number of items to be processed in each rebuild cycle. The default is 500 items, but with a fast server you may see a speed boost by bumping that number up to 5,000 or even 10,000 items per page.

    Screen shot 2011-01-27 at 07.54.35.png

    Secondly, and this will have a big impact, is dividing up the work that MySQL has to do. The search index rebuild process must do two jobs - generating the search index data from the original content, and then inserting that data into the search index table. Normally, part of this insert process involves MySQL generating its own FullText index for the search index data, but you can defer this process until later by temporarily disabling the FullText index.

    To do this, you will need to run a query manually to disable the FullText index before you start the rebuild process from the XenForo Admin Control Panel. The query is as follows:
    ALTER TABLE xf_search_index DISABLE KEYS;
    Having run this query, you can start the rebuild process from the XenForo control panel.

    After the rebuild process has completed, you must turn the FullText index back on before search will work. This will require another query to be run manually:
    ALTER TABLE xf_search_index ENABLE KEYS;
    This query will take quite a while to complete, especially on a large database, but when it's done, search will be ready to use.

    To get some idea of how long the query might take, I recently ran the query on a 5.5 million post database on a test machine here (Core 2 Quad 2.4GHz, 8GB RAM, average speed SATA hard drive) and the ENABLE KEYS query took just over two hours to complete.

    Screen shot 2011-01-27 at 08.04.47.png

    It is important to note that if you forget to re-enable the FullText index using ENABLE KEYS, the search system on your site will not work at all, which is why we recommend disabling keys to experts only.
    Eagle, Hoffi, DeltaHF and 5 others like this.
  2. Floris

    Floris Guest

    Great article, I will be sure to include it in our administration documentation.

    I've closed the site, dumped the database to .sql via command line. Logged in to MySQL and selected the database. I've disabled the keys and rebuild from admin.php > search index. 750k posts went by pretty fast as expected. The server load went up to 4, not 40. An obvious improvement.

    The next step is to enable keys again. 250gb SATA2 7200rpm drive. Quad Core cpu with 2GB system ram. *runs query*
    Mikey and Kier like this.
  3. mlx

    mlx Well-Known Member

    Oh well ...

    and compared to all that it takes like 3 minutes to re-index a 2 million post database with Sphinx
    Vladislav Rastrusny likes this.
  4. high1976

    high1976 Active Member

    thanks Kier, that way is way faster here than the usual one without disabling keys.
    Test yesterday: ~4 hours to generate the table (after changing to min word length=3)
    Test today: <10 minutes to generate the table & 4 min 37.45 sec to reenable the keys.
    Discussions:89k & Messages:687k , both times 'Items to process the page' = 5000
    Testserver is a i7-920 Quad-Core with (just) 2GB RAM dedicated to a xen-instance, usual SATA-II HDD RAID1
    Kier likes this.
  5. Floris

    Floris Guest

    .. and done. Wfew! That was a lot better experience than the previous time I tried to do disable/enable. What a frustrating experience, but I am glad it seems to have ended ok.
    Andy.N and Kier like this.
  6. dutchbb

    dutchbb Well-Known Member

    We can do the search index rebuilding on an open/live forum when searching is temporary disabled right? Because this seems a better solution for us since we can have the forums online again faster after an import. Then we can just explain to members searching is temporary unavailable due to maintenance.
  7. Kier

    Kier XenForo Developer Staff Member

    Yea, that's correct.
  8. Deebs

    Deebs Well-Known Member

    Just did a test import on my SQL server, specs:
    • 20GB RAM
    • Percona MySQL 5.5-15
    • Dual CPU dual core, Intel 5150 2.66ghz
    • Raid 5, 3 x 15k RPM 146GB SAS drives
    • 4 million posts
    Table details:
    -rw-rw---- 1 mysql mysql 8.7K 2011-08-20 10:26 xf_search_index.frm
    -rw-rw---- 1 mysql mysql 1.4G 2011-09-08 18:28 xf_search_index.MYD
    -rw-rw---- 1 mysql mysql 1.8G 2011-09-08 18:44 xf_search_index.MYI

    Total time :
    One thing to note, ALTER TABLE is single threaded.
    Walter likes this.
  9. Digital Doctor

    Digital Doctor Well-Known Member

    Would it be smart to put a link to this thread in the Xenforo Manual ?

  10. MGSteve

    MGSteve Well-Known Member

    Quite a lot, if not all mySQL actions are. Its the one thing I wish they would change, but I would surmise that changing mySQL to support threading would be quite a challenge, although I'd have thought a reindexing operation would be one of the easier ones to do.

    But multi-core is the way of the future for a while and single thread performance usually isn't as good as you think.
  11. Deebs

    Deebs Well-Known Member

    ALTER TABLE can become multithreaded if you use partitions on the table being altered but yes I agree, multithreaded operations would be a good thing to be implemented.
  12. Douglas Taylor

    Douglas Taylor Active Member

    I ran the first process in about an hour:

    Then I ran the process.

    When it came time to re-enable the keys, it processed for about 3 1/2 hours then the load spiked to 30 (dual-core server) and my MySQL server reset. I repaired the table and left the search feature disabled.

    Is it safe to attempt to re-enable keys at a later date?
  13. Douglas Taylor

    Douglas Taylor Active Member

    Turns out that my problem was that my /tmp was way too small to handle it.

Share This Page