Migrating a large forum to XenForo (test migration)

Deebs

Well-known member
The following contains my experiences of migrating to XenForo (in a test environment to 1.1 b2)


Hardware/software specifications:


SQL Server:

  • 2 x dual core Intel 2.6ghz
  • 20GB RAM
  • Several 146GB Enterprise SAS drives in a raid5 and raid1 configuration
  • Ubunto 11.04 64bit
  • Percona-Server-5.5.15-rel21.0, custom compiled
Web Server:
  • 1 x quad core AMD Opteron 2.3ghz
  • 4GB RAM
  • Ubuntu 11.04 64 bit
  • PHP 5.3.5
  • Nginx v1.06, custom compiled, using PHP-FPM


The source forums are running VBulletin 3.8.6 along with VBSeo and several other mods many of which I will not be migrating.


Steps:
  1. Create a new virtual host to hold the new migrated forums. One thing to note, ensure the virtual host is on the same server as the VBulletin installation as it makes it easier to migrate the attachments and custom avatars. I configured the virtual host to support "Friendly URLs" before I started to install any software. I also ensured that it was firewalled off to only my IP.
  2. Create a new database to hold the migrated forums.
  3. Install XenForo on the new virtual host.
  4. Redirect all traffic to the VBulletin Site to a holding page using HTTP STATUS Code 503. Basically tell spiders there is a technical difficultly and to try again later.
  5. Edit the VBulletin configuration file and change the MySQL username to something random. This ensures that if someone/something got past my redirect they could not effect any changes to the VBulletin database.
  6. Create a backup of the VBulletin database. I have done so many imports from VB to XF over the past and never have I encountered a problem but to be safe I still take a backup I can revert to, just incase I, more than anything, do anything to the live database.
  7. Login to XF admincp and then start off the import process. Things run extremely smoothly up until we come to the Post/Thread Import. Here comes the fun part, boards with millions of posts, threads etc (and private messages) take a massive amount of time. In my previous test imports this step alone was taking around 13 hours!
So this was the crucial point, ready to put the secret weapon to work. A multi-process VBulletin importer! After a long conversation with Kier & Mike I was finally given access to a new importer. This was no ordinary importer, for a start it requires SHELL access to your server. That's right, you have to run a command from the CLI (Command Line Interface). After a few plays with it I noticed that the time to import did dramatically drop but I wanted more, I knew I could get more as I found that the PHP processes were not being properly scheduled for whatever reason. Several file changes (and submitted back to Kier & Mike) later I had managed to get the script to assign each process to the processors in a round robin fashion. This meant I could heavily load up the CPUs in the webserver. There is a warning to this, the load on the SQL server shot up dramatically. Most of the load came from extremely high I/O. The other thing I recommended was that the script be run inside a SCREEN, think a terminal that continues to run even if you disconnect, so that I could wander off and not worry about if my connection had died. The good news is that, if you have ROOT SHELL access to your webserver then the importer will ask you some questions and perform all the magic above apart from the SCREEN. Some timings:


Using the stock importer:



Imported 242,362 items. (13 hours 58 minutes 53.09 seconds)


Using the new and improved CLI importer:


Start: 19:10:58 Approximately 246,300 threads remaining to import.

Stop: 21:21:30
Using 4 cores, 8 PHP processes, each process was using around 32% of the core.
Total time: 2hrs, 11mins or so

Notice that the number of threads had also increased by approximately 4,000. After the thread/post import I carried on with the import but DID NOT click the "Complete import" button. So next up, let's check my new XF site. Viewing the homepage I could see that everything was where it should be but the layout required changing to ensure that I achieved a better look under XenForo. So onto the next set of tasks, the tidy up. Btw, instead of taking around 14 hours to get to this point (based on several test imports) I was here in under 3 hours.


Under VBulletin I had a large number of groups, many of which I culled prior to the migration but still I could not rid myself of them all, that is, until now. Sorting out my permissions took around an hour, the private node function within XenForo was an absolute godsend. I have many private forums and just being able to mark the node (forum) as private and then granting explicit permissions is just heavenly. One of the main gotcha's which happened to me was that under VBulletin the moment you create a new group and forget to customise the permissions they default to having access to all forums. Ooops.

So the permissions are done. What next? I won't bore you with me talking about installing the mods but that was my next step including uploading Kier's excellent VBulletin redirection scripts. After having installed several mods it was time to move onto the reorganisation of the forums. This involved moving many threads around and again, having done many practise migrations before was a breeze using simple MySQL statements (which I recorded during my test migrations). After I was satisfied all looked well it was time to import my style. The moment I saw Themesinc.com I knew the style I wanted to base mine upon, XenFracture. Installation again was a breeze. So far things were looking extremely well. Nearly everything was complete. Time taken so far, around 4 hours.

Next up was to configure the XenForo installation to mimic my test installation which my users have had access to since around April this year. Basically a copy and paste piece of work. The finish line was nearly in line, next up was to create the forum signup Question and Answers. Finally the penultimate task had arrived, rebuilding the caches. I had installed the Sphinx search addon but realised it only indexed certain content types, this was no good so decided to fallback upon the MySQL fulltext search. Time to rebuild. Firstly I disabled the keys on the table xf_search_index, started the search cache rebuild using a value of 10000 items and 0 second delay. Once completed I ran the following "ALTER TABLE xf_search_index ENABLE KEYS;" and watched MySQL try to melt the disks. After around 15 minutes the query completed. I was ready to take the redirection page down.

Incase I needed to rollback I made a complete backup of my VBulletin web installation, then following a little configuration work I took the redirect off. We were back on the air after 5 hours of migration time.

The migration was quite nerve wracking, here was a community that had been online before 2000 and had been using VBulletin since that time, it was migrating to a completely new software solution. I have to say that during the time I knew I was in safe hands, Kier & Mike have done wonders with the software and their level of support has been second to none. I have no doubt my users will love the new environment once they are used to it but all the time they had access to the test forum they kept on at me to migrate now, not when v1.1 came out.
 
These will be the steps I take when I migrate to XF in 1 week and 3 days barring any critical issues with XF 1.1....
 
A great read Deebs, likewise I'll be going through similar steps with mine - kudos in regards to the work on the CLI. Does the screen update the screen with updates every so often as my BT connection will drop if it sees a stale connection, as long as there is data coming down it, it doesn't close the connection.

I know what you said about screen, but I haven't got it installed on the server (although I could always do this).

Alternatively start the process as a background process? been a while since I've done that, something like you end the command line with a & ?

Anyway, I won't be migrating quite as quick as you, we've got a fair bit of work ahead of us yet.
 
A great read Deebs, likewise I'll be going through similar steps with mine - kudos in regards to the work on the CLI. Does the screen update the screen with updates every so often as my BT connection will drop if it sees a stale connection, as long as there is data coming down it, it doesn't close the connection.

I know what you said about screen, but I haven't got it installed on the server (although I could always do this).

Alternatively start the process as a background process? been a while since I've done that, something like you end the command line with a & ?

Anyway, I won't be migrating quite as quick as you, we've got a fair bit of work ahead of us yet.

Screen starts the process as a background service so if you drop the connection to the host it continues to run. Install screen and find a new friend :)

The importer does output text, every 30 seconds, but still, use screen. It is a godsend.
 
I have to say that since I setup the testforums way back the community has begged me to migrate to XF. They love it. Now that 1.1 beta is "available" I am ready to migrate. Just need to fix the thread prefixes and we are there. Also, during the time of the test forums I have had to resist the urge to migrate so much, XF is so much fun to post on but personal preference had to take a backward step.
 
Deebs, you have 20gb of memory there, have you considered setting up a ramdisk scenario (either by proprietary software or tmpfs)?

I imagine you would get that 2 hours down into the minutes mark... probably with the longest time being waiting for the data to be copied to and from the disk to the memory
 
Deebs, you have 20gb of memory there, have you considered setting up a ramdisk scenario (either by proprietary software or tmpfs)?

I imagine you would get that 2 hours down into the minutes mark... probably with the longest time being waiting for the data to be copied to and from the disk to the memory

hmm, I've got 24GB of ram in mine, that's not a bad idea... apart from the fact its InnoDB - without Googling, how easy is it to move the InnoDB files around in the mySQL config?

btw, does the CLI do more than just threads & posts ?
 
hmm, I've got 24GB of ram in mine, that's not a bad idea... apart from the fact its InnoDB - without Googling, how easy is it to move the InnoDB files around in the mySQL config?

btw, does the CLI do more than just threads & posts ?

Havent reviewed the CLI yet so don't know!

However, moving the inno db tables should be a pretty simple proccess,

Make a 2 new DBs, load your vb database and xenforo database to them
Create your tmpfs folder.
Move the database directories to the tmpfs
Create a symlink to the tmpfs

Done :) In theory*

*Disclaimer, never done it myself
 
Deebs, you have 20gb of memory there, have you considered setting up a ramdisk scenario (either by proprietary software or tmpfs)?

I imagine you would get that 2 hours down into the minutes mark... probably with the longest time being waiting for the data to be copied to and from the disk to the memory

I did think about this but to be honest I decided against it as it would mean restarting MySQL at the end to move the files from RAM back to disk. Something I am loathe todo as the server hosts several other databases.

Looking at the performance stats under the new setup I have the SQL server is not the bottleneck but does push over 200MB/s to the disk at times, my bottleneck is the server that runs the CLI importer now.
 
hmm, I've got 24GB of ram in mine, that's not a bad idea... apart from the fact its InnoDB - without Googling, how easy is it to move the InnoDB files around in the mySQL config?

btw, does the CLI do more than just threads & posts ?

The CLI only does threads/posts in a multi process way at the moment.
 
I have imported from vb4 with the importer from Paul M, attachments are stored in file system. I have noticed that only a small part of attachments were imported. That could be due to the importer in general, the file system attachments, or that I simply did not allow these kind of attachments in xf acp before import. Do you have imported all your attachments from vb and did you allow the same kind of attachment types in xf?

For 1 million posts it took me less than 2 hours. Considering I have a slightly faster environment than you our figures match.
 
I have imported from vb4 with the importer from Paul M, attachments are stored in file system. I have noticed that only a small part of attachments were imported. That could be due to the importer in general, the file system attachments, or that I simply did not allow these kind of attachments in xf acp before import. Do you have imported all your attachments from vb and did you allow the same kind of attachment types in xf?

For 1 million posts it took me less than 2 hours. Considering I have a slightly faster environment than you our figures match.

So far everything looks like it has imported correctly including the attachments. Will do another test import tonight. As for time, looking at the above pic it took 2hrs 8mins to import 4 million posts, this is something I am extremely happy with and I believe with some additional tweaking of both MySQL and PHP I can get that down even more.
 
... and I believe with some additional tweaking of both MySQL and PHP I can get that down even more.

I'm interested in optimising the speed of my own migration - what are the specs of your server and what optimisations have you done to MySQL to improve the import time?

Cheers,
Shaun :D
 
I'm interested in optimising the speed of my own migration - what are the specs of your server and what optimisations have you done to MySQL to improve the import time?

Cheers,
Shaun :D

The specs of my setup are in the first post of this thread. With regard to MySQL I have tuned quite a few of the InnoDB settings to make better use of the hardware and disk subsystem beneath it.

As the MySQL system still provide some rather large MyISAM tables I have had to be mindful of ensuring there is enough RAM available to cater for the demands of the applications that use them. So out of the 20GB available I have split it as follows: InnoDB - 12GB, MyISAM key files - 2GB, OS for file caching etc - 6GB.

Also, to prevent double buffering by the OS I have use "innodb_flush_method=ALL_O_DIRECT". This tells the OS to not do any form of buffering, InnoDB has it's own inbuilt caching to handle this. Another thing to tune is the value of "innodb_io_capacity". From memory I believe the default is 200, if your disk subsystem can handle more IOPs then you can gain benefit by changing this value to suit.
 
Nice write up, very helpful, i just tested the cli importer as well and it works wonders with threads importing, does a nice job in saving us some time :)

Now if only the cli importer worked with the attachments step, got a cars related forum heavy on attachments, almost 20gb and it takes twice the time it takes to import threads on the regular non multithreaded import.
 
Upgraded my test forums to XF 1.1 beta 3 and did another import. I can confirm that the issue I had with thread prefixes not importing correctly has been fixed. Have managed to shave a few seconds here and there. So far everything looks fine from a user perspective and based on this I have notified the users that the migration will be going ahead this coming Friday (barring any critical showstoppers I discover along the way which I suspect will be zero). The import this time used 12 PHP processes spread across 4 cores and each core only used around 70% CPU so I could possibly increase the number but I have to be mindful of SQL contention and locking within MySQL as the vBulletin tables where most of the work is carried out are MyISAM.

Results below:

import3.webp
 
Top Bottom