- Our servers are on aws opsworks (I will use the terms server and instance interchangeabley)
- I've been tasked to combine all of our 'community' sites (corporate, support, and two xf forums) onto a single instance
- Our databases for these sites are on a separate instance (security groups are correctly updated for communication between instances)
- XF data/internal_data directories are on an EBS volume and symlinked to the appropriate places on the XF installations
- I have successfully performed test migrations of all of these sites onto a staging server (with staging databases, restored from dumps of the production databases) with no issues, other than a xf cookie/frontend-cache prefix and database gotchas from having two forum sites on the same server (yes, we have licenses for each). This was without using maintenance mode.
- dump the database to backup sql file on the database server (just in case)
- put the forum into maintenance mode so that no updates can occur during the migration
- snapshot the EBS volume on the original instance (chef recipe)
- restore the volume from snapshot on the new instance (chef recipe)
- build the application on the new instance (chef recipe)
- update dns in route53
- bring forum out of maintenance mode
- After dns switched over to the new instance I could not log into the admin panel (or even the main board) to take the site out of maintenance mode (even from an incognito window), password wouldn't work (our other sysadmin tried as well with no success)
- Even changing the board to active on the database would not take it out of maintenance mode, or let me log in:
update xf_option set option_value=1 where option_id = 'boardActive';
- After trying to troubleshoot for about half an hour (3x as long as I should have), I switched dns back to the original instance and only then was I able to get back control, and log into the admin panel. I then had to toggle the maintenance mode (admin panel was claiming board was active, as I had set this in the database, but the site itself was still in fact in maintenance mode, so had to make the board 'inactive' and then active again to trick it into coming out of maintenance mode.)
- This must be some sort of caching issue (we use localhost memcached for backend caching in our config), however it makes no sense to me why I would be locked out of logging onto to the admin panel after the migration (I am a superAdmin in the config/database). Authentication seriously broken by caching?
- After rolling back to the original server, the admin panel claimed the board was active when it was clearly not. What the cache?
- The backend caching settings in the config (memcached) were like that prior to my time with the company, and I've never had a chance to assess whether they should even actually be used (usually I have way bigger things on my plate than forum site configs). Should I just ditch these? Am I correct in assuming that this is the root of my issues?
- I already have an obvious workaround for performing the migration without putting the forum into maintenance mode (rsync ftw), so mainly just venting here. Apologies.