XF 1.4 Server Migration Weirdness (Can't log into admin panel/etc)

lord8bit

Member
Last night I attempted to migrate xenforo to another server and had to roll back.

Background:
- Our servers are on aws opsworks (I will use the terms server and instance interchangeabley)
- I've been tasked to combine all of our 'community' sites (corporate, support, and two xf forums) onto a single instance
- Our databases for these sites are on a separate instance (security groups are correctly updated for communication between instances)
- XF data/internal_data directories are on an EBS volume and symlinked to the appropriate places on the XF installations
- I have successfully performed test migrations of all of these sites onto a staging server (with staging databases, restored from dumps of the production databases) with no issues, other than a xf cookie/frontend-cache prefix and database gotchas from having two forum sites on the same server (yes, we have licenses for each). This was without using maintenance mode.

Attempted Process:
- dump the database to backup sql file on the database server (just in case)
- put the forum into maintenance mode so that no updates can occur during the migration
- snapshot the EBS volume on the original instance (chef recipe)
- restore the volume from snapshot on the new instance (chef recipe)
- build the application on the new instance (chef recipe)
- update dns in route53
- bring forum out of maintenance mode

Results:
- After dns switched over to the new instance I could not log into the admin panel (or even the main board) to take the site out of maintenance mode (even from an incognito window), password wouldn't work (our other sysadmin tried as well with no success)
- Even changing the board to active on the database would not take it out of maintenance mode, or let me log in:
update xf_option set option_value=1 where option_id = 'boardActive';
- After trying to troubleshoot for about half an hour (3x as long as I should have), I switched dns back to the original instance and only then was I able to get back control, and log into the admin panel. I then had to toggle the maintenance mode (admin panel was claiming board was active, as I had set this in the database, but the site itself was still in fact in maintenance mode, so had to make the board 'inactive' and then active again to trick it into coming out of maintenance mode.)

Comments/Questions:
- This must be some sort of caching issue (we use localhost memcached for backend caching in our config), however it makes no sense to me why I would be locked out of logging onto to the admin panel after the migration (I am a superAdmin in the config/database). Authentication seriously broken by caching?
- After rolling back to the original server, the admin panel claimed the board was active when it was clearly not. What the cache?
- The backend caching settings in the config (memcached) were like that prior to my time with the company, and I've never had a chance to assess whether they should even actually be used (usually I have way bigger things on my plate than forum site configs). Should I just ditch these? Am I correct in assuming that this is the root of my issues?
- I already have an obvious workaround for performing the migration without putting the forum into maintenance mode (rsync ftw), so mainly just venting here. Apologies.
 
This must be some sort of caching issue (we use localhost memcached for backend caching in our config), however it makes no sense to me why I would be locked out of logging onto to the admin panel after the migration (I am a superAdmin in the config/database). Authentication seriously broken by caching?
You said the password didn't work, but you didn't specify the exact behavior. Did you get a password error or could you just not stay logged in? They would have very different implications.

- After rolling back to the original server, the admin panel claimed the board was active when it was clearly not. What the cache?
This is one of the reasons we don't recommend editing the DB directly. Doing that is bypassing the various other updates that happen when an option is changed.

- The backend caching settings in the config (memcached) were like that prior to my time with the company, and I've never had a chance to assess whether they should even actually be used (usually I have way bigger things on my plate than forum site configs). Should I just ditch these? Am I correct in assuming that this is the root of my issues?
It's difficult to say if they're your issue. If you're getting an incorrect password error, they're not; if you can't stay logged in, then maybe they are, but that further depends on the specific cache configuration. Memcached usually doesn't cause issues like that itself.

But if you want to take that out as a possible issue, you can remove it. If you restore it, I would recommend changing the prefix to ensure that you're not using stale data (from before you removed it). XF is generally optimized to run without the cache, so you won't see too much difference.
 
Hi Mike,
Thank you for your response.

In regards to the password behaviour, there was no incorrect password error. After entering my username and password in the fields on the Admin Control Panel Login page, the site would begin its normal page transition (logo shrinks to the top middle of the page), but instead of entering the Admin Control Panel, the login page would just reset itself with the aforementioned fields blanked (again, no Incorrect password. Please try again. message below the fields).

A further detail I neglected to mention: when I initially put the forum into maintenance mode and reloaded the main board to make sure, it gave me some message (can't recall the exact wording, and don't want to put the board into maintenance mode right now), about the board being inactive, and only admins could make changes. However, after switching the dns and reloading, I was definitely logged out and could only see our custom maintenance message (set in the ACP), and couldn't log in to the actual board (which I did mention). When rolling back to the original server and reloading the main page I was once again logged in on the board, with that message showing again.

All of this would indicate to me that there must be something weird going on between caching and the database/authentication.
Backend cache settings from config.php:

$config['cache']['backend'] = 'Memcached';
$config['cache']['backendOptions'] = array(
'compression' => false,
'servers' => array(
array( 'host' => 'localhost', 'port' => 11211 ),
)
);

Good to know that removing the above shouldn't cause any(or at least much) degradation in performance though, I will probably just get rid of it in that case.

I will keep in mind not to bother trying to edit the database directly in the future, after not being able to log into the admin control panel I was just trying to force it out of maintenance mode and it seemed worth trying at the time.

Cheers
 
Right, if you're not being logged in but you're not getting an error, there are likely 2 potential causes:

1. You're putting sessions in the cache but there are issues writing to it. Your cache config doesn't mention the line to do this, but it's worth checking anyway by removing the cache config to see if that resolves it.

2. The IP seen by XF is changing from page view to page view. This can be common if there's a load balancer (such as ELB in this case) or other reverse proxies (like CloudFlare). I suspect this is the cause.

You can check this if you create a PHP info page and look at the REMOTE_ADDR value. If it's not your IP, look for the _SERVER value that does have it. (Probably something like $_SERVER['HTTP_X_FORWARDED_FOR'].) You can either force PHP to use this value or (perhaps preferably) get your web server to trust that as the real IP. There's some discussion of one way of doing this here: http://serverfault.com/questions/331531/nginx-set-real-ip-from-aws-elb-load-balancer-address If that doesn't help or isn't totally clear, let me know what your setup is and where the real IP is found and we'll go from there.
 
Top Bottom