XF 2.2 Forum is inaccessible when remote file storage is down

K a M a L

Well-known member
we use Digital Ocean spaces to store our site files .. we noticed ( always ) when Digital ocean spaces is having troubles .. forum is extremely slow or inaccessible .. even forum index and pages that doesn't have attachments .. I know xenforo keeps code cache on local file system so there shouldn't be any need to connect to spaces to retrieve any files but seems this unintentionally happens somewhere .. I didn't dig into the issue yet but probably you can do faster .
 
Not had any other reports of this, nor would it be expected.

As you say, code cache should be local, and the only time your server physically has to do any work remotely is when uploading or streaming an attachment download - but as you noted this isn't supposed to happen on the forum index or pages without attachments.

So I can only guess that this may be down to an add-on that might be storing and accessing files more frequently.

You should also consider raising the issue with DO Spaces. Them having troubles shouldn't be a frequent occurrence so I'm not sure if this is a symptom of the problem, rather than the cause.
 
Worth noting there are others, though it is only S3 and DO Spaces that have been tested to work with the S3 library we ship in the tutorial resource we have.

You'll likely have to do some trial and error in terms of the configuration but if it works you might want to consider Backblaze (cheaper than Amazon S3 and I'd like to think should be more reliable than DO Spaces is currently).

 
I am not using either of the mentioned providers, but I can confirm that if the storage provider is down, that your site will have issues loading a page (at least in my case I could see it happen on 3-4 occasions).
 
I can confirm this as well. Was using B2 using their S3 API. It worked fine. One day, forum would just not load. Backblaze does not have a very good notification system. Spent around an hour trying to find out the reason. Eventually tried disabling B2 from config.php and that fixed the bug. Decided to move back to my host right then. Never got any alert from B2. Saw several posts about this issue on their reddit group later that day.

Digital Ocean is still suffering from Spaces related issues in one of their locations.

 
If this ever happens again, we really need to know about it at the time to see if we can identify what the cause is.

A good test too would be to disable all add-ons to see if that alleviates it. Plus, keeping an eye on the network tab of the browser dev tools to see if it can be identified what part of the page is struggling to load.

It sounds very much like there's something making requests from internal data during the request. As we know, that only happens in a finite number of cases.

Perhaps a slightly more unexpected one is that we denote whether an XF installation is installed via the internal_data/install-lock.php file. This prevents the install being rerun and accidentally wiping out all of the data.

This is not something that should be checked routinely. When it is, however, because it may be stored remotely, it requires your server to perform HTTP request to do so. If it was being checked routinely for some reason, and the remote side was experiencing latency issues or similar, then this could make sense.

I think the most likely culprit is: XF\Error::logException. This is what typically writes entries to the server error log. This should not be called frequently but if you have a particular add-on or bug that is continually trying to log an exception frequently then it could certainly cause things to slow down.

So would be good to hear if this could have happened in these cases. Perhaps we need to make some changes here but, significantly, if you have a bit of a grumbling add-on or some other error that is being logged frequently then ideally steps should be taken to solve those issues.
 
@Chris D .. I've just done some testing to check the cause of the issue .. when I'm on index page - for example - there are no requests sent to external file storage ..
my conclusion .. when the file storage API is slow/down .. there are hundreds/tons of PHP process/workers/threads ( or whatever it is called) waiting for response .. so new requests are queued waiting for previous requests to complete ..I was able to replicate the issue with internal storage by adding sleep(100); to attachment controller and loading a lot of attachments at once.
I think setting low connect_timeout value could be a solution but probably there is something smarter .
 
That's certainly feasible that it may be a compound effect rather than a rogue process.

There is some suggestion that timeout / connect_timeout may be defaulted to 0 (no limit) in the AWS SDK which seems to back that up.

According to the documentation that addresses this, most client configurations support these options:

PHP:
'http' => [
   'connect_timeout' => 5,
   'timeout' => 5
]

Which would mean your typical client settings would look something like this:

PHP:
new \Aws\S3\S3Client([
   'credentials' => [
      'key' => 'ABC',
      'secret' => '123'
   ],
   'region' => 'ams3',
   'version' => 'latest',
   'endpoint' => 'https://ams3.digitaloceanspaces.com',
   'http' => [
      'connect_timeout' => 5,
      'timeout' => 5
   ]
]);

If this is happening frequently, it may be worth adjusting your config as above to see if that changes anything. If it does, I'll update the tutorial resource.
 
For what it's worth, their entire NYC region was down for a period of time today, it wasn't limited to just Spaces like their initial incident says.

 
@Chris D .. the smarter solution in my option would be serving attachments as presigned URLs .. this can be a good improvement ( regardless current issue ) .. this will allow serving files directly from storage API without routing through forums and saves a lot of server/network work.
 
Top Bottom