XF 1.2 502 Bad Gateway

Felomeno

Member
Hello

I'm running xF 1.2.1 on nginx and everything works as expected for the most part.
However when I try to edit a big post it throws a bad gateway error. I've tried changing my php-fpm configuration as well as optimizing mysql settings but it doesn't seem to be working.

Any help would be appreciated.
 
2013/09/04 01:26:39 [warn] 29153#0: *474534 an upstream response is buffered to a temporary file /var/lib/nginx/tmp/fastcgi/8/47/0000005478 while reading upstream, client: myyp, server: www.mysite.com, request: "GET /threads/internet-real-warriors-revelations.452620/ HTTP/1.1", upstream: "fastcgi://unix:/tmp/php5-fpm.sock:", host: "www.mysite.com"


2013/09/04 01:26:47 [error] 29153#0: *474557 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: myyp, server: www.mysite.com, request: "GET /posts/3646151/edit-inline?&_xfRequestUri=%2Fthreads%2Finternet-real-warriors-revelations.452620%2F&_xfNoRedirect=1&_xfToken=1%2C1378250799%2C61a135bec05c68f414e5e1664ac074977a67bd25&_xfResponseType=json HTTP/1.1", upstream: "fastcgi://unix:/tmp/php5-fpm.sock:", host: "www.mysite.com", referrer: "http://www.mysite.com/threads/internet-real-warriors-revelations.452620/"
 
I know you said you're on 1.2.1, but can you confirm that? There's a fix relating to editing long posts in it.

That said, that errors looks like PHP is crashing or is being killed. A PHP crash can be a nightmare to debug, though it could be related to the fix I mentioned above. However, it's also very possible that there is a limit on the length of time that a FCGI process can run for and is being killed. Check for any specific limits there.
 
Hello Mike, I'm definitely on 1.2.1.
I'll be doing more debugging in the morning as there are a lot of users on the site right now. Will get back to you then.

Here's another error

2013/09/05 13:01:45 [error] 27692#0: *2973910 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 83.59.180.81, server: www.mysite.com, request: "GET /posts/3646151/edit-inline?&_xfRequestUri=%2Fthreads%2Finternet-real-warriors-revelations.452620%2F&_xfNoRedirect=1&_xfToken=1%2C1378378897%2C40cf14a1c081d8febf5c06e82a094ee0ae5b2c7e&_xfResponseType=json HTTP/1.1", upstream: "fastcgi://unix:/tmp/php5-fpm.sock:", host: "www.mysite.com", referrer: "http://www.mysite.com/threads/internet-real-warriors-revelations.452620/"
 
Last edited:
I am having this problem too, XF 1.2.1. There is no information in the logs other than what has been said here. Both PHP 5.3.10 and PHP 5.5.3 have this problem.

5.3.10 used to put lots of errors in php5-fpm.log about terminating on SIGSEGV. 5.5.3 does not have this problem, or it doesn't log it.
 
Yes, this only happens on long posts. Unfortunately on my forum long posts are common, and editing them is also common. While we can work around for now it would be nice to fix this :)

Also,
However, it's also very possible that there is a limit on the length of time that a FCGI process can run for and is being killed. Check for any specific limits there.

For me at least, the gateway error is instant, so I don't think it has anything to do with the amount of time allowed.
 
After doing a little bit more debugging, this seems to be an issue with PCRE. Is XenForo creating large regexes, by chance?

Sep 8 20:01:20 smogon-dev kernel: [210415.320568] php5-fpm[32737]: segfault at 7fff9529cee0 ip 00007f2af2213a7a sp 00007fff9529ce70 error 6 in libpcre.so.3.12.1[7f2af2201000+3c000]

Appears relevant: https://bugs.php.net/bug.php?id=61579 at least for this bug, it is a stack overflow caused by an inefficient regex
 
Yes, this process is regex heavy. 1.2.1 increased the performance of this regex to deal with this issue.

If you submit a ticket with a link to the post in question, an account that can edit that post, and FTP details, I can look into it.
 
I solved the problem (for now) by upping the stack size for PHP.

We aren't really setup to have people edit directly on the server via FTP, especially since we maintain our own patch set (XenForo official source is tracked in a separate git branch which is merged in). If direct editing on the server is a common request for bug reports we can hack around all of that in the future.

(To be fair I blame this problem mostly on PHP/PCRE, which should really raise an exception or something instead of just segfaulting...)
 
It was lower, yes. Unfortunately it's a game of cat and mouse -- I don't how how much stack PCRE consumes per recursive call, and it probably changes version to version due to implementation details. So it is difficult to select a stack size that is simultaneously safe yet useful (the Debian/Ubuntu defaults don't guarantee this, as this occurrence shows). I hope my current settings prevent segfaults while allowing XenForo to work, but I won't know until someone reports a crash I suppose.

Apparently you can custom build PCRE to use the heap instead of stack for recursion. I'll look into that if it's too problematic in the future.

On XenForo's side, if the recursion limit is set too low, the post editor simply shows an an empty post. Is it possible to report an error instead?
 
Yes, this only happens on long posts. Unfortunately on my forum long posts are common, and editing them is also common. While we can work around for now it would be nice to fix this :)

Also,


For me at least, the gateway error is instant, so I don't think it has anything to do with the amount of time allowed.

I stay with 1.2.1, i have the same problem with large posts when i try to edit, i receive a instant 502 bad gateway, is this solved in 1.2.3 version ?
 
Back
Top Bottom