B

Bogdan Bultrini

Hi,
for no reason, last week, our Xen forum got this error and users can no longer write messages or do anything else.

We have: xenforo 2.2.8, php 8.0.28, mariadb 10.5.18;
dedicated server; cpu AMD EPYC 7313; ram 64gb; disks nvme).

Log server errors
XF\Db\Exception: MySQL query error [1205]: Lock wait timeout exceeded; try restarting transaction src/XF/Db/AbstractStatement.php:230

UPDATE xf_forum SET message_count = ?, last_post_date = ?, last_post_id = ?, last_post_user_id = ?, last_post_username = ? WHERE node_id = 36

J

Jon

Hi there,

This will be an error with your hosting package, you will need to have your host look into what has gone wrong.

Kind Regards

Jon

J

Jon

Hi there,

The information in the thread provided by JoshA is a valid set of steps to begin troubleshooting.

However ultimately the error provided is one that is a server level error, the update that is being attempted:

UPDATE xf_forum SET message_count = ?, last_post_date = ?, last_post_id = ?, last_post_user_id = ?, last_post_username = ? WHERE node_id = 36

is a very basic query and should finish almost instantly. If it is not, this is indicative of a server level problem and you should contact your hosts or sysadmin.

Regards

Jon

B

Bogdan Bultrini

We already did everything, and also much more than that, of what suggested in the thread.
As an example we updated Xenforo, among many other things.
We have a very high level of skills in Php, and server and database management. We have our own specialist system and network manager, and this problem occured suddenly after years of good functioning, without doing anything.
We believe is not a question of our server.

B

Bogdan Bultrini

We are totally available to pay for a dedicated assistance, in which your team can directly talk, also by Skype, with my team. My direct email is bogdan.bultrini@gmail.com
Please, it is very urgent, we have a forum of around 40 thousand users, 23 years of activity.

J

Jon

Hi there,

I am afraid we do not offer server level support in ticket support, that is an issue for your host / sysadmin to resolve.

XenForo likewise would not just stop working in such a way. It suggests an issue with an update or similar that has occured on your hosting environment.

Regards

Jon

B

Bogdan Bultrini

Xenforo is working, but sometimes, during the day, with no reason, it is difficult or impossible, to write new posts.
Please give us a set of hours of direct assistance, with Skype, we are willing to pay them.
It is the best way for us to show you our situation and for you to understand.

J

Jon

Hi there,

Again, this is not a XenForo related issue, it will be hosting related. If your site suddenly stops working as normal this is nothing to do with the software, it will be hosting related and your hosts will need to investigate.

If they are unable to provide a solution, we suggest finding a new host.

Regards

Jon

B

Bogdan Bultrini

We are on OVH, one of the best providers in Europe.
In the past there was only 1 server for our forum and website. For years everything ran well, and suddenly the forum started to give problems. The rest of our website, that uses Wordpress, continued to work well, without any problems.
At that point we bought 2 new servers, and we migrated the forum on one of the 2 new servers, and we migrated the rest of the website on the other new server. Therefore now we have 1 dedicated server for the forum, and 1 server for the rest of the website.
We manage the server ourselves, we have our Senior system and network manager.
The forum has problems also on the new server. We updated the forum to the latest version, and still gives problems.
Not all the forum gives problems. Some parts yes, some no, some "nodes are working" and some others no, and the parts that are working and not working changes continuously. This can't be a server/hosting related problem. Please let us buy a package of some hours of direct assistance via Skype. My email is bogdan.bultrini@gmail.com

J

Jon

Hi there,

I am afraid we do not offer such a service.

You can also attempt to narrow the problem down by reverting to the default XenForo style and disabling all addons.

If the issue persists, then you know it is either a) addon related or b) server related.

Regards

Jon

B

Bogdan Bultrini

As I already wrote in the thread: we already disabled all addons. It is a test that we already did, and the problem exists with or without addons.
We did this test for many times, and many days. And at the moment all addons are deactivated.

We are using our style since 4 years ago, never gave any problem, and we did not absolutely changed anything in this style. It is not style related, I am absolutely sure of this.

You have, I think, the possibility to see inside our Xenforo, you already did it years ago.
Can you connect with us on Skype tomorrow, after 10AM (italian hour, I don't know your time, because I don't know your location) at the time that you can, only after 10AM italian hour, please?
As an example in Italy now it is 6,47PM.
Thanks

J

Jon

Hi there,

As I have stated we do not offer that as a service.

The issue is not related to XenForo, and we do not provide server level support.

Regards

Jon

B

Bogdan Bultrini

Hi Jon,
i am the sysadmin of the server; let me clarify some points otherwise i do not think you can understand the problem.

The setup

  • problem started on previous server, with Xenforo 2.1, where the forum has been running for over 2 years; the problem happened without any config change; the system was php 7.3, mariadb 10.3
  • we tried first to upgrade to the last Xenforo 2.2.x, then we upgraded to a bigger and new server to avoid at once any resource issue, hardware problem and so on
  • we upgraded to php 7.4 and, later, to php 8.0; we also upgraded to mariadb 10.5
  • the server (both the old and the new one) is greatly overperforming respect our needs but we picked that one to exclude any problem that could be related to resource overuse; it has enough ram configured in the innodb pools to cache the entire db and indexes (while still keeping dozen of gigabytes for the system and other services); it has a big dual amd epyc cpu and nvme disks so, trust me, it is NOT a problem of resources

The problem
The "Lock wait timeout" that Bogdan reported is not actually the problem but just the effect. At some point, sometimes after a couple of hours, sometimes after 10-12h, a transaction starts locking almost all the rows. This is the cause of the lock timeout.

This is from innodb engine status:

---TRANSACTION 127566, ACTIVE 760 sec updating or deleting
mysql tables in use 12, locked 12
368888 lock struct(s), heap size 59908216, 30520712 row lock(s), undo log entries 29926318
MySQL thread id 144344, OS thread handle 140139549271808, query id 2437522 127.0.0.1 io_forum Updating
UPDATE xf_post
SET position = IF(position > 0, position - 1, 0)
WHERE thread_id = ?
AND position >= ?
AND post_id <> ?
Trx read view will not see trx with id >= 127566, sees < 127566

Also in the MySQL general log, which we enabled for debugging (attached to the ticket) you can see Xenforo at some point starts to push new savepoints to the db, withouth being able to commit/rollback them. You can easily spot the problem at line 3050 where you see that Xenforo starts to push savepoints with increasing labels (savepoint saveN... usually N is 0-4 depending on the stack Xenforo keeps in memory while here it starts to ramp up and at the end of the log it is already SAVEPOINT save200).

Thanks,

Attachments

B

Bogdan Bultrini

By the way, i forgot to add: the same issue appears even w/o addons, w/o crons enabled and with the default theme/style. It does not appear if the forum is kept in maintenance mode so it is indeed related to the interactions of the users with the forum, not to some hosting issue.

J

Jon

Hi there,

Thanks for this additional information.

Based on what I can see at this point, I believe the most likely scenario is that something is interupting the web service or php during a transaction.

This means that MySQL is waiting for the commit, rollback or close query to be sent, and it is not receiving it. This is why you are getting these locks on the table that are not resolving.

At this stage I would look to swap from your existing web server / php install to an alternative, for example if you are using Apache, to swap to LiteSpeed or Nginx.

I would also look to disable any opcaching or similar that you may have in place. If you use any web application firewalls or similar also, I would disable those while investiating the issue.

Regards

Jon

B

Bogdan Bultrini

1 - We already swapped web server, from Nginx to Apache: did not solve!
2 - also the Php was swapped: did not solve!
3 - We disabled this morning opache e waf+cache cloudflare. We wait to see what happens.

On our forum we have threads with even 250k answers. We allow 10 posts per page, therefore we have threads with 25k pages.
I attach a screenshot.
How does Xenforo manages this? Is it a problem, or is irrelevant?

Attachments

  • forum-bigthreads-xenforo.jpg
    forum-bigthreads-xenforo.jpg
    144.6 KB · Views: 38
J

Jon

Hi there,

Such large threads are exceptionally rare, and while there could be some performance impact, it is unlikely to be the cause of your SQL problems.

Regards

Jon

B

Bogdan Bultrini

On our forum they are common, like 40% of the active threads (our users use a thread like a chat, we have 36.280 users, 78.341 threads, 5.881.545 posts). Which could be a "good" limit of posts per thread? I ask this even if it is not the cause of our problem.

B

Bogdan Bultrini

We are here:
1 - We already swapped web server, from Nginx to Apache: did not solve!
2 - also the Php was swapped, we tried divverent versions: did not solve!
3 - We disabled opache e waf+cache cloudflare. Did not solve!

So the problem still exists, and every day, 1 time a day, the forum becomes impossible to use.

B

Bogdan Bultrini

Why starts this transaction? Who is this transaction?
Can you look deeper into this aspect?

J

Jon

Hi there,

While we have been unable to reproduce your exact error, we do have a potential cause with your extremely large threads and performing moderating actions on them.

At this time, can you test this by locking any of those very long active threads, and disabling any 3rd party addons and see if the issue persists.

Regards

Jon

B

Bogdan Bultrini

We will do it. A maximum of 20.000 post per thread will be allowed, the long threads will be splitted and locked.
I will inform in this ticket if the problem goes away.

Bogdan Bultrini (bo...@gmail.com)
The name used in this ticket will always default to the name of the person who opened the ticket.
You may use Markdown to format your message.
If you have any sensitive data (such as FTP credentials), place them here. These details will not appear in the ticket history.
Attach files
Scroll to top