xf_thread_read auto_increment out of range

mattrogowski · Feb 16, 2022

If you set up a MySQL cluster for replication, the auto_increment values generated by MySQL will go up by more than 1 at a time (see auto_increment_increment) - the default (we've been told) is to increment them by 10. This means you use up IDs 10x quicker (so you'd run out of IDs at 420 million rows, not 4.2 billion).

If you have a particularly a) large, b) active, and c) long running forum, combined with the auto_increment values going up by 10 instead of 1, you will eventually (after many, many years) exhaust the IDs available for this table. Unlike the xf_thread_view table, xf_thread_read is never truncated, it just has stale data deleted after the thread read cut-off period - so over time, the IDs keep growing and growing (quicker than usual when they're going up 10 at a time). When you end up reaching ID 4294967295 (which it did on a forum at 2am of all times), you will get an "out of range" error for that table. Our xf_thread_read table currently has ~5m rows in it - as that's a 30 day rolling total, that can be extrapolated out for the number of rows created over the course of a year, and with the IDs going up 10 at a time, it's about 7 years until 420m rows are created and the IDs are exhausted.

The solution would be to change thread_read_id to be a BIGINT. This will probably only ever affect 0.1% of forums, but considering some forums maybe have been running XF for over a decade now and could well be on a MySQL cluster after all that time, there may be some boards unknowingly creeping close to this issue. Updating the column type should stop this ever being an issue.

Xon · Feb 17, 2022

MariaDB's Galera Clustering also increases the auto-increment by more than 1, so can also experience this issue

xf_user_alert table is also similar. Especially if you've got add-ons which cause many more alerts to be generated.

xf_forum_read probably isn't as bad, but has similar usage patterns.

mattrogowski · Feb 17, 2022

xf_user_alert is another good one yeah. Anything directly user-generated is probably going to be fine (users, posts etc), but anything where there's multiple rows being inserted for a single thing (user ID * threads read, new post * users watching the thread) will be affected - hundreds of threads sending alerts to hundreds of users will add up quick. Just checked actually and the alerts table is up to ID 3.2bn, so is 75% of the way there.

PaulB · Feb 21, 2022

We're a good chunk of the way to exhausting ip_ids in xf_ip, so that's another problematic one.

It's not too hard to change all of these to BIGINT UNSIGNED on our end, but I'm concerned XF may try to override that during future updates.

Chris D · Feb 21, 2022

I’m not sure why you think we’d try to arbitrarily undo schema changes? We wouldn’t really have anything to gain by doing that.

So, for now, please feel confident in changing to BIGINT if it mitigates the issue as it stands.

PaulB · Feb 21, 2022

Chris D said:
I’m not sure why you think we’d try to arbitrarily undo schema changes? We wouldn’t really have anything to gain by doing that.

I appreciate the reassurance, but the conversion of polls to thread types is still fresh in our team's memory. We lost a lot of data to that, and we're still working out how to get polls working on blogs again and restore the missing data.

Chris D · Feb 21, 2022

Definitely not the same thing though.

Also not really sure what the relevance is. The poll data didn't change. Threads containing polls changed. Did you report any issues to us at the time? During the beta?

Xon · Feb 21, 2022

Updating xf_ip.ip_id is annoying as it means touching a lot of tables and a number of 3rd party add-ons probably also need updating

PaulB · Feb 22, 2022

Chris D said:
Definitely not the same thing though.

Well-intended feature updates can have unintended side effects. This tiny little bit of code introduced recently has caused a lot of headaches:

PHP:

public function onThreadLeaveType(Thread $thread, array $typeData, bool $isDelete)
{
   if ($thread->Poll)
   {
      $thread->Poll->delete();
   }
}

It's reasonable for us to assume that a significant schema change like the one proposed in this thread could break things in the future, given that it's entirely unsupported.

Chris D said:
Also not really sure what the relevance is.

The lesson learned was, "don't trust that feature updates, even when used as intended, won't result in data loss." If your workflow involved posting blog posts in a discussion forum prior to moving them to an article forum, you're in for a surprise, especially if a blog post with a poll is receiving a lot of publicity. That lesson is applicable here: it's yet another item to add to the pre-upgrade checklist.

For example, if new tables have been introduced that join on the same IDs, that's something that needs to be caught prior to deploying an upgrade.

Chris D said:
Did you report any issues to us at the time?

By the time we noticed, the data was already gone and others had already reported it. The response was clear: it was functioning as intended, so we didn't report it as a bug. That's a discussion for another thread, though.

Chris D · Feb 22, 2022

PaulB said:
It's reasonable for us to assume that a significant schema change like the one proposed in this thread could break things in the future, given that it's entirely unsupported.

It isn't. It's unreasonable. It's a change to a column type. Nowhere near the same thing. It is supported. I'm telling you, that's the workaround and if a change is going to be made, that's the change we will make (via the schema manager, so if the change has already been done, no changes will be made).

PaulB · Feb 22, 2022

If XenForo introduces a new table that has an ip_id column, the change proposed here will break. I'm not sure which aspect of that is confusing or unreasonable. It's going to work fine with test data and break in production.

Chris D · Feb 22, 2022

That wasn't your original concern:

PaulB said:
It's not too hard to change all of these to BIGINT UNSIGNED on our end, but I'm concerned XF may try to override that during future updates.

Nor was it your concern here:

PaulB said:
I'll leave that as an exercise to the reader, as ensuring XF doesn't overwrite your changes during upgrades is non-trivial.

You then attempted to approximate it to something entirely different, and now you're trying to claim the issue is something else entirely; e.g. adding new tables.

My one and only point is it is ludicrous to imply that we'd arbitrarily change or reset custom schema changes like you've implied twice now.

jeetkune · Feb 22, 2022

mattrogowski said:
If you set up a MySQL cluster for replication, the auto_increment values generated by MySQL will go up by more than 1 at a time (see auto_increment_increment) - the default (we've been told) is to increment them by 10. This means you use up IDs 10x quicker (so you'd run out of IDs at 420 million rows, not 4.2 billion).

If you have a particularly a) large, b) active, and c) long running forum, combined with the auto_increment values going up by 10 instead of 1, you will eventually (after many, many years) exhaust the IDs available for this table. Unlike the xf_thread_view table, xf_thread_read is never truncated, it just has stale data deleted after the thread read cut-off period - so over time, the IDs keep growing and growing (quicker than usual when they're going up 10 at a time). When you end up reaching ID 4294967295 (which it did on a forum at 2am of all times), you will get an "out of range" error for that table. Our xf_thread_read table currently has ~5m rows in it - as that's a 30 day rolling total, that can be extrapolated out for the number of rows created over the course of a year, and with the IDs going up 10 at a time, it's about 7 years until 420m rows are created and the IDs are exhausted.

The solution would be to change thread_read_id to be a BIGINT. This will probably only ever affect 0.1% of forums, but considering some forums maybe have been running XF for over a decade now and could well be on a MySQL cluster after all that time, there may be some boards unknowingly creeping close to this issue. Updating the column type should stop this ever being an issue.

why don't you try to keep your mariadb version up to date?

PaulB · Feb 23, 2022

Chris D said:
That wasn't your original concern:

From my perspective, it’s the same concern. The steps our team needs to take are the same regardless of whether we’re anticipating a reversion or a new table. The test employed is going to be the same either way. My apologies if that wasn’t clear.

I think there’s a bit of a disconnect between how some forums operate and how the XenForo team expects them to operate, and that causes a great deal of frustration on both sides. I’m not pulling these hypothetical scenarios out of thin air; they’re coming from a decade of experience trying to compensate for unexpected behavior during XenForo upgrades. I don’t usually complain about these caveats—you see a small fraction of the issues we encounter—but I do try to point them out to other large forums who are likely to encounter similar problems. I’d be happy to discuss these issues with you in more detail if you’d like, but I don’t think they’re applicable to the majority of your customers, and our team understands that.

mattrogowski · Feb 23, 2022

jeetkune said:
why don't you try to keep your mariadb version up to date?

Not sure I follow, for one it’s not my server, but not sure what the version would have to do with an increased auto-increment step meaning an INT column gets exhausted quicker than usual.

Xon · Feb 23, 2022

jeetkune said:
why don't you try to keep your mariadb version up to date?

Updating to the same patch level is fine, but a major version updates (ie 10.x => 10.y) generally has all sort of optimizer bugs because they have new defaults which don't make sense (for XenForo).

Neither would have migrated this issue of the INT column running out of space.

eva2000 · Feb 23, 2022

Chris D said:
It isn't. It's unreasonable. It's a change to a column type. Nowhere near the same thing. It is supported. I'm telling you, that's the workaround and if a change is going to be made, that's the change we will make (via the schema manager, so if the change has already been done, no changes will be made).

Can't the Xenforo upgrade routine add an optional check/flag for all these outlined impacting auto increment id fields to see if they are indeed running close to the current column type limits and run/act and make changes (if needed) accordingly?

Chris D · Feb 23, 2022

Certainly one option I’ve been considering.

PaulB · Feb 23, 2022

Chris D said:
Certainly one option I’ve been considering.

It's an awesome idea, but unfortunately it would probably cause problems with third-party add-ons that introduce their own tables with columns such as ip_id. Migrating to BIGINT unconditionally, rather than waiting until AUTO_INCREMENT approaches the limit, would have a similar issue.

eva2000 · Feb 23, 2022

PaulB said:
It's an awesome idea, but unfortunately it would probably cause problems with third-party add-ons that introduce their own tables with columns such as ip_id. Migrating to BIGINT unconditionally, rather than waiting until AUTO_INCREMENT approaches the limit, would have a similar issue.

Maybe like admin file healthcheck, a database health check can be added to check for auto increment issue and any other potential database issues that may come up in future and have set as a cron run to have issues logged to server error log or to a new database error log?

xf_thread_read auto_increment out of range

Well-known member

Well-known member

Well-known member

Well-known member

XenForo developer

Well-known member

XenForo developer

Well-known member

Well-known member

XenForo developer

Well-known member

XenForo developer

Member

Well-known member

Well-known member

Well-known member

Well-known member

XenForo developer

Well-known member

Well-known member

Similar threads

We value your privacy