XF 2.2 The table ‘xf_session_activity’ is full

bburton

Member
What might cause the xf_session_activity table to become repeatedly full resulting in the contents of forum pages not loading and an error message displaying instead?

The 'hourly clean up' cron job is running, so that's not the reason. This is occurring between those hourly clean ups - about every 20 minutes or so. Running the 'hourly clean up' manually temporarily 'fixes' the problem, then 20 minutes or so later the xf_session_activity table fills up again and this error occurs repeatedly:

XF\Db\Exception: MySQL query error [1114]: The table 'xf_session_activity' is full
src/XF/Db/AbstractStatement.php:230


There is not an unusual number of logged in users. Nothing appears unusual with the number of logged in registered users.

Thanks,
 
I found several threads on this same issue, but no explanations

Question: Can unregistered guest activity fill up this xf_session_activity table?
 
Last edited:
What might cause the xf_session_activity table to become repeatedly full resulting in the contents of forum pages not loading and an error message displaying instead?
Not properly configured (tuned) server for given application/traffic.

I would rather ask "how full is full?"
Whats the setup of actual mysql config + available resources ?

Question: Can unregistered guest activity fill up this xf_session_activity table?
Thats exactly what this table contains.
 
Last edited:
XenForo has been running just fine for months (years) with no issues. Nothing recently has been done to change the configuration

I thought xf_session_activity table just kept track of logged in registered users (?)
 
Here's an entry from the server error log:

Server error log
  • XF\Db\Exception: MySQL query error [1114]: The table 'xf_session_activity' is full
  • src/XF/Db/AbstractStatement.php:230
  • Generated by: Unknown account
  • Jan 9, 2025 at 11:08 PM

Stack trace​

-- XFDB=noForceAllWrite
INSERT IGNORE INTO xf_session_activity (
ip,
controller_name,
controller_action,
view_state,
params,
view_date,
robot_key,
user_id,
unique_key
) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)

------------
 
Last edited:
I also encountered the same problem on a large installation with 20M posts, increased the max_heap_table_size variable several times, but it did not help for long.
Then I looked at the access logs, and realized that these are AI bots from Alibaba Cloud and other similar ones learning and reading content (I had ~40 requests per second) and I just simply blocked the Alibaba autonomous system and a couple of data centers like OVH on nginx) and it helped. Welcome to the age of ai technologies. :-)

After a while I found a topic:
 
Bots are going crazy. The bots are hitting our site two times from each IP address (first they get a 301 redirect as they are going to a specific post, then they get redirected to the page for the respective thread). Then they use another IP address. Since 0800 today, we have had 244,896 different IPs hit the forums. This is filling up the session table.

Then since the system can't display data, the bots eventually go away until next time.

If you let the session table have more space, all that will happen is the bots will keep coming until it is full, no matter how big.
 
There's a sessions table that xenforo uses that is in memory and gets too big.
We use the below lines in the mysql configuration to increase the size of these tables and it's been successful so far.

1750132066874.webp

We have a LOT of bots but we ban the ones that are on bad behavior with lots of fail2ban rules and let the rest suck the data up as long as they follow the rules stated in robots.txt.

This cuts out a ton of bots, because most are AI scrapers. But as you can see, we still have a crapton.

1750132138679.webp
 
I set max_heap_table_size, as well as the other necessary options, to 2GB. The bots are so aggressive that they were still able to run it full.

In the end, the only thing that helped was proxying through Cloudflare.
 
Thanks for the feedback.

The problem is these bots do not identify themselves as bots in the browser string, they pretend to be a normal user's browser. Along the same lines, they hit the site once and the next bot comes from a completely different IP. The author's intent is absolutely to evade being blocked.

I've even seen one bad actor that does all the scraping from a single IP. As soon as you block the IP, he respawns and distributes the scraping to his botNet and you get the aforementioned behavior.
 
I set max_heap_table_size, as well as the other necessary options, to 2GB. The bots are so aggressive that they were still able to run it full.

In the end, the only thing that helped was proxying through Cloudflare.

That was my conclusion also.

Do you know if Cloudflare can be setup to handle just a single subDomain, or do they have to manage the entire domain, i.e. I just want them to handle forum.site.com and not all of site.com.
 
I set max_heap_table_size, as well as the other necessary options, to 2GB. The bots are so aggressive that they were still able to run it full.

In the end, the only thing that helped was proxying through Cloudflare.

Oh, you must not have any protection layer at your server level.
I wouldn't dare turn fail2ban off just to see what the traffic looks like.
I have spent a lot of time tuning fail2ban protection to perfection anyway.

I don't like cloudflare because it produces a lot of false positives and inconveniences users by making them click a captcha. it also makes your website dependent on their architecture working. But i wouldn't say fail2ban is anywhere near as convenient to setup.

Does anyone know what these bad bots hope to achieve? We have a large site, but 95% of it is just senseless chit-chat of little value to the world.

Based on what i see in the logs, a lot of them are ai scrapers or they are looking for vulnerabilities to exploit.
The internet is a nasty place these days.
 
That was my conclusion also.

Do you know if Cloudflare can be setup to handle just a single subDomain, or do they have to manage the entire domain, i.e. I just want them to handle forum.site.com and not all of site.com.
Yes, you can set up a subdomain by simply choosing to not proxy the main domain.

I don't like cloudflare because it produces a lot of false positives and inconveniences users by making them click a captcha. it also makes your website dependent on their architecture working. But i wouldn't say fail2ban is anywhere near as convenient to setup.
So far the only problem I have had with Cloudflare is if you have API endpoints that other websites use to connect to your website. Those pretty much stop working at all.

I haven't really had any issues with false positives. So far my test with them has not been long enough to give a definite opinion, but for sure it does not produce "a lot" of false positives- maybe some, but so far nobody has complained and Google Analytics doesn't show any visible difference in traffic levels day to day. I have had one complaint that the site showed a timeout error, which was fixed by doing a refresh. So probably those are more likely to happen.

Users don't have to click a captcha, this only happens in the "Under Attack" mode, which is not necessary to stop bots.
 
As a viewer i get false positives frequently on websites.
Extremely evident that someone is running on cloudflare because you'll get a captcha on your first visit.

This only happens in under attack mode, IE the site operator has turned it on by hand?
 
As a viewer i get false positives frequently on websites.
Extremely evident that someone is running on cloudflare because you'll get a captcha on your first visit.

This only happens in under attack mode, IE the site operator has turned it on by hand?
Yes, the site operator has turned it on.

But, you can also get a captcha challenge if Cloudflare considers you a bot.

So you are right that it is hard to be 100% certain what is going on. But I don't think Cloudflare has to be a permanent thing. Once the bots are disrupted, I will most likely turn it off, I hope that the bots don't know that Cloudflare has been turned off and come back next day. I have been running it for too short of a time to have a definite opinion.
 
Back
Top Bottom