XF 1.3 Database max connections reached

Hi guys,
We've been having a problem where our database suddenly gets slammed and reaches it's max connection limit in the span of 5 minutes. It goes from ~20 connections to ~2500 in minutes. This could be the result of several things (like DDOS) but we are trying to investigate from a couple of different angles.

Is there anything Xenforo-wise that would possibly slam the database like this in a short amount of time? We have a ton of users and large tables so I was wondering if cron jobs or the new deferred system would possibly try running a bunch of queries at once (we just upgraded and are now seeing these issues).

Thanks in advance!
Caleb
 
If you're running a very busy system, you may see benefits from taking out the deferred.php user trigger (it's a class on the <html> tag) and just calling it once per minute or so via a real cron task. That said, I don't think that would explicitly trigger it.

Are the connections doing/waiting on anything in the process list?
 
If you're running a very busy system, you may see benefits from taking out the deferred.php user trigger (it's a class on the <html> tag) and just calling it once per minute or so via a real cron task. That said, I don't think that would explicitly trigger it.

Are the connections doing/waiting on anything in the process list?
Hm. Just saw the error 'The table 'xf_thread_view' is full' in the database logs. That doesn't seem good.
 
That's a memory table by default so it can theoretically fill up, but it's normally cleared every hour and it only contains a single 4 byte column. That makes me think that the deferred tasks aren't actually running.

Alternatively, that could point to a potential DDoS if it is getting filled up "legitimately".
 
That's a memory table by default so it can theoretically fill up, but it's normally cleared every hour and it only contains a single 4 byte column. That makes me think that the deferred tasks aren't actually running.

Alternatively, that could point to a potential DDoS if it is getting filled up "legitimately".
Memory table? And is there anyway to verify deferred tasks are running via some log?
 
Memory table: http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html It doesn't have to be MEMORY, but it's done this way to avoid hitting the disk on every thread view (same as xf_session_activity). There is a default limit of 16M.

In the current code, the act of checking whether deferred tasks need to be run would probably end up causing them to run (as they are run in the ACP and that doesn't have the customizations that normally block them on the front end). As such, you may need to look at the next run time for cron entries directly in the database; if the times for a number of them are in the past, then the deferred system likely isn't being triggered correctly.
 
Memory table: http://dev.mysql.com/doc/refman/5.0/en/memory-storage-engine.html It doesn't have to be MEMORY, but it's done this way to avoid hitting the disk on every thread view (same as xf_session_activity). There is a default limit of 16M.

In the current code, the act of checking whether deferred tasks need to be run would probably end up causing them to run (as they are run in the ACP and that doesn't have the customizations that normally block them on the front end). As such, you may need to look at the next run time for cron entries directly in the database; if the times for a number of them are in the past, then the deferred system likely isn't being triggered correctly.

Ah thanks for the info (sorry not a mysql expert). So that makes a lot of sense. I'll check the cron jobs in the database to see if there is an issue there.

I work at IGN and we have a very large forum so xf_thread_view filling up with default MySql values is a distinct possibility. From reading the documentation you posted, memory tables will never be converted to disk tables. So errors will be logged but it shouldn't slow down the database a ton as we aren't going to be hitting disk with each thread view.
 
So errors will be logged but it shouldn't slow down the database a ton as we aren't going to be hitting disk with each thread view.
True, sort of. If the query triggers an error immediately (which it may not if it's INSERT DELAYED), it may block the thread view and cause XF to log it (to disk). If it isn't blocking the thread view and XF isn't logging it in the error log in the control panel, then it shouldn't interfere with XF itself.
 
True, sort of. If the query triggers an error immediately (which it may not if it's INSERT DELAYED), it may block the thread view and cause XF to log it (to disk). If it isn't blocking the thread view and XF isn't logging it in the error log in the control panel, then it shouldn't interfere with XF itself.

Yeah I'm thinking you're right about the logging part. I'm confirming it now but it looks like it was logging each time it tried and the table is full.

Our current max_heap_table_size is 33MB. Which is larger than the default but may not be enough for how many users we have? I'm still not 100% a cron job didn't fail but I'm hoping bumping the size of the table up will potentially solve this problem. Running a query to determine the size of the xf_thread_view table keeps giving 7.99 MB. So I'm not sure why the xf_thread_view table isn't 33MB like it should be according to the documentation.
 
Db just crashed again. This query was shown as being run a ton of times via different connections:

PHP:
INSERT INTO xf_session_activity
                    (user_id, unique_key, ip, controller_name, controller_action, view_state, params, view_date, robot_key)
                VALUES
                    ('0', '\naBÃ', '\naBÃ', 'Tapatalk_ControllerPublic_Tapatalk', 'sign_in', 'valid', 'useragent=tapatalk', '1411601156', '')
                ON DUPLICATE KEY UPDATE
                    ip = VALUES(ip),
                    controller_name = VALUES(controller_name),
                    controller_action = VALUES(controller_action),
                    view_state = VALUES(view_state),
                    params = VALUES(params),
                    view_date = VALUES(view_date),
                    robot_key = VALUES(robot_key)
 
Db just crashed again. This query was shown as being run a ton of times via different connections:

PHP:
INSERT INTO xf_session_activity
                    (user_id, unique_key, ip, controller_name, controller_action, view_state, params, view_date, robot_key)
                VALUES
                    ('0', '\naBÃ', '\naBÃ', 'Tapatalk_ControllerPublic_Tapatalk', 'sign_in', 'valid', 'useragent=tapatalk', '1411601156', '')
I think I remember @MattW making a post over on the Tapatalk site inquiring why some IP's from them were slamming the crud out of his server. I haven't checked back lately over there to see if they ever responded. Had to do with scraping content through the app for their search interface for Tapatalk.

EDIT:
Here's the thread: https://support.tapatalk.com/threads/tapatalk-bot.26316/#post-140069.
 
I think I remember @MattW making a post over on the Tapatalk site inquiring why some IP's from them were slamming the crud out of his server. I haven't checked back lately over there to see if they ever responded. Had to do with scraping content through the app for their search interface for Tapatalk.

EDIT:
Here's the thread: https://support.tapatalk.com/threads/tapatalk-bot.26316/#post-140069.

Yeah we also had a bunch of normal xf_session_activity inserts as well. It wasn't just Tapatalk.
 
Yeah we also had a bunch of normal xf_session_activity inserts as well. It wasn't just Tapatalk.

Your going to have a session activity record for every user, guest and robot performing an action on the site within a specified time. About 1.5 hours on our site is the oldest record I've seen. That's pretty normal and maybe trouble for a busy site. Its the thread views I'm finding interesting.
 
What's the status of those queries? Are they waiting for a table lock?

I don't think you guys are running 1.4 so this is less significant (but still valid), but you can run this query to change one of the index types to be more usable: https://xenforo.com/community/threa...ading-database-performance.82725/#post-823355

Alternatively, flipping the table to InnoDB would allow row-level locks though at the expense of needing to hit the disk. (This row is updated on each view.)
 
have you run "show processlist" on mysql directly to see what the queries are? You could run show full processlist, but that could be very large.

What about the performance of the DB, has it been tuned at all? Are you hitting a I/O block casuing the queries to bank up?

Is there any other apps that use the database instance that might be hogging connections?
 
What's the status of those queries? Are they waiting for a table lock?

I don't think you guys are running 1.4 so this is less significant (but still valid), but you can run this query to change one of the index types to be more usable: https://xenforo.com/community/threa...ading-database-performance.82725/#post-823355

Alternatively, flipping the table to InnoDB would allow row-level locks though at the expense of needing to hit the disk. (This row is updated on each view.)

MySql Workbench showed the status of a few queries we sampled as "update" so I don't think they were waiting on a lock but on an available connection to the db (since the initial issue was that the # of connections spiked to the max available for the db). We also noticed that it doesn't look like the xf_session_activity table doesn't appear to be a memory table. It has engine=InnoDb in it's create statement. We believe that it was changed by the previous engineers that worked on this. Possibly because we have too much traffic to do an in-memory table for xf_session_activity?

have you run "show processlist" on mysql directly to see what the queries are? You could run show full processlist, but that could be very large.

What about the performance of the DB, has it been tuned at all? Are you hitting a I/O block casuing the queries to bank up?

Is there any other apps that use the database instance that might be hogging connections?

There shouldn't be any other apps hitting the db as far as I know. The database doesn't appear to be hitting any IO/CPU/memory limits. Mainly just the max number of connections. But they spike so quickly that it seems like it can't be the result of normal traffic increase.
 
We also noticed that it doesn't look like the xf_session_activity table doesn't appear to be a memory table. It has engine=InnoDb in it's create statement. We believe that it was changed by the previous engineers that worked on this. Possibly because we have too much traffic to do an in-memory table for xf_session_activity?
Being an InnoDb table it will require writing to disk on a very frequent basis which could be the cuase. The xf_session_activity table isn't all that big and isn't really an issue if you lose it, I'd put it back to Memory, which of course doesn't write to disk. If your hardware hasn't enough capacity to to have it in memory then it is probably lacking in general.

Another sugestion is to use phpmyadmin, which has a good Advisory functionality - you click on status and then advisor. Use the latest version rather than a packaged version from your distro. Obviously make sure that access is tied down, preferably by ip address. We have a site that can be busy, eg 5000 page impressions a minute, and the database runs fine after I tuned it using the Advisor as a guide.
 
Top Bottom