Database server performance

Sim

Well-known member
I'm looking at whether further optimisations of my database server are required.

I host 8 XenForo sites (9 actually if you include the staging server) plus about 20 WordPress sites and a few other PHP based sites (HelpDesk and various other tools/utilities).

I have multiple web servers running nginx connecting to a standalone MySQL server.

These are all running on Linode VPS servers of various size. The MySQL server has 8GB of RAM and I run mysqltuner.pl to optimise configuration.

I'm generally happy with the performance of the database server, but I'm wondering if there are further optimisations I could apply.

The key approach I've used to this point for database server performance has been to allocate as much RAM as I can to InnoDB buffers, while still maintaining server stability. On this server that means 4.5G of RAM has been allocated to InnoDB buffers (4.8G total buffers + 200 threads @ 1.1M per thread = 5G total allocated to MySQL).

I tried to go higher, but the sever became unstable.

However, the total size of my databases has grown to the point where I'm no longer able to retain all of the data in memory - although I'm not sure whether that's actually having a significant impact given there would be large amounts of data which simply never get read and so isn't really required to be held in the buffers.

My main challenge is that the next price point up for a Linode VPS is 2x the price for only 1.5x the amount of RAM (12GB). I already have a 2nd separate 8GB MySQL server running for my largest site (although that site is fairly unique in that it has been archived - no posting allowed, so database traffic is 99.9%+ read-only).

I was considering moving to a "high memory" instance @ 16GB RAM, but it only has 1 CPU core. I don't seem to see much in the way of CPU usage - my 8GB Linode has 4 cores, but rarely gets above 30% CPU usage anyway. I'm unsure if moving to a single core machine with more RAM will help or hinder performance.

The other factor I was considering was moving to a read/write setup with two database servers, one for reads and the other for writes - once XF2 is available. Again, I'm not sure if this is going to significantly impact on performance. Would this also offer redundancy (can the read slave be automatically upgraded to master in a XenForo setup if the master went offline) ? Right now I need to take the database server (and hence the websites) offline for a full system backup occasionally to ensure data integrity - I'd like to be able to do backups without taking sites offline.

Given the price/performance ratio, I'm thinking it may be more cost effective to deploy multiple 8GB database servers rather than trying to consolidate on one larger server? But then I won't get the benefit of the read/write setup because it becomes too expensive to have 4 or 6 servers.

I'm also considering splitting the servers based on workload - XenForo databases on one server and WordPress databases on another, so each can be optimised more specifically for the workload they experience. Again, I'm unsure if this will have a significant impact.

These are the kind of response times I'm currently seeing for forum home page and new thread searches:

1503283600787.webp

1503283608358.webp

1503283615540.webp

... given how little these contribute to the overall page load time (350ms on average for Google crawler to download the HTML), I'm not sure if any more optimisations would be worth it at this point?

I'm not sure I actually have a problem, but either way, I would appreciate some guidance on possible growth strategies from here as my sites get busier and database sizes get larger:
  1. split busy sites onto multiple database servers first before worrying about scaling individual servers? (possibly cheaper overall, but much more work to maintain)
  2. work on economy of scale and go for a bigger server (more RAM) - scale up (more expensive, but less work)
  3. scale out with separate read/write servers (but what size?) (is the benefit worth the cost?)
  4. would love some redundancy so I can backup server without taking sites offline - but how to structure that? (again, is the benefit worth the cost?)
  5. some combination of the above?
My XenForo sites are my prime source of business income, so availability is starting to become increasingly important.

The cost of having the database server go down and require rebuilding or losing up to 24 hours worth of data isn't quite at the level where it compares to the cost of maintaining a highly available setup (although there are reputational costs as well, which are difficult to quantify).

But either way, I can foresee a time in the not-too-distant future where the equation does start to tip towards the cost being justified, so I'd like to start thinking about how to architect things in preparation for that time.
 
I run a ~70gb database (with a 20gb compressed xf_post table!) on an 8GB Linode VM with a 5GB innodb_buffer_pool_size. It has a master/slave setup and I've got a custom DB adaptor which shards pure read loads over the master-plus slave. But the entire setup works fine with everything hitting the master.

One critical change you need to-do, that most tuner's do not, is ensure MySQL is configured to use all the IOPs your storage can handle. For a 8GB Linode I use;
Code:
innodb_read_io_threads=32
innodb_write_io_threads=32
innodb_io_capacity=2000
innodb_io_capacity_max=3000
innodb_flush_neighbors=0

innodb_flush_neighbors picks a flushing strategy that is better for SSDs, while the reset dramatically increase the number of allowed in-flight IO requests hitting the disk-layer.

The defaults are for these are:
Code:
innodb_read_io_threads=4
innodb_write_io_threads=4
innodb_io_capacity=200
innodb_io_capacity_max=200
innodb_flush_neighbors=1

These are incredibly low, and are defaults for what is basically a single 7200k rpm sata disk which physically can not handle many simultaneous requests. Where as with SSDs you need a large number of simultaneous requests to get the full through of the drive.
 
Last edited:
So you disabled flush neighbors for Linode which uses SSDs?
Yeah. innodb_flush_neighbors works on the assumption that bytes near one logical address are physically near that address. For SSD that isn't the case, so flushing that extra data isn't a "free win" like it is with a harddrive.
 
In my experience, it's always cheaper to scale horizontally than vertically. In the longer run the costs of scaling horizontally outweighs the cost of scaling vertically.
I would recommend sharding the databases but since you mentioned that you are running lot of websites from a single database server, the easiest big win would be to pull out the site with highest traffic from the pooled server to a separate server.

You can easily incorporate redundancy by setting up RAID-1 drives on your server. This would allow you to take backups without taking your server down.
 
In my experience, it's always cheaper to scale horizontally than vertically. In the longer run the costs of scaling horizontally outweighs the cost of scaling vertically.

When you can get double the capacity for a VPS for exactly double the price, I think it's pretty easy to justify scaling vertically. Horizontal scaling comes with a huge management overhead. Of course, there is an assumption that double the capacity will equal double the performance - which isn't necessarily true, but if you're able to keep your entire dataset in memory, then I think that would be close. Not sure that will continue to scale though - and prices aren't always linear (eg the move from 8GB to 12GB Linode is 2x the cost for only 1.5x the capacity).

the easiest big win would be to pull out the site with highest traffic from the pooled server to a separate server.

Yes, I'm kind of thinking the same thing myself and have already done that with my largest site.

Although there is still an economy of scale to be had when you consider that a new 8GB server only gets you around 5GB or so of buffer capacity (assuming you're trying to keep the entire dataset in memory for maximum performance, which isn't always necessary), compared to a 16GB server which will probably give you 11 or 12GB (or more) of buffer capacity.

You can easily incorporate redundancy by setting up RAID-1 drives on your server. This would allow you to take backups without taking your server down.

How does RAID-1 allow you to take backups without taking your server down? Do you mean the fact that you have two copies of the data on mirrored drives? That's not a backup ... redundancy != backups and doesn't help with data corruption. Or am I missing something?
 
  • Like
Reactions: Xon
I run a ~70gb database (with a 20gb compressed xf_post table!) on an 8GB Linode VM with a 5GB innodb_buffer_pool_size. It has a master/slave setup and I've got a custom DB adaptor which shards pure read loads over the master-plus slave. But the entire setup works fine with everything hitting the master.

What kind of page generation times are you getting from that setup in XF debug mode?

I'd imagine that with a compressed table of that size, your CPU would be pretty heavily utilised?

Given that I'm so far able to keep most of my data in memory, I'm not experiencing any kind of IO bottlenecks, so I double I'll see any real change after implementing your suggestions - but reading the docs, they all make sense - thanks for pointing them out!

Good to know that there's plenty of performance capacity available in the servers I'm already running - if anything, I'm vastly over-provisioned.
 
What kind of page generation times are you getting from that setup in XF debug mode?

Index:
Page Time: 0.0984s​
Memory: 4.6739 MB (Peak: 5.8844 MB)​
Queries (5, time: 0.0398s, 40.4%)​

New Posts:
Page Time: 0.0714s​
Memory: 5.1443 MB (Peak: 6.2859 MB)​
Queries (5, time: 0.0204s, 28.5%)​

With XF reporting; Total: 2,920 (members: 1,517, guests: 1,389, robots: 14)

This isn't peak time, but during peak the timing can double if waiting for a php worker slot under load.

This is obviously a heavily non-standard copy of XenForo but all my performance enhancing add-ons are public. Only 2 direct code edits;
Code:
sed -i "s|'url' => \$requestPaths\['fullUri'\],|'host' => gethostname(),'url' => \$requestPaths\['fullUri'\],|" /var/www/sites/forums.spacebattles.com/html/library/XenForo/Error.php
sed -i "s|throw \$this->responseException(\$this->responseMessage(\$options->boardInactiveMessage), 503);|throw \$this->responseException(\$this->responseMessage(\$options->boardInactiveMessage), 200);|" /var/www/sites/forums.spacebattles.com/html/library/XenForo/ControllerPublic/Abstract.php

I run a multiple webfront ends, so having the hostname in the XF error log is critical. And the 2nd one stops Linode's nodebalancer from freaking out when the board is in maintenance mode.

I'd imagine that with a compressed table of that size, your CPU would be pretty heavily utilised?
Peaking at ~30% CPU atm, absolute peaks are maybe 40-50%?
 
So you disabled flush neighbors for Linode which uses SSDs?
@Brent W don't have to worry about this with Centmin Mod 123.09beta01 LEMP stack installs as at install time, MariaDB 10.1.x is auto tuned for InnoDB disk i/o and auto detects if SSDs are used and sets flush neighbors apprioriately as well as InnoDB I/O related options to dynamically at install time measured values. Basically, Centmin Mod installer will measure your disk I/O performance and dynamically tune and optimise your MariaDB InnoDB disk I/O related options appropriately for your specific server hardware specs :)
 
Last edited:
@Sim sometimes simplest solution might be the answer for some of your site setup(s). So none of sites can handle using the web server's native mysql server to serve mysql data ? Seems very underutilised on web server end if you have some sites which would run fine with local mysql server running on the web servers instead of hogging resources on the dedicated remote mysql server :)
 
Seems very underutilised on web server end if you have some sites which would run fine with local mysql server running on the web servers instead of hogging resources on the dedicated remote mysql server

Most of my WordPress sites generate relatively little traffic with also quite small databases - I wouldn't exactly call it hogging resources.

That being said - I was already looking at the possibility of splitting the database server by load type, my high volume sites are all XenForo - so it would make sense to optimise for that type of traffic on my main database server and move off the other sites to at least simplify things.

The database requirements for the other sites are probably low enough that there wouldn't be significant impact on sever utilisation from moving to an all-in-one server for those sites, and I don't especially care about availability for most of them either (ie not important enough to warrant the time/cost).

Actually now that I think about it - my smaller XenForo sites would fit that same criteria - relatively low load and smaller datasets; availability less of a priority - no reason I couldn't do the same on the server that runs them - move to an all-in-one. Looking at resource utilisation on that server and the size of the databases, it would hardly have any impact on server load at all.

Ironically, availability also improves in this scenario because even if I need to take my main database server offline, the other sites remain unaffected because they all have their own local database server.
 
Index:
Page Time: 0.0984sMemory: 4.6739 MB (Peak: 5.8844 MB)Queries (5, time: 0.0398s, 40.4%)
New Posts:
Page Time: 0.0714sMemory: 5.1443 MB (Peak: 6.2859 MB)Queries (5, time: 0.0204s, 28.5%)

What is the mechanism which gets your query count so low? Is that just caching?
 
Forum Index
Page Time: 0.0573s
Memory: 3.9599 MB (Peak: 4.2389 MB)
Queries (15, time: 0.0073s, 12.7%)

Thread
Page Time: 0.0519s
Memory: 3.9745 MB (Peak: 4.1951 MB)
Queries (21, time: 0.0095s, 18.3%)

New Posts Results Page
Page Time: 0.0281s
Memory: 3.6023 MB (Peak: 3.7971 MB)
Queries (7, time: 0.0034s, 12.1%)

Though I must admit that we have quite powerful machines running MariaDB :)
 
Queries (15, time: 0.0073s, 12.7%)

That is pretty impressive. Is it all just raw server power (specs?) or have you made other optimisations?

That being said, shaving < 100ms off your database response time (ie going from around 80ms to just under 10ms) is probably not going to achieve as much of an overall performance boost (from an end-user-perspective) as optimising the HTML page load times would - given they are typically measured in seconds, not milliseconds!
 
The DB server isn't actually optimised that well as I'm not a MySQL tuning expert like eva2000, but we've done a few things to do caching on the PHP side.

HW is a Dell PE730xd with 2x Xeon E5-2630v3@2.4GhZ, 128 GB RAM, 2x DC S3610 SSD handing about 500 GB databases for our ~ 70 forums.
 
Forum List
Timing: 0.0430 seconds Memory: 4.606 MB DB Queries: 17
Timing: 0.0475 seconds Memory: 5.371 MB DB Queries: 14
Timing: 0.0436 seconds Memory: 5.371 MB DB Queries: 14
Timing: 0.0492 seconds Memory: 5.371 MB DB Queries: 14
Timing: 0.0472 seconds Memory: 5.371 MB DB Queries: 14

Thread View
Timing: 0.0566 seconds Memory: 5.422 MB DB Queries: 20
Timing: 0.0590 seconds Memory: 5.422 MB DB Queries: 20
Timing: 0.0577 seconds Memory: 5.422 MB DB Queries: 20
Timing: 0.0573 seconds Memory: 5.422 MB DB Queries: 20
Timing: 0.0571 seconds Memory: 5.422 MB DB Queries: 20

34 addons & 10+ template mods only addons
5.6M Post

Server Specs:
Intel Xeon E3-1245v5 - 4c/8t
64GB DDR4 ECC 2133 MHz
SoftRaid 2x450GB SSD NVMe
 
Back
Top Bottom