Does anyone run your website on a 20 HDDs server?

agison

Well-known member
I have my database installed on a 8 HDDs (2.5", 15k SAS) server. My website is working well now, but sometime when I make a hard delete or some hard action related to my post table (16m records now), my website become lag.

Because XF doesn't support Mysql Replication, I'm thinking about upgrade my server or move to Mysql Cluster. I think a server with 8 SSD Drives or 20 SAS Drives is much better.

What do you think?
 
Some of my servers have 8 SAS 15k drives too (same setup as yours).

Instead of doubling to 16 SAS drives I would go with a SSD array and plenty of RAM. But make sure that IO is really the root of your problem.
 
But make sure that IO is really the root of your problem.

I was just going to say the same thing. Adding more disks to the array won't necessary clear any bottlenecks, and I would expect a 15k array to be pretty responsive.

SSH to your server and have "top" running when you delete - do you have a high percentage for the "wa" value? (I/O wait state)
 
Just out of curiosity what does your my.cnf look like?

And presumably you run everything on the one server - you don't have separate machines for web server / SQL / mail server?
 
Just out of curiosity what does your my.cnf look like?

And presumably you run everything on the one server - you don't have separate machines for web server / SQL / mail server?

I have Load Balancing system for web, another server for mail, just use this server for Database.
 
Here is my.cnf

Code:
[mysqld]
local-infile=0
 
skip-name-resolve
back_log = 128
max_connections = 50000
key_buffer_size = 128M
myisam_sort_buffer_size = 256M
myisam_max_sort_file_size =2048M
join_buffer_size = 512M
read_buffer_size = 8M
sort_buffer_size = 64M
table_definition_cache = 4096
table_open_cache = 4096
 
thread_cache_size = 512
wait_timeout = 45
connect_timeout = 15
tmp_table_size = 2048M
max_heap_table_size =512M
max_allowed_packet = 256M
max_seeks_for_key = 1000
group_concat_max_len = 1024
max_length_for_sort_data = 1024
net_buffer_length = 16384
max_connect_errors = 100000
#concurrent_insert = 2
read_rnd_buffer_size = 256K
bulk_insert_buffer_size = 8M
query_cache_limit = 2M
query_cache_size = 128M
query_cache_type = 1
query_cache_min_res_unit = 2K
query_prealloc_size = 262144
query_alloc_block_size = 65536
transaction_alloc_block_size = 8192
transaction_prealloc_size = 4096
 
# innodb settings
innodb_read_ahead = none
innodb_file_per_table = 1
innodb_open_files = 5000
innodb_buffer_pool_size = 32G
innodb_additional_mem_pool_size = 1G
innodb_thread_concurrency = 0
innodb_lock_wait_timeout=30
innodb_flush_method = O_DIRECT
 
Would something like fusion-IO help this? These are what facebook uses to speed up the delivery of DB data. There are some used ones around for reasonable price - although new ones are expensive!

I'm not sure how big they have to be, but they are effectively super-fast RAM...I think faster than RAM.

http://www.fusionio.com/
 
IOSTATS

Code:
avg-cpu:  %user  %nice %system %iowait  %steal  %idle
          10.03    0.00    1.51    0.77    0.00  87.69
 
Device:        rrqm/s  wrqm/s    r/s    w/s  rsec/s  wsec/s avgrq-sz avgqu-sz  await  svctm  %util
sda              1.54  110.12  15.89  85.56  384.31  4043.14    43.64    0.04    6.22  2.09  21.22
dm-0              0.00    0.00  12.96  50.80  283.01  406.40    10.81    0.15    2.26  0.39  2.50
dm-1              0.00    0.00    2.05    1.82    16.40    14.52    8.00    0.10  24.59  0.57  0.22
dm-2              0.00    0.00    2.36  132.44    77.92  3537.02    26.82    0.03    0.24  1.46  19.63
dm-3              0.00    0.00    0.06  10.65    6.97    85.20    8.61    0.55  51.07  0.03  0.03
 
avg-cpu:  %user  %nice %system %iowait  %steal  %idle
          6.95    0.00    2.35    1.76    0.00  88.94
 
Device:        rrqm/s  wrqm/s    r/s    w/s  rsec/s  wsec/s avgrq-sz avgqu-sz  await  svctm  %util
sda              0.00    9.00    1.00  149.00    8.00 24344.00  162.35    3.72  30.34  3.01  45.10
dm-0              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
dm-1              0.00    0.00    1.00    0.00    8.00    0.00    8.00    0.01    5.00  5.00  0.50
dm-2              0.00    0.00    0.00  60.00    0.00  640.00    10.67    92.42 2011.87  7.42  44.50
dm-3              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
 
avg-cpu:  %user  %nice %system %iowait  %steal  %idle
          9.51    0.00    2.52    1.22    0.00  86.76
 
Device:        rrqm/s  wrqm/s    r/s    w/s  rsec/s  wsec/s avgrq-sz avgqu-sz  await  svctm  %util
sda              0.00    4.00    0.00  58.00    0.00  544.00    9.38    0.54    9.33  6.02  34.90
dm-0              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
dm-1              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
dm-2              0.00    0.00    0.00  62.00    0.00  544.00    8.77    0.56    9.00  5.63  34.90
dm-3              0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  0.00  0.00
 
Well %util is basically device saturation when it gets to 100... the fact that you are seeing 45% saturation on a random sample makes me think your drives are indeed overworked. 2011ms average wait on one of the devices is an average wait of more than 2 seconds for i/o requests to be served.

Also... seeing sda do 24,000+ sector writes per second makes me cringe a bit...

You may just need more memory for MySQL... looks to me like it's doing some "disk thrashing" and using it as swap memory or something. How big are all databases being served by MySQL (including data and indexes)? And is the 48GB RAM allocated JUST for MySQL (not also a web server or anything)?
 
Well %util is basically device saturation when it gets to 100... the fact that you are seeing 45% saturation on a random sample makes me think your drives are indeed overworked. 2011ms average wait on one of the devices is an average wait of more than 2 seconds for i/o requests to be served.

Also... seeing sda do 24,000+ sector writes per second makes me cringe a bit...

You may just need more memory for MySQL... looks to me like it's doing some "disk thrashing" and using it as swap memory or something. How big are all databases being served by MySQL (including data and indexes)? And is the 48GB RAM allocated JUST for MySQL (not also a web server or anything)?


My Database size is 30GB. Only ElasticSearch installed with Mysql on this server. Nothing else.
 
You will need more memory if you want to prevent your server from using disk swap... ElasticSearch itself is going to use 15-20GB with your size database (at least it should be... if it's not and you haven't given it more memory, the disk swapping might be coming from ElasticSearch).

Also, depending on how much you want to try and get around it without upgrading your server memory, you might want to convert your tables from InnoDB to MyISAM. They use less memory, are typically faster if you don't have table locking issues. We don't use ANY InnoDB tables on our DB server and it's humming along just fine doing thousands of SQL queries per second 24/7 with almost no server load. I just checked, and the DB server load is 0.07 right now (which is less than 1% server utilization) and processing about 2,500 queries per second in the time frame I sampled.

But again... that may or may not fix the issue, and it's quite a bit of downtime to convert 30GB of databases to MyISAM just to see...
 
You will need more memory if you want to prevent your server from using disk swap... ElasticSearch itself is going to use 15-20GB with your size database (at least it should be... if it's not and you haven't given it more memory, the disk swapping might be coming from ElasticSearch).

Also, depending on how much you want to try and get around it without upgrading your server memory, you might want to convert your tables from InnoDB to MyISAM. They use less memory, are typically faster if you don't have table locking issues. We don't use ANY InnoDB tables on our DB server and it's humming along just fine doing thousands of SQL queries per second 24/7 with almost no server load. I just checked, and the DB server load is 0.07 right now (which is less than 1% server utilization) and processing about 2,500 queries per second in the time frame I sampled.

But again... that may or may not fix the issue, and it's quite a bit of downtime to convert 30GB of databases to MyISAM just to see...


I should move ElasticSearch to another server first then think about adding more memory.
Thank you.
 
Yeah, that's what I would do... or at the very least disable/shut down ElasticSearch for a bit to see how your server reacts to it not running on your DB server.
 
Top Bottom