Xenforo on Amazon EC2

Thank you again!
We did not had 200GB on 5th. At that time we had 20GB only. We increased to 200GB on 11-11.

Yes, the IOPS peak... But that was on 11-04, the day before, when we switched RDS instances, t2.small (not t2.micro as said above) to m4.large.

The small red read IOPS peak on 5th is earlier at 3pm UTC (around 150 read IOPS). I don't remember exactly, maybe I took a sql dump. I don't think that we really hit the IOPS limit - even not with 20GB at 7-8pm on the 5th.

Currently the CPU is often heavy loaded since then, even when not really critical. But it becomes laggy at 8pm UTC almost every day so we approach the limit even with 200GB. That's why I feel "the RDS is too small". I fear somehow, that parameters and such won't significantly will increase performance?!

Might the DB connections be problematic? But 64 max should not kill the server?

Do you run the 10k users on Xenforo?

1511302230709.webp
1511302264462.webp
 
If you had only 20GB on fifth, its no wonder your performance was woeful then - you get 3 iops per GB so you effectively had only 60 iops - given your customer base, it was never going to cut it.

Focus only on the performance since you allocated the extra space.

I run possibly the single busiest Xenforo site when it is at its peak. Depending on your definition of 'concurrent' users we regularly surpass 10,000 separate users within a two minute period.

You mentioned 'extended search', did you mean enhanced search? Also are you running a caching server? These can have big impacts on database performance. Also get phpmyadmin running against the database, that can give insights into tuning your db. The sort of parameters you may tune include binlog_cache_size, innodb_lock_wait_timeout, max_allowed_packet, max_heap_table_size, query_cache_limit, query_cache_size, sort_buffer_size, table_open_cache, table_open_cache
 
I mean "users online" with 15min session times.

I really want to stay with AWS, but at the moment it does not feel right. But your comments make me hope again ;) Wooow, this must be a huge site, congratulations. And you are using just the next RDS size, so 4 CPUs and 16GB RAM?

Yes, I mean enhanced search. @MattW checked my nginx and installed a caching. Do you suppose a special database caching?
 
I really want to stay with AWS, but at the moment it does not feel right. But your comments make me hope again ;) Wooow, this must be a huge site, congratulations. And you are using just the next RDS size, so 4 CPUs and 16GB RAM?
Yes
Yes, I mean enhanced search. @MattW checked my nginx and installed a caching. Do you suppose a special database caching?
nginx caching is fine for things like javascript etc, although we offload much of this to S3. The caching I was referring to was mainly around session management etc. For example the settings in our config.php are
Code:
$config['cache']['enabled'] = true;
$config['cache']['frontend'] = 'Core';
$config['cache']['frontendOptions']['cache_id_prefix'] = 'bf_';

$config['cache']['backend'] = 'Memcached';
$config['cache']['backendOptions'] = array(
        'compression' => false,
        'servers' => array(
                array(
                        // your memcached server IP /address
                        'host' => 'XXXXXX.znjku1.cfg.usw2.cache.amazonaws.com',
                        
                        // memcached port
                        'port' => 11211,
                )
        )
);
You need only a small cache server, t2.small would be plenty for I would think, you may even get away with a micro. We vwent with memcache, but I am thinking about swapping to redis and using @Xon s redis addon- will give you more flexibity going forward
 
  • Like
Reactions: Xon
You need only a small cache server, t2.small would be plenty for I would think, you may even get away with a micro. We vwent with memcache, but I am thinking about swapping to redis and using @Xon s redis addon- will give you more flexibity going forward
The sentinel support works really well, and in-practice the failover works well for XenForo due to the largely read-only nature of the caching.


My add-on allows you to setup read/write splitting between the Redis master and connected slaves. This also allows you to setup a tiny redis instance per webserver which auto-connects to the master redis instance (without being a failover target). This dramatically reduces the number of network reads which need to happen, while only requiring ~32mb of memory.

Just an FYI; but check my Redis Cache FAQ for some annoying bits about running Redis in a virtual environment. The biggest amounts to; don't touch the disk. XenForo doesn't require a persistent cache!
 
Last edited:
Thank you all. I hope again that with your comments/support we'll manage to solve the bottle neck.
  • We do use memcache already, @MattW did a great work.
  • At the moment XenForo Gallery (official addon/ChrisD) and attachments are loaded from EC2 instance. But I got @Xon 's attachment store already to use S3/cloudflare

This, for my understanding, will only help to improve/save performance on EC2 instance?
But the real problem at the moment is the RDS/database. How will that influence the database, please?

What really makes me wonder is, that even with only 200-300 users (15min sessions) without extra hot threads (like competition result announcement) the CPU load on RDS instance is mostly 30 upto 50% or more, with little amount of db connections and IOPS. Why, what could that be, please? I think something fundamentally is causing that.
  • Will install @Xon' s slow query addon
  • Will reduce shown images in gallery to 15 (now 30) per page
  • Will turn off advanced search
  • Will turn off thread preview addon as well as 2-3 more addons which might cause load
I want CPU usage on RDS to calm down, hopefully.

BTW, we have 3-4 threads with around 2000 posts, which are in focus for most users. May long often read threads causing trouble?
 
be aware that they don't rip you off. They do charge you for their traffic.
Had that experience and moved.

Might be better to have a cheap back up host just in case.
 
The traffic is not a real financial problem at the moment and it will improve after moving gallery + attachments towards S3.
Currently only the RDS performance is crucial, and the price for the (at the moment) very bad performance.
 
The traffic is not a real financial problem at the moment and it will improve after moving gallery + attachments towards S3.
Currently only the RDS performance is crucial, and the price for the (at the moment) very bad performance.
they charge you thousands of dollars for traffic use
 
Yes, I will have an eye on the costst. At the moment it's much less for the traffic.

@Jim Boy
May I ask what setup you currently run for the 10k users? Is it still that what you describe on the first page of this thread?

I installed the "long query addon" and it shows, what I assumed before: exspecially pages from xenforo media gallery often take longer than one second, attachments and Crispins user map too.
The use of an Ad Manger seems to add noticable load too, at least in the way I use it (switching 10 ads with several rotating images on the same position). Turning that fully of gives a visible relief, but it does not "save" us/the perfomance needed for the next few weeks.
If RDS CPU usage rises +60% or so, more and more pages show long loading times in "long query addon", even forums index and threads...

We are just at the jump of using [bd] Attachments Store with S3 and CloudFlare - do you or does anybody (@Marcus, @eva2000, @fly ...?) have a comment to our setup question with S3/Route53 and CloudFare? We are currently unable to turn it on.

Thank you for any hint!
 
Hello all

I've arrived at this conversation looking for answers on how to hit both servers for read requests in an AWS Aurora Cluster.

Apparently, if you just configure you application to hit the RDS endpoint then none of your traffic will ever touch the read replica. This is a known issue with Wordpress and Automattic have a specific plugin to address the issue here.
https://wordpress.org/plugins/hyperdb/

-----------------
However, I note that others are having performance issues and I'd like to weigh in with my setup for comparison.

My web server is an AWS EC2 t2.large running Nginx

My database server is an AWS Aurora db.t2.medium/db.t2.small cluster. The medium is the writer.

(I do not use cloudfront, I simply set my cdn URL to alias static assets. )

I comfortably handle upwards of 800 concurrents on my site 50:50 lurker v's logedin.

I have found that the single biggest performance leverage comes from getting both nginx and php-fpm configs just right.

Specifically;

nginx leverage is with worker connections - research for your server size - the following is for my t2.medium EC2.

Code:
worker_processes  auto;
worker_cpu_affinity auto;
worker_rlimit_nofile 30000;

events {
    worker_connections 4096;
    use epoll;
    multi_accept on;
}


The most gain however comes from getting php-fpm child processes just right - the default php-fpm values are for small servers and many people do not adjust. the following are my settings for my t2.medium EC2

Code:
pm=dynamic

pm.max_children=120

pm.start_servers=30

pm.min_spare_servers=20

pm.max_spare_servers=40

pm.max_requests = 500

There is a great video on working out your php-fpm values here.

https://serversforhackers.com/c/php-fpm-process-management

--------------------------------


All that aside - has anyone solved how to hit both AWS DB cluster servers with Xenforo ??
 
XF doesn't support db-slave for read only. I'm still not sold on aurora being a better solution than rds either
 
  • Like
Reactions: fly
Hi,

I installed Xenforo on AWS with ALB and auto-scaling.

The xf core codes are deployed via codecommit -> codedeploy

The current problem is that each time a new ec2 is spawned by ALB, the internal_data directory will be empty.

Do you have any suggestions on how to efficiently regenerate the code_cache on each new ec2 instance?

Thanks
 
NFS or some shared network storage is about your only sane solution, you might as well keep the entire XF install on that NFS/shared solution as well, and just tune php opcache to aggressively cache and just use the autoscaling nodes as compute with no state storage in them
 
+1 on a Shared NFS store. We use Kubernetes and have our PVC running to a NFS server master, which creates a slave pod on each node we have allowing our code to scale without much effort.

There are more industrial solutions out there, like using a third party tool to mount an S3 bucket as a FUSE mount point and access your files from a distributed program like minio.
 
Have you looked at this solution? It moves the internal and external data directories to s3.


Hi,

I installed Xenforo on AWS with ALB and auto-scaling.

The xf core codes are deployed via codecommit -> codedeploy

The current problem is that each time a new ec2 is spawned by ALB, the internal_data directory will be empty.

Do you have any suggestions on how to efficiently regenerate the code_cache on each new ec2 instance?

Thanks
 
Have you looked at this solution? It moves the internal and external data directories to s3.


When distributed the code_cache still has to be local, even with the s3 implementation. This only offloads attachment data and other associated data
 
Top Bottom