XF 1.0 Real-World Page Loading Speed

This isn't so much a 'have you seen this feature' post as one that goes some way to explaining why interacting with XenForo is such an immediate, pleasurable experience.

The following is screen-grabbed from Google Webmaster Tools.

google-webmaster-tools-page-load-speed-png.420
 

Attachments

  • Google Webmaster Tools Page Load Speed.webp
    Google Webmaster Tools Page Load Speed.webp
    59.1 KB · Views: 5,920
Excuse my ignorance, but what else is there to do other than
Code:
SELECT * FROM posts WHERE thread_id=x LIMIT 0,10
?

EDIT: Unless...
Code:
SELECT * FROM posts WHERE thread_id=x AND (post_num  BETWEEN 0 AND 10)
Would theoretically be faster as MySQL doesn't fetch the whole result set and then truncate. Instead it would only be fetching the records you need. :)

LIMIT is stupidly slow with a large result set, can't wait to see what they've done :D

MySQL only has to fetch extra rows and truncate if the query cannot calculate which rows to fetch using an index.

In the first example, if there is an index on (thread_id), then no extra rows will be fetched.

In the second example, if there is an index on (thread_id, post_num) no extra rows will be fetched. However, if there is only an index on (thread_id), MySQL will have fetch all the rows matching thread_id=x, then check to see if post_num is between 0 and 10.

LIMIT is only stupidly slow with poorly constructed schema, indexes, queries, and data access patterns. If done correctly, LIMIT is instantaneous. All it really involves is making sure your queries are fully indexed.
 
If I was writing a post moderation/deletion feature I wouldn't do it the traditional way - i.e. leaving the post in the posts table and changing a field to mark it as hidden. Under the assumption that most people won't have the permissions to view the post, I'd delete it from the DB, put it in a different table and use a UNION query to allow people with the appropriate permissions to see it. That way, the extra load is only on requests made by moderators and admins, instead of having a deeper index to deal with.

So, the post_num query might even work, unless you're a mod or an admin, in which case you'd have to do something totally different - but perhaps that's fine. In fact, if you were crazy enough, you could go completely to town with it and have a joint primary key on the table with thread_id and post_id - just set post_id to auto_increment and it happily starts a new thread at post_id 1, all by itself... but you'd need some pretty heroic code to make that work system-wide. I wouldn't want to write the split/merge thread features for it anyway.

Why would you want to denormalize your posts? Keep them in the same table, use a second column to mark status (deleted, awaiting moderation, approved, etc), and use a proper index. It's way less complicated. You already have to have an index on the thread id (since you can't use the primary key, post id), and creating it over (thread_id, status) is hardly much overhead, and if you use a tinyint for status, you're only adding 1 byte per post in the index. And if you only filter by thread_id (and not status), the same index will index the query.

The only place where this gets complicated, is when a specific user has hidden posts from another specific user. Because of the many-to-many relationship, you can't stuff this into the same table, and even if you used a couple of joins to handle the m-to-m relationship, the filtering on the joined tables precludes MySQL from using only the index on the primary table. The best solution for blocked users is to filter on the PHP side, and giving the user a reason why the post count is off ("you're only seeing 8 posts because 2 were by members you blocked").
 
Why would you want to denormalize your posts? Keep them in the same table, use a second column to mark status (deleted, awaiting moderation, approved, etc), and use a proper index. It's way less complicated.

I can think of at least one case off the top of my head. My site has 65,000,000 posts. While SELECT queries against it (LIMIT or otherwise) are still reasonably fast, operations against that table such as changing table schema (or adding an index) are not. MySQL essentially copies the table being modified to a tmp table, performs the modifications, then replaces the existing table (all while locked). Even on our quad quadcore box with 16 gb of ram, modifying a table of this size takes a very, very long time (during a vBulletin 4 test upgrade, an index addition took 6+ hours).

Personally, I'd rather the post table is sharded, with the id of the sharded post table kept in the thread table, and new posts being inserted into the correct table. Yes, it's a lot more complicated, but it scales. It's easy enough to use a mysql_merge table to select against, we've been doing it for many years.

I realize this is "enterprise" functionality, but I figured I'd mention it anyway. :)
 
I can think of at least one case off the top of my head. My site has 65,000,000 posts. While SELECT queries against it (LIMIT or otherwise) are still reasonably fast, operations against that table such as changing table schema (or adding an index) are not. MySQL essentially copies the table being modified to a tmp table, performs the modifications, then replaces the existing table (all while locked). Even on our quad quadcore box with 16 gb of ram, modifying a table of this size takes a very, very long time (during a vBulletin 4 test upgrade, an index addition took 6+ hours).

Personally, I'd rather the post table is sharded, with the id of the sharded post table kept in the thread table, and new posts being inserted into the correct table. Yes, it's a lot more complicated, but it scales. It's easy enough to use a mysql_merge table to select against, we've been doing it for many years.

I realize this is "enterprise" functionality, but I figured I'd mention it anyway. :)

I've been there. I've got on table with 45M rows, and alters do take forever. That's one place where Oracle and the like have an advantage.

In that case, I would separate the indexes into different tables, and only keep a primary key on the data table. In essence, turn the data table into a simple key-value store. This has the side benefit of being able to use a key-value cache (e.g. memcache) for the post retrieval. You pretty much have to do this anyway if you horizontally partition the data, unless you enjoy reducing the results of multiple index queries outside of the database (un-fun and inefficient).
 
That would make a lot of sense, I'd love to see an implementation of that idea that used memcache with expiring keys (which could alleviate the need for a "postcache" ala vB), where keys are invalidated when a thread is updated.

The whole MERGE table thing is a hassle to maintain, but on an ongoing basis has treated us pretty well. The last time we upgraded our hardware, I stopped doing ongoing maintenance, and now our post table has grown larger than the aggregate size of all our sharded tables, and performance is just fine.. for now.
 
I've thought about writing forum software that would work that way, but there's a lot of work involved in writing a forum.

I have written a sharded project that works that way. I believe there is only two sql joins in the entire code base (which could easily be eliminated). Performance is much easier to control when you separate the indexing from the data. And rather than joins, I use object composition. That makes it easy to change one component (data or even structural changes) without invalidating cache entries all over the place.
 
A bit of a cheat (fewer members and more of them UK based), but the thread and homepage pages on my site are a little quicker in vB3. :) I shall likely be slower when XenForo is compressed (they haven't compressed yet as they are fiddling).

I'm hoping they give us the ability to NOT use any kind of compression at all. I don't have metered bandwidth so i'd like to have the cpu cycles do other things than compressing pages.
 
I'm hoping they give us the ability to NOT use any kind of compression at all. I don't have metered bandwidth so i'd like to have the cpu cycles do other things than compressing pages.

You realize using gzip compression on your webserver imparts a negligible amount of load and actually speeds up page loads for users, right? It's not really about saving bandwidth anymore.
 
http://forums.Spacebattles.com is a forum with approaching five million posts, and maintains a high level of concurrency. It is highly reliant upon Sphinx as its search engine to prevent it from murdering this little server. Therefore, Spacebattles won't switch to XenForo until a viable Sphinx implementation is available.

have you done a test run with XF on that box after converting that data w/o spinx? If yes then xf won't scale on it's own without things like sphinx..which isn't necessarily but it lets me know the practical limits of a base install of xf..:)
 
You realize using gzip compression on your webserver imparts a negligible amount of load and actually speeds up page loads for users, right? It's not really about saving bandwidth anymore.

I misspoke. I have unmetered transfer but metered bandwidth. What folks call bandwidth isn't. bandwidth is how fast something can go irrespective of how much you transfer. Transfer is how much you move irregardless of speed. I have unmetered transfer and a good port speed(umetered 10 megabit connection so unlimited transfer at 10megabits); So I would rather use those cpu cycles for other things considering my target audience has more than a 90% b-band penetration.
 
I'm well aware of the difference between the two, but my point is, using gzip compression still means faster page loads, especially for people who might be on mobile devices (even when using a mobile skin). If you've got big pipes and want to use them, surely you have decent enough servers to handle the overhead of enabling gzip in Apache or Nginx.
 
Actually i have my sites on my company's 512mhz vps right now(i have access to 16ghz though) so every cycle counts. We've already bought our license so it's going to be interesting to see how it runs..:) If the vps has to dip into the 3ghz overflow pool quite a bit then i'll be looking to disable the gzipping..:) This will also be a good test to see how i can expect clients to run if they install this on their own vps accounts..:)
 
you can head to godsfrozenchosen.com/xf
that's the testing site. It's small. Feel free to sign up and beat on it..when we go live it's gets scraped and rebuilt form the ground up anyways..:) I've also redone the vps allocations so I now have access to a shared pool of 4 ghz(i share it with another vps) and then my clients have a shared 8 ghz to play with. I have some ghz left over in holdback to be able to add to either of the pools if needed.
 
Looking for a bit of advice and this thread looks a good place.

We currently run VB 3.72 on a server with the following specs:


Processors 4
Model Intel(R) Xeon(TM) CPU 3.60GHz
CPU Speed 3.6 GHz
Cache Size 1024.00 KB
RAM 4GB


Typically we have around 400-500 users online at a time but during busy periods (it's a football forum so for example during the transfer window) we can have up to 1500 users online at a time. At that level it really pushes our server to it's limits.

Does Kier (or anyone else) have a feel for how XF would handle 1500 users at the same time? As with everything, my gut feeling is that it'll be far better than VB but I'm not aware of anyone using XF on a site that gets that busy yet?
 
what kind of hdd's are in your box? XF will handle it..it's a matter of your hardware. p-4 xeons are a bit long in the tooth. That and your disk configuration are probably your biggest limitations. Give us a top when you are at max load so we can see cpu/ram/io usage.
 
FYI i've switched to another machine. It's a dual p-4 2.8 ghz w HT 4 gigs of ram running 2 x 74 gig 10k scsi hdd's in raid one with a hot spare. While i had the vps going the system didn't touch the pool. I now have 10 domains on this box along with the xf and there's no sign of distress at all..:)
 
Top Bottom