XF 1.4 Which URLs have unique content for every logged-out visitor?

jeffwidman

Active member
Which Xenforo urls have unique content for every logged-out user?

For example, any pages with server-generated honeypots or URLs that are unique to specific users such as password-reset links.

I'm working on setting up FastCGI caching for Nginx (similar to Varnish) and I already don't cache anything if the user is logged in, if it's anything other than a `GET`, or if it includes any query strings/arguments.

I've searched around a good bit for both FastCGI cache and Varnish, and there's a number of folks with questions, but no one has managed to put together a cannonical list of which URLs are okay to cache for guests and which aren't. So I suspect if we can put this together, a good number of folks will find it handy.

I don't care if the page content changes on every new post--it's fine if my guests don't see the latest posts for a minute or two--it's only when the content changes for every single visitor that I don't want it cached.

Here's my blacklist so far in Regex form:
Code:
# For sure don't cache
  search.* # Search queries have unique value appended that changes every time
  find-new/.* # URL changes every query
  lost-password.* # lost password requests append random string, won't have cookie set yet
# Pages with honeypots that change every pageload:
  login/login/?
  register/?
# Shouldn't be accessible to logged-out users, but uber-important not to cache, so including just to be safe:
  admin\.php.*
  conversations/.*
  account/.*
  logout.*

However, I'm unsure whether the following should be blacklisted or not:
Code:
# Does Nginx ever access these url subfolders, or only PHP? Do logged-out users ever need to access?
  internal_data
  library
  data
# Does the normal login page have honeypots?
  login/?

Any other urls that have honeypots or otherwise shouldn't be cached for logged-out users?

Alternatively, I've considered using a whitelist. Do I open any security holes if I whitelist the following URLs for logged-out users?
Code:
Whitelist:
  homepage
  /forums/.*
  /threads/.*
  /members/.*
  /posts/.*
  /media/.*
  /resources/.*

However, it's really tricky to set/check a bunch of nested if statements in Nginx, so if possible I'm much prefer to use a blacklist.
 
Last edited:
From @eva2000 tutorial here: http://centminmod.com/nginx_configure_wordpress.html
With Xenforo, it's just a few codes to add/modify.
Code:
set $no_cache 0;

# Don't cache uris containing the following segments
if ($request_uri ~* "(/register|/validate-field|/captcha|/login)") {
    set $no_cache 1;
}

# Don't cache logged in users
if ($http_cookie ~* "xf__session_admin|xf__user") {
    set $no_cache 1;
}
Then modify your theme to have stay login check by default and hidden.
 
@RoldanLT I appreciate you chiming in, but your list is incomplete. Among other pages, "/login/login" and "/login/login/" both have honeypots that are unique to each visitor and I don't think will be caught by your regex. You probably also want to prevent the lost-password page from being cached to minimize any potential security issues (I haven't looked into how those urls are generated).

Also, if you carefully examined the Wordpress guide from Centminmod, while very helpful, you'll find the regex had a few errors... I submitted them as bug reports in the past few days to @eva2000 and he's cleaning them up. I also think I found a fairly big bug last night with how the Nginx nested includes work that may open a security hole in Wordpress and certainly open one if you're transcribing them straight across to Xenforo--I haven't had time to completely diagnose/validate that it's a bug, but hope to do so later this week.

Anyone else, particularly @Mike or @Kier know whether my blacklist is missing any pages that have honeypots for logged-out users?

I saw a css.php that generates CSS with an argument string at the end... is that argument string unique for each visitor or just for each thread?
 
I couldn't really give a specific list. It could happen on any page if something like a CAPTCHA is added. That's likely to be uncommon though. The login and registration stuff are probably the most likely things to ensure aren't cached.

If you cache something that you shouldn't (as long as it's cached by a guest), you're really just more likely to break something for another guest than to create a security issue/information leak.
 
Top Bottom