Partial fix Caching Issues

digitalpoint

Well-known member
Couple things...


1.

CSS stored in cache never expires (end up with a lot of stale cache items because all the old cached CSS entries stay forever if you edit your CSS files).

In the XenForo_CssOutput::renderCss() method, the call to $cacheObject->save($css, $cacheId) has no expiration... maybe set the expiration to 24 hours or something so we aren't left with a cached copy of every CSS combo variation for every edit ever made?


2.

Is maybe the $cacheObject->test() call in that same method pointless? I didn't check how other caching mechanisms work, but with memcache, the test() method really is just calling $memcache->get(), and so is load(). So we are using $memcache->get() to check to see if something exists, and if it does, we call $memcache->get() to get it. Probably more efficient to get it once and return it if it has something.

Maybe this:
PHP:
if ($cacheObject = XenForo_Application::getCache())
{
    if ($cacheObject->test($cacheId))
    {
        return $cacheObject->load($cacheId, true) . "\n/* CSS returned from cache. */";
    }
}

would be more efficient as (ultimately 1 memcache get call instead of 2):
PHP:
if ($cacheObject = XenForo_Application::getCache())
{
    if ($css = $cacheObject->load($cacheId, true))
    {
        return $css . "\n/* CSS returned from cache. */";
    }
}


3.

Does it make sense to save sessions for search engine spiders when they ignore cookies and don't pass cookies back?

I ended up extending the XenForo_Session::saveSessionToSource() method like so:
PHP:
public function saveSessionToSource($sessionId, $isUpdate)
{
    if (!empty($_COOKIE) || !$this->get('robotId'))
    {
        parent::saveSessionToSource($sessionId, $isUpdate);
    }       
}

If they pass back a cookie, then fine... either it's a human pretending to be a spider (annoying though), or spiders started getting smarter and passing back session cookies.

This saved me around 1.5-2M needless sessions from being created every day (every individual page of a search spider creates a new session otherwise). It gets even worse when you consider a new session is created for every request even if there is a redirect (say a URL redirects to the canonical URL, 2 sessions are created for the spider for that one "page view").

Bonus: start tracking Google's AdSense spider as a bot... "Mediapartners-Google" is what it goes by.
 
1. The default lifetime on cache elements appears to be 3600. Have you changed this in your cache config? Not to say we shouldn't change it, but an unlimited lifetime cache by default seems dangerous.

2. That does seem right.

3. Reasonable idea but not something I'd consider right now (possibly for a more significant version though).
 
Back
Top Bottom