1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Fixed All your <base> are belong to us - Broken links on Google/Bing Cached Views

Discussion in 'Resolved Bug Reports' started by inph, May 2, 2011.

  1. inph

    inph Active Member

    Pages generated by Google or Bing's (and possibly other search engines) "Cached" views contain broken links due to an overriding <base> tag at the top of the page and relative page links generated by Xenforo.

    Browsers:
    Chrome 11, Firefox 4, IE7 and later will all ignore subsequent <base> tags that appear after the initial <base> tag.

    Opera 11, IE6 will process the additional base tag which also renders the CSS and references the Javascript correctly on the cached view.

    Edit: In Opera, viewing "Text Only Version" will also produce broken links.

    One solution would be to move from generated relatives to links including the root /

    <a href="threads/redirection-scripts-for-vbulletin-3-x.5030/page-2
    to
    <a href="/community/threads/redirection-scripts-for-vbulletin-3-x.5030/page-2

    http://www.google.co.uk/search?q=site:xenforo.com google cache links
    xf-google-cache_1.png

    http://webcache.googleusercontent.com/search?q=cache:mA4H1uMotIkJ:xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/ site:xenforo.com google cache links
    xf-google-cache_2.png

    http://xenforo.com/community/thread...vbulletin-3-x.5030/forums/add-on-releases.32/
    xf-google-cache_3.png

    http://xenforo.com/community/thread...-3-x.5030/attachments/import-301-v2-zip.7288/
    xf-google-cache_4.png

    Opera 11: http://webcache.googleusercontent.com/search?q=cache:mA4H1uMotIkJ:xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/ site:xenforo.com google cache links
    xf-google-cache_5.png

    Google Cached Page Source
    PHP:
    <!DOCTYPE html><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <
    base href="http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/">

    [
    snip google generated header html]

    <!
    DOCTYPE html>
    <
    html id="XenForo" lang="en-US" class="Public LoggedOut" xmlns:fb="http://www.facebook.com/2008/fbml">
    <
    head>

    <
    meta charset="utf-8" />
    <
    base href="http://xenforo.com/community/" />

    <
    title>Redirection Scripts for vBulletin 3.x XenForo Community</title
    Bing Cached Page Source
    PHP:
    <base href="http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/" /><meta http-equiv="content-type" content="text/html; charset=utf-8" /><!-- Banner:Start -->

    [
    snip bing generated header html]

    <!-- 
    Banner:End --><div style="position:relative"><!DOCTYPE html>

    <
    html id="XenForo" lang="en-US" class="Public LoggedOut" xmlns:fb="http://www.facebook.com/2008/fbml">
    <
    head>

    <
    meta charset="utf-8" />
    <
    base href="http://xenforo.com/community/" />
    Google Cached Text Only Version Page Source
    PHP:
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <
    base href="http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/">

    [
    snip google generated header html]

    <
    html id="XenForo" lang="en-US" class="Public LoggedOut" xmlns:fb="http://www.facebook.com/2008/fbml">
    <
    head>

        <
    meta charset="utf-8" />

        <
    title>Redirection Scripts for vBulletin 3.x XenForo Community</title>

        <
    meta name="description" content="If you have imported your vBulletin 3.x database into XenForo, you can automatically redirect all traffic destined for your vBulletin content to its new..." />
     
  2. Brogan

    Brogan XenForo Moderator Staff Member

  3. Mike

    Mike XenForo Developer Staff Member

    Brogan, the specific issue is a bit different, though conceptually the same.

    This looks like a Google issue on the whole though. Adding an additional base tag seems like a very strange behavior, if one already exists. You can see it coming up with other people as well:
    http://www.google.vu/support/forum/p/Webmasters/thread?tid=190a40c928b55355&hl=en

    Generating absolute URLs isn't really a good option, and it's rather wasteful in general (repetition and worse for caching). I don't think a change is feasible here.
     
    Brogan likes this.
  4. inph

    inph Active Member

    Whilst I agree that Google/Bing should adhere and adopt the <base href /> tag, they dont.
    Absolute URL: http://xenforo.com/community/thread...oken-links-on-google-bing-cached-views.15459/
    Relative URL: threads/all-your-base-are-belong-to-us-broken-links-on-google-bing-cached-views.15459/
    Root-Relative URL: /community/threads/all-your-base-are-belong-to-us-broken-links-on-google-bing-cached-views.15459/

    I can understand why generating absolute URLs is not a good option. How about root-relative URLs?

    Another option that keeps the current structure intact would be to extend Router.php. In the same way XF checks for the presence of page-x and additionally check for the presence of forums/members/threads/attachments/etc and respond with a 301 redirect stripping the unnecessary parts of the url out (highlighted in red).

    The .js javascripts on xenforo.com appear to already be referenced via absolute URLs (although I'm sure they're relative on my installation). css.php would also require some extra work. Although I wouldn't be too bothered by the stylesheet or javascript not working correctly as long as the links to the content work.

    Broken URLs
    http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/members/kier.2/
    http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/forums/add-on-releases.32/
    http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/attachments/import-301-v2-zip.7288/
    http://xenforo.com/community/threads/redirection-scripts-for-vbulletin-3-x.5030/threads/redirection-scripts-for-vbulletin-3-x.5030/page-3
     
  5. Mike

    Mike XenForo Developer Staff Member

    I believe I have a workaround for this issue as part of another fix. It worked with the Google HTML stuck at the beginning of the page, though I'll have to confirm once Google re-caches some of our pages.
     
    inph likes this.
  6. inph

    inph Active Member

    Code:
    <script type="text/javascript">
    var _b = document.getElementsByTagName('base')[0], _bH = "http://xenforo.com/community/";
    if (_b && _b.href != _bH) _b.href = _bH;
    </script> 
    Seems to work fine on the cached pages I checked.
    Also assuming the broken images are due to referrer checking and rewrite rules.
     

Share This Page