XML Sitemap for XenForo 1.3 [Not needed, included in 1.4]

Ok.

For some reason, I have some authorization errors for threads from private forums. Thought maybe they were coming from the sitemap. Good to know private forums aren't included.
 
Ok.

For some reason, I have some authorization errors for threads from private forums. Thought maybe they were coming from the sitemap. Good to know private forums aren't included.
you could always download the sitemap and verify that they are no there. Would be the sitemap.forums.xml.gz file in the /sitemap directory on your server
 
Error Info
PHP:
ErrorException: DOMDocument::save(sitemap/sitemap.forums.1.xml): failed to open stream: Permission denied - library/XfAddOns/Sitemap/Helper/Base.php:74
Generated By: Unknown Account, Yesterday at 11:01 PM
Stack Trace

PHP:
#0 [internal function]: XenForo_Application::handlePhpError(2, 'DOMDocument::sa...', '/home/sociall1/...', 74, Array)
#1 /home/sociall1/public_html/forums/library/XfAddOns/Sitemap/Helper/Base.php(74): DOMDocument->save('sitemap/sitemap...')
#2 /home/sociall1/public_html/forums/library/XfAddOns/Sitemap/Model/Sitemap.php(180): XfAddOns_Sitemap_Helper_Base->save('sitemap/sitemap...')
#3 /home/sociall1/public_html/forums/library/XfAddOns/Sitemap/Model/Sitemap.php(70): XfAddOns_Sitemap_Model_Sitemap->generateForums()
#4 /home/sociall1/public_html/forums/library/XfAddOns/Sitemap/CronEntry/RebuildSitemap.php(31): XfAddOns_Sitemap_Model_Sitemap->generate()
#5 [internal function]: XfAddOns_Sitemap_CronEntry_RebuildSitemap::run(Array)
#6 /home/sociall1/public_html/forums/library/XenForo/Model/Cron.php(356): call_user_func(Array, Array)
#7 /home/sociall1/public_html/forums/library/XenForo/Cron.php(29): XenForo_Model_Cron->runEntry(Array)
#8 /home/sociall1/public_html/forums/library/XenForo/Cron.php(64): XenForo_Cron->run()
#9 /home/sociall1/public_html/forums/cron.php(12): XenForo_Cron::runAndOutput()
#10 {main}

Request State

PHP:
array(3) {
  ["url"] => string(59) "http://www.sociallyuncensored.eu/forums/cron.php?1364526110"
  ["_GET"] => array(1) {
    [1364526110] => string(0) ""
  }
  ["_POST"] => array(0) {
  }
}
 
^ Never mind.

This seems to be a hosting issue. Permission denied (seems it stuck at 644 chmod).

Sorry to have bothered you with this.
 
Ok.

For some reason, I have some authorization errors for threads from private forums. Thought maybe they were coming from the sitemap. Good to know private forums aren't included.
I'm getting these two and I swear I've checked to make sure they weren't being included.
I've disabled the sitemap and deleted all sitemaps until I get more time to investigate.
 
If you like this add-on, please post a review. It takes 1 minute and I enjoy the feedback! :)

Several updates on this one
  1. I refactored the php classes to help add-on makers integrate their sitemaps. And I will be dogfooding this myself with my own addons.

    Add-on makers need only to extend the XfAddOns_Sitemap_Model_Sitemap class through a standard load_model hook.

    A getAdditionalSitemaps() method is provided for extensibility
  2. I added an option to let the sitemap add-on manage your robots.txt file. To accomplish this you will need to do a mod_rewrite rule. Instructions are provided in the zip file, you need to forward the robots.txt file to the provided robots.php file

    After you set the rewrite rule, there are several options in the admin control panel options to block common pages that are irrelevant to the spiders
screen-shot-2013-04-09-at-8-01-35-pm-png.44102
 
I have another robots.txt. this update will be compare with my file robots?
sorry

It would override your robots.txt file, though it won't delete it, it would just be ignored if it's there (provided you do the mod_rewrite)

This is an example of what it generates
Code:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /account*
Disallow: /help*
Disallow: /misc/style*
Disallow: /login*
Disallow: /logout*
Disallow: /lost-password*
Disallow: /register*
Disallow: /reports*
Disallow: /search*
Disallow: /conversations*
Disallow: /css.php
Disallow: /cron.php
Disallow: /admin.php
Disallow: /js
Disallow: /styles
Disallow: /attachments/*
Disallow: /online/*
Disallow: /recent-activity/*
Sitemap: http://yourdomain.com/sitemap/sitemap.xml.gz

There is a textarea to add custom rules should you need to disallow any url.

By default it does not block member pages and attachments, but it has the options to do so. Files like css.php, styles and help are always blocked.
 
It would override your robots.txt file, though it won't delete it, it would just be ignored if it's there (provided you do the mod_rewrite)

This is an example of what it generates
Code:
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /account*
Disallow: /help*
Disallow: /misc/style*
Disallow: /login*
Disallow: /logout*
Disallow: /lost-password*
Disallow: /register*
Disallow: /reports*
Disallow: /search*
Disallow: /conversations*
Disallow: /css.php
Disallow: /cron.php
Disallow: /admin.php
Disallow: /js
Disallow: /styles
Disallow: /attachments/*
Disallow: /online/*
Disallow: /recent-activity/*
Sitemap: http://yourdomain.com/sitemap/sitemap.xml.gz

There is a textarea to add custom rules should you need to disallow any url.

By default it does not block member pages and attachments, but it has the options to do so. Files like css.php, styles and help are always blocked.
my robots.txt
Code:
User-agent: *
Disallow: /misc/
Disallow: /help/
Disallow: /members/
Disallow: /register/
Disallow: /login/
Disallow: /online/
Disallow: /lost-password/
Disallow: /recent-activity/
Disallow: /admin.php
Disallow: /find-new/
Disallow: /conversations/
Disallow: /account/
Disallow: /goto/
Disallow: /search/
Disallow: /attachments/
Disallow: /sorry.html
Allow: /
if I use this addon, I need to use the robots file? because I saw in the new update provided a new robots file?
 
my robots.txt
Code:
User-agent: *
Disallow: /misc/
Disallow: /help/
Disallow: /members/
Disallow: /register/
Disallow: /login/
Disallow: /online/
Disallow: /lost-password/
Disallow: /recent-activity/
Disallow: /admin.php
Disallow: /find-new/
Disallow: /conversations/
Disallow: /account/
Disallow: /goto/
Disallow: /search/
Disallow: /attachments/
Disallow: /sorry.html
Allow: /
if I use this addon, I need to use the robots file? because I saw in the new update provided a new robots file?

If you update to the latest version, you do not need the robots.txt file anymore
Though, you would need to go to the admin control panel and in the Robots.txt options, add the following

/goto
/search
/sorry.html

As they are not in the defaults

... and, just as a generic comment, the robots.txt feature is optional, you could just keep using yours. This will mostly benefit new users that do not have a robots.txt in place
 
If you update to the latest version, you do not need the robots.txt file anymore
Though, you would need to go to the admin control panel and in the Robots.txt options, add the following

/goto
/search
/sorry.html

As they are not in the defaults

... and, just as a generic comment, the robots.txt feature is optional, you could just keep using yours. This will mostly benefit new users that do not have a robots.txt in place
Thank you :)
 
Can someone help me create a 'News' sitemap for a XenPorta category articles? I've a category called 'News' in XenPorta and I need to submit all the articles in that category to Google news.
 
Can you please check if it works as it seems that the htaccess lines has something wrong. I've inserted this in htaccess

Code:
    <IfModule mod_rewrite.c>
        RewriteEngine On
 
        #    If you are having problems with the rewrite rules, remove the "#" from the
        #    line that begins "RewriteBase" below. You will also have to change the path
        #    of the rewrite to reflect the path to your XenForo installation.
        # RewriteBase /
 
        #    This line may be needed to enable WebDAV editing with PHP as a CGI.
        #RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
 
        RewriteCond %{REQUEST_FILENAME} -f [OR]
        RewriteCond %{REQUEST_FILENAME} -l [OR]
        RewriteCond %{REQUEST_FILENAME} -d
        RewriteRule ^.*$ - [NC,L]
        RewriteRule ^(data/|js/|styles/|install/|favicon\.ico|crossdomain\.xml) - [NC,L]
        RewriteRule (robots\.txt)$ robots.php [NC,L]
        RewriteRule ^.*$ index.php [NC,L]
    </IfModule>

So if you go to http://forum.kog.it/robots.txt you should see the lines generated by the addon. Instead you see my old robots.txt that's empty.

But if you go to http://forum.kog.it/index.php?xfa-robots/index it works perfectly so must be something wrong with htaccess.

Can you help?
 
I "think" I have the rewrite converted to nginx format, but would someone a little more familiar with it double check my rewrite rule
Code:
rewrite /(robots.txt)$ /robots.php last;

EDIT:
And it DOES work... I just went to http://twowheeldemon.com/robots.txt and it kicked up the options selected and the additional info I put in.
 
So, instead of
Code:
RewriteRule (robots\.txt)$ robots.php [NC,L]
I insert
Code:
rewrite /(robots.txt)$ /robots.php last;
Sorry for asking but I really don't know how to write these rules so I'd better ask...
 
So, instead of
Code:
RewriteRule (robots\.txt)$ robots.php [NC,L]
I insert
Code:
rewrite /(robots.txt)$ /robots.php last;
Sorry for asking but I really don't know how to write these rules so I'd better ask...
Don't use my example if you are using Apache2. That one is STRICTLY for use with the nginx server.
 
Top Bottom