XF 1.4 Sitemap XML

Alfa1

Well-known member
Is there a way to see if the sitemap was downloaded by Google or Bing?

All sitemap generators I have used so far have this function and it has proven very handy to detect problems with the sitemap. Its pretty reassuring to see a regular download of the sitemaps by google and bing. But it can happen that Bing changes its requirements (has happened in the past) or somehow has trouble accessing the sitemap. In such case its very useful to see in the logs if the search engine has downloaded it after it was pinged.
 

Alfa1

Well-known member
Yes, simply add the sitemap to Google Webmaster Tools.
What I mean is something similar to this:
http://awesomescreenshot.com/076370gc88
http://awesomescreenshot.com/042370gb72

The first screenshot shows failure by bing to download the sitemap.
The second screenshot shows that other search engines & bots have found the sitemap and regularly download it. This is important to know to stay up to date on what is indexing your website. And also because the sitemap can be abused by scrapers.
 

Carlos

Well-known member
What I mean is something similar to this:
http://awesomescreenshot.com/076370gc88
http://awesomescreenshot.com/042370gb72

The first screenshot shows failure by bing to download the sitemap.
The second screenshot shows that other search engines & bots have found the sitemap and regularly download it. This is important to know to stay up to date on what is indexing your website. And also because the sitemap can be abused by scrapers.
I would like this feature, as well.

I'm lovin' this announcement, and then the video ends with a nice new feature along with a cool closing quote. :)
Is there any advantage for that? I look at my webmaster, this option is enabled, but it looks like it indexed only the main thread page.
No real "advantage" just that some of us want all pages to be indexed. That option just tells the software to write links for threads. And some people don't like having some areas being index'ed. Like the member list. Likewise, some people would like to have everything indexed - like the blogs, the resource manager entries (downloads, reviews, etc.), and of course the pages.
 
Last edited:

Hoffi

Well-known member
What I mean is something similar to this:
http://awesomescreenshot.com/076370gc88
http://awesomescreenshot.com/042370gb72

The first screenshot shows failure by bing to download the sitemap.
The second screenshot shows that other search engines & bots have found the sitemap and regularly download it. This is important to know to stay up to date on what is indexing your website. And also because the sitemap can be abused by scrapers.
Sorry, you are wrong. That shows, that the ping to Bing failed, in which the Forum told the crawlwer, that a new sitemap exists.
Inside the Bing and Google Webmastertools, you can check the status of indexing your sitemap.

The Second Screenshot can you become, if you set an analyze tool for your access.log.
 

TBDragon

Active member
wow really good , now another addon will go :d after this upgrading
+
for RM there will be a new version so it can be compatible with the Sitemap feature right !?
 

Blue

Well-known member
How much would this impact your server on a forum with say 1 million posts?
 
Last edited:

Chris D

XenForo developer
Staff member
It won't impact your server.

It runs as a cron task, and no doubt runs as a deferred task in the background which basically means it does it in small chunks that are continually re-queued until the entire job is finished.
 

MQK8

Active member
Kier, you and Mike continue to amaze me with this awesome piece of software! Nice addition.
 

Jim Boy

Well-known member
On the face of it, I've got some really serious reservations about this - essentially it would appear to be the first enhancement in a while that doesn't scale at all.

My site has tens of millions of posts, and a couple of hundred thousand users. I am assuming any site-building is going to be a heavy and lengthy task. Given that our public facing servers can drop out at any moment, it's much better if we can run this on a non-public facing server kicked off by a unix cron job - but that doesn't seem possible with this 'enhancement'.

There are easy things that could have been done top reduce the load as well, such as limiting the results found, generation of video and image sitemaps, and combine it with introduction of structured data. I'm sorry but this improvement would appear to be a big disappointment compared to the other enhancements in 1.4 which are all great (except email bounce handling which is again let down by not being that suitable for very large forums).
 
Top