XF 1.5 Google not indexing all pages/ sitemap.php issue

snoopy5

Well-known member
Hi,

I have a couple of different forums. All are XF1.5.21. All are older than 4 years. All have the same setup on the same server. I have created on all the same way a sitemap and the cronjob for it and put the sitemap.php link into Google search years ago.

But two of them differ dramatically in Google Search for getting indexed. They do get indexed, but only around 15-20% of all submitted pages. All other sites have somethimg like 98%.

According to Google, the test with the sitemap.php is o.k.

Now I looked into this and whe I type into my browser the path to the sitemap.php file, I get for those 2 "problem.kids" different results in the way how it is displayed than with the well indexed sites. See screenshots.

What can be the reason that they look so different? Is there any setting I did wrong?

The good ones:

sitemap_result2.webp


the bad ones:

sitemap_result1.webp
 
o.k., thanks. But I want to use a sitemap for various reasons.

And I need to know why the sitemap gives with two of my sites such a strange output as shown in the screenshot.
 
What’s wrong with the screenshot exactly?

The first screenshot looks like that because it isn’t fully loaded yet.

The second screenshot has loaded successfully.
 
What’s wrong with the screenshot exactly?

The first screenshot looks like that because it isn’t fully loaded yet.

The second screenshot has loaded successfully.

ah, o.k., so the second one is the one which is actually "o.k."?

Then my thesis that the reason for Google not indexing all sites is in a wrong setting of my site.php is not true.

What else could be a reason that Google is indexing only a fraction of a forum (10-20%)? Any common mistakes an admin can do in XF to cause that?
 
Yep the second version is correct. Most browsers apply some default styling to XML documents when they are fully loaded. The first example is just indicative of your browser not having downloaded the full file yet. But that’s really nothing to worry about. Most sitemap files can be as large as 50MB IIRC, though XF does limit them to a max of 10MB. It takes quite a significant amount of time to download and render such a document, but I’m sure Google’s crawler is more than able to cope.

Google will only index pages it can access plus the figure in relation to the sitemap entries may not include pages that it already indexed naturally.

The only mistake that could be made really is if you’re blocking certain things with robots.txt but that seems unlikely.

I really wouldn’t worry about any discrepancy here.
 
Top Bottom