Sitemap for 2nd and next pages too

Google doesn't use it anymore, actually a few years before that post, but finds pages perfectly fine.


 
Those interested may want to support having additional sitemap control/options also:

 
As reported, a few days ago, I built a sitemap manually for a long forum thread, and the results are finally in.

As expected, Google never crawled most pages of these long threads. These threads are not chit-chat or meme threads or "say hello and post an introduction" kind of threads for newbies. We're talking about thousands of pages with extensive, unique, valuable content, people writing lengthy posts discussing science and economics. If these were WordPress or Medium pages, they would be crawled and indexed in a split second. XenForo's pagination is simply insufficient to signal to Google the importance of these pages.

As soon as I submitted every page of that thread in a separate sitemap, Google began to report status on these URLs. They are now shown as "Discovered," but note that the last crawl date is "Never." That's different from "Crawled, not indexed," which is for pages that Google determines are not worthy of indexing. These were excluded simply because Google's bots never got to them. This just breaks my heart.

GhM6N1O.png
 
Last edited:
As reported, a few days ago, I built a sitemap manually for a long forum thread, and the results are finally in.

As expected, Google never crawled most pages of these long threads. These threads are not chit-chat or meme threads or "say hello and post an introduction" kind of threads for newbies. We're talking about thousands of pages with extensive, unique, valuable content, people writing lengthy posts discussing science and economics. If these were WordPress or Medium pages, they would be crawled and indexed in a split second. XenForo's pagination is simply insufficient to signal to Google the importance of these pages.

As soon as I submitted every page of that thread in a separate sitemap, Google began to report status on these URLs. They are now shown as "Discovered," but note that the last crawl date is "Never." That's different from "Crawled, not indexed," which is for pages that Google determines are not worthy of indexing. These were excluded simply because Google's bots never got to them. This just breaks my heart.

GhM6N1O.png

Just so I have this straight, you are worried about SEO, but yet you have site(s) still running HTTP?
 
Just so I have this straight, you are worried about SEO, but yet you have site(s) still running HTTP?

Switching to HTTPS has very little search engine ranking affect. It is good for other things that site users like to see, for instance secure browser indications, but it's almost negligible when looking at it as far as affects on SEO.
 
just wanted to post an update here. i have been using lgraubner/sitemap-generator-cli: Creates an XML-Sitemap by crawling a given site. to generate a sitemap manually on my pi. it takes like 4 days on my mid-sized forum. i have seen google indexing internal pages since i added these to search console. my site has very little traffic so hard to say if it would provide any long term benefit. but it seems like something i can do once a month to ensure that third party sitemap files do not have dead links in it.
 
Here is another update. There is now an add-on that add pages to sitemap. It also has another feature which might be useful for some folks.

 
Top Bottom