For the most part, generating sitemap entries for additional pages is pretty much superfluous. One objective of the sitemap is to tell search engine's where your important pages are, even if they aren't sufficiently linked together. The kind of pages that Google might have difficulty in finding via crawling in the normal way.
With that in mind, with the way content is structured in forums, sitemaps probably have limited benefit anyway, though they are certainly more useful for a new site or a site that has just migrated from somewhere else.
Though, forgetting that point, if Google can find your threads (be that via crawling or via sitemap submission) then it should be able to very easily find each page of the thread. There's plenty of metadata in the HTML and links to find it.
I think you might be missing an important factor: the last page of a thread has the newest posts. By including last pages you inform google that threads have new content, which is an important purpose of sitemaps. Because it shows the need to crawl again. And because fresh content is an important ranking signal.
Google knows there's new content because we change the lastmod date in the Sitemap entry using the last_post_date of the thread. That's sufficient in this case.
. One objective of the sitemap is to tell search engine's where your important pages are, even if they aren't sufficiently linked together. The kind of pages that Google might have difficulty in finding via crawling in the normal way.
So what is better to find for Google? The first site of a thread or page 19?
It is very often that is not the best answer to a question on the first page. That's the reason why all pages should be in the sitemap.
I am testing this now because page 300, 500 or 800 of a thread is a deeplink and probably harder to get for google. Could be some interesting stuff on it to index. For large forums it could be a plus and easy to implement.
With 9mil posts on my forum the xf sitemap contains 270K threads and 190K of those indexed. With all pages of threads included 580K are send, so not much more, and 240K indexed. This means 50K pages of threads are already indexed and 340K not.
I am sending this in sitemaps of 10k a piece, makes it easier to open/check. After a few days i am seeing a small increase indexed pages, after a few weeks i will have more relevant data.
Hello I just noticed on my install, but also on this website, that the sitemaps do not include individual pages (above the first page obviously). When sitemap was an external add-on, it was the case (all the pages were included). Basically, a lot of canonical urls are missing from XF sitemaps...
I am sending this in sitemaps of 10k a piece, makes it easier to open/check. After a few days i am seeing a small increase indexed pages, after a few weeks i will have more relevant data.