I just tried to submit our sitemap to Google for the first time and it rejected it because of 11 404 errors out of 1,887,104 pages.
I had a look through the problem files and I think they relate to forums which I have removed from public view. After moving all the threads out, I made them private and removed them from the node list.
E.g. /sitemap/sitemap.forums.pags.4.xml.gz with a processed date of March 6th references
Code:
<url>
<loc>http://www.avforums.com/forums/forza-xbox-one.553/page-2</loc>
<lastmod>2014-03-07</lastmod>
</url>
and the Xbox forza forum was retired a couple of days ago.
Now I'm damn sure I deleted all the sitemap files before I ran the sitemap generation today. So why would the processed date be two days ago, a time before I removed those forums from public view?
Another error was with /sitemap/sitemap.threads.113.xml.gz which has 10,000 links in it. The processed date on this is March 6th also.
Why would these URLs be included in the latest site map?