Not a bug Attachments detected as Soft 404 in Google WMT

nrep

Well-known member
Affected version
1.5
For some reason, Google WMT detects image attachments as soft 404s. The permissions are set so that guests can view the attachments, but it looks like Google gets confused as the attachments are appear to be a directory, rather than a file with an image extension. Of course, this technically works when viewing it in a browser - but it's causing problems with Google crawling.

Check the new WMT stats under Coverage > Excluded > Soft 404 to see the problem. I imagine it's the same in XF2 also, as it uses the same structure.
 
soft 404's also happen when a 'page' is thin content like attachment/image. Google does indeed get confused at first because it are not .png or .jpg url's but the header corrects this.


A soft 404 means that a URL on your site returns a page telling the user that the page does not exist and also a200-level (success) code to the browser. (In some cases, instead of a "not found" page, it might be a page with little or no usable content--for example, a sparsely populated or empty page.)

I have the same reported and also some images are indexed as pages.. Not a lot, like 50 from 350k images.
 
I don't think there is a lot that XF can impact here. It's worth noting that the headers that we send with inline attachments (full attachment URLs) should sufficiently indicate to Google that these are proper image like attachments and not real HTML pages. I think we'll have to write it off as a false positive on their end.

Going forward, having some sort of "friendly URLs" for attachments is something we've talked about in the past and that may help, but that's definitely a longer term thing.
 
Top Bottom