XF 2.1 Google reports "soft 404" for attachment URLs

dethfire

Well-known member
Anyone else seeing this? All 84K of my attachments is marked as "soft 404". Going to the attachment URL shows the attachment just fine.
 

djbaxter

Well-known member
I think it's a Google bug. I received notifications about "soft 404s" on two threads for one of the forums I manage. Both of the URLs loaded quickly and correctly. Neither were in private forums; both were visible to guests.


What is a soft 404?
A soft 404 is a URL that returns a page telling the user that the page does not exist and also a 200-level (success) code. In some cases, it might be a page with little or no content--for example, a sparsely populated or empty page.
Why does it matter?
Returning a success code, rather than 404/410 (not found) or 301 (moved), is a bad practice. A success code tells search engines that there’s a real page at that URL. As a result, the page may be listed in search results, and search engines will continue trying to crawl that non-existent URL instead of spending time crawling your real pages.
What should I do?
  • If your page is no longer available, and has no clear replacement, it should return a 404 (not found) or 410 (Gone) response code. Either code clearly tells both browsers and search engines that the page doesn’t exist. You can also display a custom 404 page to the user, if appropriate: for example, a page containing list of your most popular pages, or a link to your home page.
  • If your page has moved or has a clear replacement, return a 301 (permanent redirect) to redirect the user as appropriate.
  • If you think that your page is incorrectly flagged as a soft 404, use the URL Inspection tool to examine the rendered content and the returned HTTP code. If the rendered page is blank, or nearly blank, it could be that your page references many resources that can't be loaded (images, scripts, and other non-textual elements), which can be interpreted as a soft 404. Reasons that resources can't be loaded include blocked resources (blocked by robots.txt), having too many resources on a page, or slow loading/very large resources. The URL Inspection tool should list which resources could not be loaded, and also show you the rendered live page.
Use the URL Inspection tool to verify whether your URL is actually returning the correct code.
None of this applied in my case. I suspect it just means that something timed out for Google (not likely - the forum is on a fast dedicated server and there have been no recent server outages) or more likely that this is one of numerous bugs that Google has been experiencing in recent months.
 
Top