Soft 404 errors issue

gordy

Well-known member
Greetings all,

I've been getting several reported "soft 404" errors from google analytics.

On some pages that don't exist for example: http://www.planetfigure.com/wiki/index/Cold_War

This page does not exist and I have the "ErrorDocument" directive set in .htaccess like so:

Code:
ErrorDocument 401 default
ErrorDocument 403 default
ErrorDocument 404 default
ErrorDocument 500 default

Is there another way to reconcile these error messages? Maybe a page redirect within the AdminCP ?

Thank you for any help—
 
You could redirect the pages using mod_rewrite in htaccess but I don't think soft 404 errors are a major issue.

I believe they eventually get removed from the index and have little to no impact on SEO.
 
You could redirect the pages using mod_rewrite in htaccess but I don't think soft 404 errors are a major issue.

Tried that, but the problem is there's no way to wildcard it without redirecting real pages :/

I believe they eventually get removed from the index and have little to no impact on SEO.

Hmm, i'd like to think so but google is squawking about it.

Since I have an hourly cron that scrapes my access-logs and reports them as sitemaps, I might just have that portion awked out...
 
I see you've got your htaccess specifying the default error pages - which should be fine - but Soft 404 errors often happen when an error page is reached, but the response code is 200 (response OK).

So I'm reading that one cause of that is when you specify a custom error with an absolute URL (http://www.yoursite.com/error.html) as opposed to specifying a custom error with a relative URL (/error.html).

You're not redirecting to a custom error at all, but I'm just wondering what happens if you do... Maybe it's worth a try.
 
Before that, it might be worth fetching your page as Google.

You can do that in Google Webmaster Tools > Health > Fetch as Google.

Fetch that page, and see what the response error is.

The top line:

Code:
HTTP/1.1 200 OK
Date: Thu, 08 Mar 2012 10:55:15 GMT
Server: Apache/2.2.14 (Ubuntu)

Your page that isn't working should be returning 404. If it isn't then, that's what's causing the Soft 404.

If that turns out to be the case, it may be worth setting up a custom error page (with a relative URL) and then running the Fetch as Google again. Hopefully it will then indicate the correct HTTP response code.
 
I decided to go with separating the apache logs with mod_log_config into an error code for each file 404's into one log file, !200's into a separate file and then all else into an access file and pointed my sitemap script to just the access file. It's rockin' now, thanks for the help!
 
Top Bottom