Handing Removed Content: 403 vs. 404 Server Responses

ENF

Well-known member
tl;dr summary:
We need 404 response codes for hidden content that was once public to fix access errors in the search console. Suggestions?

------------

In the normal course of business, we remove content for various reasons...

1) Expired, Promoted Content. (Contract for publication has ended.)
2) Removed Content. (Rules, Content Violations, Expired/Invalid Information)

As a general rule, we don't delete content but instead remove it to a archived content pool within XF. Therefore, this results in '403' errors when people hit the content from a Google search. That results in our access error table filling up in the Google Search Console.

What we want to achieve is keeping our normal process but somehow get either XenForo or NGINX to give out 404 responses for content that has been removed from public view.

As you may be aware, Google's search console won't remove the index data for URL's that result in 403 errors. So we can't mark these as 'fixed' in the console. Since the URL's always return 403's instead of 404's Google says it's not removed but continues to report access errors instead that we can't remove unless we fully delete the content.

Anyone have any suggestions?

Ideally, I wish it was possible to tell XenForo to give out 404 error codes for specific forum nodes, or rather, any thread or content located in a specific node. Since the ULR doesn't give the node ID, we can't do this with NGINX configs.


A few notes:
- All archived content is moved to a special section that is not indexed and not accessible by any non-administrator account.
- As noted, we can't get the links removed from the search console unless it reports a 404 code.
- Sitemaps have been updated.
- We cannot and do not want to delete the content for historical purposes and there are times when the content is recycled.
 
This would need a custom add-on, and I don't think anything covers this :(

I've got some cases where Amazon will harass ebook owners if they have a copy of some content on my sites. Easier to just return a 404 then jump through all the authorization hoops.
 
Top Bottom