As designed HTTP 403 when no Unregistered access

HolyK

Member
Affected version
2.0.0 RC2
Hello,

Xenforo2 is returning HTTP/1.1 403 Forbidden when "Unregistered" does not have any view access to any portion of the forum. As much as it might have some sense it is wrong. More over this state confuses server monitoring (nagios) which is throwing alerts all over the place. And most "funny" part is that it returns "403" but also sends some data :D. See details bellow.

Having this (and no explicit allow on any node):
1511169407946.webp

Causing this:
1511169488557.webp

Which actually returns this:
Code:
wget:
--2017-11-20 10:04:52-- https://xxx.xxx.xx/
Connecting to xxx.xxx.xx (xxx.xxx.xx)|###.###.###.###|:443... connected.
HTTP request sent, awaiting response... HTTP/1.1 403 Forbidden

Nagios monitoring:
Code:
HTTP WARNING: HTTP/1.1 403 Forbidden - 20290 bytes in 1.357 second response time

As per RFC2616
403 Forbidden
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
 
A 403 error is certainly correct if the user is logged in and it's a no permission page. (Note that there is no restriction over data being sent with a 4xx error.)

Regardless though, 200 is certainly not the correct response (we are not serving the requested content), so it's unlikely any change we may make would change Nagios's reaction. 401 is technically more correct, though the RFCs require that we use WWW-Authenticate which isn't really an option. Depending on your reading, 403 is actually a reasonable response here -- "Authorization" is very likely referring to the header by that name and that won't help.

It also appears that RFC2616 has been superseded. RFC7231 has an extended description:
The 403 (Forbidden) status code indicates that the server understood the request but refuses to authorize it. A server that wishes to make public why the request has been forbidden can describe that reason in the response payload (if any).

If authentication credentials were provided in the request, the server considers them insufficient to grant access. The client SHOULD NOT automatically repeat the request with the same credentials. The client MAY repeat the request with new or different credentials. However, a request might be forbidden for reasons unrelated to the credentials.

An origin server that wishes to "hide" the current existence of a forbidden target resource MAY instead respond with a status code of 404 (Not Found).
By that description, 403 is the correct code. (RFC7235 describes 401, though it's still the same usage: https://tools.ietf.org/html/rfc7235#section-3.1)

Thus, I think this is the correct status for these requests.
 
I want to bump this because I'm currently stuck in a pesky situation: XF sends out a 403 when you are not logged in but a non-guest visitor is required to visit a page. For third party applications this means they will always get a 403 and well they are refusing to change their system (for obvious reasons).
XF sends out a full-blown login page. That's not what 403's are meant to be used for (https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/403 https://httpstatuses.com/403). IMHO this should be a solid 200 because on server level there are no restrictions, the restrictions are made on application level. It's not Apache or whatever which is refusing the request, it's XF. And the server will serve the requested content - it's just XF which decides what content will be displayed and not the server.
Additionally, the 4xx range is for client side errors and this is not a client side error.
tl;dr: It's not the server who is refusing the request, thus, 403 is wrong.
 
Responding with a 403 on pages which require further credentials (either via logging in or otherwise) is standard practice. As far as I know, that is the intended use (the visitor is unauthorized to view the page), and I don't see anything in those links that says otherwise. I also don't think the distinction between the web server and the application layer is all that important. If that were the case, status codes would be rather useless. I'd wager the majority of special status codes found in the wild come from the application layer.
 
I also don't think the distinction between the web server and the application layer is all that important.
I agree.

To see why HTTP 200 would be completely wrong, think about a web scraper. If you were scraping my site for threads for whatever reason, and you hit a thread ID that is under lock and key (f.ex. in the admin forum), but the scraper received a HTTP 200, then you would need to program into your scraper's application logic that just because the server returned 200, doesn't mean the content is what you think it is.

You wouldn't be able to make assumptions that certain CSS classes exist to signal the start of the "list of posts" block, for instance. The CSS for the error page would be completely different.

Might not be the best example, but then think about a search engine. Is a "no permissions" page something that should be indexed in a search engine? Obviously it shouldn't be.

If only there were some way of telling search engines that this is a "no permissions" page, something like an indicator code... a ...status code maybe, and we could give them numbers to make them easy for computers to parse...

:P


Fillip
 
I don't see a 403 on facebook.com, admin.google.com, mail.yahoo.com etc. So nope, that's clearly not standard practise. These are pages designed to ask you for a login. They are not error pages. They are landing pages. It's their purpose to ask for credentials.
The first visit on a page is without credentials. You are a guest. Authorize yourself and then get a response with a 403 if needed. As 403 stands, it clearly states that the server refuses to authorize the user - there is no authentication process in the first visit involved at all.
It's like throwing an error out when you never had a chance to do something against that. Actually, it's not only like that, this is pretty accurately the case.

I would agree with you if it was a logged in user or maybe if it was not the landing page (not sure on this one).

Edit: In fact, admin.google.com is a good example how someone can deal with scrappers without the necessity to change any code.
Edit2: Another prime example is our admin control panel. Never seen a 403 there either.
 
Last edited:
To an extent @S Thomas is correct.

A 302 - Found error is normally generated when attempting to view content that exists and you're redirected to a login page (ALA Google, etc).
A 403 - No permission error is generated when the displaying the content is not allowed.

So, either one would technically be correct for a guest, depending on the permissions set and what is displayed.

If the content can be confirmed to exist and if a login page is displayed, then the 302 would probably be more appropriate.

If it's a flat out denial without displaying the login page OR when you don't want to even confirm the content exists but would like the user to login in to see if they have permission to view the content, then the 403 would be correct.

So, since a forum can contain content that unregistered users (guests) can't view and it's existence shouldn't even be confirmed, the 403 error is correct in most cases.
 
Last edited:
If I remember correctly, if you go far back enough (maybe XF1 beta), the no permissions page didn't have a login prompt for guests. I believe it was added at some point to ease UX. As such, I guess I've always seen these pages as error pages which just happen to have a login prompt, rather than proper login/landing pages.

In the examples you provided (and many others that I tried), it seems a lot of applications handle this by 302 redirecting to a login page, or simply returning a 404 (as the 403 links you provided suggested for pages where you don't necessarily want to confirm the content exists). I would agree you could argue those are more correct behaviours, but returning 200 is certainly not.
 
These are pages designed to ask you for a login. They are not error pages. They are landing pages. It's their purpose to ask for credentials.
As far as I can tell, this is not technically correct. The "You must be logged in to do that." message is (should) be used when the page the user (a guest is also a user, just a user without a registered account) is trying to view is only visible to users (with permission).

If your argument was "if the entire community is turned off to guests, the first page the user sees when browsing to the Board URL should not return 403" then I would agree with you. Maybe I've misunderstood the posts in this thread, but I did not get the impression this was the argument being presented.

In other words: landing page on a private community = 200, manually navigating to a forum or thread inaccessible to guests = 403.

That would make the most sense to me.


Fillip
 
So, since a forum can contain content that unregistered users (guests) can't view and it's existence shouldn't even be confirmed, the 403 error is correct in most cases.
I would argue that if someone lands on a view restricted page without link manipulation, then the link already is public, hence it would make no sense to hide the existence of content.
But anyways, a landing page displaying a XF login page already tells you that it exists. Additionally, there is no redirect to the outside because it's handled internally (that's why I differentiated between server and application level @Jeremy P).
If I remember correctly, if you go far back enough (maybe XF1 beta), the no permissions page didn't have a login prompt for guests. I believe it was added at some point to ease UX. As such, I guess I've always seen these pages as error pages which just happen to have a login prompt, rather than proper login/landing pages.
Yeah, but that was a design failure per se and not really an UX improvement in the first place. Sure, first impressions are hard to get rid off, so I can understand your thought process, still that's really not how you want to design your application to welcome guests.
In the examples you provided (and many others that I tried), it seems a lot of applications handle this by 302 redirecting to a login page, or simply returning a 404 (as the 403 links you provided suggested for pages where you don't necessarily want to confirm the content exists). I would agree you could argue those are more correct behaviours, but returning 200 is certainly not.
That's for non-landing pages. I'm literally talking about landing pages. Logged in users and maybe other routes on the page could be a different story.
Please don't tell me that the board URL (forum index) or page URL (homepage) is not a landing page even if XF confronts you with an error message :D
If your argument was "if the entire forum is turned off to guests, the first page the user sees when browsing the forum URL should not return 403" then I would agree with you. Maybe I've misunderstood the posts in this thread, but I did not get the impression this was the argument being presented.

In other words: landing on a private forum = 200, manually navigating to a forum or thread inaccessible to guests = 403.
As I said in my previous post (and this one), it's about the landing page. To be more precise, there is a service a customer wishes to use, but the forum is completely view restricted. The service now does a simple curl on the landing page / homepage / board url / page url or whatever and gets a 403. Hence, they refuse to offer their services although there is no reason to - the service would be integrated in the user system anyways because guests can't use the system (no permissions by default). But that's something the service does not know nor needs to know.
That's why 403 on landing pages suck. Even if you place a view restriction on your forum, unless there are explicit rules or a server-side layer of protection, there is actually no reason for spiders not to crawl this landing page. It's open to public, no matter if your application is closed.
So tl;dr: Yes, private forum = 200, manually navigating = 403 (or 404, depends), guest on internal links I'm not sure but I would go with a redirect personally.
 
That's for non-landing pages. I'm literally talking about landing pages
Sure, but most of the examples you provided don't have landing pages, and neither does XF. Visiting https://mail.google.com https://mail.yahoo.com https://admin.google.com all exhibit this redirect behaviour. The only one which has a true landing page is Facebook (and props to them for that).

Please don't tell me that the board URL (forum index) or page URL (homepage) is not a landing page even if XF confronts you with an error message :D
Given that https://mail.google.com et al. are not landing pages, I would, in fact, say that the homepage is not necessarily a landing page. Granted, I would think it a step up if XF did have a proper landing page for forums where the index page is not available to guests, and I agree that not having a landing page is not really a good design insofar as welcoming guests.
 
Also, while I understand that this is about what is the correct behaviour out of the box, and I'm sure you know this anyways, you can change the response code via an add-on pretty easily if it is required for compatibility with a 3rd party service by extending the error controller plugin:

PHP:
public function actionRegistrationRequired()
{
    $view = parent::actionRegistrationRequired();
    $view->setResponseCode(200);
    return $view;
}

Just a thought :)
 
My bad for not clarifying that I was referring to the 4xx part of your quoted post. I mean I literally took admin.google.com as the example how scrappers could easily work around that. Here are some better examples:
pinterest.com
twitter.com
netflix.com
They are all like facebook (landing page, private / public content), which is the exact case how example.com would work when everything was view restricted via XF. And none of these pages throw out a 403. Or a 30x. It's a plain dead 200.

On the other hand, I would stick with mail.whatever.com as example when a guest would view an internal route, for example example.com/threads/1. I personally would throw a 30x there resulting in a 200 landing page.

So everything still better than a 403.

Yes, thanks for the snippet (didn't look into that) but I actually worked around that by removing the guest view restriction for the time the external scrapper was doing his thingy. Yet, as you said, I would like to see if the XF staff has any revised opinion on this topic with, what I believe to be, this new information from all of us.
 
Back
Top Bottom