XF 1.5 Google Crawl error 403 on disallowed pages

snoopy5

Well-known member
Hi,

I switched to https, made a new sitemap and Google starts now to index my site. But I get now on a few pages a Google crawl error 403. The surprising thing is, that some of them are pages wich are disallowed in my robots.text, like the member pages:

I have this in my robots.txt
------------------------------
User-agent: *
Disallow: /account/
Disallow: /admin.php
Disallow: /ajax/
Disallow: /attachments/
Disallow: /conversations/
Disallow: /find-new/
Disallow: /goto/
Disallow: /help/
Disallow: /login/
Disallow: /lost-password/
Disallow: /members/
Disallow: /mobiquo/
Disallow: /online/
Disallow: /posts/
Disallow: /recent-activity/
Disallow: /register/
Disallow: /search/
Disallow: /find-new/


Sitemap: https://www.mydomain.com/sitemap.php

-------------------------------

So why then I do get an error for this link:

https://www.mydomain.com/index.php?members/username.12345/

Any idea?
 
As you don't appear to be using friendly URLs, your various disallow rules won't match the URLs being generated, so they'll effectively be ignored. I presume you want friendly URLs to be enabled, so that may be the most straightforward option to change.
 
As you don't appear to be using friendly URLs, your various disallow rules won't match the URLs being generated, so they'll effectively be ignored. I presume you want friendly URLs to be enabled, so that may be the most straightforward option to change.

I do use friendly URLs. See screenhsot of the ACP settings:

xf_seo_settings.webp
 
You gave a link that wasn't using that, so robots.txt won't apply to that. I can't really comment on whether that necessarily came from as we wouldn't generate that link while friendly URLs are enabled.

We do canonicalize the URL to the "correct" one, though that only happens when we know that the page is viewable. If you don't have profiles exposed to guests (or users use privacy to hide their profiles), returning a 403 on a request would be the correct behavior.
 
Top Bottom