Future fix Google tries to crawl links from style variation chooser menu

Kirby

Well-known member
Affected version
2.3.4
We nofollow the links and noindex the pages. Google doesn't listen or care. There's not a whole lot more that we can do out of the box, but blocking the pages via robots.txt should help.
This does not really help:
If crawling is blocked via robots.txt Google still indexes those pages without crawling even though the links are nofollow.

What could help:
Don't use links for live switching; either omit href in this case or do not use <a> at all.
Live switching is done via JS exclusively, so those links are not required in this case.
 
Last edited:
This does not really help:
If crawling is blocked via robots.txt Google still indexes those pages without crawling even though the links are nofollow.

What could help:
Don't use links for live switching; either omit href in this case or do not use <a> at all.
Live switching is done via JS exclusively, so those links are not required in this case.
You're right! If you're using JavaScript for live switching, there's no need for those links to have href attributes. Omitting them or using another method to trigger the switch would be cleaner and avoid unnecessary crawling.
 
Why is Google Search Console crawling these pages?

/community/misc/style-variation?reset=

and

/misc/style-variation?variation=

How to fix it?
 
Those links are required for graceful fallback
Hmm, I dont really understand why those links in the variation chooser menu would be required as a graceful feedback and I would really appreciate to get some more information:

As far as I can see, style variation switching for guests via the menu is done entirely via JS if JS is available.
So in this case the links are not required there.

If JS is not available the whole menu isn't displayed at all, just the link to misc/style-variation.
So the links in the menu aren't required in this case either.

The variation chooser page (due to the way it does work) requires those links, but the page is noindex and if that doesn't help crawling access could easily be blocked via robots.txt effectively eliminating the possibility for Google to ever see those links.

What am I missing here?
 
The fallback page uses the same code/macro as the menu, which is pretty convenient. We could split them in theory, I guess, but it doesn't really seem worthwhile when it's only 4 URLs without the token. The (fallback/GET) URLs for each variation are noindex too in 2.4.

To be honest, I'm not sure why them being indexed but not crawled when blocked via robots.txt in XF 2.3 would signify a huge problem. Usually the inverse is what's problematic, due to crawl budgets.

Even so, Google has attempted to index/crawl URLs which only appear in HTML/JS even if they're never actually requested by clients anyway (ie. https://xenforo.com/community/js/__SENTINEL__?_v=bac5be75). If Google sees something that looks like a URL in any form there's a chance they'll try to index it, and crawl it if they can.
 
Thanks for the explantion:)

I just wrapped the links in the macro in conditionals to check $live which already exists and is false if used for the menu, at least for me this seems to work fine, is simple and doesn't affect functionality.

If Google sees something that looks like a URL in any form there's a chance they'll try to index it, and crawl it if they can.
Yeah, that’s why I think it's best to try avoid showing URLs that should not be crawled / indexed.
 
Back
Top Bottom