1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

XF 1.5 Migrated from vbulletin to Xenforo - robots.txt question

Discussion in 'XenForo Questions and Support' started by LaxmiSathy, Apr 20, 2016.

  1. LaxmiSathy

    LaxmiSathy Member

    Hello,

    Recently I migrated my vbulletin board to Xenforo.
    vbulletin is in the path - /public_html/forums
    and Xenforo installation is in the path - public_html/community.

    Now I have clarifications regarding robots.txt:

    1) Should I include
    Disallow: /forums/

    to tell Google not to crawl the /forums pages?

    2)I have included
    Disallow: /community/members/

    for all the member pages, but am getting "Access Denied" crawl error for the member pages.
    I have submitted that robots.txt is updated and also did "Fetch as Google" but still shows 300,000+ member pages as access denied crawl error
    Should I do things additionally as well?

    3)Also earlier I have vbulletin blogs and there were few blog pages like
    /forums/blogs/anitap
    which no longer exists now as I have imported all of the blog content into Xenforo threads. Should I include this url also to disallow from crawl in robots.txt?
     
  2. Jake Bunce

    Jake Bunce XenForo Moderator Staff Member

    1) A robots.txt file probably is not appropriate for this. Normally you would setup redirects for the old vB URLs.

    2) I am not sure this is a problem. "Access Denied" might be the expected response to your robots.txt file which tells the robot not to visit that page. Or if that response is not expected then perhaps the record just needs time to update.

    3) I guess you could. Most people prefer to setup specific redirects for threads, posts, etc and then use a catchall for all other URLs. That way anyone visiting the old forum is redirected to the new forum.
     
  3. Alfa1

    Alfa1 Well-Known Member

    Use the appropriate redirection script here: https://xenforo.com/community/resources/categories/redirection-scripts.2/
    Dont use robots.txt for this. Use redirects. If the amount of blogs is small you can use .htaccess redirects from blog to threads.

    No need to disallow /members/ because you can handle that with permissions. Robots can never access it if you disallow guest viewing of memberlist and profiles.
     
  4. LaxmiSathy

    LaxmiSathy Member

    @Jake Bunce @Alfa1

    Thanks for the responses.
    #2) But am getting lots of crawl errors - "Access Denied" for the member profile pages. crawl_error_accessDenied.jpg


    The error detail for the member page - community/members/rith.278948 shows as below:
    crawl_error_accessDenied1.jpg

    So how do I fix this 300,000+ member pages that is showing as access denied in the Google webmaster console?

    #3)Yes I have set up a catch all url redirect so this url - /forums/blogs/anitap/

    Hence this url -http://www.indusladies.com/forums/blogs/anitap/
    is redirected to - http://www.indusladies.com/community/

    crawl_error_soft404.jpg
    But still in webmaster console it shows 249 soft 404 errors and most of it have the links as - /forums/blogs/<blogger's user name>
    How to fix this error in webmaster console?
     
  5. Jake Bunce

    Jake Bunce XenForo Moderator Staff Member

    2) Your members page is denied to guests so XF returns a 403. This is normal. It is not a problem to be fixed. 403 is appropriate when a user is not allowed to view a page.

    I am not sure what a robots.txt disallow will show as in that report. Presumably it hasn't taken effect yet.

    3) The redirect appears to be working so you shouldn't get a 404 on that page in the future.
     
  6. LaxmiSathy

    LaxmiSathy Member

    #3) Should I click "Mark as Fixed" for those 404 errors for /forums/blogs ? How long will it take to update in the webmaster console?
     
  7. LaxmiSathy

    LaxmiSathy Member

    Jake for the soft 404 error happening with the urls - /forums/blogs/
    Should I do a "Mark as Fixed" as below :

    crawl_error_soft404a.jpg OR should I do a "Fetch as Google" and submit to index as below: crawl_error_soft404b.jpg
     
  8. LaxmiSathy

    LaxmiSathy Member

    @Jake Bunce @Alfa1
    Additionally did the below setting in admincp
    XML Sitemap Generator > included sitemap content > users - unchecked
    so that community/members url do not get submitted for google crawl.
    But still am seeing these urls in Google Search Console > Crawl error > Access denied

    crawlError_AccessDenied.jpg


    How to fix the Access denied error for these urls.
     
  9. Mike

    Mike XenForo Developer Staff Member

    The sitemap just gives an indication of things for Google to index. It still indexes things outside of it, which would include your members. The display here is informational, to let you know that it ran into these errors in case you weren't expecting it. As I assume your members profiles aren't public, that would be expected.
     
  10. LaxmiSathy

    LaxmiSathy Member

    @Mike
    I understand that access denied error is expected since the member profile pages aren't public in my board. But is there any specific action that need to be done at my end so that these errors do not show up in my Google Search Console?
     
  11. Mike

    Mike XenForo Developer Staff Member

    Robots.txt would be the main way to prevent Google from attempting to crawl the pages. Looking at your robots.txt file, I don't see anything attempting to block /community/members/.
     

Share This Page