Googlebot found an extremely high number of URLs on your site

Discussion in 'XenForo Questions and Support' started by TheBigK, Oct 15, 2012.

  1. TheBigK

    TheBigK Well-Known Member

    Looks like our affair with the Google Bot isn't ending. I checked Google Webmaster Tools and it's reporting -

    I've absolutely no clue what's going wrong and in the last ~11 months of running Xenforo, I've never had this issue. I didn't make any change to the website that could have resulted in this issue.

    Can someone inspect our site and see if anything need to be fixed?
  2. Walter

    Walter Well-Known Member

  3. TheBigK

    TheBigK Well-Known Member

  4. DBA

    DBA Well-Known Member

  5. TheBigK

    TheBigK Well-Known Member


    High: 101,242
    Average: 60,233
    Low: 4,442
  6. DBA

    DBA Well-Known Member

    Hmm I also noticed a very sharp drop in my crawl stats around the end of Sept. When was it that Google changed it's algorithm?

    Double checked my Analytics and my Google traffic is still on an upward trend though.
  7. CyclingTribe

    CyclingTribe Well-Known Member

    Yup, we've seen a similar indexing dip too. Our graph looks very similar to CrazyE's. ;)
  8. TheBigK

    TheBigK Well-Known Member

    Have you experienced traffic drop as well?
  9. DBA

    DBA Well-Known Member

    So you're also seeing a drop in traffic? How bad is it?
  10. TheBigK

    TheBigK Well-Known Member

    Kind of bad. It's 40% down.
  11. DBA

    DBA Well-Known Member

    It happened at the same time that your crawl stats dropped?
  12. MagnusB

    MagnusB Well-Known Member

    You can add a few more to that robots.txt:
    Disallow: /forums/-/
    Disallow: /help/
    Disallow: /recent-activity/
    Disallow: /login/
    Disallow: /lost-password/
    Disallow: /misc/contact/
    Disallow: /online/
    Disallow: /register/
    Disallow: /search/
    Not sure just how many that would block out though, I see my entire robots.txt only block out 920 URLs, but I have by no means a big board (I have a very small board, with relatively few posts). Do you also get allot of duplicate content warnings? If you have a high number of duplicate title tags etc, it is usually an indication that Google does not ignore something it should be. Would be a good place to start.
  13. lazer

    lazer Well-Known Member

    Pretty similar for us too... (not the "extremely high number of URL's", just the dropping crawl stats)...
    Screen Shot 2012-10-15 at 17.49.52.png

    ..although traffic is on a steady climb week on week. Back to pre-conversion figures finally (converted in April).
  14. Digital Doctor

    Digital Doctor Well-Known Member

    Is this somehow related to Google not liking you deleting your Wordpress Tags ?
  15. TheBigK

    TheBigK Well-Known Member

    The traffic is slightly on the incline in the last few days; but the GWT is now reporting this new error. I'm convinced that an error-free website would make Google send the love again.

    I'm wondering why is Google Bot indexing the URLs that I've prevented through the robots.txt. Can someone check if my robots.txt is correct?
  16. TheBigK

    TheBigK Well-Known Member

    Yes I do have duplicate content reported ( about 1000 URLs ), but I can't do anything about it as it seems to be clearly an error from Google's side. There are URLs that don't exist on our site that are marked as 'duplicate'.
  17. MagnusB

    MagnusB Well-Known Member

    Ahh, are those pages 404 pages? Do they return 404, or 200? Cause if they don't return 404, google will mark them as duplicates.
  18. TheBigK

    TheBigK Well-Known Member

    This is how it reports 'Pages with duplicate meta description' -

    Screen Shot 2012-10-16 at 12.24.10 PM.png

    It's basically a forum pagination being reported as 'duplicate'.
  19. TheBigK

    TheBigK Well-Known Member

    I've already began the process of removing the duplicate content. Just found out that few of the new members have created duplicate threads in multiple forums to attract attention and responses. :(

    I'm not sure why is Google Indexing 'Find-New'? Is my robots.txt correct?
  20. MagnusB

    MagnusB Well-Known Member

    Do you see the same robots.txt in WMT? You can also test URLs in that tool. You can also do a folder removal of /find-new/, but then it has to be in robots.txt. It seems right to me.
