Traffic Down Since VB to XF Migration

Ok. Then the 4XX responses is your biggest problem.

If that's 46% of your 1.94M crawl requests, that's a ton. It wasn't originally clear if Google just wasn't crawling your site, but with 1.94M crawl requests, that seems like a solid amount.

Yes I agree...the 46% value Other Client Error (4xx) in the Crawl Stats area does seem to stand out as something to investigate further.

Here are my Crawl Stats (By Response):

Screen Shot 2021-01-25 at 10.21.17 PM.png

Here's my confusion...what does that 46% represent (46% of what number)? If it's 46% of the 1.94M crawl requests...most definitely this could be the BIG issue I've been looking for...especially since 46% is very close to the 50%+ drop in traffic I've been referring to since the migration to XF.

Curiously...if I click on the Other Client Error section...it only lists 11 URL's for the last 90 days. If this 46% was of 46% of 1.94M...I would think we would expect to see a whole lot more than 11 URL's listed for the past 90 days. Also when I investigated all 11 URL's...turns out most of them are deleted threads...or threads in the private Staff area.

For comparison...if I click on the "Moved Temporarily (302)" section (which is only 1%)...it lists 999 URL's. Same thing with "Not found (404)" (2%)...it lists 1000 URL's. 46% must surely represent way more than 11 URL's...just very curious why Google Search Console is only listing 11.

Of course the BIG question is...how can this be resolved?...I'm not exactly sure where to start.

I know you listed some options:

  • Permission set wrong
  • Google is being directed to pages it's not supposed to be. Sitemap? Bad linking somewhere?
  • other less common ones:
I'm not sure how I would go about tackling each of these (where to start)?

Surely if we're talking 46% of 1.94M (or at least a very very big number)...the issue must be something that's setup incorrectly to have this large of an effect (it's certainly not a bunch of individual dead links that need to be investigated & fixed one by one).

Additional thoughts I had...could this be something setup incorrectly in the:

  • robots.txt file
  • .htaccess file
  • incorrect redirects
  • etc

I'm open to ideas (anyone). Thanks:)
 
When you click on "Other Client Error (4XX)", on the page with the 11 urls listed, there is a graph on top and it should have a number. "Total Crawl Requests". That's how many threw that error in the last 90 days.
 
Hopefully I followed your instructions correctly.:) Here's the Google Analytics graph for Pageviews for the same time period as previous graphs (May1, 2020 thru Sept 30, 2020)...Direct Traffic & Organic Traffic. Red circle is when migration to XF took place (1st week of July 2020):

View attachment 244849



View attachment 244848

Hope this helps,

Thanks
From the looks of it, the "direct traffic" was also negatively affected by a lot.. Those are usually the ones coming in from other sites through backlinks.. To me it really does seem like the URL structure and redirects are not working correct after migration..

Did you try to enter your domain here https://ahrefs.com/broken-link-checker ? It will tell you which pages have many backlinks, but are "not found" on your site (which means lost ranking potential).. Make sure to select "inbound links":
1611638407238.png
In the bottom of the report it will say total amount of broken inbound links.
If that number is high, it might make sense to do the 7 day trial for 7$ and export that data (they have that functionality).

GSC should also show similar data somewhere as explained by other posters, but I prefer Ahrefs as it is a really powerful tool :)
 
From the looks of it, the "direct traffic" was also negatively affected by a lot.. Those are usually the ones coming in from other sites through backlinks.. To me it really does seem like the URL structure and redirects are not working correct after migration..

Did you try to enter your domain here https://ahrefs.com/broken-link-checker ? It will tell you which pages have many backlinks, but are "not found" on your site (which means lost ranking potential).. Make sure to select "inbound links":
View attachment 244870
In the bottom of the report it will say total amount of broken inbound links.
If that number is high, it might make sense to do the 7 day trial for 7$ and export that data (they have that functionality).

GSC should also show similar data somewhere as explained by other posters, but I prefer Ahrefs as it is a really powerful tool :)
Hello mazzly. Thanks very much for the help & ideas.

This subject has been taking up a lot of my time...especially figuring out exactly what the issue it...and how to fix it. I want to install the XF add-on you suggested earlier...and see what it says...as well as the 7-day $7 arefs trial you mentioned.

With the number of links/URL's involved...I'm thinking it's got to be some sort of setup issue.

If you have any other ideas/suggestion how I should proceed tackling this...please post.

Thanks:)
 
When you click on "Other Client Error (4XX)", on the page with the 11 urls listed, there is a graph on top and it should have a number. "Total Crawl Requests". That's how many threw that error in the last 90 days.
Thanks Arn. I did this...and I think things are starting to make more sense.

"Total Crawl Requests" on the page with the 11 URL's [Other client error (4xx)] = 898K. Taking 1.94M (total crawl requests) x 0.46 = 892K (roughly the total crawl requests for "Other client error (4xx)".

Knowing this...the one thing that's confusing is why doesn't this 898K show up in the GSC "Console" area (call it out more clearly)?

  • The "red" Console "Error" category does not call it out (only have a total of 110 here).
  • The 898K could possibly fall into the "gray" Console "Excluded" category...it has a much higher value (1.77M).

Looking thru the individual "Type's" listed for the Excluded category..."Page with redirect" is the largest @1.24M. Maybe the 898K (Other client error (4xx))...is the major portion of this 1.24M "Page with redirect" category?

Thanks
 
I would search for "404" and "403" status codes in there, as that is likely what is problematic.

If Linux, something like:
Code:
grep "404" /var/log/your-server-logs.log
should do the trick :)

Cool...thank you sir. Will look for both!:)
 
I would search for "404" and "403" status codes in there, as that is likely what is problematic.

If Linux, something like:
Code:
grep "404" /var/log/your-server-logs.log
should do the trick :)
404 is counted differently so his error codes are non-404 but 4XX.

maybe grep for “Googlebot” first and then you can look through that.
 
Unfortunately this is something we have noticed ourselves after shifting to XF from VB years back. We still have no idea as to what was the reason behind that.
 
Unfortunately this is something we have noticed ourselves after shifting to XF from VB years back. We still have no idea as to what was the reason behind that.
The problem is that under vB, for example, user profiles were freely visible. Therefore, Google indexed 1000s of these profiles.
With the migration to XF, user profiles are mostly only visible to registered users. Google also crawls as a guest and then only gets an error 403 when crawling these profiles. Since google no longer has access to them, it throws these profile links from the index and punishes the users as thousands of 403 errors occur. As I see with us, this is also associated with a devaluation and your "values" on Google decline.

In addition, there are things like no access to downloads for the guests and Google no longer have access, postings that were still delivered to google in vB in the sitemap are no longer available with XF sitemap and if you still have access to goto, or postings, forbidden via robots.txt, millions of items of content are thrown out of the index by Google - like here. Of course, this results in worse ratings on Google. (...)
 
The problem is that under vB, for example, user profiles were freely visible. Therefore, Google indexed 1000s of these profiles.
With the migration to XF, user profiles are mostly only visible to registered users. Google also crawls as a guest and then only gets an error 403 when crawling these profiles. Since google no longer has access to them, it throws these profile links from the index and punishes the users as thousands of 403 errors occur. As I see with us, this is also associated with a devaluation and your "values" on Google decline.

In addition, there are things like no access to downloads for the guests and Google no longer have access, postings that were still delivered to google in vB in the sitemap are no longer available with XF sitemap and if you still have access to goto, or postings, forbidden via robots.txt, millions of items of content are thrown out of the index by Google - like here. Of course, this results in worse ratings on Google. (...)
Awesome detailed explanation...thanks very much!:)
 
I'm working with my site host to help with searching the server logs for the "4xx" error codes. They're saying that since there are close to 30 different "4xx" error codes...it can be difficult to search the logs for everything with a "4" in it.

What would be the next most common "4xx" error codes (after 403's and 404's) to search the server logs for...that might be related to the traffic decrease issue mentioned in this thread?

Thanks
 
Awesome detailed explanation...thanks very much!:)
That was not all. ;L)

You can not only increase your values on Google again as already described, you should also use the powerful possibilities of XF and convert old topics into question/answer topics (Q&A), articles (news etc.) and solution topics.
If Google recognizes such topics, separate categories appear in the GSC. E.g. (https://search.google.com/search-console/q-and-a?resource_id=yoururl) This increases the "values" on Google again. :D
 
Hopefully I followed your instructions correctly.:) Here's the Google Analytics graph for Pageviews for the same time period as previous graphs (May1, 2020 thru Sept 30, 2020)...Direct Traffic & Organic Traffic. Red circle is when migration to XF took place (1st week of July 2020):

View attachment 244849



View attachment 244848

Hope this helps,

Thanks

On the 1st week of July 2020 you have dropdown for direct traffic pagevies as well. I would guess that visitors started to hit the "you have no rights to see this page" (403 error). Check in google analytics the same as on this screenshot but chose Bounce rate instead of Pagevies. I bet it jumped up for both organic & direct traffic. That would mean you need to check permissions to find out what became not available for not logged in visitors (but was available before). Can view attachments, can view user profiles, can view threads in this node etc.
 
Server log was searched for "Googlebot"...and got a total of 1875 hits (first 12 hours of January 26th). Searching thru all these hits for any "4xx" errors is quite a task. Any suggestions for making it easier?

If there were specific individual "4xx" error codes to search for (other than 403 & 404)...I'm thinking this would be a lot easier.:)

Thanks
 
Did a quick page by page scan of the server log for "Googlebot" (first 12 hours of January 26th). Mostly seeing 200's...some 301's...a few 303's...and a few 403's.

From the quick scan I didn't see any 404's. Didn't see any non 403/404 "4xx" error codes. But again...it was a quick scan of 1875 Googlebot hits.

Thanks
 
grep "Googlebot" logfile.txt | grep -v " 200 " | grep -v " 301 " | grep -v " 302 "

that greps for googlebot and excludes lines with 200, 301, 302
 
Top Bottom