• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

XF 1.5 Not Getting Any Crawlers On Site

#1
Hi, I barely know what I'm doing.

Our website stopped getting crawlers at some point and no longer seems to be indexed all that well on the Googles. We realized that we didn't have something called an SSL Certificate, which my admins tell me was very important to have for this sorta thing, so we got it and... still nothing!

From my research it seems we may just be in the "wait it out" phase and Google may get on top of it sometime down the line but I thought hey, in the meantime, why not ask around and see what other things may be contributing to our lack of Google indexing? Couldn't hurt, right?

Our site here if you need to take a look for some reason: www.sakugacity.com

Thank you very much for your time.
 

Matt C.

Active member
#2
If you go to the Google Search Console, located here, add your site, and then upload your sitemap, it will rapidly speed up the process. You can download your sitemap here /admin.php?logs/sitemap
 

OperaManiac

Well-known member
#3
SSL is one of the hundreds of parameters that Google consider to decide your ranking. It has nothing to do with indexing of your website.
If your domain is new, Google would take a few days to start indexing it. It might already be indexing it but not showing it in search results.
Submitting sitemap is one of the best things you can do to help Google map out your website structure. There are also options in console where you can ask Google to index a particular web page. So, try that and see if it helps with a non indexed page.
You can also use the webmasters console to check if your website is not blocking Google search engine spiders. The entire platform is pretty handy for a web administrator.
Your post does indicate that the website was indexed fine in the past. I have seen something happen on my domain too. For a few days, indexing becomes incredibly slow. New content is indexed only after 2-3 days. But it usually sorts out by itself.
Your main job is to check if your website is not blocking crawlers, is not infected with malware that might be redirecting traffic.
 
#4
If you go to the Google Search Console, located here, add your site, and then upload your sitemap, it will rapidly speed up the process. You can download your sitemap here /admin.php?logs/sitemap
I've figured out the adding my site thing but the sitemap process eludes me. The search console is asking me to submit something "www.sakugacity.com/" but I'm not sure what I'm supposed to put here. Am I putting the .XML file that Xenforo spat out somewhere specific? Sorry, I've never ever done this before!

SSL is one of the hundreds of parameters that Google consider to decide your ranking. It has nothing to do with indexing of your website.
If your domain is new, Google would take a few days to start indexing it. It might already be indexing it but not showing it in search results.
Submitting sitemap is one of the best things you can do to help Google map out your website structure. There are also options in console where you can ask Google to index a particular web page. So, try that and see if it helps with a non indexed page.
You can also use the webmasters console to check if your website is not blocking Google search engine spiders. The entire platform is pretty handy for a web administrator.
Your post does indicate that the website was indexed fine in the past. I have seen something happen on my domain too. For a few days, indexing becomes incredibly slow. New content is indexed only after 2-3 days. But it usually sorts out by itself.
Your main job is to check if your website is not blocking crawlers, is not infected with malware that might be redirecting traffic.
Using your guidance I've determined that all of Google's bots are "Allowed" to visit my site. So, that must be good news!

Unfortunately, as I stated above, I have no idea how this sitemap wizardry works.
 

OperaManiac

Well-known member
#5
Your forum's sitemap is here: https://www.sakugacity.com/forums/-/index.rss
You need to add this to sitemap field.
 
Last edited:

Matt C.

Active member
#6
I've figured out the adding my site thing but the sitemap process eludes me. The search console is asking me to submit something "www.sakugacity.com/" but I'm not sure what I'm supposed to put here. Am I putting the .XML file that Xenforo spat out somewhere specific? Sorry, I've never ever done this before!



Using your guidance I've determined that all of Google's bots are "Allowed" to visit my site. So, that must be good news!

Unfortunately, as I stated above, I have no idea how this sitemap wizardry works.
On the Search Console, it will ask you to enter the URL for the sitemap. Your's is https://sakugacity.com/sitemap.php. So just enter sitemap.php like so:
Screenshot (23).png

Your forum's sitemap is here: https://www.sakugacity.com/forums/-/index.rss
You need to add this to sitemap field.
That is not correct. That is a forum's RSS feed. The sitemap is located at /sitemap.php.
 
#10
Been talking to my host a lot and they are baffled, there's no good reason that Google or anyone does not have permission to view my sitemap. They advised to use .xml format and to generate a new sitemap. XenForo's admin panel is no longer letting me download one (I get a vague error that it just didn't work) so I used a sitemap generating site to whip one up and slapped it in there.

I've been able to ascertain that file viewing services like Redleg are unable to see my sitemap.xml file due to inadequate permissions. My sitemap.xml file is sitting in my public_html folder (and also in the root, just in case) so this shouldn't be happening.

Random question: I don't know if this is relevant or not but I checked permission on my public_html folder itself and it reads thusly, could this be my problem, or is this how it should be?

permissions.jpg
 
#12
I'm sorry but the stuff they're talking about in there is beyond the scope of my limited abilities. I don't really understand what Brogan is saying in there.

Is it possible that by generating a sitemap.xml and deleting the sitemap.php that I've somehow goofed up XenForo's ability to even generate new ones? XenForo assures me it's rebuilt a new sitemap when I try to do it but when I attempt to view it I get an error like the page doesn't exist.
 

OperaManiac

Well-known member
#13
based on that link, you should put 5 in the last field so that the three values become 7 5 5 and then check. the third field currently has 0 in your screenshot.
 
#14
based on that link, you should put 5 in the last field so that the three values become 7 5 5 and then check. the third field currently has 0 in your screenshot.
Weird. I can't change the permissions in my file manager. Gonna go ahead and guess that's a web host problem?

Just found this in the Google Search Console. Is this helpful for diagnosing what's going wrong? fdsfdfdsfsdfsdsfd.jpg
 
#17
Okay, I think my host found the target: our htaccess file.

The moment he disabled it the Fetch as Google worked liked a charm and the Google Search Console was able to pick up our sitemap. Unfortunately, a bunch of stuff on the forum broke and began returning 404 errors so we had to re-enable the htaccess file so that the forum could still function. So, something in that file is both blocking Googlebot and keeping the forum running.
 
#19
post the content of the htaccess file here.
This part's a bit confusing to me, we seem to have 2 of them.

One is sitting in the root directory and it contains this:
Code:
<IfModule mod_deflate.c>
    SetOutputFilter DEFLATE
    <IfModule mod_setenvif.c>
        # Netscape 4.x has some problems...
        BrowserMatch ^Mozilla/4 gzip-only-text/html
        
        # Netscape 4.06-4.08 have some more problems
        BrowserMatch ^Mozilla/4\.0[678] no-gzip
        
        # MSIE masquerades as Netscape, but it is fine
        # BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
        
        # NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48
        # the above regex won't work. You can use the following
        # workaround to get the desired effect:
        BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html
        
        # Don't compress images
        SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary
    </IfModule>
    
    <IfModule mod_headers.c>
        # Make sure proxies don't deliver the wrong content
        Header append Vary User-Agent env=!dont-vary
    </IfModule>
</IfModule>

# Use PHP56 as default
AddHandler application/x-httpd-php56 .php

...and one is sitting in our public_html directory but is named htaccess.txt. It contains this:
Code:
#    Mod_security can interfere with uploading of content such as attachments. If you
#    cannot attach files, remove the "#" from the lines below.
#<IfModule mod_security.c>
#    SecFilterEngine Off
#    SecFilterScanPOST Off
#</IfModule>

ErrorDocument 401 default
ErrorDocument 403 default
ErrorDocument 404 default
ErrorDocument 405 default
ErrorDocument 406 default
ErrorDocument 500 default
ErrorDocument 501 default
ErrorDocument 503 default

<IfModule mod_rewrite.c>
    RewriteEngine On

    #    If you are having problems with the rewrite rules, remove the "#" from the
    #    line that begins "RewriteBase" below. You will also have to change the path
    #    of the rewrite to reflect the path to your XenForo installation.

    #RewriteBase /xenforo

    #    This line may be needed to enable WebDAV editing with PHP as a CGI.

    #RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]

    RewriteCond %{HTTP_HOST} ^[^.]+\.[^.]+$
    RewriteRule ^(.*)$ https://www.%{HTTP_HOST}/$1 [L,R=301]

    RewriteCond %{REQUEST_FILENAME} -f [OR]
    RewriteCond %{REQUEST_FILENAME} -l [OR]
    RewriteCond %{REQUEST_FILENAME} -d
    RewriteRule ^.*$ - [NC,L]
    RewriteRule ^(data/|js/|styles/|install/|favicon\.ico|crossdomain\.xml|robots\.txt) - [NC,L]
    RewriteRule ^.*$ index.php [NC,L]
</IfModule>

<IfModule mod_deflate.c>
  # Compress HTML, CSS, JavaScript, Text, XML and fonts
  AddOutputFilterByType DEFLATE application/javascript
  AddOutputFilterByType DEFLATE application/rss+xml
  AddOutputFilterByType DEFLATE application/vnd.ms-fontobject
  AddOutputFilterByType DEFLATE application/x-font
  AddOutputFilterByType DEFLATE application/x-font-opentype
  AddOutputFilterByType DEFLATE application/x-font-otf
  AddOutputFilterByType DEFLATE application/x-font-truetype
  AddOutputFilterByType DEFLATE application/x-font-ttf
  AddOutputFilterByType DEFLATE application/x-javascript
  AddOutputFilterByType DEFLATE application/xhtml+xml
  AddOutputFilterByType DEFLATE application/xml
  AddOutputFilterByType DEFLATE font/opentype
  AddOutputFilterByType DEFLATE font/otf
  AddOutputFilterByType DEFLATE font/ttf
  AddOutputFilterByType DEFLATE image/svg+xml
  AddOutputFilterByType DEFLATE image/x-icon
  AddOutputFilterByType DEFLATE text/css
  AddOutputFilterByType DEFLATE text/html
  AddOutputFilterByType DEFLATE text/javascript
  AddOutputFilterByType DEFLATE text/plain
  AddOutputFilterByType DEFLATE text/xml

  # Remove browser bugs (only needed for really old browsers)
  BrowserMatch ^Mozilla/4 gzip-only-text/html
  BrowserMatch ^Mozilla/4\.0[678] no-gzip
  BrowserMatch \bMSIE !no-gzip !gzip-only-text/html
  Header append Vary User-Agent
</IfModule>

<ifModule mod_expires.c>
     ExpiresActive On

     ############################################
     ## Add default Expires header
     ## http://developer.yahoo.com/performance/rules.html#expires

     <FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
     ExpiresDefault "access plus 1 year"
     </FilesMatch>
</ifModule>
 

OperaManiac

Well-known member
#20
there should be a file named .htacccess in public_html folder. your ftp client might not show it by default as some apps hide files starting with . at the start of them. try to find this particular file and paste its content here.