1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

XF 1.4 Sitemap Full of Strange Characters

Discussion in 'Troubleshooting and Problems' started by mjda, Aug 7, 2014.

  1. mjda

    mjda Active Member

    I built the sitemap on my test install, downloaded it, opened it up, and there is just a bunch of strange characters and symbols. When I download, and view, the sitemap here on Xenforo I see plain text. I'm assuming this has something to do with the way my server is configured (gzip maybe?), but I have no idea how to fix it.

    Anyone have any ideas?
  2. Mike

    Mike XenForo Developer Staff Member

    The sitemap is generally gzipped.
  3. mjda

    mjda Active Member

    Well, yeah, I'm talking about the XML that is left after it's extracted. It's unreadable. I could send you a copy of it if you give me an email address, or I can PM you with a screenshot of my notepad++ if that will work.
  4. Mike

    Mike XenForo Developer Staff Member

    Sounds like it didn't extract properly to me. Ideally, send me the URL to the sitemap script itself.
  5. mjda

    mjda Active Member

    I downloaded it a few times and, like I said, the one from xenforo.com looks fine. I used the same program to extract them both. In any case, I just sent you a PM with a link to my sitemap URL.
  6. Mike

    Mike XenForo Developer Staff Member

    Just for reference, 2 people have sent me links and I'm 99.9% sure it's just down to the program opening it. The file is a gzip file (not a zip file). I think Winzip may be the programmatic program here.

    (And for completion sake, the sitemap index file is not run through gzip as it stands, so it's not a fair comparison.)
  7. TJA

    TJA Well-Known Member

    http://www.rarlab.com/download.htm that might help for extracting it
  8. mjda

    mjda Active Member

    I just realized something new concerning this. If I download the xml.gz file directly via FTP I can extract it and read it just fine. So, that tells me it's not the program at all and has something to do with a server setting somewhere that is altering the sitemap file when it's sent through http. To take it a step further I downloaded the file through http, then uploaded it to my server again so I could extract it using gunzip. Even doing that, the file was unreadable.
  9. mjda

    mjda Active Member

    WinRAR is what I'm using. The other person Mike was talking about who sent him a link is the one using WinZip.
  10. Mike

    Mike XenForo Developer Staff Member

    Well I've downloaded the sitemap from both sites and extracted it without issue. (The first one, I also checked with wget and gunzip from our server.) You can try manually submitting it to Google via webmaster tools -- that will show you the status of Google processing it.

    That said, are you able to identify what the differences are between the 2 files you downloaded? It's worth noting that sitemap.php literally just spits out some headers and calls readfile() to output it. Any variation in content may imply something on the server modifying the content.
  11. Karelke

    Karelke Active Member

    I had to rename the file and add the .gz extension before unzipping.

    Screenshot from 2014-08-09 21:15:26.png

    Works perfectly.

    Screenshot from 2014-08-09 21:19:34.png

    I think WinRAR/WinZip cannot recognize the file without the extension.
  12. EQnoble

    EQnoble Well-Known Member

  13. mjda

    mjda Active Member

    I was actually able to find 1 difference. The sitemap on xenforo.com was compressed to 20%. When I downloaded the one from my site it didn't show to be compressed at all. That really got me to thinking so I started looking at the file a bit closer. I had a file called sitemap-1.xml.gz (downloaded from my test server). I extracted that to sitemap-1.xml. I then renamed sitemap-1.xml to sitemap-1-2.xml.gz. I was able to extract it again and view it just fine. So, basically, I had to extract the same file twice to get it to work.

    I went and checked my server settings and realized I had server gzip compression disabled (don't ask me why). I enabled it and restarted apache. Now, things are working as expected. I can now download the file from sitemap.php, extract it and read it just fine.

Share This Page