Partial fix vB 3.8 to XF 1.2 RC 1: Some Posts Are Empty

rellek

Well-known member
Hi,

it came to my attention that some posts were empty after being imported. I cannot tell you what exactly caused this and maybe it's a connection of unhappy circumstances, but well...

How it all started, as far as I know.
There was a WoltLab Burning Board 2 in which we posted some source code we created at school with Turbo Pascal (yes, for DOS).
Then, there was an import to vBulletin 3.6 with upgrades to 3.7 and 3.8. And now there was the import to XF.

There are posts that contain umlauts from ASCII (in the source code) and it looks like that is the issue.

Some other posts were imported empty when there was the output of GZip in (like when headers are sent and you send GZip output).

This thread is just for reference, I will PC Mike for further information as he already has my database (unless he deleted it).
 
This looks fine to me, even when the umlauts don't import correctly -- I just get a semi-corrupted looking post. If you've moved servers, you may need to try forcing the character set (to latin1).

Both types of conversions shouldn't just blank the post out though -- it should simply ignore/strip/replace invalid characters. There may be an underlying library issue that breaks that (which we couldn't do anything about).
 
As far as I saw it, umlauts already broke back when I imported the WBB2 to vB. Servers are the same (source and target)

Did your test import actually import those posts I sent you? Maybe that's a problem with PHP, I'm using 5.4.17 from dotdeb.org. I could give you server access if you want to confirm it's a library and therefore nothing that can be fixed by software.

I however do know a situation in which something gets blank. This is htmlentities when you input a string which is NOT the default encoding (UTF-8 since PHP 5.4) and contains illegal characters (like umlauts or '). Is there any of this function called during import?
 
Yeah, it actually imported them. Most of them looked fine, though I'll admit that at a cursory glance, I didn't notice any offending characters in them after the import - this could be something on the MySQL end (I did 2 tests, one without forcing the charset and one with, which is what allowed umlauts to work in other posts).

I'm importing in PHP 5.4 as well, so it's not the htmlentities behavior per se. Though I do see something that could be related but I should be seeing the same behavior.

If you want to send me server access, that would help.
 
Top Bottom