Fixed Import vBulletin encoding / charset problem

Ouard

Member
Affected version
2.0.1
Hello,

We are changing from vBulletin 4 to Xenforo. You did a great job !

However, we have a problem with version XenForo 2.0.1.
Our database VB is encoded in latin1_swedish_ci.

During first import with version 2.0.0, there was no encoding problem for posts, threads, forums...
We tried an import with version 2.0.1, some fields are bad encoding (for example, title for forums - table xf_node).

We found in method XF\Import\DataManager, this test :

PHP:
if (preg_match('/[\x80-\xff]/', $string))

This test is true for a latin1 string, so method converts a second time the string if method is called a second time.
Then, the fields are converted 2 times.

Thank you for your answer
 
Are you doing an import on a different server from where the old forum was running? If so, you may have different MySQL settings and you may need to use the option to force a character set at the beginning of the import (using "latin1").

The code to convert to UTF-8 is necessary and it assumes the content will be received in the character set you have configured within vB's languages (IIRC).
 
I am importing from vBulletin 4.5 to XF2.0.1, with collation latin1, and I have many instances of special characters in my database (such as begin/end quote characters 0x93 and 0x94) from Windows-1252/cp1252.

Examine stepPrivateMessages(), for instance, in vBulletin.php:
The problem, from what I can gather, is that bulkSet() calls convertToUtf8, and ->save() at the end of the routine also calls convertToUtf8, and the conversion can't be called twice. Note that $options['convertUtf8'] is true in the EntityEmulator.
 
Are you doing an import on a different server from where the old forum was running? If so, you may have different MySQL settings and you may need to use the option to force a character set at the beginning of the import (using "latin1").

The code to convert to UTF-8 is necessary and it assumes the content will be received in the character set you have configured within vB's languages (IIRC).
Hello,

We are doing an import on same server.
We try another import this morning, with "latin1" option in field "Force character set".
We have always encoding problem.

It's strange because some columns in db have a good conversion. For example, this is phpmyadmin capture of xf_node :
Capture d’écran 2017-12-28 à 09.59.40.webp
Description is good, not title,

Thanks
 
I think we've got this sorted now, hopefully. The main problem was that validTextOrDefault was calling set() for a second time (via a magic method) so the UTF8 conversion was being done twice. That should now be resolved.
 
Hi everyone!

I´m doing an import from vbulletin 3.8 to xenforo 2.0.1 and having this same problem.

Has this been fixed in the 2.0.1 importer or should I wat for the next release to do my import.

the importer in 2.0 works fine!
 
This issue was reported and fixed after 2.0.1 was released. It should be addressed in 2.0.2 when it is released.
 
Top Bottom