Fixed Error when importing from vb4: Received Invalid utf-8 for string column [filename]

Hello everyone,

I was importing a VB 4.x database into XF 2.0, I got an error:

[InvalidArgumentException] Received Invalid utf-8 for string column [filename]

Can anybody help? Thanks very much

MbOLFCwQTwqeaVgGO1pY6w.webp
 
This looks like a bug -- with how the code works, I don't think we're converting the filename to UTF-8.

In terms of a workaround, unfortunately you'd need to identify the attachment in the source that has a filename that contains non-ASCII characters or you'd need to do the import into 1.5 and then upgrade to 2.0.
 
Do you have any solution? My site has more than 1 million user. I can not use the import tool of xenforo 1.5.

This looks like a bug -- with how the code works, I don't think we're converting the filename to UTF-8.

In terms of a workaround, unfortunately you'd need to identify the attachment in the source that has a filename that contains non-ASCII characters or you'd need to do the import into 1.5 and then upgrade to 2.0.
 
Unfortunately, at this time, we don't really have a specific workaround other than what I've mentioned. You may need to look at the filenames stored in your source database to see if there are ones with non-ASCII characters in them. It would be in the first ~500 attachments (ordered by ID); you would need to remove the attachment or rename the value.

The 2.0 importers are currently all in beta, mostly because of uncommon problems such as this. The 1.5 importers have been tested over a much longer period and thus would be our current recommendation if you can't wait until a fix for this is available (I couldn't comment as to when that would be yet).
 
Might you be able to show a complete error trace for this problem? You should find it in your server error logs.
 
Hello,

I have the same error.
I wrote a patch in file src/XF/Import/Importer/vBulletin.php

PHP:
$import->setDataUserId($this->lookupId('user', $attachment['userid']));
                
// Patch : problème encodage nom de pièces jointes
$attachment['filename'] = $this->_getUtf8Cb( $attachment['filename'] );
// Fin Patch

$import->setSourceFile($attachTempFile, $attachment['filename']);

And method, in the same file

PHP:
private static function _getUtf8Cb($str){
        $text = htmlentities( $str, ENT_COMPAT | ENT_HTML401, 'ISO-8859-1');
        
        $text = str_replace("€", "€", $text); // Remplacement du signe euro
        $text = str_replace("’", "’", $text); // guillemet simple
        $text = str_replace("‘", "‘", $text); // guillemet simple
        $text = str_replace("–","–",$text);  // tiret long word
        $text = str_replace("…","…",$text); // Point de suspension...
        $text = str_replace("Œ","Œ",$text);
        $text = str_replace("œ","œ",$text);
        $text = str_replace("Æ","Æ",$text);
        $text = str_replace("æ","æ",$text);
        
        $text = html_entity_decode($text, ENT_QUOTES,'UTF-8');
        
        return $text;
    }
 
I believe we've fixed this, by moving a $this->convertToUtf8() call into the attachment/setSourceFile() method.
 
The fix will roll out with XenForo 2.0.2 - but the changes are spread over too many files to post a patch here.
 
Hi,

it is still broken when the source forum is iso-8859-1 and there are umlauts in the description of an attachment (settings field in attachment table).
I've added the following code to line 3396ff in /addons/XFI/Import/Importer/vBulletin.php
PHP:
if ( isset($settings['title']) ) {
   $settings['title'] = $this->convertToUtf8($settings['title']);
}
if ( isset($settings['description'])) {
   $settings['description'] = $this->convertToUtf8($settings['description']);
}
 
Last edited:
Back
Top Bottom