Fixed vB4 Import: Post Titles are never imported (getPostMessage seems broken)

Steffen

Well-known member
Affected version
2.0.2
I think this method is supposed to prepend the post title to the post message unless it's "Re: [thread title]". But the code cannot work because it doesn't know the thread title. At the moment it essentially checks whether $postTitle matches (Re: )?$postTitle which is always true.

Partial fix:
Diff:
diff --git src/XF/Import/Importer/vBulletin.php src/XF/Import/Importer/vBulletin.php
--- a/htdocs/xenforo/src/XF/Import/Importer/vBulletin.php
+++ b/htdocs/xenforo/src/XF/Import/Importer/vBulletin.php
@@ -3559,11 +3559,11 @@ class vBulletin extends AbstractForumImporter
         "), [$threadId, $startDate]);
     }

-    protected function getPostMessage($title, $message)
+    protected function getPostMessage($title, $message, $threadTitle)
     {
         if ($title !== '')
         {
-            $titleRegex = '/^(re:\s*)?' . preg_quote($title, '/') . '$/i';
+            $titleRegex = '/^(re:\s*)?' . preg_quote($threadTitle, '/') . '$/i';

             if (!preg_match($titleRegex, $title))
             {

Unfortunately, I think the thread title is unknown in stepPosts and might have to be fetched from the DB.
 
Last edited:
I think you're right - it's certainly how it worked in XF1.

Would you mind testing this to confirm it now works as you'd expect?

Diff:
diff --git a/src/XF/Import/Importer/vBulletin.php b/src/XF/Import/Importer/vBulletin.php
index 6381d00682..faf1105901 100644
--- a/src/XF/Import/Importer/vBulletin.php
+++ b/src/XF/Import/Importer/vBulletin.php
@@ -3053,7 +3053,7 @@ class vBulletin extends AbstractForumImporter
             {
                 $state->extra['postDateStart'] = $post['dateline'];
 
-                $message = $this->getPostMessage($post['title'], $post['pagetext']);
+                $message = $this->getPostMessage($post['title'], $post['pagetext'], $post['threadtitle']);
 
                 /** @var \XF\Import\Data\Post $import */
                 $import = $this->newHandler('XF:Post');
@@ -3129,10 +3129,13 @@ class vBulletin extends AbstractForumImporter
         return $this->sourceDb->fetchAll($this->prepareImportSql($this->prefix, "
             SELECT post.*,
                 IF(user.username IS NULL, post.username, user.username) AS username,
+                thread.title AS threadtitle,
                 editlog.dateline AS editdate,
                 editlog.userid AS edituserid
             FROM post AS
                 post
+            LEFT JOIN thread AS
+                thread ON (post.threadid = thread.threadid)
             LEFT JOIN user AS
                 user ON (user.userid = post.userid)
             LEFT JOIN editlog AS
@@ -3143,11 +3146,11 @@ class vBulletin extends AbstractForumImporter
         "), [$threadId, $startDate]);
     }
 
-    protected function getPostMessage($title, $message)
+    protected function getPostMessage($title, $message, $threadTitle)
     {
         if ($title !== '')
         {
-            $titleRegex = '/^(re:\s*)?' . preg_quote($title, '/') . '$/i';
+            $titleRegex = '/^(re:\s*)?' . preg_quote($threadTitle, '/') . '$/i';
 
             if (!preg_match($titleRegex, $title))
             {
 
Thanks for the patch, Chris! :)

It's working fine (remember to update vBulletin5.php accordingly but I think you've got that covered), but unfortunately it has uncovered another issue: At least our german vB4 forum does not use the "Re" prefix but "AW". :eek:

I've just changed the code slightly to accomodate for this (so this is not a problem for us anymore). As a general solution for other customers, you maybe could look for [a-z]{2}: and not only for Re:. (But I'm not sure whether that covers all cases.)
 
It's working fine (remember to update vBulletin5.php accordingly but I think you've got that covered), but unfortunately it has uncovered another issue: At least our german vB4 forum does not use the "Re" prefix but "AW". :eek:
What exactly is the mechanism to change that in vBulletin? Is it a phrase, an option or an add-on or something?

Ideally we'd be able to read what the prefix is set as, and just use that.
 
It's working fine (remember to update vBulletin5.php accordingly but I think you've got that covered)
The vBulletin5 importer inherits code from the vBulletin4 importer so I don't think there's any special changes required there btw.
 
What exactly is the mechanism to change that in vBulletin? Is it a phrase, an option or an add-on or something?
Seems to be the phrase "reply_prefix". It's value is "AW:" in our forum.

The vBulletin5 importer inherits code from the vBulletin4 importer so I don't think there's any special changes required there btw.
The vBulletin5 importer class overwrites the "getPostMessage" method and therefore needs to be adjusted.
 
I had to modify the class because otherwise the method signatures didn't match and the new $threadTitle parameter was missing.
Diff:
--- a/src/XF/Import/Importer/vBulletin5.php
+++ b/src/XF/Import/Importer/vBulletin5.php
@@ -712,9 +712,9 @@ class vBulletin5 extends vBulletin4
         "), [$threadId, $startDate]);
     }

-    protected function getPostMessage($title, $message)
+    protected function getPostMessage($title, $message, $threadTitle)
     {
-        return parent::getPostMessage($title, $this->rewriteMediaBbCodes($message));
+        return parent::getPostMessage($title, $this->rewriteMediaBbCodes($message), $threadTitle);
     }

     protected function rewriteMediaBbCodes($text)


Btw, in vB4 the maximum length of a thread title and post post title are both 85 characters. Therefore, if a thread title uses all 85 characters then the title of replies will be "Re: " followed by only the first 81 characters of the thread title (i.e. the post title is truncated). I think it would therefore be good to compare only the first 85 minus strlen(reply_prefix) minus 1 (whitespace) characters.
 
I had to modify the class because otherwise the method signatures didn't match and the new $threadTitle parameter was missing.
Oh yeah, I've changed that :) I thought you meant there was code to change in there. My bad, sorry.

Btw, in vB4 the maximum length of a thread title and post post title are both 85 characters. Therefore, if a thread title uses all 85 characters then the title of replies will be "Re: " followed by only the first 81 characters of the thread title (i.e. the post title is truncated). I think it would therefore be good to compare only the first 85 minus strlen(reply_prefix) minus 1 (whitespace) characters.
Fair point...
 
Hmm, furthermore, we seem to have a lot of threads whose title was slightly changed in the course of their existence. This yields lots of posts where "getPostMessage" prepends the post title although it's more or less equivalent to the thread title. I think it cannot be detected automatically whether this is useful or not.

I think I'll modify the code to only prefix the post title if it does not start with "Re:" / "AW:":
PHP:
if ($title !== '' && $title !== $threadTitle && substr($title, 0, 3) !== 'AW:')
{
    $message = "[b]{$title}[/b]\n\n" . ltrim($message);
}

I still think you should keep the current default because then nobody can be surprised by "data loss".
 
Last edited:
I don't particularly like this, but I think it should work for a number of different cases:
Diff:
diff --git a/src/XF/Import/Importer/vBulletin.php b/src/XF/Import/Importer/vBulletin.php
index 6381d00682..579242c83e 100644
--- a/src/XF/Import/Importer/vBulletin.php
+++ b/src/XF/Import/Importer/vBulletin.php
@@ -3053,7 +3053,7 @@ class vBulletin extends AbstractForumImporter
             {
                 $state->extra['postDateStart'] = $post['dateline'];
 
-                $message = $this->getPostMessage($post['title'], $post['pagetext']);
+                $message = $this->getPostMessage($post['title'], $post['pagetext'], $post['threadtitle']);
 
                 /** @var \XF\Import\Data\Post $import */
                 $import = $this->newHandler('XF:Post');
@@ -3129,10 +3129,13 @@ class vBulletin extends AbstractForumImporter
         return $this->sourceDb->fetchAll($this->prepareImportSql($this->prefix, "
             SELECT post.*,
                 IF(user.username IS NULL, post.username, user.username) AS username,
+                thread.title AS threadtitle,
                 editlog.dateline AS editdate,
                 editlog.userid AS edituserid
             FROM post AS
                 post
+            LEFT JOIN thread AS
+                thread ON (post.threadid = thread.threadid)
             LEFT JOIN user AS
                 user ON (user.userid = post.userid)
             LEFT JOIN editlog AS
@@ -3143,15 +3146,41 @@ class vBulletin extends AbstractForumImporter
         "), [$threadId, $startDate]);
     }
 
-    protected function getPostMessage($title, $message)
+    protected function getPostMessage($title, $message, $threadTitle)
     {
         if ($title !== '')
         {
-            $titleRegex = '/^(re:\s*)?' . preg_quote($title, '/') . '$/i';
+            if (!isset($this->session->extra['reply_prefixes']))
+            {
+                $this->session->extra['reply_prefixes'] = $this->sourceDb->fetchAllColumn($this->prepareImportSql($this->prefix, "
+                    SELECT `text`
+                    FROM phrase
+                    WHERE varname = 'reply_text'
+                "));
+            }
+
+            if (!isset($this->session->extra['titlemaxchars']))
+            {
+                $this->session->extra['titlemaxchars'] = $this->sourceDb->fetchOne($this->prepareImportSql($this->prefix, "
+                    SELECT value
+                    FROM setting
+                    WHERE varname = 'titlemaxchars'
+                "));
+            }
+
+            $replyPrefixes = $this->session->extra['reply_prefixes'];
+            $titleMaxChars = $this->session->extra['titlemaxchars'];
 
-            if (!preg_match($titleRegex, $title))
+            $titleRegex = '/^(' . preg_quote(implode('|', $replyPrefixes), '/') . ')(\s*).*$/i';
+            if (preg_match($titleRegex, $title, $matches))
             {
-                $message = "[b]{$title}[/b]\n\n" . ltrim($message);
+                $trimLen = $titleMaxChars - (strlen($matches[1]) + strlen($matches[2]));
+                $threadTitle = substr($threadTitle, 0, $trimLen);
+
+                if ($title !== ($matches[1] . $matches[2] . $threadTitle))
+                {
+                    $message = "[b]{$title}[/b]\n\n" . ltrim($message);
+                }
             }
         }
 
I didn't yet have a chance to try the patch. I think it should work fine in general but from looking at the code I think there are two (smaller) issues:

1) preg_quote will quote the "|" character so I think multiple $replyPrefixes won't work. I think preg_quote needs to be applied to the individual items before they are imploded to a string.

2) If the post title does not start with any of the $replyPrefixes then it'll be discarded.

3) Nit: The ".*$" at the end of the regex seems unnecessary because $matches[0] isn't used. :)
 
1) preg_quote will quote the "|" character so I think multiple $replyPrefixes won't work. I think preg_quote needs to be applied to the individual items before they are imploded to a string.
Oh no! It's obvious I didn't test it now ;) Thanks good point.

2) If the post title does not start with any of the $replyPrefixes then it'll be discarded.
Good point. I think making the matching of the prefix and the whitespace optional solves this (I've actually done some testing now ;), and it seems to work the same as the original version).

3) Nit: The ".*$" at the end of the regex seems unnecessary because $matches[0] isn't used. :)
(y)

Diff:
diff --git a/src/XF/Import/Importer/vBulletin.php b/src/XF/Import/Importer/vBulletin.php
index 6381d00682..2076cbc294 100644
--- a/src/XF/Import/Importer/vBulletin.php
+++ b/src/XF/Import/Importer/vBulletin.php
@@ -3053,7 +3053,7 @@ class vBulletin extends AbstractForumImporter
             {
                 $state->extra['postDateStart'] = $post['dateline'];
 
-                $message = $this->getPostMessage($post['title'], $post['pagetext']);
+                $message = $this->getPostMessage($post['title'], $post['pagetext'], $post['threadtitle']);
 
                 /** @var \XF\Import\Data\Post $import */
                 $import = $this->newHandler('XF:Post');
@@ -3129,10 +3129,13 @@ class vBulletin extends AbstractForumImporter
         return $this->sourceDb->fetchAll($this->prepareImportSql($this->prefix, "
             SELECT post.*,
                 IF(user.username IS NULL, post.username, user.username) AS username,
+                thread.title AS threadtitle,
                 editlog.dateline AS editdate,
                 editlog.userid AS edituserid
             FROM post AS
                 post
+            LEFT JOIN thread AS
+                thread ON (post.threadid = thread.threadid)
             LEFT JOIN user AS
                 user ON (user.userid = post.userid)
             LEFT JOIN editlog AS
@@ -3143,15 +3146,46 @@ class vBulletin extends AbstractForumImporter
         "), [$threadId, $startDate]);
     }
 
-    protected function getPostMessage($title, $message)
+    protected function getPostMessage($title, $message, $threadTitle)
     {
         if ($title !== '')
         {
-            $titleRegex = '/^(re:\s*)?' . preg_quote($title, '/') . '$/i';
+            if (!isset($this->session->extra['reply_prefixes']))
+            {
+                $this->session->extra['reply_prefixes'] = $this->sourceDb->fetchAllColumn($this->prepareImportSql($this->prefix, "
+                    SELECT `text`
+                    FROM phrase
+                    WHERE varname = 'reply_text'
+                "));
+            }
+
+            if (!isset($this->session->extra['titlemaxchars']))
+            {
+                $this->session->extra['titlemaxchars'] = $this->sourceDb->fetchOne($this->prepareImportSql($this->prefix, "
+                    SELECT value
+                    FROM setting
+                    WHERE varname = 'titlemaxchars'
+                "));
+            }
 
-            if (!preg_match($titleRegex, $title))
+            $replyPrefixes = $this->session->extra['reply_prefixes'];
+            $replyPrefixes = array_map(function($prefix)
             {
-                $message = "[b]{$title}[/b]\n\n" . ltrim($message);
+                return preg_quote($prefix, '/');
+            }, $replyPrefixes);
+
+            $titleMaxChars = $this->session->extra['titlemaxchars'];
+
+            $titleRegex = '/^(' . implode('|', $replyPrefixes) . ')?(\s*)?.*/i';
+            if (preg_match($titleRegex, $title, $matches))
+            {
+                $trimLen = $titleMaxChars - (strlen($matches[1]) + strlen($matches[2]));
+                $threadTitle = substr($threadTitle, 0, $trimLen);
+
+                if ($title !== ($matches[1] . $matches[2] . $threadTitle))
+                {
+                    $message = "[b]{$title}[/b]\n\n" . ltrim($message);
+                }
             }
         }
 
At what point does .diff become a valid extension for attachments here... 🤔

I would help test this for English imports but @ DBTech there's not a single case of post titles other than with the OP of a thread, as far as I can tell, so our database is quite useless in this regard 😅


Fillip
 
I've given the patch a try. I had to replace WHERE varname = 'reply_text' with WHERE varname = 'reply_prefix' but then it was working fine. (y)

(Maybe add 'Re:' as a fallback if no "reply_prefix" phrase is found? Should never happen but still...)
 
If the truncated title ends with whitespace then the saved reply title can be shorter than "titlemaxchars" (because of trimmed whitespace).

Example:
Thread title: "Vega Frontier liquid Edition (vergleichbar mit RX Vega 64 liquid) FAN im IDLE zu laut" (85 characters)
Reply title: "AW: Vega Frontier liquid Edition (vergleichbar mit RX Vega 64 liquid) FAN im IDLE zu" (only 84 characters)

This should fix it:
Diff:
diff --git a/src/XF/Import/Importer/vBulletin.php b/src/XF/Import/Importer/vBulletin.php
index a534a1a5e..d334360fa 100644
--- a/src/XF/Import/Importer/vBulletin.php
+++ b/src/XF/Import/Importer/vBulletin.php
@@ -3592,7 +3592,7 @@ class vBulletin extends AbstractForumImporter
         if (preg_match($titleRegex, $title, $matches))
         {
             $trimLen = $titleMaxChars - (strlen($matches[1]) + strlen($matches[2]));
-            $threadTitle = substr($threadTitle, 0, $trimLen);
+            $threadTitle = rtrim(substr($threadTitle, 0, $trimLen));

             if ($title !== ($matches[1] . $matches[2] . $threadTitle))
             {
 
The trailing whitespace does not always seem to be removed (please don't ask me why, maybe when using Tapatalk or the mobile style?).

Diff:
diff --git a/src/addons/XFI/Import/Importer/vBulletin.php b/src/addons/XFI/Import/Importer/vBulletin.php
index 55705d4eb..3191bf030 100644
--- a/src/addons/XFI/Import/Importer/vBulletin.php
+++ b/src/addons/XFI/Import/Importer/vBulletin.php
@@ -3740,7 +3740,7 @@ class vBulletin extends AbstractForumImporter
                 $trimLen = $titleMaxChars - (strlen($matches[1]) + strlen($matches[2]));
                 $threadTitle = rtrim(substr($threadTitle, 0, $trimLen));

-                if ($title !== ($matches[1] . $matches[2] . $threadTitle))
+                if (rtrim($title) !== ($matches[1] . $matches[2] . $threadTitle))
                 {
                     $message = "[b]{$title}[/b]\n\n" . ltrim($message);
                 }
 
Top Bottom