XF 1.5 I rewrote the MyBB importer to bring PMs in as conversations

Vekseid

Active member
I only rewrote the PrivateMessages step, naturally. There are other problems with the importer but nothing so drastic.

I did this because the standard importer creates an utterly unmanageable mess when tens of millions of PMs are involved. Currently running through a full test import.

Comments and criticism welcome. It is a bit hackish as I'm not planning on doing this again.

Code:
	public function stepPrivateMessages($start, array $options)
	{
		$options = array_merge(array(
			'current' => 1,
			'max' => false
		), $options);

		$sDb = $this->_sourceDb;
		$prefix = $this->_prefix;

		/* @var $model XenForo_Model_Import */
		$model = $this->_importModel;

        $next = $start;

		if ($options['max'] === false)
		{
            $options['max'] = $sDb->fetchOne('
				SELECT MAX(uid)
				FROM ' . $prefix . 'users
			');
		}

		$startName = $sDb->fetchOne('
				SELECT username
				FROM ' . $prefix . 'users
				WHERE uid = ' . $sDb->quote($start) . '
			');

        $currName = $sDb->fetchOne('
				SELECT username
				FROM ' . $prefix . 'users
				WHERE uid = ' . $sDb->quote($options['current']) . '
			');

        $pms = $sDb->fetchAll(
            '
				SELECT pms.*
				FROM ' . $prefix . 'privatemessages AS pms
				WHERE (pms.toid = ' . $sDb->quote($start) . ' AND pms.fromid = ' . $sDb->quote($options['current']) . ')
				OR (pms.fromid = ' . $sDb->quote($start) . ' AND pms.toid = ' . $sDb->quote($options['current']) . ')
				ORDER BY pms.pmid
			'
        );

        if ($pms) {
            $userIds = array();
            foreach ($pms AS $pm) {
                $userIds[$pm['toid']] = $pm['toid'];
                $userIds[$pm['fromid']] = $pm['fromid'];
            }

            $mapUserIds = $model->getImportContentMap('user', $userIds);

            $startUserId = $this->_mapLookUp($mapUserIds, $start);
            $currUserId = $this->_mapLookUp($mapUserIds, $options['current']);

            if ($startUserId || $currUserId) {
                $recipients = [];

                $startIsFrom = false;
                if ($start === $pms[0]['fromid']) {
                    $startIsFrom = true;
                }

                $lastpm = count($pms) - 1;

                $startRead = (int) $pms[$lastpm]['readtime'];
                $currRead = (int) $pms[$lastpm]['dateline'];

                $startLast = false;
                if ($start === $pms[$lastpm]['fromid']) {
                    $startLast = true;
                    $startRead = (int) $pms[$lastpm]['dateline'];
                    $currRead = (int) $pms[$lastpm]['readtime'];
                }

                if ($lastpm === 1) {
                    if ($startIsFrom) {
                        $startRead = (int) max($startRead, $pms[0]['dateline']);
                        $currRead = (int) max($currRead, $pms[0]['readtime']);
                    }
                    else {
                        $startRead = (int) max($startRead, $pms[0]['readtime']);
                        $currRead = (int) max($currRead, $pms[0]['dateline']);
                    }
                }
                else if ($lastpm && $startLast) {
                    $currDateline = $sDb->fetchOne('
				        SELECT max(dateline)
				        FROM ' . $prefix . 'privatemessages
				        WHERE (fromid = ' . $sDb->quote($start) . ' AND toid = ' . $sDb->quote($options['current']) . ')
			        ');
                    $currReadtime = $sDb->fetchOne('
				        SELECT max(readtime)
				        FROM ' . $prefix . 'privatemessages
				        WHERE (fromid = ' . $sDb->quote($start) . ' AND toid = ' . $sDb->quote($options['current']) . ')
			        ');
                    $currRead = (int) max($currReadtime, $currDateline);
                }
                else if ($lastpm) {
                    $startDateline = $sDb->fetchOne('
				        SELECT max(dateline)
				        FROM ' . $prefix . 'privatemessages
				        WHERE (toid = ' . $sDb->quote($start) . ' AND fromid = ' . $sDb->quote($options['current']) . ')
			        ');
                    $startReadtime = $sDb->fetchOne('
				        SELECT max(readtime)
				        FROM ' . $prefix . 'privatemessages
				        WHERE (toid = ' . $sDb->quote($start) . ' AND fromid = ' . $sDb->quote($options['current']) . ')
			        ');
                    $startRead = (int) max($startDateline, $startReadtime);
                }

                if ($startUserId) {
                    $recipients[$startUserId] = array(
                        'username' => $this->_convertToUtf8($startName, true),
                        'last_read_date' => $startRead,
                        'recipient_state' => 'active'
                    );
                }
                else {
                    $startUserId = 0;
                }

                if ($currUserId && $currUserId !== $startUserId) {
                    $recipients[$currUserId] = array(
                        'username' => $this->_convertToUtf8($currName, true),
                        'last_read_date' => $currRead,
                        'recipient_state' => 'active'
                    );
                }

                $convoSubject = $pms[0]['subject'];
                if (substr($convoSubject, 0, 4) === 'Re: ') {
                    $convoSubject = substr($convoSubject, 4);
                }

                $conversation = array(
                    'title' => $this->_convertToUtf8($convoSubject, true),
                    'start_date' => $pms[0]['dateline'],
                    'open_invite' => 0,
                    'conversation_open' => 1
                );

                if ($startIsFrom) {
                    $conversation['user_id'] = $startUserId;
                    $conversation['username'] = $this->_convertToUtf8($startName, true);
                }
                else {
                    $conversation['user_id'] = $currUserId;
                    $conversation['username'] = $this->_convertToUtf8($currName, true);
                }

                $currentText = '';
                $messages = [];

                foreach ($pms as $pm) {
                    $pmUserId = $this->_mapLookUp($mapUserIds, $pm['fromid']);
                    if (!$pmUserId) {
                        $pmUserId = 0;
                    }
                    $pmName = $currName;
                    if ($pmUserId === $startUserId) {
                        $pmName = $startName;
                    }

                    if ($currentText === $pm['message']) {
                        continue;
                    }
                    else {
                        $currentText = $pm['message'];
                    }

                    $messages[$pm['pmid']] = array(
                        'message_date' => $pm['dateline'],
                        'user_id' => $pmUserId,
                        'username' => $this->_convertToUtf8($pmName),
                        'message' => $this->_sanitizeBbCode($pm['message'])
                    );
                }

                XenForo_Db::beginTransaction();

                $model->importConversation($pms[0]['pmid'], $conversation, $recipients, $messages);

                XenForo_Db::commit();
            }
        }

        $nFrom = $startName = $sDb->fetchOne('
				SELECT min(fromid)
				FROM ' . $prefix . 'privatemessages
				WHERE (toid = ' . $sDb->quote($start) . ' AND fromid > ' . $sDb->quote($options['current']) . ')
			');
        $nTo = $startName = $sDb->fetchOne('
				SELECT min(toid)
				FROM ' . $prefix . 'privatemessages
				WHERE (fromid = ' . $sDb->quote($start) . ' AND toid > ' . $sDb->quote($options['current']) . ')
			');

        $oldcurrent = $options['current'];

        if (!$nTo && !$nFrom) {
            $this->_session->incrementStepImportTotal(1);
            $next += 1;
            $options['current'] = $next;
        }
        else if ($nTo) {
            $options['current'] = $nTo;
        }
        else if ($nFrom) {
            $options['current'] = $nFrom;
        }
        else {
            $options['current'] = (int) min($nTo, $nFrom);
        }

		return array($next, $options, $this->_getProgressOutput($next, $options['max']));
	}
 
Updated this to handle conversations by subject. Also made faster and more efficient.

This can end up being very memory hungry - one conversation pair between members, with over 40,000 PMs between them, took a bit over 2 gigs of RAM for php (it's very memory-inefficient with arrays) and upping MySQL's disconnect times for this to work. There are some potential improvements that could still be made (handling drafts better, handling forward titles better...) but attempting those adds further complications.

@Chris D or anyone, comments are welcome.

Going to do one more full test, then convert the live site next week.

Code:
    public function stepPrivateMessages($start, array $options)
    {
        $options = array_merge(array(
            'current' => 1,
            'max' => false
        ), $options);

        if ($start == 0) {
            $start = 1;
        }

        $sDb = $this->_sourceDb;
        $prefix = $this->_prefix;

        /* @var $model XenForo_Model_Import */
        $model = $this->_importModel;

        $next = $start;

        if ($options['max'] === false)
        {
            $options['max'] = $sDb->fetchOne('
                SELECT MAX(uid)
                FROM ' . $prefix . 'users
            ');
        }
        else if ($start > $options['max']) {
            return true;
        }

        $startName = $sDb->fetchOne('
                SELECT username
                FROM ' . $prefix . 'users
                WHERE uid = ' . $sDb->quote($start) . '
            ');

        $currName = $sDb->fetchOne('
                SELECT username
                FROM ' . $prefix . 'users
                WHERE uid = ' . $sDb->quote($options['current']) . '
            ');

        $startMatch = ';s:' . strlen((string) $start) . ':"' . $start . '";';
        $currMatch = ';s:' . strlen((string) $options['current']) . ':"' . $options['current'] . '";';

        // I am 99.9999% sure only the recipients conditions are needed.
        $pms = $sDb->fetchAll(
            '
                SELECT pmid, toid, fromid, subject, message, dateline, readtime
                FROM ' . $prefix . 'privatemessages
                WHERE (toid = ' . $sDb->quote($start) . ' AND fromid = ' . $sDb->quote($options['current']) . ')
                OR (fromid = ' . $sDb->quote($start) . ' AND toid = ' . $sDb->quote($options['current']) . ')
                OR (fromid = ' . $sDb->quote($start) . ' AND recipients like \'%' . $currMatch . '%\' )
                OR (fromid = ' . $sDb->quote($options['current']) . ' AND recipients like \'%' . $startMatch . '%\' )
                ORDER BY pmid
            '
        );

        if ($pms) {
            $userIds = [
                $start => $start,
                $options['current'] => $options['current']
            ];

            $mapUserIds = $model->getImportContentMap('user', $userIds);

            $startUserId = $this->_mapLookUp($mapUserIds, $start);
            $currUserId = $this->_mapLookUp($mapUserIds, $options['current']);

            if (!$startUserId) {
                $startUserId = 0;
            }
            if (!$currUserId) {
                $currUserId = 0;
            }

            unset($userIds);

            if ($startUserId || $currUserId) {

                $startIsFrom = false;
                if ($start === $pms[0]['fromid']) {
                    $startIsFrom = true;
                }

                $convoSubject = $pms[0]['subject'];
                if (substr($convoSubject, 0, 4) === 'Re: ' || substr($convoSubject, 0, 4) === 'Fw: ') {
                    $convoSubject = substr($convoSubject, 4);
                }

                $subjectKey = strtolower($convoSubject);

                $convos = [$subjectKey => [
                    'conversation' => [
                        'title' => $this->_convertToUtf8($convoSubject, true),
                        'start_date' => $pms[0]['dateline'],
                        'open_invite' => 0,
                        'conversation_open' => 1,
                        'user_id' => $startIsFrom ? $startUserId : $currUserId,
                        'username' => $startIsFrom ? $this->_convertToUtf8($startName, true) : $this->_convertToUtf8($currName, true),
                    ],
                    'recipients' => [],
                    'messages' => []
                ]];

                if ($startUserId) {
                    $convos[$subjectKey]['recipients'][$startUserId] = array(
                        'username' => $this->_convertToUtf8($startName, true),
                        'last_read_date' => (int) ($startIsFrom ? $pms[0]['dateline'] : $pms[0]['readtime']),
                        'recipient_state' => 'active'
                    );
                }

                if ($currUserId && $currUserId !== $startUserId) {
                    $convos[$subjectKey]['recipients'][$currUserId] = array(
                        'username' => $this->_convertToUtf8($currName, true),
                        'last_read_date' => (int) ($startIsFrom ? $pms[0]['readtime'] : $pms[0]['dateline']),
                        'recipient_state' => 'active'
                    );
                }

                $splitConvos = false;
                if (count($pms) > 20) {
                    $splitConvos = true;
                }
                $currentText = '';

                foreach ($pms as $key => $pm) {
                    if ($currentText === $pm['message']) {
                        continue;
                    }
                    else {
                        $currentText = $pm['message'];
                    }

                    $pmUserId = $this->_mapLookUp($mapUserIds, $pm['fromid']);
                    if (!$pmUserId) {
                        $pmUserId = 0;
                    }
                    $pmName = $currName;
                    if ($pmUserId === $startUserId) {
                        $pmName = $startName;
                        $startIsFrom = true;
                    }
                    else {
                        $startIsFrom = false;
                    }

                    $currentSubject = $pm['subject'];
                    if (substr($currentSubject, 0, 4) === 'Re: ' || substr($currentSubject, 0, 4) === 'Fw: ') {
                        $currentSubject = substr($currentSubject, 4);
                    }

                    if ($splitConvos) {
                        $subjectKey = strtolower($currentSubject);
                        if (!array_key_exists($subjectKey, $convos)) {
                            $convos[$subjectKey] = [
                                'conversation' => [
                                    'title' => $this->_convertToUtf8($currentSubject, true),
                                    'start_date' => $pm['dateline'],
                                    'open_invite' => 0,
                                    'conversation_open' => 1,
                                    'user_id' => $startIsFrom ? $startUserId : $currUserId,
                                    'username' => $startIsFrom ? $this->_convertToUtf8($startName, true) : $this->_convertToUtf8($currName, true),
                                ],
                                'recipients' => [],
                                'messages' => []
                            ];

                            if ($startUserId) {
                                $convos[$subjectKey]['recipients'][$startUserId] = array(
                                    'username' => $this->_convertToUtf8($startName, true),
                                    'last_read_date' => (int) ($startIsFrom ? $pm['dateline'] : $pm['readtime']),
                                    'recipient_state' => 'active'
                                );
                            }

                            if ($currUserId && $currUserId !== $startUserId) {
                                $convos[$subjectKey]['recipients'][$currUserId] = array(
                                    'username' => $this->_convertToUtf8($currName, true),
                                    'last_read_date' => (int) ($startIsFrom ? $pm['readtime'] : $pm['dateline']),
                                    'recipient_state' => 'active'
                                );
                            }
                        }
                    }
                    else {
                        if (strtolower($currentSubject) !== $subjectKey) {
                            $pm['message'] = 'Original Subject: ' . $currentSubject . "\n\n" . $pm['message'];
                        }
                    }

                    if ($startUserId) {
                        $convos[$subjectKey]['recipients'][$startUserId]['last_read_date'] =
                            max($convos[$subjectKey]['recipients'][$startUserId]['last_read_date'],
                                $startIsFrom ? (int) $pm['dateline'] : (int) $pm['readtime']);
                    }

                    if ($currUserId && $currUserId !== $startUserId) {
                        $convos[$subjectKey]['recipients'][$currUserId]['last_read_date'] =
                            max($convos[$subjectKey]['recipients'][$currUserId]['last_read_date'],
                                $startIsFrom ? (int) $pm['readtime'] : (int) $pm['dateline']);
                    }

                    $convos[$subjectKey]['messages'][$pm['pmid']] = array(
                        'message_date' => (int) $pm['dateline'],
                        'user_id' => $pmUserId,
                        'username' => $this->_convertToUtf8($pmName),
                        'message' => $this->_sanitizeBbCode($pm['message'])
                    );

                    unset($pms[$key]);
                }

                XenForo_Db::beginTransaction();

                foreach ($convos as $convo) {
                    $model->importConversation(0, $convo['conversation'], $convo['recipients'], $convo['messages']);
                }

                XenForo_Db::commit();
            }
        }

        $nFrom = $startName = $sDb->fetchOne('
                SELECT min(fromid)
                FROM ' . $prefix . 'privatemessages
                WHERE (toid = ' . $sDb->quote($start) . ' AND fromid > ' . $sDb->quote($options['current']) . ')
            ');
        $nTo = $startName = $sDb->fetchOne('
                SELECT min(toid)
                FROM ' . $prefix . 'privatemessages
                WHERE (fromid = ' . $sDb->quote($start) . ' AND toid > ' . $sDb->quote($options['current']) . ')
            ');

        if (!$nTo && !$nFrom) {
            $this->_session->incrementStepImportTotal(1);
            $next += 1;
            $options['current'] = $next;
        }
        else if ($nTo) {
            $options['current'] = $nTo;
        }
        else if ($nFrom) {
            $options['current'] = $nFrom;
        }
        else {
            $options['current'] = (int) min($nTo, $nFrom);
        }

        return array($next, $options, $this->_getProgressOutput($next, $options['max']));
    }
 
@Vekseid

I know, I know, I am 2 1/2 years late. ;-)

We're going to convert a medium sized MyBB 1.8.x forum to XenForo 2.2.2 soon. I experienced several issues with the Importer 1.4.0 so far, so I'm currently thinking of rewriting/extending it.

Question: Does the code above still work? Before I would run into some troubles...
 
@Vekseid

I know, I know, I am 2 1/2 years late. ;-)

We're going to convert a medium sized MyBB 1.8.x forum to XenForo 2.2.2 soon. I experienced several issues with the Importer 1.4.0 so far, so I'm currently thinking of rewriting/extending it.

Question: Does the code above still work? Before I would run into some troubles...

XF 2 completely rewrote how importing works (it was genuinely a mess... honestly still is, but less so). That said converting to 1.5 -> 2.0 should work fine still.
 
XF 2 completely rewrote how importing works (it was genuinely a mess... honestly still is, but less so). That said converting to 1.5 -> 2.0 should work fine still.

Thank you for your feedback.

We replaced the complete MyBB 1.8.x -> XenForo 2.2.x PM import routine inside of the Importer 1.4.0 already, other replacements are in work or will follow also (mentions, thread prefixes, thanks, group applications and more).

Unfortunately we need this code one time only.
 
Back
Top Bottom