• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.

Fixed SMF Importer doesn't process BB Code for quotes

jeffwidman

Active member
#1
I understand and agree that most BB Code cleanup should be handled by the admin and not the importer, but saw this and thought it'd be relatively easy to add...

Here's a post in my forum right now (data anonymized). The BB code is just straight text, and didn't transfer as a quote:
Code:
[quote author=Jeff link=topic=8576.msg32075#msg32075 date=13144659]
Moddy - does this mean various things?
[/quote]
I can fix this easily enough using the find and replace tool--but only because I preserved content IDs. If I'd merged in content, it'd be a bit trickier as I'd have to have retained the import log to correctly map the IDs.

Seems like the ideal time to parse this is in the importer itself.
 

Chris D

XenForo developer
Staff member
#2
We do actually handle that in the importer.

https://regex101.com/r/fO8rT3/1

The above link shows the regex we use in the importer against the example you provided above, and the eventual substitution.

We only transfer over the username, though, the rest of the data in the quote is ignored.
 

Chris D

XenForo developer
Staff member
#3
The exception may be if the name contains a special character, e.g. Jeff will work but Jeff! won't. In that case the regex may be slightly too strict.
 

Chris D

XenForo developer
Staff member
#4
I've fixed this for the next version of the importer, the regex has changed to:

PHP:
$string = preg_replace('#\[quote\sauthor=(.+)\slink=[^\]]+]#siU', '[quote="$1"]', $string);
That regex may help you with the content replacement tool.
 

jeffwidman

Active member
#5
The exception may be if the name contains a special character, e.g. Jeff will work but Jeff! won't. In that case the regex may be slightly too strict.
Gotcha. That explains why it transferred most quotes, but not all. The actual username includes multiple spaces, an underscore, and a parenthesis.

Example: "Funky Chicken (aka John_S) "

Thanks for the updated regex.
 

jeffwidman

Active member
#8
The post content find/replace plugin kept timing out on me because there were 280,000+ posts that needed updating.

MariaDB 10 added a regex find-and-replace function, so I just used that:
Code:
UPDATE sj_forum.xf_post SET message = regexp_replace(message, "\\[quote\\sauthor=(.+)\\slink=[^\\]]+]", "\\[quote=\"\\1\"\\]") WHERE message LIKE '%quote author%'