Resource icon

Post Content Find / Replace 1.0.0

No permission to download
We have unfortunately ran into an issue where there are weird symbols in many of our posts
If this is related to an import, you may need to consider importing again with the correct character set as many tables can be affected.

You probably need to set the database character set to 'latin1'.
 
It really depends on the context the  is in (or whatever other characters).

Was this an import from another software and upon import those were added?

I ask because sometimes when importing from one system to another that uses a different character encoding things like   will end up displaying as an  and in that respect removing all of them is probably not such a good idea because there are a bunch of characters that will also be misencoded and just removing them could leave you with bigger problems.

Really the best bet if this was an import is to redo it and make sure to account for the character encoding first.
 
If this is related to an import, you may need to consider importing again with the correct character set as many tables can be affected.

You probably need to set the database character set to 'latin1'.
Unfortunately, in order to redo the import, it will have old data. This was when we were trying to go from IPB4 to Mybb and somehow it happened.

How do i set the database character?

It really depends on the context the  is in (or whatever other characters).

Was this an import from another software and upon import those were added?

I ask because sometimes when importing from one system to another that uses a different character encoding things like   will end up displaying as an  and in that respect removing all of them is probably not such a good idea because there are a bunch of characters that will also be misencoded and just removing them could leave you with bigger problems.

Really the best bet if this was an import is to redo it and make sure to account for the character encoding first.
Yes after converting from IPB4 to Mybb is when it happened. The random A's are literally sporadically just throughout a post....Some posts are worse than others it seems.

It's not as simple to do the reimport as it means losing data so i am hoping there is a way to fix it with this find and replace tool or some other means?

Here is an example. This is actually relatively tame compared to some other posts
FLkZCLO.png
 
To be honest, an import is much simpler than the alternative.

A thing to remember if you are trying to replace any other erroneous characters, some of these errors turn into strings of multiple weird characters and the groups (sometimes 3 or 4) really represent 1 character or maybe 2.

If you remove one at a time you may leave yourself with unfixable errors: imagine that what is supposed to be a single letter followed by a space is now being displayed as æÂø , if you remove that  you get left with æø and there is no way for you to derive what should be there or what to replace it with.

Just dropping characters one at a time is not guaranteed to fix anything in fact it will more than likely make things worse unless you go though every possible error, longest string sequence to shortest which is going to take you days and that is assuming you already know all erroneous characters being displayed.

I have an idea of how to fix this in a way that you wouldn't loose any of your new content but I just don't have the necessary hours to donate that would be needed in order for it to work as I am in the process of finding full time work and time in large slots is a commodity I am short of at the moment.
 
Last edited:
To be honest, an import is much simpler than the alternative.

A thing to remember if you are trying to replace any other erroneous characters, some of these errors turn into strings of multiple weird characters and the groups (sometimes 3 or 4) really represent 1 character or maybe 2.

If you remove one at a time you may leave yourself with unfixable errors: imagine that what is supposed to be a single letter followed by a space is now being displayed as æÂø , if you remove that  you get left with æø and there is no way for you to derive what should be there or what to replace it with.

Just dropping characters one at a time is not guaranteed to fix anything in fact it will more than likely make things worse unless you go though every possible error, longest string sequence to shortest which is going to take you days and that is assuming you already know all erroneous charters being displayed.

I have an idea of how to fix this in a way that you wouldn't loose any of your new content but I just don't have the necessary hours to donate that would be needed in order for it to work as I am in the process of finding full time work and time in large slots is a commodity I am short of at the moment.
I understand :) Thank you anyway.
So you suggest just doing a completely fresh import entirely? With the old data? Instead of trying to fix it.
 
In your case (I'm assuming you don't have a board with millions upon millions of post here) I would say yes re-import because as Brogan pointed out it is not just the posts table that may be affected, there are ways around it but they are way more complicated for someone who doesn't already know and probably something you would be better off paying someone to do for you all things considered.



If you are going to take a stab at fixing this yourself I recommend these steps in the following order.
  • From CLI (not through PMA/cPanel etc unless your db is small and you can get away with it) dump your db and verify that is was a clean backup before proceeding
  • While still in the shell zip/tar your entire file structure from that domain's root.
At this point you have a clean backup and should verify it by setting the site up somewhere else (a subdirectory or on a dummy domain on your server) and walking though the site and making sure your clone 100% works the same as your live site.

At any rate do not delete or actually change anything (on the live site) until you are completely done with all of your fixes and are satisfied with the final outcome just so that you always have a setup which you can revert to and so that you can allow new content to be addressed after.


  • Once that is good run the import process from scratch again on a clean install.
  • After you get there and have no encoding issues you may start a convo with me and I will help you getting the content which was created after your import into your new import just make sure you don't delete anything until you are satisfied with completion.

If you are at all confused feel free to start a convo with me so that this thread which is basically a repository of regular expressions to be used with this addon doesn't get taken too far off the beaten path. Again and I can't stress this enough...do NOTHING to your live site other than back it up...this will give you an unlimited amount of tries to fix it without risking your site and leave me what I need to help you get your new content into the newest import.
 
Hello guys, first of all I have to say great support here from all of you.

Can you help me out with the following replacement. These are images that I want to replace.

Code:
[IMG]http://www.mydomain.nl/forums/members/aaaaaapicturexxxxxx-yyyyyy.jpg[/IMG]

should be replaced with:

Code:
[IMG]http://www.mydomain.nl/media/yyyyyy.xxxxxx/full[/IMG]

where aaaaaa, xxxxxx, yyyyyy can be any string of any length.

a real link is this one:
Code:
[img]http://www.mydomain.nl/forums/members/dirk57-albums-moth-picture46347-blue2.jpg[/img]
[img]http://www.mydomain.nl/forums/members/dirk57-albums-moth-picture44963-r9.jpg[/img]
[img]http://www.mydomain.nl/forums/members/dirk57-albums-algemeen-picture63526-a1.jpg[/img]

And should be replaced by:
Code:
[img]http://www.modelbouwforum.nl/media/blue2.46347/full[/img]
[img]http://www.modelbouwforum.nl/media/r9.44963/full[/img]
[img]http://www.modelbouwforum.nl/media/a1.63526/full[/img]
 
Last edited:
Hello again,

Played a lot with regex today and think I have the right format, can you please verify?

Looks like a more simple link works as well.

Code:
[img]http://www.mydomain.nl/forums/members/aaaaaa-albums-moth-picture46347-blue2.jpg[/img]

can be replaced by:

Code:
[img]http://www.mydomain.nl/media/46347/full[/img]

I think I can use:
Quick Find: http://www.mydomain.nl/forums/members/

Regex: #\[img]http:\/\/www\.mydomain\.nl\/forums\/members\/([a-z0-9_-]+)picture([0-9]+)([a-z0-9_-]+)([.a-z]+)\[\/img]#siu

replacement String: [IMG]http://www.mydomain.nl/media/\2/full

Is this correct?
When testing it looks oke but don't know if I made a bad mistake somewhere[/code]
 
Last edited:
Oke, first wrap -up for those behind me :)

You can replace pictures on the forum with [img]http://www.mydomain.com/media/<pictureID>[/img]

I used a couple of expressions to search and replace the different formats on our forum to the new one.

So first used format on our forum:
[img]http://www.mydomain.com/forums/members/<username>-albums-algemeen-picture63526-a1.jpg[/img]
replaced by:
[img]http://www.mydomain.com/media/63526/full[/img]
(numbers are just a example to make it clear)

Used the following:
Quick Find:
Code:
quick find: http://www.mydomain.com/forums/members/

Regular expression:
Code:
#\[img]http:\/\/www\.mydomain\.com\/forums\/members\/([a-z0-9%_-]+)picture([0-9]+)([a-z0-9%_-]+)([.a-z]+)\[\/img]#siu
(Later added % in regex to replace special characters in member names and picture names)

Replacement String:
Code:
[IMG]www.mydomain.com/media/\2/full[/img]

Second format used on our forum to show pictures was something like this:
[IMG]http://www.mydomain.com/forums/picture.php?albumid=19413&pictureid=194237[/IMG]
Also replaced by:
[img]http://www.mydomain.com/media/194237/full[/img]
(numbers are just a example to make it clear)


For this I used the following:
Quick Find:
Code:
http://www.mydomain.com/forums/picture.php?albumid=

Regular Expression:
Code:
#\[img]http:\/\/www\.mydomain\.com\/forums\/picture\.php\?([albumid\=0-9]+)([\&pictureid\=]+)([0-9]+)\[\/img]#siu

Replacement String:
Code:
[IMG]www.mydomain.com/media/\3/full[/img]


For sure that the specialists here have some comments and perhaps will correct some errors in my regex but this did the trick for me. Needed some passes to allow the S&R tool to correct all the pictures in the postings (over 3 million posts) Off course adjusted the tool to allow max 5000 results at a time.

Replaced all our pictures after migrating from Vb3.8 to XF 1.5.

Now my next challenge is to replace all the internal links to other topics in the forum. Can use some help as my limited knowledge of regex doesn't help me here :).

Hope I didn't made too much typo's :)
Thanks for reading and hope this will help someone out.
 
Last edited:
Wanted to share one that caught the bulk of mine:

Sample URL:
Code:
[ame]http://www.youtube.com/watch?v=NjpG2s37_cQ[/ame]

Quick Find: youtube

Regular expression:
Code:
#\[ame](https?:\/\/www.)?youtube[^=]+=([a-z0-9_-]{11})\[/ame]#siu

Replacement String:
Code:
[media=youtube]\2[/media]
 
I spoke too soon I guess.

In a non "save" run it finds about 50k results and shows them.

When I run this, I keep getting:
Error
  • The requested forum could not be found.
  • Please enter a valid title.
  • Please enter a value for the required field 'node_type_id'.

I searched and didn't see anyone else that ran into this on here. Any ideas?

EDIT:
I found the problem, but I have no idea what is up. It found results in a link (http://www.mydomain.com/forum/posts/8811777/( but when I go to that link it tells me The requested forum could not be found.

How is it finding something that doesn't "exist"?

EDIT 2:
Somehow, during my export/import - it has turned one of my child forums into a category on the xenforo side :/ Need to research this.. not sure how it is finding the post other than via the index somehow?
 
Last edited:
I spoke too soon I guess.

In a non "save" run it finds about 50k results and shows them.

When I run this, I keep getting:


I searched and didn't see anyone else that ran into this on here. Any ideas?

EDIT:
I found the problem, but I have no idea what is up. It found results in a link (http://www.mydomain.com/forum/posts/8811777/( but when I go to that link it tells me The requested forum could not be found.

How is it finding something that doesn't "exist"?

EDIT 2:
Somehow, during my export/import - it has turned one of my child forums into a category on the xenforo side :/ Need to research this.. not sure how it is finding the post other than via the index somehow?

I think you got something wrong. The tool searches for that string in any post on your forum. If you see the results then you could click on the link in the most left column. Then you go to the specified posting where the tool found the match.

Hope you understand what I'm trying to say.
 
This addon works perfect. My problem was related to something else, which in turn stopped the find/replace. After that was fixed, boom, awesomeness :)
 
Are u willing to share that information?

It was unique to me specifically. During my export/import from VB to XF, somehow a subforum I had turned into a category. It caused the find/replace tool to "stop" once it saw a node_id that "didn't exist". After using another addon to convert that category back to a node, all good.
 
How can I use this to convert this:

Code:
[IMG]http://forums.studentfilms.com/groupee_common/emoticons/icon_smile.gif[/IMG]

to a simple ":)"

I keep getting errors.
 
Last edited:
To request help with a replacement issue and get the best/fastest results please follow these steps.

Find the post with the example of the problem in question
|_ Click edit on that post and then switch to the bbcode editor view to see the actual text content​

Copy the relevant text and paste it in between [code][/code] tags here so that someone may more easily help you.
 
Oh shoot... that didn't work because I didn't post the code tag... Let's try this again:

Code:
[IMG]http://forums.studentfilms.com/groupee_common/emoticons/icon_razz.gif[/IMG]
[IMG]http://forums.studentfilms.com/groupee_common/emoticons/icon_smile.gif[/IMG]

They are smilies from when my forum was with Groupee/InfoPop

From a google search they appear to all be the following:


/groupee_common/emoticons/icon_smile.gif = : )
/groupee_common/emoticons/icon_frown.gif = : (
/groupee_common/emoticons/icon_redface.gif = : o (lowercase "O")
/groupee_common/emoticons/icon_biggrin.gif = : D
/groupee_common/emoticons/icon_wink.gif = ; )
/groupee_common/emoticons/icon_razz.gif = : p
/groupee_common/emoticons/icon_cool.gif = : cool :
/groupee_common/emoticons/icon_rolleyes.gif = : rolleyes :
/groupee_common/emoticons/icon_mad.gif = : mad :
/groupee_common/emoticons/icon_eek.gif = : eek :
/groupee_common/emoticons/icon_confused.gif = : confused :
 
Back
Top Bottom