XF 2.2 Looking for a query to remove html from import

beerForo

Well-known member
I can't use the find post/replace tool from XF because this is not in posts, it's in profile posts.

After an import, if somebody posted a URL as a profile post previously (not a profile post comment, those are ok) they show html.

Here is an example as it shows in the post, and, the link is active but html shows.
Code:
<a href="http://domain.com/topic/7428-hello-there/" rel='nofollow external' class="su_links">http://domain.com/topic/7428-hello-there/</a>

There are a few instances where the URL bbcode is there as well as the html, so I am looking to avoid adding double bbcode in those instances. For example they look like:

Code:
<a href="http://domain.com/topic/7428-hello-there/" rel='nofollow external' class="su_links">[URL]http://domain.com/topic/7428-hello-there/[/URL]</a>

Looking for a query to remove the html and wrap with [URL][/URL], thanks if you can help!
 

otto

Well-known member
Should work, please... make a backup, make test runs bevore you save anything to your database!
If I understand you right, try this:

Fast search:
Code:
:\\domain.com/topic/
Important: You have to replace "domain" with your real domain!!!

Regex:
Code:
#(\<a\shref=\"https?:\/\/domain\.com\/topic\/[0-9a-zA-Z-]+\/?\")([ a-z=\'\"_]+\>)(https?:\/\/domain\.com\/topic\/[0-9a-zA-Z-]+\/?)(\<\/a\>)#siU
Important: You have to replace "domain" with your real domain!!!

Match:
1628841322651.png

Replace:
Code:
[url]$3[/url]
1628841308719.png

Should work, please... make a backup, make test runs bevore you save anything to your database!
 
Last edited:

beerForo

Well-known member
No, these are all different links people posted as statuses (profile posts) not one domain. But they all look like this. And I can't use the XF tool because that is only for posts.
 

otto

Well-known member
"Look like" is not enough. ;)
Regex can be clever but never intelligent, so I need more concret input to make a regex that can match more link strings.
I need examples that let me see a schema so I can made one ore more regex codes that matches then so much as possible strings for replacement.

You can make me a test account in your forum, so I am better can take a look what I can do for you. :)
 
Last edited:

beerForo

Well-known member
But the XF tool does not work on profile posts so don't I need a query not a regex?

Like 3 steps:

"remove from <a to first >" That removes the first bracket of html
"remove </a>" That removes closing bracket.
"put [URL][/URL] around strings that begin with http if not there"
 
Last edited:

beerForo

Well-known member
The add-on can likely be modified fairly easily.
Good tip thanks. If anyone knows how to modify code so it would find profile posts (does not even need to find the comments as those hyperlinks imported fine) thanks!

And this issue is, the URLS posted as a status have html as seen above. I believe the 2 steps I posted would fix it.
 

otto

Well-known member
No not in profile posts, only in postings without any code change.


For postings and such links as you post above:

fast search: "http" (or a second run with "ftp")

Regex:
Code:
#(\<a\shref=\")((ftp|https?)(:\/\/[0-9a-z_-]+.[0-9a-z]+\/?[0-9a-zA-Z_\-.\/\)]+)+)([0-9a-zA-Z:.\[\]\/\" \=\'_\-\>]+)(\<\/a\>)#siU
That looks for http, https, and ftp links


match test:
1628873618301.png
Matches all such types of links you see above and we use only one regex match group in next step....


replace:
1628873645311.png
So we put the one matching group inner a url bbcode a voila, we have working simple links



But... when you need it for profile posts - its not so helpfull for you. As long as the search and replace tool its only working for postings. Sorry.
 
Last edited:
Top