UTF-8 URL encoding with PHP 8.2 is broken

Recep Baltaş

Well-known member
Licensed customer
Affected version
2.2.13
After trying to figure it out for weeks I finally found the solution to this URL problem. Downgrading PHP 8.2 (8.2.12) to 8.1 (8.1.26) fixed the issue.

At this moment, I don't know if it's a PHP issue or XenForo issue but I am betting on XenForo and thus creating this thread.
 
Hello,
I was able to solve the issue for PHP 8.2.13. The problem arises from character set declaration. After examining the XenForo source code for a while, I came across a function named "Unfurl" (I accessed this function with the code in message.)
  • Unfurl: A function that captures links within the text and generates HTML code to make it more understandable on the website using auxiliary functions.
  • After examining the function a bit, it directed me to another function called metadataFetcher. It leads to "getTitle()" and finally to the "cleanMetadataString()" function.

When I examined the code here, it appeared that it removes all harmful words, unnecessary spaces, etc. The converter is written globally for English, so I added only the utf-8 converter.
After inspecting the code, I added a code to check the character set before the $string definitions and updated it as follows.

File location: src/XF/Http/Metadata.php:262 (XenForo 2.2.13)

PHP:
public function cleanMetadataString($string, $isUrl = false)
    {
        if (!$string)
        {
            return '';
        }
        // Added Code Start
        if(mb_check_encoding($string, 'UTF-8') === false){
            $string = mb_convert_encoding($string, 'UTF-8', 'ISO-8859-1');
        }else{
            $string = mb_convert_encoding($string, 'ISO-8859-1', 'UTF-8');
        }
        // Added Code End
      
        $string = \XF::cleanString($string);
        $string = utf8_unhtml($string, true);
        $string = html_entity_decode($string, ENT_QUOTES | ENT_HTML5, 'UTF-8');
        $string = utf8_unhtml($string);
        $string = str_replace("\n", ' ', trim($string));
        $string = \XF::cleanString($string);
        if ($isUrl)
        {
            /** @var \XF\Validator\Url $validator */
            $validator = $this->app->validator('Url');
            $string = $validator->coerceValue($string);
            if (!$validator->isValid($string))
            {
                $string = '';
            }
        }
        return $string;
    }
Code source used to solve the issue: PHP: utf8_encode - Manual

Screenshots indicating the issue is resolved:

1703531650309.webp
1703531657021.webp
1703531664622.webp
 
Similar problems in here after upgrading 2.2.11 -> 2.2.15. Tried downgrading to PHP 8.1 and 8.0 and disabling all addons. No help from these.

1000037311.webp
Still, when I try the same link in here it works fine as can be seen here: https://xenforo.com/community/threads/testing-url-unfurl-utf-8.219430/#post-1667909

So if disabling all addons does not help, what's the difference between this forum and our forum that causes this problem?
 
Last edited:
Ask him then why did they fix it in the 2.2.14 release if it wasn't a bug? It is a bug and it happens on forums which use PHP 8.2.
 
This forum also has this error
and it's not php's fault
 
Because it looks like, THIS forum hasn't got this issue, same as many other forums and demo. This has been then categorized as PHP issue.

PHP 7.x is still fine for XenForo.
 
This has been then categorized as PHP issue.
We did not see this bug on our forum when running Xenforo 2.2.11 with PHP 8.0 or 8.1.

Now with 2.2.15 we see this bug and switching PHP between 8.0 - 8.1 - 8.2 makes absolutely no difference. 7.x versions we do not have available now and I really don't think they should be needed.
 
They even accepted it and provided a patch back then:

 
They even accepted it and provided a patch back then:

Is this the solution?
 
Is this the solution?
It was the solution for that problem, but this is not the same bug. That earlier bug threw nasty errors to user if URL itself contained special characters.

This time there are no errors in the url itself, but the unfurl preview special characters are broken.
 
Last edited:
I found one difference with Xenforo versions that affects this.

/src/vendor/symfony/dom-crawler/Crawler.php line 196

Xenforo 2.2.11:
$content = mb_convert_encoding($content, 'HTML-ENTITIES', $charset);

Xenforo 2.2.15:
$content = htmlspecialchars_decode(iconv('UTF-8', 'ISO-8859-1', htmlentities($content, ENT_COMPAT, 'UTF-8')), ENT_QUOTES);

Reverting this line of code to previous version seems to cure the problem but I must do some more researching to see if it breaks something else.
 
Last edited:
Back
Top Bottom