Fixed New URL romanization options do not enforce intl extension properly

pegasus

Well-known member
The new options require the intl extension, per release notes. However, if you don't have the intl extension, it is still possible to enable the options, which results in errors such as {Emoji} regex property not existing, null URLs, etc.
An exception occurred: [ErrorException] [E_WARNING] preg_replace_callback(): Compilation failed: unknown property name after \P or \p at offset 9 in src/XF/Str/Formatter.php on line 417
If (not sure) timing causes that error to make preg_replace_callback return null rather than throwing an exception:
An exception occurred: [TypeError] XF\Util\Str::transliterate(): Argument #1 ($string) must be of type string, null given, called in /[path]/src/XF/Mvc/Router.php on line 540 in src/XF/Util/Str.php on line 28

If the intl extension is missing, then possibly the old code (before fixing this thread's issue) should be used, and skip the Emoji related stuff. There should also possibly be a notice under the related options, e.g. "you need to install the intl extension to use this"
 
The new options require the intl extension, per release notes. However, if you don't have the intl extension, it is still possible to enable the options, which results in errors such as {Emoji} regex property not existing, null URLs, etc.
An exception occurred: [ErrorException] [E_WARNING] preg_replace_callback(): Compilation failed: unknown property name after \P or \p at offset 9 in src/XF/Str/Formatter.php on line 417
This one has confused me slightly.

You are running into this code:

PHP:
$string = preg_replace_callback(
    '/\p{Emoji}/u',
    $replaceCallback,
    $string
);

But that code should only execute on PHP 7.3 or above. PHP 7.3 is when PCRE2 was added. I'm not sure if there are any other variables, but in all the testing I've done, the character class \p{Emoji} should work. Versions below PHP 7.3 handle this differently.

This particular code path should not be dependent on the intl extension being loaded and needs to be run regardless. It only applies to the "include" option, which simply maintains the emoji without stripping them (which was the original behaviour).

Intl only comes into play if you want to transliterate the emoji to an ASCII representation.

Is there anything about your environment that rings a bell here? Were you getting this error on an old version of PHP or is your version of PHP potentially missing expected dependencies?

I think you're correct that perhaps this is causing the null type error, but the first issue should take care of this.
 
In my environment, the Emoji PCRE class did not become available in PHP until I recompiled with the intl extension. My environment was running PHP 8.2 until recently. I upgraded to 8.3 after encountering this issue, and still got that error until I enabled intl. I always download PHP directly from php.net and compile it manually. As PCRE2 is bundled with PHP gotten in this way, it seems to me that the PCRE version provided in those bundles does not have the Emoji class or that PHP's compiler does not include them.
 
Last edited:
Interesting. Well, that doesn't seem to be the case generally nor is that documented to be the behaviour.

The Emoji character class is available with PCRE2 starting with PHP 7.2 and it has no dependence on the intl extension, so the results are very confusing.

I have a setup without intl and the Emoji class is available.

1718207909649.webp

We've also had no other reports of this. We've had other reports related to intl not being available, but nothing related to this, which they would have run into.

Test script:

PHP:
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Intl and PCRE Emoji test</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            margin: 20px;
        }
        .result {
            padding: 10px;
            margin-bottom: 10px;
            border-radius: 5px;
        }
        .success {
            background-color: #d4edda;
            color: #155724;
            border: 1px solid #c3e6cb;
        }
        .error {
            background-color: #f8d7da;
            color: #721c24;
            border: 1px solid #f5c6cb;
        }
    </style>
</head>
<body>
    <h1>Intl and PCRE Emoji test</h1>
    
    <?php
    
    if (extension_loaded('intl'))
    {
        echo '<div class="result success">The intl extension is loaded.</div>';
    }
    else
    {
        echo '<div class="result error">The intl extension is not loaded.</div>';
    }
    $test_string = "🙂";
    if (preg_match('/\p{Emoji}/u', $test_string))
    {
        echo '<div class="result success">The \\p{Emoji} character class is available and working.</div>';
    }
    else
    {
        echo '<div class="result error">The \\p{Emoji} character class is not available.</div>';
    }
    ?>
</body>
</html>
 
Okay, after further comparison of file changes, I see the following:

It should be noted that PHP does not seem to use the actual PCRE library, but instead writes its own extension based on that library. This inevitably leads to situations where PHP is out of step with PCRE, even if a newer version is installed on your server separately.

- We have multiple versions of PHP installed at once, due to varying software requirements. It looks like we were using PHP 8.1.24 for XenForo 2.3 even after installing 8.3.7, until we noticed that the wrong version was being used and changed it to the newer version. For evidence of that, you can see by the error message I posted that it is using the PHP 8.1.x flavor text "unknown property name after" rather than the PHP 8.2+ text "unknown property after" (source)

- Comparing the source files of PHP 8.3.7, ext/pcre/pcre2lib/pcre2_ucptables.c (source) does indeed define the PCRE Emoji class.

- However, the respective source file in PHP 8.1.24, ext/pcre/pcre2lib/pcre2_tables.c (source) does not. So Emoji is not available in PHP 8.1. According to the file's version history, this variation of the PCRE extension was in use until PHP 8.2.0 RC 1 (source), after which the version with the Emoji class appeared (i.e. pcre2_ucptables.c).

Therefore, use of the Emoji class in PHP 8.x requires PHP 8.2 or higher.
 
Last edited:
Thank you for reporting this issue, it has now been resolved. We are aiming to include any changes that have been made in a future XF release (2.3.1).

Change log:
Make PCRE character class check more robust.
There may be a delay before changes are rolled out to the XenForo Community.
 
Back
Top Bottom