Fixed utf8.php should be updated (php8.2)

Weppa333

Well-known member
Affected version
2.2.15
This has been discussed in the past but the solution is usually to ask plugin authors ton update their plugins for improved PHP8.1/PHP8.2 compatibility, whereas I think this is clearly a problem inside Xenforo 2.2 codebase, up and including 2.2.15

The file
./src/vendor/dokuwiki/utf8/utf8.php

Contains code that is not compatible with PHP8.2, often spitting out warning/deprecations about passing "null" to strlen

Template XXXXXXXXXX: [E_DEPRECATED] mb_convert_encoding(): Passing null to parameter #1 ($string) of type array|string is deprecated (src/vendor/dokuwiki/utf8/utf8.php:142)

This can be "hidden" by not sending null to strlen inside, aot, admin templates, but it's simply hiding the fact this library is a problem waiting to happen.

Imho, this library should be patched for improved php8.2 compatibility

SecureCRT_5aQQqFKnWn.webp

eg here, simply forcing the result to a string

strlen((string)mb_convert_encoding...

Would fix the problem once and for all ( This should be done at multiple places in that library. )

My 2 cents :)

Previous reference of the same issue :
 
Last edited:
Also php8.2 is (afaik) now the recommended version for 2.2.15 and yet it contains code that is fundamentally not php8 friendly.
imho , templates shouldn't be able to trigger php errors/warning/deprecation notices
And this code can be "triggerred" trvivially by doing strlen of an empty (not set) string, as strlen can be used in templates.
 
Last edited:
I am somewhat against "fixing this bug" as that just hides the underlying issue - null is not a string and therefore you can't call strlen() on it.

This only fix that IMHO would make sense here is is to add a typehint string for param $string.
 
Yes that is of course an alternative
What creates a problem here is that this is trivially triggerable just about everywhere by templates, sometime rendering the whole installation unable to save a control panel setting, or just not starting at all potentially (needing to disable plugins) and of course I understand this is the price to pay for flexibility.

You can see multiple people in the forums here with "broken" installs due to plugins sending null to strlen (this is typical in admin panel plugins)

Just a heads up.
 
I am somewhat against "fixing this bug" as that just hides the underlying issue - null is not a string and therefore you can't call strlen() on it.
Heh, we have started discussing this code more generally in Slack just now and I just made basically the same point.

The true bug here is not actually the errors that might happen if someone passes in a non-string. The bug here is people passing non-strings to it at all.

Type safety as I am in favour of and Kirby is suggesting here would help developers identify such errors much sooner, ideally during development and before shipping.

It really would be just the add-ons that should fix this. Ultimately, for example, a developer is passing a value in here and is expecting something to happen. If they pass in the wrong thing and that results in some default behaviour that bug could be hidden as it stands, and if we exacerbate that further by casting random values to string, that further hides what could be a genuine mistake (such as typing a variable name wrong).

Points noted about how enforcing this can potentially break things, but often things like this should be fixed at source which unfortunately probably means we're going to enforce this more strictly and probably generate more errors, not fewer.
 
Just adding my $0.02 to this from the perspective of a C# developer with little to no PHP experience.

In C#, it is possible for string types to be null or empty, which are two distinct cases with two distinct use cases. If the same is possible in PHP, then I would 100% expect the method documentation to specify how it handles null values, whether that's indicating that parameters should not be null or by stating that null and empty strings are handled the same way by returning a 0 length.

With this in mind, I agree that this is the purview of the add on developers to fix, likely by adding a guard clause to handle null strings. I imagine this function would handle empty strings correctly, but this is an assumption on my part.

If this were C#, I would go a step further because type hints don't specify a contract. They just provide a hint to the developer, which can be missed or ignored. So I would also add a guard in the body of the utf8_strlen(string) function to check for a null parameter and to throw a detailed exception. This way there is a clear and distinct error message describing the exact error.
 
From a PHP perspective, it is quite similar. PHP has a long history and it hasn't always cared about what types are passed in where, and often it relied on assumptions, much like OP is suggesting, to automatically cast values to the expected value. As we can see here, that behaviour is now becoming deprecated (in PHP 8.3, so only as recently as 2023!) and PHP now emits a deprecation notice about it but eventually it will hard error.

In PHP, however, argument types (which I was somewhat inaccurately interchangeably referring to as type hints earlier) will result in a type error exception if null is passed in. So we shouldn't have to throw any exceptions ourselves.
 
Okay so this probably isn't the outcome @Weppa333 asked for but I think it's the right solution.

Starting with Beta 6 the following has changed:
  • We are no longer shipping the utf8.php code that provided functions including (but not limited to) utf8_substr, utf8_romanize, utf8_deaccent, utf8_trim, and so on.
  • We have a new XF\Util\Str class which contains ported, new or updated functions relating to UTF-8 handling, the names of these functions are slightly different, example usage: Str::substr($string, $offset, $length)
  • Where possible (no union types yet, boo) functions have accurate return types and argument types
  • A new file is now included from src/utf8.php. This file acts as a shim for the utf8_ functions we were using but are now no longer available. These call the relevant functions in the new Str util. It contains:
    • utf8_isASCII()
    • utf8_romanize()
    • utf8_deaccent()
    • utf8_bad_replace()
    • utf8_substr()
    • utf8_substr_replace()
    • utf8_strlen()
    • utf8_strpos()
    • utf8_ltrim()
    • utf8_rtrim()
    • utf8_trim()
    • utf8_strtolower()
    • utf8_strtoupper()
    • utf8_ucfirst()
    • utf8_ucwords()
    • utf8_unhtml()
    • utf8_to_unicode()
  • This new shim file is immediately deprecated along with all of its proxy functions and it will be removed in XenForo 3.0.
  • We have not attempted, nor will we, to port over all functionality previously provided by the original utf8.php. However, if you are using utf8_ functions not listed above, please let us know (via bug report) and we will consider adding them in a future release or advise of a suitable alternative.
  • Perhaps somewhat significantly, at this point in time, XenForo 2.3.0 starting with Beta 6 will have a hard-requirement for the mbstring extension to be installed. Most PHP builds have this, despite it being optional, and most modern PHP applications and frameworks require it.
With all that being said, we're going to mark the bug as fixed.

Yes, this may result in an increase of issues with some add-ons in some cases, but those issues should be reported to the add-on developers themselves for a fix by them and developers who have any questions can feel free to post in the development discussions forum.
 
FWIW, ran into an issue where using {{ contains() }} in a template was broken by this change. Specifically contains() calls substring method which even though PHP docs says needs to be a string, apparently it wasn't actually the case. In my case, there was a situation where contains() was sometimes being used on a null value. That worked pre beta 6, but not after this change.

Easy enough to fix by adding a check in the template to only check contains if it's not null... but just in case someone else runs across it.
 
If $someVar is null, this worked before beta 6, but not after:

HTML:
{{ contains($someVar, 'some_string') ? 'something to show' : '' }}

Changing it to this makes it work before and after beta 6:

HTML:
{{ $someVar && contains($someVar, 'some_string') ? 'something to show' : '' }}
 
Back
Top Bottom