Awaiting feedback XF\Util\Arr::stringToArray may not be unicode safe

Xon

Well-known member
Affected version
2.2.2
PHP:
public static function stringToArray($string, $pattern = '/\s+/', $limit = -1)
{
   return (array)preg_split($pattern, trim($string), $limit, PREG_SPLIT_NO_EMPTY);
}

The default pattern lacks the u specifier, and trim is being used instead of utf8_trim. A number of call sites probably should be updated to have the utf8/unicode specifier.

There are probably other call-sites of preg_split which probably should be updated as well (either to the same standards or to use stringToArray as appropriate
 
Do you have an example where this creates a bug?

It's worth mentioning that utf8_trim is literally just trim unless you pass in a separate character list.

When a regex works on basic ASCII characters only, the Unicode modifier doesn't really make a difference. \s may technically match a few more things in Unicode mode, though I'm not sure that's necessarily as desired. The existing code shouldn't have any Unicode-safety-based issues though.
 
Top Bottom