Steffen
Well-known member
- Affected version
- 2.0.6
XenForo currently completely removes nonbreaking hyphens (U+2011) from URLs. For example, the title "Cortex‑A72" is turned into the URL "cortexa72". It should be "cortex-a72".
diff --git a/src/XF/Mvc/Router.php b/src/XF/Mvc/Router.php
index 49b09a8a5..aea513f3f 100644
--- a/src/XF/Mvc/Router.php
+++ b/src/XF/Mvc/Router.php
@@ -488,6 +488,9 @@ class Router
);
$string = strtr($string, ['"' => '', "'" => '']);
+ // Non-breaking space and Non-breaking hyphen
+ $string = str_replace([' ', '‑'], ' ', $string);
+
if ($romanize)
{
$string = preg_replace('/[^a-zA-Z0-9_ -]/', '', $string);
$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
).I cannot actually reproduce this. I've just named a threadXenForo currently completely removes nonbreaking hyphens (U+2011) from URLs. For example, the title "Cortex‑A72" is turned into the URL "cortexa72". It should be "cortex-a72".
Cortex‑-A72
which is a non-breaking hyphen followed by a standard hyphen (if you do Ctrl+F and search for a normal hyphen, only one will be highlighted in the previous inline code). In fact, saving this post will retain it, as further evidence of us not doing anything explicit to strip that out via the input filterer.threads/cortex‑-a72.12
(again, only the right one is a standard hyphen).That seems to still strip the non-breaking hyphen in my testing.Btw, my "final" fix for this issue a few weeks ago was to add$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
after$string = utf8_romanize(utf8_deaccent($string));
.
Index: src/XF/Mvc/Router.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- src/XF/Mvc/Router.php (date 1531486827000)
+++ src/XF/Mvc/Router.php (date 1531487582000)
@@ -490,6 +490,8 @@
if ($romanize)
{
+ // Convert non-breaking hyphen to hyphen
+ $string = str_replace('‑', '-', $string);
$string = preg_replace('/[^a-zA-Z0-9_ -]/', '', $string);
}
$string = str_replace('‑', '-', $string);
works fine for non-breaking hyphens.$string = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
is that it handles other characters, too.×
→ x
and ²
→ 2
(and non-breaking spaces).Non-breaking spaces are used when you want to prevent a line-break that would make a sentence / title harder to read. For example, you might want to prevent a line-break in "Windows 10" or "July 14th" (it would look strange if "Windows" was the last word on one line and "10" the first word on the next line). I agree that this isn't usually something that your average forum user does but our editors use non-breaking spaces when writing headlines (which are then used as comment thread titles). I don't think that non-breaking spaces should be stripped from URLs, they should be converted to normal spaces (and finally to hyphens).Do you have an example of how non-breaking spaces are problematic? In my testing, these are stripped, but IMO they should be, so I don't think we need to make any changes there.
We use essential cookies to make this site work, and optional cookies to enhance your experience.