Fixed Non-English letters are not lowercasing in URLs.

Cenk

Member
Example:

threads/turkish-test-ĞğÜüŞşİıÖöÇç.4816/


It should be:

threads/turkish-test-ğğüüşşiıööçç.4816/
 

Shadab

Well-known member
Any reason why strtr was chosen over something like utf8_strtolower, for lowercasing the content titles?
(I really hope it wasn't for performance reasons).
 

Mike

XenForo developer
Staff member
strtr is actually faster, but it was mostly related to the fact that strtolower respects locales, and we don't want that. I am concerned about the performance of utf8_strtolower (without the mb version), though I haven't done any explicit tests. The link builders can be called a large number of times on a page.
 

Shadab

Well-known member
utf8_strtolower (without the mb version)
Definitely slower. :( I had wrongly assumed that those utf8 helper functions didn't utilize any mbstring functions. I just did a quick profiling and it turns out utf8_strtolower is almost 3 times slower when there are no mbstring functions available to it.

mbstring:
ascii_vs_utf8__mb.png

No mbstring:
ascii_vs_utf8__no_mb.png
 

Nickolas

Member
You can modify the UTF-8 Library accent array. Create a plugin and try following code...
PHP:
        global $UTF8_LOWER_ACCENTS, $UTF8_UPPER_ACCENTS;

        $UTF8_LOWER_ACCENTS = array_merge($UTF8_LOWER_ACCENTS, array(
            'ı' => 'i',
            'ü' => 'u',
            'ö' => 'o'
        ));

        $UTF8_UPPER_ACCENTS = array_merge($UTF8_UPPER_ACCENTS, array(
            'İ' => 'I',
            'Ü' => 'U',
            'Ö' => 'O'
        ));
and you must know that the URL romanization must be active.

And I think you have to override this code:

PHP:
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = false)
replace with:

PHP:
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = true)
 
You can modify the UTF-8 Library accent array. Create a plugin and try following code...
PHP:
        global $UTF8_LOWER_ACCENTS, $UTF8_UPPER_ACCENTS;

        $UTF8_LOWER_ACCENTS = array_merge($UTF8_LOWER_ACCENTS, array(
            'ı' => 'i',
            'ü' => 'u',
            'ö' => 'o'
        ));

        $UTF8_UPPER_ACCENTS = array_merge($UTF8_UPPER_ACCENTS, array(
            'İ' => 'I',
            'Ü' => 'U',
            'Ö' => 'O'
        ));
and you must know that the URL romanization must be active.

And I think you have to override this code:

PHP:
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = false)
replace with:

PHP:
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = true)
Wouldn't that take away meaning to the word. It would be like saying "kat" for "cat" (just an example) I might be wrong there, and the URL is a extremely vital tool for search engines.
 

James

Well-known member
and the URL is a extremely vital tool for search engines.
Not exactly. Search engines care more about content than URLs. We had this discussion earlier about the weight of keywords in URLs, and there isn't much weight in them.
 

yavuz

Well-known member
You can modify the UTF-8 Library accent array. Create a plugin and try following code...
PHP:
        global $UTF8_LOWER_ACCENTS, $UTF8_UPPER_ACCENTS;

        $UTF8_LOWER_ACCENTS = array_merge($UTF8_LOWER_ACCENTS, array(
            'ı' => 'i',
            'ü' => 'u',
            'ö' => 'o'
        ));

        $UTF8_UPPER_ACCENTS = array_merge($UTF8_UPPER_ACCENTS, array(
            'İ' => 'I',
            'Ü' => 'U',
            'Ö' => 'O'
        ));
and you must know that the URL romanization must be active.

And I think you have to override this code:

PHP:
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = false)
replace with:

PHP:
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = true)
Can someone put a plugin together that accomplish this?
 

Nickolas

Member
Wouldn't that take away meaning to the word. It would be like saying "kat" for "cat" (just an example) I might be wrong there, and the URL is a extremely vital tool for search engines.
well yes... That solution not work for every language. But works for Turkish Language... That 's why, I wrote that code.
 

Jarod

Active member
Hello,


Same with french accents : ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿ
 

dihuta

Formerly Dinh Thanh
Same with Vietnamese accents. We need to replace all Vietnamese accents with english characters.
a á à ã ă â e é è ê i í ì ĩ o ó ò õ ô ơ u ú ù ũ ư y ý ỳ đ
to:
a a a a a a e e e e i i i i o o o o o o u u u u u y y y d
 
resolved:

After several attempts, enabled the romanization .. and it worked perfectly .. removed all the special characters ...

Edit library/Xenforo/Link.php

change this
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = false)

to this
public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = true)
 
Reactions: HWS
Top