Fixed Non-English letters are not lowercasing in URLs.

Discussion in 'Resolved Bug Reports' started by Cenk, Oct 5, 2010.

  1. Cenk

    Cenk Member



    It should be:

  2. Vincent

    Vincent Well-Known Member

    I'm not into SEO and stuff, but is this to optimize SEO?
  3. Cezz

    Cezz Well-Known Member

    It is about URL normalization, URL's as standard should all be lowercase as well as many other things...

  4. Shadab

    Shadab Well-Known Member

    Any reason why strtr was chosen over something like utf8_strtolower, for lowercasing the content titles?
    (I really hope it wasn't for performance reasons).
  5. Mike

    Mike XenForo Developer Staff Member

    strtr is actually faster, but it was mostly related to the fact that strtolower respects locales, and we don't want that. I am concerned about the performance of utf8_strtolower (without the mb version), though I haven't done any explicit tests. The link builders can be called a large number of times on a page.
  6. Shadab

    Shadab Well-Known Member

    Definitely slower. :( I had wrongly assumed that those utf8 helper functions didn't utilize any mbstring functions. I just did a quick profiling and it turns out utf8_strtolower is almost 3 times slower when there are no mbstring functions available to it.


    No mbstring:
  7. Nickolas

    Nickolas Member

    You can modify the UTF-8 Library accent array. Create a plugin and try following code...

    $UTF8_LOWER_ACCENTS array_merge($UTF8_LOWER_ACCENTS, array(
    'ı' => 'i',
    'ü' => 'u',
    'ö' => 'o'

    $UTF8_UPPER_ACCENTS array_merge($UTF8_UPPER_ACCENTS, array(
    'İ' => 'I',
    'Ü' => 'U',
    'Ö' => 'O'
    and you must know that the URL romanization must be active.

    And I think you have to override this code:

    public static function buildIntegerAndTitleUrlComponent($integer$title ''$romanize false)
    replace with:

    public static function buildIntegerAndTitleUrlComponent($integer$title ''$romanize true)
  8. James Freeman

    James Freeman Member

    Wouldn't that take away meaning to the word. It would be like saying "kat" for "cat" (just an example) I might be wrong there, and the URL is a extremely vital tool for search engines.
  9. James

    James Well-Known Member

    Not exactly. Search engines care more about content than URLs. We had this discussion earlier about the weight of keywords in URLs, and there isn't much weight in them.
  10. yavuz

    yavuz Well-Known Member

    Can someone put a plugin together that accomplish this?
  11. Nickolas

    Nickolas Member

    well yes... That solution not work for every language. But works for Turkish Language... That 's why, I wrote that code.
  12. dbembibre

    dbembibre Active Member

    I have problems with spanish accents too, this works ? any way to remove non utf-8 from URL ?
  13. Jarod

    Jarod Active Member


    Same with french accents : ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿ
  14. Dinh Thanh

    Dinh Thanh Well-Known Member

    Same with Vietnamese accents. We need to replace all Vietnamese accents with english characters.
    a á à ã ă â e é è ê i í ì ĩ o ó ò õ ô ơ u ú ù ũ ư y ý ỳ đ
    a a a a a a e e e e i i i i o o o o o o u u u u u y y y d
  15. Sasa

    Sasa Active Member

    Have you tried to solve in this way.
  16. Dinh Thanh

    Dinh Thanh Well-Known Member

    We knew it and this is a suggestion. New version of Xenforo should have this feature in it's Core.
  17. FabioCesar

    FabioCesar Member

    Same with Portugues-Br
  18. FabioCesar

    FabioCesar Member


    After several attempts, enabled the romanization .. and it worked perfectly .. removed all the special characters ...

    Edit library/Xenforo/Link.php

    change this
    public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = false)

    to this
    public static function buildIntegerAndTitleUrlComponent($integer, $title = '', $romanize = true)
  19. Mike

    Mike XenForo Developer Staff Member

    1.2 now exposes options for Romanizing titles in URLs. (And it caches the work to be more efficient so, while it will have performance overhead, hopefully it will be mitigated where possible.)
