Fixed 🐛 with Turkish characters such as ö and ü

In that case, we're a little confused. I just tested the behaviour with romanize titles enabled and the result is the same as in your first post. What are you expecting to happen? What specifically do you see is the problem with the URLs?

For example, I created a new topic called Çöğü. The address bar URL should be ...com/cogu.4/, not ...com/çöğü.4/.
 
The address bar URL should be: coegue.4/ if "Romanize titles in URLs" is enabled. It should be çöğü.4/ if the option is disabled. So exactly what is the issue you are facing?
 
Maybe it will give you some more ideas.

Characters used: Türkçe karakter testi Aşk, Öküz, Çene, Türkiye, İlan, Ğündüz


Romanize titles in URLs : Closed

url_1.webp


The URL that appears when I copy and paste from the browser : /threads/t%C3%BCrk%C3%A7e-karakter-testi-a%C5%9Fk-%C3%96k%C3%BCz-%C3%87ene-t%C3%BCrkiye-%C4%B0lan-%C4%9E%C3%BCnd%C3%BCz.206/



Romanize titles in URLs : Open

url_2.webp

The URL that appears when I copy and paste from the browser : /threads/tuerkce-karakter-testi-ask-oekuez-cene-tuerkiye-lan-guenduez.206/

Minus a lot of characters here: ü, ç, ö, ğ Some characters such as do not appear.


Our expectation here is :
ı > i
ü > u
ö > o
ğ > g
ş > s
Characters like can be converted.
 
Last edited:
It would be nice if it was solved in the core. I hope the answer is different than last time:
 
We have implemented some changes in Beta 7 to address a number of concerns surrounding romanization/transliteration of strings, some are reflected in this thread, others have been somewhat of an issue for some in different scenarios over time.

Let's address the issue in this thread first.

Our previous code was based on a solution we've been using in some way or another for many years. A lot of the code was trying to solve problems and edge cases that at one point didn't have a reasonable native solution. We unraveled a few of the more redundant bits of that code in Beta 6 but for the most part the behaviours were generally unchanged from before.

Specifically code surrounding romanization which is often used in URLs behaved largely the same as it already had done, and it turns out that the approaches were fairly opinionated.

For example ö and ü were transliterated to oe and ue. This is not incorrect necessarily, but it depends on your locale. In German and potentially some other languages it is correct, but for Turkish and some others it should be o and u.

We took a little bit of inspiration here from the amazing work produced by the Symfony team, specifically their String component and we have entirely overhauled the process for both normalizing and transliterating strings. These changes will only apply if you have the intl extension available.

What does this mean in practice? Well, as you can see from the new title I've given this thread, the characters ö and ü are now transliterated to o and u.

But what about German language forums? Well, the specific rules we use for transliteration are now locale based (based on the default language of the forum).

If our default language on this forum was German, the characters ö and ü would be transliterated to oe and ue.

So, with that, I think we can safely call this fixed. But there's more!

On a semi-related note, over the years customers have sometimes expressed a concern over the appearance of emoji in URLs. There's nothing invalid about emojis being in URLs. Behind the scenes they are URL encoded, and most browsers will display them as the correct emoji icon. This behaviour remains unchanged unless "Romanize titles in URLs" is enabled. In which case, we now have this new option to control how emojis appear in URLs:

1715598230384.webp

By default, we will now convert the emoji into a string based on the emoji name. You may also decide to keep the emoji, something that previously wasn't possible when romanization was enabled. Or you may decide to remove the emoji entirely. You can see examples of these in the screenshot above.

And the end result can be seen in the URL of this thread:

Code:
https://xenforo.com/community/threads/bug-with-turkish-characters-such-as-o-and-u.221332/

The 🐛 character is converted to the word bug. And ö and ü are converted to o and u.
 
Back
Top Bottom