Fixed Censored words with a replacement including an '*' can break search

shello

Member
A few months ago @Mike pointed out that XenForo removes censored words from search queries.
The call to `XenForo_Helper_String::censorString` indeed passes an empty string as the censor string—which has the desired effect of removing the censored word—but this empty string is only used to replace censored words that don't have a replacement defined—which makes sense, as the the Censoring options could be used as well as a tool to correct frequent typos or mistakes.

If a censored word's replacement includes one (or more) asterisk characters ('*') these will be included in the search query, and will have a different meaning than they should (a literal asterisk), becoming then a wildcard for both MySQL's Full Text Search and ElasticSearch, and possibly other third-party search engines.
An effect of this unintended wildcard could be added complexity in the search, which leads to a slower search, and in some extreme cases XenForo giving up on the searcher for taking too long to respond.


One example would be the replacement of "XenForo" by "Xen****":
The search query "Buy XenForo" would be passed to the searcher as "Buy Xen****", which would, in the case of MySQL's Full Text Search and ElasticSearch, work as a wildcard—making the search slower and returning content with "XenForo" but also "XenServer", "Xenoblade", etc.

Another example is setting the replacement string with a fixed number of asterisks instead of relying in the automatic replacement which "leaks" the string length of the original censored word. In this case if "XenForo" was set to be replaced by "****", a search query "Buy XenForo" would be passed to the searcher as "Buy ****".


The obvious workaround for this issue is, of course, not using any asterisk character in the replacement of censored words.
 
I've added a new approach to ensure that all censored text gets stripped out in this case (which is what it was trying to do). Thanks.
 
Top Bottom