XF 2.3 Why does XF Not consistently stop Chinese Spam

It allows words that are in spam phrases to be posted.
Without seeing a message, I am assuming they are using full-width (CJK - Chinese, Japanese, Korean)

你應該嘗試一下這種叫做 MEDS 的新藥

I'm unsure how you could put this into a regex for the spam filter to pick up on them, but it would reject full-width upper/lowercase and numbers.
[a-zA-Z0-9]

(Without converting all spam phrases to full-width and adding them to your list too)
 
Here is a typical message

View attachment 313231

Here you can see my attempts to block it in Spam Phrases

View attachment 313232
Perhaps you can send "QQ" to moderation. That seems like the route they want to be contacted by.

I'm unsure how you could block any Chinese characters so that nothing gets through. I presume you don't use any other language on your board other than English, so that could be a way to block it entirely. Though, I couldn't point you in the right direction for a regex match of any and all characters that would flag a message.
 
Code:
/[一-龠]+|[ぁ-ゔ]+|[ァ-ヴー]+|[々〆〤ヶ]+/u

Is what I use and nothing ever gets through..
Bold and italics are also Japanese, and there are many more Chinese characters than Japanese Kanji
The Dai Kan-Wa Jiten, which is considered to be comprehensive in Japan, contains about 50,000 characters. The Zhonghua Zihai, published in 1994 in China, contains about 85,000 characters,
So, I'm wondering if the remaining characters could slip through. But, I think the bases are covered as you are more likely than not to have to use 1 character out of the 50,000 to make sense.

The latter bold are Japanese characters (hiragana and katakana), which I've never seen spammed. But, I guess it's good to cover all bases.

Thanks for that regex though, as I'm now preemptively putting it in.
 
look for links instead. very few spammers just spam to type text. they want a click.
 
look for links instead. very few spammers just spam to type text. they want a click.
Yep. Keep it simple.
 
Back
Top Bottom