XF 1.5 Spam Management - detecting gibberish

ainwood

Member
We set moderation for the first 5 posts that contain spam words, including links. In general, it works very well.

What we are seeing now is spammers who have worked this out, and so post gibberish for their first 5 posts, then post the advertising crap. Examples:

hasdfeawhafeawhzcfvewahcxhasdfeawhafeawhzcfvewahcxhasdfeawhafeawhzcfvewahcx

wiefowfjweifjewofjwofjiwefjowwiefowfjweifjewofjwofjiwefjowwiefowfjweifjewofjwofjiwefjow

wejiofjwoefjowejfoweifowefjwoefjwefjwwejiofjwoefjowejfoweifowefjwoefjwefjwwejiofjwoefjowejfoweifowefjwoefjwefjw

wefwefowoefjowifjowfjowfiowfjowefjoiwwefwefowoefjowifjowfjowfiowfjowefjoiwwefwefowoefjowifjowfjowfiowfjowefjoiwwefwefowoefjowifjowfjowfiowfjowefjoiw

hndqhwidhqiwdhuqwhdiqwhndqhwidhqiwdhuqwhdiqwhndqhwidhqiwdhuqwhdiqwhndqhwidhqiwdhuqwhdiqwhndqhwidhqiwdhuqwhdiqwhndqhwidhqiwdhuqwhdiqw

GEAWHGFCDAQHGDCFEWAH

I was wondering whether there is a regex method or similar to send any post to the moderation queue if it contains a word longer than (say) 15 characters. or something even smarter with repeated character strings? I figure that such a rule would capture most of these.

I can do some other things like ban *asdf* or *hndq* or *wefwef*, but that will quickly become whack-a-mole.
 
Something like this may work:
Code:
/^[a-zA-Z0-9]{15,}/si

Where 15 is the maximum number of unbroken characters in a single string.
 
Top Bottom