nrep
Well-known member
I'm looking for a way that I can scan a forum for spam posts that have already been made - retrospective spam checking if you will.
Services like Akismet are great for checking for spam posts as they have been made, but I'm considering converting a forum which is very old and large and existed before services like this existed. I'd like to make sure I've done everything possible to eliminate any spam from it.
Is anyone aware of a way I can retrospectively scan hundreds of thousands of posts and search the forums for spam. Because of the size of the task, I assume that a local service will be required - rather than sending huge numbers of API calls out to a service.
Something like SpamAssassin may be able to do this? Perhaps all of the posts could be checked against this (similar to a cache rebuild) and assigned a spam-score (added to a new xf_post column), from which I can browse manually and check out any of the high scoring ones.
Has anyone seen or done anything like this before, or have any suggestions on how I could go about performing a task like this?
Services like Akismet are great for checking for spam posts as they have been made, but I'm considering converting a forum which is very old and large and existed before services like this existed. I'd like to make sure I've done everything possible to eliminate any spam from it.
Is anyone aware of a way I can retrospectively scan hundreds of thousands of posts and search the forums for spam. Because of the size of the task, I assume that a local service will be required - rather than sending huge numbers of API calls out to a service.
Something like SpamAssassin may be able to do this? Perhaps all of the posts could be checked against this (similar to a cache rebuild) and assigned a spam-score (added to a new xf_post column), from which I can browse manually and check out any of the high scoring ones.
Has anyone seen or done anything like this before, or have any suggestions on how I could go about performing a task like this?