XRumer lacks certain features.
It's ability to look for topic related forms, and then look for topic related threads/posts, and reply to those... is impressive (the way you can use this reminds me of when I use to play with AIML, see
AIML)
The ability to spam multiple forums all at once (firing many thousand threads at once) is impressive, but it might be one of its down falls (I can't go into why)
It can avoid many APIs for a lot longer now, since many users are adopting xblack.txt to avoid spam reporting sites (so their proxies remain unreported for longer)
It can break QA CAPTCHA easily, since Xrumer users use their local Textcaptcha to input the answer for any forums that XRumer fails the text CAPTCHA (this is then shared centrally, so all bot users can get past the QA)..
bumff... QA is no longer a viable method (unless you like playing Russian roulette, and updating the QA frequently)
But..
- For registration form filling, it doesn't really have a browser method to jump into, it can't be automated as if it was a browser.. on registration it only looks for certain ids/ names/order of fields (Although it does have a browser method for solving CAPTCHA)
- and it also doesn't extend its self very well to plug-ins (If people could easily create plug-ins for it, all forums/cms sites would be in a lot of trouble)
Once those two features are possible, forums are in a very prone position... and I don' t think it's that far away
Imagine if users were creating pro-spam plug-ins for XRumer quicker than anti-spam plug-ins were created, that's a scary situation.
The most common CAPTCHA methods are already beaten (ReCaptcha etc). This isn't because the sets are particularly easy to train against... (in fact, it was very hard, because the sets were updated frequently) but because there are sets
available to train against. This is the problem with all public CAPTCHA that provide the sets, even for a multi billion dollar coperation such as Google. If the data is available, you can train a neural network to work out most image based CAPTCHA sets (and this isn't just true for images, but js/flash games too). If a set is available, you can train against it, so the process can be automated... this is why customisation (not common CAPTCHA ) needs to be adopted to stop spam bots.
Those text files you mentioned, there are already many thousand publically available, in fact, there are sites dedicated to them
Just Google : XRrumer linklist
Some link lists are pure Xenforo link lists (don't be surprised if your own form is listed on them)
It is impressive, it will get worse. Knowing your enemy is important...
knowing your enemies next move and what they are likely to do is just as important (script pausing will now be adopted to get around the core registration timer)....
For a system that has a valuable enough population, any weak mechanism* that is added to the core, will be coded against. QA was coded against using the Textcaptcha mechanism, registration timers will now be coded against using script pausing (slowing 1000's of threads down by a
total of 10 seconds at the start.. which is nothing)
* When I say weak mechanism, I mean any mechanism that can be coded around. Weak mechanism can be 100% effective, up until the time they are codded around. API's are not essentially weak mechanism (but are rarely even close to 100% effective), however... many APIs do rely on weak mechanisms to gain the data, once all weak mechanisms have been coded against, we will be left with a smaller set of effective APIs