Email Newsletter List Bounce Cleanup / Scrubber Services

ProCom

Well-known member
I recently converted a large forum to XF. When I run the "Email Users" script, the list of valid email addresses that I get from XF is 350,000. Unfortunately the list that my old provider was sending through SendGrid was only 270,000. That 80k difference is a concern to me and I'm worried that the data that I got from my previous provider didn't have all the proper bounce details.

I'm worried that my imported list isn't as clean as the provider said it would be, so I decided to run my "0 post members" (about 150k) through a list scrubber to reduce the potential bounces.

As a test, I submitted the same list of 25 email addresses (ones that I knew were a good mix of valid, invalid, catch-all, etc.) to three companies:

https://www.emaillistverify.com/
https://www.hubuco.com/email_verification
https://proofy.io/price
(https://kickbox.io/ was almost a consideration too)

The first two companies processed the lists pretty quickly. The third one is still "processing" a day later. I emailed support 2 days ago to find out what was going on and still no reply.

Results from the other two:

upload_2017-5-24_10-39-23.png

hubuco is definitely way cheaper, but their results are not as robust.

Have any of you used any of these services?

I'm debating if I should run the rest of my list though this system too, and not just the zero-post members. Maybe expand it to all old-members regardless of post-counts?
 

eva2000

Well-known member
I've only ever used emaillistverify and works nicely for vB and XF email list scrubbing :)

as you can save the emaillistverify results to text file you can do all sorts of manipulation of the data to figure stuff out

first what the email fields mean
List and description of sub-statuses:
"ok" - all is ok. server is saying that it is ready to receive a letter to this address, and no tricks have been detected
"error" - server is saying that delivery failed, but no information about the email exists
"smtp_error" - SMTP answer from the server is invalid, destination server reported an internal error to us
"smtp_protocol" - destination server allows us to connect, but SMTP session was closed before the email was verified
"unknown_email" - server saying that delivery failed, and the email does not exist
"attempt_rejected" - delivery failed, reason similar to "rejected"
"relay_error" - delivery fail because a relaying problem took place
"antispam_system" - some anti-spam technology is blocking the verification progress
"email_disabled" - email account is suspended/disabled/limited and can't receive emails
"domain_error" - email server for the whole domain is not installed or is incorrect, so no emails are deliverable
"ok_for_all" - email server is saying that it is ready to accept letters to any email
"dead_server" - email server is dead, no connection to it exists
"syntax_error" - syntax error in email address
"unknown" - email delivery failed, but no information about why
profile my emaillistverify detailed.txt processed list to see what type of emails make up my list
Code:
awk -F ',' '{print $2}' detailed.txt | sort -u | uniq
"dead_server"
"ok"
"ok_for_all"
"p_antispam_system"
"p_smtp_error"
"p_unknown_email"
"smtp_protocol"
"t_email_disabled"
"unknown"
"unknown_email"
figure out only bad emails
Code:
egrep -v '\"p_antispam_system\"|\"antispam_system\"|\"ok\"|\"ok_for_all\"' detailed.txt | awk -F ',' '{print $1}' | sed -e 's|\"||g' | sort > bademails.txt
20 bad emails on the list :)
Code:
wc -l bademails.txt
20 bademails.txt
find bad users from xenforo exported xenforo.txt file with username, email fields
Code:
while read u; do grep $u xenforo.txt; done < bademails.txt
will give me a list of all xenforo users with bad emails in xenforo.txt that match bademails.txt list I got from emaillistverify :)

Run a SQL query against xenforo user table for usernames and emails in formatted output required for your newsletter/sendy importing.

You just do the same in inverse for good emails and you will have a filtered text list of good emails for your newsletter/sendy etc :)
 
Last edited:

ProCom

Well-known member
Ugh, @eva2000 you're such a geek! ;) Of course, you're able to do amazing stuff with your queries that takes me about 3x longer in excel :(

I'm exporting the list of "Email Users" to a csv, which contains their email and username. Once I get the results back and sorted, I'm going to take all the invalid emails and figure out the best course of action.

Which user setting would you toggle in xf for these invalid emails? Email Invalid Bounced?

Would it work just to run a SQL query like: where email = xyz set email-stat to "Invalid Bounced" ?
 

eva2000

Well-known member

ProCom

Well-known member
Code:
UPDATE xf_user SET user_state = 'email_bounce' WHERE email = 'example@example.com';
You can make a copy of your database to test run queries against too
Thanks!!

What I usually do (since I'm an excel guy, not a shell / sql guy) is to take all the email addresses and the sql command, then do a "concatenate" that creates a SQL set command line for each item, and copy / paste them into PhpMyAdmin.

... definitely a hacked-together approach, but it's the best way I can do it without worrying that I'm doing something wrong.

For example, in cell A1 I have example@example.com, then I have this concatenate:
Code:
=CONCATENATE("UPDATE xf_user SET user_state = 'email_bounce' WHERE email = '",A1,"';")
... which spits this out:
Code:
UPDATE xf_user SET user_state = 'email_bounce' WHERE email = 'example@example.com';
 

ProCom

Well-known member
WOW, based on my list so far, I'm going to have at least 20k - 25k emails to set as "bounce" :eek:

My questions:
  1. Is running my huge list (copy / paste the command) into PhpMyAdmin a viable way to set so many of these as "invalid / bounce"?
  2. I'm sure a lot of valid / active members are going to get caught in this process and have their emails marked as "invalid / bounce". Will that impact their engagement in the forum or present them with any notices (or should I create a notice for them)?
UPDATE: I just set a test user as email = invalid / bounce and got the automated notice:

"Attempts to send emails to bounce@email.com have failed. Please update your email.
Update your contact details"

... which makes it so their account is set so they can't post: "(You have insufficient privileges to reply here.)"


 
Last edited:

eva2000

Well-known member
i usually don't use phpmyadmin just log in via ssh and run on command line i.e.
Code:
UPDATE xf_user SET user_state = 'email_bounce' WHERE email = 'example@example.com';
becomes
Code:
mysql -e "UPDATE xf_user SET user_state = 'email_bounce' WHERE email = 'example@example.com'; DATABNAME"
 

semprot

Active member
Just came to this thread from Google.
If there is any add on to verify the email before sending it in XF, that will be very very handy :(
 

Alfa1

Well-known member
The third one is still "processing" a day later. I emailed support 2 days ago to find out what was going on and still no reply.
proofy.io is way cheaper than the rest. To verify 260k email addresses emailistverify charges $375 which is pretty hefty.
If it doesnt work then its no use though. Did they get back to you?
 

Joeychgo

Well-known member
It would be nice if someone made an addon using this to periodically verify email accounts on our forums
 

delicatebobster

Active member
proofy.io is way cheaper than the rest. To verify 260k email addresses emailistverify charges $375 which is pretty hefty.
If it doesnt work then its no use though. Did they get back to you?
Thats still expensive im cleaning 100k emails for $90 :)
 

Alfa1

Well-known member
Yes, its expensive. But so are servers, getting thrown of amazon or getting blacklisted. I have hundreds of thousands of members, so it will cost quite a lot. But once the database is clean, I only need to check new email addresses while old email addresses need to be re-checked once every 1-2 year or so. Once the list is checked for the first time the number of valid email addresses goes down significantly and I dont think there is reason to re-check invalid email addresses.
 

DragonByte Tech

Well-known member
Mild necro posting in this thread to let you all know I'm going to be implementing an on-site email validation service into the next version of DragonByte Mail for XenForo 2 (v4.1.0 Beta 4). I've discovered a PHP library that essentially does the same thing as these services; attempts to connect to the recipient server over SMTP and checks whether it sends back OK.

There's some limitations:
  • If you're running cPanel / WHM, this feature requires the "SMTP Tweak" and/or equivalent option in the ConfigServer Firewall to be disabled. Essentially, you need to be able to connect to remote servers on port 25. For this reason, the feature comes disabled by default and it is your responsibility to test this accurately before enabling the feature.

  • You need the "Bounce email address" setting filled out in order to use this feature. You do not need automatic bounce handling enabled (though it's well worth taking the time to set this up too!).

  • If your bounce email address is not on the same server as your XenForo forum, and the remote server doesn't designate your XenForo forum's server as a permitted sender, false positives will occur. It is your responsibility to test this accurately before enabling the feature.

  • If your server is on a blacklist and the recipient mail server uses those blacklists, or contains an internal blacklist, an error will be returned. The "take bounce action" cron job will skip validation log entries containing the word "blacklisted".

  • If your server is greylisted anywhere (meaning temporarily blocked), an error will be returned. The "take bounce action" cron job will skip validation log entries containing the word "greylisted" or the word "graylisted".

  • Apple mail services (@icloud.com / @me.com / @mac.com) will send back "Error: too many errors" no matter what. The "take bounce action" cron job will delete validation log entries containing the words "too many errors" in sequence.

  • The SMTP verification does not classify disposable email addresses or catch-all domains. If the server returns "OK" after setting the recipient, the email will be classed as valid. Such features are outside of the scope of this mod.

  • The cron job that validates email addresses will run 25 emails per batch, every 3 hours. Email addresses are re-verified every 6 months. This cannot be changed. These numbers were chosen to limit the potential for your server to be blacklisted for suspicious activity - after all, the method we're using to validate email addresses is exactly the same as what spammers use to harvest email addresses.
With that out of the way, here's the (tentative) feature list itself:
  • Email verification log, with look & feel identical to the existing "Bounced email log", that stores all emails that failed verification. Log entries are kept for six months.
  • Button in the log viewer to manually verify another batch of 25 emails.
  • CLI script for manually verifying any number of emails, with progress reports & timer similar to the CLI importer.
  • Automatically executes the "hard bounce" action for emails listed in the verification log as "Failed". This process is compatible with 3rd party add-ons that extend the XF2 email bounce processor.
I'm still tweaking this feature and testing it on production (as all good developers do) @ our site, so it'll be a little while before it's released. To be notified when it's released, click the "Watch" button on this resource: https://xenforo.com/community/resources/dbtech-dragonbyte-mail.5867/ :)


Fillip
 

Alfa1

Well-known member
That would be a feature that would make me buy this addon.
I've discovered a PHP library that essentially does the same thing as these services; attempts to connect to the recipient server over SMTP and checks whether it sends back OK.
What EmailListVerify does is this:
  • Email deduplication - Domains that match our existing database of invalid emails are removed
  • Domain validation - DNS entries for every email address are checked and validated
  • Spam-trap removal - Spam-traps and disposable emails are detected
  • Risk validation - Remove all of the domains that match our existing database of invalid emails
  • Syntax verification - Email address syntax is verified according to IETF standard
  • MTA validation - Checks if a Mail-Transfer-Agent has a valid MX Record
Any of the above that you can add would be really useful.
 

DragonByte Tech

Well-known member
Email deduplication - Domains that match our existing database of invalid emails are removed
Risk validation - Remove all of the domains that match our existing database of invalid emails
How ironic, maybe you should deduplicate your own feature lists! 😝

Kidding aside, is there no feature in XF2 that flags users with email addresses that are banned? I seem to recall this being a feature in vB so I would imagine something similar exists in XF2. I'll make a note to investigate that.

Domain validation - DNS entries for every email address are checked and validated
MTA validation - Checks if a Mail-Transfer-Agent has a valid MX Record
This is kinda maybe what this does, sort of... It doesn't check the validity of MX records per se. What happens is essentially this:
  1. Look up host name based on email domain
    1. If host doesn't exist, we're done here
  2. Look up DNS records for the host and fetch MX records
    1. If no MX records exist, we're done here
  3. Go through each MX record and attempt to connect on port 25
    1. If any of the MX record host names do not respond on port 25, we're done here
  4. Once a successful connection has been made, it pretends to send a new email from the bounce email handler email address, but cancelling the operation and gracefully disconnects from the MTA after the RCPT TO command.
So I would say that counts as MTA validation and domain validation.

The rest of the features are outside the scope of the library and outside of the scope of this mod.


Fillip
 

DragonByte Tech

Well-known member
Unfortunately not. There are API's that can be used though. https://www.google.com/search?q=detect+disposable+email+address
XF has an email blacklist, but no way to automatically blacklist thousands of ever changing disposable email domains.
Think you quoted the wrong part of the post :p

What I meant by that part of my post was, does the user not receive a notice their email is banned if their current email matches the banned email list?

I'm not sure I like the idea of relying on a 3rd party API to detect disposable emails. Furthermore, disposable emails are technically still valid, I don't think your sender reputation is penalised for sending to disposable email accounts.


Fillip
 

Alfa1

Well-known member
What I meant by that part of my post was, does the user not receive a notice their email is banned if their current email matches the banned email list?
If they try to register with a banned domain then they get a generic error.
If you ban an email address or domain then existing members do not get any notice.

I'm not sure if making the banned email list a mile long would cause issues.
I'm not sure I like the idea of relying on a 3rd party API to detect disposable emails.
There are tens of thousands of disposable email domains and new ones are added daily. Its not possible to keep track of it manually. We do try because I had functionality developed for XF1 that only allows domains after these are manually checked and whitelisted. This is a daily task for us. Ever since adding a fake email checking service, this work decreased by 90%.

Furthermore, disposable emails are technically still valid, I don't think your sender reputation is penalised for sending to disposable email accounts.
You could be correct. I'm not sure how strict amazon is on this matter though. They are very strict.
However accepting fake/temp email defies the purpose of email verification for accounts. Why have email verification at all when email may be fake? It just invites spammers, shills, stalkers and trolls. Very few legitimate users have fake email and very few administrators are happy to accept fake email for this reason.
 
Top