Extract ALL e-mails from GMAIL, GROUP and SORT by frequency contacted - email

I want to extract all e-mail addresses from GMAIL and GROUP and SORT by the email address. The result is a sorted list of email addresses I contact the most.
After some Googling
I have tried to export contacts (there is an option to select most contacted) - but this resulted in 20 most contacted. Does seem Gmail is counting ...
There is a script by labnol that exports ALL e-mail adresses ... but this is rather slow & does not do the count
question: How can I extract all e-mail addresses from GMAIL and GROUP and SORT by the email address?
Would an export to IMAP work (and count from there)? or is there a smarter way
Many thanks

You have two options
You can indeed extract headers (just headers, you don't need the full message body) from IMAP.
You can use the Gmail API. If you want to count who sends you messages, something like this should work.
Sample code using Gmail API in Python
# Retrieve a page of threads
threads = gmail_service.users().threads().list(userId='me',fields='threads(id)').execute()
# Print ID for each thread
pp = pprint.PrettyPrinter(depth=5)
if threads['threads']:
for thread in threads['threads']:
print 'Thread ID: %s has messages from senders:' % (thread['id'])
t = gmail_service.users().threads().get(userId='me',id=thread['id'],fields='messages/payload/headers').execute()
for msg in t['messages']:
for header_from in [v for v in msg['payload']['headers'] if v['name'] == 'From']:
# There should be only one, but sometimes From is missing
from_field_val = header_from['value']
print from_field_val
# TODO Extract email address and increment count
https://gist.github.com/regisd/3b4733e974483d7994fe

Related

MailKit: How to get the To email address when it is an alias

I am sending an email to alias#company.com which is an alias to a real mailbox address. I then connect to the real mailbox (let's say realmailbox#company.com) using MailKit and retrieve messages. When I inspect the To address, all I see is the realmailbox#company.com. How to I see the original alias address that the email was sent to?
For example:
var fullMessage = imapClient.Inbox.GetMessage(uid);
var recipients = fullMessage.To;
recipients only show the realmailbox#company.com, not the alias#company.com.
It sounds to me like your SMTP server is performing a string substitution on the alias before passing it along to the recipient mailbox making it impossible to get the info you are looking for.

Gmail conversation grouping (threads) causing HTML email format issues

I have an HTML email that I send to my users. It looks great in all major Web, Desktop, and Mobile clients even Gmail, EXCEPT when the email is included as part of a "conversation" in Gmail. In that case, some of the table background colors don't show, the text alignment is wrong, etc. Is there a way to prevent this?
Make sure your messages have different subject lines, that'll prevent them from being in a conversation. Or, if just testing, leave the subject line blank.
Tested answer: Change the sender of your email by inserting a random number. So instead of sending from foo#bar.com, send from foo+1#bar.com. Functionally, these are the same email address (replying will send to the same person).
In ES5:
randomNumber = Math.floor(Math.random()*100).toString();
from = 'Foo <foo+' + randomNumber + '#bar.com>';
In ES6:
randomNumber = Math.floor(Math.random()*100);
from = `My Company <foo+${randomNumber}#bar.com>`;

Send email to address alternate from "To:"

I am implementing a sort of dynamic mailing-list system in Rails. I am looking to basically send an email and have the recipient receive it in this form:
From: person#whosentthis.com
To: mailing-list#mysite.com
<body>
Basically, the challenge is how do I send an email to an address while defining a different To: header so that they can easily reply to the mailing list or just the original sender?
Mail vs. Envelope
In emails as in physical mails (paper sheet in paper envelope), the recipient on the envelope may differ from the recipient on the letter sheet itself.
As the mailman would only consider the envelope's recipient, so do mail servers.
That means, one actually can tell the smtp service to send and email to a recipient different than the one listed in the To: field of the emails header.
Trying This Out Manually
You can try this out, manually, for example, by using the sendmail command of postfix.
# bash
sendmail really_send_to_this_address#example.com < mail.txt
where
# mail.txt
From: foo#example.com
To: this_will_be_seen_in_the_to_field_in_email_clients#example.com
Subject: Testmail
This is a test mail.
This will deliver the email to really_send_to_this_address#example.com, not to this_will_be_seen_in_the_to_field_in_email_clients#example.com, but the latter will be shown in the recipient field of the mail client.
Specifying the Envelope_to Field in Rails
The ActionMailer::Base internally uses the mail gem. Currently, Apr 2013, there is a pull request, allowing to specify the envelope_to field on a message before delivery.
Until this is pulled, you can specify the fork in the Gemfile.
# Gemfile
# ...
gem 'mail', git: 'git://github.com/jeremy/mail.git'
You need to run bundle update mail after that, and, depending on your current rails version, maybe also bundle update rails in order to resolve some dependency issues.
After that, you can set the envelope recipient of a mail message like this:
# rails
message # => <Mail::Message ...>
message.to = [ "this_will_be_seen_in_the_to_field_in_email_clients#example.com" ]
message.smtp_envelope_to = [ "really_send_to_this_address#example.com" ]
message.from = ...
message.subject = ...
message.body = ...
message.deliver
Documentation
https://github.com/mikel/mail/pull/477
http://rubydoc.info/github/jeremy/mail/master/Mail/Message#smtp_envelope_to%3D-instance_method
https://github.com/mikel/mail
Why not use the BCC header for this? Put person#whoshouldreceivethis.com into BCC and mailing-list#mysite.com into TO.
Clearification:
There is no technical solution for this. The email headers do what they do, you can't influence them in that once the email is on the way.
I am sure, though, you can find a combined use of TO, CC, BCC and REPLY-TO that gives you what you want.

How do I check if an email address is valid without sending anything to it?

I have a client with 5000 emails from an old list he has that he wants to promote his services to. He wants to know which emails on the list are still valid. I want to check them for him - without sending out 5K emails randomly and then being listed as a spammer or something. Ideas?
You can validate the email via SMTP without sending an actual email.
http://code.google.com/p/php-smtp-email-validation/
You could also send emails out, and check for bounces.
bucabay's answer is the way forward. What a library like that essentially does is checking for existing DNS record for (mail) servers at specified domains (A, MX, or AAAA). After that, it do what's termed callback verification. That's where you connect to the mail server, tell it you want to send to a particular email address and see if they say OK.
For callback verification, you should note greylisting servers say OK to everything so there is no 100% guarantee possible without actually sending the emails out. Here's some code I used when I did this manually. It's a patch onto the email address parser from here.
#
# Email callback verification
# Based on http://uk2.php.net/manual/en/function.getmxrr.php
#
if (strlen($bits['domain-literal'])){
$records = array($bits['domain-literal']);
}elseif (!getmxrr($bits['domain'], $mx_records, $mx_weight)){
$records = array($bits['domain']);
}else{
$mxs = array();
for ($i = 0; $i < count($mx_records); $i++){
$mxs[$mx_records[$i]] = $mx_weight[$i];
}
asort($mxs);
$records = array_keys($mxs);
}
$user_okay = false;
for ($j = 0; $j < count($records) && !$user_okay; $j++){
$fp = #fsockopen($records[$j], 25, $errno, $errstr, 2);
if($fp){
$ms_resp = "";
$ms_resp .= send_command($fp, "HELO ******.com");
$ms_resp .= send_command($fp, "MAIL FROM:<>");
$rcpt_text = send_command($fp, "RCPT TO:<" . $email . ">");
$ms_resp .= $rcpt_text;
$ms_code = intval(substr($rcpt_text, 0, 3));
if ($ms_code == 250 || $ms_code == 451){ // Accept all user account on greylisting server
$user_okay = true;
}
$ms_resp .= send_command($fp, "QUIT");
fclose($fp);
}
}
return $user_okay ? 1 : 0;
I think you need to send the emails to find out. Also, this is pretty much exactly what a spammer is, thus the reason for getting put on spammer lists. Sending in bursts will help you hide this fact though.
You'll have to email them at least once.
Create a new email list. Send the old list an email with a link they need to click on to continue receiving messages (re-subscribe).
Send them all an email and collect all reply-to bounces on a real email account, then purge those bounced emails from your main list.
Send them all an HTML email, and one of the images is remotely hosted and requires a unique ID to request it that you set in each email. When your web server returns that image to their client, you can then consider that email as active. This is called a web bug, and will only work if the person automatically loads remote images in their client.
https://github.com/kamilc/email_verifier is a rubygem that will check that the MX record exists and that the SMTP server says the address has a valid mailbox.
You can use a paid service like Kickbox to do this as well.
You can consider the MailboxValidator service http://www.mailboxvalidator.com/ which should be adequate for your requirement. You can get either a bulk plan where you can upload a CSV file containing your email list or get the API plan if you require programmatic integrations.

How to make an email bot that replies to users not reply to auto-responses and get itself into mail loops

I have a bot that replies to users. But sometimes when my bot sends its reply, the user or their email provider will auto-respond (vacation message, bounce message, error from mailer-daemon, etc). That is then a new message from the user (so my bot thinks) that it in turn replies to. Mail loop!
I'd like my bot to only reply to real emails from real humans. I'm currently filtering out email that admits to being bulk precedence or from a mailing list or has the Auto-Submitted header equal to "auto-replied" or "auto-generated" (see code below). But I imagine there's a more comprehensive or standard way to deal with this. (I'm happy to see solutions in other languages besides Perl.)
NB: Remember to have your own bot declare that it is autoresponding! Include
Auto-Submitted: auto-reply
in the header of your bot's email.
My original code for avoiding mail loops follows. Only reply if realmail returns true.
sub realmail {
my($email) = #_;
$email =~ /\nSubject\:\s*([^\n]*)\n/s;
my $subject = $1;
$email =~ /\nPrecedence\:\s*([^\n]*)\n/s;
my $precedence = $1;
$email =~ /\nAuto-Submitted\:\s*([^\n]*)\n/s;
my $autosub = $1;
return !($precedence =~ /bulk|list|junk/i ||
$autosub =~ /(auto\-replied|auto\-generated)/i ||
$subject =~ /^undelivered mail returned to sender$/i
);
}
(The Subject check is surely unnecessary; I just added these checks one at a time as problems arose and the above now seems to work so I don't want to touch it unless there's something definitively better.)
RFC 3834 provides some guidance for what you should do, but here are some concrete guidelines:
Set your envelope sender to a different email address than your auto-responder so bounces don't feed back into the system.
I always store in a database a key of when an email response was sent from a specific address to another address. Under no circumstance will I ever respond to the same address more than once in a 10 minute period. This alone stopped all loops, but doesn't ensure nice behavior (auto-responses to mailing lists are annoying).
Make sure you add any permutation of header that other people are matching on to stop loops. Here's the list I use:
X-Loop: autoresponder
Auto-Submitted: auto-replied
Precedence: bulk (autoreply)
Here are some header regex's I use to avoid loops and to try to play nice:
/^precedence:\s+(?:bulk|list|junk)/i
/^X-(?:Loop|Mailing-List|BeenThere|Mailman)/i
/^List-/i
/^Auto-Submitted:/i
/^Resent-/i
I also avoid responding if any of these are the envelop senders:
if ($sender eq ""
|| $sender =~ /^(?:request|owner|admin|bounce|bounces)-|-(?:request|owner|admin|bounce|bounces)\#|^(?:mailer-daemon|postmaster|daemon|majordomo|ma
ilman|bounce)\#|(?:listserv|listsrv)/i) {
That really sounds like something that's probably available as a module from CPAN, but I didn't find anything clearly relevant in five minutes of searching. Mail::Lite::Mbox::Processor looks like it might do what you want:
Mail::Lite::Message::Matcher is a
framework for automated mail
processing. For example you have a
mail server and you have a need to
process some types of incoming mail
messages automatically. For example,
you can extract automated
notifications, invoices, alerts etc.
from your mail flow and perform some
tasks based on content of those
messages.
but its docs are sparse enough that it isn't immediately obvious whether it provides those example functions itself or if you have to provide the code to drive them.
In any case, though, if you haven't already checked CPAN, that's where I would start if I wanted to do something like this.
My answer here only deals with bounces which is more straightforward.
Using DSN (Delivery Status Notification) identifier will help you detect a DSN/bounced message. It should go to Return-Path and not Reply-To.
Here's a sample of a typical DSN message. The header information includes the message id, content type has specific values (delivery-status) etc.
Not able to provide you any codes in perl, just my 2 cents of idea.
PS: Do note that not all mail servers or MTA conforms to this, but I guess most do.
There should be a standard way of dealing with this, but the problem is that you'd have to assume that systems that send auto-replies comply to that standard, when most the time, they just don't.
How do you get the address that you reply to? I hope you aren't using the From: header. Check the Reply-to: header first and if that doesn't exist, use the Return-path:.
But whatever you do, you will simply have to keep a log of what you sent to whom and throttle your bot to some sensible value of messages per time.