What tools do I need to set up a script that will email around 1,000 people a day? - email

The email addresses are stored in a database and the number of people to be emailed each day is variable. I'm not sure yet whether the emails would need to be sent individually or as a mass email. I want recommendations as to what language to use to do this and any other components necessary in a solution.
thanks

In this context, 1,000 people is a pretty small number. I probably wouldn't bother with a database, and I would do the whole thing with the scripting language of my choice (ksh or Lua, in either case piping output to sendmail. This is a very Unix-specific sort of solution.
One thing you may have to watch out for is to throttle the outgoing email—depending on your service provider, if you inject messages into the server at too high a rate, your IP address may be temporarily blacklisted. At home I tell postfix not to deliver more than 1 message per second to Verizon's server.
If I had to write platform-independent code, I would use the LuaSocket library to make a TCP connection directly with a SMTP server. They have a reasonably useful setup for building and sending RFC-compliant messages.

Here is a C# implementation for this:
System.Net.Mail.MailMessage message = new System.Net.Mail.MailMessage("from#address.com", "to#address.com", "subject", "body");
System.Net.Mail.SmtpClient client = new System.Net.Mail.SmtpClient("host.address.com", 1234);
client.Send(message);

Just about any modern language can do this. Java, C#, VB.NET, PHP, PERL, Python and many many more.
Sending emails is such a common requirement that most languages and frameworks support it natively.
As for the requirement of up to 1000 emails a day - that's not that many emails and the limiting factor will be limits imposed by an ISP most likely.
In short - use the language and platform you are most comfortable with and find out how email works in that.

As others have mentioned, it's easy to do this in just about any modern language. I'm a fan of Python, which features great scripting capabilities as well as a solid base for building applications. Python's library is well documented, and includes a number of sophisticated features (including the ability to do multipart MIME encoding).
This is from the examples:
# Import smtplib for the actual sending function
import smtplib
# Import the email modules we'll need
from email.mime.text import MIMEText
# Open a plain text file for reading. For this example, assume that
# the text file contains only ASCII characters.
fp = open(textfile, 'rb')
# Create a text/plain message
msg = MIMEText(fp.read())
fp.close()
# me == the sender's email address
# you == the recipient's email address
msg['Subject'] = 'The contents of %s' % textfile
msg['From'] = me
msg['To'] = you
# Send the message via our own SMTP server, but don't include the
# envelope header.
s = smtplib.SMTP()
s.sendmail(me, [you], msg.as_string())
s.quit()

I want recommendations as to what language to use to do this and any other components necessary in a solution
You can do this in whatever language you feel comfortable with. .NET has some nice stuff built in, and you can probably do it in less than 20 lines of code.

Related

Why is the DATA in SMTP not null terminated?

I was reading the original RPC about the SMTP Protocol and came across this section:
SMTP indicates the end of the
mail data by sending a line containing only a period.
Why did Postel decide to use the period as the terminator? Would it not be easier to use the already existing null terminator?
I see, that he would not want the users content to interfere with the protocol, but I would naively assume, that a user is more likely to use a period in one line than a null terminator?
Added to that, would the implementation of the mail client not just cut of the text if the user came to use the null terminator his mail contents?
IMHO: SMTP has be designed long time ago to be human readable/writable.
It is pretty simple to test (send simple SMTP messages) typing them by hand via telnet program.
"Human readable" makes null terminator a suboptimal choice.
EMSMTP design is a fossil of pre-spam era. It is bad (by current standards) but it is so widely implemented and sufficiently good (after fixes) to make any quick revolution "not sufficiently urgent".
Extra info: Seen RFC 3030 for BDAT alternative to DATA command.

Add date header to incoming email with Sieve

I'm looking for a way to do in Sieve something that I've been doing in Procmail for years, which is to insert an unambiguous date header in incoming messages that makes it clear to me -- independent of buried "received" headers from possibly multiple servers and however my mail client interprets the date the message was sent -- when my server received the message. This is how I did it in Procmail:
# First create the "date_received" variable for my time zone:
date_received=`/bin/date -ud '+2 hour' +'%A %Y-%m-%d %H:%M:%S +0200'`
# Second, insert the header containing the date_received variable:
:0 fh w
| formail -i "X-Local-Date-Received: $date_received"
I found "addheader" (RFC 5293) which will, obviously, add a header, but due to something else I read (sorry, don't remember where) I believe that Sieve won't run the "date" command in the shell due to either a limitation or an intended (and understandable) preference not to run shell commands for security reasons.
Other possibly useful information: I'm doing this through Roundcube 1.3.6, but I have a feeling (also due to something I read) that Roundcube might overwrite a custom Sieve filter set if I edit the raw code within Roundcube. If necessary I'm quite happy to edit or create a Sieve configuration file on the server directly to achieve this for all users on the server, but having run Sendmail and Procmail for years I'm unsure of the best place to do this.
EDIT:
As a test in Roundcube I added this at the top of my Sieve filter set:
require ["fileinto","editheader"];
# rule:[test editheader]
if true
{
addheader "X-Test-Header" "This is a test header.";
}
I didn't actually add the line "require ["fileinto","editheader"];"; I just added "editheader" to the existing line at the top of the filter set, like so:
require ["copy","fileinto","regex","editheader"];
I expect this to add ...
X-Test-Header: This is a test header.
... to every incoming message, but Roundcube won't let me save it:
An error occurred.
Unable to save filter. Server error occurred.
A search for this error returns one related result, with no solution posted.
I'm not intending to focus on Roundcube, however. Like I said earlier, I'll add this Sieve filter from the command line if necessary.
The Pigeonhole Sieve Editheader extension isn't available by default. Per its documentation, you need to ensure it's added in your list of sieve extensions on the server:
plugin {
# Use editheader
sieve_extensions = +editheader
}
If you want to run arbitrary scripts using sieve on Dovecot like you can with procmail, then you can use its external programs plugins, configure in Dovecot which external programs you want to allow users to use, and then the users can use the "vnd.dovecot.execute" extension to run those programs. You might be able to use this to port over whatever scripts you used with procmail.
In the general case, the purpose of sieve is for users to be able to configure their own mail filtering, while it seems like you're trying to actually do something globally for the server. Dovecot should add its own Received header when it processes the mail, which is the standard method for marking when a mail system gets a message, so it's not clear to me why you're not just using that, or what changes you want to make to its default behavior. It may be that what you're looking to do may be better handled in your mail transport agent rather than in your mail delivery agent.
Here is my sieve script that converts Received to Date:
require "editheader";
require "regex";
require "variables";
if not exists "Date" {
if header :regex "Received" "^from[[:space:]]+.*[[:space:]]+by[[:space:]]+mail.mydomain.com[[:space:]]+with[[:space:]]+.*[[:space:]]+for[[:space:]]+.*;(.*)$" {
addheader :last "Date" "${1}";
}
}
Note that mail.mydomain.com is a stand-in for the actual mail server address, which means it only matches the header when the message was received on a specific mail server. I made this work with dovecot-2.3.5.1
You can use date plugin. See: rfc5260:
require "date";
require "editheader";
if currentdate :matches "std11" "*" {
addheader :last "X-Local-Date-Received" "${1}";
}

building faster webmail script

i want bulid faster webmail
i've built small webmail script based on ( php imap functions ( imap port connection ) )
but it take a long time to connect and get the mail ..
So, i decided to read the mail manually without connect ( by my own functions ) ..
i've built my own functions, that go to the ( user mails ) path, and then i use ( scandir function )
to get all mails in the folder, and then read/get them manually!
i'll show you an example code
<?
$current_folder = 'new';
$virtual_user = 'someone';
$path_to_mails = '/home/user/mail/' . $virtual_user . '/' . $current_folder;
$all_emails = scandir( $path_to_mails );
foreach ( $all_emails as $mail_file ) {
$file = file_get_contents ( $mail_file ) ;
//Now i've the mail file ..
//i'll explode it and extract the important information from it
}
?>
Now i got emails without connect to any port
i think it faster than the ( php imap functions ) ...
but it also take a long time to get and read the file!!
why gmail and yahoo is soooooooooooooooooooo faster??? may be they using database to store their webmail files?
NOW MY QUESTIONS IS
1 - is my own functions really faster than the php imap functions theoretically? ( may be i am wrong )
2 - ( Gmail , Yahoo , Hotmail ) where they storing their mail files? database or hard disk? they are so faster and
in the same time they allow you to connect to their server via imap and get your mails via php, that mean they using hard disk to store email files!!
or may be they use database and they customized their webmail softwares
3 - is there any way to customize the postfix, store the mails to database instant of the hard disk??
4 - tell me the best idea to build a faster and strong webmail system
PLEASE DO NOT IGNORE ANY OF THIS QUESTIONS
i am working on this project 3 months ago.. i've tired!
1 - Yes.
2 - Depends on the provider. I assume Yahoo and Hotmail might be using actual IMAP servers but I don't think they disclose their infrastructure.
3 - This does not relate to postfix. Postfix is just the MTA after all. It doesn't store the mails it just transfers them. So you can of course code your own database driven service. Daunting task ;)
4 - Build on existing tools. The easiest choice is to build on top of the Horde Webmail
Webmail is a daunting task. The small snippet of PHP code you showed is really light years away from reality if you consider the complexity of modern webmailers. If you really want something working you need to start with existing building blocks. Horde is the best option there because it is a development framework, provides efficient IMAP caching capabilities, a decent AJAX backend and so on. Nevertheless: Your own webmail service will remain a daunting task nevertheless.

How does the email header field 'thread-index' work?

I was wondering if anyone knew how the thread-index field in email headers work?
Here's a simple chain of emails thread indexes that I messaged myself with.
Email 1 Thread-Index: AcqvbpKt7QRrdlwaRBKmERImIT9IDg==
Email 2 Thread-Index: AcqvbpjOf+21hsPgR4qZeVu9O988Eg==
Email 3 Thread-Index: Acqvbp3C811djHLbQ9eTGDmyBL925w==
Email 4 Thread-Index: AcqvbqMuifoc5OztR7ei1BLNqFSVvw==
Email 5 Thread-Index: AcqvbqfdWWuz4UwLS7arQJX7/XeUvg==
I can't seem to say with certainty how I can link these emails together. Normally, I would use the in-reply-to field or references field, but I recently found that Blackberrys do NOT include these fields. The only include Thread-Index field.
They are base64 encoded Conversation Index values. No need to reverse engineer them as they are documented by Microsoft on e.g. http://msdn.microsoft.com/en-us/library/ms528174(v=exchg.10).aspx and more detailed on http://msdn.microsoft.com/en-us/library/ee202481(v=exchg.80).aspx
Seemingly the indexes in your example doesn't represent the same conversation, which probably means that the software that sent the mails wasn't able to link them together.
EDIT: Unfortunately I don't have enough reputation to add a comment, but adamo is right that it contains a timestamp - a somewhat esoteric encoded partial FILETIME. But it also contains a GUID, so it is pretty much guarenteed to be unique for that mail (of course the same mail can exist in multiple copies).
There's a good analysis of how exactly this non-standard "Thread-Index" header appears to be used, in this post and links therefrom, including this pdf (a paper presented at the CEAS 2006 conference) and this follow-up, which includes a comment on the issue from the evolution source code (which seems to reflect substantial reverse-engineering of this undocumented header).
Executive summary: essentially, the author eventually gives up on using this header and recommends and shows a different approach, which is also implemented in the c-client library, part of the UW IMAP Toolkit open source package (which is not for IMAP only -- don't let the name fool you, it also works for POP, NNTP, local mailboxes, &c).
I wouldn't be surprised if there are mail clients out there which would not be able to link Blackberry's mails to their threads. The Thread-Index header appears to be a Microsoft extension.
Either way, Novell Evolution implements this. Take a look at this short description of how they do it, or this piece of code that finds the thread parent of a given message.
I assume that, because the lengths of the Thread-Index headers in your example are all the same, these messages were all thread starts? Strange that they're only 22-bytes, though I suppose you could try applying the 5-bytes-per-message rule to them and see if it works for you.
If you are interested in parsing the Thread-Index in C# please take a look at this post
http://forum.rebex.net/questions/3841/how-to-interprete-thread-index-header
The snippet you will find there will let you parse the Thread-Index and retrieve the Thread GUID and message DateTime. There is a problem however, it does not work for all Thread-Indexes out there. Question is why do some Thread-Indexes generate invalid DateTime and what to do to support all of them???

How to make an email bot that replies to users not reply to auto-responses and get itself into mail loops

I have a bot that replies to users. But sometimes when my bot sends its reply, the user or their email provider will auto-respond (vacation message, bounce message, error from mailer-daemon, etc). That is then a new message from the user (so my bot thinks) that it in turn replies to. Mail loop!
I'd like my bot to only reply to real emails from real humans. I'm currently filtering out email that admits to being bulk precedence or from a mailing list or has the Auto-Submitted header equal to "auto-replied" or "auto-generated" (see code below). But I imagine there's a more comprehensive or standard way to deal with this. (I'm happy to see solutions in other languages besides Perl.)
NB: Remember to have your own bot declare that it is autoresponding! Include
Auto-Submitted: auto-reply
in the header of your bot's email.
My original code for avoiding mail loops follows. Only reply if realmail returns true.
sub realmail {
my($email) = #_;
$email =~ /\nSubject\:\s*([^\n]*)\n/s;
my $subject = $1;
$email =~ /\nPrecedence\:\s*([^\n]*)\n/s;
my $precedence = $1;
$email =~ /\nAuto-Submitted\:\s*([^\n]*)\n/s;
my $autosub = $1;
return !($precedence =~ /bulk|list|junk/i ||
$autosub =~ /(auto\-replied|auto\-generated)/i ||
$subject =~ /^undelivered mail returned to sender$/i
);
}
(The Subject check is surely unnecessary; I just added these checks one at a time as problems arose and the above now seems to work so I don't want to touch it unless there's something definitively better.)
RFC 3834 provides some guidance for what you should do, but here are some concrete guidelines:
Set your envelope sender to a different email address than your auto-responder so bounces don't feed back into the system.
I always store in a database a key of when an email response was sent from a specific address to another address. Under no circumstance will I ever respond to the same address more than once in a 10 minute period. This alone stopped all loops, but doesn't ensure nice behavior (auto-responses to mailing lists are annoying).
Make sure you add any permutation of header that other people are matching on to stop loops. Here's the list I use:
X-Loop: autoresponder
Auto-Submitted: auto-replied
Precedence: bulk (autoreply)
Here are some header regex's I use to avoid loops and to try to play nice:
/^precedence:\s+(?:bulk|list|junk)/i
/^X-(?:Loop|Mailing-List|BeenThere|Mailman)/i
/^List-/i
/^Auto-Submitted:/i
/^Resent-/i
I also avoid responding if any of these are the envelop senders:
if ($sender eq ""
|| $sender =~ /^(?:request|owner|admin|bounce|bounces)-|-(?:request|owner|admin|bounce|bounces)\#|^(?:mailer-daemon|postmaster|daemon|majordomo|ma
ilman|bounce)\#|(?:listserv|listsrv)/i) {
That really sounds like something that's probably available as a module from CPAN, but I didn't find anything clearly relevant in five minutes of searching. Mail::Lite::Mbox::Processor looks like it might do what you want:
Mail::Lite::Message::Matcher is a
framework for automated mail
processing. For example you have a
mail server and you have a need to
process some types of incoming mail
messages automatically. For example,
you can extract automated
notifications, invoices, alerts etc.
from your mail flow and perform some
tasks based on content of those
messages.
but its docs are sparse enough that it isn't immediately obvious whether it provides those example functions itself or if you have to provide the code to drive them.
In any case, though, if you haven't already checked CPAN, that's where I would start if I wanted to do something like this.
My answer here only deals with bounces which is more straightforward.
Using DSN (Delivery Status Notification) identifier will help you detect a DSN/bounced message. It should go to Return-Path and not Reply-To.
Here's a sample of a typical DSN message. The header information includes the message id, content type has specific values (delivery-status) etc.
Not able to provide you any codes in perl, just my 2 cents of idea.
PS: Do note that not all mail servers or MTA conforms to this, but I guess most do.
There should be a standard way of dealing with this, but the problem is that you'd have to assume that systems that send auto-replies comply to that standard, when most the time, they just don't.
How do you get the address that you reply to? I hope you aren't using the From: header. Check the Reply-to: header first and if that doesn't exist, use the Return-path:.
But whatever you do, you will simply have to keep a log of what you sent to whom and throttle your bot to some sensible value of messages per time.