I'm interested in sending "system status update" to a kind-of-large amount of users (say in the thousands or so, for the sake of discussion) over XMPP. The recipients would be completely arbitrary XMPP accounts spread over various domains, rather than being on one server/domain or anything like that.
Is there a good way to do so such "broadcasting", or will I have to resort to sending an individual message to each individual recipient? If I have to do the latter, is it likely or unlikely that such practices will be frowned upon by server operators? I'm not planning to send a lot of these messages; hardly even once per day.
Related
For a newsletter mailing, about 50,000 users, using pear, is it convenient to order the list by mail provider or leave it all randomly?
From my experiences using Exim to send large amounts of emails, performance will suffer heavily if your email queue grows too large. Depending on your hardware, once you have around 10,000 emails in the queue, you will start to see significant effects of bogosorting, where the server uses more CPU just juggling the queue than actually getting any useful work done.
One way of avoiding large queues, is of course to get the emails delivered as fast and efficient as possible. One of the many ways of achieving this, is to get Exim to deliver multiple emails over the same TCP connection. This in turn can be achieved by sorting the recipients by domain, but that is not enough! By default, Exim will try to deliver each mail it receives immediately and then each delivery will open its own connection (this gives fast deliveries for very small volumes but will drive server load through the roof for larger volumes). You need to first spool the mails to Exim, and then let a queue runner handle the actual delivery which will then automatically see all other emails in the queue that should go to the same host and will deliver them over the same connection.
Optimizing Exim for sending large amounts of emails is a very complex subject that cannot be solved with just a few magical tricks. Crucial configuration options are (but not limited to): queue_only, queue_run_max, deliver_queue_load_max, remote_max_parallel, split_spool_directory, but also fast spool disk, enough RAM, and making sure Exim starts new queue runners often enough (command-line option when starting the Exim daemon).
How this relates to PEAR escapes me, but perhaps this gives you some ideas of how to approach your problem.
I am currently developing an email server in C, and the end goal is to be able to send millions of emails to millions of people every day. Many organizations have email lists with large numbers of users that they email every week/month/etc.
The big question: how can I prevent the server and the emails from being marked as a spam? All of the SPAM-prevention stuff I've seen so far deals mostly with poor configurations, or at least does not require large numbers of emails to be send every hour. I have yet to see anything that addresses the scope of millions-of-emails-per-hour.
Here are some assumptions you can make:
EVERY single email sent is legitimate
all SPF records and MX records are accurate, up-to-date, and valid
all other common SPAM-prevention tactics are being used (reverse DNS is good, DKIM is used, return-addresses are valid, etc etc etc)
emails are one-to-one (ie, I'm not CC'ing 1000 gmail addresses; I'm sending one email to each address)
Here are some questions to get us moving in the right direction:
should I limit the number of emails sent to X emails per minute per domain? If so, how do sites like GMail and MailChimp get around this? note: there are no ISP restrictions; this is only an issue for the receiving mail server...
should I limit the number of connections to a domain at a given time? (eg, will Google think I'm a spam agent if I open 10/100/1000 simultaneous connections to gmail servers?)
how many bounce-backs (5xx errors on an address) should I accept for automatically removing that email from a subscription list? does this affect a server's spam rating?
is there anything else I should or should not do?
Final note: please remember this is a programming question, NOT a library question - I don't want to use someone else's service; we are writing our own for a reason. I'm looking for practical programming advice.
This is not a programming question, but here goes:
I strongly recommend you join your local mail operators mailing list, as well as "Spam-L" mailing list. Read the archives, and see what issues others are having.
The short answer is that destination servers can, and do, use all sorts of methods to try to prevent spam. THere are many things you will need to be aware of in order to have good deliverability, and those things change all the time.
First and most important, remember:
Free speech also includes free listening.
Nobody has to accept or transmit your mail.
Independent operators, businesses and individuals have a perfect right to refuse your mail for any reason or no reason. ISPs are limited only by their contracts with the customer and common-carrier laws, which generally give them broad discretion in what is considered spam and how they block it.
Their system, their rules. If you want your messages delivered, you must cooperate with receiving ISPs. This may mean jumping through hoops, or complying with requirements you think are stupid, or pointless.
Ensure you are not listed by SpamHaus. Most ISPs small and large use SpamHaus DNSBL service. Presence on one of SpamHaus' lists asserts their opinion that your mail meets their listing criteria. Because of SpamHaus' high reputation, most ISPs will simply block all mail you send based on their opinion.
Make sure you process unsubscribes.
Make sure you process non-delivery reports. You may not want to kill a subscription on the first NDR, as there can be intermittent network or server problems which can result in non-delivery, or even erroneous reports that an address is incorrect. But if you get several over the course of a month or two with no successful deliveries, you should kill the subscription.
Join a pay-for reputation service. These may require posting a bond which you may lose if you send Spam. SpamHaus offer one. There are others.
Get professional advice from someone like Return-Path. You will have to pay for this also.
Monitor. The hoops you have to jump through change all the time. Ensure you are aware of emerging deliverability problems.
Join feedback loops. most large ISPs offer feedback programmes where you can get feedback on how users are perceiving your mail, whether they are reporting it as spam, etc.
Ben had some good practical advice, but for others with this problem, here is what I have discovered in the past month:
Email is all about REPUTATION. You will never be able to throw together a server, ip, and/or domain name and expect to be able to send out millions upon millions of emails.
On Stack Overflow, we have a rating system (up and downvotes) to estimate the value/trust that person has with the SO community. But it takes time and effort to get points. It's the same with email - you have to start sending out small amounts of email that people actually open up and read (and would never mark as spam), and then slowly send out more and more every month until you reach the goal of millions and millions of emails.
Everytime someone "downvotes" - marks the email as spam, flags the domain, flags the ip address, deletes the email without reading it, etc - you get a hit against your reputation. You need to be continually monitoring and putting effort and best-practices into your reputation if you want to gain good standing with people.
So start small, expand in a stable and steady manner, and always keep a watchful eye out for abuse, misuses, good and bad feedback, or anything else that might affect your reputation.
It's not only possible, but very practical; you just need to give it time and effort.
we've got different processes that send mails in case of issues encountered (e.g. not enough permissions to perform an operation on a certain order item). This works fine to the point that sometimes identical messages are sent every 5 minutes. In our environment it is very difficult to synchronize the email sending on application layer (actually there are different applications sending out email, so we'd have to touch every application if we were to implement this inside application layer).
It would seem logical for me that filtering out mails (by duplicate subjects) is best done within the email layer, e.g. the application receiving the SMTP requests.
Yet we'd also prefer not to go down to SMTP layer by ourselves, rather use an existing service/application.
Is anybody aware of a web mailer (like googlemail) which does this kind of filtering? it would be ok for us the pay for such a service, so being "free as in beer" would be nice, but being not free is not a showstopper.
Thanks in advance
Holger
I find the idea of filtering duplicate e-mail message by the Subject: header quite worrisome. If they are produced by multiple applications, how can you be certain that the content of the messages is duplicated and that you are not unwittingly dropping important notifications?
The only unique feature of a message that can be used to filter out duplicates is its Message-ID: header. If that header is the same for two messages, then it's usually reasonable to assume that they are copies of the same original message - e.g. one received directly and one that was CC'ed to a mailing list.
That said, you can do pretty much anything you want on most SMTP servers - at least those that are based on a Unix-like OS. For example, Postfix can use custom shell scripts for filtering.
You can, for example, use formail to extract the body of each message and produce its
MD5 hash. Comparing the message body hashes along with the Date:, Subject:, From:, To: and Cc: headers at the same time is a good start to detect real duplicates.
What are the risks, if any, of sending out massive amounts of emails over SMTP? Specifically, this question is regarding the risks of being labelled/blacklisted as spammers of spoofers.
Our mails are legitimate, however. Our system needs to send out reminders to our corporate users on a daily basis, which may number into the thousands, say. Our worry is that with such a setup, our domain might end up being blacklisted by the receiving organisation, thus rendering our reminder service useless.
Does anyone have any information on what might be a "safe" volume of emails to send out to avoid being blacklisted? Or can we just churn out emails with abandon?
You may be able to contract a third-party organization to take care of this for you. I know there's a lot of "direct marketing" companies that will let you use their API to send mass email (newsletters, etc). They can do the work of negotiating to get off blacklists - that's what you pay them for.
I haven't used Sendloop and don't know if it has the functionality you want, but it's probably a good example.
See: How to conduct legitimate email campaigns
In your reminder service, just follow some basic spam guidelines. Identify where the message came from, why the user got it, the link to "opt-out" or discontinue the reminders, and you'll be fine. Any blacklists you do get on will certainly remove you if you have this information in your messages.
Additionally, should you get blacklisted for some reason, have another server on a different network that you can use as a backup should your primary server get blacklisted temporarily for any reason.
Oh, and one final note - usually your entire "domain" (i.e. whatever.com) doesn't get blacklisted. Specific IP addresses or specific servers are usually what get blacklisted.
As long as you're mailing over clean IPs and domains you should be fine. You say your mailings are "legitimate" so there's no reason to worry about ISPs blocking you.
However, as you also mentioned, the volume can become a challenge. Broadly speaking, sending "thousands" of messages should be a non-issue. But... hundreds of thousands, say 250K messages a day on up, is when you start to qualify as a "high-volume" sender.
Once you start sending at this bulk level, you must run a tight ship. ISP filters will look for any clue that you're a black-hat mailer/spammer and will promptly block your deployment if anything looks off.
Make sure your list(s) are spic-and-span; all bounces, duplicates, typos and honey-pots have been scrubbed-out. Your IPs have been properly warmed-up, your DNS and domains are clean and properly registered and you remain responsive to your list recipients.
Basic common sense and following through on all the tiny, simple but crucial details goes a long way.
I'm implementing a email newsletter sender service using .NET and Windows Server technologies. Are there comprehensive guidelines which could help avoiding emails being trapped by spam filters and other mechanisms?
They should cover all aspects of (legal) bulk mail sending: SMTP configuration, DNS, HTML content, images, links within content etc. A simple example: is it better to embed images or load them from a server?
It would be great if you could provide some empirical data to show the efficiency of some measures taken.
Although I don't have a definitive answer, I think this is a very important question.
Here are few tidbits I know about it
Choose a clean hosting/smtp server. IP addresses of spamming SMTP servers are often black-listed by other ISPs.
Send a simple introductory email to every subscriber, asking them to add your sender address to their safe list.
Be very prudent in sending to only those people who are actually expecting it. You wouldn't want pattern recognizers of spam filters learning the smell of your content.
If you don't know your smtp servers in advance, its a good practice to provide configuration options in your application for controlling batch sizes and delay between batches. Some servers don't like large batches or continuous activity.
Unless you have a very specific reason to host the newsletter yourself, I think you'd be much better off using a third party service. There are lots out there, and some are very cheaply priced.
It'll save you on development work
(no point in re-inventing the
wheel).
Their system will handle all
the unsubscribe link stuff that you
need to include in email newsletters
to comply with CAN SPAM laws or
whatever.
They handle the spam
reports that you will inevitably get
if you have a list of any non-trivial size.
They keep records of who signed up,
how they signed up, and their IP
address, and can present those on
receipt of a spam report to prove
that their service wasn't sending
out spam.
You can use double-opt in
(or confirmed opt in), for extra
evidence to prove that the people
you're sending emails to actually
signed up to receive them.
If you really do need to host it yourself I'd suggest you search the web for "email deliverability". Things that are known to help include properly set up SPF records, DomainKeys/DKIM, correct DNS settings (reverse DNS especially - best to just use an online service to check your DNS settings). You can test a lot of these things by sending an email to check-auth#verifier.port25.com.
It's best to avoid using spammy words in your email - always a bit of guesswork this but you some words can trip filters.
But I'd guess that by far the most important thing is to be sending your email from a trusted server that maintains good relationships with ISPs (i.e. ensuring that ISPs don't think that the server is sending out spam). This is a big reason why it's much much easier to get a third party to handle everything for you.