Detect non-existing e-mail address without sending a message - email

I've read everything I could find on verifying e-mail addresses. The widely encountered solution is this, and it doesn't work (for one, actual nslookup output differs significantly from what the article shows, so I don't get an actual address to telnet to).
But then I thought: I don't need to verify the address. I just want to detect clearly bogus address (such an address that sending a message to it will yield "delivery failed" response). Is it possible to do in principle, and implement using C++ sockets or Java networking API in particular?

Depending on which operating system and tools you use, verifying the recipient's domain, and whether it is recorded in the DNS with a meaningful MX (mail exchange), you could use dig in place of nslookup. For foo#bar.com,
$ dig bar.com MX
Possibilities of detecting bogus eMail adresses are typically limited, though. Availability largely depends on how "generously" the MTA offers this information. Most don't, these days. The SMTP protocol includes some verbs you could then use, such as VRFY. On the other hand, spammers could do just that, hence … (That's one reason why a mail loop is run, in order to detect valid eMails fairly reliably; embedding, as I'm sure you know, a verification string to be sent back, or passed via URL to some web service.
SMTP, being a text protocol, would be used via some "transport layers" underlying higher level APIs like JavaMail. I'd look for programmability of these with the programming language used. Typically, there is some socket library, for sending and retrieving lines of text.

Related

Why 2 different servers (incoming and outgoing) in mail servers?

I have a basic question about mail servers. Usually any general server (not necessarily mail servers) handle all kinds of related requests i.e. both direction interaction with client (like sending and receiving msg to and from client) but in case of mail servers, there are 2 different servers -- one for outgoing messages called outgoing server following SMTP protocol and another for incoming messages called incoming server following POP3/IMAP protocol. Why so? . For that matter couldn't the 2 protocols be accommodated in one single protocol to handle both direction message flow. Also, typically are these 2 servers hosted on same machine in a general business?
Because the protocols are old. In the old Unix tradition, and ignoring UUCP for now:
SMTP came first, to transfer mail between sites, and mail was read locally on the server/network using a text mode client when logged in. There was no need to fetch mail, you used the 'mail' command, and it accessed your local mail spool (a file containing your mail, sitting on the file system, that the SMTP server appended to).
Later on, came the development that people wanted to read mail on their own hosts so a protocol was invented to serve that. The POP server would read that same spool file, and allow you to download all your messages to your intermittently connected client computer. SMTP was reused for sending mail because it already existed and was easily adapted for that purpose.
These days, there are usually three servers of note: SMTP submission server, SMTP transmission server, and IMAP. The submission server is where end users of the service submit their e-mail to be forwarded onto the final host, for example the Google server a Gmail user submits their email to (on port 465 or port 587, usually, with authentication). The transmission server is responsible for delivering and receiving email between sites (eg. when Yahoo and Gmail exchange mail for their customers with each other, on port 25). And IMAP is used by an end user to fetch their email.
These three services, on large sites, are generally served by separate servers on separate hosts. On very large services, like Gmail, they are separate pools of servers.
On a small business host, they were often just one machine.
There are newer and more integrated protocols. For example, both EAS and EWS have mail fetch and submission contained in the same protocol.
Personally, I regret that almost nobody uses UUCP anymore :)
The separation of shipping, management and collection of mail has a historical, philosophical and design basis.
This is in line with the spirit of the Unix world: "do one thing but do it right" and "make everything as simple as possible".
How complex the issue of e-mail is and how much technology it covers, read here:
https://en.wikipedia.org/wiki/Email
and then see the relevant RFCs in the footnotes for details.
It is difficult to implement a monolithic program that takes into account all the nuances of e-mail and at the same time
keeping it simple (Simple in SMTP). Such a monolith would be a nightmare to develop, maintain and administer,
and the post office has changed many times since the 1970s.
As for the second question, there is no reason to use the same physical machine, but there are also no contraindications other than the amount of resources available.

Postfix, isolate multiple sites mail headers so if one get's blocked/blacklisted, the others sharing the server don't also get blacklisted

I have a few separate sites on a server with a single IP.
The sites shouldn't ever send spam, but the customers are free to send emails from their sites so I have no way to prevent them from doing so.
What I'd like to do is when sending the emails via postfix, somehow separate the sites in the headers sent out.
Previously i've setup an ip for each but i'm trying to avoid doing this.
I've also found with /etc/postfix/header_checks I can remove headers but not sure if removing specific headers will cause issues?
One thing to consider here is that blacklisting is usually based on IP addresses. Separate headers won't help much there. The reason for this is that (a) it's simple and (b) many spam sending servers have been compromised and taken over by an attacker, using custom mail sending software, so headers don't matter anymore.
Different headers might still have their merit though, as spamfilters will check those. It just won't help if your server's IP gets blacklisted.
I guess rolling out DKIM might help here, it would give you artificial separation of domains using different domain keys for each. There are some good tutorials on the net on how to set it up with OpenDKIM.
A better solution, used by big mail providers like GMX, is to send mail from a separate IP if it looks like its spam. The setup for this is a little complicated, as it requires you to scan outgoing mail with spamassassin (or something similar) and to route mail depending on the respective spam value. Not an easy task. Marking spam as such, without sending it through a separate IP, might enough to convince the other side that you try to prevent spam send from your server, but this really depends on their spam filter.
The way your server identifies itself during an SMTP conversation is through the HELO command. The smtp_helo_name parameters specifies the name used there. One could try to setup a transport mechanism to use a different name for each sender domain. I'm honestly mot sure how difficult that would be.
If you are still set on changing headers: the header_checks tables not only allow to remove headers, but also to alter them via regular expressions.
Use the REPLACE command to do so. Example:
/^(Message-ID:.*)#your-domain.example(.*)/ REPLACE ${1}#other-domain.example${2}
I'd advise against it, though. It provides to little gain for the effort of finding and setting up the right rules.

How can I recognize different applications in NetFlow dumps?

I try to discover what kind of applications work in my network (e.g. Facebook, Youtube, Twitter etc.) . Unfortunatelly I can't do Deep Packet Inspection, everything I have are NetFlow traces. I was thinking about resolving ip addresses using DNS server and check domain names of flows. But what if application use domain that doesn't contain app name? Is that any possibility to find all ip addresses that use specific app/website?
Outside deep packet inspection (in which I include tech like Cisco NBAR) your main tools are probably going to be whois and port/protocol pair. Some commercial NetFlow collectors will do some of the legwork for you, for example by doing autonomous system lookup on incoming IP addresses, or providing the IANA protocol list.
The term "application" is a bit overloaded in this domain, by the way: often it's used to mean HTTP, SSH, POP3 and similar protocols in the OSI Application Layer, which are generally guessed from the port/protocol pair. For Facebook, Hotmail, etc, the whois protocol is probably your best bet. It's a bit better than reverse DNS, but the return formats aren't standardized among the Regional Internet Registries, so your parser is going to need to have some smarts. Get the IP addresses for a few of the major sites and use the command line whois utility with them to get a feel for the output before scripting anything.
Fortunately, most of the big ones are handled by ARIN. Look for "NetName" and "OrgName" in the results (and watch for the RIR names (RIPE, APNIC, etc) to indicate where that IP address isn't handled by ARIN). For example, I see www.stackoverflow.com as 198.252.206.16. whois 198.252.206.16 returns (among other things,
NetName: SE-NET01
OrgName: Stack Exchange, Inc.
You didn't specify whether you were shell scripting or programming; if the latter, the WHOIS protocol is standard and has a number of implementations in most languages.

Mail relays or SMTP services for use in code

I'm looking to start using an SMTP or mail relay service. I've found quite a few out there, but I'm not sure if there are advantages to one vs another. The only requirements I have are:
can send "from" more than 1 domain (possibly >20 for all the different sites I work on)
can pay for a higher limit (I may need to send as many as 15000 in 1 day, although the average is <500)
can send from PHP (although I doubt this will be a problem as most are compatible with any language)
I'm okay with an SMTP service, mail relay service or a site that uses a custom API, although an API would make the conversion more difficult.
Reasons for wanting to do this:
I don't want to host any mail services my self as they just cause head aches
I don't have to worry about being blacklisted. If they are blacklisted they will know about it and have the knowledge to get it fixed.
Reporting on if emails have gone through would be nice.
I'm not sure why you would need this. If you read the proper RFCs (822, 2822, 823, 2823), you should be able to connect to any given site directly using SMTP. You need to be a little careful with Line Endings (should always be CRLF), and should probably add mail.add_x_header = OFF to your php.ini.
However, if you need a relay, I recommend using a spam filtering provider, as then you have protection from being blacklisted due to spammers abusing email-generating forms. I would recommend Red Condor for this task, but that is only because I work there, and know that we can handle it.
I've started using Mandrill and found it to be a great, reliable service provided by MailChimp that includes enough for most sites to use for free. Easy to setup, but also has a lot more functionality available.

Guidelines for email newsletter service

I'm implementing a email newsletter sender service using .NET and Windows Server technologies. Are there comprehensive guidelines which could help avoiding emails being trapped by spam filters and other mechanisms?
They should cover all aspects of (legal) bulk mail sending: SMTP configuration, DNS, HTML content, images, links within content etc. A simple example: is it better to embed images or load them from a server?
It would be great if you could provide some empirical data to show the efficiency of some measures taken.
Although I don't have a definitive answer, I think this is a very important question.
Here are few tidbits I know about it
Choose a clean hosting/smtp server. IP addresses of spamming SMTP servers are often black-listed by other ISPs.
Send a simple introductory email to every subscriber, asking them to add your sender address to their safe list.
Be very prudent in sending to only those people who are actually expecting it. You wouldn't want pattern recognizers of spam filters learning the smell of your content.
If you don't know your smtp servers in advance, its a good practice to provide configuration options in your application for controlling batch sizes and delay between batches. Some servers don't like large batches or continuous activity.
Unless you have a very specific reason to host the newsletter yourself, I think you'd be much better off using a third party service. There are lots out there, and some are very cheaply priced.
It'll save you on development work
(no point in re-inventing the
wheel).
Their system will handle all
the unsubscribe link stuff that you
need to include in email newsletters
to comply with CAN SPAM laws or
whatever.
They handle the spam
reports that you will inevitably get
if you have a list of any non-trivial size.
They keep records of who signed up,
how they signed up, and their IP
address, and can present those on
receipt of a spam report to prove
that their service wasn't sending
out spam.
You can use double-opt in
(or confirmed opt in), for extra
evidence to prove that the people
you're sending emails to actually
signed up to receive them.
If you really do need to host it yourself I'd suggest you search the web for "email deliverability". Things that are known to help include properly set up SPF records, DomainKeys/DKIM, correct DNS settings (reverse DNS especially - best to just use an online service to check your DNS settings). You can test a lot of these things by sending an email to check-auth#verifier.port25.com.
It's best to avoid using spammy words in your email - always a bit of guesswork this but you some words can trip filters.
But I'd guess that by far the most important thing is to be sending your email from a trusted server that maintains good relationships with ISPs (i.e. ensuring that ISPs don't think that the server is sending out spam). This is a big reason why it's much much easier to get a third party to handle everything for you.