Why is the DATA in SMTP not null terminated? - email

I was reading the original RPC about the SMTP Protocol and came across this section:
SMTP indicates the end of the
mail data by sending a line containing only a period.
Why did Postel decide to use the period as the terminator? Would it not be easier to use the already existing null terminator?
I see, that he would not want the users content to interfere with the protocol, but I would naively assume, that a user is more likely to use a period in one line than a null terminator?
Added to that, would the implementation of the mail client not just cut of the text if the user came to use the null terminator his mail contents?

IMHO: SMTP has be designed long time ago to be human readable/writable.
It is pretty simple to test (send simple SMTP messages) typing them by hand via telnet program.
"Human readable" makes null terminator a suboptimal choice.
EMSMTP design is a fossil of pre-spam era. It is bad (by current standards) but it is so widely implemented and sufficiently good (after fixes) to make any quick revolution "not sufficiently urgent".
Extra info: Seen RFC 3030 for BDAT alternative to DATA command.

Related

Add date header to incoming email with Sieve

I'm looking for a way to do in Sieve something that I've been doing in Procmail for years, which is to insert an unambiguous date header in incoming messages that makes it clear to me -- independent of buried "received" headers from possibly multiple servers and however my mail client interprets the date the message was sent -- when my server received the message. This is how I did it in Procmail:
# First create the "date_received" variable for my time zone:
date_received=`/bin/date -ud '+2 hour' +'%A %Y-%m-%d %H:%M:%S +0200'`
# Second, insert the header containing the date_received variable:
:0 fh w
| formail -i "X-Local-Date-Received: $date_received"
I found "addheader" (RFC 5293) which will, obviously, add a header, but due to something else I read (sorry, don't remember where) I believe that Sieve won't run the "date" command in the shell due to either a limitation or an intended (and understandable) preference not to run shell commands for security reasons.
Other possibly useful information: I'm doing this through Roundcube 1.3.6, but I have a feeling (also due to something I read) that Roundcube might overwrite a custom Sieve filter set if I edit the raw code within Roundcube. If necessary I'm quite happy to edit or create a Sieve configuration file on the server directly to achieve this for all users on the server, but having run Sendmail and Procmail for years I'm unsure of the best place to do this.
EDIT:
As a test in Roundcube I added this at the top of my Sieve filter set:
require ["fileinto","editheader"];
# rule:[test editheader]
if true
{
addheader "X-Test-Header" "This is a test header.";
}
I didn't actually add the line "require ["fileinto","editheader"];"; I just added "editheader" to the existing line at the top of the filter set, like so:
require ["copy","fileinto","regex","editheader"];
I expect this to add ...
X-Test-Header: This is a test header.
... to every incoming message, but Roundcube won't let me save it:
An error occurred.
Unable to save filter. Server error occurred.
A search for this error returns one related result, with no solution posted.
I'm not intending to focus on Roundcube, however. Like I said earlier, I'll add this Sieve filter from the command line if necessary.
The Pigeonhole Sieve Editheader extension isn't available by default. Per its documentation, you need to ensure it's added in your list of sieve extensions on the server:
plugin {
# Use editheader
sieve_extensions = +editheader
}
If you want to run arbitrary scripts using sieve on Dovecot like you can with procmail, then you can use its external programs plugins, configure in Dovecot which external programs you want to allow users to use, and then the users can use the "vnd.dovecot.execute" extension to run those programs. You might be able to use this to port over whatever scripts you used with procmail.
In the general case, the purpose of sieve is for users to be able to configure their own mail filtering, while it seems like you're trying to actually do something globally for the server. Dovecot should add its own Received header when it processes the mail, which is the standard method for marking when a mail system gets a message, so it's not clear to me why you're not just using that, or what changes you want to make to its default behavior. It may be that what you're looking to do may be better handled in your mail transport agent rather than in your mail delivery agent.
Here is my sieve script that converts Received to Date:
require "editheader";
require "regex";
require "variables";
if not exists "Date" {
if header :regex "Received" "^from[[:space:]]+.*[[:space:]]+by[[:space:]]+mail.mydomain.com[[:space:]]+with[[:space:]]+.*[[:space:]]+for[[:space:]]+.*;(.*)$" {
addheader :last "Date" "${1}";
}
}
Note that mail.mydomain.com is a stand-in for the actual mail server address, which means it only matches the header when the message was received on a specific mail server. I made this work with dovecot-2.3.5.1
You can use date plugin. See: rfc5260:
require "date";
require "editheader";
if currentdate :matches "std11" "*" {
addheader :last "X-Local-Date-Received" "${1}";
}

What does X-Sender-Id mean in email raw source (Found in phishing email)?

Somebody in my company is being subject to phishing. My first suggestion was just to change the password. However after awhile I received a fake mail from her address again.
Looking at the raw source of the email I found that there is another person's email in X-Sender-ID and I'm wondering who that might be. Is that the person who sent the email or can it be an account that has been hijacked? (I replaced the email with "somebody#host.com")
X-Virus-Scanned: OK
Received: by smtp5.relay.iad3a.emailsrvr.com (Authenticated sender: somebody-AT-host.com) with ESMTPA id DF2788019C;
Fri, 21 Nov 2014 07:54:42 -0500 (EST)
X-Sender-Id: somebody#host.com
Received: from smtp.emailsrvr.com ([UNAVAILABLE]. [2.133.148.211])
by 0.0.0.0:587 (trex/5.3.2);
Fri, 21 Nov 2014 12:54:46 GMT
What is X-Sender-ID? And what is the email it contains?
My deliberations are based on this RFC which describes the Privacy Enhancement for Emails which you are obviously using.
Basically it says about the X-Sender-ID:
[...] encapsulated header field, required for all
privacy-enhanced messages, identifies a message's sender and provides
the sender's IK identification component.
What does this mean?
First of all you have to check if the mail is properly signed. If thats the case you can be sure that somebody#host.com has a certificate. And you can be sure that the mail you received has been sent from this mail address.
I can't tell you the consequences which result out of this fact as I don't know how your company is deploying the certificates etc. ... the mail address/certificate could also have been hacked and thereby abused.
I hope this helps you for your further research.
While #LMF's answer is useful technical information, I'd like to offer a possible alternative explanation.
Spammers who are not familiar with e-mail (and PHP programmers with no other malicious intent) tend to succumb to cargo cult programming when it comes to email headers. In other words, if there is something they don't understand, they might think it does something useful, and include it in their message template.
Without knowledge about your email infrastructure, or other messages of yours to compare to, I would simply assume everything below the top-most Received: header is forged, and basically without meaning.
If you have a system which runs something called trex (maybe this one?) and it really manages to write a Received: header like that, I might be wrong. The format needlessly deviates from the de-facto standard Sendmail template in a few places, but it's not technically wrong (the format is basically free-form, but introducing ad-hoc syntax makes it harder to guess what the fields mean).
Again, more information about what your typical email (and your correspondent's typical mail) looks like, this is heavy on speculation.
The x-sender-id, along with the x-recipient-id are used to specify which interchange key was used in the broadcast of the message.
X-Sender-ID entity_id : issuing_authority : version
X-Recipient-ID entity_id : issuing_authority : version
The first field contains the identity of the sender or receiver. The first field is mandatory, must be unique, and must be formatted as user#host whereas the host is a fully qualified host address.
The second identifies the name of the authority which issued the interchange key.
The third field specifies the specific type of interchange key which was used. This is represented by an alphanumeric string defined by the issuing authority to label and organize the numerous interchange keys issued by that authority. It is recommmended that they use a timestamp but is not always the case.
If the field values of the x-sender-id second and third field are identical to that of the x-recipient-id they may be only listed in the field which is defined last.
Further Reading
"Distributed Computing & Cryptography: Proceedings of a DIMACS Workshop"

separate email from original email using perl

When people email each other, they generally include the original email in their reply to a sender, adding a little more information each time to the email. Each email client seems to have a different way of adding the original email to a reply.
I need to parse email arriving at our mail server and try and extract the new part of the message, and I'm wondering if there is a sensible way to strip this appended (or prepended) information (the "original message") and just get the new information in a mail body? I believe sadly, that there is no encoding, the original email is simply added to the new message, but I thought I'd check with the experts?
thanks.
No, there is no simple, straightforward algorithm to separate quoted or forwarded text from new content. Quoting and forwarding are poorly standardized and different conventions have existed at different times.
Having said that, e.g. Google's Gmail succeeds fairly well in practice. With enough samples, you can clearly come up with reasonable heuristics.
Good indicators for quoted material are forwarded (pseudo-) headers and indented text, perhaps with a quote indicator along the left margin before the quoted text. You occasionally see outdents as well.
Traditionally, on Usenet in the early 1990s, people would use different, unique quoting styles.
: ~ | This seems to be the original.
: ~ This is the first reply.
: This is the second reply.
This is the third reply, quoting the
previous three messages in sequence.
Around 1995, both clients and standardization initiatives by and large converged on "wedge" quotes;
> >> This seems to be the original.
> > This is the first reply.
> This is the second reply.
This is the third reply, quoting the
previous three messages in sequence.
Then along came Microsoft and ruined it all. I suppose that top quoting makes sense in some corporate settings where you quickly need to collect all the background from a thread to a new participant, but even for that purpose it's a horrible abomination.
This is the third reply, quoting the
previous three messages in sequence.
---- Begin forwarded message ----
From: Him [smtp:bogus]
To: His Friend
Subject: VS: Re: Same as on this message
Date: nothing machine-readable
This is the second reply.
---- Alkuperäinen viesti ----
Lähettäjä: His Friend [smtp:poppycock]
Saaja: Some Guy
Aihe: Re: Same as on this message
Päivämäärä: olisiko eilen ehkä
This is the first reply.
----- Original message ----
From: Somebody Else [smtp:mindless]
To: Some Guy
Subject: Same as on this message
Date: like, the day before
This seems to be the original.

Apache Camel mail to identify auto generated messages

I'm looking for a way to identify auto generated messages like Outlook's "Out of office" replies.
I stumbled upon a header called "Auto-submitted" that's supposed to do the trick, but Camel doesn't seems to provide this header in the "Message" object. Reference: http://www.iana.org/assignments/auto-submitted-keywords/auto-submitted-keywords.xml
Is it possible to know if a message is auto generated or human generated?
I don't know Apache Camel, but I can tell you that there is no simple and safe way to detect automated email messages in general. Headers like auto-submitted are an indicator, but unfortunately lots of automated scripts do not add them. I once had to write an out-of-office implementation that should not send ooo replies to any automated messages (mailing lists, spam, newsletters, etc.). Here is what I finally came up with, maybe this helps in your case as well:
Sender address regular expressions that indicate automated senders:
"^owner-"
"^request-"
"-request#"
"bounce.*#"
"-confirm#"
"-errors#"
"^no[-]?reply"
"^donotreply"
"^postmaster#"
"^mailer[-_]daemon#"
"^mailer#"
"^listserv#"
"^majordom[o]?#"
"^mailman#"
"^nobody#"
"^bounce"
"^www(-data)?#"
"^mdaemon#"
"^root#"
"^news(letter)?#"
"^webmaster#" (role address - may not be a good indicator in your case)
"^administrator#" (role address - may not be a good indicator in your case)
"^support#" (role address - may not be a good indicator in your case)
Headers that indicate automated messages if they exist:
list-help
list-unsubscribe
list-subscribe
list-owner
list-post
list-archive
list-id
mailing-List
x-facebook-notify
x-mailing-list
x-cron-env
x-autoresponse
x-eBay-mailtracker
Headers that indicate automated messages if they have a special value:
'x-spam-flag':'yes'
'x-spam-status':'yes'
'X-Spam-Flag2': 'yes'
'precedence':'(bulk|list|junk)'
'x-precedence':'(bulk|list|junk)'
'x-barracuda-spam-status':'yes'
'x-dspam-result':'(spam|bl[ao]cklisted)'
'X-Mailer':'^Mail$'
'auto-submitted':'auto-replied'

Is it possible to include comments inside a non email host name?

I am working on a more complete email validator in java and came across an interesting ability to embed comments within an email both in the "username" and "address" portions.
The following snippet from http://www.dominicsayers.com/isemail/ has this to say about comments within an email.
Comments are text surrounded by parentheses (like these). These are OK but don't form part of the address. In other words mail sent to first.last#example.com will go to the same place as first(a).last(b)#example(c).com(d). Strange but true.
Do emails like this really exist ?
Is it possible to form hosts like this ?
I tried entering an url such as "google(ignore).com" but firefox and some other browsers failed and i was wondering is this because its it wrong or is it because they dont know about host name comments ?
That syntax -- comments within an addr-spec -- was indeed permissible by the original email RFC, RFC 822. However, the placement of comments like you'd like to use them was deprecated when that RFC was revised by RFC 2822... 10 years ago. It's still marked as obsolete in the current version, RFC 5322. There's no good excuse for emitting anything using that syntax.
Address parsers are supposed to be backwards-compatible in order to cover all conceivable cases, including 10-years-deprecated bits like the one you're trying to take advantage of here. But I'll bet that many, many receiving mail agents will fail to properly parse out those comments. So even though you may have technically found a loophole via the "obsolete addressing" section of the RFC, it's not likely to do you much good in practice.
As for HTTP, the syntax rules aren't the same as email syntax rules. As you're seeing, the comment section from RFC 822 isn't applicable.
Just because you can do it in the spec, doesn't mean that you should. For example, Gmail will not accept that comment format for address.
Second, (to your last point), paren-comments being allowed in email addresses doesn't mean that they work for URLs.
Finally, my advice: I'd tailor the completeness of your validator to your requirements. If you're writing a new MTA (mail transfer agent), you'll probably have to do it all. If you're writing a validator for a user input, keep it simple:
look for one #,
make sure you have stuff before (username) and after (domain name),
make sure you have a "dot" in the hostname string,
[extra credit] do a DNS lookup of the hostname to make sure it resolves.