RFC 2822 format - does it include attachments - email

Some background - I am trying to use the Gmail rest API to send an email with curl/libcurl. I am able to send a regular plain text email, but I am struggling to make sense of the API docs regarding attachments.
The API requires the email message to be passed in RFC 2822 format. I know almost nothing about this format, but I'm trying to learn. It just dawned on me that the reason why the Gmail API doesn't deal explicitly with attachments could be because RFC 2822 deals with attachments instead.
My question is - does RFC 2822 include the format of attachments as part of the email? If so then I would love to see an example message with a few header fields, a simple body and a simple text file as an attachment. Could anyone point me at such an example for beginners.

This RFC 2822 includes only text part of the email. No attachments or other MIME types.
To quote directly from the RFC 2822:
Scope
This standard specifies a syntax for text messages that are sent
between computer users, within the framework of "electronic mail"
messages. This standard supersedes the one specified in Request For
Comments (RFC) 822, "Standard for the Format of ARPA Internet Text
Messages" [RFC822], updating it to reflect current practice and
incorporating incremental changes that were specified in other RFCs
[STD3].
This standard specifies a syntax only for text messages. In
particular, it makes no provision for the transmission of images,
audio, or other sorts of structured data in electronic mail messages.
There are several extensions published, such as the MIME document
series [RFC2045, RFC2046, RFC2049], which describe mechanisms for the
transmission of such data through electronic mail, either by
extending the syntax provided here or by structuring such messages to
conform to this syntax. Those mechanisms are outside of the scope of
this standard.
In the context of electronic mail, messages are viewed as having an
envelope and contents. The envelope contains whatever information is
needed to accomplish transmission and delivery. (See [RFC2821] for a
discussion of the envelope.) The contents comprise the object to be
delivered to the recipient. This standard applies only to the format
and some of the semantics of message contents. It contains no
specification of the information in the envelope.

Related

Creating an email thread from IMAP?

I would like to create a "thread view" from emails that are on an IMAP server.
To achieve that, I fetch the list of emails in the INBOX and other folders but I need to know which email is answering which. Is there such a link between emails in IMAP?
For example on an IMAP server each email has a unique ID: if email B is an answer to email A, is the ID of A stored inside email B?
If your IMAP server supports the "thread" capability as described in RFC 5256, you can just ask the server to thread the messages for you.
Otherwise, you'll have to fetch the relevant information and do the threading on the client. The RFC describes two algorithms to do that. The simpler one, ORDEREDSUBJECT, just groups messages by subject and then sorts them by date. This gives a flat threading structure. The more complicated one, REFERENCES, looks at the In-Reply-To and References headers of each message, and considers messages with such headers to be children of the message with the given Message-Id.
The classic way is to retrieve the message-id and references fields. If two messages contain the same message-id in either message-id or references, then they are in the same thread.
Gmail has a new and IMO better way: each thread has a numeric ID which you can retrieve using x-gm-thrid. Google has published example code in various languages for using that (there should be links near that code).

Routing Email Replies as Comments to Appropriate Discussions

Story is simple: one user creates a new discussion, and system sends out email notification to other users about that. When these users reply to a notification, their replies should be properly routed as comments to the particular discussion.
When system sends out email notification, it includes routing code in the subject. For example, subject of a notification may look like this: 'Discussion "Lets Talk" has been started {123}'. Since all email clients use Re: ORIGINAL SUBJECT we get {123} back as part of the subject, parse it and know where to put the comment.
We have this working already (had it for years actually), but current implementation looks a bit dirty (especially when codes become longer) so we would like to explore alternatives if there are any. Is there are a more elegant way to approach this, that works reliably across most email clients? Email header that we might be missing? Something similar?
Thanks so much
Since you didn't mention it I'm not sure if you looked into this:
There is a field in the email header called In-Reply-To which should contain the message id(s) of the email(s) that mail is replying to and one name References which should specify a thread this mail belongs to:
"In-Reply-To:" field may be used to identify the message (or
messages) to which the new message is a reply, while the
"References:" field may be used to identify a "thread" of
conversation.
According to the rfc the In-Reply-To field should contain the "parent"-message's Message-Id while the References field will quote the parent-message's References field.
The problem with this fields is that there is no guarantee that there is something useful in them because they are not required to be filled correctly for mail delivery so some mail clients might not fill them correctly or maybe not even at all.
I found this article about building a threading algorithm using the In-Reply-To field and claiming to be robust against garbage and malicious input in these fields.

Is there a way to tie emails together other than by the "Subject" text?

I want to tie email "threads" together programmatically, specifically gmail and yahoo email "conversations." Is there a way to do this (some kind of link or pointer or "thread ID" contained within an email), or am I stuck with relying on the emailers not changing the text in the "Subject" line?
And besides, that trick would be barely functional at all, as many unrelated threads may have the same subject (such as "[no subject]" etc.).
Yes. EMails contain a header (message ID), that is a unique identifier for that email. It conveys no meaning itself, but another header (in-reply-to) that refers to the message Id of the email it is in reply to. almost every email client does a passable job with these and is used by many to provide the threading you refer to.
In addition, you can use the subject plus relative times to allow relative ordering.
Wikipedia has a great article that discusses these, and links you off to the relevant RFCs:
http://en.wikipedia.org/wiki/Email#Message_format

Is it safe to generate an email subject from the body?

I'm writing an app which allows users to send out a text-only email to a bunch of recipients. I want to try and generate the subject of this email from the body of the message, to avoid the need for a subject field
Is it safe enough to do this? Are these emails likely to fall foul of spam filters?
I'm already scanning the entire email for spam words, so there won't be any in the subject
you could download the widely used spamfilter Spamassassin and search for 'SUBJ' in the *.cf files, this will give you many spamrules that trigger based on subject (like empty subject, all caps, bad words, bad encoding of non-ascii characters etc)
I would suggest that if the mail is from a trusted source then there is not a problem. On the other hand since the mailbox dosent know that the subject is generated automatically it does not matter to them. And the third thing is that you need to check the guidelines that the email filters follow. Check out some ope source mail filter.

Is there a "no-reply" email header?

I often see automated emails postfixed with a message like
Amazon:
*Please note: this e-mail was sent from an address that cannot accept incoming e-mail. Please use the link above if you need to contact us again about this same issue.
Twitter:
Please do not reply to this message; it was sent from an unmonitored email address. This message is a service email related to your use of Twitter.
Google Checkout:
Need help? Visit the Google Checkout help center. Please do not reply to this message.
Directly underneath this warning, Gmail shows me a reply input field. It seems to me that there should be some sort of header that could be attached to such automated emails that would tell the recipient's email client to not allow replies.
Is there such a header? If not, has it ever been discussed by the groups that control email formats?
Is there such a header?
No. I'm pretty sure there isn't anything like that; and even if there is, it'd be non-standard and not widely supported, so it'd be pretty much useless at the moment. Even if it were to become standard, any such header would presumably just be informational; and for backwards-compatibility, support would have to be entirely optional for email clients.
Clients would be slow to implement it, and many users would still be on old versions of mail clients.
If not, has it ever been discussed by the groups that control email formats?
Probably. People have had a long time to suggest all manner of things with email, but my gut feeling is that it would never be implemented; well... not unless there is a fundamental shift in the ideas of what email is designed to do.
I'm sure Google would be much happier if you didn't even have a "Reply" button when they email you, so if anyone is pushing for it, it'll be the people who are already sending from donotreply#...
Email is designed to be sent from real mailboxes. RFC 2822 and RFC 5322 say:
In all cases, the "From:" field SHOULD NOT contain any mailbox that
does not belong to the author(s) of the message.
To me, that is a clear indication that email is designed as a method for conversation, rather than broadcast.
Probably the biggest killer to any change would be the little bit above that line, which would need to be entirely redefined; which would cause more problems than would be solved:
The originator fields also provide the information required when
replying to a message. When the "Reply-To:" field is present, it
indicates the address(es) to which the author of the message suggests
that replies be sent. In the absence of the "Reply-To:" field,
replies SHOULD by default be sent to the mailbox(es) specified in the
"From:" field unless otherwise specified by the person composing the
reply.
RFC 6854 updates RFC 5322 to allow the group construct to be used in the From field as well (among other things). A group can be empty, which is likely the only way you've ever seen the group syntax being used: undisclosed-recipients:;.
Section 1 of the RFC explicitly lists "no-reply" among the motivations for allowing the group construct in the From field:
The use cases for the "From:" field have evolved. There are numerous instances of automated systems that wish to send email but cannot handle replies, and a "From:" field with no usable addresses would be extremely useful for that purpose.
It provides the following example: From: Automated System:;
However, at the end of the same section, the RFC also says:
This document recommends against the general use of group syntax in these fields at this time
In section 3, the RFC clarifies that the group syntax in the From field is only for Limited Use.
Personally, I think this method should not be used – unless we're certain that all relevant clients display the originating domain in some other way (reconstructed from the Return-Path or a new header). Otherwise, this defeats all the efforts towards domain authentication (SPF, DKIM, and DMARC). Introducing an additional header field which causes clients to simply hide the reply button seems the much better approach to me.
The RFC comments on this aspect in section 5:
Some protocols attempt to validate the originator address by matching the "From:" address to a particular verified domain (for one such protocol, see the Author Domain Signing Practices (ADSP) document [RFC5617]). Such protocols will not be applicable to messages that lack an actual email address (whether real or fake) in the "From:" field. Local policy will determine how such messages are handled, and senders, therefore, need to be aware that using groups in the "From:" might adversely affect deliverability of the message.
What a failed opportunity…
It seems that Thunderbird shows a built-in warning message if From address is of form no-reply#example.com. (The message I noticed this with also had To with no-reply#example.com and my email address in Cc field only. I haven't tested if this is important.)
As far as I know, the form no-reply#example.com has not been defined in any RFC.
Update: It appears that this behavior has been implemented in this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1342809
and the actual implementation is a regex
/^(.*[._-])?(do[._-]?not|no)[._-]?reply([._-].*)?#/
If that matches, a confirmation prompt is displayed:
Reply Not Supported
The reply address ({ $email }) does not appear to be a monitored
address. Messages to this address will likely not be read by anyone.
[Reply Anyway] [Cancel]
This seems sensible enough for me and maybe other vendors could agree here. Note that this causes all the following to show the warning before allowing a reply:
service-name-no-reply#example.com
donot-reply#example.com
noreply.xyz#example.com
no-reply-userid#example.com
Unfortunately, it doesn't match
no-reply+eventid#example.com
so you have to use something like
no-reply-productname+eventid#example.com
if you want to encode extra information in the tag part.
Update: Note that none of this is specified in any RFC related to email so this is about what works in practice instead of in theory.