Story is simple: one user creates a new discussion, and system sends out email notification to other users about that. When these users reply to a notification, their replies should be properly routed as comments to the particular discussion.
When system sends out email notification, it includes routing code in the subject. For example, subject of a notification may look like this: 'Discussion "Lets Talk" has been started {123}'. Since all email clients use Re: ORIGINAL SUBJECT we get {123} back as part of the subject, parse it and know where to put the comment.
We have this working already (had it for years actually), but current implementation looks a bit dirty (especially when codes become longer) so we would like to explore alternatives if there are any. Is there are a more elegant way to approach this, that works reliably across most email clients? Email header that we might be missing? Something similar?
Thanks so much
Since you didn't mention it I'm not sure if you looked into this:
There is a field in the email header called In-Reply-To which should contain the message id(s) of the email(s) that mail is replying to and one name References which should specify a thread this mail belongs to:
"In-Reply-To:" field may be used to identify the message (or
messages) to which the new message is a reply, while the
"References:" field may be used to identify a "thread" of
conversation.
According to the rfc the In-Reply-To field should contain the "parent"-message's Message-Id while the References field will quote the parent-message's References field.
The problem with this fields is that there is no guarantee that there is something useful in them because they are not required to be filled correctly for mail delivery so some mail clients might not fill them correctly or maybe not even at all.
I found this article about building a threading algorithm using the In-Reply-To field and claiming to be robust against garbage and malicious input in these fields.
Related
Like many web apps, we use Postmark to send all notifications for server side events. Many of our events are grouped and related by something simple and logical (think multiple replies to the same issue, like in GitHub).
Right now, every email sent for these related events is it's own email thread. My question is: how do I send these emails so that related ones get pushed into the same thread?
I'm not sure if this is something at the Postmark level (like include a previous message ID) or if this is something I do with SMTP (like I should format my subject better and inline previous responses), so that's why I'm seeking guidance. Also, every Google search about: "Postmark email threads" returns concerns over the thread safety of the Ruby Gem.
For more information, the app is written in PHP and right now we are znarkus/postmark-php for sending emails and jjaffeux/postmark-inbound-php for parsing inbound ones. However, I am more than willing to add any extra packages if they help me in my quest.
Thanks in advance!
You can add a few SMTP headers with the original Message-ID that most clients use to link together replies. If the original email had a Message-ID header of <123#mail.example.com> the new email you send out should keep the subject line the same and add headers of:
In-Reply-To: <123#mail.example.com>
References: <123#mail.example.com>
And that should inform clients that the two emails should be threaded.
Edit:
The value for these headers should be the SMTP Message-ID header, which is slightly confusing because it is a separate concept from the Postmark MessageID value, which is just a UUID for the email.
The SMTP Message-ID header is always in the form an email address, because that's how it's supposed to be formed, but doesn't have to be related to the from address.
I would like to create a "thread view" from emails that are on an IMAP server.
To achieve that, I fetch the list of emails in the INBOX and other folders but I need to know which email is answering which. Is there such a link between emails in IMAP?
For example on an IMAP server each email has a unique ID: if email B is an answer to email A, is the ID of A stored inside email B?
If your IMAP server supports the "thread" capability as described in RFC 5256, you can just ask the server to thread the messages for you.
Otherwise, you'll have to fetch the relevant information and do the threading on the client. The RFC describes two algorithms to do that. The simpler one, ORDEREDSUBJECT, just groups messages by subject and then sorts them by date. This gives a flat threading structure. The more complicated one, REFERENCES, looks at the In-Reply-To and References headers of each message, and considers messages with such headers to be children of the message with the given Message-Id.
The classic way is to retrieve the message-id and references fields. If two messages contain the same message-id in either message-id or references, then they are in the same thread.
Gmail has a new and IMO better way: each thread has a numeric ID which you can retrieve using x-gm-thrid. Google has published example code in various languages for using that (there should be links near that code).
I often see automated emails postfixed with a message like
Amazon:
*Please note: this e-mail was sent from an address that cannot accept incoming e-mail. Please use the link above if you need to contact us again about this same issue.
Twitter:
Please do not reply to this message; it was sent from an unmonitored email address. This message is a service email related to your use of Twitter.
Google Checkout:
Need help? Visit the Google Checkout help center. Please do not reply to this message.
Directly underneath this warning, Gmail shows me a reply input field. It seems to me that there should be some sort of header that could be attached to such automated emails that would tell the recipient's email client to not allow replies.
Is there such a header? If not, has it ever been discussed by the groups that control email formats?
Is there such a header?
No. I'm pretty sure there isn't anything like that; and even if there is, it'd be non-standard and not widely supported, so it'd be pretty much useless at the moment. Even if it were to become standard, any such header would presumably just be informational; and for backwards-compatibility, support would have to be entirely optional for email clients.
Clients would be slow to implement it, and many users would still be on old versions of mail clients.
If not, has it ever been discussed by the groups that control email formats?
Probably. People have had a long time to suggest all manner of things with email, but my gut feeling is that it would never be implemented; well... not unless there is a fundamental shift in the ideas of what email is designed to do.
I'm sure Google would be much happier if you didn't even have a "Reply" button when they email you, so if anyone is pushing for it, it'll be the people who are already sending from donotreply#...
Email is designed to be sent from real mailboxes. RFC 2822 and RFC 5322 say:
In all cases, the "From:" field SHOULD NOT contain any mailbox that
does not belong to the author(s) of the message.
To me, that is a clear indication that email is designed as a method for conversation, rather than broadcast.
Probably the biggest killer to any change would be the little bit above that line, which would need to be entirely redefined; which would cause more problems than would be solved:
The originator fields also provide the information required when
replying to a message. When the "Reply-To:" field is present, it
indicates the address(es) to which the author of the message suggests
that replies be sent. In the absence of the "Reply-To:" field,
replies SHOULD by default be sent to the mailbox(es) specified in the
"From:" field unless otherwise specified by the person composing the
reply.
RFC 6854 updates RFC 5322 to allow the group construct to be used in the From field as well (among other things). A group can be empty, which is likely the only way you've ever seen the group syntax being used: undisclosed-recipients:;.
Section 1 of the RFC explicitly lists "no-reply" among the motivations for allowing the group construct in the From field:
The use cases for the "From:" field have evolved. There are numerous instances of automated systems that wish to send email but cannot handle replies, and a "From:" field with no usable addresses would be extremely useful for that purpose.
It provides the following example: From: Automated System:;
However, at the end of the same section, the RFC also says:
This document recommends against the general use of group syntax in these fields at this time
In section 3, the RFC clarifies that the group syntax in the From field is only for Limited Use.
Personally, I think this method should not be used – unless we're certain that all relevant clients display the originating domain in some other way (reconstructed from the Return-Path or a new header). Otherwise, this defeats all the efforts towards domain authentication (SPF, DKIM, and DMARC). Introducing an additional header field which causes clients to simply hide the reply button seems the much better approach to me.
The RFC comments on this aspect in section 5:
Some protocols attempt to validate the originator address by matching the "From:" address to a particular verified domain (for one such protocol, see the Author Domain Signing Practices (ADSP) document [RFC5617]). Such protocols will not be applicable to messages that lack an actual email address (whether real or fake) in the "From:" field. Local policy will determine how such messages are handled, and senders, therefore, need to be aware that using groups in the "From:" might adversely affect deliverability of the message.
What a failed opportunity…
It seems that Thunderbird shows a built-in warning message if From address is of form no-reply#example.com. (The message I noticed this with also had To with no-reply#example.com and my email address in Cc field only. I haven't tested if this is important.)
As far as I know, the form no-reply#example.com has not been defined in any RFC.
Update: It appears that this behavior has been implemented in this bug: https://bugzilla.mozilla.org/show_bug.cgi?id=1342809
and the actual implementation is a regex
/^(.*[._-])?(do[._-]?not|no)[._-]?reply([._-].*)?#/
If that matches, a confirmation prompt is displayed:
Reply Not Supported
The reply address ({ $email }) does not appear to be a monitored
address. Messages to this address will likely not be read by anyone.
[Reply Anyway] [Cancel]
This seems sensible enough for me and maybe other vendors could agree here. Note that this causes all the following to show the warning before allowing a reply:
service-name-no-reply#example.com
donot-reply#example.com
noreply.xyz#example.com
no-reply-userid#example.com
Unfortunately, it doesn't match
no-reply+eventid#example.com
so you have to use something like
no-reply-productname+eventid#example.com
if you want to encode extra information in the tag part.
Update: Note that none of this is specified in any RFC related to email so this is about what works in practice instead of in theory.
I'm making a webstore with integrated customer service. Every few minutes, the system will retrieve emails into the database, parsing headers and relating the message to customers and orders.
Customer threading is fairly reliable via the messages From: header. But what about orders? It seems most people use the Reply-To: header for threading orders ...
From: <orders#company.com>
To: <person#place.com>
Subject: Company Order #314159
Reply-To: <order-314159#company.com>
But a messy Reply-to: obscures and uglifies things, and probably flags spam sensors or something. I definitely don't want to count on the Subject: field, people modify the subject all the time, even when replying. There are other headers that seem suited to the job, like ...
From: <orders#company.com>
To: <person#place.com>
Subject: Company Order #314159
Message-ID: <314159-2>
... or ...
In-Reply-To: <314159-1>
But are these sent back when the person reply? Are there any headers (other than Reply-To:) that are reliably copied into replies and forwards?
You can't rely entirely on headers being preserved. When replying or forwarding, a mail client creates a new message; that mail client can quite legitimately ignore or alter any content, as deemed appropriate.
You might be able to track by the following means, but all are vulnerable to alteration (mainly by the user, but also by a primitive mail client). You should really just use them to make a best-guess.
A disposable Reply-To address. Theoretically, you can also do it with "From" if you want, but Reply-To is better to ensure the user (and their mail server/client) recognise it's from you and act appropriately. I see no reason why a spam filter would care about disposable addresses. Seeing as most spam uses fake addresses anyway and doesn't care about replies, it is not really a spammer trick. It's unlikely to cause a substantial increase in spam filtering. Using a Reply-To for the same domain as your From address is also unlikely to look suspicious.
A unique subject. Yes, it can easily be changed, but usually the existing subject is appended to, rather than deleted (especially if it obviously contains some kind of reference number). You could apply a regular expression match - maybe only using it as a confirmation of your other detection methods.
A unique string in the body (possibly preceded by the words "DO NOT REMOVE THIS LINE")
The In-Reply-To and Reference headers are probably fine, when supported. There is a small chance that a user will copy their reply into a new blank message and trash the headers anyway.
Reply-To is sadly not entirely reliable. All replies should have References: which is better standardized than In-Reply-To: which is not easily machine readable.
Your best bet may be to set the envelope header to a unique identifier, perhaps with a From: and Sender: combo that directs replies to the right place but displays nicely.
See also tangentially Dan Bernstein's notes; http://cr.yp.to/immhf.html and in particular http://cr.yp.to/immhf/thread.html
I don't think you can count on anything when it comes to forwards.
Although you have already received some answers, however, we had a similar situation where we supposed to send emails to customers and read them back and associated them with various activities.
During the research the the only HEADER we found that does not get replaced or Removed by the various email clients (Outook, Yahoo, Gmail etc.) was "XREF".
We have thoroughly tested it and it has been working since we first introduced it.
Am trying to determine the best way to persist information from an originating email, through to a reply back.
Essentially, it is to pass a GUID from the original email (c#), whereby when the receiver replies back, that GUID is also sent back for reference.
I have tried setting the MessageID, whereby using Outlook, the In-Reply-To value is set with the original ID, however using some webclient email systems, that value is not created on reply. Is there another way to sent this info through email headers?
Some variation on VERP is probably the most reliable...
http://en.wikipedia.org/wiki/Variable_envelope_return_path
Specifically, instead of having all your replies coming to the same address, encode the information you want to persist into the From address for the email.
For example, in the case of a helpdesk ticket, you could use something like:
From: Helpdesk <support-ticket-123#example.com>
To: End User <user#example.org>
Subject: Ticket #123 - problem with computer.
That way, regardless of what the user edits in the subject or text, you know what ticket is being referred to by the receiving email address.
I don't think you'll be able to do anything that is perfectly reliable by headers alone -- the number of clients that would have to cooperate is immense.
Most systems that do this work by including something in the body of the email that is sent that allows it to identify the message, and including text instructing the recipient to include that block of text in the response. You could also try including it in the subject (and including text in the body to leave the subject unchanged). That's how some mailing list managers I've seen do it.
I stumbled upon this question, and it's been very informative. This, however, leaves me with one question: Will using VERP, or a variation of editing the 'reply-to' or 'from address', cause the messages to be locked up in spam filters?
I have read that spam senders often change the bounce address to prevent their servers from getting clogged with bad email address bounces. Is it a spam risk to assume this approach?
The most reliable way is to put the ID in the subject, which should be preserved throughout replies.
(It doesn't hurt to tell your users that they should keep the subject intact.)
RT, a popular ticketing system, does this. They use a simple subject format like "[Ticket #123]" and key off of 123.