Persistence of custom headers within an email thread - email

I this is probably a strange question, but I thought I'd go ahead and ask. Say, I send an email, using IMAP SMTP, through a special client. This client adds a few custom headers to the email message before sending it on its way. The recipient receives this email and responds to me directly (and maybe CC's a few people as well).
My question is this: Given the above example, would these X-headers persist throughout all the new messages within the thread?
One thing I can think of is the client would be aware of the original email message it sent. All subsequent responses to this email would have a "Reply-To" header whose value equals the "Message-Id" of the previous email. I don't see why I couldn't crawl up these thread of replies until I get to the original message sent by the client, thereby deriving the original custom headers.
Maybe I'm over-thinking this. Any suggestions? :)

A message reply does not necessarily contain anything of the original message. The MUA is likely to suggest a modified (e.g. prepended with "Re:") version of the original subject, and obviously the addresses are utilised for appropriate defaults as well. None of the other content of the message forms part of the reply (unless the sender deliberately includes it, as with quoting or forwarding). Any X- headers that you have in your message will certainly not be included in the reply (unless you have control over that MUA).
However, your plan of tracking the original message is certainly feasible: see Section 3.6.4 of RFC 5322. Every message should (not must) have a Message-ID header, and should have In-Reply-To and References headers when appropriate.
The "Message-ID:" field contains a single unique message identifier. The "References:" and "In-Reply-To:" fields each contain one or more unique message identifiers, optionally separated by [whitespace].
In-Reply-To is mention to identify the message (or messages) that is (are) being replied to, while References identifies the entire thread of conversation. The References header is meant to contain the entire contents of the References header of the message being replied to, so you only need the last message to identify the entire thread.
Note that In-Reply-To and Reply-To are not the same thing (the latter specifies the address that the sender wishes replies to be sent to).
Assuming that you have the original message, then you should be able to use the References header of any reply to identify the original message. Not every MUA will handle References or In-Reply-To correctly, but most will.

As far as I know, there's no reason to think any email client would propagate any header lines it doesn't understand. Most will preserve the subject (usually adding "Re: " if necessary) and derive their "To: " and "Cc: " lines from the previous message's headers, but that's about it. I suppose some (but not all) will generate an "In-Reply-To" line, but that's as far as it goes.
Your idea of having a client crawl back through the thread looking for specific headers sounds like it might be do-able, but you'd have to write your own email client if you want that feature, and you'd still be blocked by the fact that not all email clients preserve message threading in any way.

Related

Is there a URI schema for addressing individual email messages?

When someone loses track of an email that has been sent to them, and brings that to the sender's attention, it is common practice for the sender to simply forward or re-send the original email. I want to know if there is any [semi-]standard way to reference a specific email, such that a mail client could open that email if it has a copy of it. This might be in the form of a URI, or possibly some other form. Such a URI might reference the sender, recipient, date, time, or other headers that [should] remain intact between sender and recipient.
The Message-ID is a globally unique identifier for messages.
Note that the Message-ID header is optional, but recommended:
Though listed as optional in the table in section 3.6, every message SHOULD have a "Message-ID:" field.
RFC 2392 specifies the URI scheme mid (which was already reserved in RFC 1738):
The "mid" scheme uses (a part of) the message-id of an email message to refer to a specific message.
An example from RFC 2392:
previous message, shows how the approach you propose can be used to accomplish ...

how to notice if a mail is a forwarded mail?

I have a very special problem.
If we create a mail in Outlook, we add a UserProperty which contains a DataBase-ID of our System, so we can Link the mail to the representing DataBase-Item. On the service which reads the mails in each Mailbox and imports them automatically I can read this property by using ExtendedPropertyDefinitions. So far everything is fine...
If the User now forwards the message in Outlook, Olk copies the UserProperty to the new message. And now my problems beginn. Now my Service thinks the new message is also linked to our database and updates DB-Entry with the new Body and new Subject.
So does anyone now how to find out if a message is a forwarded one or how to tell Outlook not to copy the userproperty to the forwarded (new) message?
thx. Jay
What we thought about, but isnt working for our case
- a second userproperty containing a simple tag linke "fromSystem". Cause this would be copied too.
- a second userproperty containing a hashsum calculated from subject and Body. Cause both could be changed by the user. We just create the message, add all properties and Display it. from this Point on we no longer have control what is Happening to the mail until the Service handles it.
Your service consuming EWS should check the ConversationIndex and only update the database if it's 22 bytes long (original source message). Forward emails and reply emails keep appending 5 bytes (10 chars) to the ConversationIndex extending it beyond 22 bytes.
Sample ConversationIndexes
Original: 01CDD15D80E51C1D4522172840ACA96287DA28A15D97
Reply: 01CDD15D80E51C1D4522172840ACA96287DA28A15D970000018630
Forward: 01CDD15D80E51C1D4522172840ACA96287DA28A15D970000018630000000FC30
ConversationIndex represents the sequential ordering of the ConversationTopic (essentially GUID + timestamp). See Working with Conversations on MSDN. ConversationIndex is explicitly defined on MSDN here.
if (message.ConversationIndex.Length == 22)
{
// update DB body, subject, etc.
}
Also make sure you load the EmailMessageSchema.ConversationIndex before trying to access its value.

Apache Camel mail to identify auto generated messages

I'm looking for a way to identify auto generated messages like Outlook's "Out of office" replies.
I stumbled upon a header called "Auto-submitted" that's supposed to do the trick, but Camel doesn't seems to provide this header in the "Message" object. Reference: http://www.iana.org/assignments/auto-submitted-keywords/auto-submitted-keywords.xml
Is it possible to know if a message is auto generated or human generated?
I don't know Apache Camel, but I can tell you that there is no simple and safe way to detect automated email messages in general. Headers like auto-submitted are an indicator, but unfortunately lots of automated scripts do not add them. I once had to write an out-of-office implementation that should not send ooo replies to any automated messages (mailing lists, spam, newsletters, etc.). Here is what I finally came up with, maybe this helps in your case as well:
Sender address regular expressions that indicate automated senders:
"^owner-"
"^request-"
"-request#"
"bounce.*#"
"-confirm#"
"-errors#"
"^no[-]?reply"
"^donotreply"
"^postmaster#"
"^mailer[-_]daemon#"
"^mailer#"
"^listserv#"
"^majordom[o]?#"
"^mailman#"
"^nobody#"
"^bounce"
"^www(-data)?#"
"^mdaemon#"
"^root#"
"^news(letter)?#"
"^webmaster#" (role address - may not be a good indicator in your case)
"^administrator#" (role address - may not be a good indicator in your case)
"^support#" (role address - may not be a good indicator in your case)
Headers that indicate automated messages if they exist:
list-help
list-unsubscribe
list-subscribe
list-owner
list-post
list-archive
list-id
mailing-List
x-facebook-notify
x-mailing-list
x-cron-env
x-autoresponse
x-eBay-mailtracker
Headers that indicate automated messages if they have a special value:
'x-spam-flag':'yes'
'x-spam-status':'yes'
'X-Spam-Flag2': 'yes'
'precedence':'(bulk|list|junk)'
'x-precedence':'(bulk|list|junk)'
'x-barracuda-spam-status':'yes'
'x-dspam-result':'(spam|bl[ao]cklisted)'
'X-Mailer':'^Mail$'
'auto-submitted':'auto-replied'

What heuristics should I use to prevent an autoresponder war?

I am currently extending an e-mail system with an autoresponse feature. In a dark past, I've seen some awesome mail loops, and I'm now trying to avoid such a thing from happening to me.
I've looked at how other tools ('mailbot', 'vacation') are doing this, grepped my own mail archive for suspicious mail headers, but I wonder if there is something else I can add.
My process at this point:
Refuse if sender address is invalid (this should get rid of messages with <> sender)
Refuse if sender address matches one of the following:
'^root#',
'^hostmaster#',
'^postmaster#',
'^nobody#',
'^www#',
'-request#'
Refuse if one of these headers (after whitespace normalization and lowercasing) is present:
'^precedence: junk$',
'^precedence: bulk$',
'^precedence: list$',
'^list-id:',
'^content-type: multipart/report$',
'^x-autogenerated: reply$',
'^auto-submit: yes$',
'^subject: auto-response$'
Refuse if sender address was already seen by the autoresponder in the recent past.
Refuse if the sender address is my own address :)
Accept and send autoresponse, prepending Auto-response: to the subject, setting headers Precedence: bulk and Auto-Submit: yes to hopefully prevent some remote mailer from propagating the autoresponse any further.
Is there anything I'm missing?
In my research so far I've come up with these rules.
Treat inbound message as autogenerated, ignore it and blacklist the sender if...
Return-Path header is <> or missing/invalid
Auto-Submitted header is present with any value other than "no"
X-Auto-Response-Suppress header is present
In-Reply-To header is missing
Note: If I'm reading RFC3834 correctly, your own programs SHOULD set this, but so far it seems some autoresponders omit this (freshdesk.com)
When sending outbound messages, be sure to...
Set the Auto-Submitted: auto-generated header (or auto-replied as appropriate)
Set your SMTP MAIL FROM: command with the null address <>
Note some delivery services including Amazon SES will set their own value here, so this may not be feasible
Check the recipient against the blacklist built up by the inbound side and abort sending to known autoresponders
Consider sending not more than 1 message per unit time (long like 24 hours) to a given recipient
Notes on other answers and points
I think ignoring Precedence: list messages will cause false positives, at least for my app's configuration
I believe the OP's "auto-submit" rule is a typo and the official header is Auto-Submitted
References
RFC3834
This SO question about Precedence header has several good answers
Wikipedia Email Loop Article
desk.com article
Comments welcome and I'll update this answer as this is a good question and I'd like to see an authoritative answer created.
Update 2014-05-22
To find if an inbound message is an "out-of-office" or other automatic reply, we use that procedure:
First, Find if header "In-Reply-To" is present. If not, that is an auto-reply.
Else, check if 1 of these header is present:
X-Auto-Response-Suppress (any value)
Precedence (value contains bulk, or junk or list)
X-Webmin-Autoreply (value 1)
X-Autogenerated (value Reply)
X-AutoReply (value YES)
Include a phrase like "This is an automatically-generated response" in the body somewhere. If your message body is HTML (not plain text) you can use a style to make it not visible.
Check for this phrase before responding. If it exists, odds are good it's an automated response.

Forwarded email detection

Is there any way to detect (using RFC 2822 headers) that an email is a forwarded email?
There are two things that are normally referred to as "forwarding".
When you set up automatic account-level forwarding to another email address, your mail system will usually introduce an extra header to enable it to detect and break mail loops. Unfortunately, the name of this header has never been standardized. Some use Delivered-To, some use X-Loop, some use X-Original-To, some use an X-header proprietary to their mail software. But there's no single header field that's present all cases.
When you manually forward a message by clicking the "Forward" button in your mailer and entering a recipient email address and some descriptive text, a new message with a new Message-ID header is generated. The set of headers on this message will be indistinguishable from a normal reply -- In-Reply-To and References are set in exactly the same way. The only difference is that the Subject header will usually start with "Fwd:" or end with "(fwd)". ("Usually" because some clients format it as "[Fwd: <original subject>]" with square brackets around the new subject, some clients localize the prefix Fwd: into their own language, and some users manually edit the Subject before hitting "send".)
So there are good hints that a message is forwarded, but no hard and fast rules.
Reading the spec, CTRL+F for "forward" gives the following header fields:
resent-date = "Resent-Date:" date-time CRLF
resent-from = "Resent-From:" mailbox-list CRLF
resent-sender = "Resent-Sender:" mailbox CRLF
resent-to = "Resent-To:" address-list CRLF
resent-cc = "Resent-Cc:" address-list CRLF
resent-bcc = "Resent-Bcc:" (address-list / [CFWS]) CRLF
resent-msg-id = "Resent-Message-ID:" msg-id CRLF
I'm not sure whether the major mail software uses these though.
EDIT
Read the spec a little too quickly, there is also this note:
Note: Reintroducing a message into the transport system and using
resent fields is a different operation from "forwarding".
"Forwarding" has two meanings: One sense of forwarding is that a mail
reading program can be told by a user to forward a copy of a message
to another person, making the forwarded message the body of the new
message. A forwarded message in this sense does not appear to have
come from the original sender, but is an entirely new message from
the forwarder of the message. On the other hand, forwarding is also
used to mean when a mail transport program gets a message and
forwards it on to a different destination for final delivery. Resent
header fields are not intended for use with either type of
forwarding.
There are no other notices of "forwarding", so there are no header fields that you can use to detect the forward, except for the subject = "Fwd: <msg>" convention.