When does subject header in Mime message gets encoded? - encoding

I am trying to understand the mime encoding. I have 2 instance of message with subject header :
1) Reply to Comment in Is my stone dead? Constantly quick-flashing white side-LED =?= in ICS Webtop Update = useless lapdock?
2)Reply to Comment in Is? white side-LED =?= in ICS Webtop Update = useless lapdock
I see different behavior in both the cases, while the former is getting encoded the later is not. Having read through some of Mime documentation, I understand the theory that when there isn't a 7bit non-ASCII character clients may encode the message. But why is the difference in above messages ? Is message length a factor too ?

Related

MIME multipart emails: how do I persuade Microsoft clients to show two HTML (or even plain) parts?

Executive summary: I can construct a multipart message which contains HTML in two separate parts, and it displays correctly in multiple MUAs. However, outlook.com insists on putting any additional HTML after the first HTML part in a downloadable attachment, instead of displaying it. It also does this for plaintext parts.
In detail: I need to add a signature to an email message, where the structure of the original message is, in general, unknown. I do this by wrapping the original message in a multipart/mixed, and then adding a new multipart/alternative which contains text and html versions of the required signature.
If the original message was itself a multipart/alternative, then the new message now looks like:
multipart/mixed
multipart/alternative [the original message]
text/plain
text/html
multipart/alternative [the appended signature]
text/plain [plaintext signature]
text/html [html signature]
This displays well in various clients (Thunderbird, and webmail from gmail/Yahoo/AOL/gmx), showing the original message with the appended signature. However, it doesn't work for MS clients (I've only tried outlook.com). The two alternative signatures are presented to the user as attachments, and not inline, so the user only sees download boxes.
To get around this, I've historically done this for Microsoft:
multipart/mixed
multipart/alternative [the original message]
text/plain
text/html
text/html [html-only signature]
This worked for several years for Microsoft, but has now stopped working - the signature is again shown as an attachment.
I've spent some hours experimenting with this, and can't find any way to get outlook.com to show two different HMTL (or even plain) text parts in the same message. The second one always appears as an attachment. Some of the things I've tried are:
Replace the second multipart-alternative above with another multipart/mixed, which encloses the multipart-alternative signature
Trying to force Content-Disposition: inline for the signature: this never works, and MS appears to ignore Content-Disposition
Replace the outer multipart/mixed with multipart-related, with type=multipart/alternative
Any ideas on how I can get MS clients to actually show the signature, short of parsing the internals of the original message and re-writing it?

javax.mail.Part and writeTo, unable to obtain the same "eml" file as the original one

My application parses many messages via javamail 1.5.6, it listens for incoming messages then store some info about them.
Almost all messages contain a digital signature, so my application needs to retrieve the full eml too, that is the raw file representing an email, in this way application users can always prove the validity of these messages.
So, once I have a javax.mail.Message, then I have to produces its eml, so I do:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
m.writeTo(baos);
this.originalMessage = baos.toString(StandardCharsets.UTF_8.name());
this approach generally works, but I had some multipart messages having part like the following:
This is a multi-part message in MIME format.
--------------55D0DAEBFD4BF19F87D16E72 Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 8bit
In allegato si notifica ai sensi e per gli effetti dell'art. 11 R.D.
1611/1993, al messaggio PEC, oltre alla Relata di Notifica e
contestuale attestazione di conformità,
--------------55D0DAEBFD4BF19F87D16E72
word "conformità" is not properly transformed in the resulting string, it becomes "conformit�", opening such eml for example with MS Outlook results in an invalid digital signature, so message appears corrupted, different from the original
Have you same idea? Thank you very much
The raw message is not a UTF-8 encoded string, nor is an "eml" file a UTF-8 encoded file. They are both byte streams, and your digital signature should work on byte streams.
In your particular example, the content of the message part is encoded using the iso-8859-15 charset, not UTF-8.

Zend mail headers issue - malformed and 'content preview'

I am using zend-mail (updated very recently). I am using IMAP storage to fetch a list of messages with an inordinate (more than half) of the messages reporting a malformed header.
I have reviewed the bug described at: ZendMail - error in headers but I think I have a different problem. Unlike that error, my failure seems to be occurring around a 'content preview' line I receive in many messages.
I've added the failing line text to the error statement:
2018-01-13T11:44:46-05:00 ERR (3): Error reading message 19 - Malformed header detected Content preview: Pacific Operational Science & Technology Conference - POST
2018-01-13T11:44:46-05:00 ERR (3): #0 /var/www/book2/vendor/zendframework/zend-mime/src/Decode.php(149): Zend\Mail\Headers::fromString('Return-Path: <A...', '\r\n')
#1 /var/www/book2/vendor/zendframework/zend-mail/src/Storage/Part.php(112): Zend\Mime\Decode::splitMessage('Return-Path: <A...', 'Return-Path: <A...', '')
The source code isn't much to look at, the body of the email follows the code snippet
$mP = 1;
$mailServer = new Imap(array("host" => "someHost","user" => "someAccount","password" => "somePassword"));
$eMessage = $mailServer->getMessage($mP);
The text from the email follows:
message has been attached to this so you can view it or label
similar future email. If you have any questions, see
root\#localhost for details.
Content preview: =============================================================================
Today's topic summary =============================================================================
Group: canvas-lms-users#googlegroups.com Url: https://groups.google.com/forum/ utm_source=digest&utm_medium=email#!forum/canvas-lms-users/topics
To me, it appears that this issue has more to do with the number of blank lines being interpreted as the end of the header or something involved with the'content preview' line. I think the lines in question have been added by spam detection software. If no 'content preview' - email headers process fine.
Any help?
I believe this is a Bug in Spamassassin. The apparently empty line above the Content preview: actually contains one space. According to RFC5322 section 3.2.2 this is a MUST NOT, presumably because there is buggy software out there (and I have seen some) that treats this empty line as the separator between the Headers and the Body of the message (the correct separator is a blank line with Nothing in it).
So Spamassassin it producing emails that do not comply with the established Internet Standards, and that is a big NO-NO.
I would be interested to hear of other examples caused by this.

"8bit/binary encoded messages are not valid Internet messages"?

8bit and binary are valid values for the Content-Transfer-Encoding header (here is a nice summary on SO).
However, trying to figure out which one was the most suitable for my needs, I encountered the following notices :
Binary encoded messages are not valid Internet messages.
and
Because not all Message Transfer Agents (MTAs) can handle 8bit data, the 8bit encoding is not a valid encoding mechanism for Internet mail.
Digging a bit I found out these warnings likely origin from Microsoft documentation.
What does it actually means ? Should one avoid these values ?
NB : It is not clear to me what the quoted "Internet messages" term specifically refers to. For my purposes, I am concerned only with multipart emails.

Handling diacritics in SIP headers

Following the SIMPLE specification of OMA, when sending a SIP INVITE for chat we can use a header named Subject.
Typically, this header contains the first message sent by a user to his contact.
My question is: this message can contain diacritics, so how should I encode them? Is there a standard definition on how to do this?
You should encode them as UTF-8 as specified in the SIP RFC. There are a few SIP Headers where UTF-8 is not allowed and US ASCII with escaping rules is mandated but the Subject header is not one of those.