ColdFusion cfmail special characters in subject line - unicode

Special characters in the subject line of the mail getting converted in to question marks or boxes.
I have tried to wrap the dynamic string of the subject line in URLEncodedFormat , however ended up in vain.
<cfset strSubject= URLEncodedFormat(s)>
<cfmail
from="xxxxx#xx.com"
to="yyyyyyy#yyy.com"
subject="#strSubject#"
type="html"
>
#testText#
</cfmail>

Assuming the special characters are unicode charactes, you will have to encode the string to a base64 format and use that in the subject line. Like this,
<cfset strSubject="Demande d’chantillons supplémentaires">
<cfset strSubject=ToBase64(strSubject, "utf-8")>
<cfmail from="test#test.com" to="test#test.com" subject="=?utf-8?B?#strSubject#?=" type="html">
#testText#
</cfmail>
The subject line must be in the format =?<charset>?<encoding>?<encoded text>?=
The ? and = are required.
MIME - Encoded Word
"charset" may be any character set registered with IANA. Typically
it would be the same charset as the message body.
"encoding" can be either "Q" denoting Q-encoding that is similar
to the quoted-printable encoding, or "B" denoting base64 encoding.
"encoded text" is the Q-encoded or base64-encoded text.

Also: add charset="utf-8" to the cfmail tag. If you are using utf-8 in the subject, you will probably also use it in the body.

Related

MailMessage subject (with non-Ascii characters) not encoded when SMTP DeliveryFormat is International

When SMTP Delivery format is not set to International, it encodes the Subject line correctly.
Sample:
Subject: =?utf-8?B?QmVzdGVsbGVpbmdhbmdzYmVzdMOkdGlndW5nIEJlc3RlbGxl?=
=?utf-8?B?aW5nYW5nc2Jlc3TDpHRpZ3VuZw==?=
However, setting the DeliveryFormat to International ignores the encoding. and non-ascii characters are read as '?' or 'ä'
We use 'International' to accept email addresses with accented characters.
Is there a workaround for this without having to encode the subject line manually?

How to escape a full email address for SMTP in the headers when the email address contains non-ascii chars

It's about sending emails with non ASCII chars in the email address.
When I use send the TO /RCPT stuff to the SMTP server I know that I need to use punycode here.
But what about the To: and From: Header. Again I know that if the User friendly part contains a non ascii char I con use the standard header encoding that I also use for the subject. But this encoding is only used for the user friendly part.
But what if the email address contains a non ascii char? How must the To header be formatted.
So how to encode "Tüst" ?
This is the encoding as far as I know.
"=?iso-8859-1?Q?T=FCst?="<tüst#domain.de>
But what with the email address.
In fact: I don't understand the RFC's. I tried hard but failed.
The answer is: UTF-8 is the correct way to encode the header.
After some more research I found the answer hidden inside this article:
https://en.wikipedia.org/wiki/International_email
Although the traditional format for email header section allows
non-ASCII characters to be included in the value portion of some of
the header fields using MIME-encoded words (e.g. in display names or
in a Subject header field), MIME-encoding must not be used to encode
other information in a header, such as an email address, or header
fields like Message-ID or Received. Moreover, the MIME-encoding
requires extra processing of the header to convert the data to and
from its MIME-encoded word representation, and harms readability of a
header section.
The 2012 standards RFC 6532 and RFC 6531 allow the inclusion of
Unicode characters in a header content using UTF-8 encoding, and their
transmission via SMTP - but in practice support is only slowly rolling
out.[5]

What is the best way to send the original email message quoted in a response?

Should I prepend all lines with '> '? Is that sufficient? Will it be accepted and understood by all major email clients?
In this case will a original.replace(/\n/g, '\n> ') regex replacement do what I want with the message?
What about the HTML version of the email? Use a big <blockquote>? Just prepending a <blockquote> and appending a </blockquote> will suffice?
Should I, like Gmail and others, prepend a line saying something like "someone <address#example.com> wrote at some time:"?
Plain text and by that I mean: Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
just requires ">" to quote the previous message (1 per line).
HTML version - depends on the client you're rendering in.

Can an email header have different character encoding than the body of the email?

Is an email with different character encoding for it's header and body valid?
The Use Case: While processing an email, should I check for the character encoding of it's header separately, or will checking that of it's body be sufficient?
Can someone guide me as to how to figure this out?
Thanks in advance!
Email headers should use the ASCII charset, if you want the header fields to have a different encoding you need to use the encoded word syntax: http://en.wikipedia.org/wiki/MIME#Encoded-Word
The email body can be directly encoded in different encoding only if mail servers that transfer it have 8bit mime enabled (nowadays every mail server should have it enabled, but it's not guaranteed), otherwise you need to encode the body in transfer encoding (quoted-printable or base64)
The charset can be different in each case, that is you can have every encoded word in different charset and every mail part encoded in different charset or even different transfer encoding as well.
For example you can have:
Subject: =?UTF-8?Q?Zg=C5=82oszenie?= //header value in UTF-8 encoded with quoted printable
and the body encoded:
Content-Type: text/plain; charset="iso-8859-2"
Content-Transfer-Encoding: base64
WmG/87PmIEfqtmyxIEphvPE=
different charsets, different transfer encodings in the same email, no problem.
From experience I can tell you that such mails are very common. Even worse, you can get an email that states one charset in Content-Type header and another charset in html body meta tag:
Content-Type: text/html; charset="iso-8859-2"
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charser=utf-8">
It's up to you to guess the actual charset used. Probably it's the one in meta tag.
Assume nothing. Expect everything. Take no prisoners. This is Sparta.

How can I generate an email with a subject line with international characters in it?

The content encoding headers define how the body of the message is to be interpreted, but the subject is a header, and isn't subject (ha ha) to the declaration of the content type/encoding headers.
Is there a way to make international character set subject lines?
https://www.rfc-editor.org/rfc/rfc2047 defines encoding of non-ascii characters in headers.
"=?" charset "?" encoding "?" encoded-text "?="
Yes. RFC 2047 specifies how.