"8bit/binary encoded messages are not valid Internet messages"? - email

8bit and binary are valid values for the Content-Transfer-Encoding header (here is a nice summary on SO).
However, trying to figure out which one was the most suitable for my needs, I encountered the following notices :
Binary encoded messages are not valid Internet messages.
and
Because not all Message Transfer Agents (MTAs) can handle 8bit data, the 8bit encoding is not a valid encoding mechanism for Internet mail.
Digging a bit I found out these warnings likely origin from Microsoft documentation.
What does it actually means ? Should one avoid these values ?
NB : It is not clear to me what the quoted "Internet messages" term specifically refers to. For my purposes, I am concerned only with multipart emails.

Related

What to sign for DTLSv1.0 Certificate Verify Message with RSA

I'm using DTLS v1.0 to communicate with a server. I'm having some trouble figuring out exactly what to do to generate the certificate verify message. I've been reading the RFCs (DTLSv1.0 and TLS1.1, which DTLS v1.0 is based on) but they're somewhat non-specific when it comes to this particular message.
I see the structure of the message is as below, and I know the signature type is RSA.
struct {
Signature signature;
} CertificateVerify;
The Signature type is defined in 7.4.3.
CertificateVerify.signature.md5_hash
MD5(handshake_messages);
CertificateVerify.signature.sha_hash
SHA(handshake_messages);
Based on what I've read it seems to be a concatenation of the sha1 hash and the md5 hash of all the previous messages sent and received (up to and excluding this one) and then RSA signed.
The piece that's got me a bit confused though is how to assemble the messages to hash them.
Does it use each fragment piece or does it use the re-assembled messages? Also, what parts of the messages does it use?
The RFC for TLS 1.1 says
starting at client hello up to but not including this message,
including the type and length fields of the handshake messages
but what about the DTLS specific parts like message_seq, fragment_offset, and fragment_length, do I include them?
UPDATE:
I have tried doing as the RFC for DTLS 1.2 shows (meaning keeping the messages fragmented, using all the handshake fields including DTLS specific fields, and not including the initial Client Hello or Hello Verify Request messages) but I am still receiving "Bad Signature". I do believe I'm signing properly, so it's my belief that I'm concatenating the data improperly to be signed.
For DTLS 1.2 it is defined. And reading RFC 4347, my impression is, RFC 6347 doesn't differ, it clarifies the calculations.
RFC 6347, 4.2.6. CertificateVerify and Finished Messages
RFC 4347, 4.2.6. Finished Messages

HTTP multipart/form-data. What happends when binary data has no string representation?

I want to write an HTTP implementation.
I've been looking around for a few days about sending files over HTTP with Content-Type: multipart/form-data, and I'm really interested about how browsers (or any HTTP client) creates that kind of request.
I already took a look at a lots of questions about it here at stackoverflow like:
How does HTTP file upload work?
What does enctype='multipart/form-data' mean?
I dig into RFCs 2616 (and newer versions), 2046, etc. But I didn't find a clear answer (obviously I did not get the idea behind it).At most articles and answers I found this piece of request string, that's is simple to me to interpret, all these things are documented at RFCs...
POST /upload?upload_progress_id=12344 HTTP/1.1
Host: localhost:3000
Content-Length: 1325
Origin: http://localhost:3000
... other headers ...
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryePkpFF7tjBAqx29L
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="MAX_FILE_SIZE"
100000
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="uploadedfile"; filename="hello.o"
Content-Type: application/x-object
... contents of file goes here ...
------WebKitFormBoundaryePkpFF7tjBAqx29L--
...and it would be simple to implement an HTTP client to construct a piece of string that way in any language.The problem becomes at ... contents of file goes here ..., there's little information about what "contents of file" is. I know it's binary data with a certain type and encoding, but It's difficult to think out of string data, how I would add a piece of binary data that has no string representation inside a string.
I would like to see examples of low level implementations of HTTP protocol with any language. And maybe in depth explanations about binary data transfer over HTTP, how client creates requests and how server read/parse it. PD. I know this question my look a duplicate but most of the answers are not focused on explaining binary data transfer (like media).
You should not try to handle strings on this part of the body, you should send binary data, see it as reading bytes from the resource and sending theses bytes unaltered.
So especially no encoding applied, no utf-8, no base64, HTTP is not a protocol with an ascii7 restriction like smtp, where base64 encoding is applied to ensure only ascii7 characters are used.
There is, by definition, no string version of this data, and looking at raw HTTP transfer (with wireshark for example) you should see binary data, bytes, stuff.
This is why most HTTP servers uses C to manage HTTP, they parse the HTTP communication byte per byte (as the protocol headers are ascii 7 only, certainly not multibytes characters) and they can also read/write arbitrary
binary data for the body quite easily (or even using system calls like readfile to let the kernel manage the binary part).
Now, about examples.
When you use Content-Length and no multipart stuff the body is exactly (content-length) bytes long, so the client parsing your sent data will just read this number of bytes and will treat this whole raw data as the body content (which may have a mime type and and encoding information, but that's just informations for layers set on top of the HTTP protocol).
When you use Transfer-Encoding: chunked, the raw binary body is separated into pieces, each part is then prefixed by an hexadecimal number (the size of the chunk) and the end of line marker. With a final null marker at the end.
If we take the wikipedia example:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
E\r\n
in\r\n
\r\n
chunks.\r\n
0\r\n
\r\n
We could replace each ascii7 letter by any byte, even a byte that would have no ascii7 representation, Ill use a * character for each real body byte:
4\r\n
****\r\n
5\r\n
*****\r\n
E\r\n
**************\r\n
0\r\n
\r\n
All the other characters are part of the HTTP protocol (here a chunked body transmission). I could also use a \n representation of binary data, and send only the null byte for each byte of the body, that would be:
4\r\n
\0\0\0\0\0\r\n
5\r\n
\0\0\0\0\0\0\r\n
E\r\n
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\r\n
0\r\n
\r\n
That's just a representation, we could also use \xNN or \NN representations, in reality these are bytes, 8 bits (too lazy to write the 0/1 representation of this body :-) ).
If the text of the example, instead of being:
Wikipedia in\r\n
\r\n
chunks.
It could have been a more complex one, with multibytes characters (here a é in utf-8):
Wikipédia in\r\n
\r\n
chunks.
This é is in fact 11000011:10101001 in utf-8, two bytes: \xc3\xa9 in \xNN representation), instead of the simple 01100101 / \x65 / echaracter. The HTTP body is now (see that second chunk size is 6 and not 5):
4\r\n
Wiki\r\n
6\r\n
p\xc3\xa9dia\r\n
E\r\n
in\r\n
\r\n
chunks.\r\n
0\r\n
\r\n
But this is only valid if the source data was effectively in utf-8, could have been another encoding. By default, unless you have some specific configuration settings available in your web server where you enforce a conversion of the source document in a specific encoding, that's not really the job of the web server to convert the source document, you take what you have, and you maybe add an header to tell the client what encoding was defined on the source document.
Finally we have the multipart way of transmitting the body, like in your question, it's a lot like the chunked version, except here boundaries and intermediary headers are used, but for the binary data between these boundaries, headers, and line endings control characters it is the same rule, everything inside are just bytes...

PlayWS calculate the size of a http call without consuming the stream

I'm currently using the PlayWS http client which returns an Akka stream. From my understanding, I can consume the stream and turn it into a Byte[] to calculate the size. However, this also consumes the stream and I can't use it anymore. Anyway around this?
I think there are two different aspects related to the question.
You want to know the size of the server response in advance to prepare buffer. Unfortunately there is no guaranteed way to do this. HTTP 1.1 spec explicitly allows transfer mode when the server does not know the size of the response in advance via chunked transfer encoding. See also quote from 3.3.1. Transfer-Encoding:
A recipient MUST be able to parse the chunked transfer coding
(Section 4.1) because it plays a crucial role in framing messages
when the payload body size is not known in advance.
Section 3.3.3. Message Body Length specifies how length of a message body is defined and it besides the aforementioned chunked transfer encoding it also contains quite unhelpful
Otherwise, this is a response message without a declared message
body length, so the message body length is determined by the
number of octets received prior to the server closing the
connection.
This is added for backward compatibility and discouraged from usage but is still legally allowed.
Still in many real world scenarios you can use Content-Length header field that the server may return. However there is a catch here as well: if gzip Content-Encoding is used, then Content-Length will contain size of the compressed body.
To sum up: in general case you can't get the size of the message body in advance before you fully get the server response i.e. in terms of code perform a blocking call on the response. You may try to use Content-Length and it might or might not help in your specific case.
You already have a fully downloaded response (or you are OK with blocking on your StreamedResponse) and you want to process it by first getting the size and only then processing the actual data. In such case you may first use getBodyAsBytes method which returns IndexedSeq[Byte] and thus has size, and then convert it into a new Source using Source.single which is actually exactly what the default (i.e. non-streaming) implementation of getBodyAsSource does.

How do I decode a websocket packet?

I'm using Wireshark packet analyzer & when I filter for all "Websocket" packets I see what I am sending /receiving to the host. When I check individual packets mine always show as [MASKED], but you can 'Umask Payload' which shows the data in clear text that looks like this:
<IC sid="52ccc752-6080-4668-8f55-662020d83979" msqid="120l93l9l114l30l104"/>
However, if I 'Follow TCP stream & look at that same packet, the data shows up as encoded in some way like this:
....K#....../...y#..|...}...f...s...~...}...{G..r...kN.."G..z...r...'...'...z...d.
The problem is all Websocket packets I receive from the host come as encoded, it is NOT SSL & I can't figure out how to decode them, I have no idea what they are even encoded as (but yet my browser can decode it).
I assume that whatever method they are coming back to me as encoded data is the same method that my data is encoded when I use 'Follow TCP stream'.
Can someone please help me figure out how to decode the data the host is sending me? See host data below
~.^jVpZc9y4Ef4ryFQ5+yJpeB+JJJdmNPI6G++mrN249kkFkuChIQmG5Fgj//p0AyAJypzxyi5T6P76
QKPRuHz9cUu6IrlZuVYcx75rXXpGYFw6nhdcBqnrXnqeZVhGEtihH65Id7NKWEoPZb8iVfc/FDQt
owztMixN0yltozQNZ3V7ncZkbxrAXZE8vFkZK2g66msJchLIjyuoiWQmvvyApGUY+JsJKPGLkrIF
IHcFALVJNXtTWsl9adMDPlAtQ1AZME0XvoFsShDz5McVn0J6y2z5ceTHlB9pnEltheQVEllIXiGR
z7Ifz6Cz4c2h6XkDLTDUFlkOQYuk/5EUimTnIykUyc5HoeTJjlHVMgWPwifv++Yf6/XLy8uVadlp
Sbs8zml/FfMKAKA4Z2WzLuqEHa/yviqBCEZXJJXeUzC25c2rcIhAEM1LyzBt8jtvtp8+kUee9i+0
ZWTL2+aKkLuyJJ8R2pHPrGPtV5ZcgRIXNVLoF6vh62tpkToy9LIzexnxvRydWEY8lhGPZRxjGccY
IDBEezkMsZSLlRZLtmQQYhm8WCBvr2lAMhFVyDqPpKDmPy1Pi5KtSGaM4Xrlh/aFRV3Rs3Uj+VdN
3rw/QJ9u3v3xuPv8DhSsUw/+ocHtdeKRDNz0wF4GfjpesJrM+CQDx5ACHtFkHdG6Zq159dxkQLPO
jxFa8Ucl7hsl1l9Sss5518vRPa/Ovupe0r+i7qXnTzT5ytq+6Io6e5LiSybMtzacUzbK4ivDZFzo
tmm8UeL+NUeBAKNYsa5jdcbay5TR/tCymZ/rBAYxCbWsuP2ZlSUn7/787Y/Pv9592r27IB9/qsi7
T3+KFklbXpHu0DS87UnPaHVBICKkoq/kI8EeEEif9zI7UFsxU/UCzpGEI4bUjCUT8AsrwWlGek6e
eVGTQ3dBNFHyN5VwSQhwc3I4kA4DN5Ct4oL0OWvZ3yYS+IfTFI0moDt7P2Pl9KvkRYzVGgvI9U89
6YAq2ClvCc1YNyn+gnYm+bxIEsD2kHCMJPS1e080KO1/6sih+Z6W06ZhNbr1HatmL5ND9+g6yThP
wASt950KJ434oUcH2o6V6YT/lcMAcU4imlwQWifDwEjuXUW/gb6pMx9ayI+piYOeSIvIuBoZW34o
EwxMxOBv37N2EvrXoYOAcfg74T+Squg6ESDgVIc413kll8GbaB+E29DPkfI7LfdIkip4PWEfmYx8
ScENzUXax/nEmPzbvDKFWqcmUCxRuBxjqFy+O1WudWoBwiY5TD0Hlkmojz585KKkVVExRaGKYzV9
rGQtBRExF1nF+4LXa5iv6w55auZ6b/h9fqgiDXE7TAuh3ZfK/8uroj+h/CvyziqfEIvKH5sifiUP
kFyn3EfAefdHxNxCqCyUjDWvJ7R/+/btrO6BP9fsKM05j/en3EbeebdHxFy5pbR/weyj5J5nJ0y8
AOCshREwN4B1XSzBOe/5Cd0N8s4qnxBvtKuw/0yr6nR4csk9a0HHLMYfKgdLTo1sJphnDWiQuX5X
6n+gRXtKfYq8s9onxFx5MMSnaV7Jpmj7HEr2CSuRYp81NAMt2trmjLWwBpywEiP70Jw1omPeDPhg
5GSs4h9EKl6M05Cmd3V2UjNF3lndE2JxiH+pYf/TndC+F8yz6jXIYvH5Nz1k+Qn1JfLOap8Qc+We
ch7Wt1OuA+u84wNgOeiPL7ACnyptyDtf2kbEYjp+grX0hO4Kl9lzqkfAYkXYwJZhT/44legRsp9+
kOkz0NyKvRrOLg3vaHmqdir2+fKpgxZXRhxd2OjkcEok20Pb0+JU0GLJ/eGYv8UtDg6sCjWDulSe
7B4CniIA/Gh90GHLtvietafMIO+8hRHxJofVpuj3HPt6Mo17ZJ81MCGWty6HumOnNkadYJ6fJRNk
sXY8yOuI5eUHeeeXnxGxHJ0t5/uCkQ9Fe2qgY4E4n1ETZDiYpzaYckkWSlMZXpEYF6Z5YVoXpn1h
Oheme2F6citrCjH39jp39B2uOd0TfKBdI05AePRri07f5Xb8UCfrDBBXVWNr/RSCRZaV36NzOO0u
oB/hGJnjeW5BAKgLEls8NM7qmMIfWlhwvsf/UsR7OEVULOLJK4GT03eSe0AsCkIdGAQXhGAyL/Qn
hipWfYfuBHkB/yd9qWWY3+6WpeAr8JfM1LydzzAJx22zhl7nzu114ZK9J4cYciI6RBEOT+GpLCgg
C55N8jy7XLjES4VLPBAfmLw8G09Jz3COKnyyN2XaFKAzO7Cux3tGeQdVyAutZ3mn9jwIFv7t9d4y
yd4CU5sVNmxowHnstzSd3UcV/aGGxR02aqWwvj50a2is9RuPdZG4pm/aoed478vuxvw7rZp/Vv1N
gLZANep3FvR3YKApYdcGB+tM6e963jI00a0TBqW67N4XyQ3sI6/Q1Cce4TltIxU74l8TIxzfXncx
yfE67hg+bOytq250jw/iR97FGsdduLt/gNKbOwIpfuR1rHF0LN+59+WtrCaHkTxuLfwjj6LG0RY/
6kR6NIwwxGNsbgkLAVjwVehHZCHkNg/37m4rRwrlPA/DUfgzpKd7VrjSl/BO8BzZCsSlc2HP5KwZ
T3gd24YbYKn2dGTq6/1Lg1lrFjMqWnd3Gx/jSc0ZT9i7e9ia2x3ICd7O2W0eDHmTaxxT8aNudI+g
I8ToUmfGs/URo6K327t7f4clWvYofNg+7OQtrtYHV9dC/ZmcM/Mz0PuQzUYlE3LW1ts6d3JOajxh
3XZtx7bk5DaOgRVu7h5kAmstT/clc/UeZTLPdtBy1MwdPUuEL574kQ8NqMXzxDvCrH+JylbbNobX
B8gewxBaVLYa8iHCmPHMWcsask4g7YGHecaEL5bnBRtPPUCMkWCz/GRqHDwPdTJX7xETOr3dxtxt
5OOEcfTFj3yjgDEy7u8fQvlUoekUXj84MNb36nHiaO683b16o5gi8SzkAtsW9p5lPL3At/BRQ2Wy
729hjj2Hw9wUcrMxehY9cu3A2NjqseMY2hv/fiPfPMBP7z7EmRrN+ideQEavn2e1QDyLTBbMGW+W
u5HMQce7uzPkA4rWmus09QzZ23qP9o4ut1dRklmwn/V27+tREk8dR8f0/PBePnVM2RMLnYbpbjAL
4ll9iUM9l/bCszTxUsNTzyCCJ/y09JjFvs6LVQ3B/JFPJJoF0XfL9bydhwW+w6IuZklcMto+wYZB
7sRwqiiSOinhdIGFnZelhnIm2gDzJ1KaalBvTh/g4BBvcK+n2uA8bMC+0h529WAdVv5qOOPg5NJ5
UTlurXGm5QUubOqcjktRy3kFW98eHzFGP3BVmjGUYVyDMlqxp4ZmQr0iG4ocg9UM1D8V8ShiK95I
sAbCYA3nfA0brhqp0V5jBG8YSgUWArG5mQUPC8JEHaDuQKw15BQIjehoyEEa+suORa+hLE10QBkD
ShFwYau6rKF46lLHc1zeovIgB6VkqdKH69xIbqeTBNaCuOTdFDesB+KNVDWxY6xOol7rGNYFJTWS
wBfxOqCaniY2aA7HlFEE8LXkNBEPWyp5cAnGlCih85OkoRNHGxA13CQVdXPoR3kTdWbqIhMXbdHb
ST90WHcKF3I5FWZhxFV71mVc1IcAKghEJuGTk7hgi/Yggqs0vmNiWikKdLihXffC20QdsnBc9lmL
CTE87kmGr4ZhM9y7gGzLMogAawcaVrOqqIvRh1jNtKfuUCmKmmJP/GXUjYVO0MrxPharXUWPk/NY
4gRh0OwPpkbEvDoooqVU67UCC92I3A/9sUTyPqk6FE2+4N5D44x3N8Y0lk8R73teTQdV2EpHfHMA
4nCX5H6vXisXuE/R2XQ0bs8YCYRNnXVTW+wrh9EaqI7Ym92P7+wAEWcgZ+K18lJCHLzJaqbFMMeT
9CTwyOgcZdpLqEaew3SgFSwAN5wqxyL4bQHwiVHop4RU4vclc3A2Ge11srEIg0nPaJwPQNVc8usV
X0/J2NuO0a5ImI4UVwuPv3z89enu8+ffvpCvFKYgFJq2nXyc2LiG3l7H+R6yH/9PjGGw0LfTyLGN
0Ehdzw4DM6apZaXM8pMA/+NMfLNyXc+IAwcgIaNBEDhmStPYMSPT9izHSMA5QCUA8tMoCFxGHWpa
phn5lhVYNAlTljoueiItwy8ft7f/Bw==
Client to server data is XORed with a mask (included in the dataframe). Some people suggest this is in order to throw off bad caching mechanisms responding to new websocket requests with server messages from older sessions. The masking makes sure that even messages containing identical data will appear differently to applications that do not understand websockets.
Also note that there are many different size options for the headers themselves.
Refer to RFC 6455 Section 5 which defines the masking/unmasking process for payloads sent from the client to the server.
https://www.rfc-editor.org/rfc/rfc6455
If you find any freeware VBA code to do the job of forming packets let me know! :-)

When does subject header in Mime message gets encoded?

I am trying to understand the mime encoding. I have 2 instance of message with subject header :
1) Reply to Comment in Is my stone dead? Constantly quick-flashing white side-LED =?= in ICS Webtop Update = useless lapdock?
2)Reply to Comment in Is? white side-LED =?= in ICS Webtop Update = useless lapdock
I see different behavior in both the cases, while the former is getting encoded the later is not. Having read through some of Mime documentation, I understand the theory that when there isn't a 7bit non-ASCII character clients may encode the message. But why is the difference in above messages ? Is message length a factor too ?