SOAP response MTOM attachment can't be decrypted (AES algorithm) - soap

I'm working on the soap client and have a problem with reading (and decryption) of the response attachment. The attachment is included into the the response using MTOM mechanism and encrypted via AES128-CBC algorithm (the secret key is included to the response xml header).
Here is the basic structure of the response:
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope>
.. the xml data that includes the secret key for the attachment
decryption usign AES algorithm.
</soapenv:Envelope>
--MIMEBoundaryurn_uuid_174A74CB7221A5AF451426570004765
Content-Type: application/octet-stream
Content-Transfer-Encoding: binary
Content-ID: <urn:uuid:174A74CB7221A5AF451426570004768#apache.org>
iQ�<]�+)B�ل�$O:���'�zT�F�x�����������}�t��݄��')#^��&�a�p}Q��¨גZ<G�%_"��|
Ps�<���'9��g](ǧ">�l��� ��XPrJ��jM�f�<$�)Q�*��
--MIMEBoundaryurn_uuid_174A74CB7221A5AF451426570004765--
MTOM mechanism implies that the attachment is sent as a binary string (without encoding to base64). As I suggest, this binary string is what must be decrypted via AES. But unfortunately it has the wrong length to apply the AES decryption - AES uses the 16 bytes blocks, so the cipher must be a multiple of 16. But it does not, for example in the example above the attachment length is 250.
Maybe I'm missing something and some kind of transformation must be applied to the attachment binary string before decrypting it?
P.S. The part of the response xml body is encrypted using the same algorithm (AES128-CBC), but is sent as a base64 cipher, which must be decoded to get the binary string and then decrypted. Which works fine. The decoded body cipher has the proper length - a multiple of 16 and it can be decrypted without any problems.
Thank you in advance for any thought or ideas!

Old, but from Oracle about half way down the page
Note: Streaming MTOM cannot be used in conjunction with message
encryption.
http://docs.oracle.com/cd/E14571_01/web.1111/e13734/mtom.htm#WSADV141

Related

Headers for REST API with optional Base64 encoding

We have a media file repository, with which other services communicate over a REST API. For various reasons we want the users of the repository to be able to upload and download files over HTTP both directly (plaintext for text files and byte array for binary files) and using Base64 encoding. We want the fact that the file is uploaded (PUT, POST) and requested for download (GET) in the Base64 encoding be reflected in the header of the HTTP request.
How do we reflect the fact that the content of the request or requested response is Base64 encoded in the HTTP header?
So far I'm tending towards appending ;base64 after the mime type in the Content-Type header, for example Content-Type: image/png;base64. Other options (X- header, Content-Encoding) are discussed in this related question but do not offer satisfactory resolution to our question.
You have to use Content-Transfer-Encoding header.
It is in RFC https://www.rfc-editor.org/rfc/rfc2045#page-14.
It supports base64 value among others, like "7bit" / "8bit" / "binary" / "quoted-printable" / "base64" / ietf-token / x-token
This header is specially designed for your case, to use as a complement for MIME type.

javax.mail.Part and writeTo, unable to obtain the same "eml" file as the original one

My application parses many messages via javamail 1.5.6, it listens for incoming messages then store some info about them.
Almost all messages contain a digital signature, so my application needs to retrieve the full eml too, that is the raw file representing an email, in this way application users can always prove the validity of these messages.
So, once I have a javax.mail.Message, then I have to produces its eml, so I do:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
m.writeTo(baos);
this.originalMessage = baos.toString(StandardCharsets.UTF_8.name());
this approach generally works, but I had some multipart messages having part like the following:
This is a multi-part message in MIME format.
--------------55D0DAEBFD4BF19F87D16E72 Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 8bit
In allegato si notifica ai sensi e per gli effetti dell'art. 11 R.D.
1611/1993, al messaggio PEC, oltre alla Relata di Notifica e
contestuale attestazione di conformità,
--------------55D0DAEBFD4BF19F87D16E72
word "conformità" is not properly transformed in the resulting string, it becomes "conformit�", opening such eml for example with MS Outlook results in an invalid digital signature, so message appears corrupted, different from the original
Have you same idea? Thank you very much
The raw message is not a UTF-8 encoded string, nor is an "eml" file a UTF-8 encoded file. They are both byte streams, and your digital signature should work on byte streams.
In your particular example, the content of the message part is encoded using the iso-8859-15 charset, not UTF-8.

HTTP multipart/form-data. What happends when binary data has no string representation?

I want to write an HTTP implementation.
I've been looking around for a few days about sending files over HTTP with Content-Type: multipart/form-data, and I'm really interested about how browsers (or any HTTP client) creates that kind of request.
I already took a look at a lots of questions about it here at stackoverflow like:
How does HTTP file upload work?
What does enctype='multipart/form-data' mean?
I dig into RFCs 2616 (and newer versions), 2046, etc. But I didn't find a clear answer (obviously I did not get the idea behind it).At most articles and answers I found this piece of request string, that's is simple to me to interpret, all these things are documented at RFCs...
POST /upload?upload_progress_id=12344 HTTP/1.1
Host: localhost:3000
Content-Length: 1325
Origin: http://localhost:3000
... other headers ...
Content-Type: multipart/form-data; boundary=----WebKitFormBoundaryePkpFF7tjBAqx29L
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="MAX_FILE_SIZE"
100000
------WebKitFormBoundaryePkpFF7tjBAqx29L
Content-Disposition: form-data; name="uploadedfile"; filename="hello.o"
Content-Type: application/x-object
... contents of file goes here ...
------WebKitFormBoundaryePkpFF7tjBAqx29L--
...and it would be simple to implement an HTTP client to construct a piece of string that way in any language.The problem becomes at ... contents of file goes here ..., there's little information about what "contents of file" is. I know it's binary data with a certain type and encoding, but It's difficult to think out of string data, how I would add a piece of binary data that has no string representation inside a string.
I would like to see examples of low level implementations of HTTP protocol with any language. And maybe in depth explanations about binary data transfer over HTTP, how client creates requests and how server read/parse it. PD. I know this question my look a duplicate but most of the answers are not focused on explaining binary data transfer (like media).
You should not try to handle strings on this part of the body, you should send binary data, see it as reading bytes from the resource and sending theses bytes unaltered.
So especially no encoding applied, no utf-8, no base64, HTTP is not a protocol with an ascii7 restriction like smtp, where base64 encoding is applied to ensure only ascii7 characters are used.
There is, by definition, no string version of this data, and looking at raw HTTP transfer (with wireshark for example) you should see binary data, bytes, stuff.
This is why most HTTP servers uses C to manage HTTP, they parse the HTTP communication byte per byte (as the protocol headers are ascii 7 only, certainly not multibytes characters) and they can also read/write arbitrary
binary data for the body quite easily (or even using system calls like readfile to let the kernel manage the binary part).
Now, about examples.
When you use Content-Length and no multipart stuff the body is exactly (content-length) bytes long, so the client parsing your sent data will just read this number of bytes and will treat this whole raw data as the body content (which may have a mime type and and encoding information, but that's just informations for layers set on top of the HTTP protocol).
When you use Transfer-Encoding: chunked, the raw binary body is separated into pieces, each part is then prefixed by an hexadecimal number (the size of the chunk) and the end of line marker. With a final null marker at the end.
If we take the wikipedia example:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
E\r\n
in\r\n
\r\n
chunks.\r\n
0\r\n
\r\n
We could replace each ascii7 letter by any byte, even a byte that would have no ascii7 representation, Ill use a * character for each real body byte:
4\r\n
****\r\n
5\r\n
*****\r\n
E\r\n
**************\r\n
0\r\n
\r\n
All the other characters are part of the HTTP protocol (here a chunked body transmission). I could also use a \n representation of binary data, and send only the null byte for each byte of the body, that would be:
4\r\n
\0\0\0\0\0\r\n
5\r\n
\0\0\0\0\0\0\r\n
E\r\n
\0\0\0\0\0\0\0\0\0\0\0\0\0\0\r\n
0\r\n
\r\n
That's just a representation, we could also use \xNN or \NN representations, in reality these are bytes, 8 bits (too lazy to write the 0/1 representation of this body :-) ).
If the text of the example, instead of being:
Wikipedia in\r\n
\r\n
chunks.
It could have been a more complex one, with multibytes characters (here a é in utf-8):
Wikipédia in\r\n
\r\n
chunks.
This é is in fact 11000011:10101001 in utf-8, two bytes: \xc3\xa9 in \xNN representation), instead of the simple 01100101 / \x65 / echaracter. The HTTP body is now (see that second chunk size is 6 and not 5):
4\r\n
Wiki\r\n
6\r\n
p\xc3\xa9dia\r\n
E\r\n
in\r\n
\r\n
chunks.\r\n
0\r\n
\r\n
But this is only valid if the source data was effectively in utf-8, could have been another encoding. By default, unless you have some specific configuration settings available in your web server where you enforce a conversion of the source document in a specific encoding, that's not really the job of the web server to convert the source document, you take what you have, and you maybe add an header to tell the client what encoding was defined on the source document.
Finally we have the multipart way of transmitting the body, like in your question, it's a lot like the chunked version, except here boundaries and intermediary headers are used, but for the binary data between these boundaries, headers, and line endings control characters it is the same rule, everything inside are just bytes...

MTOM and WS-Security (in CXF)

I am using WS-Security (XML-Signature and XML-Encryption) in my Web Service. For larger, binary objects, I intend to use MTOM.
From what I understood is that the binary data is referenced via something like this:
<xop:include href="SomeUniqueID"/>
I see two problems here:
1) How can I include this binary data in the XML-Signature part of the SOAP header?
2) How can I use XML-Encryption (or to be more specific: CXFs standard ways of "automatically" doing XML-Encryption)?
You can include the data in the XML-Signature as if you were not using MTOM.
When MTOM is enabled, at first, the data will always be encoded in Base64 and then it will be converted to binary data to send it as a MIME attachment.
CXF will use this temporary Base64 representation of your file to include it in the message signature.

What are those "garbage" 16 bytes at the beginning of an unencrypted EncryptedData tag from an encrypted ws-security SOAP message? (WCF)

I'm inspecting a WCF request message in order to implement part of the WS-Security standard to have iPhone <-> WCF intercommunication (I'm using certificate security over basicHttpBinding).
After reading the standard xmlenc-core I could decrypt both the SignedInfo and the Body tags, but I see 16 bytes at the beginning of both unencrypted tags from which I have no idea.
I create a sample application according to the standard in order to send request from the iPhone to a self hosted WCF but it continues responding "An error occurred when verifying security for the message".
The only thing I don't know how to implement are those 16 bytes, does anybody knows what to use on those 16 bytes?
Thanks
When using Triple-DES and AES the cipher-text is prefixed by the IV. So when decrypting, you should use the first 16 bytes of the value as the IV and then perform the AES-CBC decryption on the remaining bytes. My guess is that you have forgotten this and thus are decrypting the IV also (which will yield garbage).