HTTP multipart/form-data multiple files in one <input> - forms

The background:
According to W3c, multiple files selected in a <input> field, should be send by "multipart/mixed" type with separate boundary string and only one "name" parameter (as long, as the name should be unique in the form).
Writing POST data processing, I noticed that the major browsers send such multiple files as if they origins from different <input> elements, but with the same name. I.e. Instead of:
Content-Type: multipart/form-data; boundary=AaB03x
--AaB03x
Content-Disposition: form-data; name="files"
Content-Type: multipart/mixed; boundary=BbC04y
--BbC04y
Content-Disposition: file; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--BbC04y
Content-Disposition: file; filename="file2.gif"
Content-Type: image/gif
...contents of file2.gif...
--BbC04y--
--AaB03x--
...they send something like:
Content-Type: multipart/form-data; boundary=AaB03x
--AaB03x
Content-Disposition: form-data; name="files"; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--BbC04y
Content-Disposition: form-data; name="files"; filename="file2.gif"
Content-Type: image/gif
...contents of file2.gif...
--AaB03x--
The question:
How I should process the POST data? Are there browsers that will send multiple files as a "multipart/mixed" or handling such case is not needed and I should simplify my code?
Notice: I am writing framework for handling HTTP, so using other libraries and frameworks is not an option.

I have confirmed what you found.
I tested Firefox and Chromium, and this is what I get:
Content-Type: multipart/form-data; boundary=---------------------------148152952621447
-----------------------------148152952621447
Content-Disposition: form-data; name="files"; filename="fileOne.txt"
Content-Type: text/plain
this is fileOne.txt
-----------------------------148152952621447
Content-Disposition: form-data; name="files"; filename="fileTwo.txt"
Content-Type: text/plain
this is fileTwo.txt
-----------------------------148152952621447--
After an investigation, I found that the W3c information you provided
is based on RFC2388, which is already made obsolete by RFC7578.
According to RFC7578 Section 4.3 (with my emphasis):
[RFC2388] suggested that multiple files for a single form field be transmitted using a nested "multipart/mixed" part. This usage is deprecated.
To match widely deployed implementations, multiple files MUST be sent by supplying each file in a separate part but all with the same "name" parameter.
So, your question:
How I should process the POST data?
My recommendation is ignore that W3c info and follow RFC7578.
Are there browsers that will send multiple files as a "multipart/mixed" or handling such case is not needed and I should simplify my code?
Very old browsers may use "multipart/mixed" but the usage is deprecated anyway, so no need to handle such case.
My recommendation: you should definitely simplify your code.

Related

Incomplete attachments remain attached to the mail

I am using mimedefang filtering tool. In the configuration, I strip out all the attachments and forward it to another address. For particular sender, I can see milter changes the header Content-Type from application/pdf and multipart-mixed. In the received email on outlook, when I open the pdf using text editor (it contains content like ("This is a multi-part message in MIME format..." followed by some random numbers "------------=_1525668389-64274-8--").
Can anyone guess why this might be happening?
Multi-part messages (like those with attachments) have their parts divided by a boundary. This boundary is between 1 and 70 characters and must not appear anywhere in the anywhere within the encapsulated parts of the message (between boundaries).
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08jU534c0p
This is a message with multiple parts in MIME format.
--gc0p4Jq0M2Yt08jU534c0p
Content-Type: text/html; charset=UTF-8
<html><head></head><body>This is the HTML body of the message.</body></html>
--gc0p4Jq0M2Yt08jU534c0p
Content-Type: text/plain
This is the body of the message.
--gc0p4Jq0M2Yt08jU534c0p
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUg
Ym9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg==
--gc0p4Jq0M2Yt08jU534c0p--
I suspect that somewhere between mimedefang and your milter configuration, the boundaries are getting mangled or included into the attachment can causing them to be corrupted.

Adding attachment into MailMessage direct from MIME string

There are a number of good examples of adding attachments from file, but I was wondering if I could do it in a similar way to the way AlternateView.CreateAlternateViewFromString works.
I am separately using IMAP to get the body_text and body_headers of an email from one address and then wish to send it from another (via SMTP) with some changes made, but the attachment added as faithfully as possible. I am using MailMessage and SmtpClient ( System.Net & System.Net.Mail )
I have written code to extract MIME parts from BODY[TEXT] by their "Content-Type:" tag and happily add plain text the html views this way. I have the attachment in the MIME too (see example below) so it seems like I should be able to easily add the attachment directly from the string I can extract from the MIME with "Content-Type: application/octet-stream". At present, I am only interested in it working with application/octet-stream.
I have this available in parts of my code as a string containing either:
Content-Type: application/octet-stream; name="test_file.abc"
Content-Disposition: attachment; filename="test_file.abc"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_j7datrbn0
YWJjIGZpbGU=
Or
Content-Disposition: attachment; filename="test_file.abc"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_j7datrbn0
YWJjIGZpbGU=
From example:
* 53 FETCH (BODY[TEXT] {614}
--001a1140ae52fcc4690558c101ec
Content-Type: multipart/alternative; boundary="001a1140ae52fcc4630558c101ea"
--001a1140ae52fcc4630558c101ea
Content-Type: text/plain; charset="UTF-8"
Blah blah
--001a1140ae52fcc4630558c101ea
Content-Type: text/html; charset="UTF-8"
<div dir="ltr">Blah blah</div>
--001a1140ae52fcc4630558c101ea--
--001a1140ae52fcc4690558c101ec
Content-Type: application/octet-stream; name="test_file.abc"
Content-Disposition: attachment; filename="test_file.abc"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_j7datrbn0
YWJjIGZpbGU=
--001a1140ae52fcc4690558c101ec--
)
$ OK Success
All of the other aspects of my code work, but is there an easy way to directly add the attachment from the MIME data I have. I am not using MimeKit or any libraries other than standard VS ones.
Thank you for reading this long question.
Not really what I wanted to do, but I achieved what I needed as follows:
string name = find_item_by_key(attachment_content, "filename"); // looks for key="value" and returns value
string attachment_base64 = attachment_content.Substring(attachment_content.LastIndexOf("\r\n\r\n"));
attachment_base64 = attachment_base64.Trim('\r', '\n');
byte[] attachment_bytes = Convert.FromBase64String(attachment_base64);
Attachment att = new Attachment(new System.IO.MemoryStream(attachment_bytes), name);
mail.Attachments.Add(att);
Lots of pointless conversions going on, since the data is already Base64 and mail.Attachments.Add will no doubt be converting it back again.

What is the correct content-type and document structure for an email that only contains an text/html part

First off- I'm fixing up an existing open source library. While I know that people SHOULD send a plaintext version of a message when they send an html email, this isn't a best-practices question. If I don't maintain backwards compatibility, they won't accept my patch.
I'm trying to figure out how to best handle situations where ONLY an html email is sent.
The library currently generates this:
MIME-Version: 1.0
Content-Type: text/html;
Hello, World
But every html-only message I've seen in my inbox shows:
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_Part_1935495_1732146301.1367384830372"
----=_Part_1935495_1732146301.1367384830372
Content-Type: text/html;
Hello, World
----=_Part_1935495_1732146301.1367384830372--
I can't figure out if this is a best-practice or a requirement
I've been through :
https://www.rfc-editor.org/rfc/rfc2557
http://www.ietf.org/rfc/rfc2854.txt
but couldn't find any information.
Use Content-Type: Multipart/related; type=Text/HTML:
From: user1#example.com
To: user2#example.com
Subject: An example
Mime-Version: 1.0
Content-Base: http://www.example.com
Content-Type: Multipart/related; boundary="boundary-example-1";type=Text/HTML
--boundary-example-1
Content-Type: Text/HTML; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
... text of the HTML document, which might contain a hyperlink
to the other body part, for example through a statement such as:
<IMG SRC="/images/ie5.gif" ALT="Internet Explorer logo">
Example of a copyright sign encoded with Quoted-Printable: =A9
Example of a copyright sign mapped onto HTML markup: ยจ
References
MIME Hierarchies of Body Parts
MIME Standard Header Fields
Handling Binary Data in XML Documents
Direct Mailer Requirements
Multipart Messages
HTML Threading:
Conventions for use of HTML in email

MIME "multipart/related" Structure and Apple Mail. Is it a Bug?

I build a E-Mail with PHP Zend Framework Class Zend_Mail. There is one text- and one html-part with related inline-images. I want to attach one pdf-file too.
My question is about the mime-structure. Two options are possible:
option 1:
Content-Type: multipart/mixed
Content-Type: multipart/alternative
Content-Type: text/plain; charset=UTF-8
Content-Type: multipart/related
Content-Type: text/html; charset=UTF-8
Content-Type: image/jpeg
Content-Type: image/jpeg
Content-Type: image/png
Content-Type: application/pdf
option 2:
Content-Type: multipart/related;
Content-Type: multipart/alternative;
Content-Type: text/plain; charset=utf-8
Content-Type: text/html; charset=utf-8
Content-Type: image/jpeg
Content-Type: image/jpeg
Content-Type: image/png
Content-Type: application/pdf
option 2 is built by Zend_Mail, but the pdf is not recognized at Apple Mail Application. It's fine in Thunderbird 3 and Outlook 2007. Only in Apple Mail the PDF-Attachment isn't recognized.
option 1 is ok in Apple Mail, Thunderbord and Outlook. But it would be a little bit tricky to get this structure out of the Zend Framework Class Zend_Mail.
Is this a Bug in Apple Mail or is option 2 not normative?
kind regards,
sn
Have you tryied specifying the type ? see this page http://framework.zend.com/manual/en/zend.mail.attachments.html
i use this
$obj_MailAttachment = new Zend_Mime_Part($allegato);
$obj_MailAttachment->type = 'application/pdf';
$obj_MailAttachment->disposition = Zend_Mime::DISPOSITION_ATTACHMENT;
$obj_MailAttachment->encoding = Zend_Mime::ENCODING_BASE64;
$obj_MailAttachment->filename = 'ordine'.$ordine['numero'].'.pdf';
...
$mail->addAttachment($obj_MailAttachment);
Both options are violations of RFC822, the header-lines MUST start on the first character of their line; this is important because hearer-folding is triggered by that first character being whitespace SP (#32) or HT (#09), IIRC.
Example:
Content-Type: text/html; charset=UTF-8
and
Content-Type: text/html;
charset=UTF-8
are exactly equivalent.
The proper way to do what you're (apparently) attempting is by using the boundary attribute is something like this:
Content-Type: multipart/mixed; boundary="1610edf3f7626f0847a5e75c55287644"
OTHER-HEADERS
--1610edf3f7626f0847a5e75c55287644
Content-Type: multipart/mixed; boundary="embedded_boundary"
OTHER-HEADERS
--embedded_boundary
NESTED-MESSAGE-GOES-HERE
--embedded_boundary--
--1610edf3f7626f0847a5e75c55287644--
One of the parts of nested-portion would contain the PDF-attachment.
Ref:
http://www.faqs.org/rfcs/rfc2822.html and the links provided here: Are email headers case sensitive?

How do boundaries work in multipart post requests?

I trying to upload a file from an iPhone to a server. I'm trying to avoid using any libraries that aren't made by apple, and from what I can tell it looks like I'll need to go pretty low level on constructing my request. Can someone tell me what the "boundary" is in a multipart/form-data request and how to use it properly?
The boundary is an arbitrary piece of text which the client uses to delimit the fields of the form being posted. The client declares the boundary it is using as part of the Content-type header.
From the IETF Form-based File Upload in HTML RFC:
A boundary is selected that does not occur in any of the data. (This
selection is sometimes done probabilisticly.) Each field of the form
is sent, in the order in which it occurs in the form, as a part of
the multipart stream. Each part identifies the INPUT name within the
original HTML form. Each part should be labelled with an appropriate
content-type if the media type is known (e.g., inferred from the file
extension or operating system typing information) or as
application/octet-stream.
...
6. Examples
Suppose the server supplies the following HTML:
<FORM ACTION="http://server.dom/cgi/handle"
ENCTYPE="multipart/form-data"
METHOD=POST>
What is your name? <INPUT TYPE=TEXT NAME=submitter>
What files are you sending? <INPUT TYPE=FILE NAME=pics>
</FORM>
and the user types "Joe Blow" in the name field, and selects a text
file "file1.txt" for the answer to 'What files are you sending?'
The client might send back the following data:
Content-type: multipart/form-data, boundary=AaB03x
--AaB03x
content-disposition: form-data; name="field1"
Joe Blow
--AaB03x
content-disposition: form-data; name="pics"; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--AaB03x--
If the user also indicated an image file "file2.gif" for the answer
to 'What files are you sending?', the client might client might send
back the following data:
Content-type: multipart/form-data, boundary=AaB03x
--AaB03x
content-disposition: form-data; name="field1"
Joe Blow
--AaB03x
content-disposition: form-data; name="pics"
Content-type: multipart/mixed, boundary=BbC04y
--BbC04y
Content-disposition: attachment; filename="file1.txt"
In the first example, the boundary is the fixed string AaB03x. In the second example, the boundary is first AaB03x and then switches to BbC04y.