I want to implement a REST API for file upload, which has the following requirements:
Support for resumable and chunked uploads
Support for adding and editing metadata after upload, like tags or descriptions
Be resource friendly, so uploads should use raw data, i.e. no encoding and / or encapsulation
So, requirements 1) and 3) already rule out use of multipart forms (I don't like this approach anyways).
I already have a working solution to satisfy requirement 1), by issuing a POST request first, which transmits fileName, fileSize and fileModification date as JSON. The upload module then creates a temporary file (if none exists!) and responds with a 206, containing an upload token in the header, which in turn is a hash created over the data sent in the POST. This token has to be used for the actual upload. The response also contains a byte range which instructs the client what part of the file it has to upload. The advantage is that by this way, it is detectable if a former partial upload already exists - the trick is to name the temporary file the same as the token. So the byte range might also start with another value than 0-. All the user has to do to resume an interupted upload is to upload the same file again!
The actual upload is done via PUT, with the message body only containing the raw binary data. The response then returns the metadata of the created file as JSON, or responds with another 206 containing an updated byte range header if the upload is still incomplete. So by this way, also chunked uploads are possible. All in all I like this solution and it works well. At least I see no other way to implement resumable uploads without a 2 stage approach.
Now, my problem is to make this truly "RESTish".
Lets say we have a collection named /files where I want to upload files. Of course, POST /files seems to be the natural way to do so. But then, we would have a subsequent PUT to the same collection which is of course not REST compatible in my eyes. Next approach would be that the initial POST returns a new URL that points to the final resource: /files/{fileId}and the subsequent PUT(s) write to this resource instead.
That feels more "RESTish", but is not as straightforward and then it may be possible that there are incomplete file resources floating around when upload is not completed. I think the actual resource should only be created if the upload is complete. Furthermore, if I want to update / add metadata later on, that would be a PUT to the same resource, but the request itself would be quite different. Hmmm.
I could change the PUT to a POST, so the upload would consist of multiple POSTs but this smells fishy as well.
I also thought of splitting the resource in two subresources, like /files/{fileId}/metadata and /files/{fileId}/binary but this feels a bit "overengineered" and also requires the file resource to be created before the upload is complete.
Any ideas how to solve this in a better way?
Related
Is it possible to partially process a multipart/formdata request? I am developing a REST API wherein one of the resources is used to upload a large file. The application must take a call on processing the request based on the name of the file being uploaded, possibly sending back an alternate response if the file name fails validation.
If the application receives the large file and then performs the validation that triggers that alternate response, the time and resources used for the upload are both wasted. I prefer to preempt the upload of the actual file if the filename validation fails.
How can I implement this? I have considered the approach of first sending a request using the HEAD method and supplying the filename, with a subsequent upload contingent on the response to the first [HEAD] call. I would like to know if there are better alternatives.
Note: I am using Spring Boot to develop the RESTful application, although I imagine that will not significantly impact the answer I am seeking.
Simple question: I want to upload/download large files around via REST. What is the best practice to do that? Are there any chunk-patterns, do I use multipart on the transport layer, what do you recommend?
Use case: we have an API where you can upload payments (e.g. 500mb) and download large account statement files. I am aware that other protocols exist to do that but how is it done with REST?
see the answers here. They might help with your problem:
REST design for file uploads
Large file upload though html form (more than 2 GB)
In conclusion:
With REST you can simply use HTTP header fields to specify the content size, e.g use the Content-Type multipart/form-data in your request for files up to the server limit (usually 2GB - 4GB) and for files larger than that you will have to split the request in multiple parts.
Also check out this answer to see if byte-serving or chunked encoding makes sense in your application:
Content-Length header versus chunked encoding
I know that the title is not that correct, but i don't know how to name this problem...
Currently I'm trying to design my first REST-API for a conversion-service. Therefore the user has an input file which is given to the server and gets back the converted file.
The current problem I've got is, that the converted file should be accessed with a simple GET /conversionservice/my/url. However it is not possible to upload the input file within GET-Request. A POST would be necessary (am I right?), but POST isn't cacheable.
Now my question is, what's the right way to design this? I know that it could be possible to upload the input file before to the server and then access it with my GET-Request, but those input files could be everything!
Thanks for your help :)
A POST request is actually needed for a file upload. The fact that it is not cachable should not bother the service because how could any intermediaries (the browser, the server, proxy etc) know about the content of the file. If you need cachability, you would have to implement it yourself probably with a hash (md5, sha1 etc) of the uploaded file. This would keep you from having to perform the actual conversion twice, but you would have to hash each file that was uploaded which would slow you down for a "cache miss".
The only other way I could think of to solve the problem would be to require the user to pass in an accessible url to the file in the query string, then you could handle GET requests, but your users would have to make the file accessible over the internet. This would allow caching but limit the usability.
Perhaps a hybrid approach would be possible where you accepted a POST for a file upload and a GET for a url, this would increase the complexity of the service but maximize usability.
Also, you should look into what caches you are interested in leveraging as a lot of them have limits on the size of a cache entry meaning if the file is sufficiently large it would not cache anyway.
In the end, I would advise you to stick to the standards already established. Accept the POST request for the file upload and if you are interested in speeding up the user experience maybe make the upload persist, this would allow the user to upload a file once and download it in many different formats.
You sequence of events can be as follows:
Upload your file/files using POST. For immediate response, you can return required information using your own headers. (It should return document key to access the file for future use.)
Then you can use GET for further operations using the above mentioned document key as a query string.
I want to upload some file to a server, but it needs a token to post with the file together. I got the token when I logged in, so how could I post to the server? Can I write code like this?
var par=[
"token":"xxxxxxxxxx",
"file":"filename.file"
]
Alamofire.upload(.POST, "http://www.xxxxx.xxx", parameters: par)
This is most likely not supported in the current version of Alamofire depending on your server implementation.
Multipart Form Data
Your server most likely expects the data to be multipart/form-data encoded. Currently, multipart form data is not supported by Alamofire. You will need to encode the data yourself according to the RFC-2388 and RFC-2045.
If this ends up being the case, you can either implement your own version of the specs, or you could use AFNetworking. I would encourage you at the moment to use AFNetworking if this is the case. Here is a thread (courtesy of #rainypixels) to get you started if you decide you really want to implement this yourself.
You need to be careful with this option as it is an in-memory solution. Do NOT attempt to upload videos or large numbers of images in this way or your app out-of-memory very quickly.
File Upload
If the server does not expect multipart/form-data encoding, then you can use the Alamofire upload method.
public func upload(URLRequest: URLRequestConvertible, file: NSURL) -> Request
You could create an NSURLRequest with the token appended as a parameter, then pass the fileURL off to Alamofire to be uploaded.
In summary, I'm pretty certain that the first approach is what your server is going to require. Either way, hopefully this helps you get heading in the right direction.
I have a large byte file (log file) that I want to upload to server using PUT request. The reason I choose PUT is simply because I can use it to create a new resource or update an existing resource.
My problem is how to handle situation when server or Network disruption happens during PUT request.
That is say I have a huge file, during the transfer of which, Network failure happens. When the network resumes, I dont want to start the entire upload. How would I handle this?
I am using JAX-RS API with RESTeasy implementation.
Some people are using the Content-Range Header to achieve this but many people (like Mark Nottingham) state that this is not legal for requests. Please read the comments to this answer.
Besides there is no support from JAX-RS for this scenario.
If you really have the repeating problem of broken PUT requests I would simply let the client slice the files:
PUT /logs/{id}/1
PUT /logs/{id}/2
PUT /logs/{id}/3
GET /logs/{id} would then return the aggregation of all successful submitted slices.