Partial processing of a multipart/formdata request - rest

Is it possible to partially process a multipart/formdata request? I am developing a REST API wherein one of the resources is used to upload a large file. The application must take a call on processing the request based on the name of the file being uploaded, possibly sending back an alternate response if the file name fails validation.
If the application receives the large file and then performs the validation that triggers that alternate response, the time and resources used for the upload are both wasted. I prefer to preempt the upload of the actual file if the filename validation fails.
How can I implement this? I have considered the approach of first sending a request using the HEAD method and supplying the filename, with a subsequent upload contingent on the response to the first [HEAD] call. I would like to know if there are better alternatives.
Note: I am using Spring Boot to develop the RESTful application, although I imagine that will not significantly impact the answer I am seeking.

Related

Which HTTP method to use to build a REST API to perform following operation?

I am looking for a REST API to do following
Search based on parameters sent, if results found, return the results.
If no results found, create a record based on search parameters sent.
Can this be accomplished by creating one single API or 2 separate APIs are required?
I would expect this to be handled by a single request to a single resource.
Which HTTP method to use
This depends on the semantics of what is going on - we care about what the messages mean, rather than how the message handlers are implemented.
The key idea is the uniform interface constraint it REST; because we have a common understanding of what HTTP methods mean, general purpose connectors in the HTTP application can do useful work (for example, returning cached responses to a request without forwarding them to the origin server).
Thus, when trying to choose which HTTP method is appropriate, we can consider the implications the choice has on general purpose components (like web caches, browsers, crawlers, and so on).
GET announces that the meaning of the request is effectively read only; because of this, general purpose components know that they can dispatch this request at any time (for instance, a user agent might dispatch a GET request before the user decides to follow the link, to make the experience faster).
That's fine when you intend the request to provide the client with a copy of your search results, and the fact that you might end up making changes to server local state is just an implementation detail.
On the other hand, if the client is trying to edit the results of a particular search (but sometimes the server doesn't need to change anything), then GET isn't appropriate, and you should use POST.
A way to think about the difference is to consider what action you want to be taken when an intermediate cache holds a response from an earlier copy of "the same" request. If you want the cache to reuse the response, GET is the best; on the other hand, if you want the cache to throw away the old response (and possibly store the new one), then you should be using POST.

REST file download that takes 5 minutes to complete

One of the API calls to my Web API 2 server from my Angular client dynamically generates an XLSX file via a ton of SQL queries and processing. That can take up to five minutes to generate all the data and return it via a file download to the client. Obviously that's bad because Chrome shows an error by then even though the page is still loading.
It feels like this is where I'd use a status code 202 to tell the client that it got the request, but I'm not sure after that how to actually send the file back to the client then.
The only thing I can think of is that the server spawns a background task that will write the file to a specific temp location, after it's been created, and then another API call will download that file if it exists and delete it from the temp location.
Is that what I do and then just have the client poll periodically for that file? Pre-generating the file isn't an option as it has to have realtime data (at the point of request of course).
It feels like this is where I'd use a status code 202 to tell the client that it got the request, but I'm not sure after that how to actually send the file back to the client then.
Usually a HTTP 202 comes with a location header and an indicator of where and when the resource will be available.
Also a possibility is to add a link to a status monitor, like described here.
To achieve this, you could generate an id for that process and use it in the location header url to point to the result.
The client then is able to get that resource when the resource should be ready. This means you would need some short-term persistence.

REST File upload with resume and metadata support

I want to implement a REST API for file upload, which has the following requirements:
Support for resumable and chunked uploads
Support for adding and editing metadata after upload, like tags or descriptions
Be resource friendly, so uploads should use raw data, i.e. no encoding and / or encapsulation
So, requirements 1) and 3) already rule out use of multipart forms (I don't like this approach anyways).
I already have a working solution to satisfy requirement 1), by issuing a POST request first, which transmits fileName, fileSize and fileModification date as JSON. The upload module then creates a temporary file (if none exists!) and responds with a 206, containing an upload token in the header, which in turn is a hash created over the data sent in the POST. This token has to be used for the actual upload. The response also contains a byte range which instructs the client what part of the file it has to upload. The advantage is that by this way, it is detectable if a former partial upload already exists - the trick is to name the temporary file the same as the token. So the byte range might also start with another value than 0-. All the user has to do to resume an interupted upload is to upload the same file again!
The actual upload is done via PUT, with the message body only containing the raw binary data. The response then returns the metadata of the created file as JSON, or responds with another 206 containing an updated byte range header if the upload is still incomplete. So by this way, also chunked uploads are possible. All in all I like this solution and it works well. At least I see no other way to implement resumable uploads without a 2 stage approach.
Now, my problem is to make this truly "RESTish".
Lets say we have a collection named /files where I want to upload files. Of course, POST /files seems to be the natural way to do so. But then, we would have a subsequent PUT to the same collection which is of course not REST compatible in my eyes. Next approach would be that the initial POST returns a new URL that points to the final resource: /files/{fileId}and the subsequent PUT(s) write to this resource instead.
That feels more "RESTish", but is not as straightforward and then it may be possible that there are incomplete file resources floating around when upload is not completed. I think the actual resource should only be created if the upload is complete. Furthermore, if I want to update / add metadata later on, that would be a PUT to the same resource, but the request itself would be quite different. Hmmm.
I could change the PUT to a POST, so the upload would consist of multiple POSTs but this smells fishy as well.
I also thought of splitting the resource in two subresources, like /files/{fileId}/metadata and /files/{fileId}/binary but this feels a bit "overengineered" and also requires the file resource to be created before the upload is complete.
Any ideas how to solve this in a better way?

REST - GET-Respone with temporary uploaded File

I know that the title is not that correct, but i don't know how to name this problem...
Currently I'm trying to design my first REST-API for a conversion-service. Therefore the user has an input file which is given to the server and gets back the converted file.
The current problem I've got is, that the converted file should be accessed with a simple GET /conversionservice/my/url. However it is not possible to upload the input file within GET-Request. A POST would be necessary (am I right?), but POST isn't cacheable.
Now my question is, what's the right way to design this? I know that it could be possible to upload the input file before to the server and then access it with my GET-Request, but those input files could be everything!
Thanks for your help :)
A POST request is actually needed for a file upload. The fact that it is not cachable should not bother the service because how could any intermediaries (the browser, the server, proxy etc) know about the content of the file. If you need cachability, you would have to implement it yourself probably with a hash (md5, sha1 etc) of the uploaded file. This would keep you from having to perform the actual conversion twice, but you would have to hash each file that was uploaded which would slow you down for a "cache miss".
The only other way I could think of to solve the problem would be to require the user to pass in an accessible url to the file in the query string, then you could handle GET requests, but your users would have to make the file accessible over the internet. This would allow caching but limit the usability.
Perhaps a hybrid approach would be possible where you accepted a POST for a file upload and a GET for a url, this would increase the complexity of the service but maximize usability.
Also, you should look into what caches you are interested in leveraging as a lot of them have limits on the size of a cache entry meaning if the file is sufficiently large it would not cache anyway.
In the end, I would advise you to stick to the standards already established. Accept the POST request for the file upload and if you are interested in speeding up the user experience maybe make the upload persist, this would allow the user to upload a file once and download it in many different formats.
You sequence of events can be as follows:
Upload your file/files using POST. For immediate response, you can return required information using your own headers. (It should return document key to access the file for future use.)
Then you can use GET for further operations using the above mentioned document key as a query string.

How to Update a resource with a large attachment with PUT request in JAX-RS?

I have a large byte file (log file) that I want to upload to server using PUT request. The reason I choose PUT is simply because I can use it to create a new resource or update an existing resource.
My problem is how to handle situation when server or Network disruption happens during PUT request.
That is say I have a huge file, during the transfer of which, Network failure happens. When the network resumes, I dont want to start the entire upload. How would I handle this?
I am using JAX-RS API with RESTeasy implementation.
Some people are using the Content-Range Header to achieve this but many people (like Mark Nottingham) state that this is not legal for requests. Please read the comments to this answer.
Besides there is no support from JAX-RS for this scenario.
If you really have the repeating problem of broken PUT requests I would simply let the client slice the files:
PUT /logs/{id}/1
PUT /logs/{id}/2
PUT /logs/{id}/3
GET /logs/{id} would then return the aggregation of all successful submitted slices.