Progress callback for multi-part form POST request - eclipse

We have a legacy application that uses embedded Jetty and provides functionality through clients making HTTP calls. Most of the information/parameters needed by the server is sent by the client through HTTP headers.
We are using multi-part form POST using curl_mime, curl_mimepart and CURLOPT_MIMEPOST, and send transfer multiple files request to server, server accept the request and it transfer files from volume1 to volume2.
For this transfer operation we have multiple files having size in GBs and it is taking long time to complete. Currently client do a multi-part form POST request and wait for result which takes long time.
We want some mechanism where client will get some progress status of this transfer operation.
I went through curl and jetty documentation and seems progress callback(CURLOPT_PROGRESSFUNCTION) is not supported when we use multi-part form POST.
One solution I found that will fulfil my requirement is use of 102-processing,
102 Processing
But as per document it mentioned "later update of RFC 2518, RFC 4918, removed the 102 Processing status code for lack of implementation."
if 102-processing is removed, then is there any alternate given that will tell transfer progress to client?
What is the best way to tell this transfer progress to client?

Related

Which HTTP status should I return if the client tries to upload a new file while server is still processing the previous one?

My application has a button for allowing the user to upload a file. Uploading the file is very quick, we send the response to the client very quickly, but the service takes a while to process it.
The user can only upload a new file when the previous one is already processed. Therefore, if the user tries to upload a file while the server is still processing one, we should return an error to the client.
My question is which HTTP response status should I use? I checked all options and these are the ones which I believe are closer to my situation:
409 Conflict -
Indicates that the request could not be processed because of conflict in the current state of the resource, such as an edit conflict between multiple simultaneous updates.
425 Too Early (RFC 8470) - Indicates that the server is unwilling to risk processing a request that might be replayed.
428 Precondition Required (RFC 6585) - The origin server requires the request to be conditional. Intended to prevent the 'lost update' problem, where a client GETs a resource's state, modifies it, and PUTs it back to the server, when meanwhile a third party has modified the state on the server, leading to a conflict.[58]
Which one do you believe is the most appropriate to this situation? Or any of them, should I use another one?
Which one do you believe is the most appropriate to this situation?
429 Too Many Requests
The 429 status code indicates that the user has sent too many requests in a given amount of time ("rate limiting").
The response representations SHOULD include details explaining the condition, and MAY include a Retry-After header indicating how long to wait before making a new request.
429 Too Many Requests
Retry-After: 60
Content-Type: text/plain
Why don't you wait a minute?
Why I'm not advocating for any of the codes you recommended:
425 Too Early ; I would avoid this one as it seems to be specific to the context of early data.
428 Precondition Required ; the core problem here is how to communicate to a general purpose client which precondition should be included in the request. Also, it's a little bit off, semantically.
409 Conflict ; in practice, you might be able to make this one work. Semantically, the difficulty is that there really isn't a way for the client to resolve the conflict (ex: reloading the page to get a fresh copy of the server's copy of the resource).
The important thing to recognize is that HTTP status codes are only incidentally for humans. Status codes are metadata of the transfer-of-documents-over-a-network domain; the intended audience is general purpose HTTP components (browsers, spiders, caches, proxies, etc).
Therefore, the "best" code to use is going to be that code which tells general purpose components the right thing. Specialization happens in the response body, where we use the payload to communicate the fine grained details.
How about you receive the request and place it in a queue. Create a single consumer to your queue so no request could be processed until the previous one is finished. This way you can just accept the request from customer regardless of the current state, and just return a standard acknowledgement response - 200 or whatever response you send upon success.

Handle REST API timeout in time consuming operations

How is possible to handle timeouts in time consuming operations in a REST API. Let's say we have the following scenario as example:
A client service sends a request to insert a resource through a REST API.
Timeout elapses. The client thinks the insertion failed.
REST API keep working and finishes the insertion.
Client do not notify the resource insertion and it status is "Failed".
I can think I a solution with a message broker to send orders to a queue and wait until they are solved.
Any other workaround?
EDIT 1:
POST-PUT Pattern as has been suggested in this thread.
A Message Broker (add more complexity to the system)
Callback or webhook. Pass in the request a return url that the server API can call to let the client know that the work is completed.
HTTP offers a set of properties for invoking certain methods. These are primarily safetiness, idempotency and cacheability. While the first one guarantees a client that no data is modified, the 2nd one gives a promise whether a request can be reissued in regards to connection issues and the client not knowing whether the initial request succeeded or not and only the response got lost mid way. PUT i.e. does provide such a property, i.e.
A simple POST request to "insert" some data does not have any of these properties. A server receiving a POST request furthermore processes the payload according to its own semantics. The client does not know beforehand whether a resource will be created or if the server just ignores the request. In case the server created a resource the server will inform the client via the Location HTTP response header pointing to the actual location the client can retrieve information from.
PUT is usually used only to "update" a resource, though according to the spec it can also be used in order to create a new resource if it does not yet exist. As with POST on a successful resource creation the PUT response should include such a Location HTTP response header to inform the client that a resource was created.
The POST-PUT-Creation pattern separates the creation of the URI from the actual persistence of the representation by first firing off POST requests to the server until a response is received containing a Location HTTP response header. This header is used in a PUT request to actually send the payload to the server. As PUT is idempotent the server simply can reissue the request until it receives a valid response from the server.
On sending the initial POST request to the server, a client can't be sure whether the request reached the server and only the response got lost, or the initial request didn't make it to the server. As the request is only used to create a new URI (without any content yet) the client may simply reissue the request and in worst case just create a new URI that points to nothing. The server may have a cleanup routine that frees unused URIs after a certain amount of time.
Once the client receives the URI, it simply can use PUT to reliably send data to the server. As long as the client didn't receive a valid response, it can just reissue the request over and over until it receives a response.
I therefore do not see the need to use a message-oriented middleware (MOM) using brokers and queues in order to guarantee reliable messaging.
You could also cache the data after a successful insertion with a previously exchanged request_id or something of that sort. But I believe message broker with some asynchronous task runner is a much better way to deal with the problem especially if your request thread is a scarce resource. What I mean by that is. If you are receiving a good amount of requests all the time. Then it is a good idea to keep your responses as quickly as possible so the workers will be available for any requests to come.

Send batched data to Google Calendar with Scala/Spray

We are successfully sending data for new, changed, and removed events to Google Calendar from a Scala app using Spray HTTP. However, we are currently sending one event per request, and this becomes very inefficient when there are multiple events for the current user. In these cases we would like to send batched data, as described here:
https://developers.google.com/google-apps/calendar/batch
The documentation begins with:
A batch request is a single standard HTTP request containing multiple
Google Calendar API calls, using the multipart/mixed content type.
Within that main HTTP request, each of the parts contains a nested
HTTP request.
Since we are already using spray http we would like to use its support for multipart/mixed requests (spray.http.MultipartContent) but it isn't clear that this is possible since the parts must consist of one or more spray.http.BodyPart instances and there doesn't seem to be a way to turn a spray.http.HttpRequest into a BodyPart.
Has anyone successfully done this? We are also taking a look at the Google API Client for Java but would rather not go down that path if there is a more Scala-friendly way to do it.

http interface for long operation

I have a running system that process short and long running operations with a Request-Response interface based on Agatha-RRSL.
Now we want to change a little in order to be able to send requests via website in Json format so i'm trying many REST server implementation that support Json.
REST server will be one module or "shelve" handled by Topshelf, another module will be the processing module and the last the NoSQL database runner module.
To talk between REST and processing module i'm thinking about a servicebus but we have two types of request: short requests that perform work in 1-2 seconds and long requests that do work in 1 minute..
Is servicebus the right choice for this work? I'm thinking about returning a "response" for long running op with a token that can be used to request operation status and results with a new request. The problem is that big part of the requests must be used like sync request in order to complete http response.
I think I have also problems with response size (on MSMQ message transport) when I have to return huge list of objects
Any hint?
NServiceBus is not really suitable for request-response messaging patterns. It's more suited to asynchronous publish-subscribe.
Edit: In order to implement a kind of request response, you would need to message in both directions, but consisting of three logical steps:
So your client sends a message requesting the data.
The server would receive the message, process it, construct a return message with the data, and send it to the client.
The client can then process the data.
Because each of these steps takes place in isolation and in an asynchronous manner there can be no meaningful SLA or timeout enforced between when a client sends a request and receives a response. But this works nicely for large processing job which may take several minutes to complete.
Additionally a common value which can be used to tie the request to the response will need to be present in both messages. Otherwise a client could send more than one request, and receive multiple responses and not know which response was for which request.
So you can do this with NServiceBus but it takes a little more thought.
Also NServiceBus uses MSMQ as the underlying transport, not http.

How to handle correctly HTTP Digest Authentication on iPhone

I'm trying to upload a file onto my personal server.
I've written a small php page that works flawlessy so far.
The little weird thing is the fact that I generate all the body of the HTTP message I'm going to send (let's say that amounts to ~4 mb) and then I send the request to my server.
The server, then, asks for an HTTP challenge and my delegate connection:didReceiveAuthenticationChallenge:challenge replies to the server with the proper credentials and the data.
But, what's happened? The data has been sent twice!
In fact I've noticed that when I added the progressbar.. the apps sends the data (4mb), the server asks for authentication, the apps re-sends the data with the authentication (another 4mb). So, at the end, I've sent 8mb. That's wrong.
I started googling and searching for a solution but I can't figure out how to fix this.
The case scenarios are two (my guess):
Share the realm for the whole session (a minimal HTTP request, then challenge, then data)
Use the synchronized way to perform an HTTP connection (things that I do not want to do since it seems an ugly way to handle this kind of stuff to me)
Thank you
You've run into a flaw into the http protocol: you have to send all the data before getting the response with the auth challenge (when you send a request with no credentials). You can try doing a small round trip as the first request in the same session (as you've mentioned), like a HEAD request, then future requests will share the same nonce.
Too late to answer the original requester, but in time if somebody else read this.
TL;DR: Section 8.2.3 of RFC 2616 describes the 100 Continue status which is all what you need (were needing) in such a situation.
Also have a look at sections 10.1.1 and 14.20.
The client sends a request with an "Expect: 100-continue" header, pausing the request before sending the body. The server uses the already received headers to make its decision whether this request may be accepted or not (if the entity –the body– to be received is not too large, if the user's credentials are correct...). If the request is acceptable for the server, it replies with a "100 Continue" status code, the client sends the body and the server replies with the final status code for that request. To the contrary, if the request is not acceptable, the server replies with a 4xx status code ("413 Request Entity Too Large" if the provided body size is... too large, or a "401 Unauthorized" + the WWW-Authenticate: header) and the client does not send the body. Being answered with a 401 status code and the corresponding WWW-Authenticate: information, the client can now perform the request again and provides its credentials.