Should users await for a response after the http request in a saga pattern architecture? - rest

I am designing a microservice architecture, using a database per service pattern.
Following the example of Order Service and Shipping Service, when a user makes an HTTP REST request to the Order Service, this one fires an event to notify shipping service. All this happens asynchronously. So, what happens with the user experience? I mean, the user needs an immediate response from the HTTP request. How can I handle this scenario?

All this happens asynchronously. So, What happen with the user experience? I mean, the user needs an immediately response from the HTTP request. How can I handle this scenario?
Respond as soon as you have stored the request.
Part of the point of microservices is that you have a system composed of independently deployable elements that do not require coordination.
If you want a system that is reliable even though the services don't have 100% uptime, then you need to have some form of durable message storage so that the sender and the receiver don't need to be running at the same time.
Therefore, your basic pattern for data from the outside is that the information from the incoming HTTP request is copied, not directly into a running service, but instead into the message store, to be processed by the service at some later time.
In other words, your REST API is a facade in front of your storage, not in front of the service itself.
The actor model may be a useful analogy; information moves around by copying messages into different inboxes, and are later consumed by the subscribing actor.
From the perspective of the client, the HTTP response is an acknowledgement that the request has been received and recognized as valid. Think "thank you for your order, we'll send you an email when your purchase is ready for pick up."
On the web, we would include in the response links to other useful resources; click here to see the status of your order, click there to see your history of recent orders, and so on.

Related

Idempotent Keys From Client's Perspective

Suppose I have an API that calls a downstream service's API called /charge (POST). Suppose while doing charge, a timeout happened at the reverse-proxy and I got a 5xx. But the charge actually happened.
In this case, I would respond with a 5xx to my consumer. Now, if the consumer calls with the same idempotent key, then his request can succeed as the downstream service would return a cached copy of the response. But if he uses a different idempotent key while calling my API, he would keep getting 409s as the payment was already charged.
Here's my two questions:
How does the client know when to retry with the same idempotentId or initiate a new request altogether?
(Augmenting the previous question) How does the UI make the decision to use different idempotent Ids? Does each new request contain a new Id and only the retry logic reuses the same Id?
Basically, I am trying to understand idempotent keys from the client
's perspective.
A timeout should be retried automatically a few times before returning a failure response to the user. Thus if the error is transient, the user wouldn't notice any issue (except possibly a negligible delay in response).
The request originating system should maintain a log of all requests with their status. Thus if the glitch persists for a longer duration, the system can retry failed requests periodically as well as provide a detailed UI view of the submitted requests to the user. This eliminates the need for the user to ever retry a request. The system will do that on user's behalf.

Two channels for one API

We have a SaaS. It consists of Single Page application (client), Gateway, Data Service 1, Data Service 2 and Notification Service.
Client talk with Gateway (using REST) and service route the request to appropriate Data Service (1 or 2) or do own calculations.
One request from the client can be split on multiple at Gateway service. The result is an aggregation of responses from the sub-services.
Notification Service - is a service which pushing information about changes made by other users using MQ and WebSocket connection to the client. Notification can be published by any service.
With enginers, we had a discussion how the process can be optimized.
Currently, the problem that Gateway spending a lot of time just waiting for the response from Data Services.
One of the proposals is letting Gateway service response 200 Ok as soon as message pushed to the Data Service and let client wait for operation progress throw Notification channel (WebSocket connection).
It means that client always sends HTTP request for operation and get confirmation that operation is executed by WebSocket from the different endpoint.
This schema can be hidden by providing JS client library which will hide all this internal complexity.
I think something wrong with this approach. I have never seen such design. But I don't have valuable arguments against it, except complexity and two points of failure (instead of one).
What do you think about this design approach?
Do you see any potential problems with it?
Do you know any public solutions with
such approach?
Since your service is slow it might makes sense to treat it more like a batch job.
Client sends a job request to Gateway.
Gateway returns a job ID immediately after accepting it from the Client.
Client periodically polls the Gateway for results for that job ID.

Long-running operations in web-application

Web application operations are generally meant to be quick to avoid long wait times to users. However, some operations the web application may perform may be computationally-intensive and take a fair bit of time. What is the best practice in REST to deal with such operations that may be take several minutes yet require an immediate response to users? Is it okay for the web application to take several minutes to return the response of the HTTP request, or is it better to return a 202 response, process in the background somewhere else, and then provide some form of notification to the user?
Is it okay for the web application to take several minutes to return the response of the HTTP request
No. Part of the problem with this approach is that if the server doesn't acknowledge the request in a timely fashion, the client won't know that it reached its intended destination.
is it better to return a 202 response, process in the background somewhere else, and then provide some form of notification to the user?
Yes. That's exactly what 202 Accepted is designed for
The 202 response is intentionally noncommittal. Its purpose is to allow a server to accept a request for some other process (perhaps a batch-oriented process that is only run once per day) without requiring that the user agent's connection to the server persist until the process is completed. The representation sent with this response ought to describe the request's current status and point to (or embed) a status monitor that can provide the user with an estimate of when the request will be fulfilled.
It can help, I think, to remember that we're talking about your integration domain; the client isn't talking to your app. It's instead talking to your API, which pretends to be a web site that the client can integrate with. So your client sends the request to the API, and the API responds with an accepted message accompanied by a bunch of links that will help the client continue with the protocol and eventually reach its goal.

Long running REST API with queues

We are implementing a REST API, which will kick off multiple long running backend tasks. I have been reading the RESTful Web Services Cookbook and the recommendation is to return HTTP 202 / Accepted with a Content-Location header pointing to the task being processed. (e.g. http://www.example.org/orders/tasks/1234), and have the client poll this URI for an update on the long running task.
The idea is to have the REST API immediately post a message to a queue, with a background worker role picking up the message from the queue and spinning up multiple backend tasks, also using queues. The problem I see with this approach is how to assign a unique ID to the task and subsequently let the client request a status of the task by issuing a GET to the Content-Location URI.
If the REST API immediately posts to a queue, then it could generate a GUID and attach that as an attribute on the message being added to the queue, but fetching the status of the request becomes awkward.
Another option would be to have the REST API immediately add an entry to the database (let's say an order, with a new order id), with an initial status and then subsequently put a message on the queue to kick off the back ground tasks, which would then subsequently update that database record. The API would return this new order ID in the URI of the Content-Location header, for the client to use when checking the status of the task.
Somehow adding the database entry first, then adding the message to the queue seems backwards, but only adding the request to the queue makes it hard to track progress.
What would be the recommended approach?
Thanks a lot for your insights.
I assume your system looks like the following. You have a REST service, which receives requests from the client. It converts the requests into commands which the business logic can understand. You put these commands into a queue. You have a single or multiple workers which can process and remove these commands from the queue and send the results to the REST service, which can respond to the client.
Your problem that by your long running tasks the client connection timeouts, so you cannot send a response. So what you can do is sending a 202 accepted after you put the commands into the queue and add a polling link, so the client will be able to poll for the changes. Your tasks have multiple subtasks so there is progress, not just pending and complete status changes.
If you want to stick with polling, you should create a new REST resource, which contains the actual state and the progress of the long running task. This means that you have to store this info in a database, so the REST service will be able to respond to requests like GET /tasks/23461/status. This means that your worker has to update the database when it is completed a subtask or the whole task.
If your REST service is running as a daemon, then you can notify it by progress, so storing the task status in the database won't be the responsibility of the worker. This kind of REST service can store the info in the memory as well.
If you decide to use websockets to notify the client, then you can create a notification service. By REST you have to respond with a task id. After that you send back this task id on the websocket connection, so the notification service will know which websocket connection subscribed to the events of a certain task. After that you won't need the REST service, you can send the progress through the websocket connection as long as the client does not close the connection.
You can combine these solutions the following way. You let your REST service to create a task resource, so you'll be able to access the progress by using a polling link. After that you send back an identifier with 202 which you send back through the websockets connection. So you can use a notification service to notify the client. By progress your worker will notify the REST service, which will create a link like GET /tasks/23461/status and send that link to the client through the notification service. After that the client can use the link to update its status.
I think the last one is the best solution if your REST service runs as a daemon. It is because you can move the notification responsibility to a dedicated notification service, which can use websockets, polling, SSE, whatever you want. It can collapse without killing the REST service, so the REST service will stay stable and fast. If you send back a manual update link too with the 202, then the client can do manual update (assuming a human controlled client), so you will have something like graceful degradation if the notification service is not available. You don't have to maintain the notification service because it won't know anything about the tasks, it will just send data to the clients. Your worker won't have to know anything about how to send notifications and how to create hyperlinks. It will be easier to maintain the client code too, since it will be almost a pure REST client. The only extra feature will be the subscription for the notification links, which does not change frequently.

http interface for long operation

I have a running system that process short and long running operations with a Request-Response interface based on Agatha-RRSL.
Now we want to change a little in order to be able to send requests via website in Json format so i'm trying many REST server implementation that support Json.
REST server will be one module or "shelve" handled by Topshelf, another module will be the processing module and the last the NoSQL database runner module.
To talk between REST and processing module i'm thinking about a servicebus but we have two types of request: short requests that perform work in 1-2 seconds and long requests that do work in 1 minute..
Is servicebus the right choice for this work? I'm thinking about returning a "response" for long running op with a token that can be used to request operation status and results with a new request. The problem is that big part of the requests must be used like sync request in order to complete http response.
I think I have also problems with response size (on MSMQ message transport) when I have to return huge list of objects
Any hint?
NServiceBus is not really suitable for request-response messaging patterns. It's more suited to asynchronous publish-subscribe.
Edit: In order to implement a kind of request response, you would need to message in both directions, but consisting of three logical steps:
So your client sends a message requesting the data.
The server would receive the message, process it, construct a return message with the data, and send it to the client.
The client can then process the data.
Because each of these steps takes place in isolation and in an asynchronous manner there can be no meaningful SLA or timeout enforced between when a client sends a request and receives a response. But this works nicely for large processing job which may take several minutes to complete.
Additionally a common value which can be used to tie the request to the response will need to be present in both messages. Otherwise a client could send more than one request, and receive multiple responses and not know which response was for which request.
So you can do this with NServiceBus but it takes a little more thought.
Also NServiceBus uses MSMQ as the underlying transport, not http.