Sending response in a Pub-Sub architecture - publish-subscribe

At the moment I have a single AWS EC2 instance which handles all incoming http client requests. It analyses each request and then decides which back end worker server should handle the request and then makes a http call to the chosen server. The back end server then responds when it has processed the request. The front end server will then respond to the client. The front end server is effectively a load balancer.
I now want to go to a Pub-Sub architecture instead of the front end server pushing the requests to the back end instances. The front end server will do some basic processing and then simply put the request into an SNS queue and the logic of which back end server should handle the request is left to the back end servers themselves.
My question is with this model what is the best way to have the back end servers notify the front end server that they have processed the request? Previously they just replied to the http request the front end server sent but now there is no direct request, just an item of work being published to a queue and a back end instance picking it off the queue.

Pubsub architectures are not well suited to responses/acknowledgements. Their fire-and-forget broadcasting pattern decouples publishers and the subscribers: a publisher does not know if or how many subscribers there are, and the subscribers do no know which publisher generated a message. Also, it can be difficult to guarantee sequence of responses, they won't necessarily match the sequence of messages due to the nature of network comms and handling of messages can take different amounts of time etc. So each message that needs to be acknowledge needs a unique ID that the subscriber can include in its response so the publisher can match a response with the message sent. For example:
publisher sends message "new event" and provides a UUID for the
event
many subscribers get the message; some may be the handlers for
the request, but others might be observers, loggers, analytics, etc
if only one subscriber handles the message (e.g. the first
subscriber to get a key from somewhere), that subscriber generates a
message "new event handled" and provides a UUID
the original
publisher, as well as any number of other subscribers, may get that
message;
the original publisher sees the ID is
in its cache as an unconfirmed message, and now marks it as
confirmed
if a certain amount of time passes without receiving a
confirmation with given ID, the original publisher republishes the
original message, with a new ID, and removes the old ID from cache.
In step 3, if many subscribers handled the message instead of just one, then it
less obvious how the original publisher should handle "responses": how does it
know how many subscribers handle the message, some could be down or
too busy to respond, or some may be in the process of responding by the time
the original publisher determines that "not enough handlers have
responded".
Publish-subscribe architectures should be designed to not request any response, but instead to check for some condition that should have happened as a result of the command being handled, such as a thumbnail having gotten generated (it can assume as a result of a handler of the message).

Related

How to test a verticle that does not wait for acks to its messages?

I want to test a worker verticle that receives requests over EventBus and sends the results also over EventBus. A single request may result in 0,1,2,... responses - in general cases we don't know how many responses we'll get.
The business logic is that requests are acked once the processing is complete, however the responses are sent in "fire and forget" manner - therefore we only know the responses were sent, not necessarily that they were delivered already.
I am writing a test for this verticle.
The test code is planned to be like this:
1. set up consumer for responses
2. send a request
3. wait until request is acked by the worker verticle
4. wait until consumer finishes validating the responses
The problem here is step 4 - in general case we don't know if there are still some responses in flight or not.
A brute force solution is obviously to wait some reasonable time - a few milliseconds is usually enough. However. I'd prefer something more conceptual.
A solution that comes to my mind is this:
send some request for which we know for sure that there would be a single response;
wait until the consumer receives the corresponding response.
That should work, but I dislike the fact that I pump two messages through the SUT instead of just a single one.
A different solution would be to send one extra response from test code, once we have a confirmation that the request was processed - but would it be considered to be the same sender? The EventBus only guarantees delivery order from the same sender, not from different ones. The test doesn't run in cluster mode, all operations are performed on the same machine, though not necessarily in the same thread.
Yet another solution would be to somehow check that EventBus is now empty, but as I understand, this is not possible.
Is there any other (better) solution?
The solution I would choose now (after half a year more experience with vertx/EventBus) is to send two messages.
The second message would get acked only after the processing of the first one is complete.
This would only work if you have a single consumer so that your two messages can't be processed in parallel.

Handling REST request with Message Queue

I have two applications as stated below:
Spring boot application - Acts as rest end point, publishes the request to message queue. ( Apache Pulsar )
Heron (Storm) topology - which processes the message received from Message queue ( PULSAR ) and has all logic for processing.
My requirement, i need to serve different user queries through Spring boot application, which emits that query to message queue, and is consumed at spout. Once spout and bolts process the requests, a message is published again from bolt. That response from Bolt is handled at Spring boot (consumer) and replies to the user request. Typlically as shown below:
To serve to the same request, Im right now caching the deferred result object ( I set a reqID to each message which is sent to topology and I also maintain a key, value pair for ) in memory and when the message arrives I parse the request id and set the result to the defferedResult (I know this is a bad design, HOW SHOULD ONE SOLVE THIS ISSUE ?).
How can I proceed to serve the response back to the same request in this scenario where the order of messages received from topology is not sequential ( as each request which is processes takes its own time and producer bolt will fire the response as on when it is receives one ).
Im kind of stuck with this design and not able to proceed further.
//Controller
public DeferredResult<ResponseEntity<?>> process(//someinput) {
DeferredResult<ResponseEntity<?>> result = new DeferredResult<>(config.getTimeout());
CompletableFuture<String> serviceResponse = service.processAsync(inputSource);
serviceResponse.whenComplete((response, exception) -> {
if (!ObjectUtils.isEmpty(exception))
result.setErrorResult(//error);
else
result.setResult(//complete);
});
return result;
}
//In Service
public CompletableFuture processAsync(//input){
producer.send(input);
CompletableFuture result = new CompletableFuture();
//consumer has a listener as shown below
// **I want to avoid below line, how can I redesign this**
map.put(id, result);
return result;
}
//in same service, a listener is present for consumer for reading the messages
consumerListener(Message msg){
int reqID = msg.getRequestID();
map.get(reqID).complete(msg.getData);
}
As shown above as soon as I get a message I get the completableFuture
object and set the result, which interally calls the defferred result
object and returns the response to the user.
How can I proceed to serve the response back to the same request in this scenario where the order of messages received from topology is not sequential ( as each request which is processes takes its own time and producer bolt will fire the response as on when it is receives one ).
It sounds like you are looking for the Correlation Identifier messaging pattern. In broad strokes, you compute/create an identifier that gets attached to the message sent to pulsar, and arrange that Heron copies that identifier from the request it receives to the response it sends.
Thus, when your Spring Boot component is consuming messages from pulsar at step 5, you match the correlation id to the correct http request, and return the result.
Using the original requestId() as your correlation identifier should be fine, as far as I can tell.
To serve to the same request, Im right now caching the deferred result object ( I set a reqID to each message which is sent to topology and I also maintain a key, value pair for ) in memory and when the message arrives I parse the request id and set the result to the defferedResult (I know this is a bad design, HOW SHOULD ONE SOLVE THIS ISSUE ?).
Ultimately, you are likely to be doing that at some level; which is to say that the consumer at step 5 is going to be using the correlation id to look up something that was stored by the producer. Trying to pass the original request across four different process boundaries is likely to end in tears.
The more general form is to store a callback, rather than a CompletableFuture, in the map; but in this case the callback probably just completes the future.
The one thing I would want to check carefully in the design: you want to be sure that the consumer at step 5 sees the future it is supposed to use before the message arrives. In other words, there should be a happens-before memory barrier somewhere to ensure that the map lookup at step 5 doesn't fail.

If websocket messages arrive in order, why does 'sequence number' is needed?

I heard that websocket messages are received in order, because websocket runs over TCP.
Then what is the purpose of 'sequence number'?
This is the explanation of sequence number in websocket.
But I'm wondering why does that sequence number is needed, if we have a 'in-order' received message.
The sequence number allows you to map your requests to responses even if the responses don't come in the order you make them.
HTTP and other relevant protocols support pipelining. Also there is no need for the request responses to be sent back to you in any specific order. Each one may be processed according to its individual cost or dispatched across a server farm and reassembled in an order that is not predetermined. Either way, if they are out of order you will need a key to map the response back to your request.

Netty 4.0 SO_Keeplive one connection send many request to server How to process request concurrently

one connection send many request to server
How to process request concurrently.
Please use a simple example like timeserver or echoserver in netty.io
to illustrate the operation.
One way I could find out is to create a separate threaded handler that will be called as in a producer/consumer way.
The producer will be your "network" handler, giving message to the consumers, therefore not waiting for any wanswear and being able then to proceed with the next request.
The consumer will be your "business" handler, one per connection but possibly multi-threaded, consuming with multiple instances the messages and being able to answer using the Netty's context from the connection from which it is attached.
Another option for the consumer would be to have only one handler, still multi-threaded, but then message will come in with the original Netty's Context such that it can answear to the client, whatever the connection attached.
But the difficulties will come soon:
How to deal with an answear among several requests on client side: let say the client sends 3 requests A, B and C and the answears will come back, due to speed of the Business handler, as C, A, B... You have to deal with it, and knowing for which request the answer is.
You have to ensure all the ways the context given in parameter is still valid (channel active), if you don't want to have too many errors.
Perhaps the best way would be to however handle your request in order (as Netty does), and keep the answear's action as quick as possible.

async call back using scala and play2 or spray

I have a systems design challenge that I would like to get some community feedback on.
Basic system structure:
[Client] ---HTTP-POST--> [REST Service] ---> [Queue] ---> [Processors]
[Client] POSTs json to [REST Service] for processing.
Based on request, [Rest Services] sends data to various queues to be picked up by various processors written in various languages and running in different processes.
Work is parallelized in each processor but can still take up to 30 seconds to process. The time to process is a function of the complexity of the data and cannot be speed up.
The result cannot be streamed back to the client as it is completed because there is a final post processing step that can only be completed once all the sub steps are completed.
Key challenge: Once the post processing is complete, the client either needs to:
be sent the results after the client has been waiting
be notified async that the job is completed and passed an id to request the final result
Design requirements
I don't want to block the [REST Service]. It needs to take the incoming request, route the data to the appropriate queues for processing in other processes, and then be immediately available for the next incoming request.
Normally I would have used actors and/or futures/promises so the [REST Service] is not blocked when waiting for background workers to complete. The challenge here is the workers doing the background work are running in separate processes/VMs and written in various technology stacks. In order to pass these messages between heterogeneous systems and to ensure integrity of the request lifetime, a durable queue is being used (not in memory message passing or RPC).
Final point of consideration, in order to scale, there are a load balanced set of [REST Services] and [Processors] in respective pools. Therefore, since the messages from the [REST Service] to the [Processor] need to be sent asynchronously via a queue (and everything is running is separate processes), there is no way to correlate the work done in a background [Processor] back to its original calling [REST Service] instance in order to return the final processed data in a promise or actor message and finally pass the response back to the original client.
So, the question is, how to make this correlation? Once the all the background processing is completed, I need to get the result back to the client either via a long waited response or a notification (I do not want to use something like UrbanAirship as most of the clients are browsers or other services.
I hope this is clear, if not, please ask for clarification.
Edit: Possible solution - thoughts?
I think I pass a spray RequestContext to any actor which can then response back to the client (does not have to be the original actor that received HTTP request). If this is true, can I cache the RequestContext and then use it later to asynchronously send the response to the appropriate client using this cached RequestContext when the processing is completed?
Well, it's not the best because it requires more work from your Client, but it sounds like you want to implement a webhook. So,
[Client] --- POST--> [REST Service] ---> [Calculations] ---> POST [Client]
[Client] --- GET
For explanation:
Client sends a POST request to your service. Your Service then does whatever processing necessary. Upon completion, your service will then send an HTTP-POST to a URL that the Client has already set. With that POST data, the Client will then have the necessary information to then do a GET request for the completed data.