Asynchronous task queues in Google container engine

Asynchronous task queues in Google container engine - kubernetes

I'm trying to figure out a portable way to develop a custom but scalable task queue in my cluster for google container engine . This is the scenario I have a front end that captures users details in my node js instance ,these details are sent to the api system which in turn contacts the db ,saves the user details and is expected to send a welcome mail .
My issue is this i don't want to use the same api endpoint method to process the sending mail requests ,I need another process to handle that how do I handle that with my kubernetes infrastructure?.Do I need to implement a pub sub type of system to publish to another container ?.If I do this it means all subscribes will be notified of my update but what if I have 2 instances of my sub system running it means they will all observe the changes and send the mail twice. Any thoughts or ideas on this would be appreciated.

I see two reasonable ways to approach this.
1: have a service that takes in mailing events by means of an API and returns immediately after receiving to process mailing asynchornously. Using kube service you will hit only one such service and one mail will be sent in a non blocking way for the calling service, but it has downsides - ie. what happens if something fails, the mail might not be generated at all.
2: I would go for some MQ probably (Kafka, Rabbit etc.), have a message queue consumed by any number of mailing service instances, make sure that only one can pick up the message, and require an ack for the message or return it to processing if no ack in N min

Related

ExpressJS: expose Event Processing system as a REST Service API

I am looking for a way to expose an existing event processing system to the external world using a REST interface. I have existing system design where we have RabbitMQ message queues where a publisher could post a message and then wait for the message processed results on a separate queue. Message ID is used to track the output to the original message on the output queue.
Now I want this to be exposed to the external consumers but we don't want to expose our RabbitMQ endpoint for this, so I was wondering if anyone has managed to achieve something similar to this using ExpressJS. Above diagram shows the current thought process
Main challenge I am facing here is that; some of this message processing could take more than couple of minutes, so was not sure how best to develop a API like this. Choices like should I create a polling interface for client here or is there a technology these days that help eliminate the polling on the client API to verify if the message is processed and get the result.
Can someone please help me with a good approach to manage these sort of requirement.

I finally ended up going the webhook way. Now when the REST API service receives a request, the client need to also provide a webhook and this will be registered with the client request and server will call it back when the results are available.

Wrap event based system with REST API

I'm designing a system that uses a microservices architecture with event-based communication (using Google Cloud Pub/Sub).
Each of the services is listening and publishing messages so between the services everything is excellent.
On top of that, I want to provide a REST API that users can use without breaking the event-based approach. However, if I have an endpoint that triggers event X, how will I send the response to the user? Does it make sense to create a subscriber for a "ProcessXComplete" event and than return 200 OK?
For example:
I have the following microservices:
Service A
Service B
Frontend Service - REST Endpoints
I'm want to send this request "POST /posts" - this request sent to the frontend service.
The frontend service should trigger "NewPostEvent."
Both Service A and Service B will listen to this event and do something.
So far, so good, but here is where things are starting to get messy for me.
Now I want to return the user that made the request a valid response that the operation completed.
How can I know that all services finished their tasks, and how to create the handler to return this response?
Does it even make sense to go this way or is there a better design to implement both event-based communications between services and providing a REST API

What you're describing is absolutely one of the challenges of event-based programming and how eventual-consistency (and lack of atomicity) coordinates with essentially synchronous UI/UX.
It generally does make sense to have an EventXComplete event. Our microservices publish events on completion of anything that could potentially fail. So, there are lots of ServiceA.EventXSuccess events flowing through the queues. I'm not familiar with Google Cloud PubSub specifically, but in general in Messaging systems there is little extra cost to publishing messages with few (or no) subscribers to require compute power. So, we tend to over-articulate service status by default; it's easy to come back later and tone down messaging as needed. In fact, some of our newer services have Messaging Verbosity configurable via an Admin API.
The Frontend Service (which here is probably considered a Gateway Service or Facade Layer) has taken on the responsibility of being a responsive backing for your UI, so it needs to, in fact, BE responsive. In this example, I'd expect it to persist the User's POST request, return a 200 response and then update its local copy of the request based on events it's subscribed to from ServiceA and ServiceB. It also needs to provide a mechanism (events, email, webhook, gRPC, etc.) to communicate from the Frontend Service back to any UI if failure happens (maybe even if success happens). Which communication you use depends on how important and time-sensitive the notification is. A good example of this is getting an email from Amazon saying billing has failed on an Order you placed. They let you know via email within a few minutes, but they don't make you wait for the ExecuteOrderBilling message to get processed in the UI.
Connecting Microservices to the UI has been one of the most challenging aspects of our particular journey; avoiding tight coupling of models/data structures, UI workflows that are independent of microservice process flows, and perhaps the toughest one for us: authorization. These are the hidden dark-sides of this distributed architecture pattern, but they too can be overcome. Some experimentation with your particular system is likely required.

It really depends on your business case. If the REST svc is dropping message in message queue , then after dropping the message we simply return the reference ID that client can poll to check the progress.
E.g. flight search where your system has to calls 100s of backend services to show you flight deals . Search api will drop the message in the queue and save the same in the database with some reference ID and you return same id to client. Once worker are done with the message they will update the reference in DB with results and meanwhile your client will be polling (or web sockets preferably) to update the UI with results.
The idea is you can't block the request and keep everything async , this will make system scaleable.

Is a message queue like RabbitMQ the ideal solution for this application?

I have been working on a project that is basically an e-commerce. It's a multi tenant application in which every client has its own domain and the website adjusts itself based on the clients' configuration.
If the client already has a software that manages his inventory like an ERP, I would need a medium on which, when the e-commerce generates an order, external applications like the ERP can be notified that this has happened to take actions in response. It would be like raising events over different applications.
I thought about storing these events in a database and having the client make requests in a short interval to fetch the data, but something about polling and using a REST Api for this seems hackish.
Then I thought about using Websockets, but if the client is offline for some reason when the event is generated, the delivery cannot be assured.
Then I encountered Message Queues, RabbitMQ to be specific. With a message queue, modeling the problem in a simplistic manner, the e-commerce would produce events on one end and push them to a queue that a clients worker would be processing as events arrive.
I don't know what is the best approach, to be honest, and would love some of you experienced developers give me a hand with this.

I do agree with Steve, using a message queue in your situation is ideal. Message queueing allows web servers to respond to requests quickly, instead of being forced to perform resource-heavy procedures on the spot. You can put your events to the queue and let the consumer/worker handle the request when the consumer has time to handle the request.
I recommend CloudAMQP for RabbitMQ, it's easy to try out and you can get started quickly. CloudAMQP is a hosted RabbitMQ service in the cloud. I also recommend this RabbitMQ guide: https://www.cloudamqp.com/blog/2015-05-18-part1-rabbitmq-for-beginners-what-is-rabbitmq.html

Your idea of using a message queue is a good one, better than database or websockets for the reasons you describe. With the message queue (RabbitMQ, or another server/broker based system such as Apache Qpid) approach you should consider putting a broker in a "DMZ" sort of network location so that your internal ecommerce system can push events out to it, and your external clients can reach into without risking direct access to your core business systems. You could also run a separate broker per client.

Using SignalR in Azure Worker Roles

I have an Azure hosted web application which works alongside a number of instances of a worker role. Currently the web app passes work to these workers by placing messages in an Azure queue for the workers to pick up. The workers pass status and progress messages back by placing messages into a 'feedback' queue. At the moment, in order to inform my browser clients as to progress, I make ajax based periodic polling calls in the browser to an MVC controller method which in turn reads the Azure 'feedback' queue and returns these messages as json back to the browser.
Obviously, SignalR looks like a very attractive alternative to this clumsy polling / queing approach, but I have found very little guidance on how to go about doing this when we are talking about multiple worker roles (as opposed to the web role) needing to send status to individual or all clients .
The SignalR.WindowsAzureServiceBus by Clemens vasters looks superb but leaves one a bit high and dry at the end i.e. a good example solution is lacking.
Added commentary: From my reading so far it seems that no direct communication from worker role (as opposed to web role) to browser client via the SignalR approach is possible. It seems that workers have to communicate with the web role using queues. This in turn forces a polling approach ie the queues must be polled for messages from the worker roles - this polling has to originate (be driven from) from the browser it appears (how can a polling loop be set up in a web role?)
In summary, SignalR, even with the SignalR.WindowsAzureServiceBus scale out approach of Clemens Vasters, cannot handle direct comunication from worker role to browser.
Any comments from the experts would be appreciated.

You can use your worker roles as SignalR clients, so they will send messages to the web role (which is SignalR server) and the web role in turn will forward messages to clients.

We use Azure Service Bus Queues to send data to our SignalR web roles which then forward on to the clients.
The CAT pages have very good examples of how to set up asynchronous loops and sending.

Keep in mind my knowledge of this two technologies is very basic, I'm just starting. And I might have misunderstood your question, but it seems pretty obvious to me:
Web roles are capable of subscribing to a queue server where the worker role deposits the message?. If so there would be no client "pulling", the queue service would provide the web server side code with a new message, and through SignalR you would push changes to the client without client requests involved. The communication between web and worker would remain the same (which in my opinion, it's the proper way to do it).

If you are using the one of the SignalR scaleout backplanes you can get workers talking to connected clients via your web application.
How to publish messages using the SignalR SqlMessageBus explains how to do this.
It also links to a fully worked example which demonstrates one way of doing this.

Alternative message bus products such as NServiceBus could be worth investigating. NServiceBus has the ability to deliver messages asynchronously across process boundaries without the need for polling.

First message not arriving over an MSMQ/MassTransit Service Bus

I've got a MassTransit ServiceBus running over MSMQ. It appears that the first message sent over the Bus doesn't arrive, but subsequent messages do?
Is there some initialization that needs performing on the queue or bus before the message is sent?

This depends on a few settings in how much time the system needs to setup before everything will correctly route. If only first message is failing to end up in the right location, then likely the subscription data isn't propagated everywhere yet. http://readthedocs.org/docs/masstransit/en/develop/overview/subscriptions.html
Using Multicast subscriptions, the easiest choice, will require a few seconds after a endpoint has come up and register a subscriber with all other endpoints. If you can control the order of services starting up, then this can often be avoided by started back to front in the flow.
If you are using the subscription service, then that can also take a couple seconds to get data everywhere. It has to go through the subscription service but the subscription is send to everyone on the bus. This is tied to a SQL db, and latency to the db can effect this timing.
Lastly, if you are using static routing, then that should work immediately, because the subscription is setup upon startup.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse