Which ProjectReactor Processor to choose? - reactive-programming

What is the most appropriate ProjectReactor Processor to use with the following use case:
Say the core function of the game service has 10 rest calls. After each of those call's purpose/activity is complete, we need to update the users score. There's a separate User Score service that tracks each users score. Most importantly, we want the User Score service calls to be asynchronous and not block the response to the 10 calls supporting the game. In other words, only after (or asynchronously) the core rest controller methods to return their data to the caller, should the updates to the User Score service happen.
The rest controller is built using Spring Webflux, and all calls require network resources (database or other service calls), so are subscribed on the elastic scheduler. As such, the events will be produced from different threads, thus this is a multi-threaded source.
Consuming the events from the Process's stream should be asynchronous as well, with perhaps a way to limit the concurrent requests so that we can prioritize the response to the 10 main calls, instead of updating scores.

Related

Reactor vs thread

My question is very simple in Java reactive programming
Thread Model:
User A (Lets say Thread 1), sends GET request to application.
Thread 1 will wait/blocked until it gets response from DB(IO request).
Once response received, block is removed and Thread_1 returns response to user.
Reactive Programming model:
User A (Lets say Thread 1), sends GET request to application.
In reactive thread_1 will have one call back to run. So it won't wait/blocked.
Question:
Who will run that call back? That is, which thread will run that call back?
What is event loop mechanism in reactor? Provide example in layman terms.
How to make use of multiple core CPU in reactive programming for NIO tasks?
Spring WebFlux has an event loop thread group which by default consists of as many threads as many CPU cores the machine running the application has. The callback is run by one of the threads from the group. If you use Spring WebFlux and WebClient (or any other reative data access library) then you get this multi-threaded behaviour out of the box.

Wrap event based system with REST API

I'm designing a system that uses a microservices architecture with event-based communication (using Google Cloud Pub/Sub).
Each of the services is listening and publishing messages so between the services everything is excellent.
On top of that, I want to provide a REST API that users can use without breaking the event-based approach. However, if I have an endpoint that triggers event X, how will I send the response to the user? Does it make sense to create a subscriber for a "ProcessXComplete" event and than return 200 OK?
For example:
I have the following microservices:
Service A
Service B
Frontend Service - REST Endpoints
I'm want to send this request "POST /posts" - this request sent to the frontend service.
The frontend service should trigger "NewPostEvent."
Both Service A and Service B will listen to this event and do something.
So far, so good, but here is where things are starting to get messy for me.
Now I want to return the user that made the request a valid response that the operation completed.
How can I know that all services finished their tasks, and how to create the handler to return this response?
Does it even make sense to go this way or is there a better design to implement both event-based communications between services and providing a REST API
What you're describing is absolutely one of the challenges of event-based programming and how eventual-consistency (and lack of atomicity) coordinates with essentially synchronous UI/UX.
It generally does make sense to have an EventXComplete event. Our microservices publish events on completion of anything that could potentially fail. So, there are lots of ServiceA.EventXSuccess events flowing through the queues. I'm not familiar with Google Cloud PubSub specifically, but in general in Messaging systems there is little extra cost to publishing messages with few (or no) subscribers to require compute power. So, we tend to over-articulate service status by default; it's easy to come back later and tone down messaging as needed. In fact, some of our newer services have Messaging Verbosity configurable via an Admin API.
The Frontend Service (which here is probably considered a Gateway Service or Facade Layer) has taken on the responsibility of being a responsive backing for your UI, so it needs to, in fact, BE responsive. In this example, I'd expect it to persist the User's POST request, return a 200 response and then update its local copy of the request based on events it's subscribed to from ServiceA and ServiceB. It also needs to provide a mechanism (events, email, webhook, gRPC, etc.) to communicate from the Frontend Service back to any UI if failure happens (maybe even if success happens). Which communication you use depends on how important and time-sensitive the notification is. A good example of this is getting an email from Amazon saying billing has failed on an Order you placed. They let you know via email within a few minutes, but they don't make you wait for the ExecuteOrderBilling message to get processed in the UI.
Connecting Microservices to the UI has been one of the most challenging aspects of our particular journey; avoiding tight coupling of models/data structures, UI workflows that are independent of microservice process flows, and perhaps the toughest one for us: authorization. These are the hidden dark-sides of this distributed architecture pattern, but they too can be overcome. Some experimentation with your particular system is likely required.
It really depends on your business case. If the REST svc is dropping message in message queue , then after dropping the message we simply return the reference ID that client can poll to check the progress.
E.g. flight search where your system has to calls 100s of backend services to show you flight deals . Search api will drop the message in the queue and save the same in the database with some reference ID and you return same id to client. Once worker are done with the message they will update the reference in DB with results and meanwhile your client will be polling (or web sockets preferably) to update the UI with results.
The idea is you can't block the request and keep everything async , this will make system scaleable.

Kafka Microservice Proper Use Cases

In my new work's project, i discovered that instead of directly making post/put API calls from one microservice to another microservice, a microservice would produce a message to kafka, which is then consumed by a single microservice.
For example, Order microservice would publish a record to "pending-order" topic, which would then be consumed by Inventory microservice (no other consumer). In turn, after consuming the record and done some processing, Inventory microservice would produce a record to "processed-order" which would then be consumed only by Order microservice.
Is this a correct use case? Or is it better to just do API calls between microservices in this case?
There are two strong use cases of Kafka in a microservice based application:
You need to do a state change in multiple microservices as part of a single end user activity. If you do this by calling all the appropriate microservice APIs sequentially or parallely, there will be two issues:
Firstly, you lose atomicity i.e. you canNot guarantee "all or nothing" . It's very well possible that the call to microservice A succeeds but call to service B fails and that would lead to inconsistent data permanently. Secondly, in a cloud environment unpredictable latency and network timeouts are not uncommon and so when you make multiple calls as part of a single call, the probability of one of these calls getting delayed or failed is higher impacting user experience. Hence, the general recommendation here is, you write the user action atomically in a Kafka topic as an event and have multiple consumer groups - one for each interested microservice consume the event and make the state change in their own database. If the action is triggered by the user from a UI, you would also need to provide a "read your own write" guarantee where the user would like to see his data immediately after writing. Therefore, you'd need to write the event first in the local database of the first microservice and then do log based event sourcing (using an aporopriate Kafka Connector) to transfer the event data to Kafka. This will enable you to show the data to the user from the local DB. You may also need to update a cache, a search index, a distributed file system etc. and all of these can be done by consuming the Kafka events published by the individual microservices.
It is not very uncommon that you need to pull data from multiple microservice to do some activity or to aggregate data and display to the user. This, in general, is not recommended because of the latency and timeout issue mentioned above. It is usually recommended that we precompute those aggregates in the microservice local DB based on Kafka events published by the other microservices when they were changing their own state. This will allow you to serve the aggregate data to the user much faster. This is called materialized view pattern.
The only point to remember here is writing to Kafka log or broker and reading from it us asynchronous and there maybe a little time delay.
Microservice as consumer, seems fishy to me. You might mean Listeners to that topic would consume the message and maybe they will call your second microservice i.e. Inventory Microservice.
Yes, the model is fine, specially when you want to have asynchronous behavior and loads of traffic handled through it.
Imaging a scenario when you have more than 1 microservice to call from 1 endpoint. Here you need either aggregation layer which aggregates your services and you call it once, or you would like to publish several messages to Kafka which then will do the job.
Also think about Read services, if you need to call a microservice to read some data from somewhere else, then you can't use Kafka.
It all depends on your requirements and design.

Throttle API calls to external service using Scala

I have a service exposing a REST endpoint that, after a couple of transformations, calls a third-party service also via its REST endpoint.
I would like to implement some sort of throttling on my service to avoid being throttled by this third-party service. Note that my service's endpoint accepts only one request and not a list of them. I'm using Play and we also have Akka Streams as dependency.
My first though was to have my service saving the requests into a database table and then have an Akka Streams Source, leveraging the throttle function, picking tasks, applying the transformations and then calling the external service.
Is this a reasanoble approach or does it have any severe drawbacks?
Thanks!
Why save the requests to the database? Does the queue need to survive restarts and/or do you run a load-balanced setup that needs to somehow synchronize the requests?
If you don't need the above I'd think using only Source.queue to store the task data would work just as well?
And maybe you already thought of this: If you want to make your endpoint more resilient you should allow your API to send a 'sorry, busy' response and drop the request instead of queuing it if your queue grows beyond a certain size.

Communication between microservices - request data

I am dealing with communication between microservices.
For example (fictive example, just for the illustration):
Microservice A - Store Users (getUser, etc.)
Microservice B - Store Orders (createOrder, etc.)
Now if I want to add new Order from the Client app, I need to know user address. So the request would be like this:
Client -> Microservice B (createOrder for userId 5) -> Microservice A (getUser with id 5)
The microservice B will create order with details (address) from the User Microservice.
PROBLEM TO SOLVE: How effectively deal with communication between microservice A and microservice B, as we have to wait until the response come back?
OPTIONS:
Use RestAPI,
Use AMQP, like RabbitMQ and deal with this issue via RPC. (https://www.rabbitmq.com/tutorials/tutorial-six-dotnet.html)
I don't know what will be better for the performance. Is call faster via RabbitMQ, or RestAPI? What is the best solution for microservice architecture?
In your case using direct REST calls should be fine.
Option 1 Use Rest API :
When you need synchronous communication. For example, your case. This option is suitable.
Option 2 Use AMQP :
When you need asynchronous communication. For example when your order service creates order you may want to notify product service to reduce the product quantity. Or you may want to nofity user service that order for user is successfully placed.
I highly recommend having a look at http://microservices.io/patterns/index.html
It all depends on your service's communication behaviour to choose between REST APIs and Event-Based design Or Both.
What you do is based on your requirement you can choose REST APIs where you see synchronous behaviour between services
and go with Event based design where you find services needs asynchronous behaviour, there is no harm combining both also.
Ideally for inter-process communication protocol it is better to go with messaging and for client-service REST APIs are best fitted.
Check the Communication style in microservices.io
REST based Architecture
Advantage
Request/Response is easy and best fitted when you need synchronous environments.
Simpler system since there in no intermediate broker
Promotes orchestration i.e Service can take action based on response of other service.
Drawback
Services needs to discover locations of service instances.
One to one Mapping between services.
Rest used HTTP which is general purpose protocol built on top of TCP/IP which adds enormous amount of overhead when using it to pass messages.
Event Driven Architecture
Advantage
Event-driven architectures are appealing to API developers because they function very well in asynchronous environments.
Loose coupling since it decouples services as on a event of once service multiple services can take action based on application requirement. it is easy to plug-in any new consumer to producer.
Improved availability since the message broker buffers messages until the consumer is able to process them.
Drawback
Additional complexity of message broker, which must be highly available
Debugging an event request is not that easy.
Personally I am not a fan of using a message broker for RPC. It adds unnecessary complexity and overhead.
How do you host your long-lived RabbitMQ consumer in your Users web service? If you make it some static singleton, in your web service how do you deal with scaling and concurrency? Or do you make it a stand-alone daemon process? Now you have two User applications instead of one. What happens if your Users consumer slows down, by the time it consumes the request message the Orders service context might have timed-out and sent another message or given up.
For RPC I would suggest simple HTTP.
There is a pattern involving a message broker that can avoid the need for a synchronous network call. The pattern is for services to consume events from other services and store that data locally in their own database. Then when the time comes when the Orders service needs a user record it can access it from its own database.
In your case, your Users app doesn't need to know anything about orders, but your Orders app needs to know some details about your users. So every time a user is added, modified, removed etc, the Users service emits an event (UserCreated, UserModified, UserRemoved). The Orders service can subscribe to those events and store only the data it needs, such as the user address.
The benefit is that is that at request time, your Orders service has one less synchronous dependency on another service. Testing the service is easier as you have fewer request time dependencies. There are also drawbacks however such as some latency between user record changes occuring and being received by the Orders app. Something to consider.
UPDATE
If you do go with RabbitMQ for RPC then remember to make use of the message TTL feature. If the client will timeout, then set the message expiration to that period. This will help avoid wasted work on the part of the consumer and avoid a queue getting backed up under load. One issue with RPC over a message broker is that once a queue fills up it can add long latencies that take a while to recover from. Setting your message expiration to your client timeout helps avoid that.
Regarding RabbitMQ for RPC. Normally we use a message broker for decoupling and durability. Seeing as RPC is a synchronous communication, that is, we are waiting for a response, then durability is not a consideration. That leaves us decoupling. The question is does that decoupling buy you anything over the decoupling you can do with HTTP via a gateway or Docker service names?