Are independent (micro)services a paradox? - rest

Ideas about microservices:
Microservices should be functionally idependent
Microservices should specialize in doing some useful work in a domain they own
Microservices are intended to communicate with each other
I find this these ideas to be contradictory.
Let me give a hypothetical business scenario.
I need a service that executes a workflow that has a step requiring auto-translation.
My business has an auto-translation service. There is a RESTful API where I can POST a source language, target language and text, and it returns a translation. This is a perfect example of a useful standalone service. It is reusable and completely unaware of its consumers.
Should the workflow service that my business needs leverage this service? If so, then my service has a "dependency" on another service.
If you take this reasoning to the exteme, every service in the world would have every functionality in the world.
Now, I know you're thinking we can break this dependency by moving out of resquestion-response (REST) and into messaging. My service publishes a translation request message. A translation response message is published when the translation is complete and my service consumes this message. Ok, but my service has to freeze the workflow and continue when the message arrives. It's still "waiting" even if the waiting is true async waiting (say the workflow state was persisted and the translation message arrives a day later). This is just a delayed request-response.

For me personally, "independent" is a quality that applies to multiple dimensions. It may not be independent from runtime perspective, but it can be independent from development, deployment, operations and scalability perspectives.
For example, translation service may be independently developed, deployed, operated by another team. At the same time they can scale translation service independently from your business workflow service according to the demand they get. And you can scale business workflow service according to your own demand (of course downstream dependencies come in play here, but that's a whole another topic)

Related

Responses in Event Driven Architecture

The classic EDA example involves a command triggering events - like a chain of dominos.
PlaceOrder -> OrderPlaced -> PaymentSucceeded -> OrderShipped
Typically the Order Service listens to events along the way to keep the status of the order updated. Presumably (and this is the part that every article skips!) because at some point the order service will receive a ViewOrder command, which will require a response beyond "OK".
So my question is: In a EDA, do at least some of your services have to react to both events and commands?
If not, what architecture could separate the "command world" (required for supporting a HTTP API) from the "event world" of services performing async processing?
In my experience, every microservice we've built does both things. Participating with the Messaging Plane (publishing and/or subscribing) is always a requirement, and in most cases, exposing at least one API endpoint is also a requirement. In fact I don't believe we have any live services that don't expose an API endpoint although we have a few that probably could be that way.
So far, we've not run into a case where there was value in splitting a service into separate parts for API serving vs event bus interaction. I wouldn't say that's impossible, but we are very focused on services encapsulating a (functional) domain without much concern for implementation. That has allowed us to use a very formulaic approach to creating services themselves which is a big part of why we chose this architecture style.

How do you ensure that a distributed app is working as expected?

Imagine a very simple user creation flow in an online marketplace:
Service A (user service) receives the request and creates a user object and sends an async request to service B and C (e.g. via Kafka)
Service B (notification service) receives the request and sends an email to the newly created user
Service C (referral service) receives the request and credits some funds to the referrer
While this design might be laid out correctly in a design doc, it is only implicitly defined in code because the services talk to each other. How would you:
Ensure that the services are talking to each other in the correct order when implementing the user creation flow (integration tests might not suffice here since they generally test a very narrow set of path)?
Define and enforce SLO guarantees between services in production?
Debug which service is to blame when the flow breaks down?
This is a great question. And I think this scenario is a great fit for considering an orchestrator. A Microservices Orchestrator platform such as Netflix Conductor is designed to handle exactly these kind of scenarios.
With Conductor we can de-couple the flow and dependencies from the underlying functions itself and functions can be designed to do one simple thing such as saving user, notifying via email, credit referrals etc. We can then use the orchestration engine to assemble the required flow.
Such flows are executed really fast and the cost of latency is easily offset with the benefits you get.
Flow is defined as a workflow (this means the order can be controlled using the definition)
SLO guarantees - you can monitor for execution delays, failed transactions and retry and replay them as required. Latency required by an orchestrator is negligible
Debugging - with Conductor you'll get a UI that you can load up each transaction and look at what happened, which server executed it etc.
To explain these concepts better - I defined your use case here using some dummy APIs (this is a sandbox environment for Netflix Conductor)-
https://play.orkes.io/workflowDef/simple_user_creation_flow
And you can see an execution of this definition here:
https://play.orkes.io/execution/5095b5ef-3e2d-11ed-9d7b-1a5314838fe6
(For clarity - I work at https://orkes.io which offers a managed service for Netflix Conductor)

Wrap event based system with REST API

I'm designing a system that uses a microservices architecture with event-based communication (using Google Cloud Pub/Sub).
Each of the services is listening and publishing messages so between the services everything is excellent.
On top of that, I want to provide a REST API that users can use without breaking the event-based approach. However, if I have an endpoint that triggers event X, how will I send the response to the user? Does it make sense to create a subscriber for a "ProcessXComplete" event and than return 200 OK?
For example:
I have the following microservices:
Service A
Service B
Frontend Service - REST Endpoints
I'm want to send this request "POST /posts" - this request sent to the frontend service.
The frontend service should trigger "NewPostEvent."
Both Service A and Service B will listen to this event and do something.
So far, so good, but here is where things are starting to get messy for me.
Now I want to return the user that made the request a valid response that the operation completed.
How can I know that all services finished their tasks, and how to create the handler to return this response?
Does it even make sense to go this way or is there a better design to implement both event-based communications between services and providing a REST API
What you're describing is absolutely one of the challenges of event-based programming and how eventual-consistency (and lack of atomicity) coordinates with essentially synchronous UI/UX.
It generally does make sense to have an EventXComplete event. Our microservices publish events on completion of anything that could potentially fail. So, there are lots of ServiceA.EventXSuccess events flowing through the queues. I'm not familiar with Google Cloud PubSub specifically, but in general in Messaging systems there is little extra cost to publishing messages with few (or no) subscribers to require compute power. So, we tend to over-articulate service status by default; it's easy to come back later and tone down messaging as needed. In fact, some of our newer services have Messaging Verbosity configurable via an Admin API.
The Frontend Service (which here is probably considered a Gateway Service or Facade Layer) has taken on the responsibility of being a responsive backing for your UI, so it needs to, in fact, BE responsive. In this example, I'd expect it to persist the User's POST request, return a 200 response and then update its local copy of the request based on events it's subscribed to from ServiceA and ServiceB. It also needs to provide a mechanism (events, email, webhook, gRPC, etc.) to communicate from the Frontend Service back to any UI if failure happens (maybe even if success happens). Which communication you use depends on how important and time-sensitive the notification is. A good example of this is getting an email from Amazon saying billing has failed on an Order you placed. They let you know via email within a few minutes, but they don't make you wait for the ExecuteOrderBilling message to get processed in the UI.
Connecting Microservices to the UI has been one of the most challenging aspects of our particular journey; avoiding tight coupling of models/data structures, UI workflows that are independent of microservice process flows, and perhaps the toughest one for us: authorization. These are the hidden dark-sides of this distributed architecture pattern, but they too can be overcome. Some experimentation with your particular system is likely required.
It really depends on your business case. If the REST svc is dropping message in message queue , then after dropping the message we simply return the reference ID that client can poll to check the progress.
E.g. flight search where your system has to calls 100s of backend services to show you flight deals . Search api will drop the message in the queue and save the same in the database with some reference ID and you return same id to client. Once worker are done with the message they will update the reference in DB with results and meanwhile your client will be polling (or web sockets preferably) to update the UI with results.
The idea is you can't block the request and keep everything async , this will make system scaleable.

Communication between microservices - request data

I am dealing with communication between microservices.
For example (fictive example, just for the illustration):
Microservice A - Store Users (getUser, etc.)
Microservice B - Store Orders (createOrder, etc.)
Now if I want to add new Order from the Client app, I need to know user address. So the request would be like this:
Client -> Microservice B (createOrder for userId 5) -> Microservice A (getUser with id 5)
The microservice B will create order with details (address) from the User Microservice.
PROBLEM TO SOLVE: How effectively deal with communication between microservice A and microservice B, as we have to wait until the response come back?
OPTIONS:
Use RestAPI,
Use AMQP, like RabbitMQ and deal with this issue via RPC. (https://www.rabbitmq.com/tutorials/tutorial-six-dotnet.html)
I don't know what will be better for the performance. Is call faster via RabbitMQ, or RestAPI? What is the best solution for microservice architecture?
In your case using direct REST calls should be fine.
Option 1 Use Rest API :
When you need synchronous communication. For example, your case. This option is suitable.
Option 2 Use AMQP :
When you need asynchronous communication. For example when your order service creates order you may want to notify product service to reduce the product quantity. Or you may want to nofity user service that order for user is successfully placed.
I highly recommend having a look at http://microservices.io/patterns/index.html
It all depends on your service's communication behaviour to choose between REST APIs and Event-Based design Or Both.
What you do is based on your requirement you can choose REST APIs where you see synchronous behaviour between services
and go with Event based design where you find services needs asynchronous behaviour, there is no harm combining both also.
Ideally for inter-process communication protocol it is better to go with messaging and for client-service REST APIs are best fitted.
Check the Communication style in microservices.io
REST based Architecture
Advantage
Request/Response is easy and best fitted when you need synchronous environments.
Simpler system since there in no intermediate broker
Promotes orchestration i.e Service can take action based on response of other service.
Drawback
Services needs to discover locations of service instances.
One to one Mapping between services.
Rest used HTTP which is general purpose protocol built on top of TCP/IP which adds enormous amount of overhead when using it to pass messages.
Event Driven Architecture
Advantage
Event-driven architectures are appealing to API developers because they function very well in asynchronous environments.
Loose coupling since it decouples services as on a event of once service multiple services can take action based on application requirement. it is easy to plug-in any new consumer to producer.
Improved availability since the message broker buffers messages until the consumer is able to process them.
Drawback
Additional complexity of message broker, which must be highly available
Debugging an event request is not that easy.
Personally I am not a fan of using a message broker for RPC. It adds unnecessary complexity and overhead.
How do you host your long-lived RabbitMQ consumer in your Users web service? If you make it some static singleton, in your web service how do you deal with scaling and concurrency? Or do you make it a stand-alone daemon process? Now you have two User applications instead of one. What happens if your Users consumer slows down, by the time it consumes the request message the Orders service context might have timed-out and sent another message or given up.
For RPC I would suggest simple HTTP.
There is a pattern involving a message broker that can avoid the need for a synchronous network call. The pattern is for services to consume events from other services and store that data locally in their own database. Then when the time comes when the Orders service needs a user record it can access it from its own database.
In your case, your Users app doesn't need to know anything about orders, but your Orders app needs to know some details about your users. So every time a user is added, modified, removed etc, the Users service emits an event (UserCreated, UserModified, UserRemoved). The Orders service can subscribe to those events and store only the data it needs, such as the user address.
The benefit is that is that at request time, your Orders service has one less synchronous dependency on another service. Testing the service is easier as you have fewer request time dependencies. There are also drawbacks however such as some latency between user record changes occuring and being received by the Orders app. Something to consider.
UPDATE
If you do go with RabbitMQ for RPC then remember to make use of the message TTL feature. If the client will timeout, then set the message expiration to that period. This will help avoid wasted work on the part of the consumer and avoid a queue getting backed up under load. One issue with RPC over a message broker is that once a queue fills up it can add long latencies that take a while to recover from. Setting your message expiration to your client timeout helps avoid that.
Regarding RabbitMQ for RPC. Normally we use a message broker for decoupling and durability. Seeing as RPC is a synchronous communication, that is, we are waiting for a response, then durability is not a consideration. That leaves us decoupling. The question is does that decoupling buy you anything over the decoupling you can do with HTTP via a gateway or Docker service names?

Why bother with service discovery when message oriented middleware does the job?

I get the problem that etcd/consul/$whatever are trying to solve. Service consumers need to talk to service providers, a hugely fluid distributed system needs a mechanism to marry the two.
However, the problem of "where do service consumers go with their requests?" is old and IMO has been solved with MOM -- message oriented middleware.
In MOM, the idea is that service consumers do not care where the service providers live. They simply send a message and have the messaging bus take care of routing the message to the appropriate consumer. There can be multiple providers all doing the same thing (queue-based round-robin) or versioned providers (/v1/request goes to one, /v2/request goes to another).
This is a simple, powerful integration pattern that completely decouples a service interface from its implementation.
And yet I see this bizarre obsession with discovering service providers, which appears to create tight coupling between consumers and providers (in addition to a few other anti-patterns as well.)
So, what am I missing here? TIA.
In MOM, everything flows through the bus, so it might become a bottleneck. With service discovery, a consumer looks up a producer "once" (ok it might have to check back again after a while), and then "directly" (ok could be through a proxy) talks to it.
Or if you prefer catchy phrases: smart endpoints & dumb pipes vs (i guess) dumb endpoints & smart pipes.
Personally I don't see the two as either or for this type of architecture. You could use the service discovery to see what services are available at the moment and subscribe to the MOM for the events you then know will be there. If you can't find services you depend on you can raise an alert. Not all MOM's let you know when there is no publisher for a channel.
You can also combine them in the way that the service discovery is where you find the services you want to contact directly, for example a data store that does no job, and still use the MOM to subscribe to events for changes that other systems do. Not all use cases fit well with job queuing either, as some tasks must be solved synchronously, and then the service discovery is a great way to have a dynamic environment.
I do prefer the asynchronous MQ myself, and I think that if you do it right, with load balancing, redundancy, clustering with separate readers and writers etc you can easily have great stability, scalability and a standardized way for all your components to communicate.