Low Level Protocol for Microservice Orchestration - rest

Recently I started working with Microservices, I wrote a library for service discovery using Redis to store every service's url and port number, along with a TTL value for the entry. It turned out to be an expensive approach since for every cross service call to any other service required one call to Redis. Caching didn't seem to be a good idea, since the services won't be up all the times, there can be possible downtimes as well.
So I wanted to write a separate microservice which could take care of the orchestration part. For this I need to figure out a really low level network protocol to take care of the exchange of heartbeats(which would help me figure out if any of the service instance goes unavailable). How do applications like zookeeperClient, redisClient take care of heartbeats?
Moreover what is the industry's preferred protocol for cross service calls?
I have been calling REST Api's over HTTP and eliminated every possibility of Joins across different collections.
Is there a better way to do this?
Thanks.

I think the term "Orchestration" is not good for what you are asking. From what I've encountered so far in microservices world the term "Orchestration" is used when a complex business process is involved and not for service discovery. What you need is a Service registry combined with a Load balancer. You can find here all the information you need. Here are some relevant extras that great article:
There are two main service discovery patterns: client‑side discovery and server‑side discovery. Let’s first look at client‑side discovery.
The Client‑Side Discovery Pattern
When using client‑side discovery, the client is responsible for determining the network locations of available service instances and load balancing requests across them. The client queries a service registry, which is a database of available service instances. The client then uses a load‑balancing algorithm to select one of the available service instances and makes a request.
The network location of a service instance is registered with the service registry when it starts up. It is removed from the service registry when the instance terminates. The service instance’s registration is typically refreshed periodically using a heartbeat mechanism.
Netflix OSS provides a great example of the client‑side discovery pattern. Netflix Eureka is a service registry. It provides a REST API for managing service‑instance registration and for querying available instances. Netflix Ribbon is an IPC client that works with Eureka to load balance requests across the available service instances. We will discuss Eureka in more depth later in this article.
The client‑side discovery pattern has a variety of benefits and drawbacks. This pattern is relatively straightforward and, except for the service registry, there are no other moving parts. Also, since the client knows about the available services instances, it can make intelligent, application‑specific load‑balancing decisions such as using hashing consistently. One significant drawback of this pattern is that it couples the client with the service registry. You must implement client‑side service discovery logic for each programming language and framework used by your service clients.
The Server‑Side Discovery Pattern
The client makes a request to a service via a load balancer. The load balancer queries the service registry and routes each request to an available service instance. As with client‑side discovery, service instances are registered and deregistered with the service registry.
The AWS Elastic Load Balancer (ELB) is an example of a server-side discovery router. An ELB is commonly used to load balance external traffic from the Internet. However, you can also use an ELB to load balance traffic that is internal to a virtual private cloud (VPC). A client makes requests (HTTP or TCP) via the ELB using its DNS name. The ELB load balances the traffic among a set of registered Elastic Compute Cloud (EC2) instances or EC2 Container Service (ECS) containers. There isn’t a separate service registry. Instead, EC2 instances and ECS containers are registered with the ELB itself.
HTTP servers and load balancers such as NGINX Plus and NGINX can also be used as a server-side discovery load balancer. For example, this blog post describes using Consul Template to dynamically reconfigure NGINX reverse proxying. Consul Template is a tool that periodically regenerates arbitrary configuration files from configuration data stored in the Consul service registry. It runs an arbitrary shell command whenever the files change. In the example described by the blog post, Consul Template generates an nginx.conf file, which configures the reverse proxying, and then runs a command that tells NGINX to reload the configuration. A more sophisticated implementation could dynamically reconfigure NGINX Plus using either its HTTP API or DNS.
Some deployment environments such as Kubernetes and Marathon run a proxy on each host in the cluster. The proxy plays the role of a server‑side discovery load balancer. In order to make a request to a service, a client routes the request via the proxy using the host’s IP address and the service’s assigned port. The proxy then transparently forwards the request to an available service instance running somewhere in the cluster.
The server‑side discovery pattern has several benefits and drawbacks. One great benefit of this pattern is that details of discovery are abstracted away from the client. Clients simply make requests to the load balancer. This eliminates the need to implement discovery logic for each programming language and framework used by your service clients. Also, as mentioned above, some deployment environments provide this functionality for free. This pattern also has some drawbacks, however. Unless the load balancer is provided by the deployment environment, it is yet another highly available system component that you need to set up and manage.

Related

Kubernetes-services load balancing

I have read this question which is very similar to what I am asking, but still wanted to write a new question since the accepted answer there seems very incomplete and also potentially wrong.
Basically, it seems like there is some missing or contradictory information regarding built in load-balancing for regular Kubernetes Services (I am not talking about LoadBalancer services). For example, the official Cilium documentation states that "Kubernetes doesn't come with an implementation of Load Balancing". In addition, I couldn't find any information in the official Kubernetes documentation about load balancing for internal services (there was only a section discussing this under ingresses).
So my question is - how does load balancing or distribution of requests work when we make a request from within a Kubernetes cluster to the internal address of a Kubernetes service?
I know there's a Kubernetes proxy on each node that creates the DNS records for such services, but what about services that span multiple pods and nodes? There's got to be some form of request distribution or load-balancing, or else this just wouldn't work at all, no?
A standard Kubernetes Service provides basic load-balancing. Even for a ClusterIP-type Service, the Service has its own cluster-internal IP address and DNS name, and forwards requests to the collection of Pods specified by its selector:.
In normal use, it is enough to create a multiple-replica Deployment, set a Service to point at its Pods, and send requests only to the Service. All of the replicas will receive requests.
The documentation discusses the implementation of internal load balancing in more detail than an application developer normally needs. Unless your cluster administrator has done extra setup, you'll probably get round-robin request routing – the first Pod will receive the first request, the second Pod the second, and so on.
... the official Cilium documentation states ...
This is almost certainly a statement about external load balancing. As a cluster administrator (not a programmer) a "plain" Kubernetes installation doesn't include an external load-balancer implementation, and a LoadBalancer-type Service behaves identically to a NodePort-type Service.
There are obvious deficiencies to round-robin scheduling, most notably if you do wind up having individual network requests that take a long time and a lot of resource to service. As an application developer the best way to address this is to make these very-long-running requests run asynchronously; return something like an HTTP 201 Created status with a unique per-job URL, and do the actual work in a separate queue-backed worker.

Synchronize HTTP requests between several service instances in Kubernetes

We have a service with multiple replicas which works with storage without transactions and blocking approaches. So we need somehow to synchronize concurrent requests between multiple instances by some "sharding" key. Right now we host this service in Kubernetes environment as a ReplicaSet.
Don't you know any simple out-of-the-box approaches on how to do this to not implement it from scratch?
Here are several of our ideas on how to do this:
Deploy the service as a StatefulSet and implement some proxy API which will route traffic to the specific pod in this StatefulSet by sharding key from the HTTP request. In this scenario, all requests which should be synchronized will be handled by one instance and it wouldn't be a problem to handle this case.
Deploy the service as a StatefulSet and implement some custom logic in the same service to re-route traffic to the specific instance (or process on this exact instance). As I understand it's not possible to have abstract implementation and it would work only in Kubernetes environment.
Somehow expose each pod IP outside the cluster and implement routing logic on the client-side.
Just implement synchronization between instances through some third-party service like Redis.
I would like to try to route traffic to the specific pods. If you know standard approaches how to handle this case I'll be much appreciated.
Thank you a lot in advance!
Another approach would be to put a messaging queue (like Kafka and RabbitMq) in front of your service.
Then your pods will subscribe to the MQ topic/stream. The pod will decide if it should process the message or not.
Also, try looking into service meshes like Istio or Linkerd.
They might have an OOTB solution for your use-case, although I wasn't able to find one.
Remember that Network Policy is not traffic routing !
Pods are intended to be stateless and indistinguishable from one another, pod-networking.
I recommend to Istio. It has special component which is responsible or routing- Envoy. It is a high-performance proxy developed in C++ to mediate all inbound and outbound traffic for all services in the service mesh.
Useful article: istio-envoy-proxy.
Istio documentation: istio-documentation.
Useful Istio explaination https://www.youtube.com/watch?v=e2kowI0fAz0.
But you should be able to create a Deployment per customer group, and a Service per Deployment. The Ingress nginx should be able to be told to map incoming requests by whatever attributes are relevant to specific customer group Services.
Other solution is to use kube-router.
Kube-router can be run as an agent or a Pod (via DaemonSet) on each node and leverages standard Linux technologies iptables, ipvs/lvs, ipset, iproute2.
Kube-router uses IPVS/LVS technology built in Linux to provide L4 load balancing. Each ClusterIP, NodePort, and LoadBalancer Kubernetes Service type is configured as an IPVS virtual service. Each Service Endpoint is configured as real server to the virtual service. The standard ipvsadm tool can be used to verify the configuration and monitor the active connections.
How it works: service-proxy.

How does Kubernetes lngress handles the concurrent request in GKE

We are currently deploying our spring boot application in kubernetes and using ingress as our loadbalancer. I want to know how does the kubernetes handles the concurrent request, Is there any configuration which i need to be enabled to handle the concurrent request. We have downstream system which has 10 threads and all the threads make a webservice calls to our spring-boot application.I want to how does the kubernetes handles this request concurrently and how does it routes those request to pod We are using google kubernetes engine(gke). we have 2 pod containers are running.enter code here
Long story short: Ingress has nothing to do with concurrency. Traffic will just come to your application's process the same way regardless of you're deploying Node.js, Ruby or Java or another language...
So it's up to your application runtime (Java/SpringBoot) to figure out how to handle incoming connections and multiple requests happening on these connnections simultaneously (HTTP/1.1 vs HTTP/2 work very differently in this regard, but mostly it's abstracted from the app developer by the framework you're using).
Kubernetes networking is not trivial to understand. This talk provides a good overview of how the traffic flows in a Kubernetes cluster:
Ingress just creates an external load balancer on your cloud provider that sends the traffic back to your "Service" (cluster-internal load balancer). So to fully understand how traffic comes from $CLOUD_PROVIDER's load balancer to your app, you need to understand inner workings of "Kubernetes networking" –and it totally depends on your cloud provider and the network plugin (CNI) it uses.

Disadvantages of using eureka for Service Discovery with kubernetes

Context
I am deploying a set of services that are containerised using Docker into AWS. No matter which deployment solution is chosen (e.g. raw EC2/ECS/Elastic Beanstalk/Fargate) we will face the issue of "service discovery".
To name just a few of the options for service discovery that I've considered:
AWS Route 53 Service Registry
Kubernetes
Hashicorp Consul
Spring Cloud Netflix Eureka
Specifics Of My Stack
I am developing Java Spring Boot applications using Spring Cloud with the target deployment environment being AWS.
Given that my stack is Spring based, spring cloud eureka made sense to me while developing locally. It was easy to set up a single node, integrates well with the stack and ecosystem of choice and required very little set up.
Locally, we are using docker compose (not swarm) to deploy services - one of the containers deployed is a single node Eureka service discovery server.
However, when we progress outside of local development and into staging or production environment we are considering options like Kubernetes.
My Own Assessment Of Pros/Cons
AWS Route 53 Service Registry
Requires us to couple code specifically to AWS services. Not a problem per se, we are quite tied in anyway on other parts of the stack (SNS/SQS).
Makes running the stack locally slightly more difficult as it relies on Route 53, I suppose we could open up a certain hosted zone for local development.
AWS native, no managing service registries or extra "moving parts".
Spring Cloud Eureka
Downside is that thus requires us to deploy and manage a high availability service registry cluster and requires more resources. Another "moving part" to manage.
Advantages are that it fits into our stack well (spring ecosystem, spring boot, spring cloud, feign and zuul work well with this). Also can be run locally trivially.
I presume we need to configure the networks and registry zone to ensure that that clients publish their host address rather and docker container internal IP address. e.g. if service A is on host A and wants to talk to service B on host B, service B needs to advertise its EC2 address rather than some internal docker IP.
Questions
If we use Kubernetes for orchestration, are there any disadvantages to using something like Spring Cloud Eureka over the built in service discovery options described here https://kubernetes.io/docs/concepts/services-networking/service/#discovering-services
Given Kube provides this, it seems suboptimal to then use eureka deployed using kube to perform discovery. I presume kube can make some optimisations that impact avaialbility and stability that might nit be possible using eureka. e.g kube would know when deploying a new service - eureka will have to rely on heartbeats/health checks and depending on how that is configured (e.g. frequency) this could result in stale records whereas i presume kube might not suffer from this for planned service shutdown/restarts. I guess it still does for unplanned failures such as a host failure or network partition.
Does anyone have any advice on this, do people use services like Kubernetes but use other mechanisms for service discovery rather than those provided by kube. Is there a good reason to do one or the other?
Possible Challenges I Anticipate
We could replace eureka, but relying on Kube to perform discovery will mean that we need to run kube locally to deploy whereas currently we have a simple tiny docker-compose file. Also, I'll have to look at how easy it'll be to ensure that ribbon, zuul and feign play nicely with this.
Currently we have ribbon configured with a eureka client so that service A can server to service B just as "service-b" for example and have ribbon resolve a healthy host via a eureka client. I guess we can configure ribbon to not use eureka and use an external Kube service name which will be resolved by Kube DNS at runtime...
Final Note
Thanks in advance for any contribution or advice. I know this might elicit a primarily opinion focused response. But I am hoping someone can provide objective guidance on when one solution might be preferable to another.
Service discovery is something you get out-of-the-box with Kubernetes. So having another external service in your platform will be another application to maintain, deploy and can be a point of failure. So I would stick with the the service discovery provided by Kubernetes.

Fabric Service availability on start

I have a scenario where one of our services exposes WCF hosts that receive callbacks from an external service.
These hosts are dynamically created and there may be hundreds of them. I need to ensure that they are all up and running on the node before the node starts receiving requests so they don't receive failures, this is critical.
Is there a way to ensure that the service doesn't receive requests until I say it's ready? In cloud services I could do this by containing all this code within the OnStart method.
My initial thought is that I might be able to bootstrap this before I open the communication listener - in the hope that the fabric manager only sends requests once this has been done, but I can't find any information on how this lifetime is handled.
There's no "fabric manager" that controls network traffic between your services within the cluster. If your service is up, clients or other services inside the cluster can choose to try to connect to it if they know its address. With that in mind, there are two things you have control over here:
The first is whether or not your service's endpoint is discoverable by other services or clients. This is the point at which your service endpoint is registered with Service Fabric's Naming Service, which occurs when your ICommunicationListener.OpenAsync method returns. At that point, the service endpoint is registered and others can discover it and attempt to connect to it. Of course you don't have to use the Naming Service or the ICommunicationListener pattern if you don't want to; your service can open up an endpoint whenever it feels like it, but if you don't register it with the Naming Service, you'll have to come up with your own service discovery mechanism.
The second is whether or not the node on which your service is running is receiving traffic from the Azure Load Balancer (or any load balancer if you're not hosting in Azure). This has less to do with Service Fabric and more to do with the load balancer itself. In Azure, you can use a load balancer probe to determine whether or not traffic should be sent to nodes.
EDIT:
I added some info about the Azure Load Balancer to our documentation, hope this helps: https://azure.microsoft.com/en-us/documentation/articles/service-fabric-connect-and-communicate-with-services/