Colocating related containers on nodes to avoid the cost of network accesses - kubernetes

I'm still new to Kubernetes so please excuse if this is a silly question.
I'm architecting a system which includes:
an MQTT broker
a set of (containerized) microservices that publish and subscribe to it
a Redis cache that the microservices read and write to.
We will certainly need multiplicity of all of these components as we scale.
There is a natural division in the multiplicity of each of these things: they each pertain to a set of intersections in a city. A publishing or subscribing microservice will handle 1 or more intersections. The MQTT broker instance and the Redis instance each could be set up to handle n intersections.
I am wondering if it makes sense to try to avoid unnecessary network hops in Kubernetes by trying to divide things up by intersection and put all containers related to a given set of intersections on one node. Would this mean putting them all on a single pod, or is there another way?
(By the way, there will still be other publishers and subscribers that need to access the MQTT broker that are not intersection-specific.)

This is more of an opinion question.
Would this mean putting them all on a single pod, or is there another way?
I would certainly avoid putting them all in one Pod. In theory, you can put anything in a single pod, but the general practice is to add lightweight sidecars that handle a very specific function.
IMO an MQTT broker, a Redis datastore and a subscribe/publish app seem like a lot of to put in a single pod.
Possible Disadvantages:
Harder to debug because you may not know where the failure comes from.
A publish/subscriber is generally more of a stateless application and MQTT & Redis would stateful. Deployments are more recommended for stateless services and StatefulSets are recommended for stateful services.
Maybe networking latency. But you can use Node Affinity and Pod Affinity to mitigate that.
Possible Advantages:
All services sharing the same IP/Context.
Too much clutter in a pod.
It would be cleaner if you had:
Deployment for your sub/pub app.
StatefulSet with its own storage for your Redis server.
Statefulset with its own storage for your MQTT.
Each one of these workload resources would create separate pods and you can scale independently up/down.

Related

How to spin up/down workers programmatically at run-time on Kubernetes based on new Redis queues and their load?

Suppose I want to implement this architecture deployed on Kubernetes cluster:
Gateway
Simple RESTful HTTP microservice accepting scraping tasks (URLs to scrape along with postback urls)
Request Queues - Redis (or other message broker) queues created dynamically per unique domain (when new domain is encountered, gateway should programmatically create new queue. If queue for domain already exists - just place message in it.
Response Queue - Redis (or other message broker) queue used to post Worker results as scraped HTML pages along with postback URLs.
Workers - worker processes which should spin-up at runtime when new queue is created and scale-down to zero when queue is emptied.
Response Workers - worker processes consuming response queue and sending postback results to scraping client. (should be available to scale down to zero).
I would like to deploy the whole solution as dockerized containers on Kubernetes cluster.
So my main concerns/questions would be:
Creating Redis or other message broker queues dynamically at run-time via code. Is it viable? Which broker is best for that purpose? I would prefer Redis if possible since I heard it's the easiest to set up and also it supports massive throughput, ideally my scraping tasks should be short-lived so I think Redis would be okay if possible.
Creating Worker consumers at runtime via code - I need some kind of Kubernetes-compatible technology which would be able to react on newly created queue and spin up Worker consumer container which would listen to that queue and later on would be able to scale up/down based on the load of that queue. Any suggestions for such technology? I've read a bit about KNative, and it's Eventing mechanism, so would it be suited for this use-case? Don't know if I should continue investing my time in reading it's documentation.
Best tools for Redis queue management/Worker management: I would prefer C# and Node.JS tooling. Something like Bull for Node.JS would be sufficient? But ideally I would want to produce queues and messages in Gateway by using C# and consume them in Node.JS (Workers).
If you mean vertical scaling it definitely won't be a viable solution, since it requires pod restarts. Horizontal scaling is somewhat viable when compared to vertical scaling, however you need to consider a fact that even for spinning up your nodes or pods it takes some time and it is always suggested to have proper resources in place for serving your upcoming traffic else this delay will affect some features of your application and there might be a business impact. Just having auto scalers isn’t an option; you should also have proper metrics in place for monitoring your application.
This documentation details how to scale your redis and worker pods respectively using the KEDA mechanism. KEDA stands for Kubernetes Event-driven Autoscaling, KEDA is a plugins which sits on top of existing kubernetes primitives (such as Horizontal pod autoscaler) to scale any number of kubernetes containers based on the number of events which needs to be processed.

involuntary disruptions / SIGKILL handling in microservice following saga pattern

Should i engineer my microservice to handle involuntary disruptions like hardware failure?
Are these disruptions frequent enough to be handled in a service running on AWS managed EKS cluster.
Should i consider some design change in the service to handle the unexpected SIGKILL with methods like persisting the data at each step or will that be considered as over-engineering?
What standard way would you suggest for handling these involuntary disruptions if it is
a) a restful service that responds typically in 1s(follows saga pattern).
b) a service that process a big 1GB file in 1 hour.
There are couple of ways to handle those disruptions. As mentioned here here:
Here are some ways to mitigate involuntary disruptions:
Ensure your pod requests the resources it needs.
Replicate your application if you need higher availability. (Learn about running replicated stateless and stateful applications.)
For even higher availability when running replicated applications, spread applications across racks (using anti-affinity) or across zones
(if using a multi-zone cluster.)
The frequency of voluntary disruptions varies.
So:
if your budget allows it, spread your app accross zones or racks, you can use Node affinity to schedule Pods on cetrain nodes,
make sure to configure Replicas, it will ensure that when one Pod receives SIGKILL the load is automatically directed to another Pod. You can read more about this here.
consider using DaemonSets, which ensure each Node runs a copy of a Pod.
use Deployments for stateless apps and StatefulSets for stateful.
last thing you can do is to write your app to be distruption tolerant.
I hope I cleared the water a little bit for you, feel free to ask more questions.

Request buffering in Kubernetes clusters

This is a purely theoretical question. A standard Kubernetes clusted is given with autoscaling in place. If memory goes above a certain targetMemUtilizationPercentage than a new pod is started and it takes on the flow of requests that is coming to the contained service. The number of minReplicas is set to 1 and the number of maxReplicas is set to 5.
What happens when the number of pods that are online reaches maximum (5 in our case) and requests from clients are still coming towards the node? Are these requests buffered somewhere of they are discarded? Can I take any actions to avoid request loss?
Natively Kubernetes does not support messaging queue buffering. Depends on the scenario and setup you use your requests will most likely 'timeout'. To efficiently manage those you`ll need custom resource running inside Kubernetes cluster.
In that situations it very common to use a message broker which ensures communication between microservices is reliable and stable, that the messages are managed and monitored within the system and that messages don’t get lost.
RabbitMQ, Kafka and Redis appears to be most popular but choosing the right one will heaving depend on your requirement and features needed.
Worth to note since Kubernetes essentially runs on linux is that linux itself also manages/limits the requests coming in socket. You may want to read more about it here.
Another thing is that if you have pods limits set or lack of resource it is most likely that pods might be restarted or cluster will become unstable. Usually you can prevent it by configuring some kind of "circuit breaker" to limit amount of requests that could go to backed without overloading it. If the amount of requests goes beyond the circuit breaker threshold, excessive requests will be dropped.
It is better to drop some request than having cascading failure.
I managed to test this scenario and I get 503 Service Unavailable and 403 Forbidden on my requests that do not get processed.
Knative Serving actually does exactly this. https://github.com/knative/serving/
It buffers requests and informs autoscaling decisions based on in-flight request counts. It also can enforce per-Pod max in-flight requests and hold onto request until newly scaled-up Pods come up and then Knative proxies the request to them as it has this container named queue-proxy as a sidecar to its workload type called "Service".

How to notify POD in statefull set about other PODS in Kubernetes

I was reading the tutorial on deploying a Cassandra ring and zookeeper with statefulsets. What I don't understand is if I decide to add another replica into the statefulset, how do I notify the other PODS that there is another one. What are best practices for it? I want to be able for one POD to redirect request to another POD in my custom application in case the request doesn't belong to it (ie. it doesn't have the data)
Well, seems like you want to run a clustered application inside kubernetes. It is not something that kubernetes is directly responsible for. The cluster coordination for given solution should be handled within it, and a response to a "how to" question can not be generic.
Most of the softwares out there will have some kind of coordination, discovery and registration mechanism. Be it preconfigured members, external dioscovery catalog/db or some networ broadcasting.
StatefulSet helps a lot in it by retaining network identity under service/pod, or helping to keep storage, so you can ie. always point your new replicas to register with first replica (or preferably one of the first two, cause what if your no.1 is the one that restarted), but as a wrote above, this is pretty much depending on capabilities available on the solution you want to deploy.

Kubernetes: single POD with many container, or many Pod with single container

I've rather a teoretical question which I can't answer with the reousrces found online. The question is: what's the rule to decide how to compose containers in POD? . Let me explain with an example.
I've these microservices:
Authentication
Authorization
Serving content
(plus) OpenResty to forward the calls form one to the other and orhcestarate the flow. (is there a possibility to do so natively in K8?, it seems to have services base on nginx+lua, but not sure how it works)
For the sake of the example I avoid Databases and co, I assume they are external and not managed by kubernetes
Now, what's the correct way here LEFT or RIGHT of the image?
LEFT : this seems easier to make it working, everything works on "localhost" , the downside is that it looses a bit the benefit of the microservices. For example, if the auth become slows and it would need more instances, I've to duplicate the whole pod and not just that service.
RIGHT seems a bit more complex, need services to expose each POD to the other PODs. Yet, here, I could duplicate auth as I need without duplicating the other containers. On the other hand I'll have a lot of pods since each pod is basically a container.
It is generally recommended to keep different services in different pods or better deployments that will scale independently. The reasons are what is generally discussed as benefits of a microservices architecture.
A more loose coupling allowing the different services to be developed independently in their own languages/technologies,
be deployed and updated independently and
also to scale independently.
The exception are what is considered a "helper application" to assist a "primary application". Examples given in the k8s docs are data pullers, data pushers and proxies. In those cases a share file system or exchange via loopback network interface can help with critical performance use cases. A data puller can be a side-car container for an nginx container pulling a website to serve from a GIT repository for example.
right image, each in own pod. multi containers in a pod should really only be used when they are highly coupled or needed for support of the main container such as a data loader.
With separate pods, it allows for each service to be updated and deployed independently. It also allows for more efficient scaling. in the future, you may need 2 or 3 content pods but still only one authorization. if they are all together you scale them all since you don't have a choice with them all together in the same pod.
Right image is better option. Easier management, upgrades, scaling.
Should choose the right side of the structure, on the grounds that the deployment of the left side of the architecture model is tight coupling is not conducive to a module according to the actual needs of the business expansion capacity.