I am trying to learn sidecar pattern in the single-node patterns (which is used for implementing proxies, resource logging, etc.) for distributed systems.
I was just wondering if it has anything to do with the cardinality ratios in classes. Does the sidecar to application container have to be one-to-one always?
[ Reference and the images from Designing Distributed by Systems Brendan Burns ]
In general, for enhancing the functionality of the main container, the sidecar container is added. They are light weight supporting processes or services that are usually deployed with the main application.
A sidecar is used mainly for performing peripheral operations mostly without the knowledge of application container. It shall be mostly sharing the same volume, namespace etc as that of main container. It shall be a container that runs on the same pod as the application container.
Based on requirement, it shall help in taking care of peripheral functionalities like performing updates, platform abstraction, interservice communications, monitoring or security‑related handlers on behalf of main container.
Sidecar shall be wherever the main application is present and it's lifecycle is tightly coupled to the main application container. Just as how each motorcycle can have its own sidecar to meet its additional requirement, for each instance of the application, an instance of the sidecar is deployed alongside it for additional peripheral requirements. In general, the sidecar container shall be designed to be small, pluggable and less complex. Hence, it is predominately one to one with main application.
In case if the sidecar service involves more ipc with main application,
it is preferred for having the logic to be part of main application which in turn is based on your requirement. Also, if the sidecar gets bloated or increasingly complex and if it has to scale independently from the main application, then it shall be made an independent service rather than sidecar.
Related
I would like to know what happens if Dapr fails. For example, if my service's sidecar or even the Control Plane fails, what is the expected behavior of my application?
Oh, and would there be any way for me to simulate these error cases?
Context:
In my application I have a service that uses Dapr, but in a non-critical way. Therefore, I would like to ensure that it continues to run normally even if your sidecar or Dapr fails.
Very good question without a straight-forward answer, but I'll share how I look at it.
Dapr can be used with monolithic, legacy applications, for migration and modernization purposes for example, but it is more commonly used with distributed applications. In a distributed application, there are many more components that can fail: database, transparent proxy (envoy/), ingress proxy, message broker, producer, consumer... In that regard, Dapr is no different, and it can fail, but there are a few reasons why that is less likely to happen:
Dapr is like a technical microservice, it has no business logic, and your app interacts with it over explicit APIs. It is harder for a failure in the sidecar to spread to your app.
If the sidecar is exploited, it is harder to get control of the application, acts as a security boundary.
As a popular open source project, Dapr has many eyes and users on it. You are more likely to get new bugs found and fixed early.
If that happens, upgrading Dapr is much easier than a library upgrade. You can upgrade Dapr control plane with little to no disruptions to your app, and then upgrade select sidecars (a canary release if you want) - I've also done many middleware/library patching/upgrades and I know how much work the latter is in comparison.
Each sidecar lives co-located with its app. Any hardware or network failure is likely to impact both the app and sidecar, rather than sidecar only.
With Dapr, you get many resiliency and observability benefits OOTB. See my blog on this topic here. It is more likely to improve the reliability of your app than reduce it.
When you follow the best practices, and enable k8s health checks, resource constraints, Kubernetes will deal with it. Dapr can even detect the health-status of your app, and stop interacting with it until it recovers.
In the end, if there is a bug in Dapr, it may fail. But that can happen wit a library implementing Dapr-like features too. With Dapr, you can isolate the failure, and upgrade faster, w/o a single line of code change, building, testing of the application, that is the difference from perspective of this question.
Disclaimer: I work for a company building products for running Dapr, and I'm highly biassed on this topic.
I am new to Kubernetes. I am planning to build/deploy an application to EKS. This will likely be deployed on azure and gcp as well.
I want to separate data plane and control plane in my application deployed in EKS.
Is there anyway EKS/kubernetes allows to accomplish this?
Or should we go for two EKS with one for data plane and another for control plane?
Here is the problem(copied from the answer below)
I have an application, built using the microservice architecture
(meaning you will have it split into components that will communicate
with eachother).
I want to deploy this application on a public cloud (EKS, GCP, AWS).
I want a separation of the APPLICATION control plane (decision making
components like authentication APIs, internal service routing) from
the APPLICATION data plane (the serving of your application data to
the clients through the egress).
My Understanding
What I understand from your description is:
You have an application, built using the microservice architecture (meaning you will have it split into components that will communicate with eachother).
You want to deploy this application on a public cloud (EKS, GCP, AWS).
You want a separation of the APPLICATION control plane (decision making components like authentication APIs, internal service routing) from the APPLICATION data plane (the serving of your application data to the clients through the egress).
If those 3 points above are true, then the answer is yes, every cloud platform has these capabilities. You will need to architect your application in a manner that your control microservices are isolated from your data microservices. There are many ways in which you can do it (most/all of them are available in all public clouds).
Some Design Ideas
Here is a conceptual description:
By using authentication and authorization mechanisms to ensure authorized communication between control and data plane applications. Think kubernetes ServiceAccounts.
By using network policies to restrict unauthorized traffic flow between microservices. You will require a networking overlay that supports these. Think calico CNI.
By using separate namespaces for your application services as necessary for better isolation.
At one level below, in your cloud, you can limit the traffic using security groups. Or even gateways.
You can use different instance types that match your various workload types, this will not only ensure optimal performance but also separation of failure domains. For example, if a dedicated database instance crashed, the event streaming service will still be running.
Some Notes
Also understand that in a public cloud solution (even EKS) a cluster->cluster traffic is more expensive for you than the traffic inside a single cluster. This will be a very important cost factor and you should consider using a single cluster for your application. (k8s clusters can typically scale to 1000s of nodes).
I hope this somewhat answers your question. There are a lot of decisions you need to make but in short, yes, it is possible to do this separation, and your entire application will have to be designed in this way.
I great open-source observability control plane for you apps is Odigos. The installation is super easy and within a few minutes you can get traces, metrics and logs. You get auto-instrumentation for all languages (including GO) as well as a manager of your opentelemetry collectors.
Check it out: https://github.com/keyval-dev/odigos
Should i engineer my microservice to handle involuntary disruptions like hardware failure?
Are these disruptions frequent enough to be handled in a service running on AWS managed EKS cluster.
Should i consider some design change in the service to handle the unexpected SIGKILL with methods like persisting the data at each step or will that be considered as over-engineering?
What standard way would you suggest for handling these involuntary disruptions if it is
a) a restful service that responds typically in 1s(follows saga pattern).
b) a service that process a big 1GB file in 1 hour.
There are couple of ways to handle those disruptions. As mentioned here here:
Here are some ways to mitigate involuntary disruptions:
Ensure your pod requests the resources it needs.
Replicate your application if you need higher availability. (Learn about running replicated stateless and stateful applications.)
For even higher availability when running replicated applications, spread applications across racks (using anti-affinity) or across zones
(if using a multi-zone cluster.)
The frequency of voluntary disruptions varies.
So:
if your budget allows it, spread your app accross zones or racks, you can use Node affinity to schedule Pods on cetrain nodes,
make sure to configure Replicas, it will ensure that when one Pod receives SIGKILL the load is automatically directed to another Pod. You can read more about this here.
consider using DaemonSets, which ensure each Node runs a copy of a Pod.
use Deployments for stateless apps and StatefulSets for stateful.
last thing you can do is to write your app to be distruption tolerant.
I hope I cleared the water a little bit for you, feel free to ask more questions.
I'm still new to Kubernetes so please excuse if this is a silly question.
I'm architecting a system which includes:
an MQTT broker
a set of (containerized) microservices that publish and subscribe to it
a Redis cache that the microservices read and write to.
We will certainly need multiplicity of all of these components as we scale.
There is a natural division in the multiplicity of each of these things: they each pertain to a set of intersections in a city. A publishing or subscribing microservice will handle 1 or more intersections. The MQTT broker instance and the Redis instance each could be set up to handle n intersections.
I am wondering if it makes sense to try to avoid unnecessary network hops in Kubernetes by trying to divide things up by intersection and put all containers related to a given set of intersections on one node. Would this mean putting them all on a single pod, or is there another way?
(By the way, there will still be other publishers and subscribers that need to access the MQTT broker that are not intersection-specific.)
This is more of an opinion question.
Would this mean putting them all on a single pod, or is there another way?
I would certainly avoid putting them all in one Pod. In theory, you can put anything in a single pod, but the general practice is to add lightweight sidecars that handle a very specific function.
IMO an MQTT broker, a Redis datastore and a subscribe/publish app seem like a lot of to put in a single pod.
Possible Disadvantages:
Harder to debug because you may not know where the failure comes from.
A publish/subscriber is generally more of a stateless application and MQTT & Redis would stateful. Deployments are more recommended for stateless services and StatefulSets are recommended for stateful services.
Maybe networking latency. But you can use Node Affinity and Pod Affinity to mitigate that.
Possible Advantages:
All services sharing the same IP/Context.
Too much clutter in a pod.
It would be cleaner if you had:
Deployment for your sub/pub app.
StatefulSet with its own storage for your Redis server.
Statefulset with its own storage for your MQTT.
Each one of these workload resources would create separate pods and you can scale independently up/down.
I've rather a teoretical question which I can't answer with the reousrces found online. The question is: what's the rule to decide how to compose containers in POD? . Let me explain with an example.
I've these microservices:
Authentication
Authorization
Serving content
(plus) OpenResty to forward the calls form one to the other and orhcestarate the flow. (is there a possibility to do so natively in K8?, it seems to have services base on nginx+lua, but not sure how it works)
For the sake of the example I avoid Databases and co, I assume they are external and not managed by kubernetes
Now, what's the correct way here LEFT or RIGHT of the image?
LEFT : this seems easier to make it working, everything works on "localhost" , the downside is that it looses a bit the benefit of the microservices. For example, if the auth become slows and it would need more instances, I've to duplicate the whole pod and not just that service.
RIGHT seems a bit more complex, need services to expose each POD to the other PODs. Yet, here, I could duplicate auth as I need without duplicating the other containers. On the other hand I'll have a lot of pods since each pod is basically a container.
It is generally recommended to keep different services in different pods or better deployments that will scale independently. The reasons are what is generally discussed as benefits of a microservices architecture.
A more loose coupling allowing the different services to be developed independently in their own languages/technologies,
be deployed and updated independently and
also to scale independently.
The exception are what is considered a "helper application" to assist a "primary application". Examples given in the k8s docs are data pullers, data pushers and proxies. In those cases a share file system or exchange via loopback network interface can help with critical performance use cases. A data puller can be a side-car container for an nginx container pulling a website to serve from a GIT repository for example.
right image, each in own pod. multi containers in a pod should really only be used when they are highly coupled or needed for support of the main container such as a data loader.
With separate pods, it allows for each service to be updated and deployed independently. It also allows for more efficient scaling. in the future, you may need 2 or 3 content pods but still only one authorization. if they are all together you scale them all since you don't have a choice with them all together in the same pod.
Right image is better option. Easier management, upgrades, scaling.
Should choose the right side of the structure, on the grounds that the deployment of the left side of the architecture model is tight coupling is not conducive to a module according to the actual needs of the business expansion capacity.