Kubernetes Autoscaling using VPA - Off or Auto update mode? - kubernetes

For the needs of a project i have created 2 Kubernetes clusters on GKE.
Cluster 1: 10 containers in one Pod
Cluster 2: 10 containers in 10 different Pods
All containers are connected and constitute an application.
What i would like to do is to generate some load and observe how the vpa will autoscale the containers..
Until now, using the "Auto" mode i have noticed that VPA changes values only once, at the begin and not while i generate load
and
that the Upper Bound is soo high, so it doesn't need any change!
Would you suggest me:
1) to use Auto or Recommendation mode?
and
2) to create 1 or 2 replicas of my application?
Also i would like to say that 2 of 10 containers is mysql and mongoDB . So if i have to create 2 replicas, i should use statefulsets or operators, right?
Thank you very much!!

Not sure you mean it when you say this
Cluster 1: 10 containers in one Pod
Cluster 2: 10 containers in different Pods
At very first you are not following best practice, ideally, you should be keeping the single container in a single POD
Running 10 containers in one pod that too much, if there is interdependency your code should be using the K8s service name to connect to each other.
to create 1 or 2 replicas of my application?
Yes, that would be always better to run multiple replicas of the application so if due to anything even node goes down your POD on another node would be running.
Also i would like to say that 2 of 10 containers is mysql and mongoDB
. So if i have to create 2 replicas, i should use statefulsets or
operators, right?
You can use the operators and stateful sets both no in it, it's possible operator idally create the stateful sets.
Implementing the replication of MySQL across the replicas would be hard manually unless you have good experience as DBA and you are aware.
While with operator you will get the benefit both auto backup, replication auto management and other such things.
Operators indirectly creates the stateful set or deployment but you won't have to manage much and worry about the replication and failover planning and DB strategy.

Related

Run different replica count for different containers within same pod

I have a pod with 2 closely related services running as containers. I am running as a StatefulSet and have set replicas as 5. So 5 pods are created with each pod having both the containers.
Now My requirement is to have the second container run only in 1 pod. I don't want it to run in 5 pods. But my first service should still run in 5 pods.
Is there a way to define this in the deployment yaml file for Kubernetes? Please help.
a "pod" is the smallest entity that is managed by kubernetes, and one pod can contain multiple containers, but you can only specify one pod per deployment/statefulset, so there is no way to accomplish what you are asking for with only one deployment/statefulset.
however, if you want to be able to scale them independently of each other, you can create two deployments/statefulsets to accomplish this. this is imo the only way to do so.
see https://kubernetes.io/docs/concepts/workloads/pods/ for more information.
Containers are like processes,
Pods are like VMs,
and Statefulsets/Deployments are like the supervisor program controlling the VM's horizontal scaling.
The only way for your scenario is to define the second container in a new deployment's pod template, and set its replicas to 1, while keeping the old statefulset with 5 replicas.
Here are some definitions from documentations (links in the references):
Containers are technologies that allow you to package and isolate applications with their entire runtime environment—all of the files necessary to run. This makes it easy to move the contained application between environments (dev, test, production, etc.) while retaining full functionality. [1]
Pods are the smallest, most basic deployable objects in Kubernetes. A Pod represents a single instance of a running process in your cluster. Pods contain one or more containers. When a Pod runs multiple containers, the containers are managed as a single entity and share the Pod's resources. [2]
A deployment provides declarative updates for Pods and ReplicaSets. [3]
StatefulSet is the workload API object used to manage stateful applications. Manages the deployment and scaling of a set of Pods, and provides guarantees about the ordering and uniqueness of these Pods. [4]
Based on all that information - this is impossible to match your requirements using one deployment/Statefulset.
I advise you to try the idea #David Maze mentioned in a comment under your question:
If it's possible to have 4 of the main application container not having a matching same-pod support container, then they're not so "closely related" they need to run in the same pod. Run the second container in a separate Deployment/StatefulSet (also with a separate Service) and you can independently control the replica counts.
References:
Documentation about Containers
Documentation about Pods
Documentation about Deployments
Documentation about StatefulSet

Duplicate metrics with multiple instances of kube-state-metrics

Problem:
Duplicate data when querying from prometheus for metrics from kube-state-metrics.
Sample query and result with 3 instances of kube-state-metrics running:
Query:
kube_pod_container_resource_requests_cpu_cores{namespace="ns-dummy"}
Metrics
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.142:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.142:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.17:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.35.17:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.37.171:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-34-25.ec2.internal",pod="app1-appname-6bd9d8d978-gfk7f",service="prom-kube-state-metrics"}
1
kube_pod_container_resource_requests_cpu_cores{container="appname",endpoint="http",instance="172.232.37.171:8080",job="kube-state-metrics",namespace="ns-dummy",node="ip-172-232-35-22.ec2.internal",pod="app2-appname-ccbdfc7c8-g9x6s",service="prom-kube-state-metrics"}
Observation:
Every metric is coming up Nx when N pods are running for kube-state-metrics. If it's a single pod running, we get the correct info.
Possible solutions:
Scale down to single instance of kube-state-metrics. (Reduced availability is a concern)
Enable sharding. (Solves duplication problem, still less available)
According to the docs, for horizontal scaling we have to pass sharding arguments to the pods.
Shards are zero indexed. So we have to pass the index and total number of shards for each pod.
We are using Helm chart and it is deployed as a deployment.
Questions:
How can we pass different arguments to different pods in this scenario, if its possible?
Should we be worried about availability of the kube-state-metrics considering the self-healing nature of k8s workloads?
When should we really scale it to multiple instances and how?
You could use a 'self-healing' deployment with only a single replica of kube-state-metric if the container down, the deployment will start a new container. Since kube-state-metric is not focused on the health of the individual kubernetes components. It only will affect you if your cluster is too big and make many objects changes per second.
It is not focused on the health of the individual Kubernetes components, but rather on the health of the various objects inside, such as deployments, nodes and pods.
For small cluster there's is no problem use in this way, but you really need a high availability monitoring platform I recommend you take a look in this two articles:
creating a well designed and highly available monitoring stack for kubernetes and
kubernetes monitoring

How can I maintain a set of unique number crunching containers in kubernetes?

I want to run a "set" of containers in kubernetes, each which only differs in the docker environment variables (each one searches it's own dataset, which is located on network storage, then cached into the container's ram). For example:
container 1 -> Dataset 1
container 2 -> Dataset 2
Over time, I'll want to add (and sometimes remove) containers from this "set", but don't want to restart ALL of the containers when doing so.
From my (naive) knowledge of kubernetes, the only way I can see to do this is:
Each container could be its own deployment -- However there are thousands of containers, so would be a pain to modify and manage.
So my questions are:
Can I use a StatefulSet to manage this?
1.1. When a StatefulSet is "updated", must it restart all pods, even if their "spec" is unchanged?
1.2 Do StatefulSets allow for each unique container/pod to have its own environment variable(s)?
Is there any kubernetes concept to "group" deployments into some logical unit?
Any other thoughts about how to implement this in kubernetes?
Would docker swarm (or another container management platform) be better suited to my use case?
According to your description, the StatefulSet it's what you need.
1.1. When a StatefulSet is "updated", must it restart all pods, even if their "spec" is unchanged?
You can choose a proper update strategy. I suggest RollingUpdate but you can try whatever suits you.
Also check out this tutorial.
1.2 Do StatefulSets allow for each unique container/pod to have its own environment variable(s)?
Yes, because their naming is consistent (name-0, name-1, name-2, etc). You can use hostname (pod name) index with that.
Please let me know if that helped.
If you expect your containers to eventually be done with their workload and terminate (as opposed to processing a single item loaded in RAM forever), you should use a job queue such as Celery on top of Kubernetes to manage the execution. In this case Celery will do all the orchestration, including restarting jobs if they fail. This is much more manageable than using Kubernetes directly.
Kubernetes even provides an official example of such a setup.

How pod replicas sync with each other - Kubernetes?

I have a MySQL database pod with 3 replicas.Now I'm making some changes in one pod(pod data,not pod configuration), say I'm adding a table.How will the change reflect on the other replicas of the pod?
I'm using kubernetes v1.13 with 3 worker nodes.
PODs do not sync. Think of them as independend processes.
If you want a clustered MySQL installation, the Kubernetes docs describe how to do this by using a StatefulSet: https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/#deploy-mysql
In essence you have to configure master/slave instances of MySQL yourself.
Pods are independent from each other, if you modify one pod the others will not be affected
As per your configuration - changes applied in one pod wont be reflected on all others. These are isolated resources.
There is a good practice to deploy such things using PersistentVolumeClaims and StatefulSets.
You can always find explanation with examples and best practices in Run a Replicated Stateful Application documentation.
If you have three mysql server pods, then you have 3 independent databases. Even though you created them from the same Deployment. So, depending on what you do, you might end up with bunch of databases in the cluster.
I would create 1 mysql pod, with persistence, so if one pod dies, the next one would take if from where the other one left. Would not lose data.
If what you want is high availability, or failover replica, you would need to manage it on your own.
Generally speaking, K8s should not be used for storage purposes.
You are good to have common storage among those 3 pods (PVC) and also consider STS when running databases on k8s.

Set replicas on different nodes

I am developing an application for dealing with kubernetes runtime microservices. I actually did some cool things, like moving a microservice from a node to another one. The problem is that all replicas go together.
So, Imagine that a microservice has two replicas and it is running on a namespaces with two nodes.
I want to set one replica in each node. Is that possible? Even in a yaml file, is that possible?
I am trying to do my own scheduler to do that, but I got no success until now.
Thank you all
I think what you are looking for is a NodeSelector for your replica Set. From the documentation:
Inter-pod affinity and anti-affinity allow you to constrain which nodes your pod is eligible to be scheduled based on labels on pods that are already running on the node rather than based on labels on nodes.
Here is the documentation: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#inter-pod-affinity-and-anti-affinity-beta-feature
I can't find where it's documented, but I recently read somewhere that replicas will be distributed across nodes when you create the kubernetes service BEFORE the deployment / replicaset.