How to use VPA with database pod in kubernetes? - kubernetes

I want to apply VPA vertical pod auto scaling for database pods. Can we use VPA for database auto scaling (Vertical) as VPA requires at least 2 replicas (ref : https://github.com/kubernetes/autoscaler/issues/1665#issuecomment-464679271) as it delete pods when set criteria is reached. So pods are deleted hence also data.
What is good practice for using VPA with database pods?

As I understand it, the real question is how to run a stateful workload with multiple replicas.
Use StatefulSets to configure n replicas for a database. StatefulSet pods have stable names which are preserved across pod restarts (and reincarnations). Combined with PersistentVolumeClaim templates (accepted with StatefulSet spec) and headless services, it is capable of retaining same volumes and network FQDN across reincarnations.
Take a look at Helm charts for various databases, e.g. MySQL chart, for useful insights.
On a side note, it might be worthwhile to consider using an operator for the database application you're using. Operators for most applications can be found on https://operatorhub.io.

VPA - Vertical pod autoscaler can work in 2 ways:
Recommendation mode - it will recommend the requests and limits for pods based on resources used
Auto mode - it will automatically analyze the usage and set the request and limits on pods. This will result in pod termination to recreate it with new specification as stated here:
Due to Kubernetes limitations, the only way to modify the resource requests of a running Pod is to recreate the Pod. If you create a VerticalPodAutoscaler with an updateMode of "Auto", the VerticalPodAutoscaler evicts a Pod if it needs to change the Pod's resource requests.
Cloud.google.com: Kubernetes Engine: Docs: Concepts: Vertical pod autoscaler
Please refer to above link for more information regarding the concepts of VPA.
The fact that it needs at least 2 replicas is most probably connected with the fact of high availability. As the pods are getting evicted to support new limits they are unable to process the request. If it came to situation where there is only 1 replica at the time, this replica wouldn't be able to respond to requests when in terminating/recreating state.
There is an official guide to run VPA on GKE:
Cloud.google.com: Kubernetes Engine: How to: Vertical pod autoscaling
VPA supports: Deployments as well as StatefulSets.
StatefulSet
Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling.
If you want to use storage volumes to provide persistence for your workload, you can use a StatefulSet as part of the solution.
Kubernetes.io: StatefulSet
Configuring StatefulSet with PersistentVolumes will ensure that the data stored on PV will not be deleted in case of pod termination.
To be able to use your database with replicas > 1 you will need to have replication implemented within your database environment.
There are guides/resources/solutions on running databases within Kubernetes environment. Please choose the solution most appropriate to your use case. Some of them are:
Kubernetes.io: Run replicated stateful application
Github.com: Zalando: Postgres operator
Github.com: Oracle: Mysql operator
After deploying your database you will be able to run below command to extract the name of the StatefulSet:
$ kubectl get sts
You can then apply the name of the StatefulSet to the VPA like below:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: DB-VPA
spec:
targetRef:
apiVersion: "apps/v1"
kind: StatefulSet
name: <INSERT_DB_STS_HERE>
updatePolicy:
updateMode: "Auto"
I encourage you also to read this article:
Cloud.google.com: Blog: To run or not to run a database on Kubernetes, what to consider

Related

What handles a StatefulSet replication?

If a Deployment uses ReplicaSets to scale Pods up and down, and StatefulSets don't have ReplicaSets...
So, how does it manage to scale Pods up and down? I mean, what resource is responsible? What requests does a StatefulSet make in order to scale?
In short StatefulSet Controller handles statefulset replicas.
A StatefulSet is a Kubernetes API object for managing stateful application workloads. StatefulSets handle the deployment and scaling of sets of Kubernetes pods, providing guarantees about their uniqueness and ordering.
Similar to deployments, StatefulSets manage pods with identical container specifications. They differ in terms of maintaining a persistent identity for each pod. While the pods are all created based on the same spec, they are not interchangeable, so each pod is given a persistent identifier that is maintained through rescheduling.
Benefits of a StatefulSet deployment include:
Unique identifiers—every pod in the StatefulSet is assigned a unique, stable network identity, consisting of a hostname based on the application name and instance number. For example, a StatefulSet for a web application with three instances may have pods labeled web1, web2 and web3.
Persistent storage—every pod has its own stable, persistent volume, either by default or as defined per storage class. When the pods in a cluster are scaled down or deleted, their associated volumes are not lost, and the data persists. Unneeded resources can be purged by scaling down the StatefulSet to 0 before deleting the unused pods.
Ordered deployment and scaling—the pods in a StatefulSet are created and deployed in order, according to their increments. Pods are also shut down in (reverse) order, ensuring that the deployment and runtime are reliable and repeatable. The StatefulSet won’t scale until all every required pod is running, so if a pod fails, it will recreate the pod before it attempts to add more instances as per the scaling requirements.
Automated, ordered updates—a StatefulSets can handle rolling updates, shutting down each node and rebuilding it according to the original order, until every node has been replaced and the older versions cleaned up. The persistent volumes can be reused, so data is migrated to the new version automatically.

How to make sure Kubernetes autoscaler not deleting the nodes which runs specific pod

I am running a Kubernetes cluster(AWS EKS one) with Autoscaler pod So that Cluster will autoscale according to the resource request within the cluster.
Also, cluster will shrink no of nodes when the load is reduced. As I observed, Autosclaer can delete any node in this process.
I want to control this behavior such as asking Autoscaler to stop deleting nodes that runs a specific pod.
For example, If a node runs the Jenkins pod, Autoscaler should skip that node and delete other matching node from the cluster.
Will there a way to achieve this requirement. Please give your thoughts.
You can use "cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
...
template:
metadata:
labels:
app: jenkins
annotations:
"cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
spec:
nodeSelector:
failure-domain.beta.kubernetes.io/zone: us-west-2b
...
You should set a pod disruption budget that references specific pods by label. If you want to ensure that there is at least one Jenkins worker pod running at all times, for example, you could create a PDB like
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: jenkins-worker-pdb
spec:
minAvailable: 1
selector:
matchLabels:
app: jenkins
component: worker
(adapted from the basic example in Specifying a Disruption Budget in the Kubernetes docs).
Doing this won't prevent nodes from being destroyed; the cluster autoscaler is still free to scale things down. What it will do is temporarily delay destroying a node until the disruption budget can be met again.
For example, say you've configured your Jenkins setup so that there are three workers. Two get scheduled on the same node, and the autoscaler takes that node offline. The ordinary Kubernetes Deployment system will create two new replicas on nodes that still exist. If the autoscaler also decides it wants to destroy the node that has the last worker, the pod disruption budget above will prevent it from doing so until at least one other worker is running.
When you say "the Jenkins pod" in the question, there are two other important implications to this. One is that you should almost always configure your applications using higher-level objects like Deployments or StatefulSets and not bare Pods. The other is that it is generally useful to run multiple copies of things for redundancy if nothing else. Even absent the cluster autoscaler, disks fail, Amazon occasionally arbitrarily decommissions EC2 instances, and nodes otherwise can go offline outside of your control; you often don't want just one copy of something running in your cluster, especially if you're considering it a critical service.
In autoscaler FAQ on github you can read the following:
What types of pods can prevent CA from removing a node?
Pods with restrictive PodDisruptionBudget.
Kube-system pods that:
are not run on the node by default, *
don't have a pod disruption
budget
set or their PDB is too restrictive (since CA 0.6).
Pods that are not backed by a controller object (so not created by deployment, replica set, job, stateful set etc). *
Pods with local storage. *
Pods that cannot be moved elsewhere due to various constraints (lack of resources, non-matching node selectors or affinity, matching
anti-affinity, etc)
Pods that have the following annotation set: "cluster-autoscaler.kubernetes.io/safe-to-evict": "false"
*Unless the pod has the following annotation (supported in
CA 1.0.3 or later): "cluster-autoscaler.kubernetes.io/safe-to-evict": "true"

Does ReplicaSets replace Pods?

I have a conceptional question, does ReplicaSets use Pod settings?
Before i applied my ReplicaSets i deleted my Pods, so there is no information about my old Pods ?
If I apply now the Replicaset does this reference to the Pod settings, so with all settings like readinessProbe/livenessProbe ... ?
My Questions came up because in my replicaset.yml is a container section where I specified my docker image, but why does it need that information, isn't it a redundant information, because this information is in my pods.yml ?
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
name: test1
spec:
replicas: 2
template:
metadata:
labels:
app: web
spec:
containers:
- name: test1
image: test/test
Pods are the smallest deployable units of computing that can be
created and managed in Kubernetes.
A Pod (as in a pod of whales or pea pod) is a group of one or more
containers (such as Docker containers), with shared storage/network,
and a specification for how to run the containers.
See, https://kubernetes.io/docs/concepts/workloads/pods/pod/.
So, you can specify how your Pod will be scheduled (one or more containers, ports, probes, volumes, etc.).
But in case of the node failure or anything bad that can harm to the Pod, then that Pod won't be rescheduled (you have to rescheduled manually). So, in that case, you need a controller. Kubernetes provides some controllers (each one for different purposes). They are -
ReplicaSet
ReplicationController
Deployment
StatefulSet
DaemonSet
Job
CronJob
All of the above controllers and the Pod together are called as Workload. Because they all have a podTemplate section. And they all create some number of identical Pods ass specified by the spec.replicas field (if this field exists in the corresponding workload manifest). They all are upper-level concept than Pod.
Though the Deployment is more suitable than the ReplicaSet, this answer focuses on ReplicaSet over Pod cause the question is between the Pod and ReplicaSet.
In addition, each one of the above controllers has it's own purpose. Like a ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
A ReplicaSet contains a podTemplate field including selectors to identify and acquire Pod(s). A pod template specifying the configuration of new Pods it should create to meet the number of replicas criteria. It creates and deletes Pod(s) as needed to reach the desired number. When a ReplicaSet needs to create new Pod(s), it uses its Pod template.
The Pod(s) maintained by a ReplicaSet has metadata.ownerReferences field, to tell which resource owns the current Pod(s).
A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a controller and it matches a ReplicaSet’s selector, it will be immediately acquired by said ReplicaSet.
Ref: https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/
**
Now, its time to answer your questions
Since ReplicaSet is one of the Pod controller (listed above), obviously, it needs a podTemplate (using this template, your Pods will be scheduled). All of the Pods the ReplicaSet creates will have the same Pod configuration (same containers, same ports, same readiness/livelinessProbe, volumes, etc.). And having this podTemplate is not redundant info, it's needed. So, if you have a Pod controller like ReplicaSet or other (as your need), you don't need the Pod itself anymore. Because the ReplicaSet (or the other controllers) will create Pod(s).
**
Guess, you got the answer.
Does ReplicaSets replace Pods?
Yes, if you have replicaset.yml you don't need pods.yml.
I have a conceptional question, does ReplicaSets use Pod settings?
Before i applied my ReplicaSets i deleted my Pods, so there is no
information about my old Pods ? If I apply now the Replicaset does
this reference to the Pod settings, so with all settings like
readinessProbe/livenessProbe ... ?
No, the ReplicaSet manifest has to contain the Pod specification in order to determine what is the configuration of the pods that should be deployed.
With the labels, you link the ReplicaSet to running Pods.
You don't link the ReplicaSet.yml manifest to the Pods.yml manifest.
Don’t use naked Pods (that is, Pods not bound to a ReplicaSet or Deployment) if you can avoid it. Naked Pods will not be rescheduled in the event of a node failure.
In 99% of the cases, there isn't a separate pods.yml manifest.
The pods + the ReplicaSet are defined in a single manifest, hence the containers section in the replicaset.yml.

StatefulSet, ReplicaSet or DaemonSet. What is the best for a single Pod?

I want to deploy a single Pod on a Node to host my service (like GitLab for the example). The problem is : a Pod will not be re-created after the Node failure (like a reboot). The solution(s) : Use a StatefulSet, ReplicaSet or DaemonSet to ensure the Pod creation after a Node failure. But what is the best for this case ?
This Pod is stateful (I am using volume hostPath to keep the data) and is deployed using nodeSelector to keep it always on the same Node.
Here is a simple YAML file for the example : https://pastebin.com/WNDYTqSG
It creates 3 Pods (one for each Set) with a volume to keep the data statefully. In practice, all of these solutions can feet my needs, but I don't know if there are best practices for this case.
Can you help me to choose between these solutions to deploy a single stateful Pod please ?
Deployment is the most common option to manage a Pod or set of Pods. These are normally used instead of ReplicaSets as they are more flexible and creating a Deployment results in a ReplicaSet - see https://www.mirantis.com/blog/kubernetes-replication-controller-replica-set-and-deployments-understanding-replication-options/
You would only need a StatefulSet if you had multiple Pods and needed dedicated persistence per Pod or you had multiple Pods and the Pods need individual names because they relate to each other (e.g. one is a leader) - https://stackoverflow.com/a/48006210/9705485
A DaemonSet would be used when you want one Pod/replica per Node

What is the difference between a pod and a deployment?

I have been creating pods with type:deployment but I see that some documentation uses type:pod, more specifically the documentation for multi-container pods:
apiVersion: v1
kind: Pod
metadata:
name: ""
labels:
name: ""
namespace: ""
annotations: []
generateName: ""
spec:
? "// See 'The spec schema' for details."
: ~
But to create pods I can just use a deployment type:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: ""
spec:
replicas: 3
template:
metadata:
labels:
app: ""
spec:
containers:
etc
I noticed the pod documentation says:
The create command can be used to create a pod directly, or it can
create a pod or pods through a Deployment. It is highly recommended
that you use a Deployment to create your pods. It watches for failed
pods and will start up new pods as required to maintain the specified
number. If you don’t want a Deployment to monitor your pod (e.g. your
pod is writing non-persistent data which won’t survive a restart, or
your pod is intended to be very short-lived), you can create a pod
directly with the create command.
Note: We recommend using a Deployment to create pods. You should use
the instructions below only if you don’t want to create a Deployment.
But this raises the question of what kind:pod is good for? Can you somehow reference pods in a deployment? I didn't see a way. It looks like what you get with pods is some extra metadata but none of the deployment options such as replica or a restart policy. What good is a pod that doesn't persist data, survives a restart? I think I'd be able to create a multi-container pod with a deployment as well.
Radek's answer is very good, but I would like to pitch in from my experience, you will almost never use an object with the kind pod, because that doesn't make any sense in practice.
Because you need a deployment object - or other Kubernetes API objects like a replication controller or replicaset - that needs to keep the replicas (pods) alive (that's kind of the point of using kubernetes).
What you will use in practice for a typical application are:
Deployment object (where you will specify your apps container/containers) that will host your app's container with some other specifications.
Service object (that is like a grouping object and gives it a so-called virtual IP (cluster IP) for the pods that have a certain label - and those pods are basically the app containers that you deployed with the former deployment object).
You need to have the service object because the pods from the deployment object can be killed, scaled up and down, and you can't rely on their IP addresses because they will not be persistent.
So you need an object like a service, that gives those pods a stable IP.
Just wanted to give you some context around pods, so you know how things work together.
Hope that clears a few things for you, not long ago I was in your shoes :)
Both Pod and Deployment are full-fledged objects in the Kubernetes API. Deployment manages creating Pods by means of ReplicaSets. What it boils down to is that Deployment will create Pods with spec taken from the template. It is rather unlikely that you will ever need to create Pods directly for a production use-case.
Kubernetes has three Object Types you should know about:
Pods - runs one or more closely related containers
Services - sets up networking in a Kubernetes cluster
Deployment - Maintains a set of identical pods, ensuring that they have the correct config and that the right number of them exist.
Pods:
Runs a single set of containers
Good for one-off dev purposes
Rarely used directly in production
Deployment:
Runs a set of identical pods
Monitors the state of each pod, updating as necessary
Good for dev
Good for production
And I would agree with other answers, forget about Pods and just use Deployment. Why? Look at the second bullet point, it monitors the state of each pod, updating as necessary.
So, instead of struggling with error messages such as this one:
Forbidden: pod updates may not change fields other than spec.containers[*].image
So just refactor or completely recreate your Pod into a Deployment that creates a pod to do what you need done. With Deployment you can change any piece of configuration you want to and you need not worry about seeing that error message.
Pod is container instance.
That is the output of replicas: 3
Think of one deployment can have many running instances(replica).
//deployment.yaml
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: tomcat-deployment222
spec:
selector:
matchLabels:
app: tomcat
replicas: 3
template:
metadata:
labels:
app: tomcat
spec:
containers:
- name: tomcat
image: tomcat:9.0
ports:
- containerPort: 8080
I want to add some informations from Kubernetes In Action book, so you can see all picture and connect relation between Kubernetes resources like Pod, Deployment and ReplicationController(ReplicaSet)
Pods
are the basic deployable unit in Kubernetes. But in real-world use cases, you want your deployments to stay up and running automatically and remain healthy without any manual intervention. For this the recommended approach is to use a Deployment, which under the hood create a ReplicaSet.
A ReplicaSet, as the name implies, is a set of replicas (Pods) maintained with their Revision history.
(ReplicaSet extends an older object called ReplicationController -- which is exactly the same but without the Revision history.)
A ReplicaSet constantly monitors the list of running pods and makes sure the running number of pods matching a certain specification always matches the desired number.
Removing a pod from the scope of the ReplicationController comes in handy
when you want to perform actions on a specific pod. For example, you might
have a bug that causes your pod to start behaving badly after a specific amount
of time or a specific event.
A Deployment
is a higher-level resource meant for deploying applications and updating them declaratively.
When you create a Deployment, a ReplicaSet resource is created underneath (eventually more of them). ReplicaSets replicate and manage pods, as well. When using a Deployment, the actual pods are created and managed by the Deployment’s ReplicaSets, not by the Deployment directly
Let’s think about what has happened. By changing the pod template in your Deployment resource, you’ve updated your app to a newer version—by changing a single field!
Finally, Roll back a Deployment either to the previous revision or to any earlier revision so easy with Deployment resource.
These images are from Kubernetes In Action book, too.
Pod is a collection of containers and basic object of Kuberntes. All containers of pod lie in same node.
Not suitable for production
No rolling updates
Deployment is a kind of controller in Kubernetes.
Controllers use a Pod Template that you provide to create the Pods for which it is responsible.
Deployment creates a ReplicaSet which in turn make sure that,
CurrentReplicas is always same as desiredReplicas .
Advantages :
You can rollout and rollback your changes using deployment
Monitors the state of each pod
Best suitable for production
Supports rolling updates
In Kubernetes we can deploy our workloads using different type of API objects like Pods, Deployment, ReplicaSet, ReplicationController and StatefulSets.
Out of those Pods are the smallest deployable unit in Kubernetes. Any workload/application that runs in Kubernetes, has to run inside a container part of a Pod. A Pod could run multiple containers (meaning multiple applications) within it. A Pod is a wrapper on top of one/many running containers. Using a Pod, kubernetes could control, monitor, operate the containers.
Now using stand alone Pods we can't do lot of things. We can't change configurations, volumes inside Pods. We can't restart the Pod if one is down.
So there is another API Object called Deployment comes into picture which maintains the desired state (how many instances, how much compute resource application uses) of the application. The Deployment maintaines multiple instances of same application by running multiple Pods. Deployments unlike Pods are mutable. Deployments uses another API Object called ReplicaSet to maintain the desired state. Deployments through ReplicaSet spawns another Pod if one is down.
So Pod runs applications in containers. Deployments run Pods and maintains desired state of the application.
Try to avoid Pods and implement Deployments instead for managing containers as objects of kind Pod will not be rescheduled (or self healed) in the event of a node failure or pod termination.
A Deployment is generally preferable because it defines a ReplicaSet to ensure that the desired number of Pods is always available and specifies a strategy to replace Pods, such as RollingUpdate.
May be this example will be helpful for beginners !!
1) Listing PODs
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 92s
webapp-mysql-75dfdf859f-9c54j 1/1 Running 0 92s
2) Deleting web-app pode - which is created using deployment
controlplane $ kubectl -n my-namespace delete pod webapp-mysql-75dfdf859f-9c54j
pod "webapp-mysql-75dfdf859f-9c54j" deleted
3) Listing PODs ( You can see, it is recreated automatically)
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
mysql 1/1 Running 0 2m42s
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 45s
4) Deleting mysql POD whcih is created directly ( with out deployment)
controlplane $ kubectl -n my-namespace delete pod mysql
pod "mysql" deleted
5) Listing PODs ( You can see mysql POD is lost for ever )
controlplane $ kubectl -n my-namespace get pods
NAME READY STATUS RESTARTS AGE
webapp-mysql-75dfdf859f-mqrcx 1/1 Running 0 76s
In kubernetes Pods are the smallest deployable units. Every time when we create a kubernetes object like Deployments, replica-sets, statefulsets, daemonsets it creates pod.
As mentioned above deployments create pods based on desired state mentioned in your deployment object. So for example you want 5 replicas of a application, you mentioned replicas: 5 in your deployment manifest. Now deployment controller is responsible to create 5 identical replicas (no less, no more) of given application with all metadata like RBAC policy, networks policy, labels, annotations, health check, resource quotas, taint/tolerations and others and associate with each pods it creates.
There are some cases when you wants to create pod, for example if you are running a test sidecar where you don't need to run application forever, you don't need multiple replicas, and you run application when you wants to execute in that case pod is suitable. For example helm test, which is a pod definition that specifies a container with a given command to run.
I am also a beginner in k8s so correct me if I am wrong.
We know that a pod is created when we create a deployment. What I observed is that if you see the YAML file of the deployment, you can see its kind:deployment. But if you see the YAML file of the pod, you see its kind:pod.