I am new to kubernetes and trying to understand when to use kubectl autoscale and kubectl scale commands
Scale in deployment tells how many pods should be always running to ensure proper working of the application. You have to specify it manually.
In YAMLs you have to define it in spec.replicas like in example below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
Second way to specify scale (replicas) of deployment is use command.
$ kubectl run nginx --image=nginx --replicas=3
deployment.apps/nginx created
$ kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 3 3 3 3 11s
It means that deployment will have 3 pods running and Kubernetes will always try to maintain this number of pods (If any of the pods will crush, K8s will recreate it). You can always change it with in spec.replicas and use kubectl apply -f <name-of-deployment> or via command
$ kubectl scale deployment nginx --replicas=10
deployment.extensions/nginx scaled
$ kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
nginx 10 10 10 10 4m48s
Please read in documentation about scaling and replicasets.
Horizontal Pod Autoscaling (HPA) was invented to scale deployment based on metrics produced by pods. For example, if your application have about 300 HTTP request per minute and each your pod allows to 100 HTTP requests for minute it will be ok. However if you will receive a huge amount of HTTP request ~ 1000, 3 pods will not be enough and 70% of request will fail. When you will use HPA, deployment will autoscale to run 10 pods to handle all requests. After some time, when number of request will drop to 500/minute it will scale down to 5 pods. Later depends on request number it might go up or down depends on your HPA configuration.
Easiest way to apply autoscale is:
$ kubectl autoscale deployment <your-deployment> --<metrics>=value --min=3 --max=10
It means that autoscale will automatically scale based on metrics to maximum 10 pods and later it will downscale minimum to 3.
Very good example is shown at HPA documentation with CPU usage.
Please keep in mind that Kubernetes can use many types of metrics based on API (HTTP/HTTP request, CPU/Memory load, number of threads, etc.)
Hope it help you to understand difference between Scale and Autoscaling.
Related
I am creating deployments using Kubernetes API from my server. The deployment pod has two containers - one is the main and the other is a sidecar container that checks the health of the pod and calls the server when it becomes healthy.
I am using this endpoint to get the deployment. It has deployment status property with the following structure as mention here.
I couldn't understand the fields availableReplicas, readyReplicas, replicas, unavailableReplicas and updatedReplicas.
I checked docs of Kubernetes and these SO questions too - What is the difference between current and available pod replicas in kubernetes deployment? and Meaning of "available" and "unavailable" in kubectl describe deployment but could not infer the difference between of a pod being ready, running and available. Could somebody please explain the difference between these terms and states?
A different kinds of replicas in the Deployment's Status can be described as follows:
Replicas - describes how many pods this deployment should have. It is copied from the spec. This happens asynchronously, so in a very brief interval, you could read a Deployment where the spec.replicas is not equal to status.replicas.
availableReplicas - means how many pods are ready for at least some time (minReadySeconds). This prevents flapping of state.
unavailableReplicas - is the total number of pods that should be there, minus the number of pods that has to be created, or ones that are not available yet (e.g. are failing, or are not ready for minReadySeconds).
updatedReplicas - the number of pods reachable by deployment, that match the spec template.
readyReplicas - the number of pods that are reachable from deployment through all the replicas.
Let's use the official example of a Deployment that creates a ReplicaSet to bring up three nginx Pods:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
The Deployment creates three replicated Pods, indicated by the .spec.replicas field.
Create the Deployment by running the following command:
kubectl apply -f https://k8s.io/examples/controllers/nginx-deployment.yaml
Run kubectl get deployments to check if the Deployment was created.
If the Deployment is still being created, the output is similar to the following:
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 0/3 0 0 1s
When you inspect the Deployments in your cluster, the following fields are displayed:
NAME - lists the names of the Deployments in the namespace.
READY - displays how many replicas of the application are available to your users. It follows the pattern ready/desired.
UP-TO-DATE - displays the number of replicas that have been updated to achieve the desired state.
AVAILABLE - displays how many replicas of the application are available to your users.
AGE - displays the amount of time that the application has been running.
Run the kubectl get deployments again a few seconds later. The output is similar to this:
NAME READY UP-TO-DATE AVAILABLE AGE
nginx-deployment 3/3 3 3 18s
To see the ReplicaSet (rs) created by the Deployment, run kubectl get rs. The output is similar to this:
NAME DESIRED CURRENT READY AGE
nginx-deployment-75675f5897 3 3 3 18s
ReplicaSet output shows the following fields:
NAME - lists the names of the ReplicaSets in the namespace.
DESIRED - displays the desired number of replicas of the application, which you define when you create the Deployment. This is the desired state.
CURRENT - displays how many replicas are currently running.
READY displays how many replicas of the application are available to your users.
AGE - displays the amount of time that the application has been running.
As you can see there is no actual difference between availableReplicas and readyReplicas as both of those fields displays how many replicas of the application are available to your users.
And when it comes to the Pod Lifecycle it is important to see the difference between Pod phase, Container states and Pod conditions which all have different meanings. I strongly recommend going through the linked docs in order to get a solid understanding behind them.
I have a node-pool (default-pool) in a GKE cluster with 3 nodes, machine type n1-standard-1. They host 6 pods with a redis cluster in it (3 masters and 3 slaves) and 3 pods with an nodejs example app in it.
I want to upgrade to a bigger machine type (n1-standard-2) with also 3 nodes.
In the documentation, google gives an example to upgrade to a different machine type (in a new node pool).
I have tested it while in development, and my node pool was unreachable for a while while executing the following command:
for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=default-pool -o=name); do
kubectl cordon "$node";
done
In my terminal, I got a message that my connection with the server was lost (I could not execute kubectl commands). After a few minutes, I could reconnect and I got the desired output as shown in the documentation.
The second time, I tried leaving out the cordon command and I skipped to the following command:
for node in $(kubectl get nodes -l cloud.google.com/gke-nodepool=default-pool -o=name); do
kubectl drain --force --ignore-daemonsets --delete-local-data --grace-period=10 "$node";
done
This because if I interprete the kubernetes documentation correctly, the nodes are automatically cordonned when using the drain command. But I got the same result as with the cordon command: I lost connection to the cluster for a few minutes, and I could not reach the nodejs example app that was hosted on the same nodes. After a few minutes, it restored itself.
I found a workaround to upgrade to a new node pool with bigger machine types: I edited the deployment/statefulset yaml files and changed the nodeSelector. Node pools in GKE are tagged with:
cloud.google.com/gke-nodepool=NODE_POOL_NAME
so I added the correct nodeSelector to the deployment.yaml file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: example-deployment
labels:
app: example
spec:
replicas: 3
selector:
matchLabels:
app: example
template:
metadata:
labels:
app: example
spec:
nodeSelector:
cloud.google.com/gke-nodepool: new-default-pool
containers:
- name: example
image: IMAGE
ports:
- containerPort: 3000
This works without downtime, but I'm not sure this is the right way to do in a production environment.
What is wrong with the cordon/drain command, or am I not using them correctly?
Cordoning a node will cause it to be removed from the load balancers backend list, so will a drain. The correct way to do it is to set up anti-affinity rules on the deployment so the pods are not deployed on the same node, or the same region for that matter. That will cause an even distribution of pods throught your node pool.
Then you have to disable autoscaling on the old node pool if you have it enabled, slowly drain 1-2 nodes a time and wait for them to appear on the new node pool, making sure at all times to keep one pod of the deployment alive so it can handle traffic.
I'm hosting an application on the Google Cloud Platform via Kubernetes, and I've managed to set up this continuous deployment pipeline:
Application code is updated
New Docker image is automatically generated
K8s Deployment is automatically updated to use the new image
This works great, except for one issue - the deployment always seems to have only one pod. Because of this, when the next update cycle comes around, the entire application goes down, which is unacceptable.
I've tried modifying the YAML of the deployment to increase the number of replicas, and it works... until the next image update, where it gets reset back to one pod again.
This is the command I use to update the image deployment:
set image deployment foo-server gcp-cd-foo-server-sha256=gcr.io/project-name/gcp-cd-foo-server:$REVISION_ID
You can use this command if you dont want to edit deployment yaml file:
kubectl scale deployment foo-server --replicas=2
Also, look at update strategy with maxUnavailable and maxsurge properties.
In your orgional deployment.yml file keep the replicas to 2 or more, othervise you cant avoid down time if only one pod is running and you are going to re-deploy/upgrade etc.
Deployment with 3 replicas( example):
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
Deployment can ensure that only a certain number of Pods may be down
while they are being updated. By default, it ensures that at least 25%
less than the desired number of Pods are up (25% max unavailable).
Deployment can also ensure that only a certain number of Pods may be
created above the desired number of Pods. By default, it ensures that
at most 25% more than the desired number of Pods are up (25% max
surge).
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/
Nevermind, I had just set up my deployments wrong - had something to do with using the GCP user interface to create the deployments rather than console commands. I created the deployments with kubectl run app --image ... instead and it works now.
I have a deployment with a defined number of replicas. I use readiness probe to communicate if my Pod is ready/ not ready to handle new connections – my Pods toggle between ready/ not ready state during their lifetime.
I want Kubernetes to scale the deployment up/ down to ensure that there is always the desired number of pods in a ready state.
Example:
If replicas is 4 and there are 4 Pods in ready state, then Kubernetes should keep the current replica count.
If replicas is 4 and there are 2 ready pods and 2 not ready pods, then Kubernetes should add 2 more pods.
How do I make Kubernetes scale my deployment based on the "ready"/ "not ready" status of my Pods?
I don't think this is possible. If pod is not ready, k8 will not make it ready as It is something which releated to your application.Even if it create new pod, how readiness will be guaranted. So you have to resolve the reasons behind non ready status and then k8. Only thing k8 does it keep them away from taking world load to avoid request failure
Ensuring you always have 4 pods running can be done by specifying the replicas property in your deployment definition:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 4 #here we define a requirement for 4 replicas
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80
Kubernetes will ensure that if any pods crash, replacement pods will be created so that a total of 4 are always available.
You cannot schedule deployments on unhealthy nodes in the cluster. The master api will only create pods on nodes which are healthy and meet the quota criteria to create any additional pods on the nodes which are schedulable.
Moreover, what you define is called an auto-heal concept of k8s which in basic terms will be taken care of.
The following is the file used to create the Deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: kloud-php7
namespace: kloud-hosting
spec:
replicas: 1
template:
metadata:
labels:
app: kloud-php7
spec:
containers:
- name: kloud-php7
image: 192.168.1.1:5000/kloud-php7
- name: kloud-nginx
image: 192.168.1.1:5000/kloud-nginx
ports:
- containerPort: 80
The Deployment and the Pod worked fine, but after deleting the Deployment and a generated ReplicaSet, the I cannot delete the spawn Pods permanently. New Pods will be created if old ones are deleted.
The kubernetes cluster is created with kargo, containing 4 nodes running CentOS 7.3, kubernetes version 1.5.6
Any idea how to solve this problem ?
This is working as intended. The Deployment creates (and recreates) a ReplicaSet and the ReplicaSet creates (and recreates!) Pods. You need to delete the Deployment, not the Pods or the ReplicaSet:
kubectl delete deploy -n kloud-hosting kloud-php7
This is Because the replication set always enables to recreate the pods as mentioned in the deployment file(suppose say 3 ..kube always make sure that 3 pods up and running)
so here we need to delete replication set first to get rid of pods.
kubectl get rs
and delete the replication set .this will in turn deletes the pods
It could be the deamonsets need to be deleted.
For example:
$ kubectl get DaemonSets
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
elasticsearch-operator-sysctl 5 5 5 5 5 <none> 6d
$ kubectl delete daemonsets elasticsearch-operator-sysctl
Now running get pods should not list elasticsearch* pods.