Will minimal number of replicas higher than 1 set in one of the below code snippets in production environments increase resiliency and availability (let's say starting with 2)?
What would be the best practices when setting this variable ?
I'm new to kubernetes but I was not able to find answer to this question.
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
namespace: test
labels:
app: test
spec:
replicas: 1
...
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: test-hpa
namespace: test
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test
minReplicas: 1
The short answer is 'yes', but there are many cases where the answer would be 'no'. For instance of all instances of the pod are scheduled on the same node, and the node dies, there will be some downtime before the pods start up again on an available node.
You can configure topologySpreadConstraints based on topologyKey: "node" as documented here https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/ to raise your recilience level.
Both replica set and deployment have the attribute replica: 3, what's the difference between deployment and replica set? Does deployment work via replica set under the hood?
configuration of deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-deployment
labels:
my-label: my-value
spec:
replicas: 3
selector:
matchLabels:
my-label: my-value
template:
metadata:
labels:
my-label: my-value
spec:
containers:
- name: app-container
image: my-image:latest
configuration of replica set
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: my-replicaset
labels:
my-label: my-value
spec:
replicas: 3
selector:
matchLabels:
my-label: my-value
template:
metadata:
labels:
my-label: my-value
spec:
containers:
- name: app-container
image: my-image:latest
Kubernetes Documentation
When to use a ReplicaSet
A ReplicaSet ensures that a specified number of pod replicas are running at any given time. However, Deployment is a higher-level concept that manages ReplicaSets and provides declarative updates to Pods along with a lot of other useful features. Therefore, we recommend using Deployments instead of directly using ReplicaSets, unless you require custom update orchestration or don't require updates at all.
This actually means that you may never need to manipulate ReplicaSet objects: use a Deployment instead, and define your application in the spec section.
Deployment resource makes it easier for updating your pods to a newer version.
Lets say you use ReplicaSet-A for controlling your pods, then You wish to update your pods to a newer version, now you should create Replicaset-B, scale down ReplicaSet-A and scale up ReplicaSet-B by one step repeatedly (This process is known as rolling update). Although this does the job, but it's not a good practice and it's better to let K8S do the job.
A Deployment resource does this automatically without any human interaction and increases the abstraction by one level.
Note: Deployment doesn't interact with pods directly, it just does rolling update using ReplicaSets.
A ReplicaSet ensures that a number of Pods is created in a cluster. The pods are called replicas and are the mechanism of availability in Kubernetes.
But changing the ReplicaSet will not take effect on existing Pods, so it is not possible to easily change, for example, the image version.
A deployment is a higher abstraction that manages one or more ReplicaSets to provide a controlled rollout of a new version. When the image version is changed in the Deployment, a new ReplicaSet for this version will be created with initially zero replicas. Then it will be scaled to one replica, after that is running, the old ReplicaSet will be scaled down. (The number of newly created pods, the step size so to speak, can be tuned.)
As long as you don't have a rollout in progress, a deployment will result in a single replicaset with the replication factor managed by the deployment.
I would recommend to always use a Deployment and not a bare ReplicaSet.
One of the differences between Deployments and ReplicaSet is changes made to container are not reflected once the ReplicaSet is created
For example:
This is my replicaset.yaml
apiVersion: apps/v1
kind: ReplicaSet
metadata:
name: nginx-replicaset
spec:
selector:
matchLabels:
app: nginx
replicas: 5 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.2
ports:
- containerPort: 80
I will apply this replicaset using this command
kubectl apply -f replicaset.yaml
kubectl get pods
kubectl describe pod <<name_of_pod>>
So from pod definition, we can observe that nginx is using 1.13.2. Now let's change image version to 1.14.2 in replicaset.yaml
Again apply changes
kubectl apply -f replicaset.yaml
kubectl get pods
kubectl describe pod <<name_of_pod>>
Now we don't see any changes in Pod and they are still using old image.
Now let us repeat the same with a deployment (deployment.yaml)
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 5 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.13.2
ports:
- containerPort: 80
I will apply this deployment using this command
kubectl apply -f deployment.yaml
kubectl get pods
kubectl describe pod <<name_of_pod>>
Change the deployment.yaml file with some other version of nginx image
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
selector:
matchLabels:
app: nginx
replicas: 5 # tells deployment to run 2 pods matching the template
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
I will again apply this deployment using this command
kubectl apply -f deployment.yaml
kubectl get pods
kubectl describe pod <<name_of_pod>>
Now we can see these pods and we can see updated image in the description of pod
TLDR:
Deployment manages -> Replicatset manages -> pod(s) abstraction of
-> container (e.g docker container)
A Deployment in Kubernetes is a higher-level abstraction that represents a set of replicas of your application. It ensures that your desired number of replicas of your application are running and available.
A ReplicaSet, on the other hand, is a lower-level resource that is responsible for maintaining a stable set of replicas of your application. ReplicaSet ensures that a specified number of replicas are running and available.
Example:
Suppose you have a web application that you want to run on 3 replicas for high availability.
You would create a Deployment resource in Kubernetes, specifying the desired number of replicas as 3 pods. The deployment would then create and manage the ReplicaSet, which would in turn create and manage 3 replicas of your web application pod.
In summary, Deployment provides higher-level abstractions for scaling, rolling updates and rolling back, while ReplicaSet provides a lower-level mechanism for ensuring that a specified number of replicas of your application are running.
As mentioned in this answer: allow for easy updating of a Replica Set as well as the ability to roll back to a previous deployment.
So, kind: Deployment scales replicasets, which scales Pods, supports zero-downtime updates by creating and destroying replicasets
What is the purpose of HorizontalPodAutoscaler resource type?
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: xyz
spec:
maxReplicas: 4
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: xyz
targetCPUUtilizationPercentage: 70
As you write, with a Deployment it is easy to manually scale an app horizontally, by changing the numer of replicas.
By using a HorizontalPodAutoscaler, you can automate the horizontal scaling by e.g. configuring some metric thresholds, therefore the name autoscaler.
The documentation of VPA states that HPA and VPA should not be used together. It can only be used to gethere when you want scaling on custom metrics.
I have scaling enabled on CPU.
My question is can I have HPA enabled for some deployment (lets say A) and VPA enabled for some deployment (lets say B). Or will this also leed to errors.
Using them both at the same time is not suggested because if they both detect that the memory is need they might want to try to resolve the same problems at the same time which will lead to wrongly allocated resources.
This is not something that can be specified at the application deployment level but you can specify which deployment should HPA and VPA scale using targetRef
So for the deployment with app1 you can specify VPA:
apiVersion: autoscaling.k8s.io/v1beta2
kind: VerticalPodAutoscaler
metadata:
name: app1-vpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app1
And for the app2 you can specify to use HPA:
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: app2-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app2
If need to use HPA and VPA together on the same deployments, you just have to make sure that the will based they behavior on different metrics. This way you prevent them by being scaled by the same event. To summarize VPA and HPA can be used together if he HPA config won`t use CPU or Memory to determine its targets as stated in the documentation:
"Vertical Pod Autoscaler should not be used with the Horizontal
Pod Autoscaler (HPA) on CPU or memory at this moment"
I am using Kubernetes in GCP. I am scaling my pods using metrics of queue size uploaded to Cloud Monitoring.
The problem:
Kubernetes scale up pods in very short intervals. About 12-15 seconds between each scale up. My machines take about 30 seconds to boot up. I would like the scale up intervals to be something close to 30.
Adding
spec:
minReadySeconds: 30
to the deployment yaml did not worked.
Example hpa:
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: <NAME>
namespace: <NAMESPACE>
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: <DEPLOYMENT>
minReplicas: <MIN_REPLICAS>
maxReplicas: <MAX_REPLICAS>
metrics:
- type: External
external:
metricName: "custom.googleapis.com|rabbit_mq|<QUEUE>|messages_count"
metricSelector:
matchLabels:
metric.labels.name: <NAMESPACE>
targetValue: <TARGETVALUE>
Is there a way to control this scale-up interval?
The delays between scale-ups are determined internally by the HPA algorithm.
From the documentation:
Starting from v1.12, a new algorithmic update removes the need for the upscale delay.
It seems it was a configurable parameter before, but now the algorithm tries to be smart about it and decide on its own how quickly to scale up your app.
To be really sure how the HPA does it and how you could influence it, you can inspect the code.