Will minimal number of replicas higher than 1 set in one of the below code snippets in production environments increase resiliency and availability (let's say starting with 2)?
What would be the best practices when setting this variable ?
I'm new to kubernetes but I was not able to find answer to this question.
apiVersion: apps/v1
kind: Deployment
metadata:
name: test
namespace: test
labels:
app: test
spec:
replicas: 1
...
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: test-hpa
namespace: test
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: test
minReplicas: 1
The short answer is 'yes', but there are many cases where the answer would be 'no'. For instance of all instances of the pod are scheduled on the same node, and the node dies, there will be some downtime before the pods start up again on an available node.
You can configure topologySpreadConstraints based on topologyKey: "node" as documented here https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/ to raise your recilience level.
Related
Can we set min and max limit for deployments at deployment level, not at cluster or replica set level in kubernetes ?
On deployment level it is not possible, but there is an option to do this indirectly. You should use a HorizontalPodAutoscaler (HPA for short):
HPA automatically updates a workload resource (such as a Deployment or
StatefulSet), with the aim of automatically scaling the workload to
match demand.
Example code for HPA:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
More information can be found in Kubernetes documentation.
At deployment level only replicas attribute is there. When you define hpa there is an option for min and max
As mentioned in this answer: allow for easy updating of a Replica Set as well as the ability to roll back to a previous deployment.
So, kind: Deployment scales replicasets, which scales Pods, supports zero-downtime updates by creating and destroying replicasets
What is the purpose of HorizontalPodAutoscaler resource type?
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: xyz
spec:
maxReplicas: 4
minReplicas: 2
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: xyz
targetCPUUtilizationPercentage: 70
As you write, with a Deployment it is easy to manually scale an app horizontally, by changing the numer of replicas.
By using a HorizontalPodAutoscaler, you can automate the horizontal scaling by e.g. configuring some metric thresholds, therefore the name autoscaler.
I have a KEDA enabled Queue Triggered Azure Function running in a Kubernetes cluster. When I describe the HPA created by KEDA, I am unable to understand the value metrics.
In the following image, what does "7309m" represents? I came to the conclusion that "1" is the queueLength parameter I supplied in the ScaledObject.yaml file.
The official documentation shows an example which is based upon the percentage of the resources utilized by the system as follows:
ScaledObject.yaml file
apiVersion: v1
kind: Secret
metadata:
name: queue-connection-secret
data:
connection-string: ####
---
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: azure-queue-auth
spec:
secretTargetRef:
- parameter: connection
name: queue-connection-secret
key: connection-string
---
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: queuetrigfuncscaledobject
spec:
scaleTargetRef:
name: queuetrigfuncdeployment
minReplicaCount: 0
maxReplicaCount: 120
pollingInterval: 1
cooldownPeriod: 900
triggers:
- type: azure-queue
metadata:
queueName: k8s-poc-queue
queueLength: "1"
authenticationRef:
name: azure-queue-auth
The queueLength specifies the target amount of messages you want to see on each pod in k8s. Simply saying it's the amount you want to achieve while scaling the number of pods. Specifying it as 1 will make HPA scale the pods to the number of messages in the queue, which isn't what you want to achieve I assume. There is an official doc available here https://keda.sh/docs/2.4/scalers/azure-storage-queue/
I am fairly new to the kubernetes engine and I have a use case that I can't seem to make working. I want to have each pod run in only one dedicated node and then autoscale the cluster.
For now I have tried using a DaemonSet to run each pod and I have created an HorizontalPodAutoscaler targeting the nodepool.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: test
spec:
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test
spec:
containers:
- name: actions
image: image_link
nodeSelector:
cloud.google.com/gke-nodepool: test
updateStrategy:
type: RollingUpdate
---
apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: test
spec:
scaleTargetRef:
apiVersion: extensions/v1beta1
kind: DaemonSet
name: test
minReplicas: 1
maxReplicas: 2
metrics:
- type: Resource
resource:
name: cpu
targetAverageUtilization: 80
I then use the stress utility to test the autoscaling process but the number of nodes stays constant. Is there something I am missing here ? Is there another component I can use for my use case ?
HorizontalPodAutoscaler is used to scale the pods depending on the metrics limit. It is not applicable to daemonset.
Daemonset deploys one pod on each node in the cluster. If you want to scale daemonset you need to scale your nodepool.
HorizontalPodAutoscaler is best used to auto scale deployment objects. In your case, change the daemonset object to deployment object or scale out the nodepool. Auto scaling of nodes is supported on Google cloud platform. Not sure about other cloud providers. You need to check your cloud provider documentation
Daemonset is a controller which deploys a POD for each node having the selector matched expression, you can't have more than on POD running on each node. You should look at another controller, I could not see what kind of app you want to deploy, I would suggest:
Deployment: If you want to use a stateless based application which can handle scaling up and down without consistency between the replicas
StatefulSet: If you want to use a stateful based application which needs some care to scaling and also data consistency
One important thing to notice about the HPA is that you must have metrics enabled, otherwise the reconciliation loop would not be able to watch the scale action needed.
Just finished reading Nigel Poulton's The Kubernetes Book, but I am somewhat puzzled with Services.
Could a Service be added to the Deployment manifest below somehow?Or does the Service have to be POSTed on its own? Isn't the whole purpose of a deployment to specify everything needed for the app to run?
apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: hello-deploy
spec:
replicas: 10
selector:
matchLabels:
app: hello-world
minReadySeconds: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
maxSurge: 1
template:
metadata:
labels:
app: hello-world
spec:
containers:
- name: hello-pod
image: nigelpoulton/k8sbook : latest
ports:
- containerPort: 8080
They're different objects and you have to submit them separately (HTTP POST, kubectl apply, ...).
There are a couple of tricks you can do to minimize the impact of this:
You can use a multi-document YAML file and submit that as a single thing, like
---
apiVersion: apps/v1
kind: Deployment
...
---
apiVersion: v1
kind: Service
...
There is an undocumented kind: List that could embed multiple objects
apiVersion: v1
kind: List
items:
- apiVersion: apps/v1
kind: Deployment
...
- apiVersion: v1
kind: Service
...
You can use a higher-level deployment manager such as Helm that lets you keep each object in a separate file, but deploy them in a single command.
It's perhaps unfortunate that a couple of Kubernetes objects have names that are different from their plain English meanings (a Deployment doesn't cover all of the steps or parts of deploying a whole application; a Service is just an IP/DNS pointer and not a service implementation) but that's the way it is. I tend to capitalize the Kubernetes object names when it will disambiguate things.
Isn't the whole purpose of a deployment to specify everything needed for the app to run?
The whole purpose of "Deployment" is to manage the deployment of pods/replicasets including replication, scaling, rolling update, rollbacks. The DeploymentController is part of the master node's controller manager, and it makes sure that the current state always matches the desired state.
does the Service have to be POSTed on its own?
If you are familiar with Load balancers terminology, Services are frontends and Pods are its backends. Since it is frontend, Service forwards requests to its backend (pods).