How does kubernetes help in reducing the cost of hosting? - kubernetes

I am trying to understand this hosting and scaling stuffs , say if i have a website with huge traffic on weekends which would require 2 vps at least to handle the load.
we could do either of the 2 things
we could simply upgrade to a larger vps plan and forget it, which is an inefficient way and also a costlier option.
Making 2 vps and setting up a load balancer and let it handle the traffic between 2 vps just like kubernetes does.
So how are kubernetes helpful then if we are still paying for 2nd vps?
Can kubernetes spin full vps before deploying news pods in it?

You can use Cluster Autoscaler for your Kubernetes cluster which will add or remove nodes on demands.

Kubernetes can run virtually anywhere - on bare metal as well as in a private or public cloud.
However, where you choose to run Kubernetes determines the scalability of your Kubernetes cluster.
Deploying Kubernetes on VPS servers requires more effort on your side and the cluster is less scalable compared to managed Kubernetes services such as: GKE, EKS and AKS.
In General, the Cluster Autoscaler is available primarily for managed Kubernetes Services (see: Supported cloud providers).
Cluster Autoscaler:
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
there are pods that failed to run in the cluster due to insufficient resources.
there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
For VPS, you can still use the Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA) to optimize the resource utilization of your application.
Horizontal Pod Autoscaler:
The Horizontal Pod Autoscaler automatically scales the number of Pods in a replication controller, deployment, replica set or stateful set based on observed CPU utilization (or, with custom metrics support, on some other application-provided metrics).
Vertical Pod Autoscaler:
The Vertical Pod Autoscaler automatically adjust the amount of CPU and memory requested by pods running in the Kubernetes Cluster.

Related

What is the Best Way to Scale an external (non EKS) EC2 Auto Scaling Group from Inside a Kubernetes Cluster Based on Prometheus Metrics?

I am currently autoscaling an HPA via internal Prometheus metrics which then filters down to scale the cluster via the AWS Cluster Autoscaler. That HPA is tied to an external service run on bare EC2 instances. I would like to use the same metrics that I use to scale that HPA to also scale the ASG behind that service that is external to the Kubernetes cluster.
What is the best way to do this? It is preferable that the external EC2 cluster does not have network access to the EKS cluster.
I was thinking about just writing a small service that does it via the AWS API based on polling Prometheus intermittently but I figured that there must be a better way.

does GKE autopilot auto scale both pods and nodes?

when I change the replicas: x in my .yaml file I can see GKE autopilot boots pods up/down depending on the value, but what will happen if the load on my deployment gets too big. Will it then autoscale the number of pods and nodes to handle the traffic and then reduce back to the value specified in replicas when the request load is reduced again?
I'm basically asking how does autopilot horizontal autoscaling works?
and how do I get a minimum of 2 pod replicas that can horizontally autoscale in autopilot?
GKE autopilot by default will not scale the replicas count beyond what you specified. This is the default behavior of Kubernetes in general.
If you want automatic autoscaling you have to use Horizental Pod Autoscaler (HPA) which is supported in Autopilot
If you deploy HPA to scale up and down your workload, Autopilot will scale up and down the nodes automatically and that's transparent for you as the nodes are managed by Google.
GKE autoscale only Nodes by default, while you have to take care of your HPA deployment scaling.
Autopilot: GKE provisions and manages the cluster's underlying
infrastructure, including nodes and node pools, giving you an
optimized cluster with a hands-off experience.
We need to configure both scaling options for deployment VPA and HPA.
Pre-configured: Autopilot handles all the scaling and configuring of
your nodes.
Default: You configure Horizontal pod autoscaling (HPA) You configure
Vertical Pod autoscaling (VPA)
GKE will manage the scaling up/down of your nodes in node pools, without worrying about the infrastructure you just have to start deploying the application with HPA & VPA auto-scaling.
You can read more about the options here : https://cloud.google.com/kubernetes-engine/docs/concepts/autopilot-overview#comparison

Azure Kubernetes Service - can the Cluster Autoscaler get triggered even if I don't set autoscaling explicitly?

I am deploying a service to Azure Kubernetes Service.
The Horizontal Pod Autoscaler scales the number of pods, whereas the Cluster Autoscaler scales the number of nodes based on the number of pending pods. If my understanding is correct, if I don't set up autoscaling in my deployment file, the HPA won't get triggered, and only one pod will run; therefore, the CA won't get triggered either.
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
My question is - is there a scenario in AKS where the CA would get triggered, even without setting autoscaling in my deployment file?
Cluster autoscaler is typically used together with the horizontal pod autoscaler. The Horizontal Pod Autoscaler increases or decreases the number of pods based on application demand, and the cluster autoscaler adjusts the number of nodes as needed to run those additional pods accordingly.
If your deployment does not have the capacity to automatically scale up or down via the HPA, NOR you don't manually increase number of pods to the level where no additional pods can run due to insufficient resource in your nodes then the CA would not be triggered therefore the answer is NO.
You might find this document from official azure docs helpful also.

Azure Kubernetes Cluster Autoscaler - set memory threshold for scaling out nodes

In my 1 node AKS, I deploy multiple job resources (kind:jobs) that are terminated after the task is completed. I have enabled Cluster Autoscaler to add a second node when too many jobs are consuming the first node memory, however it scales out after a job/pod is unable to be created due to lack of memory.
In my job yaml I also defined the resource memory limit and request.
Is there a possibility to configure the Cluster Autoscaler to scale out proactively when it reaches a certain memory threshold (e.g., 70% of the node memory) not just when it cannot deploy a job/pod?
In Kubernetes you can find 3 Autoscaling Mechanisms: Horizontal Pod Autoscaler, Vertical Pod Autoscaler which both can be controlled by metrics usage and Cluster Autoscaler.
As per Cluster Autoscaler Documentation:
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true:
there are pods that failed to run in the cluster due to insufficient resources.
there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
In AKS Cluster Autoscaler Documentation you can find note that CA is Kubernetes Component, not something AKS specific:
The cluster autoscaler is a Kubernetes component. Although the AKS cluster uses a virtual machine scale set for the nodes, don't manually enable or edit settings for scale set autoscale in the Azure portal or using the Azure CLI. Let the Kubernetes cluster autoscaler manage the required scale settings.
In Azure Documentation - About the cluster autoscaler you have information that AKS clusters can scale in one of two ways:
The cluster autoscaler watches for pods that can't be scheduled on nodes because of resource constraints. The cluster then automatically increases the number of nodes.
The horizontal pod autoscaler uses the Metrics Server in a Kubernetes cluster to monitor the resource demand of pods. If an application needs more resources, the number of pods is automatically increased to meet the demand.
On AKS you can adjust a bit your Autoscaler Profile to change some default values. More detail can be found in Using the autoscaler profile
I would suggest you to read the Understanding Kubernetes Cluster Autoscaling article which explains how CA works. Under Limitations part you have information:
The cluster autoscaler doesn’t take into account actual CPU/GPU/Memory usage, just resource requests and limits. Most teams overprovision at the pod level, so in practice we see aggressive upscaling and conservative downscaling.
Conclusion
Cluster Autoscaler doesn't consider actual resources usage. CA downscale or upscale might take a few minutes depending on cloud provider.

In GCP Kubernetes (GKE) how do I assign a stateless pod created by a deployment to a provisioned vm

I have several operational deployments on minikube locally and am trying to deploy them on GCP with kubernetes.
When I describe a pod created by a deployment (which created a replication set that spawned the pod):
kubectl get po redis-sentinel-2953931510-0ngjx -o yaml
It indicates it landed on one of the kubernetes vms.
I'm having trouble with deployments that work separately failing due to lack of resources e.g. cpu even though I provisioned a VM above the requirements. I suspect the cluster is placing the pods on it's own nodes and running out of resources.
How should I proceed?
Do I introduce a vm to be orchestrated by kubernetes?
Do I enlarge the kubernetes nodes?
Or something else all together?
It was a resource problem and node pool size was inhibiting the deployments.I was mistaken in trying to provide google compute instances and disks.
I ended up provisioning Kubernetes node pools with more cpu and disk space and solved it. I also added elasticity by provisioning autoscaling.
here is a node pool documentation
here is a terraform Kubernetes deployment
here is the machine type documentation