EKS Cluster Autoscaler with Spot & On-Demand Node groups - kubernetes

I am working on EKS Managed Node Groups using Terraform, I want to use Spot Node group as a priority & if only spot failover is there, On-demand NG should be used. And once there is available capacity in Spot NG, I want to use Spot NG back again.
I am able to implement 1st scenario of SPOT- ON-DEMAND
But from ON-DEMAND to Spot, I am not sure how I can do it.
I am new to whole EKS CAS thing, please help me understand if there is any way.
Thanks.

Related

Tekton on EKS how to work with zones when using volumeClaim?

Update 2022-03-22: I could isolate the problem to the Cluster
Autoscaler and not enough pod "slots" left on a node. No solution still. For a detailed
analysis see https://github.com/tektoncd/pipeline/issues/4699
I have an EKS cluster running with aws-ebs controller. Now I want to use tekton on this cluster. Tekton has an affinity assistant which should schedule pods on the same node if they share a workspace (aka volumeClaim). Sadly, this does not seem to work for me, as I randomly get an error from my node stating didn't match pod affinity rules and didn't find available persistent volume to bind even there is a volumne existing. After debugging, I found
that the persistentVolumes created from time to time are in a different region and on another host than the pod which is spanned.
Does somebody know how to still use “automatic” aws-ebs provisioning with tekton on EKS or something similar, making this work? My fallback solution would be to try S3 as a storage ... but I assume this maybe not the best solution as I have many small files from a git repository. Just provisioning a volume and then running pods only on this one node, is not the solution I would opt for. Even this is better than nothing :)
Any help would be appreciated! If more information is needed, please add a comment and I try to follow up.
-Thanks a lot!
You get this events:
0/6 nodes are available: 2 Too many pods
Your node is essentially full. When you use Tekton Pipelines with the Affinity Assistant, all pods in the run will be scheduled to the same pod.
If you want to run Tekton Pipelines with space for just few pods per node, then you should disable the Affinity Assistant in the config map.

Reduce costs in EKS cluster outside working hours

I have an EKS cluster with two worker nodes. I would like to "switch off" the nodes or do something to reduce costs of my cluster outside working hours. Is there any way to turn off the nodes at night and turn on again at morning?
Thanks a lot.
This is a very common concern with anyone using managed K8s cluster. There might be different approaches people might be taking for this. What works best for us is a combination of kube-downscaler and cluster-autoscaler.
kube-downscaler helps you to scale down / "pause" Kubernetes workload (Deployments, StatefulSets, and/or HorizontalPodAutoscalers and CronJobs too !) during non-work hours.
cluster-autoscaler is a tool that automatically:
Scales-down the size of the Kubernetes cluster when there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
Scales-up the size of the Kubernetes cluster when there are pods that failed to run in the cluster due to insufficient resources.
So, essentially during night when kube-downscaler scales down the pods and other objects, cluster-autoscaler notices the underutilized nodes and kill them before placing pods on other nodes. And does the opposite in the morning.
Ofcourse, there might be some fine-tuning needed regarding the configuration of the two to make it work best for you.
Unrelated to your specific question but, if you are in "savings" mode you may want to have a look at EC2 Spot Instances for EKS assuming you can operate within their boundaries. See here for the details.

Off-Loading of k8s deployments to different cluster in case of high loads

Since I am unable to find anything on google or the official docs, I have a question.
I have a local minikube cluster with deployment, service and ingress, which is working fine. Now when the load on my local cluster becomes too high I want to automatically switch to a remote cluster.
Is this possible?
How would I achieve this?
Thank you in advance
EDIT:
A remote cluster in my case would be a rancher Kubernetes cluster, but as long as the resources on my local one are sufficient I want to stay there.
So lets say my local cluster has enough resources to run two replicas of my application, but when a third one is needed to distribute the load, it should be deployed to the remote rancher cluster. (I hope that is clearer now)
I imagine it would be doable with kubefed (https://github.com/kubernetes-sigs/kubefed) when using the ReplicaSchedulingPreferences (https://github.com/kubernetes-sigs/kubefed/blob/master/docs/userguide.md#replicaschedulingpreference) and just weighting the local cluster very high and the remote one very low and then setting spec.rebalance to true to distribute it in case of high loads, but that approach seems a bit like a workaround.
Your idea of using Kubefed sounds good but there is an another option: Multicluster-Scheduler.
Multicluster-scheduler is a system of Kubernetes controllers that
intelligently schedules workloads across clusters. It is simple to use
and simple to integrate with other tools.
To be able to make a better choice for your use case you can read through the Comparison with Kubefed (Federation v2).
All the necessary info can be found in the provided GitHub thread.
Please let me know if that helped.

Where is KOPS located/running from?

I am new to Docker and Kubernetes, though I have mostly figured out how it all works at this point.
I inherited an app that uses both, as well as KOPS.
One of the last things I am having trouble with is the KOPS setup. I know for absolute certain that Kubernetes is setup via KOPS. There's two KOPS state stores on an S3 bucket (corresponding to a dev and prod cluster respectively)
However while I can find the server that kubectl/kubernetes is running on, absolutely none of the servers I have access to seem to have a kops command.
Am I misunderstanding how KOPS works? Does it not do some sort of dynamic monitoring (would that just be done by ReplicaSet by itself?), but rather just sets a cluster running and it's done?
I can include my cluster.spec or config files, if they're helpful to anyone, but I can't really see how they're super relevant to this question.
I guess I'm just confused - as far as I can tell from my perspective, it looks like KOPS is run once, sets up a cluster, and is done. But then whenever one of my node or master servers goes down, it is self-healing. I would expect that of the node servers, but not the master servers.
This is all on AWS.
Sorry if this is a dumb question, I am just having trouble conceptually understanding what is going on here.
kops is a command line tool, you run it from your own machine (or a jumpbox) and it creates clusters for you, it’s not a long-running server itself. It’s like Terraform if you’re familiar with that, but tailored specifically to spinning up Kubernetes clusters.
kops creates nodes on AWS via autoscaling groups. It’s this construct (which is an AWS thing) that ensures your nodes come back to the desired number.
kops is used for managing Kubernetes clusters themselves, like creating them, scaling, updating, deleting. kubectl is used for managing container workloads that run on Kubernetes. You can create, scale, update, and delete your replica sets with that. How you run workloads on Kubernetes should have nothing to do with how/what tool you (or some cluster admin) use to manage the Kubernetes cluster itself. That is, unless you’re trying to change the “system components” of Kubernetes, like the Kubernetes API or kubedns, which are cluster-admin-level concerns but happen to run on top of Kuberentes as container workloads.
As for how pods get spun up when nodes go down, that’s what Kubernetes as a container orchestrator strives to do. You declare the desired state you want, and the Kubernetes system makes it so. If things crash or fail or disappear, Kubernetes aims to reconcile this difference between actual state and desired state, and schedules desired container workloads to run on available nodes to bring the actual state of the world back in line with your desired state. At a lower level, AWS does similar things — it creates VMs and keeps them running. If Amazon needs to take down a host for maintenance it will figure out how to run your VM (and attach volumes, etc.) elsewhere automatically.

GKE autoscaling doesn't scale

I am setting up a Kubernetes cluster on Google using the Google Kubernetes Engine. I have created the cluster with auto-scaling enabled on my nodepool.
As far as I understand this should be enough for the cluster to spin up extra nodes if needed.
But when I run some load on my cluster, the HPA is activated and wants to spin up some extra instances but can't deploy them due to 'insufficient cpu'. At this point I expected the auto-scaling of the cluster to kick into action but it doesn't seem to scale up. I did however see this:
So the node that is wanting to be created (I guess thanks to the auto-scaler?) can't be created with following message: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 8.0 in region europe-west1.
I also didn't touch the auto-scaling on the instance group, so when running gcloud compute instance-groups managed list, it shows as 'autoscaled: no'
So any help getting this autoscaling to work would be appreciated.
TL;DR I guess the reason it isn't working is: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 8.0 in region europe-west1, but I don't know how I can fix it.
You really have debugged it yourself already. You need to edit the Quotas on the GCP Console. Make sure you select the correct project. Increase all that are low: probably addresses and CPUs in the zone. This process is semi automated only, so you might need to wait a bit and possibly pay a deposit.