How to Scale up and Scale down cluster instances in AWS ECS

How to Scale up and Scale down cluster instances in AWS ECS - amazon-ecs

We have an application to create/start/stop containers inside AWS ECS. we are not making use of ecs services because we don't want container to be started if it is stopped by an application.
So how to automate scale-in/scale-out of the cluster instances in ecs without using ecs services?

Below is the documentation which will tell you step by step how to scale your container instances.
Scaling Container Instances
So how this works is :
Say you have one Container Instance and 2 services running on it.
You are required to increase the ECS Service but it will not scale as it doesn't have resources available on one Container Instance.
Following up the documentation, you can set up CloudWatch Alarms on let's say MemoryReservation metric for your cluster.
When the memory reservation of your cluster rises above 75% (meaning that only 25% of the memory in your cluster is available to for new tasks to reserve), the alarm triggers the Auto Scaling group to add another instance and provide more resources for your tasks and services.
Depending on the Amazon EC2 instance types that you use in your
clusters, and quantity of container instances that you have in a
cluster, your tasks have a limited amount of resources that they can
use while running. Amazon ECS monitors the resources available in the
cluster to work with the schedulers to place tasks. If your cluster
runs low on any of these resources, such as memory, you are eventually
unable to launch more tasks until you add more container instances,
reduce the number of desired tasks in a service, or stop some of the
running tasks in your cluster to free up the constrained resource.

Related

AWS EKS - Dead container cleanup

I am using Terraform to create infrastructure on AWS environment. Out of many services, we are also creating AWS EKS using terraform-aws-modules/eks/aws module. The EKS is primarily used for spinning dynamic containers to handle asynchronous job execution. Once a given task is completed the container releases resources and terminates.
What I have noticed is that, the dead containers lying on the EKS cluster forever. This is causing too many dead containers just sitting on EKS and consuming storage. I came across few blogs which mention that Kubernetes has garbage collection process, but none describes how it can be specified using Terraform or explicitly for AWS EKS.
Hence I am looking for a solution, which will help to specify garbage collection policy for dead containers on AWS EKS. If not achievable via Terraform, I am ok with using kubectl with AWS EKS.

These two kubelet flags will cause the node to clean up docker images when the filesystem reaches those percentages. https://kubernetes.io/docs/concepts/architecture/garbage-collection/#container-image-lifecycle
--image-gc-high-threshold="85"
--image-gc-low-threshold="80"
But you also probably want to set --maximum-dead-containers 1 so that running multiple (same) images doesn't leave dead containers around.
In EKS you can add these flags to the UserData section of your EC2 instance/Autoscaling group.
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh --apiserver-endpoint ..... --kubelet-extra-args '<here>'

Resizing instance groups by schedule

I have kubernetes cluster that contains two node pools. I have a task to automate resizing node pools to 0 nodes on weekends to save the money.
I know that I can stop the compute instances by standard schedule.
But I can't stop the instances that are members of instance pools. I can only resize the pool to 0. How can I do that by gcloud schedule?

Cloud scheduler won't allow you to resize the node pool. You can instead use Cloud scheduler along with Cloud Functions to call the container API to resize the node pool. There is an example on the Google public docs to do something like this for a compute instance, you'll have to convert the function call to use the container API instead.

Here are some possible solutions:
Use GKE to manage your cluster, so you can resizing-a-cluster or migration to
different size machine.
Manage your own kubernetes cluster, uses a Compute Engine instance group for the nodes in your cluster, you can actually update it without needing GKE's help
If you want automation, you can use Jenkins or Airflow to schedule resizing jobs.
Hope this can help you.

Kubernetes with hybrid containers on one VM?

I have played around a little bit with docker and kubernetes. Need some advice here on - Is it a good idea to have one POD on a VM with all these deployed in multiple (hybrid) containers?
This is our POC plan:
Customers to access (nginx reverse proxy) with a public API endpoint. eg., abc.xyz.com or def.xyz.com
List of containers that we need
Identity server Connected to SQL server
Our API server with Hangfire. Connected to SQL server
The API server that connects to Redis Server
The Redis in turn has 3 agents with Hangfire load-balanced (future scalable)
Setup 1 or 2 VMs?
Combination of Windows and Linux Containers, is that advisable?
How many Pods per VM? How many containers per Pod?
Should we attach volumes for DB?
Thank you for your help

Cluster size can be different depending on the Kubernetes platform you want to use. For managed solutions like GKE/EKS/AKS you don't need to create a master node but you have less control over our cluster and you can't use latest Kubernetes version.
It is safer to have at least 2 worker nodes. (More is better). In case of node failure, pods will be rescheduled on another healthy node.
I'd say linux containers are more lightweight and have less overhead, but it's up to you to decide what to use.
Number of pods per VM is defined during scheduling process by the kube-scheduler and depends on the pods' requested resources and amount of resources available on cluster nodes.
All data inside running containers in a Pod are lost after pod restart/deletion. You can import/restore DB content during pod startup using Init Containers(or DB replication) or configure volumes to save data between pod restarts.
You can easily decide which container you need to put in the same Pod if you look at your application set from the perspective of scaling, updating and availability.
If you can benefit from scaling, updating application parts independently and having several replicas of some crucial parts of your application, it's better to put them in the separate Deployments. If it's required for the application parts to run always on the same node and if it's fine to restart them all at once, you can put them in one Pod.

ECS auto scailing cluster with ec2 count

To deploy my docker-compose, I using AWS ECS.
Everything works fine, except auto scailing.
When create ECS cluster,
I can decide number of instances.
So I defined it to 1.
Next, when creating service on my cluster,
Also can decide number of tasks.
I know that tasks running on the instance, so I defined it to 1.
And to specify auto scailing policy like this.
As you know that, if cpu percentage up to 50 in 5 minutes, it automatically adds a task.
So finish configure it, I run benchmark to test.
In the service describe, desired tasks is increase to 2.
But instance didn't added automatically.
In the event log,
Maybe I defined number of instances to 1 in my cluster, So it can't start new task.
Why auto scailing do not automatically add new instance on my cluster?
Is there any problem on my configuration?
Thanks.

Your ecs cluster Is not autoscaling the number of instances. It autoscales number of tasks that are running inside your existing cluster. An ec2 instance can have multiple tasks running. To autoscale instance count, you will need to use cloudwatch alarms:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_alarm_autoscaling.html
You are receiving this issue because of the port conflict when ECS attempts to use the "closest matching container instance" which in this case is the one which ends in 9e5e.
When attempting to spin up a task on that instance it notices that this instance "is already using a port required by your task"
In order to resolve this issue,
You need to use dynamic porting for your ECS cluster.
There is a tutorial on how to do this that Amazon provides here:
https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/
Essentially,
You will need to modify the port mapping in the task definition that has the docker container you are trying to run and scale.
The port mapping should be 0 for the host port and then the port number that your application uses for the container port.
the zero value will make each docker instance in the ECS cluster that is ran use a different number for its host port, eliminating the port conflict you are experiencing.

Kubernetes vs Google Container Engine: How to use autoscaling?

I have deployed an app using Kubernetes to a Google Cloud Container Engine Cluster.
I got into autoscaling, and I found the following options:
Kubernetes Horizontal Pod Autoscaling (HPA)
As explained here, Kubernetes offers the HPA on deployments. As per the docs:
Horizontal Pod Autoscaling automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization
Google Cloud Container Cluster
Now I have a Google Cloud Container Cluster using 3 instances, with autoscaling enabled. As per the docs:
Cluster Autoscaler enables users to automatically resize clusters so that all scheduled pods have a place to run.
This means I have two places to define my autoscaling. Hence my questions:
Is a Pod the same as VM instance inside my cluster, or can multiple Pod's run inside a single VM instance?
Are these two parameters doing the same (aka creating/removing VM instances inside my cluster). If not, what is their behaviour compared to one another?
What happens if e.g. I have a number of pods between 3 and 10 and a cluster with number of instances between 1 and 3 and autoscaling kicks in. When and how would both scale?
Many thanks!

Is a Pod the same as VM instance inside my cluster, or can multiple
Pod's run inside a single VM instance?
Multiple Pods can run the same instance (called node in kuberenetes). You can define maximum resources to consume for a POD in the deployment yaml. See the docs. This is an important prerequisite for autoscaling.
Are these two parameters doing the same (aka creating/removing VM
instances inside my cluster). If not, what is their behaviour compared
to one another?
Kubernetes autoscaler will schedule additional PODs in your existing nodes. Google autoscaler will add worker nodes (new instances) to your cluster. Google autoscaler looks at queued up PODs that cannot be scheduled because there is no space in your cluster and when it finds those will add nodes.
What happens if e.g. I have a number of pods between 3 and 10 and a
cluster with number of instances between 1 and 3 and autoscaling kicks
in. When and how would both scale?
By the maximum resource usage you define for your pods google autoscaler will estimate how many new nodes are required to run all queued up, scheduled pods.
Also read this article.