ECS auto scailing cluster with ec2 count - amazon-ecs

To deploy my docker-compose, I using AWS ECS.
Everything works fine, except auto scailing.
When create ECS cluster,
I can decide number of instances.
So I defined it to 1.
Next, when creating service on my cluster,
Also can decide number of tasks.
I know that tasks running on the instance, so I defined it to 1.
And to specify auto scailing policy like this.
As you know that, if cpu percentage up to 50 in 5 minutes, it automatically adds a task.
So finish configure it, I run benchmark to test.
In the service describe, desired tasks is increase to 2.
But instance didn't added automatically.
In the event log,
Maybe I defined number of instances to 1 in my cluster, So it can't start new task.
Why auto scailing do not automatically add new instance on my cluster?
Is there any problem on my configuration?
Thanks.

Your ecs cluster Is not autoscaling the number of instances. It autoscales number of tasks that are running inside your existing cluster. An ec2 instance can have multiple tasks running. To autoscale instance count, you will need to use cloudwatch alarms:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/cloudwatch_alarm_autoscaling.html
You are receiving this issue because of the port conflict when ECS attempts to use the "closest matching container instance" which in this case is the one which ends in 9e5e.
When attempting to spin up a task on that instance it notices that this instance "is already using a port required by your task"
In order to resolve this issue,
You need to use dynamic porting for your ECS cluster.
There is a tutorial on how to do this that Amazon provides here:
https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/
Essentially,
You will need to modify the port mapping in the task definition that has the docker container you are trying to run and scale.
The port mapping should be 0 for the host port and then the port number that your application uses for the container port.
the zero value will make each docker instance in the ECS cluster that is ran use a different number for its host port, eliminating the port conflict you are experiencing.

Related

Execute command on each node

Background: Have approx 50 nodes "behind" a namespace. Meaning that a given Pod in this namespace can land on any of those 50 nodes.
The task is to test if an outbound firewall rule (in a FW outside the cluster) has been implemented correctly. Therefore I would like to test a command on each potential node in the namespace which will tell me if I can reach my target from the given node. (using curl for such test but that is besides the point for my question)
I can create a small containerized app which will exit 0 on success. Then next step would be execute this on each potential node and harvest the result. How to do that?
(I don't have access to the nodes directly, only indirectly via Kubernetes/OpenShift. I only have access to the namespace-level, not the cluster-level.)
The underlying node firewall settings is NOT control by K8s network policies. To test network connectivity in a namespace you only need to run 1 pod in that namespace. To test firewall settings of the node you typically ssh into the node and execute command to test - while this is possible with K8s but that would require the pod to run with root privileged; which not applicable to you as you only has access to a single namespace.
Then next step would be execute this on each potential node and
harvest the result. How to do that?
As gohm'c answer you can not run Command on Nodes unless you have access to Worker nodes. You need to have SSH access to check the firewall on Nodes.
If you are planning to just run container app on specific types of nodes, or on all the Nodes you can follow below answer
You can create the deployment or you can use the Deamon set if want to run on each node.
Deployment could be useful if you are planning to run on specific nodes, you have to use in that case Node selector or Affinity.
Daemon set will deploy and run containers on all existing Nodes. So you can choose accordingly.

Get AWS Batch cluster name CDK

I'm trying to create an alarm by using memory utilization in AWS Batch. However, the metric related to this service is under the ECS Cluster that is automatically created when creating a compute environment. I'm trying to provide this cluster name to the alarm dimension, but I'm unable to access the cluster name using CDK. I've researched in the CDK API and it doesn't seem to be possible. Does anybody now how this can be done?
I don't know whether you can find the ECS Cluster created by Batch using CDK. Batch hides the details about the work that it does on the backend (i.e. creating an ECS Cluster).
My only guess is that you can write custom code to list the ECS Clusters in your account and match one of the clusters with the name you expect to see. I think Batch initializes the cluster when you initialize the Batch Compute Environment, but I'm not sure whether there is a lag in the timing.

How to Scale up and Scale down cluster instances in AWS ECS

We have an application to create/start/stop containers inside AWS ECS. we are not making use of ecs services because we don't want container to be started if it is stopped by an application.
So how to automate scale-in/scale-out of the cluster instances in ecs without using ecs services?
Below is the documentation which will tell you step by step how to scale your container instances.
Scaling Container Instances
So how this works is :
Say you have one Container Instance and 2 services running on it.
You are required to increase the ECS Service but it will not scale as it doesn't have resources available on one Container Instance.
Following up the documentation, you can set up CloudWatch Alarms on let's say MemoryReservation metric for your cluster.
When the memory reservation of your cluster rises above 75% (meaning that only 25% of the memory in your cluster is available to for new tasks to reserve), the alarm triggers the Auto Scaling group to add another instance and provide more resources for your tasks and services.
Depending on the Amazon EC2 instance types that you use in your
clusters, and quantity of container instances that you have in a
cluster, your tasks have a limited amount of resources that they can
use while running. Amazon ECS monitors the resources available in the
cluster to work with the schedulers to place tasks. If your cluster
runs low on any of these resources, such as memory, you are eventually
unable to launch more tasks until you add more container instances,
reduce the number of desired tasks in a service, or stop some of the
running tasks in your cluster to free up the constrained resource.

Resizing instance groups by schedule

I have kubernetes cluster that contains two node pools. I have a task to automate resizing node pools to 0 nodes on weekends to save the money.
I know that I can stop the compute instances by standard schedule.
But I can't stop the instances that are members of instance pools. I can only resize the pool to 0. How can I do that by gcloud schedule?
Cloud scheduler won't allow you to resize the node pool. You can instead use Cloud scheduler along with Cloud Functions to call the container API to resize the node pool. There is an example on the Google public docs to do something like this for a compute instance, you'll have to convert the function call to use the container API instead.
Here are some possible solutions:
Use GKE to manage your cluster, so you can resizing-a-cluster or migration to
different size machine.
Manage your own kubernetes cluster, uses a Compute Engine instance group for the nodes in your cluster, you can actually update it without needing GKE's help
If you want automation, you can use Jenkins or Airflow to schedule resizing jobs.
Hope this can help you.

Can I create a GCP cluster with different machine types?

I'd like to create a cluster with two different machine types.
How would I go about doing this? What documentation is available?
I assume you are talking about a Google Container Engine cluster.
You can have machines of different types by having more than one node pool.
If you are creating the cluster in the Console, start by creating it with one node pool and after it is created edit the cluster to add a second node pool with different instance configuration. This is necessary because the UI only allows one node pool at creation.