Ulimits on AWS ECS Fargate - amazon-ecs

The default ULIMIT "NOFILE" is set to 1024 for containers launched using Fargate. So if I have a cluster of let's say 10 services with two or three tasks each (all running on Fargate), what are the implications if I set them all to use a huge NOFILE number such as 900000?
More specifically, do we need to care about the host machine? It's my assumption that if I were using the EC2 launch type and set all my tasks to effectively use as many files as they wanted, the hosting EC2 instance(s) could easily get overwhelmed. Or maybe the hosts wouldn't get overwhelmed but the containers registered on the hosts would get a first come first served number of files they can open potentially leading to one service starving another? But as we don't manage the instances on EC2, what's the harm in setting the ULIMIT as high as possible for all services? Do our containers sit side-by-side on a host and would therefore share the hosts resource limits. Or do we get a host per service / per task?
Of course it's possible my assumptions are wrong about how this all works.

The maximum nofile limit on fargate is 4096
Amazon ECS tasks hosted on Fargate use the default resource limit values set by the operating system with the exception of the nofile resource limit parameter which Fargate overrides. The nofile resource limit sets a restriction on the number of open files that a container can use. The default nofile soft limit is 1024 and hard limit is 4096.
https://docs.aws.amazon.com/AmazonECS/latest/userguide/task_definition_parameters.html

A slight correction on this answer. Like the linked documentation states, these are the DEFAULT soft and hard limits for ulimit nofile. You can override this by updating your ECS Task Definition. The Ulimit settings go under the ContainerDefinitions section of the Definition.
I've successfully set the soft and hard limits for nofile on some of my AWS Fargate Tasks using this method.
So while you cannot use the Linux "ulimit -n" command to change this on the fly, you can alter it via the ECS Task Definition.
EDIT:
I've done some testing and for my setup, running AWS ECS Fargate on a Python Bullseye distro, I was able to max out at NOFILE = 1024 x 1024 = 1048576 files.
{
"ulimits": [
{
"name": "nofile",
"softLimit": 1048576,
"hardLimit": 1048576
}
],
}
Any integer multiple added to this (1024 x 1024 x INT) caused ECS to report an error when trying to start up the ECS Fargate Task:
CannotStartContainerError: ResourceInitializationError: failed to create new container runtime task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container
Hope this helps someone.
Please refer to:
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-ecs-taskdefinition-containerdefinitions-ulimit.html

Related

How to Set Max User Processes on Kubernetes

I built a docker container for my jboss application (wildfly11 in used). Container ran in AWS EKS Fargate. After running the container for several minutes, “java.lang.OutOfMemoryError: Unable to create new native thread ” occurred.
After reading this article “How to solve java.lang.OutOfMemoryError: unable to create new native thread”, I would like to change the max user processes from 1024 to 4096. However, I can't find any possible way to change it by reading the documentation in kubernetes.
I have tried those methods in this article How do I set ulimit for containers in Kubernetes? , but it seems cannot help.
I have also edited the file /etc/security/limits.conf in my Dockerfile, but the number still hasn't changed.
Anyone have an idea about this?
Thank you.

Pin Kubernetes pods/deployments/replica sets/daemon sets to run on specific cpu only

I need to restrict an app/deployment to run on specific cpus only (say 0-3 or just 1 or 2 etc.) I found out about CPU Manager and tried implement it with static policy but not able to achieve what I intend to.
I tried the following so far:
Enabled cpu manager static policy on kubelet and verified that it is enabled
Reserved the cpu with --reserved-cpus=0-3 option in the kubelet
Ran a sample nginx deployment with limits equal to requests and cpu of integer value i.e. QoS of guaranteed is ensured and able to validate the cpu affinity with taskset -c -p $(pidof nginx)
So, this makes my nginx app to be restricted to run on all cpus other than reserved cpus (0-3), i.e. if my machine has 32 cpus, the app can run on any of the 4-31 cpus. And so can any other apps/deployments that will run. As I understand, the reserved cpus 0-3 will be reserved for system daemons, OS daemons etc.
My questions-
Using the Kubernetes CPU Manager features, is it possible to pin certain cpu to an app/pod (in this case, my nginx app) to run on a specific cpu only (say 2 or 3 or 4-5)? If yes, how?
If point number 1 is possible, can we perform the pinning at container level too i.e. say Pod A has two containers Container B and Container D. Is it possible to pin cpu 0-3 to Container B and cpu 4 to Container B?
If none of this is possible using Kubernetes CPU Manager, what are the alternatives that are available at this point of time, if any?
As I understand your question, you want to set up your dedicated number of CPU for each app/pod. As I've searched.
I am only able to find some documentation that might help. The other one is a Github topic I think this is a workaround to your problem.
This is a disclaimer, based from what I've read, searched and understand there is no direct solution for this issue, only workarounds. I am still searching further for this.

Google DATA FUSION CPU

I have a problem when I try to deploy a downstream pipeline, the error from logs i receive is this:
PROVISION task failed in REQUESTING_CREATE state for program run program_run:default.ListaNomi1_v3.-SNAPSHOT.workflow.DataPipelineWorkflow.182bbf2c-576b-11ec-8095-da8d4f8ab0b3 due to Dataproc operation failure: INVALID_ARGUMENT: Multiple validation errors: - Insufficient 'CPUS' quota. Requested 10.0, available 3.0. - Insufficient 'CPUS_ALL_REGIONS' quota. Requested 10.0, available 7.0. - Insufficient 'IN_USE_ADDRESSES' quota. Requested 3.0, available 1.0. - This request exceeds CPU quota. Some things to try: request fewer workers (a minimum of 2 is required), use smaller master and/or worker machine types (such as n1-standard-2)..
I'm trying to change the worker and Master nodes configuration but it always Fail,
I can't modify the quota because I m not the leader and he says that can't change.
To process data with Cloud Data Fusion you need a cluster.
Two options are:
Ephemeral cluster when it's created for each pipeline run. This is the one you are trying to use, but it needs compute quotas to create a cluster
Static cluster (Existing Dataproc). In this case the cluster is created beforehand and you simply "points" your Pipeline to use it by creating and selection provisioning profile. This can be an option to prevent quota issues during pipeline start. But such a static cluster would incur costs while it's running, even without any jobs.

Airflow Memory Error: Task exited with return code -9

According to both of these Link1 and Link2, my Airflow DAG run is returning the error INFO - Task exited with return code -9 due to an out-of-memory issue. My DAG run has 10 tasks/operators, and each task simply:
makes a query to get one of my BigQuery tables, and
writes the results to a collection in my Mongo database.
The size of the 10 BigQuery tables range from 1MB to 400MB, and the total size of all 10 tables is ~1GB. My docker container has default 2GB of memory and I've increased this to 4GB, however I am still receiving this error from a few of the tasks. I am confused about this, as 4GB should be plenty of memory for this. I am also concerned because, in the future, these tables may become larger (a single table query could be 1-2GB), and I'd like to avoid these return code -9 errors at that time.
I'm not quite sure how to handle this issue, since the point of the DAG is to transfer data from BigQuery to Mongo daily, and the queries / data in-memory for the DAG's tasks is necessarily fairly large then, based on the size of the tables.
As you said, the error message you get corresponds to an out of memory issue.
Referring to the official documentation:
DAG execution is RAM limited. Each task execution starts with two
Airflow processes: task execution and monitoring. Currently, each node
can take up to 6 concurrent tasks. More memory can be consumed,
depending on the size of the DAG.
High memory pressure in any of the GKE nodes will lead the Kubernetes scheduler to evict pods from nodes in an attempt to relieve that pressure. While many different Airflow components are running within GKE, most don't tend to use much memory, so the case that happens most frequently is that a user uploaded a resource-intensive DAG. The Airflow workers run those DAGs, run out of resources, and then get evicted.
You can check it with following steps:
In the Cloud Console, navigate to Kubernetes Engine -> Workloads
Click on airflow-worker, and look under Managed pods
If there are pods that show Evicted, click each evicted pod and look for the The node was low on resource: memory message at the top of the window.
What are the possible ways to fix OOM issue?
Create a new Cloud Composer environment with a larger machine type than the current machine type.
Ensure that the tasks in the DAG are idempotent, which means that the result of running the same DAG run multiple times should be the same as the result of running it once.
Configure task retries by setting the number of retries on the task - this way when your task gets -9'ed by the scheduler it will go to up_for_retry instead of failed
Additionally you can check the behavior of CPU:
In the Cloud Console, navigate to Kubernetes Engine -> Clusters
Locate Node Pools at the bottom of the page, and expand the default-pool section
Click the link listed under Instance groups
Switch to the Monitoring tab, where you can find CPU utilization
Ideally, the GCE instances shouldn't be running over 70% CPU at all times, or the Composer environment may become unstable during resource usage.
I hope you find the above pieces of information useful.
I am going to chunk the data so that less is loaded into any 1 task at any given time. I'm not sure yet whether I will need to use GCS/S3 for intermediary storage.

AWS EB should create new instance once my docker reached its maximum memory limit

I have deployed my dockerized micro services in AWS server using Elastic Beanstalk which is written using Akka-HTTP(https://github.com/theiterators/akka-http-microservice) and Scala.
I have allocated 512mb memory size for each docker and performance problems. I have noticed that the CPU usage increased when server getting more number of requests(like 20%, 23%, 45%...) & depends on load, then it automatically came down to the normal state (0.88%). But Memory usage keeps on increasing for every request and it failed to release unused memory even after CPU usage came to the normal stage and it reached 100% and docker killed by itself and restarted again.
I have also enabled auto scaling feature in EB to handle a huge number of requests. So it created another duplicate instance only after CPU usage of the running instance is reached its maximum.
How can I setup auto-scaling to create another instance once memory usage is reached its maximum limit(i.e 500mb out of 512mb)?
Please provide us a solution/way to resolve these problems as soon as possible as it is a very critical problem for us?
CloudWatch doesn't natively report memory statistics. But there are some scripts that Amazon provides (usually just referred to as the "CloudWatch Monitoring Scripts for Linux) that will get the statistics into CloudWatch so you can use those metrics to build a scaling policy.
The Elastic Beanstalk documentation provides some information on installing the scripts on the Linux platform at http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-cw.html.
However, this will come with another caveat in that you cannot use the native Docker deployment JSON as it won't pick up the .ebextensions folder (see Where to put ebextensions config in AWS Elastic Beanstalk Docker deploy with dockerrun source bundle?). The solution here would be to create a zip of your application that includes the JSON file and .ebextensions folder and use that as the deployment artifact.
There is also one thing I am unclear on and that is if these metrics will be available to choose from under the Configuration -> Scaling section of the application. You may need to create another .ebextensions config file to set the custom metric such as:
option_settings:
aws:elasticbeanstalk:customoption:
BreachDuration: 3
LowerBreachScaleIncrement: -1
MeasureName: MemoryUtilization
Period: 60
Statistic: Average
Threshold: 90
UpperBreachScaleIncrement: 2
Now, even if this works, if the application will not lower its memory usage after scaling and load goes down then the scaling policy would just continue to trigger and reach max instances eventually.
I'd first see if you can get some garbage collection statistics for the JVM and maybe tune the JVM to do garbage collection more often to help bring memory down faster after application load goes down.