Auto-Scaling removes running task in ECS service (FARGATE) - amazon-ecs

I am running an ecs service using Fargate on AWS. Each task there completes a single operation and dies (fetching a message from a SQS queue and decode/encode a video file). Now I designed an autoscaling policy like below,
If SQS queue size is more than 5 increment desired count to 1 (repeat every 60 seconds).
If SQS queue size is less than 2 decrement desired count to 1 (repeat every 60 seconds).
But what AWS is doing is than when queue size gets down below 2, it kills out running tasks leaving the corresponding operation "broken". I don't want AWS to kill the running tasks (because they will automatically die within sometime when the command completes) but just to set the desired count to 0 so that the tasks doesn't get "respawned". So literally I want my tasks to be unstoppable during auto-scaling.
How I can achieve this in ECS service and aws_ecs_autoscaling_target. Please note that I am using terraform to provision the service.
Thanks in advance.

I had to solve this issue in a different approach. I had to create a small Lambda function which gets triggered by the cloudwatch alarm and starts a Fargate task using StartTask. This workflow suited well here rather than using autoscaling policy.

Related

Delay in Kubernetes Job status update when running many jobs in parallel

I have a bit of a unique use-case where I want to run a large number (thousands to tens of thousands) of Kubernetes Jobs at once. Each job consists of a single container, Parallelism 1 and Completions 1, with no side-car or agent. My cluster has plenty of capacity for the resources I'm requesting.
My problem is that the Job status is not transitioning to Complete for a significant period of time when I run many jobs concurrently.
My application submits Jobs and has a watcher on the namespace - as soon as a Job's status transitions to 'succeeded 1', we delete the Job and send information back to the application. The application needs this to happen as soon as possible in order to define and submit subsequent Jobs.
I'm able to submit new Job requests as fast as I want, and Pod scheduling happens without delay, but beyond about one or two hundred concurrent Jobs I get significant delay between a Job's Pod completing and the Job's status updating to Complete. At only around 1,000 jobs in the cluster, it can easily take 5-10 minutes for a Job status to update.
This tells me there is some process in the Kubernetes Control Plane that needs more resources to process Pod completion events more rapidly, or a configuration option that enables it to process more tasks in parallel. However, my system monitoring tools have not yet been able to identify any Control Plane services that are maxing out their available resources while the cluster processes the backlog, and all other operations on the cluster appear to be normal.
My question is - where should I look for system resource or configuration bottlenecks? I don't know enough about Kubernetes to know exactly what components are responsible for updating a Job's status.

Apache Flink job is not scheduled on multiple TaskManagers in Kubernetes (replicas)

I have a simple Flink job that reads from an ActiveMQ source & sink to a database & print. I deploy the job in the Kubernetes with 2 TaskManagers, each having Task Slots of 10 (taskmanager.numberOfTaskSlots: 10). I configured parallelism more than the total TaskSlots available (ie., 10 in this case).
When I see the Flink Dashboard I see this job runs only in one of the TaskManager, but the other TaskManager has no Jobs. I verified this by checking every operator where it is scheduled, also in Task Manager UI page one of the manager has all slots free. I attach below images for reference.
Did I configure anything wrong? Where is the gap in my understanding? And can someone explain it?
Job
The first task manager has enough slots (10) to fully satisfy the requirements of your job.
The scheduler's default behavior is to fully utilize one task manager's slots before using slots from another task manager. If instead you would prefer that Flink spread out the workload across all available task managers, set cluster.evenly-spread-out-slots: true in flink-conf.yaml. (This option was added in Flink 1.10 to recreate a scheduling behavior similar to what was the default before Flink 1.5.)

Complete parallel Kubernetes job when one worker pod succeeds

I have a simple containerised python script which I am trying to parallelise with Kubernetes. This script guesses hashes until it finds a hashed value below a certain threshold.
I am only interested in the first such value, so I wish to create a Kubernetes job that spawns n worker pods and completes as soon as one worker pod finds a suitable value.
By default, Kubernetes jobs wait until all worker pods complete before marking the job as complete. I have so far been unable to find a way around this (no mention of this job pattern in the documentation), and have been relying on checking the logs of bare pods via a bash script to determine whether one has completed.
Is there a native means to achieve this? And, if not, what would be the best approach?
Hi look this link https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/#parallel-jobs.
I've never tried it but it seems possible to launch several pods and configure the end of the job when x pods have finished. In your case x is 1.
We can define two specifications for parallel Jobs:
1. Parallel Jobs with a fixed completion count:
specify a non-zero positive value for .spec.completions.
the Job represents the overall task, and is complete when there is
one successful Pod for each value in the range 1 to
.spec.completions
not implemented yet: Each Pod is passed a different index in the
range 1 to .spec.completions.
2. Parallel Jobs with a work queue:
do not specify .spec.completions, default to .spec.parallelism
the Pods must coordinate amongst themselves or an external service to
determine what each should work on.
For example, a Pod might fetch a batch of up to N items from the work queue.
each Pod is independently capable of determining whether or not all its peers are done, and thus that the entire Job is done.
when any Pod from the Job terminates with success, no new Pods are
created
once at least one Pod has terminated with success and all Pods are
terminated, then the Job is completed with success
once any Pod has exited with success, no other Pod should still be
doing any work for this task or writing any output. They should all
be in the process of exiting
For a fixed completion count Job, you should set .spec.completions to the number of completions needed. You can set .spec.parallelism, or leave it unset and it will default to 1.
For a work queue Job, you must leave .spec.completions unset, and set .spec.parallelism to a non-negative integer.
For more information about how to make use of the different types of job, see the job patterns section.
You can also take a look on single job which starts controller pod:
This pattern is for a single Job to create a Pod which then creates other Pods, acting as a sort of custom controller for those Pods. This allows the most flexibility, but may be somewhat complicated to get started with and offers less integration with Kubernetes.
One example of this pattern would be a Job which starts a Pod which runs a script that in turn starts a Spark master controller (see spark example), runs a spark driver, and then cleans up.
An advantage of this approach is that the overall process gets the completion guarantee of a Job object, but complete control over what Pods are created and how work is assigned to them.
At the same time take under consideration that completition status of Job set by dafault - when specified number of successful completions is reached it ensure that all tasks are processed properly. Applying this status before all tasks are finished is not secure solution.
You should also know that finished Jobs are usually no longer needed in the system. Keeping them around in the system will put pressure on the API server. If the Jobs are managed directly by a higher level controller, such as CronJobs, the Jobs can be cleaned up by CronJobs based on the specified capacity-based cleanup policy.
Here is official documentations: jobs-parallel-processing , parallel-jobs.
Useful blog: article-parallel job.
EDIT:
Another option is that you can create special script which will continuously check values you look for. Using job then will not be necessary, you can simply use deployment.

AWS Fargate vs Batch vs ECS for a once a day batch process

I have a batch process, written in PHP and embedded in a Docker container. Basically, it loads data from several webservices, do some computation on data (during ~1h), and post computed data to an other webservice, then the container exit (with a return code of 0 if OK, 1 if failure somewhere on the process). During the process, some logs are written on STDOUT or STDERR. The batch must be triggered once a day.
I was wondering what is the best AWS service to use to schedule, execute, and monitor my batch process :
at the very begining, I used a EC2 machine with a crontab : no high-availibilty function here, so I decided to switch to a more PaaS approach.
then, I was using Elastic Beanstalk for Docker, with a non-functional Webserver (only to reply to the Healthcheck), and a Crontab inside the container to wake-up my batch command once a day. With autoscalling rule min=1 max=1, I have HA (if the container crash or if the VM crash, it is restarted by AWS)
but now, to be more efficient, I decided to move to some ECS service, and have an approach where I do not need to have EC2 instances awake 23/24 for nothing. So I tried Fargate.
with Fargate I defined my task (Fargate type, not the EC2 type), and configure everything on it.
I create a Cluster to run my task : I can run "by hand, one time" my task, so I know every settings are corrects.
Now, going deeper in Fargate, I want to have my task executed once a day.
It seems to work fine when I used the Scheduled Task feature of ECS : the container start on time, the process run, then the container stop. But CloudWatch is missing some metrics : CPUReservation and CPUUtilization are not reported. Also, there is no way to know if the batch quit with exit code 0 or 1 (all execution stopped with status "STOPPED"). So i Cant send a CloudWatch alarm if the container execution failed.
I use the "Services" feature of Fargate, but it cant handle a batch process, because the container is started every time it stops. This is normal, because the container do not have any daemon. There is no way to schedule a service. I want my container to be active only when it needs to work (once a day during at max 1h). But the missing metrics are correctly reported in CloudWatch.
Here are my questions : what are the best suitable AWS managed services to trigger a container once a day, let it run to do its task, and have reporting facility to track execution (CPU usage, batch duration), including alarm (SNS) when task failed ?
We had the same issue with identifying failed jobs. I propose you take a look into AWS Batch where logs for FAILED jobs are available in CloudWatch Logs; Take a look here.
One more thing you should consider is total cost of ownership of whatever solution you choose eventually. Fargate, in this regard, is quite expensive.
may be too late for your projects but still I thought it could benefit others.
Have you had a look at AWS Step Functions? It is possible to define a workflow and start tasks on ECS/Fargate (or jobs on EKS for that matter), wait for the results and raise alarms/send emails...

ECS instance scaling to/from 0

I have a service which runs 1 task. The task takes 2 hours to run and runs daily. My ideal scenario would be this:
I update my service to from 0 desired tasks to 1 desired task
ECS sees that in order to run the service I need an EC2 Instance. It therefore spins up an instance to run the task.
When the task finishes it updates the service to 0 desired tasks.
ECS sees that I don't need the instance to run 0 tasks and turns it off
Using the ECS admin it looks like this is possible but in reality, when I scale up my service from 0->1 task, it just complains there is no instances to run the task, rather than autoscaling an instance. I set the auto scale policies of the cluster to min=0, desired=1, max=1 however it makes no difference.
I'd like to know if my ideal scenario is indeed possible, or if there is a better way to achieve this goal.
Thanks in advance,
Unfortunately point 2 and 4 is not true for ECS (EC2 launch type). By default it will neither launch the EC2 instance nor terminate the instance.
Actually FARGATE is more costlier than ECS(EC2 lauch type). But for your use case FARGATE will be much more cheaper [1] compare to ECS(EC2 lauch type).
But again FARGATE would not be the best option. According to your use case best option would be AWS Batch [2]. Batch uses the ECS as a back end and main advantage using batch is it will also perform the step 2 and step 4 mentioned in your use case.
[1] https://aws.amazon.com/fargate/pricing/
[2] https://docs.aws.amazon.com/batch/latest/userguide/what-is-batch.html