Run task defintion after stack creation - aws-cloudformation

The question seems simple enough. I have a bunch task definitions and a cluster in my CloudFormation template. When setting up manually I would create a task based on any definition and provide it with a CRON definition. It would then start to run.
I can't seem to find this option in CF? I found service but this only works for tasks that run indefinitely, which mine are not (they run once per day for approx. 10-20 minutes).
After some research I found out about AWS::Events::Rule which people seem to only use in conjunction with Lambda which I do not. I was unable to find any example that referenced FARGATE tasks so I'm not sure it's even possible.
If anyone has any examples of running tasks in CRON using CF, that would be great.

I think that ECS scheduled tasks (cron) would suit you:
Amazon ECS supports the ability to schedule tasks on either a cron-like schedule or in a response to CloudWatch Events. This is supported for Amazon ECS tasks using both the Fargate and EC2 launch types.
This is based on CloudWatch Events which can be used to schedule many things, not only lambda.
To setup it using CloudFormation you can use AWS::Events::Rule with the target of AWS::Events::Rule EcsParameters

Related

Trigger a PHP script to run on each Fargate task in an ECS service

What is the best way to trigger a PHP script on all running Fargate tasks for an ECS service?
I need to trigger this from another PHP script on one of the ECS tasks.
The reason I need to do this is that I have NGINX FastCGI Cache on these individual ECS tasks and need to purge the cache for all tasks when an admin makes an update in the CMS.
My NGINX configuration has a /purge/[path] endpoint that will purge the cache using fastcgi_cache_purge, and I do currently have a somewhat working hacky solution that looks something like this:
Admin saves in the CMS
PHP snippet runs that:
a) Fetches the number of running ECS tasks using AWS PHP SDK
b) Runs a for loop for the number of ECS tasks running
c) Calls the /purge/[path] URL using cURL (tries multiple times as it sometimes hits the same ECS task)
The above works but is not optimal.
Here are some other solutions that come to mind, but I can't find much information online on how to implement:
Would it be possible perhaps to change the fastcgi_cache_path to a shared file system like AWS EFS, or does that hurt performance not being tmpfs? I see that there's tmpfs support for ECS, but once mounted, would it be shared across the multiple ECS tasks or individually per ECS task?
Using AWS SNS/SQS (write one PHP script that subscribes and another one that publishes an event)
Use Redis pub/sub similar to above (also not sure exactly how to implement and how to start a long-running subscriber when ECS tasks start)
Use ECS Exec and AWS PHP SDK (almost have a working solution that fetches all ECS tasks and loops them through, and executes an "ECS Exec" command, but "--non-interactive" is not available yet, so it doesn't work. Only "--interactive" mode works currently)
Is there an easier/better solution for this? If using any of the above implementations, can someone put me in the right direction on how to implement this using PHP?
Thanks!
This honestly seems like a duplicate of your previous question, but I'll answer:
Would it be possible perhaps to change the fastcgi_cache_path to a shared file system like AWS EFS, or does that hurt performance not
being tmpfs? I see that there's tmpfs support for ECS, but once
mounted, would it be shared across the multiple ECS tasks or
individually per ECS task?
EFS wouldn't be a good option for this because EFS is SLOW. It would kill any performance benefit of having a cache.
A tmpfs on Fargate will not be shared among instances.
Using AWS SNS/SQS (write one PHP script that subscribes and another one that publishes an event)
SNS could work, as discussed in your previous question. SQS would not work at all, since an SQS message is only delivered to one consumer, and you want it to be delivered to all instances.
Use Redis pub/sub similar to above (also not sure exactly how to implement and how to start a long-running subscriber when ECS tasks
start)
Yes this would work, as discussed in your previous question. But of course would require a lot of custom coding.
Use ECS Exec and AWS PHP SDK (almost have a working solution that fetches all ECS tasks and loops them through, and executes an "ECS
Exec" command, but "--non-interactive" is not available yet, so it
doesn't work. Only "--interactive" mode works currently)
This should work. Why do you state that --non-interactive is not available yet? If you just leave off the --interactive parameter, then the command is executed non-interactively.

Google Kubernetes Api Cron Job

I have a cluster in Google Kubernetes Engine, in that cluster there is a workload which runs every 4 hours, its a cron job that was set up by someone. I want to make that run whenever I need it. I am trying to achieve this by using the google Kubernetes API, sending requests from my app whenever a button is clicked to run that cron job, unfortunately the API has no apparent way to do that, or does not have a way at all. What would be some good advice to achieve my goal?
This is a Community Wiki answer, posted for better visibility, so feel free to edit it and add any additional details you consider important.
CronJob resource in kubernetes is not meant to be used one-off tasks, that are run on demand. It is rather configured to run on a regular schedule.
Manuel Polacek has already mentioned that in his comment:
For this scenario you don't need a cron job. A simple bare pod or a
job would be enough, i would say. You can apply a resource on button
push, for example with kubectl – Manuel Polacek Apr 24 at 19:25
So rather than trying to find a way to run your CronJobs on demand, regardless of how they are originally scheduled (usually to be repeated at regular intervals), you should copy the code of such CronJob and find a different way of running it. A Job fits ideally to such use case as it is designed to run one-off tasks.

Task definitions AWS Fargate

Let us say I am defining a task definition in AWS Fargate, this task definition would be used to start up tasks that involve a multi-container application regarding 2 web servers. How many task definitions would I need, how many tasks would I pay for and how many services are create?
I have read a lot of documentation, but it does not click for me. Is there anyone who can explain the correlation between: task definitions, task/s, Docker containers, services and ECS Fargate clusters?
A task definition is a specification. You use it to define one or more containers (with image URIs) that you want to run together, along with other details such as environment variables, CPU/memory requirements, etc. The task definition doesn't actually run anything, its a description of how things will be set up when something does run.
A task is an actual thing that is running. ECS uses the task definition to run the task; it downloads the container images, configures the runtime environment based on other details in the task definition. You can run one or many tasks for any given task definition. Each running task is a set of one or more running containers - the containers in a task all run on the same instance.
A service in ECS is a way to run N tasks all using the same task definition, and keep those N tasks running if they happen to shut down unexpectedly. Those N tasks can run on different instances in EC2 (although some may run on the same instance depending on the placement strategy used for the service); on Fargate, there are no instances and the tasks "just run", so you don't have to think about placement strategies. You can also use services to connect those tasks to a load balancer, so that requests from a client inside or outside of AWS can be routed evenly cross all N tasks. You can update the task definition used by a service, which will then trigger a rolling update (starting up and shutting down running tasks) so that all running tasks will be using the new version of the task definition after the deployment completes. This is used, for example, when you create a new container image and want your service to be updated to use the latest version.
A service is scoped to a cluster. A cluster is really just a name. Different clusters can have different IAM policies and roles, so that you can restrict who can create services in different clusters using IAM.

How to schedule jobs in Kubeflow?

I'm setting up a Kubeflow cluster on AWS EKS, is there a native way in Kubeflow that allows us to automatically schedule jobs i.e. (Run the workflow every X hours, get data every X hours, etc.)
I have tried to look for other things like Airflow, but i'm not really sure if it will integrate well with the Kubeflow environment.
That should be what a recurring run is for.
That would be using a run trigger, which does have a cron field, for specifying cron semantics for scheduling runs.

How to run multiple Kubernetes jobs in sequence?

I would like to run a sequence of Kubernetes jobs one after another. It's okay if they are run on different nodes, but it's important that each one run to completion before the next one starts. Is there anything built into Kubernetes to facilitate this? Other architecture recommendations also welcome!
This requirement to add control flow, even if it's a simple sequential flow, is outside the scope of Kubernetes native entities as far as I know.
There are many workflow engine implementations for Kubernetes, most of them are focusing on solving CI/CD but are generic enough for you to use however you want.
Argo: https://applatix.com/open-source/argo/
Added a custom resource deginition in Kubernetes entity for Workflow
Brigade: https://brigade.sh/
Takes a more serverless like approach and is built on Javascript which is very flexible
Codefresh: https://codefresh.io
Has a unique approach where you can use the SaaS to easily get started without complicated installation and maintenance, and you can point Codefresh at your Kubernetes nodes to run the workflow on.
Feel free to Google for "Kubernetes Workflow", and discover the right platform for yourself.
Disclaimer: I work at Codefresh
I would try to use cronjobs and set the concurrency policy to forbid so it doesn't run concurrent jobs.
I have worked on IBM TWS (Workload Automation) which is a scheduler similar to cronjob where you can mention the dependencies of the jobs.
You can specify a job to run only after it's dependencies has run using follows keyword.