Reversing the flow of jobs in a workflow - github

I'm working on some terraform logic and using github workflows to deploy multiple components in a sequential manner like job2(alb) depending on the completion of job1(creation of VPC). This works fine during the apply phase. However if I were to delete the infra using terraform destroy the sequence of jobs fails as job1 can't be successfull without job1.
Is there a way to enable the execution of the workflow in the bottom-up approach based on input?
I know that we can leverage terraform to deploy these components and handle the dependencies at terraform level. This is an example of a use case I'm working on.

You can control the flow of jobs by using the keyword “needs”. Read the docs here: https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idneeds

Related

How to run multiple Copy Files task in a Azure DevOps Release pipeline simultaneously with Custom Conditions?

I am using Azure DevOps Server 2020 and I have a release pipeline which has around 21 copy file tasks in it to copy the output of multiple microservices to different target paths and this takes almost around 23 mins to complete the release pipeline.
I want to optimize the release pipeline and save some time and thus I am thinking of running all the copy task simultaneously.
Under the copy tasks in Control Options section, I see Run this task option is available where we do have the option to define custom conditions but I am not sure which custom conditions do I need to define exactly so that all my copy tasks gets executed parallelly.
Could anyone please let me know what custom conditions will allow all the copy task to get executed in one go?
Currently it is not possible to have tasks run in parallel. It has been raised as a suggestion here but the feature hasn't been implemented
How to run multiple Copy Files task in a Azure DevOps Release pipeline simultaneously with Custom Conditions?
Just as TheWinterCoder pointed, Currently it is not possible to have tasks run in parallel.
But, as a workaround, you could divide the replication task into several different jobs and make the jobs run in parallel:
This requires you to have multiple agents available in the local agent pool:

Azure terraform pipeline

I hope somebody can help me to solve this issue and understand how to implement the best approach.
I have a production environment running tons of azure services (sql server, databases, web app etc).
all those infra has been created with terraform. For as powerful as it is, I am terrified on using it in a pipeline for 1 reason.
Some of my friend, often they do some changes to the infra manually, and having not having those changes in my terraform states, if I automate this process, it might destroy the resource ungracefully, which is something that I don't want to face.
so I was wondering if anyone can shade some light on the following question:
is it possible to have terraform automated to check the infra state at every push to GitHub, and to quit if the output of the plan reports any change?
change to make clear my example.
Lets say I have a terraform state on which I have 2 web app, and somebody manually created a 3 web app on that resource group, it develops some code and push it to GitHub.My pipeline triggers, and as first step I have terraform that runs a terraform plan and/or terraform apply, if this command reports any change, I want it to quit the pipeline(fail) so I will know there is something new there, and if the terraform plan and/or terraform apply return there are no changes to the infra, is up to date to continue with the code deployment.
thank you in advance for any help and clarification.
Yes, you can just run
terraform plan -detailed-exitcode
If the exit code is != 0, you know there are changes. See here for details.
Let me point out that I would highly advise you to lock down your prod environment so that nobody can do manual changes! Your CI/CD pipeline should be the only way to make changes there.
Adding to the above answer, you can also make use of terraform import command just to import the remote changes to your state file. The terraform import command is used to import existing resources into Terraform. Later run plan to check if the changes are in sync.
Refer: https://www.terraform.io/docs/cli/commands/import.html

Use Airflow to run parametrized jobs on-demand and with a schedule

I have a reporting application that uses Celery to process thousands of jobs per day. There is a python module per each report type that encapsulates all job steps. Jobs take customer-specific parameters and typically complete within a few minutes. Currently, jobs are triggered by customers on-demand when they create a new report or request a refresh of an existing one.
Now, I would like to add scheduling, so the jobs run daily, and reports get refreshed automatically. I understand that Airflow shines at task orchestration and scheduling. I also like the idea of expressing my jobs as DAGs and getting the benefit of task retries. I can see how I can use Airflow to run scheduled batch-processing jobs, but I am unsure about my use case.
If I express my jobs as Airflow DAGs, I will still need to run them parametrized for each customer. It means, if the customer creates a new report, I will need to have a way to trigger a DAG with the customer-specific configuration. And with a scheduled execution, I will need to enumerate all customers and create a parametrized (sub-)DAG for each of them. My understanding this should be possible since Airflow supports DAGs created dynamically, however, I am not sure if this is an efficient and correct way to use Airflow.
I wonder if anyway considered using Airflow for a scenario similar to mine.
Celery workflows do literally the same, and you can create and run them at any point of time. Also, Celery has a pretty good scheduler (I have never seen it failing in 5 years of using Celery) - Celery Beat.
Sure, Airflow can be used to do what you need without any problems.
You can use Airflow to create DAGs dynamically, I am not sure if this will work with a scale of 1000 of DAGs though. There are some good examples on astronomer.io on Dynamically Generating DAGs in Airflow.
I have some DAGs and task that are dynamically generated by a yaml configuration with different schedules and configurations. It all works without any issue.
Only thing that might be challenging is the "jobs are triggered by customers on-demand" - I guess you could trigger any DAG with Airflow's REST API, but it's still in a experimental state.

Rundeck operation from Slack

I'm running Terraform script from Rundeck. When I run terraform plan, complete output should go to slack. If everything is fine, I need to approve it in slack. Then it should run terraform apply.
You can design a job that executes the terraform plan (step) with slack notification, and using this app call another job that executes terraform apply (as the same way of the first job). At the moment of call your terraform scripts, maybe a good idea is to use -auto-approve argument to avoid the interactive behavior on Rundeck, another alternative is using expect to execute your terraform scripts.

How to run multiple Kubernetes jobs in sequence?

I would like to run a sequence of Kubernetes jobs one after another. It's okay if they are run on different nodes, but it's important that each one run to completion before the next one starts. Is there anything built into Kubernetes to facilitate this? Other architecture recommendations also welcome!
This requirement to add control flow, even if it's a simple sequential flow, is outside the scope of Kubernetes native entities as far as I know.
There are many workflow engine implementations for Kubernetes, most of them are focusing on solving CI/CD but are generic enough for you to use however you want.
Argo: https://applatix.com/open-source/argo/
Added a custom resource deginition in Kubernetes entity for Workflow
Brigade: https://brigade.sh/
Takes a more serverless like approach and is built on Javascript which is very flexible
Codefresh: https://codefresh.io
Has a unique approach where you can use the SaaS to easily get started without complicated installation and maintenance, and you can point Codefresh at your Kubernetes nodes to run the workflow on.
Feel free to Google for "Kubernetes Workflow", and discover the right platform for yourself.
Disclaimer: I work at Codefresh
I would try to use cronjobs and set the concurrency policy to forbid so it doesn't run concurrent jobs.
I have worked on IBM TWS (Workload Automation) which is a scheduler similar to cronjob where you can mention the dependencies of the jobs.
You can specify a job to run only after it's dependencies has run using follows keyword.