Stop GitHub Jobs in Progress if Another Failed (stop on fail) - github

TL; DR: Running jobs a,b in parallel. If a fails, stop the execution of b, while it's still running.
My company uses GitHub Actions to deploy our code.
The first step in our deployment is building dockers and pushing them to DockerHub.
We wrote a test for our code, which we want to run in parallel with building the dockers.
Both of these are separate jobs, and we have a few more jobs depending on the success of the first two.
Right now, if the test job fails, the other job continues to run, but obviously, the next one won't run, because the test job failed.
What I would like to do is cancel the docker building job while it's running, if the test failed.
Is that possible? After searching the web, StackOverflow and the GitHub Actions page, I haven't found a way to do that.
Thanks!

You can specify the needs option and refer to the job name. See: https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idneeds
An example could be something like:
jobs:
build:
...
deploy:
needs: build
...

You can use the Cancel this build action.
The basic idea is to add it as a final step in each of your jobs that you want to cause a short-circuit in case of failure:
jobs
job_a:
steps:
- run: |
echo 'I am failing'
exit 1
- name: Cancelling parallel jobs
if: failure()
uses: andymckay/cancel-action#0.2
job_b:
steps:
- run: echo 'long task'
This will basically cancel job_b or any other in the same workflow whenever job_a fails.

Since you are working on an enterprise project, I would prefer to avoid using unverified actions from public repositories no matter how many stars they have. I think you can add a step to the end of each job a, b. This step will only run if previous steps failed. If it is failed then it will send a cancel-workflow api call.
- if: failure()
name: Check Job Status
uses: actions/github-script#v6
env:
RUN_ID: ${{ github.run_id }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
script: |
const runId = process.env.RUN_ID
const [owner, repo] = process.env.GITHUB_REPOSITORY.split("/");
const resp = await github.rest.actions.cancelWorkflowRun({
owner,
repo,
runId
})
Note: You may need to add another custom github_pat since this api-call may require higher permissions than default actions. I also suggest you to take a look at this post , I found it quite useful.

Related

how to avoid the duplicated job in github action like build image?

there are some workflows in a repository and triggered by push behavior.(sometime a push will trigger 2 workflows, sometimes will trigger 5 workflows)
but there are some common works like 'build docker image' for integration test.
I'm not sure how to avoid the duplicated work, I tried reuable workflow, but that will build the image 2 or more times。
is there anyway like the needs keywork to make all triggered action depends on the same job result ?
Check if your use case is similar to "Avoid re-running wokflow for the same commit", with Samuel Ryan's workaround:
Yes, you can use the Check Runs API to identify Workflow runs for a ref.
As luck would have it, someone has already built a comprehensive Action for this use-case: fkirc/skip-duplicate-actions.
Add a new “pre” job to your Workflow, this job uses fkirc/skip-duplicate-actions to determine if your main job should be skipped
Add a condition to your main job using the should_skip output of the “pre” job.
For example, adapted from the fkirc/skip-duplicate-actions README:
jobs:
pre_job:
runs-on: ubuntu-latest
outputs:
should_skip: ${{ steps.skip_check.outputs.should_skip }}
steps:
- id: skip_check
uses: fkirc/skip-duplicate-actions#v3.4.0
with:
skip_after_successful_duplicate: 'true'
main_job:
needs: pre_job
if: ${{ needs.pre_job.outputs.should_skip != 'true' }}
runs-on: ubuntu-latest
steps:
- run: echo "Running slow tests..." && sleep 30

GitHub Actions Job is skipped although all needs succeeded

We do have a problem with a GitHub Actions job which is always skipped although all "needed" jobs did run successfully. That's the job:
deploy-api:
needs: [build-test-api, terraform-apply, set-deployment-env]
uses: ./.github/workflows/workflow-api-deploy.yml
To verify that all needs did pass, I have added another job for debugging and printed the result of the needed jobs.
debug-deploy-api:
runs-on: ubuntu-latest
needs: [build-test-api, terraform-apply, set-deployment-env]
if: always() # Had to add this, otherwise it would be skipped just as "deploy-api".
steps:
- run: |
echo "Result of build-test-api: ${{ needs.build-test-api.result }}"
echo "Result of terraform-apply: ${{ needs.terraform-apply.result }}"
echo "Result of set-deployment-env: ${{ needs.set-deployment-env.result }}"
The output is
Result of build-test-api: success
Result of terraform-apply: success
Result of set-deployment-env: success
I don't understand why deploy-api is skipped.
Job began to be skipped after this change
The behavior started after adding a dependency to build-test-api:
With this version of build-test-api, the deploy job did run just fine:
build-test-api:
uses: # reusable WF from internal repo
needs: set-deployment-env
After changing it into
build-test-api:
uses: # reusable WF from internal repo
needs: [set-deployment-env, auto-versioning]
if: |
always() &&
(needs.set-deployment-env.result == 'success') &&
(needs.auto-versioning.result == 'success' || needs.auto-versioning.result == 'skipped')
deploy-api has been skipped always. But build-test-api is, despite that change, still running fine and even appends the created artifact to the workflow run.
Activating runner and step debug logging did not reveal any insights on why the job is still skipped. Any ideas?
Meanwhile I did contact the GitHub Premium Support and they provided a solution:
deploy-api:
if: success('build-test-api') # This line is required, if any of the previous job did not end with status 'success'.
needs: build-test-api
uses: ./.github/workflows/48-reusable-workflow-2.yml
I think I also know why: The documentation says:
You can use the following status check functions as expressions in if conditionals. A default status check of success() is applied unless you include one of these functions.
And definition of success() is as follows:
Returns true when none of the previous steps have failed or been canceled.
The only issue I think is, that is should be:
Returns true when none of the previous steps have failed, canceled or skipped.

Github Actions Job being skipped

Using Github Actions for some CI/CD.
Currently I am experiencing strange behavior where my jobs are being skipped despite the conditions being met. deploy-api has two conditions, if code was pushed to master and test-api was a success. But even though we are meeting those conditions, it is still being skipped.
jobs:
test-api:
name: Run tests on API
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v1
- name: Get dependencies
run: npm install
working-directory: ./api
- name: Run tests
run: npm run test
working-directory: ./api
deploy-api:
needs: test-api # other job must finish
if: github.ref == 'refs/heads/master' && needs.test-api.status == 'success' #only run if it's a commit to master AND previous success
As seen in the picture the second job is being skipped despite the push being on the master branch (as seen on the top) AND the previous job being successful.
Am I missing something in the code? Does anyone know of a workaround that can be used?
It would be nice if the UI told the user why it was skipped!
Use needs.test-api.result == 'success' (there is no .status) in the if expression.
See https://docs.github.com/en/actions/reference/context-and-expression-syntax-for-github-actions#needs-context.

Create dependencies between jobs in GitHub Actions

I'm new to GitHub Actions, playing with various options to work out good approaches to CI/CD pipelines.
Initially I had all my CI steps under one job, doing the following:
checkout code from repo
lint
scan source for vulnerabilities
build
test
create image
scan image for vulnerabilities
push to AWS ECR
Some of those steps don't need to be done in sequence though; e.g. we could run linting and source code vulnerability scanning in parallel with the build; saving time (if we assume that those steps are going to pass).
i.e. essentially I'd like my pipeline to do something like this:
job1 = {
- checkout code from repo #required per job, since each job runs on a different runner
- lint
}
job2 = {
- checkout code from repo
- scan source for vulnerabilities
}
job3 = {
- checkout code from repo
- build
- test
- create image
- scan image for vulnerabilities
- await job1 & job2
- push to AWS ECR
}
I have a couple of questions:
Is it possible to setup some await jobN rule within a job; i.e. to view the status of one job from another?
(only relevant if the answer to 1 is Yes): Is there any way to have the failure of one job immediately impact other jobs in the same workflow? i.e. If my linting job detects issues then I can immediately call this a fail, so would want the failure in job1 to immediately stop jobs 2 and 3 from consuming additional time, since they're no longer adding value.
Ideally, some of your jobs should be encapsulated in their own workflows, for example:
Workflow for testing the source by whatever means.
Workflow for (building and-) deploying.
and then, have these workflows depend on each other, or be triggered using different triggers.
Unfortunately, at least for the time being, workflow dependency is not an existing feature (reference).
Edit: Dependencies between workflows is now also possible, as discussed in this StackOverflow question.
Although I feel that including all of your mentioned jobs in a single workflow would create a long and hard to maintain file, I believe you can still achieve your goal by using some of the conditionals provided by the GitHub actions syntax.
Possible options:
jobs.<job_id>.if
jobs.<job_id>.needs
Using the latter, a sample syntax may look like this:
jobs:
job1:
job2:
needs: job1
job3:
needs: [job1, job2]
And here is a workflow ready to be used for testing of the above approach. In this example, job 2 will run only after job 1 completes, and job 3 will not run, since it depends on a job that failed.
name: Experiment
on: [push]
jobs:
job1:
name: Job 1
runs-on: ubuntu-latest
steps:
- name: Sleep and Run
run: |
echo "Sleeping for 10"
sleep 10
job2:
name: Job 2
needs: job1
runs-on: ubuntu-latest
steps:
- name: Dependant is Running
run: |
echo "Completed job 2, but triggering failure"
exit 1
job3:
name: Job 3
needs: job2
runs-on: ubuntu-latest
steps:
- name: Will never run
run: |
echo "If you can read this, the experiment failed"
Relevant docs:
Workflow syntax for GitHub Actions
Context and expression syntax for GitHub Actions

How to run a github-actions step, even if the previous step fails, while still failing the job

I'm trying to follow an example Github has for testing my build with github actions, and then compressing the test results and uploading them as an artifact.
https://help.github.com/en/actions/automating-your-workflow-with-github-actions/persisting-workflow-data-using-artifacts#uploading-build-and-test-artifacts
I'm having trouble with what to do when my tests fail though. This is my action. When my tests pass everything works great, my results are zipped an exported as an artifact, but if my tests fail, it stops the rest of the steps in the job, so my results never get published.
I tried adding the continue-on-error: true https://help.github.com/en/actions/automating-your-workflow-with-github-actions/workflow-syntax-for-github-actions#jobsjob_idstepscontinue-on-error
This makes it continue after it fails and uploads my test results. but then the job is marked as passed, even though my test step failed. Is there some way to have it upload my artifact even if a step fails, while still marking the overall job as failed?
name: CI
on:
pull_request:
branches:
- master
push:
branches:
- master
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v1
- name: Test App
run: ./gradlew test
- name: Archive Rest Results
uses: actions/upload-artifact#v1
with:
name: test-results
path: app/build/reports/tests
You can add
if: always()
to your step to have it run even if a previous step fails
https://docs.github.com/en/actions/learn-github-actions/expressions#status-check-functions
so for a single step it would look like this:
steps:
- name: Build App
run: ./build.sh
- name: Archive Test Results
if: always()
uses: actions/upload-artifact#v1
with:
name: test-results
path: app/build
Or you can add it to a job:
jobs:
job1:
job2:
needs: job1
job3:
if: always()
needs: [job1, job2]
Additionally, as pointed out below, putting always() will cause the function to run even if the build is canceled.
If dont want the function to run when you manually cancel a job, you can instead put:
if: success() || failure()
Other way, you can add continue-on-error: true.
Look like
- name: Job fail
continue-on-error: true
run |
exit 1
- name: Next job
run |
echo Hello
Read more in here.
run a github-actions step, even if the previous step fails
If you only need to execute the step if it succeeds or fails, then:
steps:
- name: Build App
run: ./build.sh
- name: Archive Test Results
if: success() || failure()
uses: actions/upload-artifact#v1
with:
name: test-results
path: app/build
Why use success() || failure() instead of always()?
Reading the Status check functions documentation on Github:
always
Causes the step to always execute, and returns true, even when canceled. A job or step will not run when a critical failure prevents the task from running. For example, if getting sources failed.
Which means the job will run even when it gets cancelled, if that's what you want, then go ahead. Otherwise, success() || failure() would be more suitable.
Note -
The documentation made clear thanks to Vladimir Panteleev in which he submitted the following PR: Github Docs PR #8411
Addon: if you have following sitution. 2 steps i.e. build > deploy and in some cases i.e. workflow_dispatch with input parameters you might want to skip build and proceed with deploy. At the same time you might want deploy to be skipped, when build failed.
Logically that would be something like skipped or not failed as deploy conditional.
if: always() will not work, cause it will always trigger deploy, even if build failed.
Solution is pretty simple:
if: ${{ !failure() }}
Mind that you cannot skip brackets when negating in if:, cause it reports syntax error.
The other answers here are great and work, but you might want a little more granularity.
For instance, ./upload only if ./test ran, even if it failed.
However, if something else failed and prevented the tests from running, don't upload.
# ... Other steps
- run: ./test
id: test
- run: ./upload
if: success() || steps.test.conclusion == 'failure'
steps.*.conclusion will be success, failure, cancelled, or skipped.
success or failure indicate the step ran. cancelled or skipped means it didn't.
Note there is an important caveat that you must test at least one success() or failure() in if.
if: steps.test.conclusion == 'success' || steps.test.conclusion == 'failure' won't work as expected.
you can add || true to your command.
example:
jobs:
build-and-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout#v1
- name: Test App
run: ./gradlew test || true