Is there a way to configure retries for Azure DevOps pipeline tasks or jobs? - azure-devops

Currently I have a OneBranch DevOps pipeline that fails every now and then while restoring packages. Usually it fails because of some transient error like a socket exception or timeout. Re-trying the job usually fixes the issue.
Is there a way to configure a job or task to retry?

Azure Devops now supports the retryCountOnTaskFailure setting on a task to do just this.
See this page for further information:
https://learn.microsoft.com/en-us/azure/devops/release-notes/2021/pipelines/sprint-195-update

Update:
Automatic retries for a task was added and when you read this it should be available for usage.
It can be used as follow:
- task: <name of task>
retryCountOnTaskFailure: <max number of retries>
...
Here are a few things to note when using retries:
The failing task is retried immediately.
There is no assumption about the idempotency of the task. If the task has side-effects (for instance, if it created an external resource partially), then it may fail the second time it is run.
There is no information about the retry count made available to the task.
A warning is added to the task logs indicating that it has failed before it is retried.
All of the attempts to retry a task are shown in the UI as part of the same task node.
Original answer:
There is no way of doing that with native tasks. However, if you can script then you can put such logic inside.
You could do this for instance in this way:
n=0
until [ "$n" -ge 5 ]
do
command && break # substitute your command here
n=$((n+1))
sleep 15
done
However there is no native way of doing this for regular tasks.
Automatically retry a task in on roadmap so it could change in near future.

Related

Waiting on job "x" to finish in pipeline "1" before running job "x" in pipeline "2" (Azure DevOps)

We're using SonarQube for tests, and there's one token it uses, as long as one pipeline is running, it goes fine, but if I run different pipelines (all of them have E2E tests as final jobs), they all fail, because they keep calling a token that expires as soon as its used by one pipeline (job). Would it be possible to have -all- pipelines pause at job "x" if they detect some pipeline running job "x" already? The jobs have same names across all pipelines. Yes, I know this is solved by just running one pipeline at a time, but that's not what my devs wanna do.
The best way to make jobs run one by one is set demands for your agent job to run on a specific self-hosted agent. Just as below, set a user-defined capabilities for the self-hosted agent and then require run on the agent by setting demands in agent job.
In this way, agent jobs will only run on this agent. Build will run one by one until the previous one complete.
Besides, you could control if a job should run by defining approvals and checks. By using Invoke REST API check to make a call to a REST API such as Gets a list of builds, and define the success criteria as build count is zero, then, next build starts.

Azure Devops - is it possible to queue stages between pipeline runs

I have a Azure Devops pipeline (yml) with a stage that deploys an application to an environment and then runs a bunch of tests against it. This pipeline is triggered when a PR is created. We sometimes have multiple runs of the same pipelines happening at once resulting in two deployments to the same environment happening at the same time.
Is it possible to configure the pipeline in such a way that the deploy stage can only be executed one at a time?
Simple example of what I'm trying to do:
Pipeline (yml) with stages: 1) Build -> 2) Deploy/Test -> 3) Release
Run 1: Build: complete -> Deploy/Test In progress -> Release waiting for stage 2
Run 2: Build: complete -> Deploy/Test waiting for Run 1 stage 2
If you use YAML then make you release in deployment job where you use environment with enabled exclusive lock. However, this has some drawback:
On developer community you can find a feature request for better handling this. And there is one workaround which involved calling REST API. But, as someone already mentioned:
Workarounds involving polling end up with race conditions where two queued builds can both end up starting.
So there is no ideal solution, but if you can please upvote above mentioned request.

How can I kill (not cancel) an errant Azure Pipeline run, stage, job, or task?

I want to know how to kill an Azure Pipeline task (or any level of execution - run, stage, job, etc.), so that I am not blocked waiting for an errant pipeline to finish executing or timeout.
For example, canceling the pipeline does not stop it immediately if a condition is configured incorrectly. If the condition resolves to true the task will execute even if the pipeline is cancelled.
This is really painful if your org/project only has 1 agent. :(
How can I kill (not cancel) an errant Azure Pipeline run, stage, job, or task?
For the hosted agent, we could not kill that azure pipeline directly, since we cannot directly access the running machine.
As workaround, we could reduce the time that the task continues to run after the job is cancelled by setting a shorter Build job cancel timeout in minutes:
For example, I create a pipeline with task, which will still run for 60 minutes after the job is cancelled. But if I set the value of Build job cancel timeout in minutes to 2 mins, the azure pipeline will be cancelled completely.
For the private agent, we could run services.msc, and look for "VSTS Agent (name of your agent)". Right-click the entry and then choose restart.

Conditionally run an Azure DevOps job only if a specific prior job has failed

In Azure DevOps, I'm trying to make it so that a release job (not a YAML pipeline job) only runs if a specific prior job has failed. One of the pre-defined conditions for running a job is "Only when a previous job has failed", but this is not appropriate because it includes all prior jobs, rather than just the last job (or better yet, a job of the users choosing).
Please note that this question focuses on tasks within a job and is not the answer that I am looking for.
According to the documentation for conditions under "How can I trigger a job if a previous job succeeded with issues?", I can access the result of a previous job -
eq(dependencies.A.result,'SucceededWithIssues')
Looking at the logs for a previous release, the AGENT_JOBNAME for the job that I wish to check is Post Deployment Tests, so that would mean that my condition should look like this. However, Azure DevOps wont even let me save my release -
not(eq(dependencies.Post Deployment Tests.result, 'succeeded'))
Job condition for job "Swap Slots Back on Post Deployment Test Failure" in stage "Dev" is invalid: Unexpected symbol: 'Deployment'.
I've tried to wrap the job name in quotes, but I get similar errors -
not(eq(dependencies.'Post Deployment Tests'.result, 'succeeded'))
not(eq(dependencies."Post Deployment Tests".result, 'succeeded'))
I've also tried to reference my job using underscores, which does allow me to save the release but then results in an error at run time -
not(eq(dependencies.Post_Deployment_Tests.result, 'succeeded'))
Unrecognized value: 'dependencies'.
How can I achieve me goal of conditionally running a job only if a specific prior job has failed?
How can I achieve me goal of conditionally running a job only if a
specific prior job has failed?
1.You should know that Yaml pipeline and Classic pipeline use difference technologies.
See feature ability, they have different features. Also, they have quite different behavior for Conditions though they(Yaml,Classic Build,Classic Release) all support job conditions.
The eq(dependencies.'Post Deployment Tests'.result, 'succeeded') you're trying is not supported in Classic(UI) Release pipeline. It's for Yaml:
It's expected behavior to get error like Unrecognized value: 'dependencies' cause the job dependency is not supported in Release pipeline. See this old ticket.

Timeout for a whole pipeline w/manual steps in Azure DevOps

I have created a release pipeline in Azure DevOps, with several stages, deployments to each environment. On some of the environments (Test and Production), I have manual approval tasks (not set in YAML, but on the environment). If the approval task is not performed within a set time, I want the whole pipeline to cancel.
I have set a timeoutInMinutes on the stage itself, however, the timeout never starts, as the stage is waiting for the approval before it can start at all.
I haven't found a way to set a timeout on the approval/review activity, nor have I found a way to have a different stage/job independent of the others sit and wait for a timeout and cancel the job with a logging command ##vso[task.complete result=Canceled;]DONE
See the screenshot. The pipeline just sits and waits forever. Any ideas?
Timeout for a whole pipeline w/manual steps in Azure DevOps
Yes, you are right. I could reproduced this issue on my side.
As we know, the Timeout is used:
To avoid taking up resources when your job is hung or waiting too
long, it's a good idea to set a limit on how long your job is allowed
to run.
When we set the checks on the stage. Our job is in the pending state not Running, At this time, the timeout we set has not yet started working. It only starts timing after our job starts running. So, we need a timeout for the checks, just like the timeout for Pre-deployment approvals:
I could not find any solution/workaround after spending a long time, but I found this is a highly prioritized feature requirement that has been tracked by the Azure Devops teams after confirming with Azure devops teams.
Now, I could see the status of this feature timeout for the checks request is In Progress at sprint 158,
I believe this feature will meet us soon, please pay attention to the release notes of Azure devops. Thank you for helping us build a better Azure DevOps.
Hope this helps.