Upstart job and dependencies - upstart

We have three jobs, job1, job2 and job3. With job3 depending on the other two:
job3 config
start on (started job1 and started job2)
stop on (stopping job1 or stopping job2)
with this configuration, job3 starts on OS start up. And in the event of failure of one of the other two, say job1, job3 will stop; but if job1 recovers from it's failure, job3 will not be started, because upstart has events, not states.
One possible workaround will be to check the status of job2 in post-start script of job1, and if it is running, emitting an event to trigger the start of job3, with the same logic for job2:
job1 config
post-start script
if job2 is running emit event
end
job2 config
post-start script
if job1 is running emit event
end
job3 config
start on event
This obviously is not a clean solution and needs more changing when more jobs are created etc.
Is there any better way of doing this?

Related

Delay task in Azure Pipeline cannot be cancelled

I have a recurring issue with an Azure Pipeline YAML template that cannot be cancelled once started. The template defines a release stage that includes 3 jobs:
stage: Release
jobs:
- job: Wait
steps:
- task: Delay#1
inputs:
delayForMinutes: ${{ parameters.ReleaseDelay }}
- deployment: Deploy
dependsOn:
- Wait
# several more tasks that work fine
- job: Cleanup # works fine also
Our workflow is such that sometimes, we want to go ahead and approve a deployment, but we would like to queue it to wait for a couple hours, e.g. to prep updates to release after business hours. It works fine normally.
The issue comes if we try to cancel the Wait task through the pipeline Web UI. Once the release environment approval has been granted and the wait task has started, the pipeline execution cannot be cancelled.
I've tested this with multiple pipelines that reuse this template and it is a persistent/reproducible issue.
So, my question is, is the Microsoft built-in Delay task inherently un-interruptable, or is my dependency in the successor task somehow locking the Delay from being able to be cancelled?
The pipeline will show a status of "Cancelled" once I click the confirmation button to cancel the run, but the task continues to execute as if I had not done so. Crucially, it also does not cancel at the end of the Wait task. It will start straight into the deployment job as if it never received the order to cancel.
The Azure Pipelines docs do not mention the Delay task being un-interruptable, and I can cancel other tasks at different places in the pipeline which also have dependencies defined, so I don't think it's the fault of the dependency declaration, but that's also a secondary candidate for investigation.
You could investigate using the manual validation task instead of the delay task
Using this you could set a timeout but have the ability to shortcut the timeout by resuming the pipeline. Set the task to "resume" once the timeout has been reached.
Your YAML would look something like this
stage: Release
jobs:
- job: waitForValidation
displayName: Wait for external validation
pool: server
steps:
- task: ManualValidation#0
timeoutInMinutes: ${{ parameters.ReleaseDelay }}
inputs:
notifyUsers: |
test#test.com
example#example.com
instructions: 'Please validate the build configuration and resume'
onTimeout: 'resume'
- deployment: Deploy
dependsOn:
- waitForValidation
# several more tasks that work fine
- job: Cleanup # works fine also
Note that this type of task can only be run on on an "Agentless" job so don't forget to set the pool on that job to "server" e.g. pool: server

how to kill a process on devops (ideally with a timeout)

I have a complex devops build script in yaml. Is there some way that if a given step takes too much time the process is killed (or some task is executes which kills certain processed).
This is would be useful in our case where we have large tests suites in several DLLs. I am seeing often that some tests fail and after devops hangs. I would like to kill the testrunner and other processes which may be hanging with (and also without) a timeout.
Is this possible on devops?
You can specify timeoutInMinutes and cancelTimeoutInMinutes for the job:
jobs:
- job: Test
timeoutInMinutes: 10 # how long to run the job before automatically cancelling
cancelTimeoutInMinutes: 2 # how much time to give 'run always even if cancelled tasks' before stopping them
More information: https://learn.microsoft.com/en-us/azure/devops/pipelines/process/phases?view=azure-devops&tabs=yaml#timeouts

Argo Workflow - DAG Task level retry

I have a DAG workflow as below
taskA -> after taskA completes taskB and taskC runs in parallel -> once task B and C completes taskD starts. In case taskC fails due to some external issue which needs a manual intervention for correction. After correction can we manually restart (from UI or CLI) the workflow so that it resumes directly from the failed taskC and goes to taskD and completes the workflow.
Yes, clicking 'retry' from the workflow in the GUI will do just that.

How to launch scheduled spark jobs even if previous jobs are still executing on rundeck?

Why rundeck not launching scheduled spark jobs even if the previous job is still executing?
Rundeck is skipping the jobs set to launch during the execution of the previous job, then after the completion of its execution launch new job based on the schedule.
But I want to launch a scheduled job even if the previous job is executing.
Check your workflow strategy, here you have an explanation about that:
https://www.rundeck.com/blog/howto-controlling-how-and-where-rundeck-jobs-execute
You can design a workflow strategy based on "Parallel" to launch the jobs simultaneously on your node.
Example using the parallel strategy with a parent job.
Example jobs:
Job one, Job two and Parent Job (using parallel strategy).

Does rundeck support jobs dependencies?

I've been searching for days on how to layout a rundeck workflow with job dependencies. what I need to do is to have 3 jobs: job-1 and job-2 are scheduled to run in parallel while job-3 will only be triggered after the completion of both job-1, and job-2. assuming that job-1 and job-2 have different execution times.
I tried using job state conditionals to do that but it seems that the condition if not met will halt or fail only. My idea is to halt the execution until all the parent jobs completes and then resume the workflow.
You can achieve this by compiling a master job which includes 2 steps:
step: job-1 and job-2 as a sub-job which includes both (run in parallel if node oriented execution is selected)
step: job-3
But not all 3 in in the same flow.
Right now you can use Job State Conditional feature for that: https://docs.rundeck.com/2.9.4/plugins-user-guide/bundled-plugins.html#job-state-plugin
Rundeck cannot do this for you automatically. You can set a scheduler for job-3 to run after the max timestamp of job1 or job2. Enable "retry" for job3 incase the dependencies would be fail.