AWS CloudFormation Template for Orchestration of mutliple AWS Glue Jobs (combination of sequentially and parallel execution) - aws-cloudformation

I'm looking for help on CloudFormation Template for Glue Jobs orchestration for below scenario:
Suppose I have 6 AWS Glue Jobs, 3 jobs (Job1, Job2, Job3) should be executed parallel and remaining 3 jobs should be executed sequentially (Job3 executed before Job4 then Job4 executed before Job5, then Job5 executed before Job6). If any job failure, send Workflow "Failure" notification along with the failed Glue Jobname.
Job1
Job2
job3 --->job4--->job5-->job6

You can define a StepFunctions using AWS::StepFunctions::StateMachine resource and define additionally AWS::Events::Rule for following events:
EventPattern:
source:
- "aws.states"
detail-type:
- "Step Functions Execution Status Change"
detail:
status:
- "FAILED"
- "TIMED_OUT"
Here is a sample single job execution stage in step functions:
Run Job 3:
Type: "Task"
Resource: "arn:aws:states:::glue:startJobRun.sync"
Parameters:
JobName: "GlueJob1Name"
Next: "Run Job 4"
You will have to enclose three parallel jobs within Parallel task type.

Related

Jfrog Pipeline - Does cronTrigger resource supports triggering a pipeline with predefined variables?

resources:
- name: nightly_cron_trigger
type: CronTrigger
configuration:
interval: "30 03 * * *" # Every day at 03:30AM UTC
branches:
include: *serviceBranchRegexp
pipelines:
- name: commons_nightly
steps:
- name: prepare_nightly_run
type: Bash
configuration:
nodePool: ci_c5large
inputResources:
- name: nightly_cron_trigger
- name: commons_bitbucket
trigger: false
outputResources:
- name: commons_property_bag
environmentVariables:
GIT_REPO_PATH:
default: *serviceGitRepoPath
execution:
onStart:
- source
currently we have a pipeline (runs with cron each night) where each step triggers an embedded pipeline and each step does the same - only the resources and names are changing. So I thought maybe the cron can run the main pipeline a few times at night but every run will have different params.
cron resource does not support this, meaning you cant trigger a pipeline with predefined variables using cronTrigger resource.
But may be you can use property bag resource. May be you can configure like this:
input cronTrigger will trigger a pipeline step and that pipelineStep will update the output propertyBag resource with different parameters.
cronTrigger -> pipelineStep -> propertyBag
which this propertyBag resource can be a input to a different pipeline now.

How to monitor a quartz scheduler job?

I am very new to quartz scheduler. I am aware that we can enable logs for quartz jobs and triggers by doing the following configuration
org.quartz.plugin.jobHistory.class: org.quartz.plugins.history.LoggingJobHistoryPlugin
# Format of Log Generated
org.quartz.plugin.jobHistory.jobSuccessMessage= Job [{1}.{0}] execution complete and reports: { 8 }
org.quartz.plugin.jobHistory.jobToBeFiredMessage= Job [{1}.{0}] to be fired by trigger [{4}.{3}], re-fire: { 7 }
org.quartz.plugin.triggHistory.class= org.quartz.plugins.history.LoggingTriggerHistoryPlugin
# Format of Log Generated
org.quartz.plugin.triggHistory.triggerFiredMessage= Trigger \{1\}.\{0\} fired job \{6\}.\{5\} at: \{4, date, HH:mm:ss MM/dd/yyyy\}
org.quartz.plugin.triggHistory.triggerCompleteMessage= Trigger \{1\}.\{0\} completed firing job \{6\}.\{5\} at \{4, date, HH:mm:ss MM/dd/yyyy\}
But I am trying to understand if there is any way to directly get the quantitative metrics like how many jobs are currently running or duration for each job etc.
I am also aware of various tools like quartz-dask which gives a ui for the said metrics. But I am more interested in the metrics which in turn I could push to my prometheus instance

Azure DevOps, When a job in the stage is successful, how can it be ensured that stage is successful or partially successful?

When one of the jobs in the stage is successful and the other is fail, how can it be ensured that that stage is successful and proceeds to the next stage?
If I'm more descriptive,
Pipeline
A(Stage) --> B(Stage)
-Job1(failed) -Job1
- ...
-Job2(succeeded)
(run only when previous job failed)
When Job 1 in A stage is fail, automatically Job 2 running. But in this way, stage A is resulted 'Failed'. Therefore, Stage B is never running because of previous stage failed.
"Trigger even when the selected stages partially succeed" button is not working for stage B because previous stage marked as "fail".
Additionally, I tried:
If i mark all tasks inside "A -> Job1" as "Continue on error" and i checked "Trigger even when the selected stages partially succeed" box in stage B this time "Job 2" not running because "run only when previous job failed " option marked on job 2. There is no option like "run only when previous job partially succeeded".
In addition, custom condition not working in Job 2, its show null for "Agent.JobStatus" variable. If these "SucceededWithIssues" and "Failed" status available for "Agent.JobStatus" it can be run but I couldn't come to a conclusion here because null value of JobStatus.

How to stop and resume a spring batch job

Goal : I am using spring batch for data processing and I want to have an option to stop/resume (where it left off).
Issue: I am able to send a stop signal to a running job and it gets stopped successfully. But when I try to send start signal to same job its creating a new instance of the job and starts as a fresh job.
My question is how can we achieve a resume functionality for a stopped job in spring batch.
You just have to run it with the same parameters. Just make sure you haven't marked the job as non-restarted and that you're not using RunIdIncrementer or similar to automatically generate unique job parameters.
See for instance, this example. After the first run, we have:
INFO: Job: [SimpleJob: [name=myJob]] completed with the following parameters: [{}] and the following status: [STOPPED]
Status is: STOPPED, job execution id 0
#1 step1 COMPLETED
#2 step2 STOPPED
And after the second:
INFO: Job: [SimpleJob: [name=myJob]] completed with the following parameters: [{}] and the following status: [COMPLETED]
Status is: COMPLETED, job execution id 1
#3 step2 COMPLETED
#4 step3 COMPLETED
Note that stopped steps will be re-executed. If you're using chunk-oriented steps, make sure that at least the ItemReader implements ItemStream (and does it with the correct semantics).
Steps marked with allowRestartWithComplete will always be re-run.

Inspect and retry resque jobs via redis-cli

I am unable to run the resque-web on my server due to some issues I still have to work on but I still have to check and retry failed jobs in my resque queues.
Has anyone any experience on how to peek the failed jobs queue to see what the error was and then how to retry it using the redis-cli command line?
thanks,
Found a solution on the following link:
http://ariejan.net/2010/08/23/resque-how-to-requeue-failed-jobs
In the rails console we can use these commands to check and retry failed jobs:
1 - Get the number of failed jobs:
Resque::Failure.count
2 - Check the errors exception class and backtrace
Resque::Failure.all(0,20).each { |job|
puts "#{job["exception"]} #{job["backtrace"]}"
}
The job object is a hash with information about the failed job. You may inspect it to check more information. Also note that this only lists the first 20 failed jobs. Not sure how to list them all so you will have to vary the values (0, 20) to get the whole list.
3 - Retry all failed jobs:
(Resque::Failure.count-1).downto(0).each { |i| Resque::Failure.requeue(i) }
4 - Reset the failed jobs count:
Resque::Failure.clear
retrying all the jobs do not reset the counter. We must clear it so it goes to zero.