deploying Batch with CloudFormation - aws-cloudformation

I've been able to create a Compute Environment, a Job Queue and about a dozen Job Definitions using CloudFormation. Great!
Unless I'm missing something, there doesn't seem to be an element to actually submit my Job Definitions using CloudFormation. :(
At first, I thought I had it figured out because you can create CloudWatch Events that trigger a Job Submission. However, I notice that the Event Rule in CloudFormation does not have support for Batch like the CLI/SDK does. Lame!
Anyone else deploying Batch with CloudFormation? How are you submitting jobs? I guess I can create a Custom Resource, but that seems harder than it should be.

Does https://docs.aws.amazon.com/batch/latest/userguide/batch-cwe-target.html solve your problem?
AWS Batch jobs are available as CloudWatch Events targets. Using simple rules that you can quickly set up, you can match events and submit AWS Batch jobs in response to them.
When you create a new rule, add the batch job as a target.

The easiest way would be to create a Lambda function. You can create it via CF and capture your requirement in the function code.
Or like you mentioned, you can create a custom resource.

Related

SAS Viya - Environment Manager: Job triggers

I am currently looking into SAS Viya 3.4 to replace SAS 9.4.
Now I was curious to see the possibilities of the Environment Manager in scheduling Jobs and mantaining and creating Job flows. However, I noticed that I could only Drag and Drop Jobs in a flow and connect them with very few configurable options. Also as a trigger to start a Jobflow I was only able to select a time event. I am wondering if there are other trigger types to choose from. Like a Job will be triggered if a specific table exists or a file exists [or ...]. Neither did I see the possibility to trigger/start a job based on the return code of the previous job.
Also it does not seem to be smart enough to make sure two jobs don't access a library with write access at the same time.
I can't see how SAS Viya could replace a Job Orchestration Tool. However, I feel like the tool was built to replace such an Orchestration Tool. Did I miss something or is it just not possible to do so with the Environment Manager in SAS Viya?
Any help/insights is highly appreciated. I already searched through the documentation but could not find anything.. Maybe I was just looking at the wrong place?
Why 3.4 and not 3.5 (or Viya 4)?
If you want to use Viya with your own Job Orchestration software you can consider this tool (built by my team): https://cli.sasjs.io/job/
We deployed it on Jenkins for this customer: https://www.sas.com/en_us/news/press-releases/2021/july/sas-partnership-with-lloyds-list-intelligence.html

Automation for ADF V2 Pipeline

I need help with implementation for below requirement:
There is one ADF pipeline that runs every two hours (with Tumbling window trigger), now i need to create one more pipeline that will be used for performing maintenance job . This pipeline is scheduled to run once a month (with schedule trigger). Here is the requirement that i'm trying to implement:
Now before running the second pipeline i need to make sure the first pipeline is not running (basically get the status and if its running wait for its completion) and then disable the trigger associated with it.
Run the second pipeline and after its completion , enable the trigger that is associated with first pipeline
Please let me know if this can be achieved within ADF or some kind of custom scripting needed to achieve the result.
First, your idea is achievable.
Second, if you want to use built-in feature in Azure Datafactory, then there is no way.
Basically, you need to use azure function(simple httptrigger, dont give any input, then you can hit and execute it directly.) to achieve your requirement that ADF can't do. From your description, the executing of these two pipelines are mutually exclusive, so you can use sdk to check to status of another pipeline in azure function. If another pipeline is running, then wait a few seconds then re-check the status of another pipeline.(In short, put the main logic and code in the azure function.)
Simple azure function:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=csharp
Use SDK to monitor:
https://learn.microsoft.com/en-us/azure/data-factory/monitor-programmatically#net
(The link I give is C#, you can choose other supported language.)

A file prepared by one spring batch job is not accessible to other for deletion

I have a requirement where I have to prepare a file using one job and another job which runs once a day will send the file to external system and delete/or move from the location. When this job tries to delete/or move the file it can't access it.
I tried setting writable to true when file is created. Running jobs on separate times (Running one job at a time). Tried adding "delete" as a step to the same job as well. Nothing worked.
I am using file.delete(). Also tried Files.deleteIfExists().
I suspect the first job is not assigning proper permissions but don't know a way around it set permissions in spring batch
Are these jobs run by the same user? i.e. Same user and permissions?
Also what is the actual error message? Does it say permissions denied? If so they it is likely an OS restriction not Spring Batch/Java limitation.
An easier solution would be to just add a step to the first job to send the files are part of the job and drop the job that just transfers the files.
Answering my own question 😀. Hope it helps someone.
Issue was the last ItemWriter was holding the resources because I was using the composite writer. While using CompositeWriter beforeStep, afterStep methods are “hidden”. You have to call them explicitely. I selected the approach to write a custom writer which will explicitely call writer.close().
Adding afterStep method and calling super.close() should also work. Though I have nit tries that out.

Talend Force run order of joblets

My company has a couple of joblets that we put in new jobs to do things like initialization of variables, get system information from the database and sending out error / warning emails. The issue we are running into is that if we go ahead and start creating the components of a job and realize that we forgot to include these 3 joblets, we have to basically re-create the job to ensure that the joblets are added first so they run first.
Is there any way to force these joblets to run first and possibly also in a certain order before moving on to the contents of the job being created? Please let me know if there is any information you may need that I'm missing as I have only been using Talend for a few days. The rest of the team has not been using it too much longer than I have, so they do not have the answer I'm looking for either. Thanks in advance!
In Joblets you can use the components Trigger_Input and Trigger_Output as connection-points for on subjob OK triggers. So you can connect joblets and other components in a job with triggers. Thus enforcing execution order.
But you cannot get a on subjob OK trigger from a tPreJob. I am thinking on triggering from a tPreJob to a tWarn (on component OK) and then from tWarn to the joblet (on subjob OK).

Creating a custom scheduler using spring quartz

I have a requirement to create a custom scheduler. I would like all the parameters defining the frequency that my jobs will run to be stored in db tables. This would allow my customers to change the frequency etc via a nice little webapp (webapp is a different application to my main one).
I know using quartz you can define all your job triggers programatically but is that just at start time? How would it work if my customer logs on a changes the schedule in the webapp. Am I able to re-define a job trigger in the original app by checking for changes periodially?
Does anyone know of any nice examples of this?
regards
You have bunch of methods in Scheduler interface. JavaDoc here.
Replace already scheduled job with :
Add a new job with replace=true in addJob method
OR
Deleting an existing job ( method: deletejob )
And then adding a new job with modified details ( addjob)
Replace already scheduled triggers with
rescheduleJob ( If jobdetail associated with previous trigger is same for new trigger too )
OR
unscheduleJob followed by scheduleJob
If used with spring you can use spring.quartz.overwrite-existing-jobs=true