Talend Force run order of joblets - talend

My company has a couple of joblets that we put in new jobs to do things like initialization of variables, get system information from the database and sending out error / warning emails. The issue we are running into is that if we go ahead and start creating the components of a job and realize that we forgot to include these 3 joblets, we have to basically re-create the job to ensure that the joblets are added first so they run first.
Is there any way to force these joblets to run first and possibly also in a certain order before moving on to the contents of the job being created? Please let me know if there is any information you may need that I'm missing as I have only been using Talend for a few days. The rest of the team has not been using it too much longer than I have, so they do not have the answer I'm looking for either. Thanks in advance!

In Joblets you can use the components Trigger_Input and Trigger_Output as connection-points for on subjob OK triggers. So you can connect joblets and other components in a job with triggers. Thus enforcing execution order.
But you cannot get a on subjob OK trigger from a tPreJob. I am thinking on triggering from a tPreJob to a tWarn (on component OK) and then from tWarn to the joblet (on subjob OK).

Related

SAS Viya - Environment Manager: Job triggers

I am currently looking into SAS Viya 3.4 to replace SAS 9.4.
Now I was curious to see the possibilities of the Environment Manager in scheduling Jobs and mantaining and creating Job flows. However, I noticed that I could only Drag and Drop Jobs in a flow and connect them with very few configurable options. Also as a trigger to start a Jobflow I was only able to select a time event. I am wondering if there are other trigger types to choose from. Like a Job will be triggered if a specific table exists or a file exists [or ...]. Neither did I see the possibility to trigger/start a job based on the return code of the previous job.
Also it does not seem to be smart enough to make sure two jobs don't access a library with write access at the same time.
I can't see how SAS Viya could replace a Job Orchestration Tool. However, I feel like the tool was built to replace such an Orchestration Tool. Did I miss something or is it just not possible to do so with the Environment Manager in SAS Viya?
Any help/insights is highly appreciated. I already searched through the documentation but could not find anything.. Maybe I was just looking at the wrong place?
Why 3.4 and not 3.5 (or Viya 4)?
If you want to use Viya with your own Job Orchestration software you can consider this tool (built by my team): https://cli.sasjs.io/job/
We deployed it on Jenkins for this customer: https://www.sas.com/en_us/news/press-releases/2021/july/sas-partnership-with-lloyds-list-intelligence.html

A file prepared by one spring batch job is not accessible to other for deletion

I have a requirement where I have to prepare a file using one job and another job which runs once a day will send the file to external system and delete/or move from the location. When this job tries to delete/or move the file it can't access it.
I tried setting writable to true when file is created. Running jobs on separate times (Running one job at a time). Tried adding "delete" as a step to the same job as well. Nothing worked.
I am using file.delete(). Also tried Files.deleteIfExists().
I suspect the first job is not assigning proper permissions but don't know a way around it set permissions in spring batch
Are these jobs run by the same user? i.e. Same user and permissions?
Also what is the actual error message? Does it say permissions denied? If so they it is likely an OS restriction not Spring Batch/Java limitation.
An easier solution would be to just add a step to the first job to send the files are part of the job and drop the job that just transfers the files.
Answering my own question 😀. Hope it helps someone.
Issue was the last ItemWriter was holding the resources because I was using the composite writer. While using CompositeWriter beforeStep, afterStep methods are “hidden”. You have to call them explicitely. I selected the approach to write a custom writer which will explicitely call writer.close().
Adding afterStep method and calling super.close() should also work. Though I have nit tries that out.

Dynamics CRM workflow failing with infinite loop detection - but why?

I want to run a plug-in every 30 minutes, to poll an external system for changes. I am in CRM Online, so I don't have ready access to a scheduling engine.
To run the plug-in, I have a 'trigger' entity with a timezone independent date-
Updating the field also triggers a workflow, which in pseudocode has this logic:
If (Trigger_WaitUntil >= [Process-Execution Time])
{
Timeout until Trigger:WaitUntil
{
Set Trigger_WaitUntil to [Process-Execution Time] + 30 minutes
Stop Workflow with status of: Succeeded
}
}
If Trigger_WaitUntil < [Process-Execution Time])
{
Send email //Tell an admin that the recurring task has self-terminated
Stop Workflow with status of: Canceled
}
So, the behaviour I expect is that every 30 minutes, the 'WaitUntil' field gets updated (and the Plug-in and workflow get triggered again); unless the WaitUntil date is before the Execution time, in which case stop the workflow.
However, 4 hours or so later (probably 8 executions, although I haven't verified that yet) I get an infinite loop warning "This workflow job was canceled because the workflow that started it included an infinite loop. Correct the workflow logic and try again. For information about workflow".
My question is why? Do workflows have a correlation id like plug-ins, which is being carried through to the child workflow? If so, is there anyway I can prevent this, whilst maintaining the current basic mechanism of using a single trigger record to manage the schedule (I've seen other solutions in which workflows create new records, but then you've got to go round tidying up the old trigger records as well)
Yes, this behavior is well-known. The only way to implement recurring workflows without issues with infinite loops in Dynamics CRM and using only OOB features is usage of Bulk Deletion functionality. This article describes how to implement it - http://www.crmsoftwareblog.com/2012/08/using-the-bulk-deletion-process-to-schedule-recurring-workflows/
UPD: If you want to run your code every 30 mins then you will have to create 48 bulkdelete jobs with correspond startdatetime like 12:00, 12: 30, 1:00 ...
The current supported method for CRM is to use the Azure Scheduler.
Excerpt:
create a Web API application to communicate with CRM and our external
provider running on a shared (free) Azure web site and also utilize
the Azure Scheduler to manage the recurrence pattern.
The free version of the Azure Scheduler limits us to execution no more
than once an hour and a maximum of 5 jobs. If you have a lot going on
$20 a month will get you executions every minute and up to 50 jobs -
which sounds like a pretty good deal.
so if you wanted every 30 minutes, you could create two jobs, one on the half hour, and one on the hour.
The Bulk Deletion is an interesting work around and something we've used before. It creates extra work and maintenance though so I try to avoid it if possible.
I would generally recommend building a windows application and using the windows scheduling feature (I know you said you don't have a scheduler available but this is often forgotten). This approach works really well and is very easy to troubleshoot. Writing to logs and sending error email alerts is pretty easy to make it robust. The server doesn't need to be accessible externally, it only needs to reach CRM. If you had CRM on-prem, you could just use the same server.
Azure Scheduler is a great suggestion. This keeps you in the cloud which is nice.
SSIS is another option if you already have KingswaySoft or Cozy Roc in place.
You could build a workflow that creates another record and cleans up after itself; however, this is really using the wrong tool for the job. Also, it's very easy for it to fail and then not initiate the next record.
There is a solution called "Scheduled Workflow Runner". You create a FetchXML query to create a record set to run against, and point it at an on-demand workflow that you want it to run on each record.
http://alexanderdevelopment.net/post/2013/05/18/scheduling-recurring-dynamics-crm-workflows-with-fetchxml/

Talend Subjobs and Sundry

Trying to troubleshoot an existing Talend job with many iterations and sub-jobs created by a developer who is no longer with the company. Ran into an issue with subjobs and hoping someone here can answer.
I know by reading the documentation that OnSubjobOk10 indicates that the job will execute after #10 is complete. But in a workflow with no names, how I do know which is Subjob#10? Can I assume it is the one from where the job-job connection is made?
Thanks in advance,
Bee
OnSubJobOK will make te next subjob work if the previous subjob finished without error, from help.talend:
OnSubjobOK (previously Then Run): This link is used to trigger the
next subjob on the condition that the main subjob completed without
error. This connection is to be used only from the start component of
the Job.
These connections are used to orchestrate the subjobs forming the Job
or to easily troubleshoot and handle unexpected errors.

Execute only one Talend Component

In Talend Open Studio, how do I execute only one of my components? If I click Run, all active components will run. So far the only way I know to execute a single component is to deactivate all others in the Job.
How can I execute one component or subjob without having to deactivate all the other components in the job?
Well, I'm afraid you can't.
Two possible solutions :
Deactivating unwanted components / subjobs (like you already stated)
Decompose your job into multiple jobs. This may give you more flexibility. You can then use the tBufferOutput component to pass information from the child job to his parent.