Talend : Use tBufferInput/Output as lookup between subjobs

Talend : Use tBufferInput/Output as lookup between subjobs - talend

Is it possible to use tBufferInput/Output as lookup (with tmap) but in different subjobs and using OnSubJobOK Link ?
In one single job it's simple.
But when I try to retrieve values from my tBufferInput in Job1 (After running a job writing data in tBufferOutput in Job2 ) the buffer seems to be empty.
JOB 1 :
JOB2 :
My Sequence :

You can't use a cache filled in Job1 to read it in Job2. Caches (tBuffers , tHashs) are only available within the same job.

You can use cache to propagate data from child to parent Job:
https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/1I5EGN1E92B_kquoxHqV1Q (Scenario 3)
To my knowledge, reverse is not possible.

Related

Can you control how the tasks are executed in Azure Pipelines?

I have built a pipeline with 4 tasks
Task 1 Builds a VM
Task 2 Add a Data Disk
Task 3 Add a Second Data Disk
Task 4 Add a Third Data Disk
However, if I only want Task 1 and Task 2 to execute how can I skip Task 3 and 4? For example, if the user only wants 1 Data Disk. I know they can be disabled manually but is there a way to automate this based on a variable?

Every stage, job and task has a condition property. You can use a condition expression to decide which tasks to run and when. You can reference variables in such expressions. By promoting these variables to a "Queue time variable" you can let a user control these.
Make sure you prepend each condition with succeeded() to make the previous steps have completed succesfully.
condition: and(succeeded(), gt(variables.Disks, 2))
See:
Expressions
Specify Conditions
Define Variables - Allow at queue time

Routing agents through specific resources in anylogic

I am solving a job shop scheduling problem resorting to anylogic. I have 20 jobs (agents) and 5 machines(resources) and each job as a specific order to visit the machines. My question is: how can I make sure that each job follows its order.
This is what I have done. One agent called 'jobs' and 5 agents, each one corresponding to a machine. One resource pool associated to each one of the service blocks. In the collection enterblocks I selected the 5 enter blocks.
In the agent 'jobs' I have this. The parameters associated to each job, read from the database file, and the collection 'enternames' where I selected the machine(1,2,3,4,5) parameters and the collection 'ptimes' where I put the processing times of the job (This two colletions is where I am not sure I have done it correctly)
My database file
I am not sure how to use the counter used here How to store routings in job shop production in Anylogic. In the previous link the getNextService function is used in the exit blocks but I am also not sure how to use it in my case due to the counter.

Firstly, to confirm that based on the Job agent and database view, the first line in the database will result in a Job agent with values such as:
machine1 = 1 and process1=23
machine2 = 0 and process2=82 and so on
If that is the intent, then a better way is to restructure the database, so there are two tables:
Table of jobs to machine sequence looking something like this:
job
op1
op2
op3
op4
op5
1
machine2
machine1
machine4
machine5
machine3
2
machine4
machine3
machine5
machine1
machine2
3
...
...
...
...
...
Table of jobs to processing time
Then, add a collection of type ArrayList of String to Job (let's call this collection col_machineSequence) and when the Job agents get created their on startup code should be:
for (String param : List.of("op1","op2","op3","op4","op5")) {
col_machineSequence.add(getParameter(param));
}
As a result, col_machineSequence will contain sequence of machines each job should visit in the order defined in the database.
NOTE: Please see help on getParameter() here.
Also:
Putting a Queue in front of the Service isn't necessary
Repeating Enter-Queue-Service-Exit isn't necessary, this can be simplified using this method
Follow-up clarifications:
Collections - these will be enclosed in each Job agent
Queue sorting - Service block has Priorities / preemption which governs the ordering on the queue
Create another agent for the second table (call the agent ProcessingTime and table processing_time) and add it to the Job agent and then load it from database filtering on p_jobid as shown in the picture

How to pass an option value of Parent job as the "job name" of child job?

I have a main job "Parent" and 2 child jobs "Child_1" & "Child_2" defined in "Project-A".
The Parent job has 1 option name:
"childname" with Allowed Values "Child_1,Child_2"
Only one value can be selected from the drop down.
Within Parent, there is a Job Reference step where I'm trying to pass ${option.childname} in the "Job Name" field to call the selected child job.
However, it is resulting in an error:
Job [${option.childname}] not found, project: Project-A
How do I get the Parent to run the child job in this manner?
If not, what is the alternate way to select the child job?
My ultimate goal is:
1) Define several jobs within the project.
2) Define one main control job thru which I can select some combination of hostname, application component, environment name, etc. and execute the correct child job. The point is to not have to sift through several jobs (or groups of jobs) to run a particular child job.
Thanks!

A good option to do this is using API or RD CLI wrapped on a shell script passing the option on bash variable.
API Reference:
https://docs.rundeck.com/docs/api/
A very good Rundeck API examples:
https://documenter.getpostman.com/view/95797/rundeck/7TNfX9k#intro
RD CLI:
https://rundeck.github.io/rundeck-cli/

How to pass output from a Datastage Parallel job to input as another job?

My requirement is
Parallel Job1 --I extract data from a table, when row count is more than 0
Parallel job 2 should be triggered in the sequencer only when the row count from source query in Job1 is greater than 0
I want to achieve this without creating any intermediate file in job1.

So basically what you want to do is using information from a data stream (of your Job1) and use it in the "above" sequence as a parameter.
In your case you want to decide on sequence level to run subsequent jobs (if more than 0 rows get returned) or not.
Two options for that:
Job1 writes information to a file which is a value file of a parameterset. These files are stored in a fixed directory. The parameter of the value file could then be used in your sequence to decide your further processing. Details for parameter sets can be found here.
You could use a server job for Job1 and set a user status (basic function DSSetUserStatus) in a transfomer. This is also passed back to the sequence and could be referenced in subsequent stages of the sequence. See the documentation but you will find many other information on the internet as well regarding this topic.
There are more solution to this problem - or let us call it challenge. Other ways may be a script called at sequence level which queries the database and will avoid Job1...

spring batch add job parameters in a step

I have a job with two steps. first step is to create a file in a folder with the following structure
src/<timestamp>/file.zip
The next step needs to retrieve this file and process it
I want to add the timestamp to the job parameter. Each job instance is differentiated by the timestamp, but I won't know the timestamp before the first step completes. If i add a timestamp at the beginning of the job to the job parameter then each time a new job instance will be started. any incomplete job will be ignored.

I think you can make use of JobExecutionContext instead.
Step 1 gets the current timestamp, use that to generate the file, and put to JobExecutionContext. Step 2 read from the JobExecutionContext to get the timestamp, which used to construct the input path for its processing.
Just to add something on top on your approach of splitting steps like this: You have to think twice whether this is really what you want. If Step 1 finished, and Step 2 failed, when the job instance is re-runed, it will start from Step 2, that means the file is not going to regenerate in Step 1 (because it is completed already). If it is what you look for, that's fine. If not, you may see if you want to put Step1 & Step2 in one step instead.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Talend : Use tBufferInput/Output as lookup between subjobs - talend

You can't use a cache filled in Job1 to read it in Job2. Caches (tBuffers , tHashs) are only available within the same job.

You can use cache to propagate data from child to parent Job: https://help.talend.com/reader/wDRBNUuxk629sNcI0dNYaA/1I5EGN1E92B_kquoxHqV1Q (Scenario 3) To my knowledge, reverse is not possible.

Related

Can you control how the tasks are executed in Azure Pipelines?

Routing agents through specific resources in anylogic

How to pass an option value of Parent job as the "job name" of child job?

How to pass output from a Datastage Parallel job to input as another job?

spring batch add job parameters in a step

Categories

Resources