Run a component in master job after all iterations of child job completes successfully - talend

I have a talend Parent job which runs 20 times a child job on Iterate Flow. I am trying to run another tjava after all the iterations of Child Job completes.
On connecting tjava with Child job for On Component ok, it is running for each iteration.
Is there a way to run after all iterations are done/complete.
Flow right now,
tDBInput --OnComponentOk--> tflowtoiterate --Iterate--> Child_Job(tRunJob) --OnComponentOk--> tjava

To run tJava after all iterations are complete, you can connect like this tDbinput->onsubjobok->tJava.
Your flow and ieration connected to tDbinput will execute first, and then after all iterations, tJava will execute.

Related

Give Input to github action job after first job is succesful

I have a requirement in Github-Action where i need to feed input after 1st job is finished.
I.e. The first job (Build) starts automatically after each commit without taking any input and it runs succesfully.
Then for second job to start(Deploy), based on input(Environment) i select the Deploy should start executing.
I have figured out the manual job execution from https://stackoverflow.com/a/73708545/2728619
But I need help for taking input after 1st job is finished.
For Manual trigger, i tried the solution https://stackoverflow.com/a/73708545/2728619
I am expeccting reading input after 1st job is run(not at the beginning of the workflow).

Multiple Kafka input not starting at the same time in Talend Job

I have a simple Talend standard job containing two Kafka inputs as you can see in the picture, the problem is when I run the job just one of the Kafka input start, the ideal condition that I expected to happen is multiple Kafka input running at the same time, is there is any configuration that I miss?
you can easily add the tParallelize component at the beginning of the talend job and it will be executed at the same time, if you have multiple sub jobs it can work too.
I think the Talend job default runs in serial we just can't see which component runs first because the process is so fast.

IBM DataStage : Job activity does not continue in sequence

I have 16 job activities in a sequence, I already define the trigger with OK so they're all connected and auto run when the previous job has finished. I already run and recompiled each job activity on their own but when I recompile and re-run the sequence, somehow only the first job activity run and finished as OK but it does not trigger the next job. Here's the log
job_spi_februari..JobControl (#Coordinator): Summary of sequence run
19:18:01: Sequence started
19:18:01: jenis_kredit (JOB job_jenis_kredit) started
19:18:16: jenis_kredit (JOB job_jenis_kredit) finished, status=2 [Finished with warnings]
19:18:16: Sequence finished OK
I'm very confused why it's like this, it shows that it goes well without any problem or warning but it does not trigger the next job as it should be as if there's something wrong. What happens actually and how to fix this?
In case, you're curious about my job activity, they all look like this
If you connect all job activites with a OK trigger - the Sequence will end once a single activity does not finish with ok (like "Finished with Warnings") because nothing is left to execute.
If you want to go on I suggest to define a custom trigger which fires on RunOK and Runwarn.

Setting up a Job Schedule

I currently have a setup that creates a job and then collect some metrics about the tasks in the job. I want to do something similar, but by setting a job schedule instead. In particular, I want to set a job schedule that wakes up at a recurrence interval that I specify, and run the same code that I was running when creating a job. What's the best way to go about doing that?
It seems that there is a CloudJobSchedule that I could use to set up my job schedule, but this only lets me create say a job manager task, and specify few properties. How can I run external code on the jobs created by the Job schedule?
It could also help to clarify how the CloudJobSchedule works. Specifically, after I commit my job schedule, what would happen programmatically. Does the code just move sequentially and run the rest of the code. In this case, does it make sense to get a reference to the last job created by the job schedule and run code on the job returned?
You'll want to create a CloudJobSchedule. You can specify the recurrence in the Schedule.
If you only need to run a single task per recurrence, your job manager task can simply be the task you need to run. If you need to run multiple tasks per job recurrence, your job manager needs to have logic to submit tasks to Batch and monitor for completion (if necessary).
When you submit a job schedule to Batch, your client side code will continue running. The behavior is no different than if you were submitting a regular job. You can retrieve the last job run via JobScheduleExecutionInformation and the RecentJob property.

End Celery worker task on, time limit, job stage or instruction from client

I'm new to celery and I would appreciate a little help with a design pattern(or example code) for a worker I have yet to write.
Below is a description of the desired characteristics of the worker.
The worker will run a task that collects data from an endless source, a generator.
The worker task will run forever feeding from the generator unless it is directed to stop.
The worker task should stop gracefully on the occurrence of any one of the following triggers.
It exceeds an execution time limit in seconds.
It exceeds a number of iterations of the endless generator loop.
The client sends a message instructing the worker task to finish immediately.
Below is some sudo code for how I believe I need to handle trigger scenarios 1 and 2.
What I don't know is how I send the 'finish immediately' signal from the client and how it is received and executed in the worker task.
Any advice or sample code would be appreciated.
from celery.task import task
from celery.exceptions import SoftTimeLimitExceeded
COUNTLIMIT = # some value sent to the worker task by the client
#task()
def getData():
try:
for count, data in enumerate(endlessGeneratorThing()):
# process data here
if count > COUNTLIMIT: # Handle trigger scenario 2
clean_up_task_nicely()
break
except SoftTimeLimitExceeded: # Handle trigger scenario 1
clean_up_task_nicely()
My understanding of revoke is that it only revokes a task prior to its execution. For (3), I think what you want to do is use an AbortableTask, which provides a cooperative way to end a task:
http://docs.celeryproject.org/en/latest/reference/celery.contrib.abortable.html
On the client end you are able to call task.abort(), on the task end, you are able to poll task.is_aborted()