How to manually cancel a ADF pipeline when an activity fails or completes? - azure-data-factory

I have a pipeline which has two concurrent list of activities running at the same time lets say the first list of activities has a flag at end which makes the until loop in the second list of activity to complete if flag is true. Now the problem is the flag is set at the end of the pipeline once all the activities in list one is completed. In case if any one of the activity prior to the set flag activity fails the set flag is never set to TRUE and the until loop runs for infinite time which is not ideal.
Is there a way that I can manually cancel the activity when any one of the activities fails?
Secondly the until loop has an inner wait activity so once it enters the loop it will wait for 50 minutes until it checks for the flag condition next time. The wait is important so I can't exclude the wait however I would want the until loop to end as soon as the flag is set to true even though the wait is running. Basically i'm trying to end the Until loop.
I did try the steps in MS Docs : Pipeline cancel run : https://learn.microsoft.com/en-us/rest/api/datafactory/pipelineruns/cancel
But this does not work because even when the RUN ID is correct it says incorrect RUN ID.
Could some one please advise how to achieve this?

Related

Stop ADF pipeline execution if there is no data

I must stop the execution of an adf pipeline if there is no data in a table, but this should not generate an error, it should only stop the execution, is this possible?
You can use if activity wherein 1st validate whether there is any data in table, if yes then use other activities within True case else do nothing.
It would exit without any issues

ADF fail error. Property 'output' cannot be selected

I have the following Fail task in my Until:
On my Until I have an expression that fails the pipeline if an error is encountered in my Until:
#or(greater(variables('RunDate') ,utcnow()), activity('Fail').output)
I've been getting this error for the last few days. I have 4 other pipelines that work the same way but only this one is giving me this error.
I'm not sure what this error means. It seems it can't select the Output text from the Fail activity but not sure why.
You want to stop the Until activity if any of your child activities fails.
I reproduced the similar scenario and gave the Fail activity output in until condition and got same error.
This is my scenario inside until which causing the error.
You will get this error when all of your inner activities inside Until succeeds. As every activity succeeds, the pipeline flow won't recognize the existence of Fail activity in that particular iteration. And when pipeline flow meets condition in the until, it will fail because it don't know about Fail activity.
Even though, in first iteration if any of your activity fails and Fail activity executed, you will get the error in the until condition because the activity('Fail').output gives the JSON object which will not fit in the or logic as it expects only a boolean value.
So, you can't give the Fail activity output like that in the Until condition.
The workaround for this can be achieved with two set variable activities inside Until.
The Until activity stops the execution of the inner activities when the condition of the Until becomes true. It will execute as long as the condition is false.
First create a pipeline variable of boolean type.
Set the variable to false if you want to continue the execution after success.
In your case, join the success of your last activity to this Set variable activity.
Set the variable to true, if you want to stop the execution after failure and along with your date condition.
Now, give the Until condition as per your requirement using or(your date condition, boolean variable) or you can use and as well.
But make sure that, it gives true if you want to stop and false if you want to continue.
NOTE: As per my repro, the Until is giving the error as Activity failed because an inner activity failed if any of the child activity fails only in the final iteration and not in every iteration.

Cloud Dataflow: Once trigger not working

I have a Dataflow pipeline reading from unbounded source. My window size is 10 hours, I am trying to test my trigger using a TestStream. My trigger will emit early result if element count reaches at least 2 for the same key within a Window. I have following trigger to achieve this:
input.apply(Window.into(FixedWindows.of(Duration.standardHours(12))) .triggering(AfterWatermark.pastEndOfWindow()
.withEarlyFirings(AfterPane.elementCountAtLeast(2)))
.apply(Count.perElement())
We also tried:
Repeatedly.forever(AfterPane.elementCountAtLeast(2)).orFinally(AfterWatermark.pastEndOfWindow())
I expect early firing when asserting the result, however I don't get all the result in
PAssert.that(pipeline).inWindow(..)..
What am I doing wrong? Also running same test repeatedly yields different result meaning different values are returned from the trigger.
Triggering is non-deterministic. It will give you an early firing some time after the trigger condition is satisfied. It will then give you another early firing some time after the trigger condition is satisfied again.
The actual choice to emit after the trigger is determined by the runner. If you are using a batch runner, it may wait until all the data is available. How much input are you expecting for each key/window? Which runner are you using?

ScheduledExecutorService: modify one or more running tasks

I have a program, it loads a few tasks from a file prepared by user and start executing them according the scheduling shown in the file.
Example: taskFile.txt
Task1: run every hour
Task2: run every 2 seconds
...
TaskN: run every monday at 10:00
This first part is Ok, i solved by using ScheduledExecutorService and i am very satisfied. The tasks are load and run as they should.
Now, let's image that the user, by GUI (at runtime), decides that Task2 should run every minute, and he wants to remove Task3.
I cannot find any way to access one specific task in the pool, in order to remove/modify it.
So I cannot update tasks at runtime. When user changes a task, I can only modify the taskFile.txt and restart the application, in order to reload all tasks according the newly updated taskFile.txt.
Do you know any way to access a single task in order to modify/delete it?
Or even, a way to remove one given task, so i can insert a new one in the pool, with the modifications wanted by the user.
Thanks
This is not elegant, but works.
Let's suppose you need 10 threads, and sometimes you need to manage a specific thread.
Instead to have a pool with 10 thread, use 10 pools with one thread for each, keep them in your favourite data structure, and act on the pool_1 when you want to modify thread_1.
It's possible to remove the older Runnable from the pool and put a new one with the needed changes.
Otherways, anything put in the pool became anonymous and will be not directly manageable.
If somebody has a better solution...

Spring Batch - execute a set of steps 'x' times based on a condition

I need to execute a sequence of steps a specific number of times.. any pointers on what is the best way to do this in Spring Batch. I am able to implement executing a single step 'x' times. but my requirement is to execute a set of steps - based on a condition 'x' times.Any pointers will help.
Thanks
Lakshmi
You could put all steps in a job an start the whole job several times. There are different ways, how a job actually is launched in spring-batch. have a look at joboperator and launcher and then simply implement a loop around the launching of the job.
You can do this after the whole spring-context is initialized, so there will be no overhead concerning that. But you must by attention about the scope of your beans, especially the reader and writers.
Depending on your needs concerning failurehandling and restart, you also have pay attention how you manage the execution context of your job and steps.
You can simulate a loop with SB using a JobExecutionDecider:
Put it in front of all steps.
Store x in job execution context and check for x value into
decider: move to 'END' if x equals desidered value or increment it
and move to first step of set.
After last step move back to start (the decider).