ADF fail error. Property 'output' cannot be selected - azure-data-factory

I have the following Fail task in my Until:
On my Until I have an expression that fails the pipeline if an error is encountered in my Until:
#or(greater(variables('RunDate') ,utcnow()), activity('Fail').output)
I've been getting this error for the last few days. I have 4 other pipelines that work the same way but only this one is giving me this error.
I'm not sure what this error means. It seems it can't select the Output text from the Fail activity but not sure why.

You want to stop the Until activity if any of your child activities fails.
I reproduced the similar scenario and gave the Fail activity output in until condition and got same error.
This is my scenario inside until which causing the error.
You will get this error when all of your inner activities inside Until succeeds. As every activity succeeds, the pipeline flow won't recognize the existence of Fail activity in that particular iteration. And when pipeline flow meets condition in the until, it will fail because it don't know about Fail activity.
Even though, in first iteration if any of your activity fails and Fail activity executed, you will get the error in the until condition because the activity('Fail').output gives the JSON object which will not fit in the or logic as it expects only a boolean value.
So, you can't give the Fail activity output like that in the Until condition.
The workaround for this can be achieved with two set variable activities inside Until.
The Until activity stops the execution of the inner activities when the condition of the Until becomes true. It will execute as long as the condition is false.
First create a pipeline variable of boolean type.
Set the variable to false if you want to continue the execution after success.
In your case, join the success of your last activity to this Set variable activity.
Set the variable to true, if you want to stop the execution after failure and along with your date condition.
Now, give the Until condition as per your requirement using or(your date condition, boolean variable) or you can use and as well.
But make sure that, it gives true if you want to stop and false if you want to continue.
NOTE: As per my repro, the Until is giving the error as Activity failed because an inner activity failed if any of the child activity fails only in the final iteration and not in every iteration.

Related

Trigger Date for reruns

My pipelines activities need the date of the run as a parameter. Now I get the current date in the pipeline from the utcnow() function. Ideally this would be something I could enter dynamically in the trigger so I could rerun a failed day and the parameter would be set right, now a rerun would lead to my pipeline being rerun but with the date of today not the failed run date.
I am used to airflow where such things are pretty easy to do, including scheduling reruns. Probably I think too much in terms of airflow but I can't wrap my head around a better solution.
In ADF,it is not supported directly to pass trigger date at which pipeline got failed to trigger.
You can get the trigger time using #pipeline().TriggerTime .
This system variable will give the time at which the trigger triggers the pipeline to run.
You can store this trigger value for every pipeline and use this as a parameter for the trigger which got failed and rerun the pipeline.
Reference: Microsoft document on System Variables on ADF
To resolve my problem I had to create a nested structure of pipelines, the top pipeline setting a variable for the date and then calling other pipelines passing that variable.
With this I still can't rerun the top pipeline but rerunning Execute Pipeline1/2/3 reruns them with the right variable set. It is still not perfect since the top pipeline run stays an error and it is difficult to keep track of what needs to be rerun, however it is a partial solution.

Stop ADF pipeline execution if there is no data

I must stop the execution of an adf pipeline if there is no data in a table, but this should not generate an error, it should only stop the execution, is this possible?
You can use if activity wherein 1st validate whether there is any data in table, if yes then use other activities within True case else do nothing.
It would exit without any issues

How to manually cancel a ADF pipeline when an activity fails or completes?

I have a pipeline which has two concurrent list of activities running at the same time lets say the first list of activities has a flag at end which makes the until loop in the second list of activity to complete if flag is true. Now the problem is the flag is set at the end of the pipeline once all the activities in list one is completed. In case if any one of the activity prior to the set flag activity fails the set flag is never set to TRUE and the until loop runs for infinite time which is not ideal.
Is there a way that I can manually cancel the activity when any one of the activities fails?
Secondly the until loop has an inner wait activity so once it enters the loop it will wait for 50 minutes until it checks for the flag condition next time. The wait is important so I can't exclude the wait however I would want the until loop to end as soon as the flag is set to true even though the wait is running. Basically i'm trying to end the Until loop.
I did try the steps in MS Docs : Pipeline cancel run : https://learn.microsoft.com/en-us/rest/api/datafactory/pipelineruns/cancel
But this does not work because even when the RUN ID is correct it says incorrect RUN ID.
Could some one please advise how to achieve this?

Talend - Error handling without using subjobs

Please see the image.
So here is a flow, wherein the first component executes a database query to find QAR_ID (single row), if it is found then all well. I am trying to put error handling into this. When no rows are found, it directs to tJava_11 which raises an java exception and that gets logged by another tJava component.
The problem I am facing is when it goes to error handling flow, it logs the error and just goes to the post-job section. However, I want Talend to take the OnSubJobOk route so that it continues with other steps instead of directly jumping to post-job section.
I know this is possible using subjobs but I don't want to keep creating 'n' number of subjobs.
Is there any way this can be done in the same job?
You could remove the runif and handle both scenarios in the get_QAR_ID into context component. ie query the database component's NB_LINE after variable, if it's <1 raise the error, else set the value. Your job would then flow to the onSubjobOk.
You can do something like this :
In tJava_1 you do your error logging if no row is returned by your query, and you continue to the next subjob. No need to throw an exception here only to catch it immediately after.
If any row is found, you continue to the next subjob (tJava_2) with an If trigger.

Cloud Dataflow: Once trigger not working

I have a Dataflow pipeline reading from unbounded source. My window size is 10 hours, I am trying to test my trigger using a TestStream. My trigger will emit early result if element count reaches at least 2 for the same key within a Window. I have following trigger to achieve this:
input.apply(Window.into(FixedWindows.of(Duration.standardHours(12))) .triggering(AfterWatermark.pastEndOfWindow()
.withEarlyFirings(AfterPane.elementCountAtLeast(2)))
.apply(Count.perElement())
We also tried:
Repeatedly.forever(AfterPane.elementCountAtLeast(2)).orFinally(AfterWatermark.pastEndOfWindow())
I expect early firing when asserting the result, however I don't get all the result in
PAssert.that(pipeline).inWindow(..)..
What am I doing wrong? Also running same test repeatedly yields different result meaning different values are returned from the trigger.
Triggering is non-deterministic. It will give you an early firing some time after the trigger condition is satisfied. It will then give you another early firing some time after the trigger condition is satisfied again.
The actual choice to emit after the trigger is determined by the runner. If you are using a batch runner, it may wait until all the data is available. How much input are you expecting for each key/window? Which runner are you using?