How to execute pipeline at different times - azure-data-factory

I have requirement like . I have pipeline it contains 6 activities. I need to trigger the pipeline at 6 AM & 8 PM. At 6 AM I need to trigger the first 3 activities and next 3 activities I need to run at 8 PM.
Note: 6 activates in one pipeline.

What AbhishekKhandave-MT called out is accurate . In the worst case scneario you can always clone the existing pipeline and then you will have two pipeline , keep and choose activities what you want and then schedule them accordingly .
In case you do not want to use the above suggestion , you can always use an IF actvity to check the time add the activites inside , I agree it will be very messy that way .

Trigger in Azure Data Factory are associated with Pipeline only. You can not run individual activities using trigger. Once you run pipeline, all activities in it will get executed.
Types of Trigger:
Schedule trigger
Tumbling window
Storage events
Custom events
Refer - https://learn.microsoft.com/en-us/azure/data-factory/how-to-create-schedule-trigger?tabs=data-factory

Related

Run different stages/templates of azure pipeline at different schedules

I have a configuration file for the Azure pipeline that is scheduled through the UI to run Mon to Fri. The file has different stages and each stage calls a different template. What I want to do is run different stages/templates in different days of the week.
I tried to save different schedules through the triggers UI, but they need to be applied to the entire file.
I was also reading this https://learn.microsoft.com/en-us/azure/devops/pipelines/process/scheduled-triggers?view=azure-devops&tabs=yaml but again, the schedule would be applied to the entire file.
Is there a way to apply a different schedule to each step?
No, there is an "out-of-box" way to do that. I think you may try to:
Divide your build into several and schedule them separately.
Or
Add a planning step that detects the day and sets an execution step variable like:
echo "##vso[task.setvariable variable=run.step1;]YES"
Set variables in scripts
Then use it in the conditions:
and(succeeded(), ne(variables['run.step1'], 'YES'))
Specify conditions

ADF pipeline should be triggered after Formrecognizer analysis is completed

I am calling AzureFormRecognizer from my Azure data factory 2 pipeline and sending a 200 page document
FormRecognizer takes around 5 mins to complete analysis and untill then status is "Running"
So I have added a Wait activity to wait for 5 mins and then I call GetAnalyze results by calling form recognizer api
Question
Is there any way to trigger ADF pipeline once Form recognizer completes its analysis ?
You can use Execute pipeline activity for this. Keep the Formrecognizer activity in a separate pipeline and call that pipeline in main pipeline.
Here for sample, I have used set variable actvity in the pipeline.
You can use your Formrecognizer activity in this.
Check on wait on completion so that it waits for the activity to get complete.
After Execute pipeline use your next activities which will be executed after the completion of FormRecognizer activity. For this demo I have used another set variable activity.
Or You can use another Execute pipeline after this if you want to trigger them in a separate pipeline.

Run an Azure Data Factory Pipeline Continuously

I have a requirement to incrementally copy data from one SQL table to another SQL table. The watermark or key column is an Identity column. My boss wants me to restart the load as soon as it's done...and as you know, the completion time may vary. In Azure Data Factory, the trigger options are Scheduled, Tumbling Window and Custom Event. Does anyone know which option would allow me to achieve this continuous running of the pipeline and how to configure it?
Create a new pipeline. Call it "run forever". Add an "until" activity with a never-true condition e.g. #equals(1, 2).
Inside the until execute the pipeline which copies between tables. Ensure "wait for completion" is ticked.
If the table copy fails then "run forever" will fail and will have to be manually re-started. You likely do not want "run forever" to be scheduled as the scheduled invocations will queue should it, in fact, not fail.
Pipeline in ADF are batch-based. You can set "micro batches" of 1 min with schedule trigger or 5 mins with tumbling window.
ADF will run in a batch. If you want to continuously load the data then you can go for Stream Analytics / Event Hub which would load real-time data

Automation for ADF V2 Pipeline

I need help with implementation for below requirement:
There is one ADF pipeline that runs every two hours (with Tumbling window trigger), now i need to create one more pipeline that will be used for performing maintenance job . This pipeline is scheduled to run once a month (with schedule trigger). Here is the requirement that i'm trying to implement:
Now before running the second pipeline i need to make sure the first pipeline is not running (basically get the status and if its running wait for its completion) and then disable the trigger associated with it.
Run the second pipeline and after its completion , enable the trigger that is associated with first pipeline
Please let me know if this can be achieved within ADF or some kind of custom scripting needed to achieve the result.
First, your idea is achievable.
Second, if you want to use built-in feature in Azure Datafactory, then there is no way.
Basically, you need to use azure function(simple httptrigger, dont give any input, then you can hit and execute it directly.) to achieve your requirement that ADF can't do. From your description, the executing of these two pipelines are mutually exclusive, so you can use sdk to check to status of another pipeline in azure function. If another pipeline is running, then wait a few seconds then re-check the status of another pipeline.(In short, put the main logic and code in the azure function.)
Simple azure function:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=csharp
Use SDK to monitor:
https://learn.microsoft.com/en-us/azure/data-factory/monitor-programmatically#net
(The link I give is C#, you can choose other supported language.)

Run lot of piplines in the same time

There is a solution if i want to run those piplines in the same time instead of doing it for each pipline
Just add a trigger at same time for all of your pipelines.
In the ADF portal:
Set the same time for trigger configuration:
If you want to execute them in the queue,you could use execute pipeline activity which allows you to invoke another pipeline.
You could also leverage the lookup activity to lookup the pipeline using meta data or a pipeline parameter table and then use the set the for each loop to parallel processing so that it will process upto 50 pipelines at once.
See ForEach Article for more info: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity#parallel-execution