Can we call the pipeline in parallel and have multiple instances running? - azure-data-factory

I have a scenario where I have to execute a pipeline from different pipelines to log the validations.
So I'm planning to have a single pipeline for all the validations (like counts, duplicates, count drops etc..) and this pipeline should be trigger when a particular table execution completes.
for example: There are two pipelines P1 & P2 which both invokes this validation pipeline upon completion. so there is a chance that this validation pipeline may trigger twice at same time.
can we run a pipeline like this? is there any lock will applied automatically?

You can reuse a pipeline which acts as generic pipeline in other pipelines and call them parallelly and there is no lock aspect.
Just that make sure the generic pipeline is allowed parallel executions else it would be in queue

Related

Run different stages/templates of azure pipeline at different schedules

I have a configuration file for the Azure pipeline that is scheduled through the UI to run Mon to Fri. The file has different stages and each stage calls a different template. What I want to do is run different stages/templates in different days of the week.
I tried to save different schedules through the triggers UI, but they need to be applied to the entire file.
I was also reading this https://learn.microsoft.com/en-us/azure/devops/pipelines/process/scheduled-triggers?view=azure-devops&tabs=yaml but again, the schedule would be applied to the entire file.
Is there a way to apply a different schedule to each step?
No, there is an "out-of-box" way to do that. I think you may try to:
Divide your build into several and schedule them separately.
Or
Add a planning step that detects the day and sets an execution step variable like:
echo "##vso[task.setvariable variable=run.step1;]YES"
Set variables in scripts
Then use it in the conditions:
and(succeeded(), ne(variables['run.step1'], 'YES'))
Specify conditions

Can you control how the tasks are executed in Azure Pipelines?

I have built a pipeline with 4 tasks
Task 1 Builds a VM
Task 2 Add a Data Disk
Task 3 Add a Second Data Disk
Task 4 Add a Third Data Disk
However, if I only want Task 1 and Task 2 to execute how can I skip Task 3 and 4? For example, if the user only wants 1 Data Disk. I know they can be disabled manually but is there a way to automate this based on a variable?
Every stage, job and task has a condition property. You can use a condition expression to decide which tasks to run and when. You can reference variables in such expressions. By promoting these variables to a "Queue time variable" you can let a user control these.
Make sure you prepend each condition with succeeded() to make the previous steps have completed succesfully.
condition: and(succeeded(), gt(variables.Disks, 2))
See:
Expressions
Specify Conditions
Define Variables - Allow at queue time

Conditional component

Is there a way to
get a particular pipeline's, say P1, status (failed / completed) in conditional component in pipeline P2?
Can we call a pipeline from conditional component?
Usecase:
I have functional pipelines F1, F2, F3 etc and audit pipelines as audit_success and audit_failure. If I can get F3's status in 1 single audit pipeline, I can have 2 branches in same pipeline thereby avoiding creation of 2 pipelines.
There is no conditional component that checks for another pipeline's status. However, you can achieve this through pipeline triggers, but as you mentioned it does require two different pipelines.

Automation for ADF V2 Pipeline

I need help with implementation for below requirement:
There is one ADF pipeline that runs every two hours (with Tumbling window trigger), now i need to create one more pipeline that will be used for performing maintenance job . This pipeline is scheduled to run once a month (with schedule trigger). Here is the requirement that i'm trying to implement:
Now before running the second pipeline i need to make sure the first pipeline is not running (basically get the status and if its running wait for its completion) and then disable the trigger associated with it.
Run the second pipeline and after its completion , enable the trigger that is associated with first pipeline
Please let me know if this can be achieved within ADF or some kind of custom scripting needed to achieve the result.
First, your idea is achievable.
Second, if you want to use built-in feature in Azure Datafactory, then there is no way.
Basically, you need to use azure function(simple httptrigger, dont give any input, then you can hit and execute it directly.) to achieve your requirement that ADF can't do. From your description, the executing of these two pipelines are mutually exclusive, so you can use sdk to check to status of another pipeline in azure function. If another pipeline is running, then wait a few seconds then re-check the status of another pipeline.(In short, put the main logic and code in the azure function.)
Simple azure function:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-http-webhook-trigger?tabs=csharp
Use SDK to monitor:
https://learn.microsoft.com/en-us/azure/data-factory/monitor-programmatically#net
(The link I give is C#, you can choose other supported language.)

Run lot of piplines in the same time

There is a solution if i want to run those piplines in the same time instead of doing it for each pipline
Just add a trigger at same time for all of your pipelines.
In the ADF portal:
Set the same time for trigger configuration:
If you want to execute them in the queue,you could use execute pipeline activity which allows you to invoke another pipeline.
You could also leverage the lookup activity to lookup the pipeline using meta data or a pipeline parameter table and then use the set the for each loop to parallel processing so that it will process upto 50 pipelines at once.
See ForEach Article for more info: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity#parallel-execution