Azure Data Factory Tasks Queued - azure-data-factory

I've created a new ADF pipeline which is working well but gives me some concern over performance.
As an example - here's a task from the pipeline that copies a small blob from a container to another container in the same storage account:
Notice that it's queued for 58 seconds.
The pipeline uses "Managed Virtual Network" integration runtime because it makes use of Azure SQL Private Endpoints.
Any ideas why the copy data tasks are held at "Queued" for so long?

As you mentioned that your pipeline using "Managed Virtual Network" integration runtime, therefore, as per the Activity execution time using managed virtual network:
By design, Azure integration runtime in managed virtual network takes
longer queue time than global Azure integration runtime as we are not
reserving one compute node per data factory, so there is a warm up for
each activity to start, and it occurs primarily on virtual network
join rather than Azure integration runtime. For non-copy activities
including pipeline activity and external activity, there is a 60
minutes Time To Live (TTL) when you trigger them at the first time.
Within TTL, the queue time is shorter because the node is already
warmed up.
There is also 60 minutes time to Live(TTL) feature is available in "Managed Virtual Network" IR which shorten the queue time because the node is already warmed up, but unfortunately Copy activity doesn't have TTL support yet.

Related

Azure Data Factory: what happens if Self-Hosted IR is down

Let's say we need to maintain and reboot a Self-Hosted Integration Runtime machine. We only have one node. At the same, some pipelines may be running. What will happen with activities that are normally scheduled on this SHIR. Will they fail immediately once it's not available, or will they remain in the "waiting" state up to their maximum Timeout value, until a runtime comes back up?
I'd assume it's the latter but wanted to confirm.
I did a quick try out by stopping the Self-hosted IR service.
In ADF, the test connection from linked services return error:
Copy activity that involves the self-hosted IR failed immediately:

Build agent metric in Azure Devops pipelines

We pay for a number of Microsoft hosted build agents in Azure pipelines. We have a lot of build pipelines, where many of them do jobs in parallel.
Are there any metrics I can use to see the utilization of the build agents and even more interesting, how many jobs are in queue for a free build agent?
Since this would be for the whole Azure Devops instance the Dashboard feature doesn't seems to be appropriate because it only seems to hold project specific metrics.
Got to your Organization Settings-Parallel jobs blade. This will give you the ability to view the jobs in progress.
As for metrics there is a public preview just came out for this; however, I do not have it available yet.
Agent pool usage data is sampled and aggregated by the Analytics service every 10 mins. The number of jobs is plotted based on the max number of running jobs for the specified interval of time.
This feature is enabled by default. To try it out, follow the guidance
below.
Within project settings, navigate to the pipelines “Agent pools” tab
From the agent pool, select a pool (e.g., Azure Pipelines) Within the
pool, select the “Analytics” tab

Why is my Azure DevOps Migration timing out after several hours?

I have a long running Migration (don't ask) being run by an AzureDevOps Release pipeline.
Specifically, it's an "Azure SQL Database deployment" activity, running a "SQL Script File" Deployment Type.
Despite having configured maximums in all the timeouts in the Invoke-Sql Additional Parameters settings, my migration is still timing out.
Specifically, I get:
We stopped hearing from agent Hosted Agent. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error.
So far it's timed out after:
6:13:15
6:13:18
6:14:41
6:10:19
So "after 6 and a bit hours". It's ~22,400 seconds, which doesn't seem like any obvious kind of number either :)
Why? And how do I fix it?
It turns out that AzureDevOps uses Hosting Agents, to execute each Task in a pipeline, and those Agents have innate lifetimes, independent from whatever task they're running.
https://learn.microsoft.com/en-us/azure/devops/pipelines/troubleshooting/troubleshooting?view=azure-devops#job-time-out
A pipeline may run for a long time and then fail due to job time-out. Job timeout closely depends on the agent being used. Free Microsoft hosted agents have a max timeout of 60 minutes per job for a private repository and 360 minutes for a public repository. To increase the max timeout for a job, you can opt for any of the following.
Buy a Microsoft hosted agent which will give you 360 minutes for all jobs, irrespective of the repository used
Use a self-hosted agent to rule out any timeout issues due to the agent
Learn more about job timeout.
So I'm hitting the "360 minute" limit (presumably they give you a little extra on top, so that no-one complains?).
Solution is to use a self-hosted agent. (or make my Migration run in under 6 hours, of course)

Stopping Cloud Data Fusion Instance

I have production pipelines which only runs for couple of hours using Google Data Fusion. I would like to stop the Data Fusion Instance and start it the next day. I don't see an option to stop the instance. Is there anyway we can stop the instance and start the same instance again ?
As per design Data Fusion instance is running in a GCP tenancy unit that guarantees the user fully automated way to manage all the cloud resources and services (GKE cluster, Cloud Storage, Cloud SQL, Persistent Disk, Elasticsearch, and Cloud KMS, etc.) for storing, developing and executing customer pipelines. Therefore, there is no possibility to terminate Data Fusion instance, thus all the pipeline service execution contributors are launching on demand and clearing after pipeline completion, find here the price charging concepts.

Azure devops- Running the multiple release from single release defnition

I am trying to invoke the multiple releases definition using REST API.Also enabled multiple agents for each Agent job. But even after triggering multiple releases the second release is in Queue and not at all starting. Is there any way to start the deployment parallelly from a single release defition.
Parallel jobs have different restrictions depending on the agents you use and your project is public or private.
Microsoft-hosted agent
If your jobs run on the pool of Microsoft-hosted agents. Microsoft provides a free tier of service by default in every organization:
Public project: 10 free Microsoft-hosted parallel jobs that can run
for up to 360 minutes (6 hours) each time, with no overall time limit
per month.
Private project: One free job that can run for up to 60 minutes each
time, until you've used 1,800 minutes (30 hours) per month.
Note:When you purchase your first Microsoft-hosted parallel job, the number of parallel jobs you have in the organization is still 1. To be able to run two jobs concurrently, you will need to purchase two parallel jobs if you are currently on the free tier. The first purchase only removes the time limits on the first job.
Self-hosted agent
To use self-hosted parallel jobs, you need to deploy self-hosted agents on your machines. You can register any number of these self-hosted agents in your organization. Microsoft charges based on the number of jobs you want to run at a time, not the number of agents registered.
Public project: Unlimited parallel jobs.
Private project: One self-hosted parallel job. Additionally, for each
active Visual Studio Enterprise subscriber who is a member of your
organization, you get one additional self-hosted parallel job.
For private project,when the free tier is no longer sufficient, you can pay for additional capacity per parallel job. Buy self-hosted parallel jobs.There are no time limits on self-hosted jobs.
These are stated in the document, you can refer to the link in Daniel's comment for details.