How to compare Dev and Prod Azure Data Factory implementation? - azure-data-factory

I might have modified Data Factory pipeline implemented by another developer in accident.
But don't know now that what was modified since there were so many modification.
Is there any tool to compare what is difference between development and prod environment to see how they differ from each others?

You can extract the code from the pipeline from 2 versions and compare them using comparison tools available online for free.
You can also go through this document to use the PowerShell cmdlets to get the information of the pipelines in the Azure data factory.

Related

Where Is The History Of The Azure Classic Pipelines Kept?

I prefer the classic editor in Azure pipelines as I can see in a glance of about a second or two what is happening.
I see the benefit of versioning the pipeline, I just don't want to have to learn another DSL/schema if I don't have to.
The classic view has a history of changes and the content of the JSON it creates is very clear.
Where is this stored and is it possible to direct this to the same repo I would have used for the yaml pipelines?

How can I get AzureDevOps build variables information through the Rest API's?

I have a tool aztr that summarizes the Azure Build/Release pipeline test results. A recent requirement has come up to save the incoming variable information and the summary information for consumption outside of the tool (say a csv file).
Now on the Release pipelines side the Release API provides all the details about the variables passed into the Release. I want the same functionality to be available on the Build side as well but the Build API, does not provide that functionality. Is there a different API, I need to use to get the variables passed into the build?
Thank you for your responses.
We could list the custom build variables via this API
GET https://dev.azure.com/{organization}/{project}/_apis/build/definitions/{definitionId}?api-version=6.0
In addition, they are some predefined variables in the variable and they are DevOps Services system variables, we cannot list them via this api.
Result:

Is it possible to deploy only updated Azure Function Projects when I push a repo of the entire Solution?

I need to refactor a .Net Web API, I'm considering moving to serverless and I'm trying to understand the best option to migrate the code to Azure Functions.
As farĀ as I understand the correct approach to reduce costs and cold start time is to split the API: it is much better to have many small web api than a single one with all methods. Small api consume less memory and cold start quicker.
Having more Functions in the same Project does not resolve the problem as they would be all deployed in the same Function App so one dll, high memory, slow cold start.
So I should create several Azure Function Projects and deploy each of them in a different Function App.
If all the above is correct we finally got to the problem:
I would structure the code and the repo so that I have one Solution containing several Azure Function Projects. How can I have a CI/CD (Azure DevOps) so that when I push the repo ONLY the Azure Function Projects updated/modified/new are deployed? I need to deploy only the modified Azure Function Projects so not to have all the Function Apps (also the ones whose code is unchanged) goes cold.
This is less important but I'd also need to have one URL for all APIs, so https://myapi.azurewebsites.net/api/Function1, https://myapi.azurewebsites.net/api/Function2, etc and not https://myapi1.azurewebsites.net/api/Function1, https://myapi2.azurewebsites.net/api/Function1, etc. Is this possible using the above structure?
You need to have multiple CI/CD pipelines with trigger limited only to specific folder:
trigger:
paths:
include:
- function-a/*
exclude:
- '*'
for this you will get pipeline triggered only if changes are done in function-a folder. To limit a work needed to develop pipelines you should consider using templates. You can find more info about this here:
Template types & usage
Build templates on Azure DevOps - this is my ow blog
In this way you will avoid repeating yourself.
EDIT
To unify your API you can use Azure Functions Proxies
With this feature, you can specify endpoints on your function app that
are implemented by another resource. You can use these proxies to
break a large API into multiple function apps (as in a microservice
architecture), while still presenting a single API surface for
clients.

For Azure Data Factories is there a way to 'Validate all' using powershell rather than the GUI?

A working Azure Data Factory (ADF) exists that contains pipelines with activities that are dependent on database tables
The definition of a database table changes
The next time the pipeline runs it fails
Of course we can set up something so it fails gracefully but ...
I need to proactively execute a scheduled Powershell script that iterates through all ADFs (iterating is easy) to do the equivalent of the 'Validate All' (validating is impossible?) functionality that the GUI provides
I do realise that the Utopian CI/CD DevOps environment I dream about will one day in the next year or so achieve this via other ways
I need the automation validation method today - not in a year!
I've looked at what I think are all of the powershell cmdlets available and short of somehow deleting and redeploying each ADF (fraught with danger) I can't find a simple method to validate an Azure Data Factory via Powershell.
Thanks in advance
In the ".Net" SDK, each of the models has a "Validate()" method. I have not yet found anything similar in the Powershell commands.
In my experience, the (GUI) validation is not foolproof. Some things are only tested at runtime.
I know it has been a while and you said you didn't want the validation to work in an year - but after a couple of years we finally have both the Validate all and Export ARM template features from the Data Factory user experience via a publicly available npm package #microsoft/azure-data-factory-utilities. The full guidance can be found on this documentation.

Azure Data Factory development with multiple users without using Source Control

While working on a single Azure Data Factory solution with no Source Control. Is it possible to work parallelly for a team of 3 or more developers, without corrupting the main JSON?
Scenario:
All developers are accessing the same ADF and working on different pipelines at the same time. One of the developer publishes his/her updates, does it somehow overwrites or ignores the changes other developers are publishing?
I tested and found that:
Multiple users can access the same Data factory and working with
different pipelines in same time.
Publish only affect the current user and the current pipeline which
user is developing and editing. It won't overwrites other pipelines.
For you question:
Is it possible to work parallelly for a team of 3 or more developers, without corrupting the main JSON?
Yes, it's possible.
One of the developer publishes his/her updates, does it somehow overwrites or ignores the changes other developers are publishing?
No, it doesn't. For example, user A only develop with pipeline A, then publish again. The Publish only affect the current pipeline, won't overwrite or affection other pipelines.
You could test and prove it.
Update:
Thanks #V_Singh for share us the Microsoft suggestion:
Microsoft suggested to use CI/CD only, otherwise there will be some disparity in code.
Reply from Microsoft:
"In Live Mode can hit unexpected errors if you try to publish because you may have not the latest version ( For Example user A publish, user B is using old version and depends on an old resource and try to publish) not possible. Suggested to please use Git, since it is intended for collaborative scenarios."
Hope this helps.