Use Lookup and For Each Iteration to pull data from different analytics.dev.azure.com projects - azure-data-factory

Hi would just like to ask if this is possible, I am currently working on ADF, what I want to do is get workitems from analytics.dev.azure.com/[Organization]/[Project] then copy it to SQL Database. i am currently already doing this for 1 project, but want to do it for multiple projects without creating multiple copyto tasks within ADF but just run a Lookup to ForEach to iterate through all the team analytics URLs, is there anyway to do this?

We can use lookup and for-each activity to copy data to SQL dB tables from all URLs. Below are the steps
Create a lookup table which contains the entire list of URLs
Next in for each activity's settings, type the following in items for getting output of lookup activity
#activity('Lookup1').output.value
Inside for each activity, use copy activity.
In source, create a dataset and http linked service. Enter the base URL and relative URL. I have stored relative URLs in lookup activity. Thus I have given #{item().url} in relative URL
In sink, Create azure SQL database table for each item in for each activity or use the existing tables and copy data to those tables.

Related

how to copy a dashboard in a different azure devops instance?

I have a requirement to copy an existing dashboard dashboard in an org(source org) to a different org(target org) under a different ado instance by any means possible. Dashboard can have widgets and widgets can be linked to
pipeline
Query - A query can be referencing to a user, team, project, custom values of a standard field, custom fields
some other things that i have not encountered so far
So far my steps are as follows
get dashboard details using get dashboard rest api
identify widgets in dashboard details api response
get any pipeline if there is no pipeline create a dummy one and use its details for a widget that is using pipeline
identify distinct queries present in all widgets
create its equivalent query in target org and save its id
replace queryid in widget settings to its equivalent created queryid in targetorg
create dashboard in target org
I am facing issue in step 5
There are lot of moving variables in a query. Query might be referencing to things that does not exist in target org like a particular user, team, custom values of a standard field, custom fields. In order to create a query successfully i need to know possible values of a field in target org. While creating a new query from ui it shows possible values for a field in dropdown so i am wondering is there any rest api that gives possible values of a field and if no such field exist in target org then it should throw error.
Looking forward to suggestions for a simpler or alternative approach to replicate a dashboard across different ado instance and/or better approach for step 5
If you are looking for a rest api the query the fields in your target process of the organization, you could refer to this doc. Field-list.
GET https://dev.azure.com/{organization}/_apis/work/processes/{processId}/workItemTypes/{witRefName}/fields?api-version=6.0-preview.2
After that, you could create the fields in your target process, you could refer to this rest api. Fields-Create
POST https://dev.azure.com/{organization}/_apis/work/processdefinitions/{processId}/fields?api-version=4.1-preview.1
Or could you share more details of your requirement, like screenshots and widget definition or dashboards configuration for update.

Nested ForEach in ADF (Azure Data Factory)

I have a set of json files that I want to browse, in each file there is a field that contains a list of links that direct to an image. The goal is to download each image from the links using binary formats (I tested with several links and it already works).
Here, my problem is to make the nested ForEach, I manage to browse all the json files but when I make a second ForEach to browse the links and make a copy data to download the images using an Execute Pipeline I get this error
"ErrorCode=InvalidTemplate, ErrorMessage=cannot reference action 'Copy data1'. Action 'Copy data1' must either be in 'runAfter' path, or be a Trigger"
Example of file:
t1.json
{
"type": "jean",
"image":[
"pngmart.com/files/7/Denim-Jean-PNG-Transparent-Image.png",
"https://img2.freepng.fr/20171218/882/men-s-jeans-png-image-5a387658387590.0344736015136497522313.jpg",
"https://img2.freepng.fr/20171201/ed5/blue-jeans-png-image-5a21ed9dc7f436.281334271512172957819.jpg"
]
}
t1.json
{
"type": "socks",
"image":[ "https://upload.wikimedia.org/wikipedia/commons/thumb/5/52/Fun_socks.png/667px-Fun_socks.png",
"https://upload.wikimedia.org/wikipedia/commons/e/ed/Bulk_tube_socks.png",
"https://cdn.picpng.com/socks/socks-face-30640.png"
]
}
Do you have a solution?
Thanks
As per the documentation you cannot nest For Each activities in Azure Data Factory (ADF) or Synapse Pipelines, but you can use the Execute Pipeline activity to create nested pipelines, where the parent has a For Each activity and the child pipeline does too. You can also chain For Each activities one after the other, but not nest them.
Excerpt from the documentation:
Limitation
Workaround
You can't nest a ForEach loop inside another ForEach loop (or an Until loop).
Design a two-level pipeline where the outer pipeline with the outer ForEach loop iterates over an inner pipeline with the nested loop.
Or visually:
It may be that multiple nested pipelines is not what you want in which case you could pass this looping off to another activity, eg Stored Proc, Databricks Notebook, Synapse Notebook (if you're in Azure Synapse Analytics) etc. One example here might be to load up the json files into a table (or dataframe), extract the filenames once and then loop through that list, rather than each file. Just an idea.
I have repro’d and was able to copy all the links looping the copy data activity inside the ForEach activity and using the execute pipeline activity.
Parent pipeline:
If you have multiple JSON files, get the files list using the Get Metadata activity.
Loop the child items using the ForEach activity and add the execute pipeline activity to get the data from each file by passing the current item as a parameter (#item().name).
Child pipeline:
Create a parameter to store the file name from the parent pipeline.
Using the lookup activity, get the data from the current JSON file.
Filename property: #pipeline().parameters.filename
Here I have added https:// to your first image link as it is not validating in the copy activity and giving an error.
Pass the output to the ForEach activity and loop through each image value.
#activity('Lookup1').output.value[0].image
Add Copy data activity inside ForEach activity to copy each link from source to sink.
I have created a binary dataset with the HttpServer linked service and created a parameter for the Base URL in the linked service.
Passing the linked service parameter value from the dataset.
Pass the dataset parameter value from the copy activity source to use the current item (link) in the linked service.

Azure Data Factory "Copy data" activity not working for ODATA source with forward slash ("/") in the endpoint's column list

I am trying to do a simple Copy Data activity in Azure Data Factory.
My source dataset is an ODATA EndPoint, which has a $select filter (to specify columns):
All columns are loaded just fine in my destination (SQL server), only I am missing the column "specialField/Custom:81". When I click "preview data", or simply run the Copy Data activity, I get all fields except this one.
It seems clear that it is because the field name contains special characters. How do I fix this? I can easily retrieve data from this field in Postman, so it is a Data Factory issue.

Azure data factory pipeline - copy blob and store filename in a DocumentDB or in Azure SQL

I set up 2 blob storage folders called "input" and "output". My pipeline gets triggered when a new file arrives in "input" and copies that file to the "output" folder. Furthermore I do have a Get Metadata activity where I receive the copied filename(s).
Now I would like to store the filename(s) of the copied data into a DocumentDB.
I tried to use the ForEach activity with it, but here I am stuck.
Basically I tried to use parts from this answer: Add file name as column in data factory pipeline destination
But I don't know what to assign as Source in the CopyData activity since my source are the filenames from the ForEach activity - or am I wrong?
Based on your requirements, I suggest you using Blob Trigger Azure Functions to combine with your current Azure data factory business.
Step 1: still use event trigger in adf to transfer between input and output.
Step 2: assign Blob Trigger Azure Functions to output folder.
Step 3: the function will be triggered as soon as a new file created into it.Then get the file name and use Document DB SDK to store it into document db.
.net document db SDK: https://learn.microsoft.com/en-us/azure/cosmos-db/sql-api-sdk-dotnet
Blob trigger bindings, please refer to here: https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob
You may try use a custom activity to insert filenames into Document Db.
You can pass filenames as parameters to the custom activity, and write your own code to insert data into Document Db.
https://learn.microsoft.com/en-us/azure/data-factory/transform-data-using-dotnet-custom-activity

Azure data factory: From Salesforce to REST API

I have created data factory source as Salesforce for which i am querying lead and then want to pass each lead email id as an argument (POST request) to a REST endpoint.
I am not sure in the pipeline what sink should i put and if it is HTTP File dataset then how to pass the email id from Source to sink in the argument?
The only way I know how to surface data in ADF itself is through a lookup activity. You can then iterate through the results of the lookup using a forEach activity. Reference the result of the lookup using in the ForEach items parameter:
#activity('nameoflookupactivity').output.value
If you need to add multiple IDs at once I think to only way would be to concatenate your IDs in a sink like a SQL database. You would first have to copy your IDs to a table in SQL. Then in Azure SQL DB/SQL server 2017 you could use the following query:
SELECT STRING_AGG([salesforce_Id], ',') AS Ids FROM salesforceLeads;
The ADF tutorial for incrementally loading multiple tables discusses the ForEach activity extensively, you can find it here:
ADF Incrementally Loading multiple Tables Tutorial
For more information about the STRING_AGG function check out:STRING_AGG Documentation
Thanks,
Jan