Is there a way to add a column in the copy activity? - azure-data-factory

I have to call a webservice with multiple times with a diferent parameter value.
therefore i created a foreach task containing a copy activity.
Now i want to save the output of this call along with the parameter value.
Cann i somehow add a new additional field in the copy activity mapping to my parameter value? (#item().value)

what is your copy source type? You might use a query to include the item().value
You could also reference this post.
Add file name as column in data factory pipeline destination

Related

Additional Column in ADF through HTTP linked service

I would like to add an additional column during a copy activity. I cannot use the Getmetadata activity because it is through a hhtp linked service.
However, I am using a parameter called filename in order to specify the file. Would it be possible to input the above parameter in the additional column.
enter image description here
You cannot view the Dataset parameter in the pipeline directly.
Alternately you can use a pipeline parameter to provide input to Dataset parameter and use the same pipeline parameter in the Additional Column value.

Azure Data Factory V2 Copy Activity - Save List of All Copied Files

I have pipelines that copy files from on-premises to different sinks, such as on-premises and SFTP.
I would like to save a list of all files that were copied in each run for reporting.
I tried using Get Metadata and For Each, but not sure how to save the output to a flat file or even a database table.
Alternatively, is it possible to fine the list of object that are copied somewhere in the Data Factory logs?
Thank you
Update:
Items:#activity('Get Metadata1').output.childItems
If you want record the source file names, yes we can. As you said we need to use Get Metadata and For Each activity.
I've created a test to save the source file names of the Copy activity into a SQL table.
As we all know, we can get the file list via Child items in Get metadata activity.
The dataset of Get Metadata1 activity specify the container which contains several files.
The list of file in test container is as follows:
At inside of the ForEach activity, we can traverse this array. I set a Copy activity named Copy-Files to copy files from source to destnation.
#item().name represents every file in the test container. I key in the dynamic content #item().name to specify the file name. Then it will sequentially pass the file names in the test container. This is to execute the copy task in batches, each batch will pass in a file name to be copied. So that we can record each file name into the database table later.
Then I set another Copy activity to save the file names into a SQL table. Here I'm using Azure SQL and I've created a simple table.
create table dbo.File_Names(
Copy_File_Name varchar(max)
);
As this post also said, we can use similar syntax select '#{item().name}' as Copy_File_Name to access some activity datas in ADF. Note: the alias name should be the same as the column name in SQL table.
Then we can sink the file names into the SQL table.
Select the table which created previously.
After I run debug, I can see all the file names are saved into the table.
If you want add more infomation, you can reference the post I maintioned previously.

how to get Iteration Id for items in array using Azure Data factory

I have a simple ADF pipeline which contains 1 lookup (which loads the name of tables to be migrated) and a ForEach activity (Which contains copy activity and a function App to loads data in BQ). I want to get the Iteration ID and want to send it to Azure function App.
Let say the Lookup returns a JSON with three tables in it (A,B,C) I want to get the iteration id inside the foreach loop for example 1 for A and 2 for B and 3 for C.
Any help on this will be highly appreciated.
I agree this is a common requirement,but it seems no direct way to get the array index inside the for-each activity. However,you could try my little trick with AzureFunction Activity.
Step1: Create a text file (named as index.txt)in the some blob storage path and store 1 value in it(for using it as array index)
Step2: Inside the For-each Activity, use LookUp Activity to read the value of index.txt. First time, it is 1.
Step3: After that, execute an Azure Function Activity to change the value --plus 1.So that,next time it is 2.
Step4: When you finish For-each Activity,you could reset the value as 0 by Azure Function Activity.
No need to create 2 azure functions,just 1. You could pass a boolean parameter to distinct whether this invoke is for reset or plus.
In the lookup table from which I was going to pick the Source and destination tables/databases. I added another column with the Iterator number like 1, 2,3,4 for each row in the Source table from which the lookup activities is retrieving the data.
Then inside Azure data factory, I read that column inside the Foreach loop. For each of the Source and Destination tables I have a self made Iterator and used that for my purpose. It worked perfectly fine for me.

Azure Data Factory Copy using Variable

I am coping data from a rest api to an azure SQL database. The copy is working find but there is a column which isn't being return within the api.
What I want to do is to add this column to the source. I've got a variable called symbol which I want to use as the source column. However, this isn't working:
Mapping
Any ideas?
This functionality is available using the "Additional Columns" feature of the Copy Activity.
If you navigate to the "Source" area, the bottom of the page will show you an area where you can add Additional Columns. Clicking the "New" button will let you enter a name and a value (which can be dynamic), which will be added to the output.
Source(s):
https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-overview#add-additional-columns-during-copy
Per my knowledge, the copy activity may can't meet your requirements.Please see the error conditions in the link:
Source data store query result does not have a column name that is
specified in the input dataset "structure" section.
Sink data store (if with pre-defined schema) does not have a column
name that is specified in the output dataset "structure" section.
Either fewer columns or more columns in the "structure" of sink
dataset than specified in the mapping.
Duplicate mapping.
I think Mapping Data Flow is your choice.You could add a derived column before the sink dataset and create a parameter named Symbol.
Then set the derived column as the value of Symbol.
You can use the Copy Activity with a stored proc sink to do that. See my answer here for more info.

Retrieve blob file name in Copy Data activity

I download json files from a web API and store them in blob storage using a Copy Data activity and binary copy. Next I would like to use another Copy Data activity to extract a value from each json file in the blob container and store the value together with its ID in a database. The ID is part of the filename, but is there some way to extract the filename?
You can do the following set of activities:
1) A GetMetadata activity, configure a dataset pointing to the blob folder, and add the Child Items in the Field List.
2) A forEach activity that takes every item from the GetMetadata activity and iterates over them. To do this you configure the Items to be #activity('NameOfGetMetadataActivity').output.childItems
3) Inside the foreach, you can extract the filename of each file using the following function: item().name
After this continue as you see fit, either adding functions to get the ID or copy the entire name.
Hope this helped!
After Setting up Dataset for source file/file path with wildcard and destination/sink as some table
Add Copy Activity setup source, sink
Add Additional Columns
Provide a name to the additional column and value "$$FILEPATH"
Import Mapping and voila - your additional column should be in the list of source columns marked "Additional"