We are executing pipeline one by one in sequence manager and load the data on premise SQL.
but we would want to load the data all the copy activity in single trigger. which means we have to load the 15 tables data into on premise DB. if tomorrow, we have to add one more table, we should not change in pipeline. we would like dynamic table insert. kindly advise.
thanks to all
I reproduced the above scenario and got the below results.
Use two lookups one for your source database and one for on-prem SQL.
Here I have used Azure SQL database for both source and target. You can use your database with SQL Server linked service in lookup.
Use the below query in both lookups to get the list of tables.
SELECT TABLE_NAME
FROM
information_schema.tables;
lookup list of of source tables:
Give the same query in second lookup.
List of target tables with another lookup:
Use filter activity to get the list of new tables which are not copied to target.
Items: #activity('sql source lookup').output.value
Condition: #not(contains(activity('on-prem lookup').output.value,item()))
Filter result:
Give this Value array to a ForEach activity and use copy activity inside ForEach.
Copy Source:
Copy sink:
You can see new tables are copied to my target. Schedule this pipeline everyday so that your every new table gets copied to target.
Related
I am using Azure data factory to copy data from tables present in one database to tables present in another database. I am using the lookup table to get the list of tables that needs to be copied and after that using the foreach iterator to copy the data.
I am using the Below table to get a list of tables that needs to be copied.
The problem: I want to update the flag to 1 when a table is successfully copied. I tried using the log that is generated after the pipeline ran but I am unable to use it effectively.
I have created a ADF pipeline which picks up the files does processing of the files and project it to Power BI, want to log all the information to a table as part of logging.
what all information should I log into the table? and how to achieve it?
want to log all the information to a table as part of logging
Use Copy Activity and add source as input from other activities also use sink as SQL Server.
You can use below query and add it to Add dynamic content:
SELECT ‘#{pipeline().DataFactory}’ as DataFactory_Name,
‘#{pipeline().Pipeline}’ as Pipeline_Name,
‘#{activity(‘copytables’).output.executionDetails[0].source.type}’ as Source_Type,
‘#{activity(‘copytables’).output.executionDetails[0].sink.type}’ as Sink_Type,
‘#{activity(‘copytables’).output.executionDetails[0].status}’ as Execution_Status,
‘#{activity(‘copytables’).output.rowsRead}’ as RowsRead,
‘#{activity(‘copytables’).output.rowsCopied}’ as RowsWritten
‘#{activity(‘copytables’).output.copyDuration}’ as CopyDurationInSecs,
‘#{activity(‘copytables’).output.executionDetails[0].start}’ as CopyActivity_Start_Time,
‘#{utcnow()}’ as CopyActivity_End_Time,
‘#{pipeline().RunId}’ as RunId,
‘#{pipeline().TriggerType}’ as TriggerType,
‘#{pipeline().TriggerName}’ as TriggerName,
‘#{pipeline().TriggerTime}’ as TriggerTime
Refer this article by Rohit Dhande for more information
I have pipelines that copy files from on-premises to different sinks, such as on-premises and SFTP.
I would like to save a list of all files that were copied in each run for reporting.
I tried using Get Metadata and For Each, but not sure how to save the output to a flat file or even a database table.
Alternatively, is it possible to fine the list of object that are copied somewhere in the Data Factory logs?
Thank you
Update:
Items:#activity('Get Metadata1').output.childItems
If you want record the source file names, yes we can. As you said we need to use Get Metadata and For Each activity.
I've created a test to save the source file names of the Copy activity into a SQL table.
As we all know, we can get the file list via Child items in Get metadata activity.
The dataset of Get Metadata1 activity specify the container which contains several files.
The list of file in test container is as follows:
At inside of the ForEach activity, we can traverse this array. I set a Copy activity named Copy-Files to copy files from source to destnation.
#item().name represents every file in the test container. I key in the dynamic content #item().name to specify the file name. Then it will sequentially pass the file names in the test container. This is to execute the copy task in batches, each batch will pass in a file name to be copied. So that we can record each file name into the database table later.
Then I set another Copy activity to save the file names into a SQL table. Here I'm using Azure SQL and I've created a simple table.
create table dbo.File_Names(
Copy_File_Name varchar(max)
);
As this post also said, we can use similar syntax select '#{item().name}' as Copy_File_Name to access some activity datas in ADF. Note: the alias name should be the same as the column name in SQL table.
Then we can sink the file names into the SQL table.
Select the table which created previously.
After I run debug, I can see all the file names are saved into the table.
If you want add more infomation, you can reference the post I maintioned previously.
I have a dataset based on a csv file. This exposes a data as follows:
Name,Age
John,23
I have an Azure SQL Server instance with a table named: [People]
This has columns
Name, Age
I am using the Copy Data task activity and trying to copy data from the csv data set into the azure table.
There is no option to indicate the table name as a source. Instead I have a space to input a Stored Procedure name?
How does this work? Where do I put the target table name in the image below?
You should DEFINITELY have a table name to write to. If you don't have a table, something is wrong with your setup. Anyway, make sure you have a table to write to; make sure the field names in your table match the fields in the CSV file. Then, follow the steps outlined in the description below. There are several steps to click through, but all are pretty intuitive, so just follow the instructions step by step and you should be fine.
http://normalian.hatenablog.com/entry/2017/09/04/233320
You can add records into the SQL Database table directly without stored procedures, by configuring the table value on the Sink Dataset rather than the Copy Activity which is what is happening.
Have a look at the below screenshot which shows the Table field within my dataset.
I'm designing a Copy Data task where the Sink SQL Server table contains an Identity column. The Copy Data task always wants me to map that column when, in my opinion, it should just not include the column in the list of columns to map. Does anyone know how I can get the ADF Copy Data task to ignore Sink Identity columns?
If you are using copy data tool, and in your sql server, the ID is set as auto-increment, then it should not show out at the mapping step. Please tell us if it is not the case.
If you are using the create pipeline/dataset, you could just go to the sink dataset schema tab, remove the id column. And then go to the copy activity mapping tab, click import schemes again. ID column should has disappeared now.
You could include a SET_IDENTITY_INSERT_ON statement for the given table before executing the copy step. After completed, set it to OFF.