Query configuration table as a one time activity in Azure Data Factory - azure-data-factory

Is there a way to query a DB table as a one time activity, so that the values can be used to drive a repeating pipeline activity.
Let's say I have a set of values that varies based on the environment(DEV/TEST/PROD). Instead of passing the values corresponding to the environment as parameters, can I configure these values in a DB table and read them the first time the Data Factory runs, so that a repeating Orchestrator task that runs every five minutes can fetch value obtained from the table?

You can use a Lookup activity for your case.
Specify your query in the Lookup activity to get the row you want to query for your environment value. You may also wants to check the "First row only" option for your case.
To access the value returned from DB, you can get the value from the output of the Lookup. It would be in the "firstRow" object of the output.
For the conditional/switch handling for your use case, put in #activity('Lookup config table').output.firstRow.VALUE for your expression in the Switch's dynamic content.

you can use lookup activity from azure data factory to query values from db and then use parameters to store them to use in next activities, please check this

Related

How to filter prometheus series based on the results of another query in grafana dashboard?

I am using Grafana 9.3.1 for monitoring of our system. Among other things, I am trying to monitor the remaining FUP of a phone number for each unit we operate.
Basically, we intend to use two data sources.
Database mapping of the unit ID to its phone number (e.g. "unit_id=123, phone_number="00 123456789")
Prometheus time series remaining_fup{phone_number="00 123456789"}. However, remaining_fup is a 3rd party data and does not include unit_id.
In my unit-detail dashboard I have unit_id variable which indicates which unit FUP should be displayed (among other things depending on unit_id)
My original approach was this:
Create a mixed datasource dashboard
Add database datasource as data A. SELECT phone_number FROM units WHERE unit_id='$unit_id'
Add prometheus datasource remaining_fup and filter it based on A.phone_number: remaining_fup{phone_number="${A.phone_number}"}
Unfortunatelly such use of A isn't supported. I used to hope for applying some transformation like Merge or Join by field and then Filter but with no success. After a lot of googling and trying I feel hopeless.
Could you help please? Is such filter even possible? Thanks!
TL;DR: In grafana dashboard I want to query one datasource in order to obtain a value which I subsequently want to use in another datasource query.
1.) Create variable - name phone_number, type: Query and query your database datasource SELECT phone_number FROM units WHERE unit_id='$unit_id'. You can hide this variable if you don't want it to be visible for the dashboard users.
2.) Variable phone_number may have multiple values, so use advance variable formatting to create valid regex query syntax for your prometheus datasource, e.g.
remaining_fup{phone_number=~"${phone_number:pipe}"}
Of course this queries are just examples and they may need some (syntax) tweaking for the use case. Main idea: don't use 2 queries, but one variable and one query (where you use that variable).

How to Call the Copy Activity Dynamically Single Pipeline

We are executing pipeline one by one in sequence manager and load the data on premise SQL.
but we would want to load the data all the copy activity in single trigger. which means we have to load the 15 tables data into on premise DB. if tomorrow, we have to add one more table, we should not change in pipeline. we would like dynamic table insert. kindly advise.
thanks to all
I reproduced the above scenario and got the below results.
Use two lookups one for your source database and one for on-prem SQL.
Here I have used Azure SQL database for both source and target. You can use your database with SQL Server linked service in lookup.
Use the below query in both lookups to get the list of tables.
SELECT TABLE_NAME
FROM
information_schema.tables;
lookup list of of source tables:
Give the same query in second lookup.
List of target tables with another lookup:
Use filter activity to get the list of new tables which are not copied to target.
Items: #activity('sql source lookup').output.value
Condition: #not(contains(activity('on-prem lookup').output.value,item()))
Filter result:
Give this Value array to a ForEach activity and use copy activity inside ForEach.
Copy Source:
Copy sink:
You can see new tables are copied to my target. Schedule this pipeline everyday so that your every new table gets copied to target.

Azure Data Factory - Inner Activity Failed In For Each

I have used a look up activity to pass the value to the for each iteration activity. The output values from Lookup is generated from a SQL table. Once the iteration starts if one of the activity inside the for each fails, the for each iterator tries to run it for the number of times, the lookup output value is available. How do I come out of the loop? I have removed the records from the SQL table, to come out of the loop, but the loop continues to run. How can I clear the For Each Items set when an inner activity fails?
REgards,
Sandeep
How can I clear the For Each Items set when an inner activity fails?
No, we can't. For Each active doesn't support break for now even if the internal active failed.
Many users have post same questions in stack overflow and Data Factory feedback:
It' voted up 31 times but still with no respond of the Data Factory Product Team.
Ref: https://feedback.azure.com/forums/270578-data-factory/suggestions/39673909-foreach-activity-allow-break
Update:
Congratulations that you have found a solution for you scenario:
"Now used an until activity by comparing the variable values and count of files out put from a lookup activity to resolve the issue."
I post it in answer and this can be beneficial to other community members.
Hope this helps.
I have replaced the for each loop with the until activity. The input for the until activity was a SQL query which returns the count of records from the table where the file names are copied and a variable value. Used the #greater expression with Variable value, and lookup activity value. Inside the loop created logic to increment the value of the variable using a temp variable and add expression. If an expression fails, marked the variable value greater than the lookup output value.

how to get Iteration Id for items in array using Azure Data factory

I have a simple ADF pipeline which contains 1 lookup (which loads the name of tables to be migrated) and a ForEach activity (Which contains copy activity and a function App to loads data in BQ). I want to get the Iteration ID and want to send it to Azure function App.
Let say the Lookup returns a JSON with three tables in it (A,B,C) I want to get the iteration id inside the foreach loop for example 1 for A and 2 for B and 3 for C.
Any help on this will be highly appreciated.
I agree this is a common requirement,but it seems no direct way to get the array index inside the for-each activity. However,you could try my little trick with AzureFunction Activity.
Step1: Create a text file (named as index.txt)in the some blob storage path and store 1 value in it(for using it as array index)
Step2: Inside the For-each Activity, use LookUp Activity to read the value of index.txt. First time, it is 1.
Step3: After that, execute an Azure Function Activity to change the value --plus 1.So that,next time it is 2.
Step4: When you finish For-each Activity,you could reset the value as 0 by Azure Function Activity.
No need to create 2 azure functions,just 1. You could pass a boolean parameter to distinct whether this invoke is for reset or plus.
In the lookup table from which I was going to pick the Source and destination tables/databases. I added another column with the Iterator number like 1, 2,3,4 for each row in the Source table from which the lookup activities is retrieving the data.
Then inside Azure data factory, I read that column inside the Foreach loop. For each of the Source and Destination tables I have a self made Iterator and used that for my purpose. It worked perfectly fine for me.

Datastage: set web server trasformer stage url from query

I need to set "PortAddress" and "WSDL Address" dinamically using the result of a query.
I've created the oracle Connector stage with my query. For example:
select col1,col2,col3,...,url
from myTable
How can I use "url" column value in the Web Service stage?
Thanks in advance.
This is a general problem not restricted to your web service transformer. You want to "transfer" data from a data stream to the Sequence level in order to feed it into the next job as a parameter.
Basically there are two main ways to do it:
Parallel Edition: In the first job where select the url from your database and write it to a value file of a parameter set. Use the parameter set in the second job with the new value file. Details see here
Server Edition: In a server job you select the data from your database in a transformer you can use a DataStage function (DSSetUserStatus) to set the so called UserStatus for this job. This can then be referenced in the next job of the Sequence.