Azure ADF Lookup a distinct list for a following 'For each activity' - azure-data-factory

I'm using a Lookup activity to get a list of values from a csv file. I want to want to loop through a distinct list of one of the columns (other columns are not identical.) I can build an array object with just the column I'm interested in but how can I make it a distinct list for a following loop?

You can use the union() function to get the unique list from the array.
Example:
I have a list of array values in a variable.
Using set variable activity, get the unique list from the array.
#union(variables('array_list'),variables('array_list'))
Output of set variable activity.

Related

Filter output is a JSON ids, these ids should be copied in to another json

I am filtering 2 json files contains ids and picking up non existing ids.
results some output to filter with list of ids, these ids should be copied to other json or next activity to copy data from rest API using these ids.
I have reproed with sample JSON files. Below is the filter output.
Use the below expression to use the output value of filter activity in later activities or you can create a variable and store the value in that variable using set variable activity and use it later.
Here I created a string variable to store the output.
Expression:
#string(activity('Filter1').output.value)
Set variable output:

Dynamic filter rows in Talend

Is it possible to get values from a Excel file and use them to filter rows in Talend?
For example:
I have an Excel file with account numbers list.
In Talend, I have a query in a tDBInput linked to a tMap, that is linked to a tFilter. This filter is based on Excel columns values with the accounts list.
Or if I can use the account list in the where clause in tDBInput. The problem is that the account list can change anytime.
Thanks a lot.
There are multiple ways to proceed :
Link your tDBInput to tMap as the main flow, and your excelInput as the lookup. Make the join between the two flow in tMap as an inner join : this way you can filter main data with data coming from excel. The inconvenient is that you will read all data from DBInput and filter after that, which is less efficient.
You can also make it through two subjobs :
First subjob : tFileInputExcel -> tJavaRow . Push your list of accounts to a String in a context variable to construct your "where" query. (you can also use a tAggregateRow with 'list' mode to build a list )
Second subjob : use this constructed context variable as filter in your DBInput query. This way you'll read only valuable data in this component.

how to get Iteration Id for items in array using Azure Data factory

I have a simple ADF pipeline which contains 1 lookup (which loads the name of tables to be migrated) and a ForEach activity (Which contains copy activity and a function App to loads data in BQ). I want to get the Iteration ID and want to send it to Azure function App.
Let say the Lookup returns a JSON with three tables in it (A,B,C) I want to get the iteration id inside the foreach loop for example 1 for A and 2 for B and 3 for C.
Any help on this will be highly appreciated.
I agree this is a common requirement,but it seems no direct way to get the array index inside the for-each activity. However,you could try my little trick with AzureFunction Activity.
Step1: Create a text file (named as index.txt)in the some blob storage path and store 1 value in it(for using it as array index)
Step2: Inside the For-each Activity, use LookUp Activity to read the value of index.txt. First time, it is 1.
Step3: After that, execute an Azure Function Activity to change the value --plus 1.So that,next time it is 2.
Step4: When you finish For-each Activity,you could reset the value as 0 by Azure Function Activity.
No need to create 2 azure functions,just 1. You could pass a boolean parameter to distinct whether this invoke is for reset or plus.
In the lookup table from which I was going to pick the Source and destination tables/databases. I added another column with the Iterator number like 1, 2,3,4 for each row in the Source table from which the lookup activities is retrieving the data.
Then inside Azure data factory, I read that column inside the Foreach loop. For each of the Source and Destination tables I have a self made Iterator and used that for my purpose. It worked perfectly fine for me.

Using MYSQLI to select rows in which part of a column matches part of an input

I have a database in which one of the columns contains a series of information 'tags' about the row that are stored as a comma-separated list (a string) of dynamic length. I am using mysqli within PHP, and I want to select rows in which any of these items match any of the items in an input string.
For example, there could be a row describing an apple, containing the tags: "tasty, red, fruit, sour, sweet, green." I want this to show up as a result in a query like: "SELECT * FROM table WHERE info tags IN ('blue', 'red', 'yellow')", because it has at least one item ("red") overlapping. Kind of like "array_intersect" in PHP.
I think I could use IN if each row had only one tag, and I could use LIKE if I used only one input tag, but both are of dynamic length. I know I can loop over all the input tags, but I was hoping to put this in a single query. Is that possible? If not, can I use a different structure to store the tags in the database to make this possible (something other than a comma separated string)?
I think the best would be to create tags table (id + label) then separate "table_tags" table which holds table_id and tag_id.
that means using JOINS to get the final result.
another (but lazy) solution would be to prefix and suffix tags with commas so the full column contains something like:
,tasty,red,fruit,sour,sweet,green,
and you can do a LIKE search without being worried about overlapping words (i.e red vs bored) and still get a proper match by using LIKE '%,WORD,%'

ValueList from ExecuteSQL() based query in FMP 12

I'm looking into upgrading our FMP11 developed solution to FMP12. For us, key functionality exists around the ValueList feature to DISPLAY one value (e.g. a description) while RETURNING another value (e.g. a UID), into the selected field.
I would be interested if you have been able to replicate this feature from the ExecuteSQL() function (I can successfully return a single ValueList ... having trouble with the above)
many thanks in advance
Giles
Based on the core functionality of value lists, you can't use the ExecuteSQL() function to calculate the value directly inside of the value list dialog box.
What you would have to do is create a table with a single record and two fields. Then you would use an ExecuteSQL() calculation to populate the first and second fields with data. It would be important to make sure you sort your data inside executeSQL() in the same order in both fields.
So your filemaker calculations would be (assuming the first field is key and the second is name and it's from a table called items, and you are looking for where key > 100)
keylist =
ExecuteSQL (
"SELECT key
FROM items
WHERE key > 100
ORDER BY key ASC"
; "" ; "" )
namelist =
ExecuteSQL (
"SELECT name
FROM items
WHERE key > 100
ORDER BY key ASC"
; "" ; "" )
You would then create a value list that uses keylist as the first field, and namelist as the second field, only displaying values from the second field.
It would be nice to have the functionality to calculate a value list, but as far as I know filemaker always needs to pull values from a source outside of the value list dialog box.