Azure Data Factory, If Activity expression with array element - azure-data-factory

I have an array HeaderList with a list of names. I have a look up activity to look at a CSV file header. Then, I have a IF activity to compare the first element. the expression in If activity is like this:
#equals(activity('Lookup2').output.firstRow.Prop_0,variables('HeaderList')[0])
That does not work. If I change it to this:
#equals(activity('Lookup2').output.firstRow.Prop_0,'XYZ'), then it works. How do I reference an array element in expression?
Thanks
#equals(activity('Lookup2').output.firstRow.Prop_0,variables('HeaderList')[0])
What does it mean?

I have got the same error in the if condition activity. But when the pipeline is debugged, it did not throw any error. I have repro'd the same in my ADF environment. Below are the steps.
Lookup activity is taken, and it refers to a csv file.
An array variable 'HeaderList' is taken and values for the variable is set using set variables activity.
Then If Condition activity is taken and below expression is given as a dynamic content.
#equals(activity('Lookup1').output.firstRow.prop_0,variables('HeaderList')[0])
The same error is produced.
Error: Cannot fit unknown into function parameter any.
When pipeline is debugged, it did not throw any error. It is successful.

Related

Azure Data Factory Error 'item' is not a recognized function

I have the following REST configuration in Azure Data Factory
As you can I'm getting the error:
'item' is not a recognized function
The full configuration is
convert?q=USD_#{item().Currency}&compact=ultra&apiKey=xxxxxxxxxxxxxxxxxxx
Do I need to configure #item in Parameters?
The guide suggests I need to following these steps
Based on your code in the dynamic context, you are using this REST resource inside a ForEach as above it has item() function. You can get item().<"Value"> in a ForEach using a lookup.
item() is a ForEach function and can be used inside a ForEach which is used inside a ADF pipeline. You are using the ForEach function inside Dataset which is not known for the dataset. That's why it is giving a warning. When you use that dataset only for that pipeline it will give you the result without any error. But for any other pipeline It will give you the warning as error.
To use a pipeline function in the Dataset, the best practice is to create a Dataset parameter and give the value for this in the pipeline like below.
Create a Dataset Parameter with string type and a Default value:
Give this parameter in the Dataset dynamic context:
Now you can give pipeline function values for this Parameter inside ForEach or inside Pipeline:
Here I have used Copy activity for sample and given the value as per my URL. You can give your Relative URL with item() function in dynamic context.
Based on the item().Currency values it will give the REST page URL in each iteration.

Azure Data Factory - Capture error details of a dataflow activity -> Store into a variable -> Assign this variable to a dataflow parameter

I have a data flow and my requirement is to capture the error details into a variable when it fails and assign this variable to a parameter in the next data flow. I tried to achieve this until the second stage(With help) as below, but I'm unable to get this variable assigned to a parameter in the next data flow. The error I get is - Expression cannot be parsed
What do I do later?
This parameter is assigned to a column in the data flow and I use this column to update the table in the dedicated pool with the relevant error message.
I tried to reproduce the same in my environment and I got the same error
The above scenario fails, because dataflow fails to parse ' ' and / in your error message.
To resolve above error,please follow below steps:
I just create the error fail1 with message containing a different character.
Go to set variable : Create a variable and added dynamic content to the value.
#replace(replace(string(activity('Fail1').output.message),pipeline().parameters.quote,'"'),'\','/')
Output:
Updated:
Parameter

how to pass the outputs from Get metadata stage and use it for file name comparison in databricks notebook

I have 2 Get metadata stages in ADF which is fetching file names from 2 different folders, I need to use these outputs for file name comparison in databricks notebook and return true if all the files are present.
how to pass the output from Get meta data stages to databricks and perform string comparison and
return true if all files are present and return false if even 1 file is missing
How to achieve this?
Please find the below answer which I explained with 1 Get metadata stage , the same can be replicated for more than one also.
Create an ADF pipeline with below activities.
Now in the Get Metadata activity , add the childItems in the Fieldlist as argument, to pass the output of Get Metadata to Notebook as show below
In the Databricks Notebook activity , add the below parameter as Base Paramter which will capture the output of Get Metadata and pass as input paramater to Notebook. Generally this parameter will of object datatype , but I converted to string datatype to access the names of files in the notebook as show below
#string(activity('Get Metadata1').output.childItems)
Now we can able to access the Get Metadata output as string in the notebook.
import ast
required_filenames = ['File1.csv','File2.csv','File3.csv'] ##This is for comparing with the output we get from GetMetadata activity.
metadata_value = dbutils.widgets.get('metadata_output') ##Accessing the output from Get Metadata and storing into a variable using databricks widgets.
metadata_list = ast.literal_eval(metadata_value) ##Converting the above string datatype to the list datatype.
blob_output_list=[] ##Creating an empty list to add the names of files we get from GetMetadata activity.
for i in metadata_list:
blob_output_list.append(i['name']) ##This will add all the names of files from blob storage to the empty list we created above.
validateif = all(item in blob_output_list for item in required_filenames) ##This validateif variable now compare both the lists using list comprehension and provide either True or False.
I tried in the above way and can able to solve the provided requirement. Hope this helps.
Request to please upvote the answer if this helps in your requirement.

Azure-data-Factory Copy data If a certain file exists

I have many files in a blob container. However I wanted to run a Stored procedure only IF a certain file (e.g. SRManifest.csv) exists on the blob Container. I used Get metadata and IF Condition on Data Factory. Can you please help me with the dynamic script for this. I tried this #bool(startswith(
activity('Get Metadata1').output.childitems.ItemName,
'SRManifest.csv')). It doesnt work.
Then I thought, what if i used #greaterOREquals(activity('Get Metadata1').output.LastModified,adddays(utcnow(),-2))But this checks the last modified within 2 days of the Bloob not the file exist. Thank you.
Please see below my diagram
I have understood your requirement differently I think.
I wanted to run a Stored procedure only IF a certain file (e.g. SRManifest.csv) exists on the blob Container
1 Change your metadata activity to look for existence of sentinel file (SRManifest.csv)
2 Follow with an IF activity, use this condition:
3 Put your sp in the True part of the IF activity
If you also needed the file list passed to the sp then you'll need the GetMetadata with childitems option inside the IF-True activity
Based on your diagram, since you are looping over all the blob names already, you can add a Boolean variable to the pipeline and set its default value to false:
Inside the ForEach activity, you only want to attempt to set the variable if the value is still false, and if the blob name is found, set it to true. Since Set Variable cannot be self-referential, do this inside the False branch of an If activity:
This will only attempt to process if the value is false (so the file name has not been found yet), and will do nothing if the value is true. Now set the variable based on your file name:
[NOTE: This value can be hard coded, parameterized, or based on a variable]
When you execute the pipeline, you'll see the Set Variable stops attempting once the value is set to true:
In the main pipeline, after the ForEach activity has completed, you can use the variable to set the condition of your final If activity. If the blob is never found, it will still be false, so put the Stored Procedure activity inside the True branch.

Debugging values into variables or user properties

How can I spy into my values when I'm on ADF debug mode ?
I want to build a simple pipeline that digs into a storage account table. For each row, enter the value of the second column, use it to create a URL and call a web service.
I saw the output of the Lookup command but how can I saw, for example, the content of each input() into the foreach activity. Can I used the user properties for debugging reason ?
When debugging, I frequently make use of the 'Set Variable' activity. Viewing the output of a 'Set Variable' activity is spying on the value.
You want to see the input to each iteration of your ForEach. Prepend the inner activity with a Set Variable activity. Dynamic content #string(item()) should be enough.