Convert Row Count to INT in Azure Data Factory

Convert Row Count to INT in Azure Data Factory - azure-data-factory

I am trying to use a Lookup Activity to return a row count. I am able to do this, but once I do, I would like to run an If Statement against it and if the count returns more than 20MIL in rows, I want to execute an additional pipeline for further table manipulation. The issue, however, is that I can not compare the returned value to a static integer. Below is the current Dynamic Expression I have for this If Statement:
#greater(int(activity('COUNT_RL_WK_GRBY_LOOKUP').output),20000000)
and when fired, the following error is returned:
{
"errorCode": "InvalidTemplate",
"message": "The function 'int' was invoked with a parameter that is not valid. The value cannot be converted to the target type",
"failureType": "UserError",
"target": "If Condition1",
"details": ""
}
Is it possible to convert this returned value to an integer in order to make the comparison? If not, is there a possible work around in order to achieve my desired result?

Looks like the issue is with your dynamic expression. Please correct your dynamic expression similar to below and retry.
If firstRowOnly is set to true : #greater(int(activity('COUNT_RL_WK_GRBY_LOOKUP').output.firstRow.propertyname),20000000)
If firstRowOnly is set to false : #greater(int(activity('COUNT_RL_WK_GRBY_LOOKUP').output.value[zero based index].propertyname),20000000)
The lookup result is returned in the output section of the activity run result.
When firstRowOnly is set to true (default), the output format is as shown in the following code. The lookup result is under a fixed firstRow key. To use the result in subsequent activity, use the pattern of #{activity('MyLookupActivity').output.firstRow.TableName}.
Sample Output JSON code is as follows:
{
"firstRow":
{
"Id": "1",
"TableName" : "Table1"
}
}
When firstRowOnly is set to false, the output format is as shown in the following code. A count field indicates how many records are returned. Detailed values are displayed under a fixed value array. In such a case, the Lookup activity is followed by a Foreach activity. You pass the value array to the ForEach activity items field by using the pattern of #activity('MyLookupActivity').output.value. To access elements in the value array, use the following syntax: #{activity('lookupActivity').output.value[zero based index].propertyname}. An example is #{activity('lookupActivity').output.value[0].tablename}.
Sample Output JSON Code is as follows:
{
"count": "2",
"value": [
{
"Id": "1",
"TableName" : "Table1"
},
{
"Id": "2",
"TableName" : "Table2"
}
]
}
Hope this helps.

Do this - when you run the debugger look at the output from your lookup. It will give a json string including the alias for the result of your query. If it's not firstrow set then you get a table. But for first you'll get output then firstRow and then your alias. So that's what you specify.
For example...if you put alias of your count as Row_Cnt then...
#greater(activity('COUNT_RL_WK_GRBY_LOOKUP').output.firstRow.Row_Cnt,20000000)
You don't need the int function. You were trying to do that (just like I was!) because it was complaining about datatype. That's because you were returning a bunch of json text as the output instead of the value you were after. Totally makes sense after you realize how it works. But it is NOT intuitively obvious because it's coming back with data but its string stuff from json, not the value you're after. And functions like equals are just happy with that. It's not until you try to do something like greater where it looks for numeric value that it chokes.

Related

ADF: use the output from a lookup activity on another activity in Data Factory

I have a lookup activity (Get_ID) that returns:
{
"count": 2,
"value": [
{
"TRGT_VAL": "10000"
},
{
"TRGT_VAL": "52000"
}
],
(...)
I want to use these 2 values from TRGT_VAL in a WHERE clause of a query in another activity. I'm using
#concat('SELECT * FROM table WHERE column in ',activity('Get_ID').output.value[0].TRGT_VAL)
But only the first value of 10000 is being taken into account. How to get the whole list?

I solved by using a lot of replaces:
#concat('(',replace(replace(replace(replace(replace(replace(replace(string(activity('Get_ID').output.value),'{',''),' ',''),'"',''),'TRGT_VAL:',''),'[',''),'}',''),']',''),')')
Output
{
"name": "AptitudeCF",
"value": "(10000,52000)"
}

Instead of using big expression with lot of replace functions, you can use String interpolation syntax and frame your query. Below is query which you can consider.
SELECT * FROM table WHERE column in (#{activity('Get_ID').output.value[0].TRGT_VAL},#{activity('Get_ID').output.value[1].TRGT_VAL)

Escape charaters in JSON causing issue while retrieving attribute in ForEach activity

I have below JSON
{
"id": " https://xxx.vault.azure.net/secrets/xxx ",
"attributes": {
"enabled": true,
"nbf": 1632075242,
"created": 1632075247,
"updated": 1632075247,
"recoveryLevel": "Recoverable+Purgeable"
},
"tags": {}
}
The above JSON is the output of a web activity and I am using this output into a ForEach activity. The above output when goes to ForEach activity as input, all the values are coming with escape characters.
{
{"id":" https://xxx.vault.azure.net/secrets/xxx ","attributes":{"enabled":true,"nbf":1632075242,"created":1632075247,"updated":1632075247,"recoveryLevel":"Recoverable+Purgeable"},"tags":{}}
From this JSON, I am trying to get only xxx value from the id attribute. How can I do this in Dynamic expression.
Any help is much appreciated.
Thanks

Use the built-in functions lastIndexOf (to find the last occurence of backslash), length (to determine the length of a string), add (to add numbers), sub (to subtract numbers) and substring to do this. For example:
#substring(item().id,add(lastIndexOf(item().id,'/'),1),sub(length(item().id),add(lastIndexOf(item().id,'/'),1)))

How can I get the count of JSON array in ADF?

I'm using Azure data factory to retrieve data and copy into a database... the Source looks like this:
{
"GroupIds": [
"4ee1a-0856-4618-4c3c77302b",
"21259-0ce1-4a30-2a499965d9",
"b2209-4dda-4e2f-029384e4ad",
"63ac6-fcbc-8f7e-36fdc5e4f9",
"821c9-aa73-4a94-3fc0bd2338"
],
"Id": "w5a19-a493-bfd4-0a0c8djc05",
"Name": "Test Item",
"Description": "test item description",
"Notes": null,
"ExternalId": null,
"ExpiryDate": null,
"ActiveStatus": 0,
"TagIds": [
"784083-4c77-b8fb-0135046c",
"86de96-44c1-a497-0a308607",
"7565aa-437f-af36-8f9306c9",
"d5d841-1762-8c14-d8420da2",
"bac054-2b6e-a19b-ef5b0b0c"
],
"ResourceIds": []
}
In my ADF pipeline, I am trying to get the count of GroupIds and store that in a database column (along with the associated Id from the JSON above).
Is there some kind of syntax I can use to tell ADF that I just want the count of GroupIds or is this going to require some kind of recursive loop activity?

You can use the length function in Azure Data Factory (ADF) to check the length of json arrays:
length(json(variables('varSource')).GroupIds)
If you are loading the data to a SQL database then you could use OPENJSON, a simple example:
DECLARE #json NVARCHAR(MAX) = '{
"GroupIds": [
"4ee1a-0856-4618-4c3c77302b",
"21259-0ce1-4a30-2a499965d9",
"b2209-4dda-4e2f-029384e4ad",
"63ac6-fcbc-8f7e-36fdc5e4f9",
"821c9-aa73-4a94-3fc0bd2338"
],
"Id": "w5a19-a493-bfd4-0a0c8djc05",
"Name": "Test Item",
"Description": "test item description",
"Notes": null,
"ExternalId": null,
"ExpiryDate": null,
"ActiveStatus": 0,
"TagIds": [
"784083-4c77-b8fb-0135046c",
"86de96-44c1-a497-0a308607",
"7565aa-437f-af36-8f9306c9",
"d5d841-1762-8c14-d8420da2",
"bac054-2b6e-a19b-ef5b0b0c"
],
"ResourceIds": []
}'
SELECT *
FROM OPENJSON( #json, '$.GroupIds' )
SELECT COUNT(*) countOfGroupIds
FROM OPENJSON( #json, '$.GroupIds' );
My results:
If your data is stored in a table the code is similar. Make sense?
Another funky way to approach it if you really need the count in-line, is to convert the JSON to XML using the built-in functions and then run some XPath on it. It's not as complicated as it sounds and would allow you to get the result inside the pipeline.
The Data Factory XML function converts JSON to XML, but that JSON must have a single root property. We can fix up the json with concat and a single line of code. In this example I'm using a Set Variable activity, where varSource is your original JSON:
#concat('{"root":', variables('varSource'), '}')
Next, we can just apply the XPath with another simple expression:
#string(xpath(xml(json(variables('varIntermed1'))), 'count(/root/GroupIds)'))
My results:
Easy huh. It's a shame there isn't more built-in support for JPath unless I'm missing something, although you can use limited JPath in the Copy activity.

You can use Data flow activity in the Azure data factory pipeline to get the count.
Step1:
Connect the Source to JSON dataset, and in Source options under JSON settings, select single document.
In the source preview, you can see there are 5 GroupIDs per ID.
Step2:
Use flatten transformation to deformalize the values into rows for GroupIDs.
Select GroupIDs array in Unroll by and Unroll root.
Step3:
Use Aggregate transformation, to get the count of GroupIDs group by ID.
Under Group by: Select a column from the drop-down for your aggregation.
Under Aggregate: You can build the expression to get the count of the column (GroupIDs).
Aggregate Data preview:
Step4: Connect the output to Sink transformation to load the final output to database.

Is it possible in Grafana using a source data table with a JSON field, to get an attribute from that field?

We configure Grafana to use a table input data source, it works very well with the fields already defined (like time, status, values, etc.).
But now a new field has been added to the table that is a serialized JSON object, returned from a process we can not modify.
We need to use a value (timestamp) that is a property of this serialized object in that table string field.
One serialized field value example is this:
{"timestamp":"2020-02-23T18:25:44.012Z","status":"fail","errors":[{"timestamp":"2020-02-23T18:25:43.511Z","message":"invalid key: key is shorter than minimum 16 bytes"},{"timestamp":"2020-02-23T18:25:43.851Z","message":"unauthorized: authorization not possible"}]}
The pretty print is:
{
"timestamp": "2020-02-23T18:25:44.012Z",
"status": "fail",
"errors": [
{
"timestamp": "2020-02-23T18:25:43.511Z",
"message": "invalid key: key is shorter than minimum 16 bytes"
},
{
"timestamp": "2020-02-23T18:25:43.851Z",
"message": "unauthorized: authorization not possible"
}
]
}
Is there any way to use a value like: field.timestamp or field.errors[0].timestamp ?
Is there a Plugin that allows it ?, or is not possible at all ?

Use PostgreSQL JSON column select in your Grafana query, e.g.:
SELECT
field->'timestamp',
...

Query to convert a unknown value to json

I have the structure like,
select to_json('[{
"11111":
{
"Number":{"11111"},
"createdTime":"2018-06-25 10:30:11.047 +0530",
"errorMessage":"invalid"
}
}]')
If I try to convert to json structure, I am getting the following error:
ERROR: could not determine polymorphic type because input has type unknown
I need to get a valid json format.
Thanks..

to_json is used to convert e.g. a record or other values that are not a JSON value to proper a JSON value.
You apparently want to use a string value as a JSON.
In order to be able to do that, you need to supply valid JSON. The part "Number":{"11111"} is however invalid JSON, you need to remove the curly braces.
select '[{
"11111":
{
"Number": "11111",
"createdTime":"2018-06-25 10:30:11.047 +0530",
"errorMessage":"invalid"
}
}]'::json
But why are you using a JSON array if you only have a single value? From what you have shown a single JSON value would make more sense:
select '{
"11111":
{
"Number": "11111",
"createdTime":"2018-06-25 10:30:11.047 +0530",
"errorMessage":"invalid"
}
}'::json

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Convert Row Count to INT in Azure Data Factory - azure-data-factory

Related

ADF: use the output from a lookup activity on another activity in Data Factory

Escape charaters in JSON causing issue while retrieving attribute in ForEach activity

How can I get the count of JSON array in ADF?

Is it possible in Grafana using a source data table with a JSON field, to get an attribute from that field?

Query to convert a unknown value to json

Categories

Resources